English Español Deutsch Français Italiano Português (Brasil) Русский 中文 日本語

Structured Data Beyond Google And Back




Paul: That's it and we're live for Schema, Beyond Google and Back. I'm pleased to be on this webinar, joined by two Jedis of Schema, as I always call you two. Hello Jono and Martha. I would introduce you, but I'm sure you'll do a much better job than what I would do. We'll get started, I'll let you introduce yourself or let you go first Martha, ladies first.

Martha: Sure. So my name is Martha. I'm the CEO at Schema App and I'm Canadian. I'm just really delighted here to be able to talk a little bit about some advanced Schema markup tactics.

Paul: That's nice. Jono.

Jono: Hi, I'm Jono. I run special ops at Yoast. I don't know what that means either, but I spend a lot of my time thinking about things to do with Schema. And I'm an SEO and technical, digital marketing nerd.

Paul: I see there's more and more people joining. We've got quite a few people in the chat now, so I don't see any reason to wait around and let people join. Let's get this show on the road. We're going to take it that people have a general understanding of what structured data is and what schema.org is.  I believe you've got something to share with us, Jono. I believe he is going first.

Jono: For so long there's been a bit of a chicken and egg problem, that nobody really adopted Schema because it was hard to do and complex and the rewards weren't that great. Google and schema.org and the people around it didn't really get involved in accelerating it. 

It just needed to hit that tipping point and we are so far past that tipping point now. Structured data is a primary citizen of all our shared conversations and not just this kind of awkward technical thing. And it's increasingly becoming the main way that we describe our products, our websites, our services. This is an exciting time.

Paul: I've noticed that schema.org been updating every month, since about April of last year and Jono, you've obviously turned around earlier on this year and said, "It's going to be getting more complex and there's going to be more changes coming." It's definitely a hot topic and something that's been needed some attention for some while.

Martha: Absolutely. And I think, especially with what we're going to cover today. We're going to see how you can think of it as a core strategy across any type of business, in any type of content and why sort of it's more than just about getting features in rich snippets. But really about kind of understanding and sort of that underlying kind of piece of how things are connected.

Paul: Ready Jono.

Google’s Approach to Structured Data

Jono: Right. Let's go, yeah. We're going to do a triple-speed run through, about 15 minutes or so of some of my rambling thoughts. And then pass over to Martha, to ground us back to reality and give us something actually useful and then wrap up with some Q&A.

None of this is new, Schema is not new. You all know the basics, but as we've said, adoptions sucked up to date because it's been complicated and difficult and abstract. 

One of the things that's really changed recently, is that Google have started pushing much, much harder for people to adopt for a whole bunch of reasons. Not least of which, is that if they want to transform the conventional search results; 10 blue links, into something like this slide; full of widgets, cards and tools and answers and experiences.

They have to be able to do more than just scrape content, they have to understand that. And the mechanism they're using to gain that understanding, is structured data and schema.org.

Strategically for them, this is hugely important. It's the foundation for how they compete with Amazon, with Netflix, with all the people who are fighting for our time and attention and taking away money that they see as theirs. Whether that's the launch of a new vertical at Google Flights or being able to buy things directly in the servers, they see this as their landscape, and in order to get there, they have to have structured data.

However, Google aren't perfect and sometimes not everybody agrees with their strategies and approaches, not least of which in the world of structured data. Where quite often, they come under fire for being perceived as stealing people's content, taking their clicks and so on. 

And furthermore, schema.org is like a language, which means it's prone to the challenges of grammar and interpretation and inconsistencies and pronunciation, and these are all challenges that Google face with their interpretation of Schema. It's unique, it's influenced by their processes, their politics and their technology and sometimes the details of where they get it right and wrong and have different opinions, are quite valuable to understand. 

Schema.org is a standard and it's defined. Yes, it's mostly run by Google, it's slightly awkward, but it should be an independent thing from Google's implementation of it. But for example, an article has to have an author and a publisher in order to be a valid article. However, in Google's rules, a publisher cannot be a person. 

But I have a blog, I publish content, I am the publisher of those articles. But as far as Google is concerned, that's invalid Schema, even though technically it is valid. Local businesses couldn't until recently have multiple opening hours within a single day, even though a lot of Europe takes a nap in the afternoon. There's no way to describe this.

Their interpretation is scattered and there's lots of trial and error and fiddling and faffing to try and get it right, because there's no documentation. It's radically inconsistent across and within Google. And the different teams who are responsible for different bits of each snippet have different interpretations and standards and approaches.

An article must have a main entity page property, but FAQ markup must be the main entity of a page. It sounds very similar, but it's a polar opposite in approaches. Two different teams interpreting the rules in two radically different ways. 

People get confused when they think that the warnings that Google show mean they're doing something wrong when it actually means that Google's interpretation of the standard isn't necessarily right. This is a bit of a mess.

Also leads to all sorts of interesting tactical opportunities, which I don't recommend doing, but it's worth noting that for quite some time you could put HTML links in FAQ answer content. This is an affiliate thing showing up directly within Google, from somebody's Schema, which is mind-blowing. 

But I think more interestingly from a Schema perspective, the data type for an answer to a question is text. And that was a strictly defined thing at schema.org. And it means what it says. The answer to this should support text. This is HTML markup.

And when you read through Google's documentation and interpretation of these things, they have different definitions what text means in different places. And the rules, again, are different for what text means in an article. 

Schema Beyond Google: A Connected Graph and Cross-Page Referencing

One of the things I think it's really interesting to start thinking about if we start to open this can of worms about Schema beyond Google, is that Google's whole paradigm, the way they think about the web, the way they think about links and content and entities, is that one page should be representing one thing.

That model has shaped the modern web as we know it. A webpage has a URL and it talks about a thing because that was the keyword you want to rank for. It's what you want to develop it for. 

I don't know if that's right. I don't know if it's the best way that we can build an internet. I struggle to conceive of other ways because this is so predominant, but it's not necessarily the best.

This is their paradigm. And the way that they define how we have to approach Schema, where every page has to have a main entity and we have to map out all the relationships between them, isn't necessarily the only way that it could be done.

This is why adoption has struggled because if I'm writing a page or a post, I can open up my tool and I can say, "This page is an article or it's about an event or a book." But that's only one aspect. What if I want to describe an article which I wrote, which is about an event, which is being held at a local business, which is a restaurant, which is a book signing by a local chef who's selling that book? Where is that option on the dropdown?

Describing the world in its nuance to unlock new possibilities and capabilities cannot be done in a model where one page is about one thing. To describe the world in a valuable and meaningful way, which allows us to get beyond Google, which I'll come on to, we need to be building a connected graph, not just bits of Schema.

Google's model is simplistic and it forces us to behave as if the web is simplistic. And we only produce and write the Schema that gives us the immediate rewards that they're doling out. There is so much more that we can do with this when we get beyond what thing is on this page.

One of the other challenges when we're stuck within a Google-specific world is that they don't support cross-page referencing. Now part of the point, the very foundational point of linked structured data is to build and describe structures that span multiple pages and properties and websites and environments and posters and TVs and formats.

You cannot say, "This article is written by the author represented on that URL." They have no mechanism to allow you to describe that. You've got to build the whole graph of everything you want to represent on a page on that page. 

What I'm hinting at is whilst I love Google and we work closely with them on a bunch of these standards and they're doing a great job and they're lovely people, there is room for more. 

Google's handling of Schema is imperfect and, more interestingly, incomplete. They're only interested in the pitch that they can reward in search results. That's the tip of the iceberg, but they exert a lot of influence over Schema for their benefits.

Structured Data Looks Beyond Google’s Interpretation

But schema.org is open source, which means that there's some stuff we can start to think about and do. Google's largest competitive advantage is that they can crawl and scrape and store and process data from the web at scale. In a world where more websites have more connected data and structure and describe their things, other brands, other platforms, and other products can start to intelligently extract and use that information at a fraction of the cost that Google does.

That enables new search engines, new apps, new Internet of Things concepts, new machine learning capabilities. That's pretty exciting beyond just search results. It allows us to start to think about the internet as being units of things smaller than just pages, of components, and blocks, and paragraphs, and entities. If a page contains six different things, that should be fine, but in Google's paradigm it really isn't at the moment.

Some practical starting points in two minutes or less. Go look at pending.schema.org, Martha's going to mention this, this is where all the upcoming stuff is documented. If you want to get ahead of the game and beat Google to defining how these things work, we have to react to how people are already consuming Schema.

If you have a business that operates in any of the sectors which look like they are being covered here, get there first and define the rules. You want to go further on defining the rules, contribute to and extend the spec. 

You want to get really far beyond Google? Well, become a consumer. If this stuff is structured and easy to consume and understand, there's no reason why your business and website and tools can't scrape and extract it in the same way, or maybe even better than Google can, maybe with some slightly better adherence to schema.org's rules. 

There are opportunities all sorts here. Going down and I'm going to shut up now because Martha's got much more interesting stuff to say. Good luck. Thank you.

Paul: That was absolutely great. Even I learned a lot just listening to that. 

Martha: I loved that. I especially loved the rail on the inconsistencies because I do think that that is a challenge. Like we could talk for the rest of the half-hour just about testing tools and the inconsistencies there.

Tactics for Connecting Things on Your Website with Schema

Based on that foundation that Jono has given us, I'm actually going to give us some tactics into getting into, sort of, how do you start connecting things on your site? And sort of what are some tactics that you can use.

I always start by sharing my knowledge graph, which is actually like how I'm connected to other things on the web and in the world. I worked at Cisco for 14 years, I have a technical background. I'm one of the founders of Schema App, I'm Canadian, and I also used to own this awesome 1965 Austin-Healey Sprite. And my actual car was in the movie Losing Chase, which was directed by Kevin Bacon. Through my car, I'm like two degrees from Kevin Bacon and you wouldn't know that unless I explained how these things are connected. 

And so that really goes for your site and businesses or your blogs as well, right? You can use the Schema markup to explain how things are connected, how they relate to one another, whether that be sort of within your own site and your own business, or sort of outside of the web and on different entities.

I'm a big believer in not putting the same Schema markup on every page, because you want to explain what that exact page is. And instead what you want to do is try to use that most specific type from schema.org. And schema.org has over 800 different types. 

Previously I was talking about not chasing features, but instead thinking about what is on this page, how are those connected and how does that then connect across the business? One of the key things is around nesting markup within the primary topic of the page. 

And what this means is actually figuring out how they're connected. And so there's actually a free tool on our website called Schema Paths, you can find it under resources, tools, Schema Paths, or just Google “Schema Paths”. For example, if you had a local business page that was talking about the local business and also included some FAQ on the page and you want to figure out how to do this connectedness, how do you take step one to nest them?

We would actually tell you, "Okay, for if your local businesses is the primary topic of the page you can use subjectOf and then add FAQ page and nest it. Or if the page is primarily about the FAQ page and you want to link it back to the local business, you can use a property about." I really encourage you if you're trying to do this more connected, nested Schema markup, Schema Paths, great, free tool, kind of help you do that.

The other piece, I think that's really key is how do we start connecting things and defining things beyond our own definition. If we define them within our own business, that's great, but there's things like Wikidata, Wikipedia and Google's Knowledge Graph that redefines a lot of these entities. 

Within the Schema markup, instead of having multiple entities, they're all nicely connected and define exactly what it is. And you can do this ad nauseum. One of my favorites is FAQ with features within Google. You can also figure out how to nest them so that pages that might not get a rich result, you can actually start getting rich results for if you can update that content.

What we want to do here is look at, these are very specific types of Schema markup for a specific industry. And so here, I've just given you examples of the types that exist within health and life sciences. But also types that are also just in schema.org that may be used by health and life sciences, such as a product, a news article, an FAQ, et cetera.

Now, when someone does a search, the reason I like looking at this is when you look at this search and you think about how are we going to figure out the connected things that would allow us to answer the question of a Lyme doctor in Toronto? 

Don't take it as a challenge to use all the properties, but know that they're there in order to help you explain things appropriately beyond the features in Google. 

Using Schema to Explain Your Business

Now we tend to like to look at describing what are the things within the business, and then looking for ways to basically use those rich results or those features within it. If it's a doctor, let's look at a rating. If it's a hospital, let's look at an FAQ, if we're describing these different things. But more importantly, know that whatever your business is, you can actually use schema.org to explain what it is and how it relates to others. Then look at ways that you can maximize the features within Google.

I often talk about this as the starting point of doing your Schema markup strategy. It isn't necessarily about what features you can get, but what is it within the business you need to be understood and explained. And then how do you optimize that, connect it to other pieces and then really think about the content on the page in order to maximize features?

Those are the tactics I had to share in order to set up our conversation to say, "How do you think really about Schema markup beyond just the feature, but really by explaining your business and understanding how it's all related."

Paul: That's it. That was great. That's the bit I love, is the way that you can connect all of the dots together, I call it connecting the dots. And it is the fact that you're able to send a much richer and a much deeper understanding of what that specific bit is that you're trying to explain. We've got lots and lots of questions in the chat here, which I was expecting to see.

What Is the Easiest Way to Write Schema?

The first one is from our friend, Simon Cox, "Other than plug-ins, what is the easiest way to write Schema?" 

Jono: Do you want to do this Martha, I think we're aligned on this, aren't we?

Martha: Sure. Well, a friend of ours recently tweeted and said, "I spent all day Saturday writing Schema markup." And Jono and I were like, "I don't know about you, but I was out playing with my kids."

You can learn how to do it and I think reading it is good. In fact, Google has a code lab, to help you make sure you understand it. But I would say generate your Schema markup, and then look to edit it and learn it. 

And so whether you're using a free generator, I tend to like Merkel's Generator, I think is really good. Or if there's, depending on the type of site you have, there are plug-ins like Yoast that will sort of serve you. If you have primarily articles, FAQs and things that they serve. But then again, if you're looking to sort of do things that are more advanced, again, I would encourage you to find something to generate.

I would start with a free generator and try to kind of backwards engineer how it's done. But more people will tell you in Schema markup, they spent hours trying to figure out where the missing comma or quote is. Especially nested Schema markup, it can get pretty confusing as to where to put it. I would say, start with sort of generators to kind of get a sense of what it is. 

Because your time is much better spent strategizing on how you want to represent your content and nest it and figure it out, rather than write it. But that's from those of us that have been generating it since the 2012, 2013. 

Jono: I think this is really important and I see so many people talking about things like, "Oh, we're using Google Tag Manager to write our own Schema." If you're just doing that on one page, for one very specific scenario, for one type of rich result you want. Great, go and do it. But when you're starting to do more complex instructions, descriptions, the type of things you referenced, it gets really hard and it's going to be maintained over time.

Paul: I do a lot of stuff with structured out of GTM and a lot of those sites is fragile and you've only got to have a web design change or a CSS class change, and then it just breaks. If you're in safe hands, then you're halfway there, I think is the word for it.

Will Google Change Its Approach to Structured Data?

Jeff Kennedy: “do you foresee Google shifting their approach anytime soon? I.e. taking a more holistic view, obviously on Schema.

Jono: No. I've spoken to various teams working on various parts of this and some of them are... They differ radically in ideology and approach, et cetera. One of the things I've trained myself to do, and we're trying to do internally, is to stop thinking about Google as a monolithic organization.

Remember that it is bunches of small teams operating much more like a university of project groups, experimenting with ideas. And I think culturally, that's not going to change anytime soon. Sometimes there's obviously a directive from up high, which gives them direction. But these bits don't talk to each other. At least, not very well.

Martha: Yeah. I agree with you Jono. My interactions with, them as well. That we're seeing sort of a really different range. And I always use this example, like if John Mueller says something, he's saying it sort of within his area of knowledge, but might not understand it on a broader state. 

Choosing the Right Types of Schema for Your Business

Paul: With over 800 different types of Schema...which ones do you choose from? Which are applicable? Or are there only a few? How far is too far with markup?

Martha: I actually think it's a business question, to start. Which is how do you represent what's important for people to understand about your business? And the content that exists today to do that. And then you want to ask the question, which is like that backwards engineering piece, of what makes most sense to add to your content to help you further define and explain what it is and how it is.

We often say, don't use Schema App as a challenge to fill out every property. We don't want to do that, but they're there to help you represent what's important to your users and your business, and to prompt ideas on what you might want to put on there.

It really is, what are the important things about your business? Where do they exist on the web? And then it's just pulling in that content piece and like, can I get a rich result for that? And how do I then tie all those things together? I would say the different types really relate to your business, right?

Schema and Ambiguity

Paul: Another question from our friend Dwayne Diezel, about Schema wrap this time. What are the drawbacks of being ambiguous when applying Schema or other structures? What is the worst that can happen?

Jono: Oh, ambiguity is interesting. If you falsify stuff outright, you're obviously at risk of Google slapping you and saying you're no longer eligible for X types of Schema, or any kind of structured data. 

But there's definitely an interesting middle-ground for ambiguous stuff. I guess you could make vague statements and claims and not quite define things, but then you're not going to achieve anything. Unless you're sending strong signals about this is about that and is the main entity of this thing, and they're related in this way. Then it's just going to be code on a page, right? It's not going to describe things accurately. 

Martha: Exactly. That last line that Jono just said, you're trying to be very clear about this. If you're being ambiguous, I would say, why? And are you actually providing, are you doing a service to your users or yourself? It sounds like probably you should just go drink coffee instead of writing that Schema.

Are Other Web Ontology Languages Worth Using?

Paul: Another one from Dwayne. He's told us it's his last question. Iis there any point of use in other ontologies? I think that's it, yeah. Like OWL or FOAF? I don't know if you've heard of those. I certainly haven't.

Martha: What Dwayne is talking about is sort of the underlying ontology technologies, right? My co-founder, Mark, who's the semantic technologist, he would say, "schema.org is a pretty robust vocabulary that describes a lot of different things."

I would say there's enough robustness within schema.org, that unless you're actually starting to build an ontology for an enterprise that needs to be extended maybe with an internally hosted ontology that has specific definitions. But I say for the most part, no one needs to do that.

How to Make a Business Case for Schema Beyond Google

Paul: That's great. Another question from Jeff Kennedy. How would we make a business case for a site to implement Schema beyond what is supported by Google? 

Jono: Because Google is not the only search engine, because what they support will change tomorrow, and you absolutely want to beat your competitors to whatever shiny, new, rich results are coming. 

Increasingly, even outside of Google, Facebook, Bing, et cetera, voice is a thing. Your toaster is connected to the Internet. Your TV will be structured data to determine what to show you. Netflix will... This is the new way that systems harvest data to market to you, to understand you, to connect and solve information.

If your brand is either holding that or not exposing it, these services aren't going to be able to use it to recommend you. And increasingly, Google is so far ahead on this. When you search and you see a list of cards and rich snippets and results, they are only selecting and exposing those from brands which they've understood and content they have understood. If you're not marking those up in a way that they can understand, you never get in front of the consumer in the first place, so it doesn't matter how good you're competing elsewhere. You can't access the market. This is the tool.

Closing Tips on Structured Data

Paul: We're at the top of the hour. Anyone else that's got any further questions, please do reach out to any of us, via any of our social profiles. And I'm sure we'll pick up on those in no time at all.

It's been absolutely great having the opportunity to sit with both of you guys and talk through this. 

Martha: Absolutely. Any time.

Paul: If you guys would like to just give two quick tips, which is probably really hard on this subject. Just for the audience to... I mean, whether it's a go and learn somewhere or where there's a good school.

Martha: Well, I'll call it Schema Paths, which I already mentioned. A free tool on our website, but it helps you starting to think about that connectedness. And I think if anything, it'll allow advanced users to start to figure out how to use some of those, maybe, unknown properties to figure out how things are there. 

And then I would say, as you're building a business case or trying to get started or sort of playing with nesting, I really like FAQ. And we actually, shared our webinar with a ton of content and examples of how you can do that on product organization, you name it. Those, I guess, would be two learning pieces that I think are really valuable and help you get started on this journey.

Paul: And, Jono?

Jono: Mine would be, go and read through the schema.org page that lists all of them. There's just some really fascinating stuff in there. And that's in particular, outside of the very obvious stuff. There's a whole section of intangible things like mathematical calculations, and data sets and ideas. All of these are things that can be represented. And again, only do it if you can find the business case.

And the other suggestion would be, think about what goes into an average web page. There's loads of cool stuff you can do with a few pages and eCommerce. There is so much depth and richness you can use to describe just an average web page. And that's a really fun way of learning like, how many things can I describe on this? This thing has a description, it has a name, it has a site. That's really awesome. Try that.

Paul: Yeah. That's super. Well, thank you SEMrush for letting us do this. Yeah.

Martha: Thank you.

Paul: If there's any questions, anyone watching, then just please do reach out.

Jono: Good luck. Thank you very much.


Check out other webinars from this series