English Español Deutsch Français Italiano Português (Brasil) Русский 中文 日本語

Epic #SEOisAEO series: Voice Search is changing the game



Jason Barnard: We're going to be talking about voice search and how it's changing the game. I started researching this, my point of view has changed quite considerably over the last three or four weeks.

First thing I did was look at lots of figures. I immediately realized I needed to take them with a pinch of salt. Made me think a great deal of what Mark Twain apparently didn't say, but he made famous, "There are three types of lies: Lies, damn lies, and statistics." And I would add to that “there are four types of lies: lies, damn lies, statistics, and predictions”. So, take a pinch of salt every time you look at a number, like this one, which has been doing the rounds. "40% of adults now use voice search once per day" from Location World. Or "By 2020 50% of search will be voice." from Andrew Ng who worked on RankBrain and who's now at Baidu.

The pinch of salt is a very big pinch of salt in my view. Eric? What do you think? You mentioned in one of your articles voice interactions, can you clarify the difference between voice search and voice interactions? And maybe help us understand why that figure of 50% is misleading.


Difference between voice interactions and voice search

Eric Enge: Well, sure, actually, the most popular "queries" are things like "call mom", "call dad", "set a timer". Things that we never would have thought of as search in traditional terms. I don't go to the Google search box and type "call mom". So I really think about it as voice interactions. That's really what's going on.

Also, I think we need to understand a little bit about how quickly this stuff is growing. There's no doubt it's growing rapidly and it's going to be very important. But there's a reason why people like Google and Bing don't actually share real quantitative data on the volume, and that's because it's convenient for Google and Bing to have people believe it's true - they want people investing in these things.  

Jason Barnard: I read an article by Will Critchlow and he goes through lots of numbers and, he doesn't say it quite like this, but he ends up with a figure of 4-5%.

Eric Enge: To deal with the broader question, it's not 50% now, it's not going to be 50% in two years. It's growing extremely rapidly. I think we can say that people are comfortable using voice with their devices - for example in one of our studies we found 25% of people in the U.S. were comfortable using voice commands with their phones in public restrooms.

And a comparable amount in theatres. And over 50% use them from time-to-time at home. That doesn't change the fact that most of what people are doing is still typing. Voice is coming strong. It's going to be incredibly important. It's going to be a huge opportunity. That's why we need to be talking about it. Not because it's taking over the world already, but because there's so much opportunity for us that if we're not positioning ourselves to take advantage of what's coming, then we're losing a huge, huge opportunity.

Jason Barnard: Thank you. I love that answer. I think that's one of the most sensible answers I've heard in a long time. And it synthesizes exactly the position we don't need to completely freak out now but we definitely need to wake up so that we don't miss the opportunities that are going to come.

Understanding the question. Who are the players? This is, for me, a very interesting question. Google, Amazon, Apple, Microsoft. Understanding user intent. How do they stack up? How are these four major players performing today?


How do Google, Apple, Microsoft and Amazon compare in voice recognition?

Juliana Turnbull: Well thank you, Jason. Your survey, Eric, on Stone Temple's website tested nearly 5,000 queries on five devices and the Google assistant answered the most questions and also got the highest percentage correct. So, there does look to be quite a lot of movement for Google. According to other studies, some are saying that Apple Siri has a large market share, but in terms of in the future, I believe it will be Google. I've seen some stats on Re-Shift media - Google Home has about 3.5 million units sold in the first quarter of this year in the U.S., compared with the Amazon Alexa’s 2.5 million. So Google has a big potential there to gather more momentum.

But I do think that all of these different voice devices like the Apple Siri, Google assisted Alexa, they still have a bit of way to go when it comes to understanding accents. The American accent seems to be the one that's the most easily understood, for now. And there was a study in the Washington Post saying that those with a Spanish accent were at a disadvantage by 4.2%, which I was surprised about. But also, when people are speaking with a slightly off-American accent there's also confusion. So particularly when people are using the British accent, Alexa or the voice assistants are not 100% sure of what the question is.

There is still obviously a lot to be done. I am also surprised that there isn't really that much information around how to make voice search accessible across the devices for people that are maybe hard of seeing, or disabled in the hands. So I'm conducting a study around that and we can show the link later.

Jason Barnard: We have Google winning the war of being available on an enormous number of devices and collecting vast amounts of data. The amount of data that they have that will enable them to leverage machine learning and get better and better at understanding all this. So for me, Google will win the war. Just to come back to the accessibility question, I love that. I'm really keen on accessibility. Accessibility was used to cheat for years in SEO and now actually has to be done properly. That is good for SEO, but it's also good for people who have sight problems or physical disabilities. So, I love your survey.

Juliana Turnbull: Great. Yeah, thank you very much. And just to hone in on the point of about sort of the growing area, another article on Computerworld said that in the four months from January until the end of April, the number of hardware devices that use Google Assistant tripled - rose from 1,500 to 5,000. Compare that against Amazon’s 12,000 devices - they're still big, but I think that they need to check behind them because Google is growing at a very high rate.

Jason Barnard: Yes indeed. Providing the answer. We now have these incredibly complex SERPs with answer boxes, people also asked, knowledge panels, so on and so forth - many more opportunities to be the answer. And that affects the voice search since getting the answer box would probably make you the answer given by one of these machines. Eric - answer boxes / featured snippets / position zeros...  with voice, is this all that's going to matter for us as SEOs?


Position 0 and voice search

Eric Enge: Well, I do think getting those featured snippets is incredibly important. It's not going to be the only source of answers but the reason it's so important is that when you're in a voice environment, you only get one response generally speaking. The reason why I believe Google has been investing so heavily in featured snippets is to build an algorithm on top of the normal search algorithm that specializes in finding that best canonical short answer to a user's question. So it builds on the normal algorithm - it looks at the top ten results and extracts the best answer. In the voice environment, it's very likely that that will be the source of your answer. But there are other sources.

Another major source is if you build yourself Actions on Google that can actually be used to generate an answer for a voice query as well. This is actually something we've done at Stone Temple consulting - we've built an Actions app and people can invoke that app without actually having to download or install anything, which is great. But even if they don't know about our app, if Google gets a question from a user for which our app has the answer, they might serve the answer from our app rather than a featured snippet (a feature of Google app structure called "Implicit Queries"). They're both opportunities. But for the featured snippets, you definitely need to be attempting to get those with your websites. Even putting voice search aside, it's a good thing to do in regular SERPs. And then in voice search, it's incredibly important.

There was a study by one agency showing that 80-something per cent of the voice answers were coming from features snippets. But that does leave another 20%

Dawn Anderson: It was Pete Meyers who did a study that found that a huge percentage were actually coming from feature snippets. Last year when I was at the European Summer School for Information Retrieval, Google's Enrique Alfonseca said that they go to the knowledge graph first, pull everything out of that, which is obviously where the vast majority of featured snippets come from, and then they fill in the gaps from web pages and presumably the more structured, the better.

Eric Enge: Absolutely. If there's a question that somebody asked of the device, which is based on public domain information, like, "What's the capital of Washington State?" That will be served from the knowledge graph, whereas feature snippets are drawn from third party websites or you can also get information drawn from your Actions on Google app.

Jason Barnard: Yeah. That's a good reply. I think with Dawn's interaction as well, we get a very solid picture of the knowledge graph and Position 0. But also, something I just learned that I didn't know literally five minutes ago is that actions aren't just things that people install, they can also be called implicitly by Google. Which I find really exciting.

Eric Enge: Yes. And that's one more thing that I did want to mention is that a short while ago Google released a new type of schema markup which is meant to be embedded in article content and Google will use that as cue from you, as a publisher, that this is information that you are recommending to be read back in an answer to a voice query. I have not done a lot of experimentation with that yet, but it's very notable that they've even gotten to the point where they considered releasing such a thing.

Jason Barnard: Yeah. Super! Great! On to the next question. Optimizing content for voice. If we want to provide the answer to voice, what are the tactics content-wise, Dawn?


Tactics for content for voice search

Dawn Anderson: For me, one of the biggest tips would be to add structure. Google does a lot of extraction and content with structure helps with understanding, meaning and context, etc. But even just adding semi-structure, using H tags, and adding unordered lists and lists suddenly turns this huge mass of unstructured data into something with some structure and meaning. And that's easier to extract.

Tables are very problematic. There's no really easy way to read from a table, as yet. Tables tend to do fairly well in things like featured snippets, so that begs the question, do you really need to have almost a triple strategy going on? Do you have content for feature snippets, but then, because you know that table are not going to be easily read out, you also have maybe an ordered list to support that. Maybe even some longer form explanatory content. Because you want to get the longer questions as well. We know that voice search queries are long, so we have to work towards meeting those fairly long and wordy questions.

But at the same time, we have this challenge of mobile first and very, very limited real estate. We have to try and get enough text in there so that there are meaning and strong semantics and lots of clues and so forth. But we also have to realize that people don't have a huge amount of space. So there's this potential conflict between mobile first and voice search. So it's just really a balancing act around those things. I think we have to really start to think about the fact that a lot of the research in the past around natural language processing has taken a notion of relatedness into consideration. And that's not the same as relations as we see in the Knowledge Graph. It's about things that co-occur together, such as “sea” tends to co-occur with “shores”. And I always think about going beyond the page - when you start to link things together well, that adds to co-occurrence, it isn’t necessarily just the things that always spring to mind. So, it's another argument for things like topical siloing that adds that bit more context. And obviously, answer the questions. Last year was the year we asked “how” more than anything else. Bear in mind that you need to keep the sentences short, sweet. Answer the question right at the beginning of a paragraph. Nothing worse than a long, long, long, long wordy paragraph with the answer at the end, as I've just done. So keep it short, brief, to the point, get the answer out quick and early. Use entities in the answers as well because anaphora and cataphora is problematic because once you start saying pronouns like "he", "they", "it", "them", and perhaps start to include double pronouns, multiple entities together, and you start dropping the names, it's very difficult for these machines to necessarily map it all together and understand who we’re talking about.

In natural language processing, the anaphoric resolution is apparently still the most challenging problem going and they still haven't resolved that.

Jason Barnard: That was incredibly complete.

Dawn Anderson: It's just a really interesting area, actually.

Jason Barnard: Per device content. You actually started to answer this question - you were saying that you're going to need to consider whether it's a feature snippet or if it's being read out. What format content are you going to have to provide to be able to please all these different platforms?


Optimizing content per device for voice search

Dawn Anderson: I feel that you need a multimedia strategy anyway now because people consume information differently. Some people prefer images, some people prefer video. People learn in different ways. So I don't think it's necessarily an all-or-nothing approach. Tabular data is massively useful for things like extraction and producing information visualization. So, I wouldn't necessarily say, "Hey, you can't read an answer from a table so you just throw it out." Even in the academic papers you will tend to see the data visualized in a table and then you will have the insight - some people don't want to have to do the insight and analysis themselves. They like that long form where we're picking apart all this tabular data and we're saying what it actually means. So, you definitely don't want just one or the other necessarily. Both add a lot of value. So I think we need to use the data where it best fits the purpose. Repurpose, even. I mean, there's so much great stuff out there that, to some extent, we can make better things that meet people's needs in a different or informational way by using a lot of the stuff we've already got. Also, I think here we have to make the best use of UX. I think UX is massive. Using accordions, concertinas and so forth. And this is where the H2 and H3 can come in as part of those concertinas - they emphasize “this is the question, here's the answer”. It's a balancing act,  and that’s all good fun.

Jason Barnard: Brilliant. Great answer. Wonderful. Super. Technically, what are the main points that are going to help us communicate our answer so Google will be able to understand and be able to present it?


Tech tips for voice search

Juliana Turnbull: Technically, what is going to help voice search really is the foundations. I call it Brilliant Basics. In short, making sure that you have a good website. One Brilliant Basic is site speed. People are not going to be sticking around if the site is slow. The second one Dawn talked about, is schema markup so you do need to include that. And remember that Google assistant will read out featured snippets when they answer voice search queries, so aim to rank for position 0. And then the third thing is to have the content your potential clients are looking for. As Dawn said, have your answer in the first one or two sentences if people are asking questions. If you want to know what people are asking, use “answer the public” and you can find that out. I would also look at Google trends. Lastly, make it easy for people to get the information and make it easy for them to share. We can't forget all that we've done before just because we now have voice search. So it’s still all about making sure that you have those great foundation blocks within your website.

Jason Barnard: Thanks, Juliana! Next is per device tech format. Very similar question to the one that I asked Dawn. If you want to get your rich snippet, and also be the spoken answer do, you need to put anything in particular in place technically or not?


Optimizing tech per device for voice search

Juliana Turnbull: Yes, you do need to have your schema markup in place. For example, if you have, a hotel you've got customer reviews. Markup so you have those nice stars appearing in the search results, taking over more of the SERP real estate and making it easier and more enticing for people to click through. It's also important that if you are a site, let's say one that's giving tours around Barcelona or tours around Paris or anything where people can ask questions, put up a frequently asked questions page. Find what people are looking for, but when you have those questions, you then need to provide that answer with a couple of sentences. So be really clear and precise.

Jason Barnard: Yeah. Super. Great. Wonderful stuff.

Dawn Anderson: I don't see a lot of people talk about using the data highlighter in Google search console. We used that for events on a project and it was massively helpful. But as soon as the dates were passed and out of date and another event rolled around we very quickly lost the featured snippet, the voice response. So I would say, you've got to keep it all up to date. Especially when it's to do with anything time-related. Everything's gotta be really accurate.

I don't think it's any coincidence that Google's John Mueller said featured snippets tend to be the higher quality sites. And I think that you've got to be accurate in the answers that you're giving. And you've got to keep on top of your content. Make sure it's not old, tired and out-of-date.

Jason Barnard: Somebody's just posted to the chat. They said, “we're providing the right content at the right time with the proper markup”. I would add that “and be credible as an answer”. So, whoever posted that, absolutely wonderful.

Juliana Turnbull: This all ties in with relevancy. As Dawn said, you need to be updating the content. And I also stress that you can still keep the blog posts or some of the information that you have in a website, but update it with what's coming up. Also wanted to mention that local businesses do need to optimize for your local SEO. And this goes back to the point I made before that it's really about Brilliant Basics. You need to get your basics in place to be able to then take advantage of new things that come in the market like voice search.

Jason Barnard: Sure. Yeah. Perhaps an important point about local businesses is that they can actually monetize voice search today, so it's an immediate effect we can see. That also helps us understand how this is all going to evolve. Back to Juliana. The pros and cons of creating an action or a skill. We mentioned this earlier on - actions and skills can be used implicitly. Marking up your site is also a possibility. What's your take on that?


Marking up your site for voice search

Juliana Turnbull: Yes, you do need to mark up your site. But first of all, make sure you have the content that's relevant for voice search. Make sure you have the site that is optimized, all of the basics in place. Make sure it has a fast load time as well. Plus E-A-T, the expertise, the authoritativeness and trustworthiness, which again came into the algorithm earlier this year. So, you really need to be even more credible and authoritative.

There was a Google data study in 2017 and they said that 70% of all the searches in the Google system are natural language. So it would mean that you wouldn't say something like, “bookshop" You would look at more "secondhand bookshops in Manhattan" or you'd look at "vintage clothing shops Manhattan" and you need to make sure that your content is actually optimized for it.

Jason Barnard: Okay. Super. Thank you. Now. Making voice search profitable for Google. This is something I think intrigues us all. Eric, monetizing voice search, how is Google going to do it?


How is Google going to monetize voice search?

Eric Enge: That's a great question because at the moment I'm pretty sure Google doesn't fully have the answer to this question. But we can actually infer and guess to a certain extent. So one way we can get at that is looking at regular web results. The ad density you see on commercial queries versus informational queries allows you to see where Google determines that there's a high amount of buying intent... You get a lot higher ad density in those commercial search results than you do on a purely informational query. Well, I think the same thing is going to translate to the voice environment and that Google will potentially have the first response to a query with the commercial intent being something which is an ad rather than something that's organically determined. That might be one of the ways that they would go, keeping in mind that it's a little bit burdensome to provide multiple answers.

Of course, "this answer brought to you by so and so" is another thing that they can do. But I think the primary thing will be a focus on putting ads or monetizing obviously commercial queries.

Jason Barnard: Okay. Super. Wonderful. I like looking at “what's Google's point-of-view” and “what's our point-of-view as marketers”? So it's exactly the same question, but for businesses. What are we going to do to make money out of this?


How are businesses going to monetize voice search?

Eric Enge: Well, so there are some different aspects to this question, for sure. And this really ties into another part to the prior answer I gave - I do think we're going to see situations where Google creates opportunities for people to have commercialized apps. I believe Amazon is already doing this with Alexa Skills - allowing people to sponsor their skill, and get it read back as an answer more often. I think those opportunities are going to emerge. Where the opportunities are today, for the most part, is actually just in reputation, branding and visibility. So you build nice Actions for Google or an Alexa Skill and somebody becomes aware of your Action or Skill.

Jason Barnard: Super-duper. Thank you very much.

Dawn Anderson: Can I throw in a crazy theory?

Jason Barnard: Go ahead.

Dawn Anderson: For me, we are talking about only one answer, and there can only be one answer and it needs to be the most accurate available. But because some of the queries that people ask are going to get increasingly long, long, long, long. There's not necessarily always going to be a right answer. Especially as people get more and more used to the voice. So, I suspect they won't particularly try to monetize that in the early stages because one answer seems to be almost like "this is biased", especially if it's a paid result. Because there isn't any space there for anything that's not paid.

At the same time, we have the dialogue flow, where you have all these companies developing their conversational bots, and actually adding all these potential answers to a question somebody might add. So, somebody might ask a question like, "What's the best way to cook a chicken?" And you have a conversational bot with a developer that's providing answers to that and saying, this step, that step and the other step. That data then feeds into the conversational search, and the Actions and Skills can be monetized because there's the botability and the conversational element.

There is value in building conversational bots and using Google actions, isn't there?

Jason Barnard: Yeah. 100%. I love the word botability. I like the idea that you can use voice to bring somebody towards a purchase and therefore the individual answers are worthwhile because you can bring them to your product at the end of it. Wonderful stuff. I keep coming back to the very simplistic idea, because I read it somewhere, that Google's obsessed by the Star Trek machine. I love that because it's very clear because we can easily visualise this machine that dialogues with us and comes up with an answer. And the idea at some point is it comes up with an answer even before we know we've had the question. Thoughts on making the most of a voice in client acquisition?


Voice search in multi-channel, multi-device client acquisition

Eric Enge: From a client acquisition point-of-view I guess we need to think about what the scenarios really are. Clearly what we're doing here with building our app is actually helping us acquire clients. And that's where I think the branding and reputation and visibility really play really well in this environment. We see people doing this in the Amazon world. WebMD has an Alexa skill where they're doing a great job of getting a lot of visibility. Mayo Clinic is playing in that space as well.

Jason Barnard: Absolutely. Perfect. Now, the last question is the timeline. 50% in 2020 we've agreed that probably isn't going to be the truth. But what is a more realistic timeline for voice becoming, perhaps not dominant, but certainly sufficiently important that, as business and marketers, we need to wake up 100% to the needs?


The timeline for voice search

Dawn Anderson: I think it's going to massively escalate unbelievably quickly. Google's Andrei Broder (he's the guy who named navigational, transactional and informational queries) gave a keynote called "A Call To Arms In Assisted Search"? Basically, I know you'd call in the series Answer Engine Optimization but for me, it's Assistive Engine Optimization because we're now entering an era of assistive search, assistive systems.

They call them subordinate systems where the system will do stuff for you, help you with the moments: the do, the buy, the ask... Here he’s made a call to arms for assistive systems but with artificial intelligence. We've seen how fast things are moving, so when that happens, it's gonna hockey stick. I track a lot of the researchers, and the speed at which these citations are coming through (and therefore the research is progressing) is a massive pace. So whilst the X percentage in 2020 seems very ambitious, I think we're going to see it really start to take off massively.

Jason Barnard: Absolutely brilliant answer. I'm incredibly relieved that Assistive Engine Optimization is still AEO.

Thank you. That's a wonderful last point. So everyone can join the conversation. #SEOisAEO. Assistive Engine Optimization. That was a phenomenally interesting discussion. I learned lots of things and you guys brought an awful lot to the table and I'm really, really, really grateful for that.


This was the second episode of Epic Series by Jason.

The first episode

All levels

Check out other webinars from this series