How To Be a Mini SEO Data Scientist

Nick Garner

Apr 03, 20154 min read
How To Be a Mini SEO Data Scientist

If you want to be a mini SEO data scientist, it's a case of shifting your mindset from simply accepting what you're told to challenging assumptions based on rational thinking.

Irrational thinking is accepting everything Google tells you without validating it. Most people never question why something is the way it is. As a result, they carry on with outdated assumptions and end up being the source of a lot of misdirection.

In SEO, there is no definitive answer to many huge questions. Google doesn't tell us exactly what signals the algorithm responds to, they just say there are 200 ranking factors and they leave us to work out which ones are most important. As a result, SEO is largely guesswork, with the person who has the best track record and the most experience being the thought leader on any SEO question.

This is where being comfortable with data analysis is really important because the data gives you correlations and with some depth of insight, you can work out what stuff is affecting those correlations.

A classic case is with social shares. There are lots of people who say that social shares will make you rank and as a result there's a whole universe of sellers who will give you Facebook likes, Google plus likes and countless other social shares.

But do social shares actually get you rankings? Or is there some other force in play here?

The only way you're going to work it out, is to analyze the best data you can get your hands on and start to piece together the answers.

To be a mini SEO data scientist, it's all about the using the right data, manipulating it in the right way around a particular hypothesis, which gives you insights, which then helps you make better decisions and therefore be more successful in SEO.

So How Do You Become a Mini Data Scientist?

Understand Excel

The first thing is to understand Excel because it's the fastest and easiest way to manipulate information. Of course not everyone likes Excel or has the aptitude for it. In this case it's a question of getting hold of the right tools which gives you insights you need.

Know About Metrics

The second thing is to understand what the metrics actually mean and how they relate to various websites. It's a bit like if you want to buy a car; you get to understand the costs of various cars and their features and from that you can make a buying decision.

To get to understand the relative value and importance of metrics, comes down to looking at lots of websites and associating metrics with them. After a while you will get a feel for what is good or bad based on a metric.

Form Useful Hypotheses

The third thing is to have good hypotheses. A typical hypothesis might be 'do these social shares help with my rankings?'. Since a hypothesis is ultimately a question, the better the question, the better the hypothesis.

Let's take a hypothesis around social shares and rankings: 'social shares get you ranked.' To test this hypothesis, we need a list of posts from a website with their social shares.

Here's a good example:

Helpfully, Lee Odden has given us a list of 25 of his top posts for 2014, with their respective shares and from that we can do some data mining and then some correlation analysis.

  • Step one: Find out how each of these pages have done on Google search. It so happens 90 DataGrabber can give me all of the key phrases that each of these URLs rank for within the Semrush database.
  • Step two: see how the URLs have done on link acquisition by running them through 90 DataGrabber once again and getting all of the relevant Majestic metrics for them
  • Step three: aggregate the social shares, Google ranking data, link data and start to look for some correlations.

Extrapolate Insights From Correlations

The first question is: what relationship is there between social shares and visibility on Google search results? Interestingly if you use the Semrush search volume metric (which is an estimation of search traffic that comes into your site for a given key phrase), the correlations are only at 1%.

However, if you compare social shares with the number of key phrases are given page ranks for, the correlations are at 59%, which is a good solid number.

In itself you could argue somehow social shares affect rankings directly. However the Google team have denied repeatedly that social shares have a direct effect on the algorithm. And I agree with that.

There's a 70% correlation between number of referring domains and social shares; the correlation climbs to 81% when it comes to the relationship between number of links which are Majestic trust flow one and above plus follow linksand social shares.

Then if you look at the correlation between number of key phrases a page ranks and number of links which are above trust flow one plus follow, you get a remarkable 76% correlation.

This data also validates general hypothesis I have about rankings. My theory is that Google will simply rank a domain across more key phrases, generally irrespective of competition, if it trusts that domain.

The only real exceptions are where you get highly competitive money key phrases. In those circumstances it does seem to be a game of link volume and strength, combined with user engagement metrics associated with search results

Final Data Summary

Based on this data, it seems links get you rankings and a social shares are indicative of the linkability of a page. So if you're writing content, you could say that a great KPI for linkable content is getting more shares. Once you have those shares, you can get out and acquire links for those pages.

As far as being a mini SEO data scientist goes, it's really a question of having good questions and getting the best data you can get your hands on and then a nice structured way organizing it to get answers.

I make no claims about being much good at Excel; there are other people who leave me standing. But even with my limited knowledge which goes as far as pivot tables and Vlookups, I can get a lot of insight.

One of the reasons I can do this kind of data analysis is because we've built a tool called 90 DataGrabber that allows us to mine data in new ways, really easily. So I urge you to download a copy and have a play here. And welcome to the exciting world of SEO enlightenment!

Are you interested in learning more about data science and data mining? View "Data Mining with Semrush: Unlocking SEMrush for the Serious SEO'er," a webinar with Nick Garner and Semrush Marketing Director Michael Stricker.

Author Photo
Nick GarnerNick Garner is founder of 90 Digital, a specialist agency for SEO in competitive verticals, digital PR and enterprise web development. Coincidentally, 90 Digital has a team of just over 90 people. His last article for Semrush was "Data Mining with SEMrush."
Subscribe to learn more about SEO
By clicking “Subscribe” you agree to Semrush Privacy Policy and consent to Semrush using your contact data for newsletter purposes