Advanced Keyword Research for Epic Content

, July 11, 2014

Are you out of ideas or lacking inspiration? Do you want to write great content, but aren’t sure which topics to choose?

Let’s take a look at the pages that bring your competitors the most traffic, and let’s review what topics will make you visible on the search engines. But first, let me point out that this topic is somewhat technically challenging. This article, however, is accompanied by a script, which will do almost all of the work for you.

Using competitive research and OpenRefine, we will find new keywords for content creation.

What will the outcome look like?

The outcome will be a list of the most effective landing pages of a given domain, in terms of both organic traffic and their potential. Calculation of these metrics will be stated further in the text.

In short, SEMrush calculates traffic from the ratio of position-to-phrase search frequency. The potential of a given landing page is calculated as a ratio of search frequency of phrases, which correspond to the landing page to their competitiveness. Meaning, the landing page that contains a lot of non-competitive searched phrases will have a greater potential to rank higher.

Result

*Results should look like this (all images in this article are clickable)

In the first column you can see an overview of individual landing pages sorted by calculated potential. The second column contains a list of the five most effective phrases (effective in terms of expected traffic), belonging to a given landing page, which also matches their position (number in square brackets), their search frequency (number in round brackets) and competition (number in curly brackets). In the following columns you can see the average position, overall search frequency, average competitiveness and potential belonging to the specific URL.

Let’s consider two basic outcome analyses that should be done:

1.  Sites with the greatest potential are those that appear for many search phrases, ideally with the smallest competition possible. Which of the phrases (that my competition has) can be beneficial for me as well? And how can I compete with this content? Or, how can I use it?

2.  How do landing pages cover most phrases, and how do they bring the most traffic to my competitors? Why is this not working in my case?

The question remains: How do I get the desired outcome?

Below you will find instructions on how to achieve it.

Preparation

Before you start the analysis, it is necessary you have OpenRefine software installed on your computer. You will find setup files for all operating systems here: http://openrefine.org/download.html.

Apart from OpenRefine, you will also need to know your competition in organic results. You can use tools like SEMrush, KeywordSpy or iSpionage. Here is an example using SEMrush: http://www.semrush.com/info/example.com+(by+organic).

Getting started

You will have to export the data you gathered from your tool of choice. In the example below, I used data from SEMrush.

For those who do not have a paid account and would like to test using larger data, I have downloaded a few examples you can use:

  • Moz.com: http://podstavec.cz/tools/moz.com-domain_organic-us.csv
  • Reuters.com:http://podstavec.cz/tools/reuters.com-domain_organic-us.csv
  • Gadling.com: http://www.podstavec.cz/tools/gadling.com-domain_organic-us.csv
  • Vogue.co.uk:http://podstavec.cz/tools/vogue.co.uk-domain_organic-us.csv

Next, access OpenRefine. It should open in a new browser window as 127.0.0.1:3333. It should look like this:

SEMRush OpenRefine opened
 

Insert the downloaded file and click on “Next.” Then, do not modify anything, just name your project at the top right corner.

Click on “Create Project.” Shortly afterward, a project should open in OpenRefine. If you use the dataset from Gadling.com, it should look like this:

Gadling create project
 

So far, it is only a simple export from SEMrush. Let’s continue.

Processing data using OpenRefine

OpenRefine is a great tool for the processing, cleaning, transformation and acquisition of data. Unfortunately, discussing how it works and what it is for would require not only an article, but at least an e-book!

OpenRefine allows you to create scripts that can perform certain processes using any dataset, which is also what I did. You can download the script at http://www.podstavec.cz/tools/semrush-script.txt.

Copy this script to the clipboard (CTRL + C) and then go to the “Undo/Redo” section (found on the left). Click on “Apply.”

Undo-Redo
 

You will see a clipboard similar to this one:

Clipboard
 

Insert the content of the script and click on “Perform Operations.” Behold, you will see an outcome similar to the one at the beginning of this article. You just have to display records of individual landing pages instead of rows. You can do that by clicking on “Record” above the data.

Legend

What does the data in particular columns mean?

  • URL – The landing page of your competitor appears in the search results for the phrases listed in the “Keywords” column on the right. It’s good to filter the landing page according to the patterns that can be found in the URL (if they exist). In the example of SEOBook.com we can find through the “tools” filter only the URLs which contain the tools, and thus discover the potential of SEO tools that SEOBook.com is using.
  • Keyword [Position] (Search Volume) {Competition} – A key phrase for which the competitive landing page appears in search. In square brackets you will find the least measured position. Round brackets show the search frequency of a given phrase and curly brackets show the competitiveness of a given phrase.
  • Average position – Average position of the landing page on the SERPs.
  • Cumulated search – Total search frequency of phrases covered by given a landing page.
  • Competition – Average competitiveness of phrases to which a given landing page appears on the SERPs.
  • Potential – Potential of the landing page, calculated using the following formula:

 Cumulated search volume/(Competition + 1)

The less competitive (and more searched for at the same time) the search phrase is, the higher the score is for a given landing page.

How to work with the outcome

The simplest and most basic overview you will receive is a list of landing pages sorted by “Potential.” It is also good to filter the data and work only with outcomes that can tell you something. To do this, you can use a text filter for any column (click on the arrow next to the relevant column and select the “Text Filter”) or “Facets.”

Probably the best outcome is an Excel spreadsheet which can be used by almost everyone. Click on “Export” and then “Excel” in the top right corner. After that, the downloading of a .xls file should begin.

More experienced users who are familiar with OpenRefine can use “Add Column by fetching a URL to obtain information about social signals and reference profiles of given landing pages (this will be described in another article).

Conclusion

Within moments, you will have an overview of your competition’s most effective content, as well as what phrases bring visitors (coming from organic search) to a given landing page.

This will provide you with inspiration for new phrases and further content development of your website. With this overview, it is possible to work more deeply — or, if you are interested, we can talk more about working with data acquisition in the next article.

I wish you good luck in gaining inspiration, and a lot of great content on your website! Feel free to comment with any questions.

 


 

Author bio: Filip Podstavec is an online marketing consultant.

  • http://www.probloggingsuccess.com/ Jane

    Thanks for sharing this Filip. It is true that often times I am pumped with enthusiasm to create awesome content but don’t know what to write about. I’m all in for keyword research – I am ready to spend the time but it could get dry sometimes.

    I’m hearing about openrefine for the first time. I’ll try it out. Thanks!