On-Page Boot Camp: What Is TF-IDF and How To Use It

A.J. Ghergich

Aug 21, 20175 min read
What Is TF-IDF and How To Use It

In the previous videos in the On-Page Boot Camp series, A.J. Ghergich showed us how he used SEMrush to find the best keywords to work with, win featured snippets and make the content even better with the SEO Ideas tool.

In the last video of the series, A.J. unveils the secrets of on-page optimization and presents the secret weapon of content writing, the TF-IDF metric, which allows you to find the best semantically related keywords for your content. — SEMrush Team.

Youtube video thumbnail

Video Transcription

Hey, SEMrushers! A.J. Ghergich of Ghergich & Co. We are on our fourth and final video, and I am obviously in a pretty festive mood because we are going to talk about TF-IDF. I will explain what the heck it is, how you use it to enhance your SEO and kinda take it to the next level of your on-page optimization with semantically related keywords. And then at the end of the video, I am gonna show you some new features in keyword tracking that SEMrush just released. I have been beta testing it, love it, and I know you’re gonna love it as well.

What Is TF-IDF

TF-IDF means ‘Term Frequency — Inverse Document Frequency'. Let’s clear this up and make it easier to understand.

TF-IDF description

The overall goal of TF-IDF is to statistically measure how important a word is in a collection of documents. It’s like a really useful keyword density tool on steroids. It gets less complicated when we break it down.

1. What Is TF

So let’s look at TF: term frequency. It’s exactly what it sounds like — how often the term occurs. That is what it is measuring — occurrence. That is only going to take you so far, so typically term frequency is then divided by the length of the document to account for longer or shorter documents.

Let me give you an example: let’s say you have a 500-word article that says ‘horse’ 4 times. And you have a 2000 word article that says ‘horse’ 5 times. This would then account for that. So that is why they are dividing it by document length. We now have a good measure of occurrence.

2. What Is IDF

Let’s go and look at IDF: inverse document frequency. Ooooh, scary! It is not actually that hard to understand. This is telling you how important a term is. So it essentially has two jobs.

  • The first one: It is going to weigh down terms that appear frequently, like ‘is’ or ‘the,' and a lot of the stop-words that we all use.

  • The second goal is to scale up the more unique and less-used terms.

Let me show you how SEMrush is implementing TF-IDF into their toolset and doing these calculations, so you don’t have to.

Make Use Of Semantic Ideas

Alright, I am over in Optimization Ideas again and we are looking under examples at energizer.com. I have selected Semantic from the drop-down menu, and I have keyed in on this hands-free lighting page.

Semantic Ideas

It has some rankings, but the page itself is a little sparse. When we look at the page before we dive in, you can see it is a very thin page. It is just a little bit of text and not a lot of optimization other than branding names.


SEMrush is obviously picking up on that as well, and they are saying ‘Hey, we have a semantically related idea.’ Semantic keywords are essentially based on your top-10 rivals’ rankings data, but they are showing you things that are related topically.


For instance, if you were trying to optimize for ‘car,’ SEMrush is gonna pick up on the fact that maybe you are not saying ‘automobiles’ or ‘used truck’ and those kinds of things. The point is that Google is way beyond just ‘Let’s stuff in 4 or 5 keywords here that are an exact match and call it a day.’

Focus On The Overall Concepts

You need to look at the concepts that others are talking about and the topical relevance of your page in general compared to others. Is it as beneficial to the end-user as it could possibly be? Or am I talking about the concepts that Google deems important? And so you can see here that SEMrush has a detailed analysis we can click in to, and it shows us pretty much everything you’re going to want to know.

Semantic Ideas detailed view

You have, in this column, ‘Hey, am I using this check mark or no?’ Here are my rivals using this word, and then the more traditional percentage base, and then TF-IDF, which is awesome. So we have them, we don’t have to run the calculations, worry about the math of it. It is already done for us. I tend to key in on two-word phrases because it is a little easier to wrap your head around. We are not saying ‘hands-free’, and everyone else is.

Don't Rush In!

I do want to caution, though: just because it is on here doesn’t mean you need to go and do every one of these. You are mainly looking to say ‘Ok, how do I flesh this out? How do I round this out and make this content better?’ I am not talking about some of these concepts: I am not talking about the ‘power’ or we don’t talk about ‘lumens.’ So maybe there should be some of these in there.

If it doesn’t make sense for your page, you don’t do it, and everyone else is doing it, then don’t worry about it, don’t do that. But, in general, this will point you to those concepts that you are not are talking about enough.

You can also go over to Keyword Usage. What you get is a little bit more of the traditional, ‘Are they using these keywords in their title tag, my rivals, or their meta-description, or H1, and so on? But there is now a TF-IDF section here as well, so check that out.

Keyword ideas with TF-IDF Section

Bonus: Check The Opportunities To Get Into SERP Features

In the intro, I mentioned I would show you some cool bonus features that SEMrush is rolling out. Let’s go and take a look at those now.

We are over in Position Tracking for Energizer, the example site we have been using, and if you click on Rank count, you will notice you can pick Local pack or Top stories, so you can have more control over what you are wanting to track versus yourself or your competitors. If you step down, there are several new things.

Position Tracking with SERP features filters

There is a SERP Features column that shows all the different types of SERP features.

SERP Features column

And then there is this really cool bar that allows you to select the types of features and discover what your competitors might have that you don’t have.

Featured Snippet Filter Bar in Position Tracking

So let me give a quick example of that. With Energizer we talked about review stars in one of the videos, so let’s look at reviews.

Now, here are some keywords that have reviews on the SERP. But we have selected ‘Any domain across the web.’ So watch what happens when we change that to just ‘Energizer’... nothing.

No Reviews implemented by Energizer

And that is because they haven’t really implemented review stars as part of their markup. So there is a great way to use this for discovery. It is a really cool addition; I am really excited to use it.

Keep It Up!

These videos have been really fun to make. It is fun to share how I use SEMrush.

I would love some feedback on how you use it! Maybe you are using it in a whole different way. And hit me up on Twitter, let me know some of the things that you’re doing that are a little bit different with SEMrush and maybe I can point those out in a future video. 

Author Photo
A.J. Ghergich is the CTO of Brado. Follow him at @SEO on Twitter for SEO tips and cat gifs.