What Is the Google Index?
The Google index is a database of all the webpages the search engine has crawled and stored to use in search results.
The index acts like a massive, searchable library of web content. It stores the text from each webpage, along with important metadata like titles, headers, links, images, and more.
All of this data gets compiled into a structured index that allows Google to instantly scan its contents and match search queries with relevant results.
So when you search for something in Google, you're searching its powerful index to find the best webpages on that topic.
Every page that appears in Google's search results has to be indexed first.
So if your page isn’t indexed, it won’t show in search results.
Here's how indexing fits into the whole process:
- Crawling: Googlebot crawls the web and looks for new or updated pages
- Indexing: Google analyzes the pages and stores them in a huge database
- Ranking: Google’s algorithm picks the best and most relevant pages from its index and shows them as search results
Predetermined algorithms control Google indexing. However, you can influence indexing by managing how bots discover your online content.
How Do I Check If Google Has Indexed My Site?
Google makes it easy to find out whether your site has been indexed—by using the “site:” search operator.
Here’s how to check:
- Go to Google
- In the search bar, type in the “site:” search operator followed by your domain (e.g., site:yourdomain.com)
- When you look under the search bar, you’ll see an estimate of how many of your pages Google has indexed
If zero results show up, none of your pages are indexed.
If there are indexed pages, Google will show them as search results.
That’s how you check the indexing status of your pages quickly. But it is not the most practical way, as it may be difficult to spot specific pages that haven't been indexed.
The alternative (and more complex) way to check if Google has indexed your website is to use Google Search Console. We'll take a closer look at it in the next chapter.
Semrush’s Site Audit tool is perfect for identifying and fixing these issues. Give it a try.
How Do I Get Google to Index My Site?
If you have a new website, it can take Google some time to index it. Crawling can take anywhere from a few days to a few weeks. (Indexing usually happens right after that, but it’s not guaranteed.)
So, indexing can take time. You can speed up the process, though.
The easiest way is to request indexing in Google Search Console.
Google Search Console (GSC) is a free Google toolset that allows you to check your website's presence on Google and troubleshoot any related issues.
If you don't have a GSC account yet, you'll need to:
- Sign in with your Google account
- Add a new property (your website) to your account
- Verify the ownership of the website
Need help? Read our detailed guide that will help you set up your Google Search Console step by step.
Create and Submit a Sitemap
An XML sitemap is a file that lists all the URLs you want Google to index.
It tells Google that these pages are important. Which helps crawlers find your main pages faster.
It looks something like this:
You'll likely find your sitemap on this URL:
Once you have the URL of your sitemap, go to the “Sitemaps” board in GSC. You'll find it under the “Indexing” section in the left menu.
Enter the URL of your sitemap and hit “Submit.”
It may take a couple of days for your sitemap to be processed. When it’s done, you should see the link to your sitemap and a green “Success” status in the report.
Submitting the sitemap can help Google discover all the pages you deem important. And can speed up the process of indexing them.
Use the URL Inspection Tool
To check the status of a specific URL, use the URL inspection tool in GSC.
Start by entering the URL into the “URL Inspection” search bar at the top.
If you see the “URL is on Google” status, it means the inspected URL has been crawled and indexed by Google.
You can check the details to see when it was last crawled, as well as other helpful information.
If this is the case, you're all set and don't have to do anything.
But if you see the “URL is not on Google” status, it means the inspected URL is not indexed and cannot appear in Google’s search engine results pages (SERPs).
You'll probably see the reason why the page hasn't been indexed. And you'll need to address the issue (see next section).
Once that’s done, you can request indexing by clicking the “Request Indexing” link.
How to Find and Fix Indexing Issues
Sometimes, there may be deeper issues with your website's technical SEO. So there’s a chance your site (or a specific page) won’t get indexed even if you request it.
This can happen if your site isn’t mobile-friendly, your site loads too slowly, there are redirect issues, etc.
Perform a technical SEO audit to find underlying issues that may prevent your site from being indexed.
You can use an auditing tool like Semrush's Site Audit.
- Create a free Semrush account (no credit card needed)
- Set up your first crawl (We have a detailed setup guide that will help you with that.)
- Click the “Start Site Audit” button
After you run the audit, you'll get a detailed view of your site's health.
You can also see a list of all the issues by clicking the “Issues” tab:
The issues related to indexing will almost always appear at the top of the list—in the “Errors” section.
Let's take a look at some common reasons why your site may not be indexed and how to fix them.
Check Your Robots.txt File
Your robots.txt file gives instructions to search engines not to crawl certain parts of a website. It looks something like this:
You'll find yours at: https://yourdomain.com/robots.txt
(Follow this guide on how to create a robots.txt file if you don't have one.)
You may want to block Google from crawling duplicate pages, private pages, or resources like PDFs and videos.
But if your robots.txt file tells Googlebot (or web crawlers in general) that your entire site should not be crawled, there's a high chance it won't be indexed, either.
Each directive in robots.txt consists of two parts:
- “User-agent” identifies the crawler that is addressed
- The “Allow” or “Disallow”instruction tells the crawler that the site (or its part) can or cannot be crawled
This directive says any crawler (represented by an asterisk) cannot crawl (disallow:) the whole site (represented by a slash symbol):
Inspect your robots.txt and make sure there is no directive that could prevent Google from crawling your site or pages/folders you want to have indexed.
Manage Your Noindex and Canonical Tags
One of the ways to tell search engines not to index your pages is to use the robots meta tags with a “noindex” attribute.
It looks like this:
<meta name="robots" content="noindex">
If a page contains this line of code, you’re telling Google not to index it.
You can check what pages on your website have noindex meta tags in Google Search Console:
- Click the “Pages” report under the “Indexing” section in the left menu
- Scroll down to the “Why pages aren’t indexed” section
- Look for “Excluded by ‘noindex’ tag.” Click it.
If the list of URLs contains a page you want indexed, simply remove the noindex meta tag from the source code of that page.
The Site Audit tool will also warn you about pages that are blocked either through the robots.txt file or the noindex tag.
It'll also notify you about resources blocked by the so-called X-Robots tag, which is usually used for non-HTML documents (such as PDF files).
Manage Canonical Tags
Another reason your page may not be indexed is that it contains a canonical tag.
Canonical tags tell crawlers if a certain version of a page is preferred. They prevent issues caused by duplicate content appearing on multiple URLs.
If a page has a canonical tag pointing to another URL, Googlebot assumes there’s a different preferred version of that page. And will not index the page with the canonical tag, even if there is no alternate version.
The “Pages” report in Google Search Console will help here again.
Just scroll down to the “Why pages aren’t indexed” section and click the “Alternate page with proper canonical tag” reason.
You'll see a list of affected pages. Go through the list.
If there is a page that you want to have indexed (meaning, the canonical is used incorrectly), remove the canonical tag from that page. Or make sure it points to the version of the page you want to have indexed.
Learn more about this topic in our guide to canonical tags.
Check Your Internal Links
Internal links help crawlers find your webpages. Which can speed up the process of indexing.
If you want to audit your internal links, go to the “Internal Linking” thematic report in Site Audit.
The report will list all the issues related to internal linking.
It would help to fix all of them, of course. But these three issues are most important when it comes to crawling and indexing:
- Outgoing internal links contain nofollow attribute: Nofollow links don't pass authority. If used internally, Google may choose to ignore the target page when crawling your site. Make sure you don't use them for pages you want to have indexed.
- Pages need more than 3 clicks to be reached: If pages need more than three clicks to be reached from the homepage, there's a chance they won't be crawled and indexed. Add more internal links to these pages (and review your website architecture).
- Orphaned pages in sitemap: Pages that have no internal links pointing to them are known as “orphan pages.” They are rarely indexed. Fix this issue by linking to any orphaned pages.
To see pages that are affected by a specific problem, click on the link stating the number of found issues next to it.
Last but not least, don't forget to use internal linking strategically:
- Link to your most important pages: Google recognizes that pages are important to you if they have more internal links
- Link to your new pages: Make internal linking part of your content creation process to speed up the indexing of your new pages
Check out our guide to internal linking to learn more about best practices and strategies.
Check the Overall Quality of Your Site
Indexing problems are not always related to technical issues.
Even if your site meets all technical requirements for being indexed, Google may not index all your pages. Especially if Google doesn't consider your website to be of high quality.
Here's a quote by John Mueller from Google:
If you have a smaller site and you’re seeing a significant part of your pages are not being indexed, then I would take a step back and try to reconsider the overall quality of the website and not focus so much on technical issues for those pages.
If this is your case, you may want to start working on the following:
- Improving the content quality of your website
- Building quality backlinks to your website
- Improving the Expertise, Authoritativeness, and Trustworthiness (E-A-T) signals of your website
So while quality issues may negatively impact indexing, you should not dismiss technical problems as potential obstacles as well.
A balanced diagnostic approach assessing both quality and technical factors is key.
Monitor Your Indexing Issues
Even though you’ve fixed your indexing issues now, new issues might crop up in the future.
This is especially true if you frequently add new content or update your website’s structure.
Semrush’s Site Audit tool can help you spot new technical problems early before they escalate.
Just schedule periodic audits by going to the site audit settings for your website and clicking the “Schedule” button.
You get an option to set up automatic scans on a daily or weekly basis
We’d recommend configuring weekly scans to start. Later, you can adjust the cadence as needed.
Now you can rest assured that technical problems will get flagged quickly so you can address them before they cause serious issues.