Why only a few pages of my website are being crawled?
If you’ve noticed that only 4-6 pages of your website are being crawled (your home page, sitemaps URLs and robots.txt), most likely this is because our bot couldn’t find internal links on your Homepage. There can be two possible reasons for this:
1. Your Home page is more than 2 MB. Site Audit is currently only equipped to parse pages not larger than 2MB.
In this case you will see the large HTML page size error:
In both cases there is a way to ensure that our bot will crawl your pages. To do this, you need to change the crawl source from “website” to “sitemap” or “URLs from file” in your campaign settings:
“Website” is a default option. It means we will crawl using a breadth-first search algorithm and navigating through the links we see on your page’s code - starting from the homepage.
If you choose one of the other options, we will crawl links that are found in the sitemap or in the file you upload.
- How is Site Health Score calculated in the Site Audit tool?
- Why does Semrush say I have duplicate content?
- How many pages can I crawl in a Site Audit?
- How do I audit a subdomain?
- What Issues Can Site Audit Identify?
- Why does Semrush say I have an incorrect certificate?
- What are unoptimized anchors and how does Site Audit identify them?
- What do the Structured Data Markup Items in Site Audit Mean?
- Why do working pages on my website appear as broken?
- Can I change automatic Audit re-run schedule or stop it completely?
- Why can’t I find URLs from the Audit report on my website?
- How long does it take to crawl a website? It appears that my audit is stuck.
- Configuring Site Audit
- Site Audit Overview Report
- Site Audit Thematic Reports
- Reviewing Your Site Audit Issues
- Site Audit Crawled Pages Report
- Site Audit Statistics
- Compare Crawls and Progress
- Troubleshooting Site Audit
- Exporting Site Audit Results
- How to Optimize your Site Audit Crawl Speed
- How To Integrate Site Audit with Zapier