Crawling & Indexing

Robots.txt, sitemaps, crawl budget, and indexing issues.

News
Crawling & Indexing May 13, 2026

Silent soft 404s caused 90% traffic loss after site migration

Pages returned HTTP 200 but Google's quality classifiers flagged them as soft 404s, deindexing thousands of URLs before the traffic drop was visible.
News
Crawling & Indexing May 13, 2026

1,800 pages deindexed overnight despite clean GSC signals

A WordPress site lost all indexed pages while GSC showed no errors. The blank canonicals looked technical, but the community and Google's own docs point to quality.
News
Crawling & Indexing May 11, 2026

Empty category pages trigger soft 404s with no clean fix

Google flags out-of-stock category pages as soft 404s, but removing them risks slow re-indexing when products return. Each option has real tradeoffs.
News
Crawling & Indexing May 5, 2026

GSC reports resource failures despite 200 OK in server logs

A WooCommerce site logs clean 200 responses to Googlebot, but GSC flags resource failures caused by CDN interception or timing gaps between crawl and render.
News
Crawling & Indexing May 4, 2026

Noindex vs. robots.txt disallow for millions of stub pages

Noindex and robots.txt disallow have different effects on crawling and indexing. Verify you have a crawl budget problem before blocking stub pages at scale.
News
Crawling & Indexing April 29, 2026

OpenAI crawl activity tripled after GPT-5, led by search bot

OAI-SearchBot now generates more log events than GPTBot after a 3.5x post-GPT-5 surge, and each bot has its own robots.txt directive you need to manage.
News
Crawling & Indexing April 25, 2026

ChatGPT uses SerpAPI to pull Google results, not its own crawler

ChatGPT pulls results from SerpAPI, not its own index, so your Google rankings directly determine whether AI platforms surface your content.
News
Crawling & Indexing April 25, 2026

AI bot traffic starves Googlebot of crawl budget on large sites

AI crawler traffic is consuming server bandwidth and crawl budget on large sites, potentially throttling Googlebot discovery and indexing of important pages.
News
Crawling & Indexing April 18, 2026

Mueller doubts freshness-based sitemap splits speed crawling

Mueller doubts freshness-based sitemap splits influence crawl frequency, questioning a widely used enterprise SEO tactic with no confirmed crawl benefit.
News
Crawling & Indexing April 18, 2026

Blocking CSS and JS in robots.txt breaks indexing, not saves

Blocking CSS and JS in robots.txt breaks Googlebot's page rendering and indexing, not crawl budget. Improve cache headers on static assets instead.
News
Crawling & Indexing April 17, 2026

Wildcard DNS lets Googlebot index phantom subdomains as real pages

Wildcard DNS can cause Googlebot to index phantom subdomains as real pages, wasting crawl budget and creating duplicate content signals that may hurt rankings.
News
Crawling & Indexing April 17, 2026

Indexing API bypasses 'Discovered - currently not indexed' queue

Indexing API achieved 94% indexation in 48 hours versus 8.4% via sitemap, but bypassing documented restrictions for JobPosting-only pages risks future enforcement.
News
Crawling & Indexing April 14, 2026

Mueller lists nine reasons Google overrides your rel=canonical

John Mueller listed nine scenarios where Google picks a different canonical than your tag, from JS rendering failures to URL parameter pattern inference.