Mueller doubts freshness-based sitemap splits speed crawling
Summary
Mueller listed five reasons SEOs split XML sitemaps, but doubted the claim that separating fresh content from evergreen content influences crawl frequency.
Freshness-based splits are widespread in enterprise SEO but lack confirmation from Google that they actually work. Split sitemaps by page type or hreflang requirements instead, not by content age.
For most sites under 50,000 URLs without hreflang, keep a single sitemap file. Don't expect sitemap structure to control crawl priority.
What happened
Google’s John Mueller listed five reasons why SEOs split XML sitemaps into multiple files, but cast doubt on one of the most commonly cited strategies: separating evergreen content from fresh content to influence crawl frequency. Mueller shared the list in a discussion covered by Search Engine Journal on April 3, 2026.
An SEO had asked why anyone would choose to manage multiple sitemap files instead of keeping everything in one. Mueller responded with five reasons he’s seen:
- Tracking URL groups: Splitting by page type (e.g., product detail pages vs. category pages) to monitor indexing separately, similar to what the Page Indexing report in Search Console offers.
- Freshness-based splitting: Placing evergreen content in a separate file so search engines might check the “old” sitemap less often. Mueller added a caveat: “I don’t know if this actually happens though.”
- Proactive splitting: Breaking up a sitemap before it hits the 50,000-URL limit, so there’s no scramble to restructure later.
- Hreflang sitemaps: Multilingual markup can consume so much space that even 50,000 URLs push files past the size limit.
- Automated tooling: Some CMS or deployment systems generate multiple files without anyone choosing that structure.
Why it matters
The freshness-based split is a tactic that circulates widely in enterprise SEO circles. The idea is straightforward: if Google sees a sitemap file that rarely changes, it might deprioritize re-crawling those URLs and spend more crawl budget on a separate file containing frequently updated pages. Mueller’s explicit uncertainty about whether search engines actually behave this way is a meaningful signal. It suggests Google has not committed to treating sitemap files as crawl-priority signals.
The proactive splitting and hreflang reasons are more grounded. The sitemaps protocol specification caps each file at 50,000 URLs and 50MB uncompressed. Hreflang annotations add <xhtml:link> child elements inside each <url> entry, which can push the file past the 50MB uncompressed size limit even when the <url> count stays under 50,000.
Search Engine Journal’s coverage also notes a claim from enterprise-level SEOs that keeping sitemaps well under 50,000 lines improves indexing. Mueller did not confirm or deny that claim.
What to do
For most sites under 50,000 URLs without hreflang in sitemaps, a single sitemap file works fine. The overhead of managing multiple files adds complexity without a confirmed crawl benefit.
If you’re running a large site, split sitemaps by page type rather than by freshness. Grouping product pages, category pages, and editorial content into separate files makes the Search Console Page Indexing report more useful for diagnosing coverage issues by section.
Sites using hreflang sitemaps should check file sizes. The 50,000-URL cap counts <url> entries only, not the <xhtml:link> child elements within them. But hreflang annotations add significant XML bulk. If your file approaches 50MB uncompressed, split by language group or site section.
Don’t rely on freshness-based splits as a crawl budget strategy. Mueller’s own doubt about the practice means there’s no confirmed mechanism backing it. If you’ve already structured sitemaps this way, there’s no reason to undo it, but don’t expect it to influence crawl frequency.
Check whether your CMS auto-generates multiple sitemap files. WordPress with Yoast, for example, creates separate sitemaps for posts, pages, and taxonomies by default. If that structure maps to how you’d want to monitor indexing, leave it. If it creates noise, consider whether a custom sitemap setup would be cleaner.
Watch out for
Assuming sitemap structure controls crawl priority. Sitemaps are a discovery mechanism, not a directive. Google decides crawl frequency based on its own signals, not on how you organize your sitemap files. Mueller’s answer reinforces that even the freshness-split theory lacks confirmation.
Hreflang file sizes being deceptive. A sitemap with 5,000 pages might look small, but if each page lists 15 language alternates via <xhtml:link> elements, the XML file size grows substantially. The 50,000-URL cap counts <url> entries, not child elements, so hreflang won’t push you past the URL limit at 5,000 pages. But the added XML markup can push the file past the 50MB uncompressed size limit. Check file size, not just URL count.