Wildcard DNS lets Googlebot index phantom subdomains as real pages

Summary

Wildcard DNS can cause Googlebot to discover and index phantom subdomains that were never built, especially when third parties link to non-existent URLs or Googlebot constructs them from patterns.

Phantom subdomains waste crawl budget and create duplicate content signals. Google may pick a phantom subdomain as canonical instead of the real page, hurting rankings. 301 redirects are the strongest fix.

Return 404s for unknown subdomains, or redirect phantoms with 301s to canonical URLs. Check Search Console for indexed duplicates and add rel="canonical" tags as backup.

What happened

A discussion in r/bigseo raised questions about how Google handles 301 redirects across a network of dating websites using wildcard DNS. The practitioner described a setup where wildcard DNS records resolve all subdomains to the same server, causing Googlebot to discover and index subdomains that were never intentionally created.

Wildcard DNS means any subdomain typed or linked to will return a valid HTTP response. If the server doesn’t distinguish between real and phantom subdomains, Googlebot treats them all as live pages. The result is indexed URLs that nobody built, pointing to duplicate or near-duplicate content.

Why it matters

Wildcard DNS is common in multi-brand or multi-location setups where operators spin up subdomains programmatically. Dating networks, regional directories, and white-label SaaS platforms often use this pattern. The problem surfaces when third parties link to arbitrary subdomains, or when Googlebot constructs URLs from patterns it finds elsewhere.

Once phantom subdomains get indexed, they compete with real pages for crawl budget. They also create duplicate content signals across what Google sees as separate hosts. Google’s canonicalization process will try to pick a preferred version, but it may not pick the one you want.

Google’s documentation on consolidating duplicate URLs lists three methods ranked by signal strength:

  • Redirects (301/302): The strongest canonicalization signal. The redirect target is treated as the canonical URL.
  • rel=“canonical” annotations: A strong signal pointing Google to the preferred URL.
  • Sitemap inclusion: A weaker signal that hints which URLs should be canonical.

Google recommends combining these methods. But with wildcard DNS, the phantom subdomains may not have any canonical signals at all, leaving Google to pick on its own. Per the HTTP/1.1 specification (RFC 7231), a 301 status code indicates a permanent move. Google treats 301s as a strong signal that the target URL should be treated as canonical.

The practical risk is twofold. Crawl budget gets spent on URLs that add no value. And if Google picks a phantom subdomain as canonical over a real page, the real page can lose rankings.

What to do

Audit your DNS configuration. Check whether your domain uses wildcard DNS records (an asterisk record like *.example.com). If you don’t need wildcard resolution, remove it. Explicit subdomain records are safer.

Return 404 or 410 for unknown subdomains. If you need wildcard DNS for legitimate reasons, configure your server to check incoming Host headers against a list of valid subdomains. Return a 404 for anything not on the list.

Set up 301 redirects for known phantoms. If phantom subdomains are already indexed, redirect them to the correct canonical URL. A 301 is the strongest signal you can send.

# Apache example: redirect unknown subdomains to main domain
RewriteEngine On
RewriteCond %{HTTP_HOST} !^(www|app|blog)\.example\.com$ [NC]
RewriteCond %{HTTP_HOST} \.example\.com$ [NC]
RewriteRule ^(.*)$ https://www.example.com/$1 [R=301,L]

Add rel=“canonical” as a fallback. On any subdomain that must remain live, add a <link rel="canonical"> tag pointing to the preferred URL. Combine with sitemap inclusion for the strongest signal set.

Check Google Search Console. Use a Domain property (verified via DNS) to see data across all subdomains. Look at the “Pages” report filtered by “Alternate page with proper canonical tag” and “Duplicate without user-selected canonical.” These statuses can reveal phantom subdomains Google has already found. Standard URL-prefix properties are scoped to a single subdomain and won’t surface phantom subdomains.

Watch out for

Wildcard SSL certificates masking the problem. If you have a wildcard TLS certificate (*.example.com), phantom subdomains will serve over HTTPS without errors. Googlebot won’t hit any certificate warnings, so nothing flags these URLs as suspicious during crawling.

Redirect loops between phantom subdomains. If your redirect logic is based on pattern matching rather than an explicit allowlist, two phantom subdomains can redirect to each other. Test redirects with curl -I before deploying broadly.