Why IA migrations break without a staging crawl
Summary
Redirect mapping alone works for simple URL changes, but migrations that restructure information architecture need a staging crawl. Redirect chains, orphaned consolidation links, and Shopify's auto-generated canonical conflicts won't surface in post-launch monitoring until traffic has already dropped. Crawl both environments before launch and monitor for at least three to six months.
What happened
A practitioner in r/TechSEO asked when a site migration justifies a full staging environment versus simply mapping redirects and going live. The scenario: a mid-sized ecommerce site moving from a custom CMS to Shopify. The domain stays the same, but the URL structure, navigation paths, and information architecture are all changing. Some content is being consolidated or removed entirely.
One commenter summarized the general consensus: “For any site with decent traffic I’d go the thorough route. It doesn’t actually take that much extra time.”
The question itself is worth unpacking because the answer isn’t binary. Redirect mapping alone works fine for simple migrations where URLs change but IA stays intact. The complexity threshold shifts when content consolidation, navigation restructuring, and platform-specific URL behaviors enter the picture.
Why it matters
The risk with a “go live and monitor” strategy is that monitoring is reactive. By the time Google Search Console reports crawl anomalies or ranking drops, the damage may already be weeks old. GSC data lags by a variable and unspecified amount, and monitoring catches problems only after they reach production. A staging crawl catches problems before they get there.
Three specific failure modes make staging valuable for IA-heavy migrations:
- Redirect chains that don’t surface in logs. When old URLs redirect to intermediate URLs that then redirect again to final destinations, the chain resolves correctly in a browser and Google will follow it. Google has confirmed that redirect chains pass signals without loss. The problem is operational: chains are harder to audit, harder to maintain, and mask mapping errors where an intermediate URL was supposed to go somewhere else entirely. Staging lets you crawl the full redirect topology and catch A-B-C chains before launch, when they’re still cheap to flatten.
- Internal links pointing to removed or consolidated pages. Content consolidation means some old URLs redirect to pages covering a broader topic. The redirect returns a 200 at the final destination, so it won’t show up as a 404. But Google treats a redirect as a signal of equivalence and may assess whether the destination is a true equivalent. For low-relevance redirects, signals may not transfer at all, not just transfer at reduced strength.
- Platform-specific canonical behavior. Based on practitioner experience, Shopify generates canonical tags for product variants and collection URLs in ways that can conflict with your redirect rules. A carefully mapped redirect can land on a page where Shopify’s auto-generated canonical tag points somewhere unexpected. You won’t see that conflict in a redirect spreadsheet.
Google’s migration documentation covers planning steps for moves involving URL changes but doesn’t draw a hard line on when staging becomes necessary. The practical threshold depends on how much of the site’s link topology is changing, not just how many URLs are moving.
For ecommerce sites with faceted navigation, the risk multiplies. Old facet URLs may redirect to filtered collection pages on Shopify. Note that Shopify’s actual filtered collection URL structure varies depending on the theme and any installed search or filter apps. Confirm the URL patterns your specific Shopify implementation will produce before building redirect mappings from legacy facet URLs. If some facet combinations no longer exist, those redirects land on filtered pages with thin content. Googlebot following those chains produces nothing useful.
What to do
Decide based on IA change, not URL count. If only URLs are changing and the site structure stays the same, redirect mapping with post-launch monitoring is reasonable. If navigation paths, content hierarchy, or page consolidation patterns are changing, staging is worth the investment.
Crawl both environments simultaneously. Use Screaming Frog or Sitebulb to crawl the staging site and compare it against a crawl of the current production site. Enable JavaScript rendering in the crawler — Screaming Frog defaults to static HTML mode, which will miss canonical tags and internal links injected by Shopify themes and apps. Look for:
- Redirect chains longer than one hop
- Internal links on the new site that point to old URLs
- Pages where Shopify’s auto-generated canonical tag conflicts with your redirect destination, or points to a collection or variant URL you intended to consolidate
- Orphaned pages that exist on staging but have no internal links pointing to them
Test Shopify’s canonical and redirect behavior specifically. Shopify generates canonical tags for collections and product variants in ways that may conflict with your redirect rules. On staging, visit product variant URLs directly and check the rendered canonical tag. Compare it against what your redirect map expects. Google already overrides canonicals for several reasons; a platform-generated mismatch adds another.
Validate content consolidation targets. For every old URL that redirects to a consolidated page, check whether the destination page actually covers the topic the old page ranked for. If the destination is a broad category page and the old page was a specific product or article, Google may not transfer ranking signals at all. Before redirecting any high-traffic or high-backlink page to a broader category page, evaluate whether that page should simply be retained or redirected to a closer equivalent. Consolidation redirects make sense for truly redundant content, not for unique pages with their own ranking signals.
Set your monitoring baseline before launch. Export your current crawl stats, indexed page count, and top landing pages from GSC. After launch, compare against these baselines daily for the first month, then weekly for three to six months. IA migrations surface problems on a longer tail than simple URL moves — consolidation-related ranking loss often doesn’t appear until Google re-evaluates page quality signals weeks after the initial recrawl. Pre-define traffic drop thresholds so you know what counts as normal fluctuation versus a real problem. Pay special attention to pages marked “Crawled, currently not indexed”. This status has multiple potential causes including duplicate content, soft 404s, canonical conflicts, quality signals, or Google’s crawl queue prioritization, and should be investigated case by case.
Watch out for
Staging environments that don’t match production. If your staging site lacks JavaScript rendering, lazy-loading behavior, or CMS plugins present on the live site, your staging crawl results won’t match what Googlebot encounters. Make sure the staging environment mirrors the production Shopify configuration as closely as possible.
When using Screaming Frog or Sitebulb to validate canonical tags on Shopify staging, enable JavaScript rendering in the crawler settings. Shopify themes and apps can inject or modify canonical tags via JavaScript. A default non-JS crawl will miss these conflicts entirely.
“Successful” redirects masking relevance loss. A redirect that returns a 200 at the final destination looks clean in every audit tool. But if the destination page doesn’t match the search intent of the old page, Google may not transfer signals at all, and rankings will erode over weeks. No redirect audit tool flags semantic mismatches. You need to review consolidated redirect targets manually, at least for your top-traffic pages.