Does Next.js server-side rendering guarantee good SEO?

No. SSR ensures the initial HTML contains content, but you still need proper meta tags, canonical URLs, structured data, status codes, and crawl management. Many Next.js sites with SSR still have indexing problems due to _rsc parameter pollution, soft 404s, or missing structured data.

How do I prevent _rsc query parameters from being indexed?

Add 'Disallow: /*?_rsc=' to your robots.txt, ensure canonical tags point to clean URLs, and consider disabling Link prefetching on pages with many internal links. Next.js strips _rsc from middleware requests, so server-side redirects are unreliable.

What structured data should a job board implement?

At minimum: JobPosting schema on individual job pages with title, description, salary, location, employer, and validThrough date. Add BreadcrumbList for hierarchical navigation, Organization schema site-wide, and FAQPage schema on help/FAQ pages.

How does Vercel ISR affect SEO for pages with expiring content?

ISR serves cached pages until revalidation triggers. If a job listing expires but the ISR cache has not revalidated, Googlebot sees stale content with outdated structured data. Set revalidation intervals shorter than your content's lifespan and use on-demand revalidation for removals.

Should I use SSG or SSR for programmatic SEO pages on Next.js?

Use SSG with generateStaticParams for pages with stable URLs and predictable content, location hubs, category pages, and popular listings. Use SSR or ISR for pages that change frequently or have too many variants to pre-build. Avoid client-side rendering for any page you want indexed.

What is the biggest Next.js SEO pitfall at scale?

Inconsistent rendering across page types. A site can have perfectly server-rendered detail pages while listing pages silently depend on client-side JavaScript. Audit each page type independently, do not assume one working page type means they all work.

How do Core Web Vitals differ across Next.js page types?

Listing pages with many images and dynamic content tend to have worse LCP and INP than detail pages. Location pages with maps or third-party embeds add CLS risk. Test CWV per page template, not just the homepage.

Does Vercel's Edge Network help or hurt SEO?

It helps with TTFB and global latency, which benefits Core Web Vitals. But Vercel's serverless function timeouts, ISR caching behavior, and middleware limitations create SEO-specific challenges that need explicit configuration.

How do you handle page churn on a job board with thousands of expiring listings?

Return 410 Gone for expired listings, remove them from sitemaps immediately, use on-demand ISR revalidation so Googlebot sees the 410 on next crawl, and use the Google Indexing API for faster deindexing. Budget for 5-15% monthly page turnover in your crawl management strategy.

Do filter pill components in Next.js hurt SEO?

Yes, if they are Client Components. The pill labels, counts, and active states are invisible to crawlers in the initial HTML. Render pill content in Server Components and use Client Components only for the interactive toggle behavior. Also manage the URL parameters that pill clicks generate, canonical to the unfiltered URL.

JavaScript SEO

Battling Next.js SEO Issues on a Government Jobs Aggregator

March 28, 2026 (Updated March 28, 2026 ) 28 min read

Next.js is the default choice for React-based web applications, and Vercel makes deployment effortless. But “effortless” hides a minefield of SEO pitfalls, ones that only surface at scale, across different page types, and under the unforgiving lens of Googlebot’s rendering pipeline. (If you are weighing framework options, see our Astro vs Next.js comparison for SEOs for a side-by-side breakdown.)

This case study follows a government jobs aggregator we will call GovJobsHub, a Next.js App Router site on Vercel with roughly 20,000 pages. The site aggregates federal, state, and local government job listings into programmatically generated pages organized by location, category, and agency. It is the kind of site where technical SEO determines whether tens of thousands of pages get indexed or disappear into a crawl budget black hole.

We will walk through every pitfall we encountered, explain why each one happens in the Next.js + Vercel stack, and show how to configure the stack properly from the start.

Understanding the Page Types

Before diagnosing problems, you need to understand that different page types on a Next.js site can have completely different rendering behaviors, crawl characteristics, and SEO requirements. This was our most important lesson: never assume one working page type means they all work.

GovJobsHub has six distinct page types:

Job Detail Pages (`/jobs/[id]`)

Individual job listings. Each has a title, description, salary range, location, agency, and application deadline. These are the most valuable pages for Google Jobs rich results. Count: ~15,000 pages, constantly churning as listings expire and new ones appear.

SEO requirements: JobPosting structured data, proper 410 status on expiry, fresh content signals, unique meta descriptions.

Location Pages (`/jobs/[state]`, `/jobs/[state]/[city]`)

Aggregation pages listing jobs by geography. State pages show all jobs in that state. City pages narrow further. Count: ~2,500 pages (50 states + major cities).

SEO requirements: Unique content beyond the job list itself, BreadcrumbList schema, proper pagination, no thin-content signals.

Category Pages (`/jobs/category/[slug]`)

Jobs grouped by field: IT, healthcare, law enforcement, administration. Count: ~200 pages.

SEO requirements: Similar to location pages. Category descriptions must not be boilerplate.

Agency Pages (`/jobs/federal/[agency]`)

Federal jobs grouped by agency: VA, DOD, USPS, etc. Count: ~150 pages.

SEO requirements: Agency-specific context, not just a filtered job list.

Hub Pages (`/jobs`, `/jobs/federal`, `/jobs/remote`)

Top-level entry points that aggregate across all jobs or major segments. Count: ~10 pages.

SEO requirements: Strong internal linking, SSR mandatory, canonical management for pagination.

Static Pages (homepage, `/about`, `/faq`, `/checker`)

Marketing and utility pages. Count: ~10 pages.

SEO requirements: Standard on-page SEO, FAQPage schema where applicable.

Each of these page types had different problems. That is the nature of Next.js at scale, the framework’s flexibility means each route can end up with a different rendering strategy, and silent inconsistencies compound.

Pitfall 1: The Rendering Gap Across Page Types

This was the most damaging issue and the hardest to detect.

GovJobsHub’s job detail pages were fully server-rendered. You could curl the URL and see complete job descriptions, salary data, and structured markup in the raw HTML. These pages looked great to Googlebot.

But the main /jobs hub page, the highest-traffic listing page and the root of the site’s internal link graph, told a different story. The raw HTML contained React Server Component flight data: serialized self.__next_f.push() arrays instead of actual job cards. The content only appeared after JavaScript execution.

The location pages were a mix. State-level pages (/jobs/california) rendered server-side. But city-level pages (/jobs/california/los-angeles) used a hybrid approach where the job list was delivered as serialized JSON in RSC payloads rather than rendered HTML.

Why This Happens

In the Next.js App Router, any component marked with 'use client' renders on the client. If your job listing grid is a Client Component, maybe because it has sorting, filtering, or pagination interactions, the actual job data is not in the initial HTML. The server sends a placeholder and the RSC payload, and the client hydrates it.

The insidious part: this works perfectly in the browser. You never notice unless you view source or disable JavaScript. As Sam Torres noted in her JavaScript SEO AMA, rendering queues, not crawl budget, are the real bottleneck for JavaScript-heavy sites.

How to Detect It

# Fetch raw HTML and check for actual content vs RSC payloads
curl -s https://yoursite.com/jobs | grep -c "self.__next_f.push"
curl -s https://yoursite.com/jobs | grep -c "<article"

# If you see many __next_f.push calls and zero <article> tags,
# the page depends on client-side rendering

Run this check against every page type. Do not assume, verify. For a more structured approach, Sitebulb’s rendering comparison (response vs. render) covers this analysis in depth.

How to Fix It

Move SEO-critical content out of Client Components. In the App Router, the default is Server Components, content renders on the server unless you explicitly opt out with 'use client'. The fix is architectural:

// BAD: Job list in a Client Component
'use client'
export function JobList({ jobs }) {
  // This content is NOT in initial HTML
  return jobs.map(job => <JobCard key={job.id} job={job} />)
}

// GOOD: Server Component with client interactivity separated
// JobList is a Server Component (default)
export function JobList({ jobs }) {
  // This content IS in initial HTML
  return (
    <div>
      {jobs.map(job => <JobCard key={job.id} job={job} />)}
      {/* Only the interactive filter is a Client Component */}
      <JobFilter />
    </div>
  )
}

The principle: render the content on the server, hydrate only the interactivity on the client. Every page type that you want indexed should have its primary content in Server Components.

Pitfall 2: _rsc Parameter Pollution

When a user navigates between pages on a Next.js App Router site, the framework fetches an optimized RSC payload by appending ?_rsc=XXXXX to the URL. This is an internal mechanism, it is not meant to be seen by search engines.

But Googlebot sees everything. It discovers these _rsc URLs during JavaScript rendering, follows them, and attempts to index them. The result: thousands of “Duplicate, Google chose different canonical” entries in Search Console.

GovJobsHub had over 1,300 of these entries within three months of launch.

Why It Is Hard to Fix

This is a framework-level issue with no clean solution:

robots.txt Disallow: /*?_rsc=: Google still discovers and reports the URLs. They show as “Indexed, though blocked by robots.txt” instead of disappearing.
Middleware redirect: Next.js strips _rsc from the NextRequest object before middleware processes it. You literally cannot see the parameter in middleware code.
next.config.js redirects: Using has conditions to redirect _rsc URLs reduced errors from 1,300 to about 400, but did not eliminate the problem.
Disabling prefetch: Setting prefetch={false} on all <Link> components prevents _rsc requests entirely but sacrifices the performance benefits of prefetching.

The Pragmatic Approach

There is no silver bullet. The combination that worked best for GovJobsHub:

robots.txt Disallow: blocks most crawling of these URLs
Canonical tags on every page: pointing to the clean URL without parameters
Selective prefetch disabling: turn off prefetch on pages with dozens of internal links (like listing pages) where the _rsc generation is heaviest
Accept the noise: some _rsc entries in Search Console are cosmetic. Focus on whether your clean URLs are indexed correctly, not on eliminating every duplicate report

# robots.txt
User-agent: *
Disallow: /*?_rsc=
Disallow: /*&_rsc=

The Bigger Question

This issue is tracked in multiple GitHub discussions with hundreds of participants and no official resolution from the Next.js team. Making matters worse, Google recently removed its JavaScript SEO guidance, leaving practitioners without an official render validation framework. If you are building a site where clean index coverage matters, and at 20,000 pages, it absolutely does, you need to account for this as a known, ongoing maintenance burden.

Setting It Up Right: Rendering Strategy Per Page Type

The core mistake is treating all pages the same. Each page type on a programmatic site needs its own rendering strategy based on content volume, update frequency, and SEO value.

Here is what worked for GovJobsHub after the fixes:

Page Type	Rendering	Revalidation	Rationale
Job detail (`/jobs/[id]`)	ISR	24 hours	High volume, moderate churn. Cannot SSG 15K pages at build time.
State pages (`/jobs/[state]`)	SSG	Build time	50 pages, stable URLs, high SEO value. Pre-build all of them.
City pages (`/jobs/[state]/[city]`)	ISR	48 hours	~2,500 pages, moderate churn. Too many for full SSG.
Category pages (`/jobs/category/[slug]`)	SSG	Build time	~200 pages, stable. Pre-build all.
Agency pages (`/jobs/federal/[agency]`)	SSG	Build time	~150 pages, stable. Pre-build all.
Hub pages (`/jobs`, `/jobs/federal`)	ISR	1 hour	High traffic, content changes with each new listing.
Static pages	SSG	Build time	Rarely changes.

The Decision Framework

Use SSG (generateStaticParams) when:

Page count is under 500
URLs are stable and predictable
Content changes infrequently (weekly or less)
Pages are high SEO value (location hubs, category landing pages)

Use ISR when:

Page count is in the thousands
Content updates daily but not in real-time
You need fresh content without full rebuilds
Set revalidate shorter than your content’s lifespan

Never use client-side rendering for:

Any page you want indexed
Any page with structured data
Any page that is a target for internal linking

// Example: generateStaticParams for state pages
// This pre-builds all 50 state pages at build time
export async function generateStaticParams() {
  return US_STATES.map(state => ({
    state: state.slug,
  }))
}

// Example: ISR for job detail pages
// Revalidates every 24 hours
export const revalidate = 86400

Pitfall 3: Robots.txt and Meta Robots Contradictions

GovJobsHub had a resume checker tool at /checker. The robots.txt blocked it with Disallow: /checker/. But the page’s HTML included <meta name="robots" content="index, follow">. These directives conflict, robots.txt prevents crawling, but the meta tag (which Googlebot never sees because it cannot crawl the page) says to index it.

This is not just a GovJobsHub problem. It is a pattern on Next.js sites where robots.txt is managed in one file and meta robots are set in page-level metadata, two different systems with no built-in consistency check.

Other robots.txt Mistakes

Blocking static assets: Several Next.js sites block /_next/static/ in robots.txt, thinking they are hiding implementation details. This prevents Googlebot from loading CSS and JavaScript needed to render pages. Only block /_next/data/ if you want to prevent JSON endpoint crawling.

Missing _rsc blocking: As covered above, _rsc parameters should be disallowed.

Overly broad API blocking: Disallow: /api/ blocks all API routes, but some sites serve structured data or public content through API routes that should be crawlable.

Proper robots.txt for Next.js on Vercel

User-agent: *
Allow: /_next/static/
Allow: /_next/image/
Disallow: /_next/data/
Disallow: /api/
Disallow: /*?_rsc=
Disallow: /*&_rsc=

Sitemap: https://www.yoursite.com/sitemap.xml

Pair this with consistent meta robots in your layout:

// app/layout.tsx — default for all pages
export const metadata = {
  robots: {
    index: true,
    follow: true,
  },
}

// app/admin/layout.tsx — override for non-public sections
export const metadata = {
  robots: {
    index: false,
    follow: false,
  },
}

Pitfall 4: Soft 404s and Wrong Status Codes

The SALT.agency study of 50 Next.js sites found that 41 out of 50 failed to return proper 404 status codes for non-existent URLs. GovJobsHub was among them initially.

The problem manifests differently per page type:

Job Detail Pages

When a job listing expires, what should happen? The page should return 410 Gone, telling Google the content existed but has been permanently removed. Instead, GovJobsHub was returning 200 with a “This job is no longer available” message. Google kept these pages in the index with stale JobPosting structured data, wasting crawl budget and showing expired listings in search results. This matters even more than you might expect: Google may skip JavaScript rendering entirely for non-200 pages, so getting the status code right determines whether your error handling is even rendered.

Dynamic Route Catchalls

Requesting /jobs/not-a-real-state returned a 200 status code with a generic “No jobs found” page instead of a 404. At scale, this means any URL under /jobs/ appears valid to crawlers, encouraging them to waste budget on non-existent paths.

The Fix

// app/jobs/[id]/page.tsx
import { notFound } from 'next/navigation'

export default async function JobPage({ params }) {
  const job = await getJob(params.id)

  if (!job) {
    notFound() // Returns 404 status code
  }

  if (job.expired) {
    // For expired content, return 410 Gone
    return new Response(null, { status: 410 })
  }

  return <JobDetail job={job} />
}

Verify per page type. Curl non-existent URLs under each route pattern and check the status code:

curl -o /dev/null -s -w "%{http_code}" https://yoursite.com/jobs/fake-id-12345
curl -o /dev/null -s -w "%{http_code}" https://yoursite.com/jobs/not-a-state
curl -o /dev/null -s -w "%{http_code}" https://yoursite.com/jobs/category/fake

Every one of those should return 404, not 200.

Pitfall 5: Missing and Broken Structured Data

GovJobsHub’s structured data situation was a mixed bag. Job detail pages had solid JobPosting schema. Everything else was bare.

What Was Missing

Page Type	Had	Needed
Job detail	JobPosting	Already good
Location pages	Organization only	JobPosting aggregate, BreadcrumbList
Category pages	Organization only	BreadcrumbList
Hub pages	Organization, WebSite	BreadcrumbList
FAQ page	Nothing	FAQPage
All pages	Nothing	BreadcrumbList

Next.js-Specific JSON-LD Gotcha

In Next.js, you cannot put JSON-LD in the <head> the way you might in a traditional HTML site. The JSON-LD <script> tag must be rendered within a Server Component in the page body:

// app/jobs/[id]/page.tsx
export default async function JobPage({ params }) {
  const job = await getJob(params.id)

  const jsonLd = {
    '@context': 'https://schema.org',
    '@type': 'JobPosting',
    title: job.title,
    description: job.description,
    datePosted: job.postedDate,
    validThrough: job.expiryDate,
    hiringOrganization: {
      '@type': 'Organization',
      name: job.agency,
    },
    jobLocation: {
      '@type': 'Place',
      address: {
        '@type': 'PostalAddress',
        addressLocality: job.city,
        addressRegion: job.state,
        addressCountry: 'US',
      },
    },
  }

  return (
    <>
      <script
        type="application/ld+json"
        dangerouslySetInnerHTML={{
          // Sanitize to prevent XSS — replace < with unicode escape
          __html: JSON.stringify(jsonLd).replace(/</g, '\\u003c'),
        }}
      />
      <JobDetail job={job} />
    </>
  )
}

Critical: The JSON.stringify XSS sanitization (replacing < with \u003c) is not optional. Without it, malicious job descriptions could inject scripts via structured data.

BreadcrumbList for Hierarchical Pages

Every page with a position in the site hierarchy should have BreadcrumbList schema. For a site with /jobs/california/los-angeles, that means:

{
  "@context": "https://schema.org",
  "@type": "BreadcrumbList",
  "itemListElement": [
    { "@type": "ListItem", "position": 1, "name": "Jobs", "item": "https://www.govJobshub.com/jobs" },
    { "@type": "ListItem", "position": 2, "name": "California", "item": "https://www.govJobshub.com/jobs/california" },
    { "@type": "ListItem", "position": 3, "name": "Los Angeles" }
  ]
}

Implement this as a reusable Server Component that takes the breadcrumb trail as a prop. Use it on every page type except the homepage.

Pitfall 6: Boilerplate Content at Scale

Every state page on GovJobsHub had an “About Government Jobs in [State]” section that read almost identically:

“Government jobs in [State] offer competitive salaries, excellent benefits, and job security. Browse our latest listings from federal, state, and local agencies.”

That sentence appeared, with minor variations, on 50 state pages, hundreds of city pages, and dozens of category pages. At scale, this is a thin content signal. Google’s quality systems look for pages that add substantive, unique value. When the only difference between /jobs/california and /jobs/texas is the state name in a template sentence, both pages risk being classified as low-quality.

The Fix: Data-Driven Unique Content

Replace boilerplate with programmatically generated content that is genuinely unique per page:

// Generate unique location context
function getLocationContent(state: string, stats: StateStats) {
  return {
    intro: `${state} has ${stats.activeListings.toLocaleString()} open government positions across ${stats.agencyCount} agencies. The average salary is $${stats.avgSalary.toLocaleString()}.`,
    topAgencies: `The largest employers are ${stats.topAgencies.slice(0, 3).join(', ')}.`,
    trends: stats.monthOverMonth > 0
      ? `Listings are up ${stats.monthOverMonth}% compared to last month.`
      : `Listings are down ${Math.abs(stats.monthOverMonth)}% compared to last month.`,
  }
}

Even two or three sentences of unique, data-driven content per page significantly differentiates them. The key is pulling from real data, job counts, salary ranges, top employers, trending categories, not just swapping a place name into a template.

Category and Agency Pages

The same principle applies. A category page for “IT & Technology” should reference the specific agencies hiring for tech roles, the salary range for that category, and any notable trends. An agency page for the VA should mention its hiring volume, locations, and most common position types.

This content does not need to be hand-written. It needs to be data-driven and genuinely different per page.

Pitfall 7: Page Churn and the Expiring Content Problem

A job board is not a blog. Content does not accumulate, it churns. GovJobsHub has roughly 15,000 job detail pages at any given time, but individual listings have a lifespan of 30 to 90 days. That means 2,000 to 5,000 pages expire every month and roughly the same number of new pages appear.

This creates a cascade of SEO problems that static content sites never face.

The Index Bloat Cycle

Here is what happens without intervention:

A job listing is posted. ISR generates the page. Googlebot crawls it. It enters the index with JobPosting rich results.
60 days later, the listing expires. The source data is removed.
But the ISR cache still serves the old page. Googlebot crawls the cached version and sees active content.
Eventually ISR revalidates and the page updates, but to what? If the code renders a “This job is no longer available” message with a 200 status code, Google keeps the URL in the index as a soft 404.
Meanwhile, the expired listing still has JobPosting structured data in Google’s cache, showing in search results with stale salary, location, and apply links.

At scale, this means hundreds of expired listings sitting in Google’s index at any given time, damaging user trust and wasting crawl budget.

The Fix: A Page Lifecycle Strategy

Every page type with expiring content needs a defined lifecycle:

Active listing (200 OK):

Full content, JobPosting schema, in sitemap
ISR revalidation every 24 hours

Expired listing (410 Gone):

Return 410 status immediately, not a soft 404, not a redirect
Strip JobPosting schema
Remove from sitemap on next generation
Trigger on-demand ISR revalidation so the 410 is served immediately, not after the next cache interval

// app/jobs/[id]/page.tsx
export default async function JobPage({ params }) {
  const job = await getJob(params.id)

  if (!job) {
    notFound() // 404 for never-existed
  }

  if (job.status === 'expired') {
    // Return 410 Gone — this listing existed but is permanently removed
    return new Response('This job listing has been removed.', {
      status: 410,
      headers: { 'Content-Type': 'text/html' },
    })
  }

  return <JobDetail job={job} />
}

The Google Indexing API

For job boards specifically, Google offers the Indexing API which supports URL_DELETED notifications. This is dramatically faster than waiting for Googlebot to recrawl, deletions are processed within minutes, not days.

// Notify Google when a listing expires
async function notifyGoogleOfRemoval(url: string) {
  const auth = new google.auth.GoogleAuth({
    scopes: ['https://www.googleapis.com/auth/indexing'],
  })
  const client = await auth.getClient()

  await client.request({
    url: 'https://indexing.googleapis.com/v3/urlNotifications:publish',
    method: 'POST',
    data: {
      url,
      type: 'URL_DELETED',
    },
  })
}

GovJobsHub was not using this API at all initially. After implementing it, stale listings were deindexed within hours instead of lingering for weeks.

Sitemap Freshness

Your sitemap must reflect page removals quickly. If you generate sitemaps at build time, expired listings stay in the sitemap until the next deploy. For a job board, sitemaps should be generated dynamically or regenerated on a schedule shorter than your content’s average lifespan.

At minimum, run sitemap regeneration daily. Include only active listings. Set lastmod to the listing’s actual post date, not the sitemap generation time.

Pitfall 8: Filter Pills, the Hidden Client Rendering Trap

GovJobsHub has filter pills on every listing page. Users click pills to filter by job category (IT, Healthcare, Law Enforcement), location type (Remote, On-site, Hybrid), salary range, and agency. These pills are a standard UI pattern, small, rounded chips that toggle on and off.

They are also an SEO disaster in a typical Next.js implementation.

The Rendering Problem

Filter pills are interactive. Users click them. They toggle state. They update the job list below. In Next.js, this means they are almost always implemented as Client Components:

// Typical implementation — entirely client-rendered
'use client'

export function FilterPills({ categories, activeFilters, onToggle }) {
  return (
    <div className="flex gap-2 flex-wrap">
      {categories.map(cat => (
        <button
          key={cat.slug}
          onClick={() => onToggle(cat.slug)}
          className={activeFilters.includes(cat.slug) ? 'active' : ''}
        >
          {cat.name} ({cat.count})
        </button>
      ))}
    </div>
  )
}

The problem: none of this renders in the initial HTML. Googlebot sees an empty <div> where the pills should be. The category names, the job counts, the entire navigational structure of the filter UI, all invisible to crawlers.

This matters because those pill labels are often keyword-rich terms that help search engines understand the page’s topic. “IT & Technology (342 jobs)” is a strong relevance signal for a listing page. When it is client-rendered, that signal disappears.

The URL Parameter Problem

Clicking pills typically updates the URL: /jobs?category=IT&location=remote. Each combination is a unique URL that Googlebot can discover and attempt to index. With 20 categories, 3 location types, and 5 salary ranges, that is potentially hundreds of filtered URL variations per listing page, most of which contain duplicate or near-duplicate content.

GovJobsHub had over 800 filtered URL variations discovered in Search Console, each generating a “Duplicate, Google chose different canonical” warning.

The Fix: Server-Rendered Pills with Client Interactivity

Separate the rendering from the interaction:

// Server Component — renders pill labels and counts in initial HTML
export function FilterPills({ categories, activeFilters }) {
  return (
    <div className="flex gap-2 flex-wrap">
      {categories.map(cat => (
        <PillButton
          key={cat.slug}
          slug={cat.slug}
          label={`${cat.name} (${cat.count})`}
          isActive={activeFilters.includes(cat.slug)}
        />
      ))}
    </div>
  )
}

// Client Component — only handles the click interaction
'use client'

export function PillButton({ slug, label, isActive }) {
  const router = useRouter()

  return (
    <button
      onClick={() => {
        // Update URL params and re-fetch
        const params = new URLSearchParams(window.location.search)
        if (isActive) {
          params.delete('category', slug)
        } else {
          params.append('category', slug)
        }
        router.push(`?${params.toString()}`, { scroll: false })
      }}
      className={isActive ? 'active' : ''}
    >
      {label}
    </button>
  )
}

Now the pill labels and counts render in the initial HTML (visible to Googlebot), while the click behavior hydrates on the client.

Managing Filtered URLs

Even with server-rendered pills, the URL parameter problem remains. The fix is canonical management:

// All filtered views canonical to the unfiltered URL
export async function generateMetadata({ searchParams }) {
  const hasFilters = Object.keys(searchParams).some(
    key => ['category', 'location', 'salary'].includes(key)
  )

  return {
    alternates: {
      canonical: 'https://www.yoursite.com/jobs', // Always clean URL
    },
    // Noindex filtered views to prevent duplicate content
    ...(hasFilters && {
      robots: { index: false, follow: true },
    }),
  }
}

The follow: true is important, even though the filtered page is noindexed, you want Googlebot to follow the links on it to discover individual job detail pages.

Pitfall 9: Core Web Vitals Across Page Types

The SALT.agency study of 50 Next.js sites found that only 3 out of 50 passed LCP and only 1 out of 50 passed all three Core Web Vitals thresholds. GovJobsHub was not an outlier, it failed LCP on listing pages and had INP issues on interactive pages.

LCP: The Image Problem

Listing pages have hero images and dozens of job cards with agency logos. The default behavior of next/image is to lazy-load everything. But the hero image is above the fold, it should not be lazy-loaded.

// BAD: Hero image lazy-loads by default
<Image src={heroImage} alt="..." width={1200} height={600} />

// GOOD: Hero image preloaded with priority
<Image src={heroImage} alt="..." width={1200} height={600} priority />

This single prop (priority) was the difference between a 3.8s and a 2.1s LCP on GovJobsHub’s listing pages.

INP: The Hydration Problem

Interactive pages, those with search filters, sorting, and pagination, had poor Interaction to Next Paint (INP) scores. The cause: heavy hydration. When the client-side JavaScript boots up and hydrates Server Component output, the main thread is blocked. Any user interaction during hydration (clicking a filter, typing in search) queues behind the hydration work.

Mitigations:

Reduce Client Component scope: hydrate only the interactive parts, not the entire page
Use React.lazy and dynamic imports: defer hydration of below-the-fold interactive components
Avoid CSS-in-JS: libraries like Styled Components inject styles at runtime, causing layout recalculations that block the main thread

CLS: The Dynamic Content Problem

Location pages that load job counts and statistics asynchronously caused Cumulative Layout Shift. The page renders, then numbers pop in and push content down.

Fix: Reserve space for dynamic content using CSS min-height or skeleton placeholders that match the final content dimensions. Better yet, fetch the data on the server so it is in the initial render.

Test Per Page Type

CWV scores vary dramatically across page types. The homepage might score 95 on Lighthouse while listing pages score 45. Test every template independently:

# Test each page type with Lighthouse CLI
lighthouse https://yoursite.com/ --output=json
lighthouse https://yoursite.com/jobs --output=json
lighthouse https://yoursite.com/jobs/california --output=json
lighthouse https://yoursite.com/jobs/12345 --output=json

Pitfall 10: Vercel-Specific Issues

Vercel makes Next.js deployment simple, but its platform constraints create SEO-specific challenges that are not obvious until you hit them.

ISR Cache Staleness

Vercel’s ISR implementation caches pages on its Edge Network. When revalidate is set to 86400 (24 hours), the page can serve stale content for up to 24 hours after the source data changes. For a job board, this means:

Expired job listings still appear in search results with active JobPosting schema
Google crawls the cached page and sees content that no longer exists
When the cache finally revalidates, the page updates, but Google may not re-crawl for days

Fix: Use on-demand revalidation. When a job listing is removed from the database, call Vercel’s revalidation API:

// API route: /api/revalidate
export async function POST(request: Request) {
  const { path, secret } = await request.json()

  if (secret !== process.env.REVALIDATION_SECRET) {
    return new Response('Unauthorized', { status: 401 })
  }

  await revalidatePath(path)
  return Response.json({ revalidated: true })
}

Serverless Function Timeouts

Vercel’s default function timeout is 10 seconds on the Hobby plan, 60 seconds on Pro. Pages that query large datasets, like a hub page aggregating 20,000 job listings for sorting and filtering, can timeout during SSR.

Fix: Pre-compute aggregations. Do not query the full dataset on every request. Build summary data at deploy time or via a scheduled job, and have the SSR page read from the pre-computed summary.

Middleware Limitations for SEO

Vercel middleware runs at the Edge, which means:

No access to Node.js APIs (no fs, no database drivers)
_rsc parameters are stripped from the request before middleware sees them
Response body cannot be modified (only headers and redirects)

If you need server-side SEO logic, like conditionally setting X-Robots-Tag headers based on content state, you need to do it in the route handler or page component, not middleware.

www vs non-www

Vercel does not automatically redirect between www and non-www. Both versions serve content, creating duplicate pages. Configure this in vercel.json or via middleware:

// middleware.ts — redirect non-www to www
import { NextResponse } from 'next/server'
import type { NextRequest } from 'next/server'

export function middleware(request: NextRequest) {
  const hostname = request.headers.get('host') || ''
  if (hostname === 'govJobshub.com') {
    return NextResponse.redirect(
      new URL(request.url.replace('govJobshub.com', 'www.govJobshub.com')),
      301
    )
  }
}

Pitfall 11: Internal Linking at Scale

With 20,000 pages, internal linking is not something you do manually. It is an architectural decision that determines which pages get crawled, how link equity flows, and which pages rank.

The Hub-and-Spoke Problem

GovJobsHub’s initial linking structure was flat: the main /jobs page linked to paginated results, and each job card linked to a detail page. Location pages and category pages existed but were poorly connected to the job detail pages and to each other.

This created a shallow hub-and-spoke pattern where:

Job detail pages were 3+ clicks from the homepage
Location pages did not link to related category pages
Category pages did not link to related location pages
No cross-linking between related geographic areas

The Fix: Programmatic Cross-Linking

Build internal links into your templates:

// On a state page (/jobs/california), link to:
// 1. Child city pages
// 2. Related category pages for that state
// 3. Neighboring state pages
// 4. Parent hub page

function StatePageLinks({ state, topCities, topCategories }) {
  return (
    <>
      <nav aria-label="Cities in this state">
        <h2>Top Cities in {state.name}</h2>
        <ul>
          {topCities.map(city => (
            <li key={city.slug}>
              <Link href={`/jobs/${state.slug}/${city.slug}`}>
                {city.name} ({city.jobCount} jobs)
              </Link>
            </li>
          ))}
        </ul>
      </nav>

      <nav aria-label="Job categories in this state">
        <h2>Popular Categories in {state.name}</h2>
        <ul>
          {topCategories.map(cat => (
            <li key={cat.slug}>
              <Link href={`/jobs/category/${cat.slug}`}>
                {cat.name}
              </Link>
            </li>
          ))}
        </ul>
      </nav>
    </>
  )
}

Crawl Depth Matters

The goal: every page should be reachable within 3 clicks from the homepage. For a 20,000-page site, that requires deliberate hub-and-spoke architecture with cross-links between spokes.

Homepage links to hub pages (jobs, federal, remote) and featured locations/categories. Hub pages link to location and category index pages. Location and category pages link to individual job listings and cross-link to each other. Job detail pages link back to their location and category parents.

Setting It Up Right: Sitemaps for 20K Pages

Next.js’s built-in sitemap.ts works for small sites. At 20,000 pages, you need a sitemap index that splits pages by type.

The Problem

A single sitemap file with 20,000 URLs is technically valid (the limit is 50,000), but it is harder to debug and monitor. When Google reports indexing issues, a monolithic sitemap gives you no granularity about which page types are affected.

The Fix: Sitemap Index by Page Type

// app/sitemap.ts — generates sitemap index
import { MetadataRoute } from 'next'

export default function sitemap(): MetadataRoute.Sitemap {
  return [
    // Return sitemap index entries
    // Each points to a type-specific sitemap
  ]
}

// app/sitemaps/jobs/sitemap.ts
// app/sitemaps/locations/sitemap.ts
// app/sitemaps/categories/sitemap.ts

Or use the next-sitemap package, which handles sitemap index generation, splitting, and per-route configuration automatically.

Priority and Changefreq per Page Type

Page Type	Priority	Changefreq
Homepage	1.0	daily
Hub pages	0.9	daily
State pages	0.8	daily
Category pages	0.8	weekly
City pages	0.7	daily
Agency pages	0.7	weekly
Job detail pages	0.6	weekly
Static pages	0.5	monthly

Note: Google has stated it largely ignores priority and changefreq, but other search engines (Bing, Yandex) still use them, and they help with debugging.

lastmod Must Be Accurate

Do not set lastmod to the current build time for every page. Use the actual content modification date. For ISR pages, this means tracking when the underlying data last changed, not when the cache was last generated.

Setting It Up Right: Canonical URLs

Canonical mismanagement is the silent killer of large Next.js sites. GovJobsHub had three distinct canonical problems.

Problem 1: Pagination Canonicals

Paginated pages (/jobs?page=2, /jobs?page=3) should each have a self-referencing canonical. The second page of results is not a duplicate of the first, it is a distinct page with different content. But some Next.js SEO guides incorrectly suggest pointing all paginated pages to page 1.

// Correct: self-referencing canonical on paginated pages
export async function generateMetadata({ searchParams }) {
  const page = searchParams.page || '1'
  const canonicalUrl = page === '1'
    ? 'https://www.yoursite.com/jobs'
    : `https://www.yoursite.com/jobs?page=${page}`

  return {
    alternates: {
      canonical: canonicalUrl,
    },
  }
}

Problem 2: Parameter Pollution

Beyond _rsc, other query parameters can create duplicates: ?sort=salary, ?filter=remote, ?q=engineer. Each parameter combination is a unique URL to Google.

Rule: Pages with sort/filter parameters should canonical to the unfiltered version. Search result pages should be noindexed.

// Canonical always points to clean URL
export async function generateMetadata({ searchParams }) {
  return {
    alternates: {
      canonical: 'https://www.yoursite.com/jobs',
    },
    // Noindex search results
    ...(searchParams.q && {
      robots: { index: false, follow: true },
    }),
  }
}

Problem 3: Trailing Slashes

Next.js uses 308 redirects (not 301) for trailing slash normalization. Pick one format and enforce it in next.config.js:

// next.config.js
module.exports = {
  trailingSlash: false, // /jobs, not /jobs/
}

The Setup Checklist

If you are starting a Next.js + Vercel project today and SEO matters, configure these before writing a single page component.

1. Rendering Defaults

All page content in Server Components by default
'use client' only for interactive UI elements (filters, modals, forms)
Audit with curl or view-source: before launch, if content is not in raw HTML, it is not server-rendered

2. Metadata Configuration

Use the Metadata API (metadata object or generateMetadata function), not next/head
Set defaults in app/layout.tsx, override per route
Every page gets: title, description, canonical, robots, Open Graph

3. robots.txt

Allow /_next/static/ and /_next/image/
Disallow /_next/data/, /api/, /*?_rsc=
Include sitemap reference

4. Sitemap

Split by page type for sites over 1,000 pages
Accurate lastmod from content timestamps, not build time
Submit to Search Console immediately

5. Structured Data

JSON-LD in Server Components, not Client Components
Sanitize with .replace(/</g, '\\u003c')
Match schema type to page type (JobPosting, BreadcrumbList, FAQPage, etc.)

6. Status Codes

Use notFound() for missing content
Return 410 for expired content
Verify every dynamic route pattern returns 404 for invalid params

7. Vercel Configuration

www/non-www redirect in middleware or vercel.json
On-demand ISR revalidation for content removals
Function timeout appropriate for your data volume

8. Internal Linking

Programmatic cross-links between page types
Maximum 3 clicks from homepage to any page
Use <Link>, never router.push() for navigation between indexable pages

9. Core Web Vitals

priority on above-the-fold next/image components
Use next/font for web fonts
Avoid CSS-in-JS libraries
Test every page template, not just the homepage

10. Monitoring

Google Search Console coverage report, check weekly for the first 3 months
Log file analysis if possible, see which pages Googlebot actually crawls
Automated crawl audits with tools like Screaming Frog (or its MCP server for AI-assisted audits)
CrUX data for real-user CWV per page type

Results and Takeaways

After implementing the fixes described above, GovJobsHub saw measurable improvements over 8 weeks:

Indexed pages increased from ~4,000 to ~14,000 (of 20,000 total)
_rsc duplicate entries in Search Console dropped from 1,300 to under 200
Soft 404 errors eliminated entirely
Average LCP on listing pages improved from 3.8s to 2.1s
JobPosting rich results started appearing for individual job listings within 3 weeks of schema implementation

What We Learned

1. Audit every page type independently. The rendering strategy, status codes, structured data, and CWV scores can be completely different across page types on the same Next.js site. A passing grade on the homepage means nothing for your listing pages.

2. Next.js defaults are not SEO defaults. The framework does not block you from doing SEO well, but it does not do it for you. Every SEO requirement, rendering strategy, canonical management, structured data, status codes, needs explicit configuration.

3. _rsc is a fact of life. There is no clean fix. Budget time for managing the noise in Search Console and implement the mitigation stack (robots.txt + canonicals + selective prefetch disabling).

4. Vercel adds a caching layer you must account for. ISR staleness, function timeouts, and middleware limitations are platform constraints that affect SEO. On-demand revalidation is not optional for sites with expiring content.

5. Programmatic content needs programmatic quality. Templated pages at scale require data-driven unique content, not just name-swapped boilerplate. Every page type needs enough unique content to justify its existence in the index.

6. Internal linking is architecture, not afterthought. At 20,000 pages, you cannot manually link. Build cross-linking into your templates and ensure crawl depth stays under 3 clicks for every page.

The Next.js + Vercel stack is powerful, and it can absolutely support large-scale SEO. But it requires deliberate configuration at every layer, rendering, caching, metadata, structured data, and crawl management. The pitfalls are real, well-documented, and largely avoidable if you know where to look.