Screaming Frog MCP, AI-Powered SEO Audits

The Screaming Frog MCP server lets you run technical SEO audits through any MCP-compatible AI assistant. Instead of exporting CSVs and filtering spreadsheets, you ask conversational questions about your crawl data and get prioritized, actionable output. It wraps Screaming Frog’s headless CLI, exposes 8 tools for crawling, exporting, and querying, and works with Claude Desktop, Claude Code, Cursor, Windsurf, Cline, and any other MCP client. This guide covers setup, real audit workflows, multi-site automation, and how to extend the server for your own tools.

The source code is on GitHub.

What the Screaming Frog MCP Actually Does (and Doesn’t)

I built the Screaming Frog MCP server because crawling is just the first step. The real work happens after: exporting data, filtering results, cross-referencing against other sources, and extracting actionable patterns from thousands of URLs. The MCP connects this analysis phase directly to your AI assistant without the export-CSV-open-spreadsheet loop.

It wraps Screaming Frog’s headless CLI and exposes crawl data through 8 tools. You ask your AI assistant conversational questions about your crawl data. “Show me pages with missing H1 tags.” “Find redirect chains longer than 3 hops.” “Which pages have the highest crawl depth?” The assistant runs the queries, processes results, and hands you actionable output in one session.

This is not a replacement for the GUI. The GUI is where you configure crawls, set spider options, and tweak advanced settings. The MCP is where you do analysis and automation. Configure in the GUI. Analyze with your AI assistant.

TaskGUIMCP
One-off crawl of a single siteFasterSlower (setup overhead)
Visual inspection of resultsBetter (interactive filtering)Slower (text-based)
JavaScript rendering analysisFull supportLimited (default settings only)
Batch crawls across multiple sitesManual loop requiredNative (parallel, up to 2 concurrent)
Simple single-tab queries (missing titles, H1s)Faster (few clicks)Faster (but not by much)
Cross-referencing crawl data with external sourcesManual import/exportNative (orchestrated queries)
Recurring audits across a portfolioManual each timeAutomated (schedule crawls, export, analyze)
Natural language follow-up questionsNot applicableNative (converse with results)

The MCP shines for batch operations, cross-tool integration, and automation. For a single site where you’ll interact with the GUI once per month, stick with the GUI. For recurring audits, multi-site comparisons, or questions that require threading together data from multiple crawls, the MCP saves the repetitive work.

Five-Minute Setup

Install the MCP using pip or uvx, depending on your preferences.

pip install screaming-frog-mcp

Or, if you prefer not to install permanently.

uvx screaming-frog-mcp

Next, verify that Screaming Frog’s CLI is accessible. You’ll need to tell your MCP client where to find it, because the path varies by operating system.

On macOS, the CLI lives at /Applications/Screaming Frog SEO Spider.app/Contents/MacOS/ScreamingFrogSEOSpiderLauncher by default. Most clients find it automatically.

On Linux or Windows, set the SF_CLI_PATH environment variable to the full path of your Screaming Frog CLI executable.

Linux example

export SF_CLI_PATH=/usr/bin/screamingfrogseospider

Windows example

set SF_CLI_PATH=C:\Program Files (x86)\Screaming Frog SEO Spider\ScreamingFrogSEOSpiderCli.exe

Claude Desktop setup. Open ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or the equivalent on your OS, and add the following.

{
  "mcpServers": {
    "screaming-frog": {
      "command": "uvx",
      "args": ["screaming-frog-mcp"],
      "env": {
        "SF_CLI_PATH": "/path/to/ScreamingFrogSEOSpiderLauncher"
      }
    }
  }
}

Restart Claude Desktop. The MCP is now available.

Claude Code setup. Run this command.

claude mcp add screaming-frog https://github.com/bzsasson/screaming-frog-mcp

Cursor, Windsurf, or other MCP-compatible clients. Check your client’s MCP configuration documentation. The setup is similar: point the client to the MCP server and set the SF_CLI_PATH environment variable.

Test it by asking your AI assistant to list your saved crawls. You’ll see the database IDs and sizes of every crawl you’ve stored in Screaming Frog. If you see output, you’re connected.

One gotcha on Linux and Windows. SF_CLI_PATH must point to the exact path where Screaming Frog installs the CLI executable. Install paths vary by OS and installer version. Verify the path on your machine before configuring the env var. If sf_check fails, you’ve likely got the path wrong.

The GitHub repo has the latest installation instructions and troubleshooting for platform-specific issues.

The Database Lock (Why You Close the GUI First)

Screaming Frog stores crawl data in SQLite, in a directory called ProjectInstanceData. When you open the Screaming Frog GUI, it acquires an exclusive write lock on that database. No other process can read from it while the lock is held. This is standard SQLite exclusive locking behavior and protects data consistency.

The MCP uses Screaming Frog’s headless CLI, which tries to acquire a read lock. It cannot do this while the GUI has the exclusive lock. You’ll see an error message: “The database is locked. Please quit the SF GUI first, then retry.”

Fix: Close the Screaming Frog GUI before running any MCP commands. Wait a moment for the lock to release, then run your analysis.

Exception: The list_crawls tool is read-only and works even while the GUI is open. You can run it to check which crawls exist without closing the GUI. Every other tool (crawl_site, export_crawl, delete_crawl) requires the GUI to be closed.

The lock is not a bug. It’s a safety feature that prevents simultaneous writes. If you see a database lock error, you know the fix: quit the GUI and retry.

How to Use Screaming Frog to Improve On-Page SEO

Export the Page Titles, Meta Description, H1, and H2 tabs from a crawl, then ask your AI assistant to flag pages with missing, duplicate, or truncated elements. The MCP lets you cross-reference on-page issues with crawl depth and inlink count so you fix the highest-impact pages first.

The standard on-page audit covers five signals: title tags (missing, duplicate, over 60 characters), meta descriptions (missing, duplicate, over 160 characters), H1 tags (missing, multiple, or duplicate), H2 structure (missing on long pages), and image alt text (missing on above-the-fold images).

Ask your AI assistant:

“Export Page Titles:All, Meta Description:All, H1:All, H2:All, and Images:All from my crawl. Show me a summary of on-page issues sorted by page count.”

Your assistant exports, reads, and aggregates. The output might look like this:

On-page issues summary:

Missing meta descriptions:     342 pages
Duplicate title tags:          89 pages (31 unique duplicates)
Missing H1 tags:               23 pages
Multiple H1 tags:              156 pages
Title tags over 60 chars:      412 pages
Missing image alt text:        1,204 images across 389 pages

Now drill in:

“Which of the 342 pages missing meta descriptions have the most inlinks? Show me the top 20.”

Your assistant cross-references with the Internal:All export. The pages with the most inlinks are your highest-priority fixes, they’re the most internally linked (and likely most visited) pages without descriptions.

For duplicate titles, ask:

“Group the duplicate titles and show me how many pages share each one. Are any of these on different URL patterns like /products/ vs /category/?”

This reveals whether duplicates are a template issue (all product pages sharing a generic title) or a content issue (two distinct pages with the same title). Template issues are a one-line fix in your CMS. Content issues need individual attention.

How Do You Find Broken Pages That Actually Matter?

Export Response Codes and Internal Links data, then cross-reference to find broken pages that still have internal links pointing to them. These are the pages costing you crawl equity, and fixing them has the most immediate impact.

Most audits start with “find all 404s.” That’s too basic. The real question is which broken pages have internal links pointing to them, and how many.

Start by listing your saved crawls. Ask your AI assistant:

“Show me my saved crawls with their database IDs.”

Your assistant runs the list_crawls command and returns something like this.

Site Crawl 1, Database ID 1234 (8,234 URLs)
Site Crawl 2, Database ID 5678 (3,156 URLs)

Pick the crawl you want to analyze. Ask:

“Export the Response Codes and Internal Links data from crawl 1234.”

Your assistant exports both data sets. This takes 20-40 seconds depending on crawl size.

Next, ask:

“Show me pages that returned 4xx or 5xx errors and have at least one internal link pointing to them. Sort by the number of inlinks (highest first).”

Your assistant reads the exported data, cross-references the Response Codes tab with the Internal Links tab, filters for broken pages with inlinks, and sorts. This takes another 10-20 seconds.

The output looks something like this.

Pages with errors AND internal links (sorted by inlink count)

https://example.com/products/widget-1  | 404 | 127 inlinks
https://example.com/blog/old-post-2019 | 410 | 89 inlinks
https://example.com/category/tools      | 503 | 45 inlinks

These are the pages costing you the most crawl equity. The first page has 127 internal links pointing to a 404. The impact is severe.

Now ask:

“Which of these are linked from the homepage, site navigation, or footer?”

Your assistant filters by link source and identifies the highest-impact broken links. Then ask:

“What’s the pattern? Are all 404s on a specific section like /products or /blog?”

This kind of follow-up analysis is where the MCP shows its value. You’re not re-exporting, not re-opening spreadsheets. You’re asking questions and getting answers in real time.

Once you have your prioritized list, hand it to your development team with the link sources. They know exactly what to fix and in what order. For pages that should return 2xx (missing pages, migration targets), redirect them. For pages that should genuinely return 404, remove the internal links or add a sitemap note.

One gotcha here. The default export tabs don’t include all response code granularity. If you need to distinguish between 404s and 410s (Gone), or between different 5xx error types, the default “Response Codes:All” tab gives you the full breakdown. But if you need response times or other performance metrics, you’ll need to specify additional tabs. Ask your assistant: “Include Response Codes:All and Performance:All in the export.”

Can 301 Redirect Chains Hurt Your SEO?

Yes. Each hop in a redirect chain adds latency, wastes crawl budget, and dilutes link equity. Google follows up to 10 redirects but recommends keeping chains to one hop. Two hops maximum is the safe target.

Long chains signal poor site architecture and create UX friction.

You’ve crawled a site you recently migrated. You expect some redirect chains from old URLs to new ones, but you want to find the longest chains and any loops. For more on handling redirects during major site changes, see our guide on site migrations (coming soon).

Ask your AI assistant:

“Export the redirect chain data from my crawl. I need to find chains longer than 2 hops and identify any circular redirects.”

Your assistant runs an export using the bulk export option (not the standard tab exports). The export parameter is important: bulk_export='All Redirect Chains', not export_tabs='Redirect Chains:All'. This is a common gotcha. Bulk exports are a different category from standard tab exports, and they use different parameters.

The export returns a CSV with every redirect chain in the crawl. Your assistant parses it and shows you something like this.

Chain 1: /old-product.html -> /products/new-product.html -> /products/new-product-v2.html (3 hops)
Chain 2: /about.html -> /about-us.html -> /about-us.html (loop detected)
Chain 3: /blog/2019 -> /blog/2020 -> /blog/current (2 hops)

Your assistant identifies chain 1 (over your 2-hop threshold) and chain 2 (a loop, a critical issue).

Ask:

“Generate a redirect remediation map. For chains over 2 hops, show the old URL and the final destination URL (skipping the intermediate hops).”

Your assistant outputs:

/old-product.html -> /products/new-product-v2.html
/about.html -> /about.html (loop, needs manual intervention)

Instead of three hops, you now have a direct redirect. One hop instead of three. Your developers can update the redirect rules in your web server config. Crawl budget improves immediately.

What is the difference between bulk_export and export_tabs? bulk_export handles cross-tab aggregate data (redirect chains, all inlinks, all outlinks). export_tabs handles individual tab views (Response Codes:All, Page Titles:All). Using the wrong one returns empty or incomplete data. For redirect chains specifically, bulk_export='All Redirect Chains' is correct. If you accidentally use export_tabs='Redirect Chains:All', you get nothing useful. When hunting for redirect loops, also include the Response Codes bulk export so your assistant can cross-reference each chain’s final destination with its HTTP status and separate true dead ends (4xx, 5xx) from intentional redirects.

How Do You Find Orphan Pages in Screaming Frog?

Crawl the site normally, then compare the discovered URLs against your full sitemap. Pages in the sitemap that the crawler never reached through internal links are your orphans. The MCP automates this comparison.

Orphan pages are URLs that exist on your site but have zero internal links pointing to them. Search engines discover pages by following links. If nothing links to a page, crawlers may never find it, or they’ll deprioritize it in their crawl queue. These pages often rank poorly even if the content is strong.

The tricky part: Screaming Frog’s standard crawl only finds pages it can reach by following links from the start URL. By definition, it can’t crawl orphan pages this way. To find them, you need to feed Screaming Frog a list of all known URLs (from your sitemap or server logs) and then compare that list against the pages the crawler actually discovered through links.

Here’s the workflow with the MCP.

First, crawl the site normally. This discovers all pages reachable through internal links.

"Crawl https://example.com and label it 'example-link-crawl-march-2026'."

While that runs (or after), prepare a list of all URLs from your sitemap. You can ask your assistant to fetch and parse the sitemap, or upload a URL list file.

Once the crawl finishes, export the Internal:All data.

"Export Internal:All from the crawl I just ran."

Now ask:

“Compare the sitemap URLs against the crawled URLs. Which sitemap URLs were never discovered through internal links?”

Your assistant cross-references the two lists and returns the orphans.

Orphan pages (in sitemap but not discovered via links):

https://example.com/landing/summer-sale       (in sitemap, 0 inlinks)
https://example.com/resources/old-whitepaper   (in sitemap, 0 inlinks)
https://example.com/products/discontinued-widget (in sitemap, 0 inlinks)

For each orphan, decide: should this page exist? If yes, add internal links from relevant pages. If it’s outdated or irrelevant, remove it from the sitemap and consider adding a redirect or noindex.

Ask your assistant for a recommendation:

“For each orphan page, suggest which existing pages on the site would be the best candidates to link from, based on URL structure and topic similarity.”

Your assistant analyzes the URL patterns and suggests linking opportunities. The /landing/summer-sale page probably belongs in a promotions section. The whitepaper should be linked from the resources hub. The discontinued product should redirect to its replacement or category page.

How Do You Find Duplicate Content with Screaming Frog?

Export the Content:All and Duplicate:All tabs, then group duplicate pairs by canonical status. Pages with no canonical tag set are the highest priority, because search engines are guessing which version to index.

Duplicate content causes crawl waste and can dilute ranking signals. Two URLs serving identical or near-identical content compete against each other instead of consolidating authority on a single page.

Screaming Frog identifies exact duplicates (same content hash) and near-duplicates (high similarity percentage). The MCP can export and analyze both.

Ask your AI assistant:

“Export the Content:All data and the Duplicate:All data from my crawl.”

Your assistant exports both tabs. The Duplicate tab shows URL pairs with their similarity scores.

"Show me pages with duplicate content. Group them by the canonical URL (if set) and flag any pairs where neither page has a canonical tag."

The output highlights three categories:

  1. Duplicates with correct canonicals (low priority, already handled)
  2. Duplicates with conflicting canonicals (each page points to itself, medium priority)
  3. Duplicates with no canonical at all (high priority, search engines are guessing)

For category 3, your assistant can suggest which URL should be the canonical based on factors like URL length, crawl depth, and inlink count. The shorter URL with more inlinks is usually the right canonical target.

A common pattern on e-commerce sites: parameterized URLs like /products/widget?color=blue and /products/widget?color=red serve nearly identical content. Ask your assistant:

“Are any of the duplicate pairs caused by URL parameters? Show me the parameter patterns.”

Your assistant groups the duplicates by their base URL (stripping parameters) and shows which parameters are causing the duplication. This gives you a clear list of parameters to handle via canonical tags, parameter handling in Google Search Console, or URL rewriting.

How to Use Screaming Frog for JavaScript Crawling

Save a .seospiderconfig file with JavaScript rendering enabled in the GUI, then pass it to the MCP’s crawl_site tool. After the crawl, export the JavaScript:All tab and compare rendered content against raw HTML to find pages where critical elements only appear after JS execution.

JavaScript-rendered content is a common source of indexing issues. If your site uses React, Vue, Angular, or any client-side framework, the content that Screaming Frog sees by default (raw HTML) may differ from what search engines see after rendering.

Screaming Frog supports JavaScript rendering through its built-in Chromium engine. The MCP can trigger crawls with JavaScript rendering enabled, but only if you’ve saved a .seospiderconfig file with those settings.

Here’s the workflow.

  1. Open the Screaming Frog GUI
  2. Go to Configuration > Spider > Rendering and set it to “JavaScript”
  3. Adjust the rendering timeout (5 seconds is a good starting point for most sites)
  4. Save the configuration: File > Save Configuration > js-rendering.seospiderconfig
  5. Close the GUI

Now use the MCP with that config.

"Crawl https://example.com using my js-rendering.seospiderconfig file."

Your assistant passes the config file to the crawl_site tool.

After the crawl finishes, compare the rendered content against the raw HTML.

"Export the JavaScript:All tab from this crawl. Show me pages where the rendered title or H1 differs from the raw HTML version."

Pages where the title or H1 only appears after JavaScript execution are vulnerable to indexing issues. Googlebot renders JavaScript, but with delays. Other search engines (Bing, Yandex) may not render at all.

For each flagged page, the fix depends on your stack. Server-side rendering (SSR) is the safest option. If SSR isn’t feasible, ensure critical content (titles, headings, main body text) is present in the initial HTML response, not injected by JavaScript after load.

How Do I Perform a Sitewide SEO Audit?

Crawl the site, export all key tabs in one batch (Response Codes, Page Titles, Meta Descriptions, H1s, Images, Canonicals, Directives), then ask your AI assistant to summarize and prioritize issues by impact. The full workflow takes five steps: crawl, export, summarize, prioritize, drill in.

Here’s the complete workflow using the MCP.

Step 1: Crawl. Ask your assistant to crawl the site. For sites under 10,000 pages, a standard crawl takes 5-15 minutes. For larger sites, use a saved config with appropriate limits.

"Crawl https://example.com with a max of 50,000 URLs. Label it 'full-audit-march-2026'."

Step 2: Export everything. Once the crawl finishes, export all the data you’ll need in one go.

"Export Internal:All, Response Codes:All, Page Titles:All, Meta Description:All, H1:All, H2:All, Images:All, Canonicals:All, and Directives:All from this crawl."

Step 3: Get the summary. Ask for the big picture first.

"Give me a summary: total pages crawled, broken pages (4xx and 5xx), pages with missing titles, pages with missing meta descriptions, pages with missing H1s, pages with duplicate titles, and pages with duplicate meta descriptions."

Step 4: Prioritize. Ask your assistant to rank the issues by impact.

"Rank these issues by severity. For broken pages, weight by number of inlinks. For missing titles, weight by crawl depth (shallow pages are higher priority). Give me the top 20 items to fix first."

Step 5: Drill into specifics. Pick the highest-priority issue and drill in.

"Show me the 10 highest-traffic pages with missing meta descriptions."
"Which of our product pages have duplicate titles?"
"Are there any pages blocked by robots.txt that have inlinks?"

This is the core loop: crawl, export, summarize, prioritize, drill in. Each follow-up question costs you seconds instead of the minutes it takes to re-export and re-filter in a spreadsheet.

How Often Should You Perform an SEO Audit?

After every major deployment, weekly for high-churn sites (e-commerce, job boards, news), monthly to quarterly for stable sites, and daily for the first week after any migration. The MCP makes weekly and post-deploy audits practical by automating the crawl-export-analyze loop.

The right cadence depends on how fast your site changes:

After every major deployment. Any code release that touches URL structure, redirects, meta tags, robots.txt, or sitemap generation should trigger an audit. The MCP makes this practical because you can automate it (see the next section).

Weekly for high-churn sites. E-commerce, classifieds, job boards, news sites. Focus on response codes and redirect health rather than full audits.

Monthly to quarterly for stable sites. Blogs, SaaS marketing sites, corporate sites. Monthly if you publish regularly, quarterly for brochure sites. Full audit covering all signal types.

After every migration or redesign. This is the most critical audit. Run it the day of migration and repeat daily for the first week. Redirect chains and broken pages surface quickly when you’re checking every day.

The MCP’s automation capabilities make weekly audits practical even for large portfolios. Set up crawls to run on a schedule and have your assistant analyze the results when they’re ready.

Automating Recurring Audits Across Multiple Sites

The GUI is fine for a single site. The MCP shows its value when you’re auditing the same portfolio of sites every month or after every deployment.

Imagine you manage five client sites. Each is crawled weekly. You have two hours to identify the top issues per site and compile a summary. Manually crawling and exporting each site takes 15-30 minutes per site. You’re looking at 75-150 minutes of work just to get the data.

With the MCP, ask your AI assistant:

“List all my saved crawls and group them by client.”

Your assistant returns something like this.

Client A: crawl_id_001 (8,234 URLs)
Client B: crawl_id_002 (3,156 URLs)
Client C: crawl_id_003 (12,401 URLs)
Client D: crawl_id_004 (2,890 URLs)
Client E: crawl_id_005 (5,678 URLs)

Now ask:

“For each crawl, find the top 3 issues: missing meta descriptions, pages with 404 responses, and pages with no internal links. Rank by frequency and show me a summary.”

Your assistant runs a bulk analysis. For each database ID, it exports Meta Description:All and Response Codes:All, reads the CSVs, and compiles the results. Within minutes, you have a table.

Client A: 324 missing, 18 404s, 6 orphaned
Client B: 145 missing, 3 404s, 0 orphaned
Client C: 8,934 missing, 234 404s, 45 orphaned
Client D: 98 missing, 0 404s, 2 orphaned
Client E: 612 missing, 89 404s, 12 orphaned

Client C’s 404 count is alarming. Ask:

“For Client C, show me the 404 pages. Which ones are orphaned (no internal links)?”

Your assistant cross-references the 404 list with the internal link data. Result: 89 of the 234 404s are truly orphaned. The other 145 still have internal links and should redirect or be restored.

For recurring audits at scale, create a Python script that runs on a schedule.

import subprocess
import json
from datetime import datetime

# Sites to audit
sites = {
    "client-a": "https://clienta.com",
    "client-b": "https://clientb.com",
    "client-c": "https://clientc.com",
}

# Trigger crawls
for site_name, url in sites.items():
    print(f"Starting crawl for {site_name}...")
    # Your AI assistant or script calls crawl_site(url=url, label=f"{site_name}-{datetime.now().isoformat()}")

# Poll status until complete
# Your script polls crawl_status() for each crawl

# Export and analyze
results = {}
for site_name in sites.keys():
    # Your script calls export_crawl and read_crawl_data
    # Aggregates issues per site
    pass

# Generate report
print(json.dumps(results, indent=2))

The workflow is trigger crawls, poll status, export results, analyze, report. For 5 sites running in parallel (up to 2 concurrent per the MCP limits), the whole process takes 30-50 minutes instead of 2+ hours.

For 50 sites, the automation saves days of manual work. This is the genuine value of the MCP: eliminating the repetitive loop of clicking-crawling-exporting-analyzing across a portfolio.

One gotcha for batch crawls. You can run a maximum of 2 concurrent crawls. If you have 10 sites to audit, queue them. Also, headless crawls via the MCP use default Screaming Frog settings unless you pass a custom .seospiderconfig file. For consistent results across sites, create a config in the GUI with your preferred settings (authentication, JavaScript rendering timeout, URL filters), save it, and reference it in your crawl_site calls.

Connect the Screaming Frog MCP alongside the Ahrefs MCP (or DataForSEO MCP) in the same AI session. Your assistant can match crawl response codes against backlink data to find dead pages with live external links, or combine crawl depth with organic traffic to find high-ranking pages buried deep in your site architecture.

The real power of MCP comes from combining tools in one session. Your AI assistant can pull crawl data from Screaming Frog, cross-reference it with backlink data from Ahrefs (via the Ahrefs MCP), and check SERP positions via DataForSEO, all in one conversation.

“Show me pages with broken backlinks that also have 404 status in my crawl” is a query that would take 30 minutes with manual exports. With MCP, it’s one question.

The workflow: export your crawl’s Response Codes data, then query Ahrefs for backlinks pointing to those URLs. Your assistant matches the two datasets and identifies external links pointing to dead pages. These are link reclamation opportunities: reach out to the linking sites and ask them to update the URL, or redirect the broken page to the most relevant live page.

Similarly, you can combine crawl depth data with organic traffic from Ahrefs to find pages that rank well but sit deep in your site architecture. Those pages deserve better internal linking.

As more SEO tools ship MCP servers, the combinations multiply. Sitebulb, Lumar, and ContentKing don’t have MCP servers yet. If you build one for a tool you use, share it on the MCP registry.

Power User: Export Options and Custom Configs

What export options does the Screaming Frog MCP support? Over 30 tabs and bulk exports. The default export covers internal structure, response codes, meta tags, headings, images, and directives. Beyond that, you can export External Links, JavaScript execution status, Structured Data validation, sitemaps, AMP metadata, security headers, hreflang configuration, and pagination signals.

Say you’re auditing structured data. Ask your AI assistant:

“Export all structured data validation errors from my crawl.”

Your assistant exports using bulk_export='All Structured Data,Validation Errors' and returns a CSV like this.

URL                                | Schema Type  | Error
/product/widget-a.html             | Product      | Missing "offers" property
/article/blog-post.html            | NewsArticle  | Invalid date format
/event/conference-2026.html        | Event        | Missing "location" property

Fifteen percent of your product pages are missing the offers property. All articles have date format issues. Events are missing location data. These are fixable, ranked by frequency.

How do you use custom crawl configurations with the MCP? Create a .seospiderconfig file in the GUI (File > Save Configuration), then pass it to crawl_site via the config_file parameter. The config stores spider options, excluded patterns, JS rendering settings, and authentication, so every crawl runs with identical settings.

crawl_site(url='https://example.com', config_file='/path/to/mycrawl.seospiderconfig', label='client-site-march-2026')

How do you filter exports to specific error types? Use the exact filter name from the Screaming Frog GUI. For example, “Response Codes:Client Error (4xx)” instead of “Response Codes:All” to get only 4xx errors. The names are case-sensitive and must match the GUI labels exactly. One character off and the export silently produces nothing for that filter.

For large crawls (5,000+ URLs), the MCP handles pagination. By default, read_crawl_data returns 100 rows. Use the offset parameter to fetch the next batch.

# First call: rows 1-100
read_crawl_data(export_id='...', file='Meta Description:All.csv', limit=100, offset=0)

# Second call: rows 101-200
read_crawl_data(export_id='...', file='Meta Description:All.csv', limit=100, offset=100)

Your assistant can automate this. Ask: “Read all rows from the meta description export and categorize them by URL pattern (product, blog, category).” Your assistant batches the reads and aggregates the results.

To see the complete reference of all available export tabs and filters, ask your assistant: “Show me the screaming-frog://export-reference resource.” This gives you every option available for exporting.

For Builders: Extending the MCP

The Screaming Frog MCP is open source, built on FastMCP, and clocks in at roughly 905 lines of Python. If you want to add custom analysis tools, fork it and extend it.

The architecture is straightforward. Each tool is a function decorated with @mcp.tool().

@mcp.tool()
async def my_custom_tool(param1: str, param2: int = 10) -> str:
    """Tool description here."""
    # Validation
    if not _validate_input(param1):
        raise ValueError("Invalid input")

    # Logic
    result = await _do_work(param1, param2)

    return json.dumps(result)

A real example. You want a tool that identifies pages with high crawl depth and suggests internal linking improvements. You’d write a function that exports the Internal:All data, filters by crawl depth, and returns the deep pages alongside shallow candidates for internal linking. Add the function to the server, decorate it, and your AI assistant can call it directly.

What security protections does the MCP include? URL validation blocks private IPs, localhost, and cloud metadata endpoints like metadata.google.internal. Argument validation rejects command-line injection. Database IDs are regex-validated to prevent traversal attacks. CSV reads are sandboxed to the export directory, so paths like ../../../etc/passwd are rejected.

You can adapt the MCP for other SEO tools. If a tool exposes a CLI or API, the pattern is the same: wrap the interface, expose the data through tools, and let your AI assistant query it conversationally.

Deploy your extended MCP locally.

uvx /path/to/your/fork/screaming-frog-mcp

Or containerize with Docker.

FROM python:3.11
COPY . /app
WORKDIR /app
RUN pip install -e .
CMD ["screaming-frog-mcp"]

One gotcha for builders. The server runs as a subprocess wrapper around Screaming Frog’s CLI. Every tool call spawns a CLI process. If SF isn’t installed or the license has expired, every call fails. The sf_check tool exists for exactly this reason. Run it first in any automated setup to verify the CLI is available and licensed.

Troubleshooting

“The database is locked” error. The Screaming Frog GUI is open. Close it, wait 2-3 seconds for the SQLite lock to release, then retry. The list_crawls tool is the one exception — it works with the GUI open because it only reads metadata.

sf_check fails or returns “not found.” The SF_CLI_PATH environment variable points to the wrong location. On macOS, the default path is /Applications/Screaming Frog SEO Spider.app/Contents/MacOS/ScreamingFrogSEOSpiderLauncher. On Linux and Windows, the path depends on your installer. Run which screamingfrogseospider (Linux) or check your Start Menu shortcut properties (Windows) to find the actual path.

Exports return empty or incomplete data. Three common causes. First, you used export_tabs when you needed bulk_export (or vice versa). Redirect chains, all inlinks, and all outlinks require bulk_export. Individual tab views like Response Codes:All use export_tabs. Second, the filter name doesn’t match the GUI label exactly. Filter names are case-sensitive: “Response Codes:Client Error (4xx)” works, “response codes:client error (4xx)” doesn’t. Third, the crawl hasn’t finished. Check crawl_status before exporting.

Crawl hangs or takes much longer than expected. Headless crawls use default Screaming Frog settings unless you pass a .seospiderconfig file. If the site has millions of pages and you didn’t set a URL limit, the crawl will keep going. Use the max_urls parameter on crawl_site to cap it. For JavaScript-heavy sites, rendering adds significant time per page. Start with a 5,000-URL limit to estimate total crawl time before running a full crawl.

“License expired” or “License not found” errors. The MCP uses Screaming Frog’s CLI, which requires a valid license. A free license crawls up to 500 URLs. For larger crawls, you need a paid license. Run sf_check to verify your license status. If the license recently expired, renew it in the GUI before running MCP commands.

Two concurrent crawls maximum. The MCP limits you to 2 simultaneous crawls. If you queue a third, it waits. For large portfolios, batch your crawls in groups of 2. Each crawl still runs independently, so a slow site won’t block a fast one.

read_crawl_data only returns 100 rows. This is the default limit. Use the offset parameter to paginate through larger datasets, or ask your AI assistant to read all rows automatically. It will batch the reads and aggregate the results.

Conclusion

The Screaming Frog MCP turns crawl analysis from a manual, repetitive process into a conversational one. Export data, ask questions, drill into specifics, and get prioritized output without re-opening spreadsheets between each step.

Start with a single crawl. Ask your assistant to find missing H1 tags or redirect chains. Get comfortable with the tool sequence: list crawls, export data, read results, analyze. Once you’ve run one audit through the MCP, the patterns are clear and you can scale to weekly audits across multiple sites.

The biggest shift isn’t speed (though that matters). It’s that follow-up questions cost seconds instead of minutes. “Which of these broken pages have the most inlinks?” is one sentence, not a re-export and a VLOOKUP. That changes how thoroughly you audit, because drilling deeper stops being expensive.

The source code is on GitHub. If you build something useful on top of it, open a PR or share it on the MCP registry.