AI Overviews show negative reviews in unrelated queries

May 13, 2026 6 min read

Summary

AI Overviews and ChatGPT now show negative brand reviews in queries where users never searched for complaints, but they go negative for different reasons. Google flags controversy, lawsuits, and recalls (4.5x more likely than ChatGPT). ChatGPT flags product limitations and evaluation queries (3x more likely than Google). They disagree on which brand to flag 73% of the time.

Audit both engines separately across Reddit, Trustpilot, G2, and other platforms AI tools treat as authoritative.

When responding, lead with quantitative evidence on support-style pages, since LLMs extract specific data points far more readily than marketing copy.

What happened

AI Overviews and LLM-powered search tools are showing negative brand reviews in comparison queries where users never asked about complaints. The framing comes from a Search Engine Journal post authored and sponsored by Erase.com, an online reputation management vendor with a direct commercial interest in the conclusion that brand reputation is at risk in AI search. Treat the underlying framework as vendor-observed pattern, not independent research.

With that caveat, the post maps four signals to which complaints get pulled into AI-generated answers. The signals are recency plus volume, specificity (naming features or outcomes), platform authority (Reddit, Trustpilot, G2, and industry forums), and recurrence across multiple sources.

When a user asks ChatGPT or Google’s AI Overview something like “which CRM should I choose,” the response can include years-old Reddit threads, forum gripes, and complaint-site entries about your brand. Per the same vendor post, a brand’s negative signal may also show up in answers about a competitor, though this is a pattern observation from Erase.com, not independently verified.

Separate BrightEdge research adds harder numbers and shows that the two major AI engines go negative for different reasons. Google AI Overviews mentions brands negatively in 2.3% of queries, versus 1.6% for ChatGPT. But the triggers diverge sharply.

Google acts like an investigative reporter: it is 4.5x more likely than ChatGPT to flag negativity tied to news events, lawsuits, boycotts, data breaches, and product recalls. ChatGPT acts like a product reviewer: it is 3x more likely to go negative on product limitations, compatibility issues, and evaluative “is it worth it?” queries. On overlapping queries where both engines went negative, they usually disagreed on which brand to flag.

The source article also notes that AI engines sometimes misquote or misrepresent brand statements, a pattern documented across training, retrieval, and generation stages.

Why it matters

Traditional reputation management worked by creating and placing positive content across as many third-party sites and platforms as possible, aiming to dominate the SERPs for brand-related queries and push negatives down. The playbook was additive: outrank complaints with authoritative positive content across multiple domains.

For third-party AI search tools, that model is largely broken. ChatGPT’s training data already contains negative sentiment from the open web; Perplexity and ChatGPT with Browse actively retrieve and synthesize across sources rather than ranking them. In both cases, volume of positive content on SERPs does not control what these tools present. For Google AI Overviews specifically, traditional SEO controls only apply to content you own.

A noindex directive on a first-party page (your own corporate blog, a legacy landing page, a thin review page on your domain) can keep that URL out of AI Overview synthesis. Googlebot must crawl and index content before it can cite it in an AI Overview. But the actual problem the source article describes is third-party content: Reddit threads, Trustpilot reviews, G2 entries, niche forums where you have no noindex access. For those pages, on-site directives are irrelevant.

Google’s Removals tool is also not a workaround for most cases. The temporary removal request only works for properties you verify in Search Console, so it cannot target third-party complaint pages. Google does offer an Outdated Content tool that works for any URL, but it only applies when the page has already been updated or removed and the cached version is stale. It does not remove content that is still live on the source page.

The extraction mechanism is different from SERP ranking. A complaint buried on page three of Google results may still get cited in an AI Overview if it hits the four-signal pattern. The working theory behind this framework is that burying a result in traditional search may no longer be sufficient to prevent it from being synthesized into an AI-generated comparison.

The engine split matters for prioritization. If your brand has been involved in a public controversy, lawsuit, or product recall, Google AI Overviews is the higher-risk channel. It pulls from news coverage and treats controversy as relevant context across query types.

If your product has known limitations or compatibility issues that users discuss on forums and review sites, ChatGPT is the bigger exposure. It treats “is it worth it?” and “best X for Y” evaluation queries as invitations to flag product shortcomings.

SaaS companies, e-commerce brands, and hospitality businesses running comparison-driven traffic are most exposed on the ChatGPT side. Brands with any history of public controversy, regulatory action, or recalls carry additional risk in Google AI Overviews regardless of query type.

In-house SEO teams managing review suppression through traditional tactics are discovering a gap. The work now extends beyond SERP management into auditing every source that LLM-powered tools treat as authoritative. Reddit threads, niche forums, Trustpilot pages, and G2 reviews all feed directly into AI answer synthesis.

What to do

The source article proposes a four-step framework mapped to the four signals AI engines weight. The steps are practical and worth walking through. We add a fifth step based on what we know about how AI engines select sources.

Audit your negative signal footprint across both engines. Open ChatGPT and Google (with AI Overviews) separately and type: “What are the pros and cons of [your brand] vs [top competitor]?” Screenshot both responses and note every negative claim. The two engines disagree on which brand to flag 73% of the time, so checking one is not enough.

Then run site:[platform].com "[your brand]" "scam" OR "complaint" on Google for each high-authority platform. Check Reddit, Trustpilot, G2, Capterra, Yelp, Google Business Profile, and any niche industry forums.

Prioritize based on appearance likelihood. Not every negative mention gets pulled into AI answers. Focus on complaints that hit multiple signals simultaneously. A detailed, recent Reddit post naming a specific product flaw that also appears on Trustpilot is far more likely to show up than a vague one-star review from 2021.

Remove or respond where possible. Some platforms allow dispute resolution or responses. Where removal isn’t an option, a substantive public response may provide AI tools with additional context, though there is no confirmed evidence this reliably shifts how AI engines characterize the overall sentiment of a page or thread. The goal is not to delete criticism but to ensure AI tools have updated, accurate context.

Build a positive content layer on the platforms AI engines prefer. The same four signals apply on the positive side. The hypothesis is that recent, specific, recurring positive content on authoritative platforms may shift what AI tools synthesize. Encourage detailed customer reviews on the platforms you identified in your audit. Publish case studies and comparison content on your own domain that directly addresses the claims AI is showing.

Address complaints with data, not marketing copy. If AI tools are showing a specific complaint (“slow onboarding,” “poor support response times”), create content that responds with quantitative evidence: internal benchmark data, customer satisfaction survey results, NPS scores, before-and-after metrics from product changes.

LLMs gravitate toward specific, extractable claims over vague reassurances. A support page showing “average onboarding time dropped from 14 days to 3 days after Q1 redesign” is far more likely to get synthesized into an AI answer than a blog post saying “we take onboarding seriously.” Frame this content as reference material, not promotion. AI systems already cite support pages, size guides, and help content over product and marketing pages.

Watch out for

Old complaints reappearing in unrelated queries. A 2023 Reddit thread can appear in a 2026 product comparison if it’s specific enough and corroborated elsewhere. Recency matters, but specificity and cross-source recurrence can override age. Audit older content, not just recent reviews.

Negative signal appearing in competitor queries. Your brand’s complaints can show up when someone asks about a competitor’s product. Monitor AI answers for competitor comparison queries, not just your own brand queries. The exposure is bidirectional.

Different engines, different triggers. A clean result in ChatGPT does not mean Google AI Overviews will be clean, and vice versa. Per BrightEdge data, the two engines disagree on which brand to flag 73% of the time on the same query. Google reacts to news-driven controversy; ChatGPT reacts to product evaluation signals. Audit both.

Summary

What happened

Why it matters

What to do

Watch out for

Related Articles