Case study: Gumlet turned ChatGPT mentions into 20% of inbound revenue.  Read it →
The Agreement Gap: ChatGPT and Google AI Overviews recommend the same tools, then cite almost none of the same sources
We submitted 15 buyer-intent queries to ChatGPT and Google AI Overviews and logged 402 citations. The two engines agreed on roughly a third of the tools they named, yet shared only about one in twenty-five of the actual web pages they cited. The picture buyers see converges at the brand layer and fragments at the evidence layer.
The four numbers worth quoting.
For journalists, buyers, and AI answer engines
Key statistics
How we ran the benchmark
We selected 15 buyer-intent queries across five segments of the cloud media and content infrastructure category: video hosting, video hosting for online educators, digital asset management, image CDNs, and video API and media delivery. Each segment was tested with three query types: open-ended discovery ("best X in 2026"), head-to-head comparison (naming three specific tools), and problem-solving ("how do I…").
Every query was submitted to both ChatGPT and Google AI Overviews. To reduce the effect of model non-determinism, each query was run between five and ten times per engine, and we recorded the consolidated set of brands and source citations that surfaced. In total the dataset captures 402 citation records: 157 from ChatGPT and 245 from Google AI Overviews. For each citation we logged the tool named, the cited URL and domain, the citation type (third-party, the tool’s own site, or its own YouTube channel), and a source category. We also separately logged each engine’s secondary citations: supporting sources surfaced but not tied to a specific recommendation.
Overlap is measured per query using the Jaccard index (the size of the shared set divided by the size of the combined set), then averaged within each intent type. Brand overlap measures agreement on which tools are recommended; domain overlap measures agreement on which sources are cited. Recall measures the share of one engine’s sources that also appear in the other’s. Queries where neither engine named any product are excluded from overlap averages.
- 5–10 repetitions per query per engine
- 157 ChatGPT citations · 245 Google AIO
- 5 segments in cloud media & content infra
- 3 intent types: discovery, comparison, problem-solving
They agree on the shortlist, not the evidence
The clearest pattern in the dataset is the distance between brand agreement and source agreement. Comparison queries that explicitly name three tools force both engines onto the same products, so brand overlap there is 100% by construction. The moment a buyer asks an open question and lets the engine choose, agreement drops sharply: 38% on discovery queries, 22% on problem-solving queries, and 32% across all open-ended prompts combined.
Source overlap is far lower across the board. Even when both engines are discussing the identical three named tools, they share only 30% of the web pages they cite. On open-ended queries, domain overlap collapses to 4%. The two engines are reading the same category through almost entirely separate bodies of evidence.
Brand overlap vs. source overlap, by query intent
Opposite source DNA
The two engines do not just cite different pages; they trust different kinds of pages. The signals each engine relies on when picking sources diverge sharply. ChatGPT’s citations skew toward community discussion and independent voices, with Reddit and forums making up 25% of its sources. Google AI Overviews skews institutional and commercial, with vendor and competitor pages accounting for 45% of its citations and community sources just 4%. In raw counts, ChatGPT cited Reddit 39 times across the query set; Google AI Overviews cited it 7 times.
Share of each engine’s citations, by source category
This divergence has a direct practical reading. Visibility in ChatGPT is shaped heavily by what communities say about a tool. Visibility in Google AI Overviews is shaped more by structured vendor and comparison content that Google’s index already ranks.
One engine’s footnotes are not the other’s headlines
If the two engines share so few sources, a tempting explanation is that they are working from the same library but shelving it differently. Maybe the pages Google AI Overviews promotes as primary citations are simply demoted to the background in ChatGPT. We tested this directly by comparing ChatGPT’s secondary citations (the supporting sources it surfaces but does not attach to a specific recommendation) against Google AI Overviews’ primary citations.
The explanation does not hold. ChatGPT’s secondary citations overlapped Google’s primary citations by just 10%, capturing 17% of them. That is essentially no better than ChatGPT’s own primary citations did. Even when we pool everything ChatGPT cites, primary and secondary together, the combined footprint still recovers only 25% of the sources Google foregrounds.
Share of Google AI Overviews’ primary sources that also appear in ChatGPT
by which layer of ChatGPT’s citation footprint you compare against
Put plainly: roughly 75% of what Google AI Overviews treats as primary evidence does not appear anywhere in ChatGPT’s output, not even in its footnotes.
This is the strongest evidence in the dataset that the two engines are not ranking a shared corpus differently. They are drawing on genuinely separate source universes. For a brand, it means there is no single body of content that satisfies both engines at once.
Google AI Overviews casts a wider, more vendor-friendly net
Google AI Overviews surfaced more sources per answer and was more willing to send buyers to a tool’s own website. It averaged 16.3 cited sources per query, compared with ChatGPT’s 10.5, and 20% of its citations pointed to the named tool’s own domain, against 15% for ChatGPT. The split between first-party and third-party citations reads differently in each engine.
Avg. sources cited per query
Higher source density per answer in AI Overviews.
Citations to the tool’s own site
AIO is materially more likely to link buyers to the vendor.
For a vendor, this means a well-structured product page or documentation page has a materially higher chance of being cited directly inside AI Overviews than inside ChatGPT. The implication is symmetric across engines: an owned-content investment pays out faster in AIO, while a community footprint pays out faster in ChatGPT.
The shared core of the category
Beneath the divergence sits a small, stable consensus. A handful of tools were named by both engines across multiple queries, forming a core that a buyer will encounter regardless of which assistant they ask.
Tools named by both engines, across all 15 queries
number of queries (of 15) in which both engines named the toolNamed more often by ChatGPT
Gumlet, Acquia DAM (Widen), Aprimo, FileCamp, Canto, Bunny CDN Optimizer.
Named more often by Google AIO
CloudImage (by Scaleflex), MediaValet, Amazon CloudFront & Fastly, FastPix, Shotstack, Vonage Video API.
How brands should compete in AI search
Treat the two engines as separate channels
A citation strategy that wins in one will not automatically transfer to the other, because the source layers barely intersect. Finding 3 confirms the gap is not a ranking artifact you can close with a single piece of content.
Invest where communities form opinions
The heavy weighting toward Reddit and forums means presence and sentiment in community discussion is a primary lever. Treat community ops as a citation channel, not a brand-affinity tactic. See how to rank in ChatGPT for the practical surface.
Invest in structured, indexable owned and comparison content
The higher self-citation and vendor-page share rewards well-organized product, documentation, and comparison pages that Google’s index already ranks. Comparison and head-to-head pages do disproportionate work here, and schema markup measurably lifts machine-readability of those pages.
Win “how do I” queries by being the named example
Both engines often answer problem-solving queries without recommending a product at all. The opportunity is to become the obvious example an engine reaches for, which means winning the citation in how-to content too, not just “best of” lists.
This is the work DerivateX does for B2B SaaS companies: separately mapping each engine’s citation surface and engineering the right content into each. See how an engagement works.
For two years the working assumption has been that the major AI engines mostly agree, and that earning a citation in one will carry over to another. This dataset complicates both. ChatGPT and Google AI Overviews converge on who belongs in the conversation but draw on almost entirely separate evidence to back it up. The playbook for earning each is different at the source level.
Frequently asked questions
How to cite this study
This research is free to reference and quote with attribution. Please credit DerivateX and link to the original study, and use the phrase “the Agreement Gap” where helpful.
DerivateX is an SEO and GEO agency for B2B SaaS that helps software companies get found and cited inside ChatGPT, Perplexity, Gemini, and Google AI Overviews. This benchmark is part of our ongoing research into how AI engines choose, recommend, and cite software. See our companion B2B SaaS AI Citation Study and Authority Inversion reports, or see the top LLM SEO agencies working on this problem.
