DerivateX Research AI search benchmark

The Agreement Gap: ChatGPT and Google AI Overviews recommend the same tools, then cite almost none of the same sources

We submitted 15 buyer-intent queries to ChatGPT and Google AI Overviews and logged 402 citations. The two engines agreed on roughly a third of the tools they named, yet shared only about one in twenty-five of the actual web pages they cited. The picture buyers see converges at the brand layer and fragments at the evidence layer.

15 buyer-intent queries 402 citations 2 engines Cloud media & content infrastructure
Engine profiles · cloud media SaaS
ChatGPT
Community-driven
Reddit share25%
Sources / query10.5
Google AIO
Vendor-driven
Vendor share45%
Sources / query16.3
The agreement gap
Same tools recommended
32%
ChatGPT Google AIO
Same sources cited
4%
ChatGPT Google AIO
Same buyers asking. Different evidence answering.
Key takeaways

The four numbers worth quoting.

For journalists, buyers, and AI answer engines

32%
brand overlap on open-ended buying queries. The engines name the same shortlist roughly a third of the time.
4%
source (domain) overlap on the same queries. One in twenty-five of the actual cited pages match.
39 vs 7
Reddit citations. ChatGPT cited Reddit 39 times; Google AI Overviews cited it 7 times across the same query set.
75%
of Google AI Overviews’ primary sources do not appear anywhere in ChatGPT, not even as footnotes.
The numbers

Key statistics

Overlap
32%
brand overlap on open-ended queries, averaged across discovery and problem-solving prompts.
Tools both engines named
Overlap
4%
source overlap on the same open-ended queries. About 1 in 25 cited URLs were shared.
Domains both engines cited
Overlap
30%
source overlap even on direct comparison queries where both engines were forced onto identical named tools.
Forced-comparison overlap
The gap
75%
of Google AI Overviews’ primary sources never appear anywhere in ChatGPT’s output.
Separate source universes
ChatGPT
25%
of ChatGPT’s citations came from Reddit and community forums.
Community share
Google AIO
45%
of Google AI Overviews’ citations went to vendor and competitor pages.
Vendor share
Google AIO
16.3
cited sources per query for Google AI Overviews. ChatGPT averaged 10.5.
Citation breadth per answer
Both engines
20%
of Google AIO citations point to a tool’s own website. ChatGPT: 15%.
Self-citation share
Methodology

How we ran the benchmark

We selected 15 buyer-intent queries across five segments of the cloud media and content infrastructure category: video hosting, video hosting for online educators, digital asset management, image CDNs, and video API and media delivery. Each segment was tested with three query types: open-ended discovery ("best X in 2026"), head-to-head comparison (naming three specific tools), and problem-solving ("how do I…").

Every query was submitted to both ChatGPT and Google AI Overviews. To reduce the effect of model non-determinism, each query was run between five and ten times per engine, and we recorded the consolidated set of brands and source citations that surfaced. In total the dataset captures 402 citation records: 157 from ChatGPT and 245 from Google AI Overviews. For each citation we logged the tool named, the cited URL and domain, the citation type (third-party, the tool’s own site, or its own YouTube channel), and a source category. We also separately logged each engine’s secondary citations: supporting sources surfaced but not tied to a specific recommendation.

Overlap is measured per query using the Jaccard index (the size of the shared set divided by the size of the combined set), then averaged within each intent type. Brand overlap measures agreement on which tools are recommended; domain overlap measures agreement on which sources are cited. Recall measures the share of one engine’s sources that also appear in the other’s. Queries where neither engine named any product are excluded from overlap averages.

15buyer-intent queries
3query types tested
402citations logged
2engines benchmarked
  • 5–10 repetitions per query per engine
  • 157 ChatGPT citations · 245 Google AIO
  • 5 segments in cloud media & content infra
  • 3 intent types: discovery, comparison, problem-solving
Finding 01

They agree on the shortlist, not the evidence

The clearest pattern in the dataset is the distance between brand agreement and source agreement. Comparison queries that explicitly name three tools force both engines onto the same products, so brand overlap there is 100% by construction. The moment a buyer asks an open question and lets the engine choose, agreement drops sharply: 38% on discovery queries, 22% on problem-solving queries, and 32% across all open-ended prompts combined.

Source overlap is far lower across the board. Even when both engines are discussing the identical three named tools, they share only 30% of the web pages they cite. On open-ended queries, domain overlap collapses to 4%. The two engines are reading the same category through almost entirely separate bodies of evidence.

Brand overlap vs. source overlap, by query intent

Same tools recommended Same sources cited
Comparison queriesthree tools named by the buyer
100%
30%
Discovery queries“best X in 2026”
38%
6%
Problem-solving queries“how do I…”
22%
2%
All open-ended queriesdiscovery + problem-solving
32%
4%
100% → 30%
Even when both engines are forced to discuss the same three named tools, they still disagree on 70% of the web pages they choose to cite. Brand convergence does not produce evidence convergence.
Finding 02

Opposite source DNA

The two engines do not just cite different pages; they trust different kinds of pages. The signals each engine relies on when picking sources diverge sharply. ChatGPT’s citations skew toward community discussion and independent voices, with Reddit and forums making up 25% of its sources. Google AI Overviews skews institutional and commercial, with vendor and competitor pages accounting for 45% of its citations and community sources just 4%. In raw counts, ChatGPT cited Reddit 39 times across the query set; Google AI Overviews cited it 7 times.

Share of each engine’s citations, by source category

ChatGPT Google AI Overviews
Reddit / community
25%
4%
Vendor / competitor
25%
45%
Vendor self-citation
15%
20%
Review aggregator
8%
7%
Independent blog
8%
3%
Media / editorial
4%
1%
YouTube
0%
2%

This divergence has a direct practical reading. Visibility in ChatGPT is shaped heavily by what communities say about a tool. Visibility in Google AI Overviews is shaped more by structured vendor and comparison content that Google’s index already ranks.

39 vs 7
ChatGPT cited Reddit 39 times across the query set. Google AI Overviews cited it 7. The two engines read the same buyer market through opposite-shaped lenses, one community-driven and one vendor-driven.
Finding 03

One engine’s footnotes are not the other’s headlines

If the two engines share so few sources, a tempting explanation is that they are working from the same library but shelving it differently. Maybe the pages Google AI Overviews promotes as primary citations are simply demoted to the background in ChatGPT. We tested this directly by comparing ChatGPT’s secondary citations (the supporting sources it surfaces but does not attach to a specific recommendation) against Google AI Overviews’ primary citations.

The explanation does not hold. ChatGPT’s secondary citations overlapped Google’s primary citations by just 10%, capturing 17% of them. That is essentially no better than ChatGPT’s own primary citations did. Even when we pool everything ChatGPT cites, primary and secondary together, the combined footprint still recovers only 25% of the sources Google foregrounds.

Share of Google AI Overviews’ primary sources that also appear in ChatGPT

by which layer of ChatGPT’s citation footprint you compare against

ChatGPT primary citations
17%
ChatGPT secondary citations
17%
Full footprint (primary + secondary)
25%

Put plainly: roughly 75% of what Google AI Overviews treats as primary evidence does not appear anywhere in ChatGPT’s output, not even in its footnotes.

This is the strongest evidence in the dataset that the two engines are not ranking a shared corpus differently. They are drawing on genuinely separate source universes. For a brand, it means there is no single body of content that satisfies both engines at once.

75%
of Google AI Overviews’ primary citations do not appear anywhere in ChatGPT’s output. The two engines are reading the category from separate source universes, not reordering the same one.
Finding 04

Google AI Overviews casts a wider, more vendor-friendly net

Google AI Overviews surfaced more sources per answer and was more willing to send buyers to a tool’s own website. It averaged 16.3 cited sources per query, compared with ChatGPT’s 10.5, and 20% of its citations pointed to the named tool’s own domain, against 15% for ChatGPT. The split between first-party and third-party citations reads differently in each engine.

Avg. sources cited per query

ChatGPT
10.5
Google AIO
16.3

Higher source density per answer in AI Overviews.

Citations to the tool’s own site

ChatGPT
15%
Google AIO
20%

AIO is materially more likely to link buyers to the vendor.

For a vendor, this means a well-structured product page or documentation page has a materially higher chance of being cited directly inside AI Overviews than inside ChatGPT. The implication is symmetric across engines: an owned-content investment pays out faster in AIO, while a community footprint pays out faster in ChatGPT.

Finding 05

The shared core of the category

Beneath the divergence sits a small, stable consensus. A handful of tools were named by both engines across multiple queries, forming a core that a buyer will encounter regardless of which assistant they ask.

Tools named by both engines, across all 15 queries

number of queries (of 15) in which both engines named the tool
favicons?domain=vimeoVimeo
5
favicons?domain=wistiaWistia
4
favicons?domain=gumletGumlet
4
favicons?domain=cloudinaryCloudinary
3
favicons?domain=bynderBynder
2
Imgix
2
Mux
2
favicons?domain=cloudflareCloudflare Stream
2

Named more often by ChatGPT

Gumlet, Acquia DAM (Widen), Aprimo, FileCamp, Canto, Bunny CDN Optimizer.

Named more often by Google AIO

CloudImage (by Scaleflex), MediaValet, Amazon CloudFront & Fastly, FastPix, Shotstack, Vonage Video API.

What this means

How brands should compete in AI search

01 · Strategy

Treat the two engines as separate channels

A citation strategy that wins in one will not automatically transfer to the other, because the source layers barely intersect. Finding 3 confirms the gap is not a ranking artifact you can close with a single piece of content.

02 · For ChatGPT

Invest where communities form opinions

The heavy weighting toward Reddit and forums means presence and sentiment in community discussion is a primary lever. Treat community ops as a citation channel, not a brand-affinity tactic. See how to rank in ChatGPT for the practical surface.

03 · For Google AIO

Invest in structured, indexable owned and comparison content

The higher self-citation and vendor-page share rewards well-organized product, documentation, and comparison pages that Google’s index already ranks. Comparison and head-to-head pages do disproportionate work here, and schema markup measurably lifts machine-readability of those pages.

04 · Both engines

Win “how do I” queries by being the named example

Both engines often answer problem-solving queries without recommending a product at all. The opportunity is to become the obvious example an engine reaches for, which means winning the citation in how-to content too, not just “best of” lists.

This is the work DerivateX does for B2B SaaS companies: separately mapping each engine’s citation surface and engineering the right content into each. See how an engagement works.

For two years the working assumption has been that the major AI engines mostly agree, and that earning a citation in one will carry over to another. This dataset complicates both. ChatGPT and Google AI Overviews converge on who belongs in the conversation but draw on almost entirely separate evidence to back it up. The playbook for earning each is different at the source level.
Apoorv Sharma
Apoorv SharmaCo-founder, DerivateX
Questions this report answers

Frequently asked questions

Partly. On open-ended buying queries, ChatGPT and Google AI Overviews recommended the same tools about 32% of the time. They converge much more on direct comparison queries (where the buyer names the tools, so brand overlap is 100% by design) and least on problem-solving queries (22%).
Almost never. Source (domain) overlap was 4% on open-ended queries and 30% even on comparison queries that discuss identical named tools. The engines surface roughly the same shortlist but back it with almost entirely different web pages.
No. Even pooling ChatGPT’s primary and secondary citations together, only 25% of Google AI Overviews’ primary sources appear in ChatGPT at all. The two engines draw on separate source universes rather than reordering a shared one.
ChatGPT, by a wide margin. 25% of its citations were community sources, versus 4% for Google AI Overviews. In raw counts, ChatGPT cited Reddit 39 times across the query set; Google AI Overviews cited it 7 times.
Google AI Overviews. 20% of its citations point to the named tool’s own domain, against 15% for ChatGPT. AIO also surfaces more sources per answer overall (16.3 vs 10.5).
In the cloud media and content infrastructure category, Vimeo, Wistia, Gumlet, and Cloudinary were the tools both engines named most consistently across queries. Beneath the source divergence sits a small, stable brand consensus.
Treat them as separate channels. For ChatGPT, invest in community presence and sentiment, especially Reddit and forums where the bulk of citations live. For Google AI Overviews, invest in structured, indexable comparison and documentation pages that Google already ranks. The same content investment will not satisfy both at once.
The exact percentages should be read as category-specific. The structural finding (brand convergence with source divergence, and separate source universes between the two engines) is consistent with how each engine is built and is likely to hold qualitatively in other B2B software categories, though magnitude will vary.
Limitations This is a focused snapshot, not an exhaustive census. It covers 15 queries within a single category (cloud media and content infrastructure), so the absolute overlap figures should be read as category-specific rather than universal. AI engines are non-deterministic and results shift over time and by geography; we mitigated this with repeated runs per query but did not capture run-by-run frequency in the published dataset. Comparison-query brand overlap is 100% by construction because the query names the tools, which is why we report open-ended and comparison results separately. Source categorization involves judgment at the margins.
Use this research

How to cite this study

This research is free to reference and quote with attribution. Please credit DerivateX and link to the original study, and use the phrase “the Agreement Gap” where helpful.

DerivateX Research (2026). The Agreement Gap: How often ChatGPT and Google AI Overviews recommend the same tools, and why they rarely cite the same sources. DerivateX. https://derivatex.agency/report/chatgpt-vs-google-ai-overviews-benchmark/
About

DerivateX is an SEO and GEO agency for B2B SaaS that helps software companies get found and cited inside ChatGPT, Perplexity, Gemini, and Google AI Overviews. This benchmark is part of our ongoing research into how AI engines choose, recommend, and cite software. See our companion B2B SaaS AI Citation Study and Authority Inversion reports, or see the top LLM SEO agencies working on this problem.

15 buyer-intent queries 5 segments tested 3 query intent types 402 citation records 157 ChatGPT · 245 Google AIO 5–10 runs per query Conducted June 2026 by DerivateX