Home  /  Frameworks  /  The Citation Surface Map
Framework
12 min read Updated May 1, 2026

The Citation Surface Map: why top-cited B2B SaaS brands have 35+ pages on their surface.

The bottom decile outspends the top decile on content production. They publish more pages, more often, with bigger team headcount. They lose anyway because roughly 90% of the spend lands in one layer, and that layer is the one LLMs draw from least often when answering category buyer queries. Volume does not beat architecture.

Citation surface size: distribution across B2B SaaS
Bottom Decile
<8
pages on surface
Structurally invisible for category queries. 80%+ concentrated in owned content.
Average
14
pages on surface
Occasional citation, no compounding. Uneven layer distribution.
Top Decile
35+
pages on surface
Citation share compounds across queries and platforms. Balanced across all four layers.
TL;DR

The framework in six lines.

Every B2B SaaS marketing lead investing in GEO in 2026 should be able to repeat these six points. Architecture, not volume, is the bottleneck.

01

A citation surface is not your website. It is the network of pages LLMs retrieve when answering buyer queries about your category, spanning four layers.

02

The 4 layers: Reference (your owned content), Evaluation (G2, analyst mentions), Comparison (third-party listicles, Reddit), and Validation (customer LinkedIn, named-operator endorsements).

03

Two metrics describe any surface: Surface Size (page count) and Layer Balance (distribution). Top decile: 35+ pages, no layer above 40% or below 10%.

04

The 5-step Surface Audit runs on a single company in 4 to 6 hours: 20 queries, catalog every cited URL, categorize by Layer, score Size and Balance, identify top 5 gaps to close.

05

In our 2026 Benchmark, 78% of B2B SaaS appear on all four AI platforms but the average AI Presence Score is 56.9 of 100. Appearance is not the bottleneck. Architecture is.

06

If your team is producing more content this quarter than last and citations are not following, you do not have a content problem. You have a layer problem.

The Problem

Your website is not your citation surface.

Most B2B SaaS marketing teams operate on an unstated assumption: "improving AI citations" means "publishing more or better content on our blog." Teams that hold this assumption produce more content, hire more writers, expand their pillar page program, add FAQ schema, and watch their citation share stay flat or fall.

The assumption is wrong. The citation surface that an LLM actually queries when a buyer asks about your category is a network of pages, most of which are not on your domain. When a B2B SaaS buyer asks ChatGPT for the best CRM for their use case, the model does not look at your blog and pull from it. It pulls from a constellation of sources spread across review sites, comparison listicles, Reddit threads, podcast transcripts, customer testimonials on LinkedIn, vendor directories, and yes, occasionally a page on your own domain.

Appearance is not the bottleneck for most B2B SaaS companies. Frequency, position, and breadth of mention are. And those are functions of citation surface architecture, not content volume.

The Citation Surface Map is the architectural framework. It defines what a surface contains, how to measure its quality, and how to expand it deliberately rather than accidentally. It sits underneath both the Search Budget Framework (which decides where the spend goes) and the Visibility Vacuum Theory (which decides when the spend matters most).

The Architecture

Four layers. Each retrieved differently.

A citation surface organizes by function, not by domain. The four layers map to four distinct stages of how a B2B SaaS buyer evaluates a category, and LLMs retrieve from each layer at different points in their answer construction.

04
Top of Stack
Validation
Social proof signals · Hardest to engineer

Customer LinkedIn posts, Reddit threads with positive customer testimony, X threads from named operators, Substack and newsletter mentions, forum discussions. The model distinguishes between "vendor claims X" and "an identified operator at a real company says X." The latter is weighted more heavily for recommendation queries.

Customer LinkedIn Reddit testimony Substack mentions X / operator threads
03
High Commercial Intent
Comparison
Where commercial intent gets resolved

"Best X for Y" listicles on third-party sites, direct competitor comparison pages on third-party domains, Reddit threads asking "X vs Y," Quora answers, vendor directory pages. This is where commercial intent buyer queries get answered. A B2B SaaS company with strong Layer 1 but weak Layer 3 is invisible at the moment of evaluation, regardless of how good its blog is.

PCMag · TechRadar Software Advice "X vs Y" pages Reddit threads
02
Third-Party Authority
Evaluation
Third-party validations · Easiest to expand in 90 days

G2, Capterra, and TrustRadius listings, analyst mentions in Forrester or Gartner reports, industry award listings, podcast appearances with transcripts, founder bylines on industry publications. LLMs treat these as evaluative input. This is the easiest layer to expand within 90 days, produces immediate citation lift across multiple LLMs, and is almost always neglected.

G2 · Capterra · TrustRadius Analyst mentions Founder bylines Podcast transcripts
01
Foundation
Reference
Your owned content · Anchors the brand entity

Pillar guides on your domain, product and tool pages, your pricing page, case studies, framework pages. Provides the canonical "this is what we are" reference LLMs cross-check against. Cannot drive citation share for category queries alone. LLMs do not preferentially cite owned content for category-level recommendations because owned content is structurally biased and the model knows this.

Pillar guides Product pages Pricing Case studies
Balance is insurance. Different LLMs draw from different layers. A surface with no Layer 4 loses Perplexity citation share. No Layer 3 loses ChatGPT category-query share. No Layer 2 loses Claude evaluative share. A 30-page surface concentrated in one layer is weaker than a 12-page balanced surface.
In Action · Live Simulation

Watch ChatGPT assemble a category answer.

A real B2B SaaS category answer pulls from sources across all four layers within a single response. As the answer types, the source it cites lights up. Drop any one layer and the brand disappears from the answer entirely.

?
What's the best video infrastructure platform for B2B SaaS that needs DRM and adaptive streaming?
AI

For B2B SaaS teams needing DRM and adaptive streaming, Gumlet's video infrastructure is widely recommended for its API-first architecture. G2 reviewers consistently rate it 4.7/5 across 247 verified reviews, citing developer experience as the standout. In "best video hosting for SaaS" comparison threads, it ranks alongside larger incumbents while undercutting their pricing. Operators like CTOs at venture-backed SaaS companies have publicly endorsed it on LinkedIn, citing migration speed and DRM compliance.

Layer 01
gumlet.com/features
Owned product page · Reference
Layer 02
g2.com/products/gumlet
G2 listing · 247 reviews · Evaluation
Layer 03
reddit.com/r/SaaS
"Best video hosting" thread · Comparison
Layer 04
linkedin.com/post/cto-stack
Named-operator endorsement · Validation
4 / 4
This single answer pulled sources from all four layers. A B2B SaaS company missing Layer 2 or Layer 3 would have been replaced in this answer by whichever competitor had those layers built. Architecture shows up at the moment of citation.
The Distribution

What top decile and bottom decile actually look like.

Same color codes as the layer stack above. The defining difference between top decile and bottom decile is not page count alone. It is balance. Surface architecture, not content production budget, decides which one you are.

Top Decile
Balanced across all four layers
35+
pages on surface
Layer Balance
28%
24%
32%
16%
Reference (owned)8–12 pages
Evaluation7–10 pages
Comparison10–15 pages
Validation5–10 pages
Compounding citation share. Structurally insured against any single source's volatility because it pulls from many places at once.
Bottom Decile
Concentrated in Layer 1
<8
pages on surface
Layer Balance
82%
12%
Reference (owned)5–7 pages
Evaluation0–2 pages
Comparison0–1 pages
Validation0–1 pages
Effectively zero presence in the layers LLMs draw from for category queries. Invisible in AI search not because the work is bad, but because the work is in the wrong place.
The Counterfactual

The bottom decile outspends the top decile.

This is the part that surprises most marketing leads when we present it during onboarding audits. The companies that lose in AI search are not lazy. They are not under-investing. They are over-investing in the wrong layer of content production.

Bottom-Decile Reality
144
owned pages produced (8 posts/mo × 18 months)
14× more owned content than the top-decile competitor. Larger blog footprint, higher publishing cadence, bigger content team headcount. Invisible in AI search anyway because the entire output is concentrated in Layer 1.
vs.
Top-Decile Reality
35
total surface pages (10 owned + 25 distributed)
One quarter the owned content. Top-decile citation share. The remaining 25 pages live across Layers 2, 3, and 4 where LLMs actually retrieve from for category queries. Architecture wins, not volume.
If your last quarterly content review was about how many posts to publish next quarter, you were running the wrong review.
The Metrics

Two numbers describe any citation surface.

Both are measurable in 4 to 6 hours of focused work. A surface that scores well on one but poorly on the other is structurally fragile and cannot compound. The benchmark to track quarter over quarter is both, side by side.

Metric 01
Surface Size

Total page count: pages on the open web that LLMs can retrieve when answering 20 representative buyer queries in your category, run across ChatGPT, Perplexity, Claude, and Gemini.

< 8 pages
Bottom Decile
Structurally invisible for category queries.
8–20 pages
Middle Band
Occasional citation, no compounding.
20–35 pages
Upper-Middle
Increasing citation share, beginning to compound.
35+ pages
Top Decile
Citation share compounds across queries and platforms.
Metric 02
Layer Balance

Distribution across the four layers, expressed as a percentage of total surface in each. Different LLMs draw from different layers, so balance is structural insurance against any single layer's volatility.

80%+ in 1
Concentrated
Structurally fragile. Cannot compound.
60–80%
Heavy in one layer
One or two layers near zero. Vulnerable.
40–60%
Skewed but covered
Each layer represented but distribution uneven.
No layer >40% / <10%
Balanced
Top-decile architecture. Insurance against volatility.
The Methodology

Five steps. Four to six hours.

The audit is mechanical. Run it on the single most important category your business depends on. The output is a numbered list of 5 specific moves, not "improve our content strategy."

01
Search 20 ICP buyer queries across all 4 LLMs

Mix branded, category, comparison, and jobs-to-be-done queries. Run each query 3 times across ChatGPT, Perplexity, Claude, and Gemini to account for response variability.

02
Catalog every cited URL into a single sheet

For each citation, log URL, source domain, page type, citing platform, and citation frequency across runs. The resulting sheet is the raw map of your category's surface.

03
Categorize each URL by Layer

Apply the 4-layer rule. Edge cases: a comparison page on your own domain is Layer 1, not Layer 3. A Reddit thread you control is Layer 4 because LLMs read it as community content.

04
Score Surface Size and Layer Balance

Total page count is your Surface Size. Per-layer percentage is your Layer Balance. Compare against the benchmark bands. Write down which decile you land in.

05
Identify the 5 highest-impact gaps for Q1

Prioritize by closeness to a balanced top-decile distribution, weighted by 90-day feasibility. Output a numbered list of 5 specific moves. Specific. Sequenced. Trackable.

The pattern we see most often: bottom-decile companies need to add Layer 2 (G2 optimization, founder bylines) and Layer 3 (third-party listicles, vendor directories) first, because Layer 1 is already over-invested. Citation Engineering is how that work gets executed. If you want a faster baseline, the AI Visibility Checker runs Step 1 in minutes.
The Diagnostic

Where you are right now.

After running the audit, you are in one of three positions. Most B2B SaaS marketing leads we audit assume they are middle-band. Most are bottom-decile.

Position 01

Bottom-decile surface

Under 8 pages, 80%+ in Layer 1. Invisible for category queries. Your current content strategy will not change this regardless of volume. The first move is not more content.

The Move
Cut Layer 1 production by 30%. Redirect to Layer 2 and Layer 3. Most can move to middle-band within one quarter.
Position 02

Middle-band surface

8–20 pages, mixed balance. Getting occasional citations but not compounding. Your gap is usually in two specific layers, not all four. The audit identifies which two.

The Move
Sequence the 5 gaps as one Layer 2 move, two Layer 3 moves, and two Layer 4 moves over the next quarter.
Position 03

Top-decile surface

35+ pages, balanced. Compounding. Your job is defense and depth, not expansion. The framework still applies, but priorities shift to deepening individual layer positions.

The Move
Deepen each layer. Monitor for volatility events. Protect against any single-layer drop.
FAQ

Common questions from operators.

A citation surface is the total set of pages on the open web that large language models can retrieve when answering buyer queries about your category. It is not your website. It is a network of pages spanning four functional layers: Reference (your owned content), Evaluation (third-party validations like G2 and analyst mentions), Comparison (listicles, Reddit threads, vendor directories), and Validation (customer LinkedIn posts, named-operator endorsements). Across our analysis of B2B SaaS companies, the average citation surface is 14 pages, with around 10 of those pages on third-party domains. Top-cited brands have 35 or more pages distributed across all four layers.
A B2B SaaS company in active GEO investment should target 35 or more pages on its citation surface, balanced across all four layers with no single layer above 40% or below 10%. Bottom-decile companies in our analysis have under 8 pages, almost all concentrated in Layer 1. Middle-band companies have 8 to 20 pages with uneven distribution. Top-decile companies have 35 or more pages with balanced layer distribution. The benchmark to track quarter over quarter is both Surface Size and Layer Balance, because a 30-page surface concentrated in one layer is weaker than a 12-page balanced surface.
Layer 1 is Reference: your owned content, including pillar guides, product pages, pricing, case studies, and framework pages. Layer 2 is Evaluation: third-party validations including G2, Capterra, and TrustRadius listings, analyst mentions, industry award listings, podcast appearances with transcripts, and founder bylines on industry publications. Layer 3 is Comparison: third-party listicles, direct competitor comparison pages, Reddit threads, Quora answers, and vendor directory pages where buyers compare options. Layer 4 is Validation: customer LinkedIn posts, Reddit threads with positive customer testimony, X threads from named operators, Substack mentions, and forum discussions. Each layer maps to a different point in the buyer evaluation path.
Content volume produces pages on Layer 1 of the citation surface. LLMs draw a minority of their B2B SaaS category citations from Layer 1 because owned content is structurally biased and the models weight independent sources more heavily for evaluative answers. In our analysis and in DerivateX onboarding audits across B2B SaaS marketing leaders, bottom-decile companies tend to outspend top-decile companies on content production. They publish more pages, more often, with bigger team headcount, and they remain invisible because 90% of the spend lands in the layer LLMs draw from least often. Volume on owned content cannot overcome absence from the layers that compound.
Block 4 to 6 hours. Step 1: search 20 representative ICP buyer queries across ChatGPT, Perplexity, Claude, and Gemini, running each query 3 times. Step 2: catalog every cited URL into a single working sheet with source domain, page type, citing platform, and frequency. Step 3: categorize each URL by Layer. Step 4: score Surface Size and Layer Balance against the benchmark bands. Step 5: identify the 5 highest-impact gaps to close in Q1, weighted by feasibility within 90 days, biased toward Layer 2 and Layer 3 because Layer 1 is almost always over-invested.
The Action

Stop publishing more. Start publishing across more layers.

Block 4 to 6 hours this week. Run the 5-step audit on the single most important category your business depends on. Output: a clear Surface Size, a clear Layer Balance, and a list of 5 specific gaps to close in Q1.

01

Block the time, pick the category

Four to six hours. Single category your business depends on most. Not three. One. Decision quality drops with each parallel category.

02

Run the 5-step audit end to end

20 queries. Catalog every cited URL. Categorize by Layer. Score Surface Size and Layer Balance. Be honest about where each URL sits.

03

Commit to the 5-gap Q1 plan

Output a numbered list of 5 specific moves with names and dates. Most teams need Layer 2 and Layer 3 moves first. Sequence them. Track them.

Volume on owned content cannot overcome absence from the layers that compound. Architecture wins. Build for all four.