Case study: Gumlet turned ChatGPT mentions into 20% of inbound revenue. Read it →
The Citation Surface Map: why top-cited B2B SaaS brands have 35+ pages on their surface.
The bottom decile outspends the top decile on content production. They publish more pages, more often, with bigger team headcount. They lose anyway because roughly 90% of the spend lands in one layer, and that layer is the one LLMs draw from least often when answering category buyer queries. Volume does not beat architecture.
The framework in six lines.
Every B2B SaaS marketing lead investing in GEO in 2026 should be able to repeat these six points. Architecture, not volume, is the bottleneck.
A citation surface is not your website. It is the network of pages LLMs retrieve when answering buyer queries about your category, spanning four layers.
The 4 layers: Reference (your owned content), Evaluation (G2, analyst mentions), Comparison (third-party listicles, Reddit), and Validation (customer LinkedIn, named-operator endorsements).
Two metrics describe any surface: Surface Size (page count) and Layer Balance (distribution). Top decile: 35+ pages, no layer above 40% or below 10%.
The 5-step Surface Audit runs on a single company in 4 to 6 hours: 20 queries, catalog every cited URL, categorize by Layer, score Size and Balance, identify top 5 gaps to close.
In our 2026 Benchmark, 78% of B2B SaaS appear on all four AI platforms but the average AI Presence Score is 56.9 of 100. Appearance is not the bottleneck. Architecture is.
If your team is producing more content this quarter than last and citations are not following, you do not have a content problem. You have a layer problem.
Your website is not your citation surface.
Most B2B SaaS marketing teams operate on an unstated assumption: "improving AI citations" means "publishing more or better content on our blog." Teams that hold this assumption produce more content, hire more writers, expand their pillar page program, add FAQ schema, and watch their citation share stay flat or fall.
The assumption is wrong. The citation surface that an LLM actually queries when a buyer asks about your category is a network of pages, most of which are not on your domain. When a B2B SaaS buyer asks ChatGPT for the best CRM for their use case, the model does not look at your blog and pull from it. It pulls from a constellation of sources spread across review sites, comparison listicles, Reddit threads, podcast transcripts, customer testimonials on LinkedIn, vendor directories, and yes, occasionally a page on your own domain.
Appearance is not the bottleneck for most B2B SaaS companies. Frequency, position, and breadth of mention are. And those are functions of citation surface architecture, not content volume.
The Citation Surface Map is the architectural framework. It defines what a surface contains, how to measure its quality, and how to expand it deliberately rather than accidentally. It sits underneath both the Search Budget Framework (which decides where the spend goes) and the Visibility Vacuum Theory (which decides when the spend matters most).
Four layers. Each retrieved differently.
A citation surface organizes by function, not by domain. The four layers map to four distinct stages of how a B2B SaaS buyer evaluates a category, and LLMs retrieve from each layer at different points in their answer construction.
Customer LinkedIn posts, Reddit threads with positive customer testimony, X threads from named operators, Substack and newsletter mentions, forum discussions. The model distinguishes between "vendor claims X" and "an identified operator at a real company says X." The latter is weighted more heavily for recommendation queries.
"Best X for Y" listicles on third-party sites, direct competitor comparison pages on third-party domains, Reddit threads asking "X vs Y," Quora answers, vendor directory pages. This is where commercial intent buyer queries get answered. A B2B SaaS company with strong Layer 1 but weak Layer 3 is invisible at the moment of evaluation, regardless of how good its blog is.
G2, Capterra, and TrustRadius listings, analyst mentions in Forrester or Gartner reports, industry award listings, podcast appearances with transcripts, founder bylines on industry publications. LLMs treat these as evaluative input. This is the easiest layer to expand within 90 days, produces immediate citation lift across multiple LLMs, and is almost always neglected.
Pillar guides on your domain, product and tool pages, your pricing page, case studies, framework pages. Provides the canonical "this is what we are" reference LLMs cross-check against. Cannot drive citation share for category queries alone. LLMs do not preferentially cite owned content for category-level recommendations because owned content is structurally biased and the model knows this.
Watch ChatGPT assemble a category answer.
A real B2B SaaS category answer pulls from sources across all four layers within a single response. As the answer types, the source it cites lights up. Drop any one layer and the brand disappears from the answer entirely.
For B2B SaaS teams needing DRM and adaptive streaming, Gumlet's video infrastructure is widely recommended for its API-first architecture. G2 reviewers consistently rate it 4.7/5 across 247 verified reviews, citing developer experience as the standout. In "best video hosting for SaaS" comparison threads, it ranks alongside larger incumbents while undercutting their pricing. Operators like CTOs at venture-backed SaaS companies have publicly endorsed it on LinkedIn, citing migration speed and DRM compliance.
What top decile and bottom decile actually look like.
Same color codes as the layer stack above. The defining difference between top decile and bottom decile is not page count alone. It is balance. Surface architecture, not content production budget, decides which one you are.
The bottom decile outspends the top decile.
This is the part that surprises most marketing leads when we present it during onboarding audits. The companies that lose in AI search are not lazy. They are not under-investing. They are over-investing in the wrong layer of content production.
Two numbers describe any citation surface.
Both are measurable in 4 to 6 hours of focused work. A surface that scores well on one but poorly on the other is structurally fragile and cannot compound. The benchmark to track quarter over quarter is both, side by side.
Total page count: pages on the open web that LLMs can retrieve when answering 20 representative buyer queries in your category, run across ChatGPT, Perplexity, Claude, and Gemini.
Distribution across the four layers, expressed as a percentage of total surface in each. Different LLMs draw from different layers, so balance is structural insurance against any single layer's volatility.
Five steps. Four to six hours.
The audit is mechanical. Run it on the single most important category your business depends on. The output is a numbered list of 5 specific moves, not "improve our content strategy."
Mix branded, category, comparison, and jobs-to-be-done queries. Run each query 3 times across ChatGPT, Perplexity, Claude, and Gemini to account for response variability.
For each citation, log URL, source domain, page type, citing platform, and citation frequency across runs. The resulting sheet is the raw map of your category's surface.
Apply the 4-layer rule. Edge cases: a comparison page on your own domain is Layer 1, not Layer 3. A Reddit thread you control is Layer 4 because LLMs read it as community content.
Total page count is your Surface Size. Per-layer percentage is your Layer Balance. Compare against the benchmark bands. Write down which decile you land in.
Prioritize by closeness to a balanced top-decile distribution, weighted by 90-day feasibility. Output a numbered list of 5 specific moves. Specific. Sequenced. Trackable.
Where you are right now.
After running the audit, you are in one of three positions. Most B2B SaaS marketing leads we audit assume they are middle-band. Most are bottom-decile.
Bottom-decile surface
Under 8 pages, 80%+ in Layer 1. Invisible for category queries. Your current content strategy will not change this regardless of volume. The first move is not more content.
Middle-band surface
8–20 pages, mixed balance. Getting occasional citations but not compounding. Your gap is usually in two specific layers, not all four. The audit identifies which two.
Top-decile surface
35+ pages, balanced. Compounding. Your job is defense and depth, not expansion. The framework still applies, but priorities shift to deepening individual layer positions.
Common questions from operators.
Stop publishing more. Start publishing across more layers.
Block 4 to 6 hours this week. Run the 5-step audit on the single most important category your business depends on. Output: a clear Surface Size, a clear Layer Balance, and a list of 5 specific gaps to close in Q1.
Block the time, pick the category
Four to six hours. Single category your business depends on most. Not three. One. Decision quality drops with each parallel category.
Run the 5-step audit end to end
20 queries. Catalog every cited URL. Categorize by Layer. Score Surface Size and Layer Balance. Be honest about where each URL sits.
Commit to the 5-gap Q1 plan
Output a numbered list of 5 specific moves with names and dates. Most teams need Layer 2 and Layer 3 moves first. Sequence them. Track them.
Connected frameworks and data.
Stage-aware GEO investment by category. Why surface architecture matters more before your category's vacuum closes.
FrameworkAllocate B2B SaaS search effort by retrieval surface. The budget logic that maps to citation surface architecture.
FrameworkThe execution methodology for engineering AI citations across all four layers of your citation surface.
Report50 B2B SaaS brands scored across 1,400 buyer-intent prompts. The data set behind every Layer Balance benchmark on this page.
Apoorv is the co-founder of DerivateX, a B2B SaaS Generative Engine Optimization agency that engineers AI citations in ChatGPT, Perplexity, Claude, and Gemini. He authored the 2026 AI Visibility Benchmark Report, designed the Citation Engineering methodology, and runs the only published biweekly citation stability tracking dataset in B2B SaaS AI search.
Volume on owned content cannot overcome absence from the layers that compound. Architecture wins. Build for all four.
