Our team has the bandwidth to create 2 more SEO growth plans this month. Get one now →Â
Your SEO report looks fine. It tells you nothing about whether ChatGPT recommends you.
LLM visibility is the measure of how often and how prominently AI tools recommend your brand for the queries your buyers are actually asking. Most B2B SaaS companies have no idea what that number is. We measure it, track it weekly, and build the citations that move it.
This page covers the measurement and tracking side of LLM SEO. For the full picture of how we build AI citation visibility, visit the LLM SEO overview page.
See LLM SEO overviewYou cannot build what you cannot measure. Most LLM SEO programs skip this step.
"LLM visibility is the frequency and prominence with which AI tools (ChatGPT, Perplexity, Claude, Gemini) recommend your brand when buyers ask questions in your category."
Every B2B SaaS company running an SEO program tracks rankings, traffic, and domain authority. None of those metrics tell you whether ChatGPT recommends your product. A brand can rank first on Google for its primary keyword and be completely absent from AI recommendations in the same category.
This is not a future problem. Your buyers are using ChatGPT and Perplexity to build shortlists right now. If you are not in those answers, you are not on those shortlists. And no metric in your current stack will tell you that you are missing.
LLM visibility is the gap metric. It sits between the SEO work you are already doing and the AI-sourced pipeline you are not yet seeing. Measuring it is the prerequisite to building it and building it deliberately rather than waiting to appear by accident.
AI Visibility Score (AVS): a 0 to 100 metric built for this specific problem.
DerivateX coined AVS and uses it as the primary reporting metric for every LLM SEO engagement. It is the only metric we know of that is specifically designed to measure AI citation frequency and prominence, which is not inferred from proxy signals, not estimated from brand mentions, but calculated from systematic weekly prompt testing across all four major AI tools.
How AVS is calculated
We define 20 buyer prompts specific to your category (the questions your ICP is actually asking ChatGPT during their research). We run all 20 prompts across ChatGPT, Perplexity, Claude, and Gemini every Monday. That creates 80 scoring events per week.
Each result is scored 0 to 5: five points if your brand is the primary recommendation, three if it is a named secondary mention, one if it appears in passing, zero if it is absent entirely.
AVS is the total raw score divided by the maximum possible score of 400, expressed as a percentage. An AVS of 34 means your brand captured 136 of 400 possible points. Meaning you are visible and building traction, but not yet the default recommendation.
The same 20 prompts are run every week. The score can only be trusted if the prompts are fixed. You are tracking movement on the same questions, not optimizing each week’s prompt set for the best-looking result.
What your AVS means
Week 6
An AVS of 34 at Week 6 of a GEO engagement means the brand is building genuine category presence. It is appearing in most tools for many prompts, named prominently in roughly 15% of all tested scenarios. That is meaningful movement from a Week 1 baseline that typically reads between 0 and 8.
The prompt-level breakdown is where the action lives. Low-scoring prompts tell you exactly where Citation Engineering coverage is missing. High-scoring prompts tell you which territory you already own and need to defend.
AVS is a leading indicator. Rising AVS precedes AI-sourced pipeline by 6 to 12 weeks. That expectation is set from the first report.
Three things AVS measures simultaneously. Most tools only do one.
Citation Frequency
How often your brand appears across the full set of 20 buyer prompts and 4 AI tools each week. A brand that appears in 60 of 80 scoring events is meaningfully more visible than one that appears in 20, even if both have individual high-scoring prompts. Frequency is the volume signal.
Citation Prominence
Whether your brand is the primary recommendation (five points), a secondary mention (three points), or a passing reference (one point). Two brands can appear the same number of times but have very different AVS scores depending on whether they are being cited as the answer or listed at the bottom of a bulleted list.
Cross-Platform Breadth
Whether your visibility is consistent across all four tools or concentrated in one. A brand that scores well on ChatGPT but poorly on Perplexity and Claude has a fragile position. Breadth matters because buyer AI tool preferences vary and platform market share shifts. A high AVS requires consistent citation across all four.
Where most B2B SaaS brands start. Where they need to go.
These benchmarks are based on DerivateX’s client data across B2B SaaS engagements. Brands in niche categories with fewer active competitors reach higher scores faster. Brands in crowded categories like CRM or project management should expect slower progress.
Pre-visibility
Early traction
Category Presence
Category authority
Category dominance
Measure first. Build second. Report on both.
Every LLM visibility engagement at DerivateX runs in the same sequence. You know your baseline before we spend a dollar on execution. You know the gaps before we write a word of content.
That is why ChatGPT SEO is not optional for B2B SaaS. It is the channel where purchase intent is being formed before the buyer reaches you.
Establish your baseline AVS
We define 20 buyer prompts specific to your category and ICP, run them across ChatGPT, Perplexity, Claude, and Gemini, score every result on the AVS rubric, and give you your Week 1 number. You know where you stand before anything else happens. Most brands score between 0 and 8 at baseline. That gap is the opportunity.
Map the gaps by prompt
The prompt-level breakdown from Week 1 tells you exactly which buyer queries your brand is absent from and which ones you are winning. Low-scoring prompts become the content and authority priorities for the first sprint. This is the diagnostic that makes Citation Engineering targeted rather than generic.
Execute Citation Engineering against the gaps
Content production, digital PR placements, entity optimization, and schema implementation → all mapped to the specific prompts where your AVS is lowest. Every execution decision is tied to a measurable scoring event. If the work is effective, the AVS moves on the specific prompts it was targeting. Read the full AVS methodology for the scoring details.
Google clicks in SaaS are already declining
AI Overviews are absorbing organic clicks across SaaS categories. The buyers who used to arrive via Google search are increasingly arriving via AI answer or not at all. ChatGPT SEO is not a future investment, it is a response to a shift that is already underway.
Get Started
Find out your brand's AVS before a competitor does.
We define your 20 buyer prompts, run them across all four AI tools, and give you your baseline AI Visibility Score with a prompt-level gap analysis. You will know where you stand and what to fix before the call ends.
