Case study: Gumlet turned ChatGPT mentions into 20% of inbound revenue. Read it →
The B2B SaaS SEO Agency Evaluation Checklist (Steal This Before Hiring)

You Have Three Discovery Calls Booked. All Three Said “Data-Driven.” Here Is the Problem.
You know the scenario. Three agencies on your shortlist. All three have polished websites, SaaS client logos, and case studies showing traffic graphs that go up and to the right.
All three used “pipeline-focused” and “revenue attribution” in the same thirty minutes.
One quoted $3,500 per month. One quoted $12,000. The third sent a 47-slide deck before you had even asked for a proposal.
The problem is not a lack of options. The problem is that you have no framework that makes the evaluation objective. Generic questions like “do you have SaaS experience?” have become rehearsed answers.
Every agency has a SaaS logo on their homepage. Every agency says they focus on pipeline. The only way to find out who is actually building that capability versus just labeling it is to ask questions that require PROOF, not positioning.
There is a second problem that almost no published hiring guide covers. Agency evaluation criteria from 2022 are incomplete in 2026. A growing share of B2B buyers now research solutions in ChatGPT, Perplexity, and Gemini before they run a single Google search.
If your agency cannot explain, with specifics, how they will get your brand cited in AI-generated answers for buying-intent queries, they are covering roughly half the surface area where your buyers are active.
This checklist is built to catch both failure modes: agencies that cannot do traditional SaaS SEO well, and agencies that added “GEO” to their website without the methodology to back it up.
What you will leave with is a 25-point, pass/fail rubric organized into six dimensions. It takes about 90 minutes to apply per agency. It produces a number you can compare across shortlisted partners, which makes a subjective decision as objective as possible.
This companion piece to the complete B2B SaaS SEO agency hiring guide focuses specifically on the evaluation framework itself. Start with the scoring system, then run every agency through the same six sections.
How to Use This Checklist
Before running any agency through the 25 points, set up your scoring so the comparison is clean.
Score each point as:
- 0 = Fail. The agency cannot answer, gives a vague answer, or contradicts themselves.
- 1 = Partial. They give a reasonable answer but cannot back it with proof.
- 2 = Pass. Clear answer AND demonstrable evidence.
Eleven of the 25 points are marked PROOF REQUIRED. For those points, a verbal answer scores a 1 at best. A 2 requires a document, a live demonstration, or a screenshot you can verify independently.
Apply the checklist AFTER the first discovery call and BEFORE requesting a proposal. If an agency scores below 30, do not invest time in a proposal stage.
| Score Range | What It Means |
|---|---|
| 40 to 50 | Strong shortlist candidate |
| 30 to 39 | Proceed with caution; identify which sections pulled the score down |
| Below 30 | Keep looking |
One more thing before you start: run this same rubric with every agency so you are comparing identical criteria, not whatever each agency chose to present. The score difference between two agencies almost always concentrates in Strategy and AI Visibility. That is where real differentiation lives.
1. Strategy (Points 1 to 5)

The strategy section tests whether an agency thinks about your business before thinking about your keywords. Most agencies skip this. They arrive at the first call with keyword gap analyses, traffic projections, and competitor benchmarks. That preparation looks impressive. It is also the wrong starting point.
An agency that cannot explain your funnel, your buying motion, and your ARR stage before proposing a keyword list is optimizing for deliverables, not outcomes.
Point 1: Do They Ask About Your Funnel Before Mentioning Keywords?
The fastest way to identify a pipeline-focused agency from a traffic-focused one is to clock how long it takes before they mention keywords. If your first call opens with keyword opportunities and search volume data, the agency is working backward from what they can produce, not what you need.
A pass on this point looks like this: before the first call ends, the agency has asked about your trial length, your average time-to-convert, what a qualified demo looks like for your sales team, and which content you already have that is close to converting. They are mapping your funnel before they map your keywords.
A $150-per-month search term that reliably converts to a $15,000 ACV customer is worth more than a 20,000-search-per-month informational article that attracts free-tier signups who never convert. An agency that knows this will act like it from the first conversation.
- Pass: Funnel questions arrive before keyword recommendations.
- Fail: The first slide in their deck is a keyword gap analysis.
Point 2: Can They Differentiate Their Strategy Across Funnel Stages?
A SaaS SEO program without a clear funnel architecture is a blog post calendar with a retainer attached to it. A pass here requires the agency to explain, with named examples from current or past clients, how their content strategy differs at each funnel stage.
Top-of-funnel informational content builds awareness and earns AI citations. Bottom-of-funnel commercial content, specifically comparison pages, alternative pages, and use-case landing pages, converts buyers who are already evaluating. The two content types require different keyword logic, different formatting, and different success metrics.
An agency that cannot articulate this distinction is likely to produce a 90-day content calendar full of informational blog posts while your commercial-intent pages sit untouched.
- Pass: They can describe funnel-stage content strategy with named examples and explain why they would prioritize one over the other at your specific stage.
- Fail: Their content strategy is a list of topics. No funnel architecture visible.
Point 3: Do They Have a Documented Methodology for AI Search Visibility? (PROOF REQUIRED)
This is the single highest-stakes question on the entire checklist. No other strength compensates for a fail here if AI search matters to your buyer journey, and in 2026 it almost certainly does.
A pass requires them to show you a WRITTEN FRAMEWORK, not talk about one. Ask them to explain how entity clarity, claim density, and answer-forward content structure work together to get a brand cited in AI-generated responses. Then ask them to open ChatGPT or Perplexity and show you a page they have built that currently appears in an AI answer for a commercial buying-intent query.
Agencies doing genuine GEO work can do this in two minutes. Agencies that added “AI SEO” to their service page without the methodology will pivot to talking about featured snippets, voice search, or traffic growth metrics. These are different things entirely.
For context on what a deliberate AI visibility strategy produces: Gumlet’s AI traffic share grew from 14.6% to 22.4% under a structured GEO program, with 20% of inbound revenue attributed to AI discovery. That is a pipeline outcome, not a traffic metric.
- Pass: Written GEO framework exists. Live AI citation demonstrable on a client page.
- Fail: They describe AI SEO in terms of featured snippets, readability, or “content AI can understand” without specifying the structural mechanics.
Point 4: Do They Separate Traditional SEO and GEO in Their Scope of Work?
Google and AI models consume content differently. Google crawls, indexes, and ranks pages based on hundreds of signals including backlinks, on-page structure, and user behavior. AI models extract discrete, attributable claims from content and surface those claims when a user’s query triggers a match.
A piece of content optimized for Google rankings is not automatically optimized for AI citation, and vice versa.
An agency that treats both as “search optimization” with one unified deliverable set either does not understand the distinction or is not operationally capable of executing both. Ask to see their scope-of-work template or proposal structure. A 2 on this point requires their SOW to list Google-facing deliverables and AI-search-facing deliverables separately, with distinct metrics for each.
- Pass: SOW explicitly distinguishes Google SEO deliverables from GEO deliverables with different KPIs.
- Fail: One deliverable set covers “all search” with no distinction between channels.
Point 5: Is Their Keyword Strategy Built Around Your Buyer’s Journey or Around Search Volume?
High search volume is not the same as high buyer intent. An agency that prioritizes keywords by monthly search volume is optimizing for traffic. An agency that prioritizes keywords by buying-stage relevance is optimizing for pipeline.
For a B2B SaaS company with a $10,000 to $50,000 ACV, the difference between these two approaches is the difference between 50,000 monthly sessions that do not move your demo bookings and 800 monthly sessions that do.
Ask the agency to walk you through how they would build a keyword strategy for your specific buying motion. If their answer involves a spreadsheet sorted by search volume with difficulty scores, probe further.
A pass requires them to identify commercial-intent targets first, specifically queries like “[your category] for [your use case],” “[competitor] alternatives,” and “[category] software comparison,” and explain why those terms convert at higher rates than informational ones even when the volume is lower.
- Pass: Keyword prioritization framework explicitly starts with buyer intent and commercial stage, not search volume.
- Fail: Primary keyword selection methodology is volume-first.
2. Execution (Points 6 to 10)

The execution section tests whether the agency can actually build what they describe. This is where most evaluation processes fall apart. The strategy questions get great answers because the senior team handles sales. The execution questions reveal who will actually be working on your account at week three of the engagement.
Point 6: Who Actually Does the Work? (PROOF REQUIRED)
The single most common pattern in failed B2B SaaS SEO engagements is a gap between who runs the sales process and who runs the account. An agency’s founder or senior strategist presents well, asks the right questions, and signals genuine understanding of your business.
You sign. Week one, you meet your actual account manager. They are newer, managing 15 other accounts, and learning your category as they go.
A pass requires named individuals: the account strategist assigned to your account, the writer or writers who would produce your content, and the person responsible for link outreach. Ask how many accounts each named person currently manages.
Ask to speak with a current client at a similar ARR stage who can describe what week-to-week account management actually looks like. This is not an aggressive ask. A confident agency with strong execution will facilitate this conversation without hesitation.
- Pass: Named team members, specific account loads per person, client reference available.
- Fail: “Our team,” vague answers about resource allocation, or a junior coordinator is revealed as the account lead after a senior strategist runs the pitch.
Point 7: Do They Have a Documented Content Production Process? (PROOF REQUIRED)
Content quality is the single biggest variable in whether a B2B SaaS SEO program produces pipeline, and content quality is impossible to maintain without a documented production process.
Ask them to show you a content brief template and their editorial review process. A pass requires a written brief structure (covering keyword target, funnel stage, search intent, target ICP, competitive angle, and required proof points) and a visible QC step where content is reviewed against defined standards before delivery.
An agency that describes quality control as “our editor reads it before it goes out” is relying on individual taste, not process. Individual taste does not scale, and it does not produce consistent B2B SaaS authority content.
- Pass: Written brief template visible. Defined QC criteria documented.
- Fail: “We write great content” with no process evidence.
Point 8: Can They Handle Your Marketing Site Without Touching Your Product Subdomain?
This is a SaaS-specific technical trap that generalist agencies fail repeatedly. Most B2B SaaS products live on a subdomain such as app.yourproduct.com or dashboard.yourproduct.com.
That subdomain sits behind authentication and should have no interaction with your SEO program. Your marketing site, typically on the root domain, carries your entire organic strategy.
Ask the agency how they handle the marketing site versus app divide in their technical SEO work. A credible answer names the specific risks: indexing errors if product subdomains are accidentally crawled, JavaScript rendering issues on marketing pages that sit adjacent to app pages, and canonical tag discipline across subdomain structures.
A blank look or a generic answer about “optimizing all pages” tells you this agency treats SaaS websites the same way they would treat any other B2B website. That is a meaningful capability gap.
- Pass: Immediate recognition of the marketing site versus app divide with specific technical examples.
- Fail: Generic answer about website optimization with no acknowledgment of the subdomain problem.
Point 9: Is Technical SEO Included in the Base Retainer, and What Does Their Audit Process Actually Cover? (PROOF REQUIRED)
Technical SEO for a 15-page B2B SaaS marketing site should not take longer than three weeks. An agency billing for 60 or 90 days of “technical SEO” on a standard SaaS marketing site is padding the timeline to justify the retainer before any visible work happens.
Ask to see a sample technical audit from a comparable client. A credible audit covers: crawl configuration and indexability issues, Core Web Vitals with specific numbers, redirect chains and 404s, schema markup implementation status, and canonical tag accuracy. It also includes a prioritized fix list, not a comprehensive problem dump.
Agencies that produce 200-item technical audits without prioritization are performing analysis, not solving problems.
Ask specifically: what will be FIXED, not just identified, by the end of month one?
- Pass: Sample technical audit visible. Specific items and timeline for resolution provided. Technical SEO is in the base retainer.
- Fail: Technical SEO is a separate add-on, the audit timeline is vague, or the audit sample is a crawl error report with no prioritization.
Point 10: How Do They Handle AI Hallucinations About Your Brand?
AI models sometimes generate inaccurate descriptions of products, misattribute features, or describe companies in ways that are outdated or simply wrong. This is not a theoretical problem. It is an active one for B2B SaaS companies that have changed their positioning, updated their feature set, or rebranded.
Ask the agency what their approach is to correcting AI hallucinations about a client’s brand. A pass requires them to describe entity optimization: ensuring consistent brand descriptions appear across high-authority sources, implementing accurate schema markup, building Knowledge Panel associations, and monitoring AI-generated responses for inaccurate claims.
If they have no framework for this, they are not operating in AI search at the entity level. For B2B SaaS companies actively building category ownership, getting the entity signal right is as important as getting the content right.
- Pass: Named methodology for entity consistency, schema implementation, and AI hallucination monitoring.
- Fail: No awareness of the problem, or it gets conflated with online reputation management.
3. Content Quality (Points 11 to 15)

The content section tests whether the agency produces writing that ranks, gets cited by AI models, and converts buyers. These three outcomes require different things from a piece of content, and most agencies only optimize for one of them. A pass across this section requires the agency to demonstrate that their content process accounts for all three simultaneously.
Point 11: Do They Write for Dual Intent? (PROOF REQUIRED)
Every piece of B2B SaaS content should satisfy two audiences at once: the human reader making a buying decision, and the AI model deciding whether to cite it. These goals are not in conflict. Content that is specific, claim-dense, and structurally legible serves both. Content that is vague, filler-heavy, or built around keyword density serves neither.
Ask the agency to show you a published piece that currently gets cited by an LLM for a commercial query. Then ask them to walk you through the specific structural choices that produced that citation.
A credible answer covers claim density (specific, verifiable statements rather than general assertions), answer-forward structure (the H2 question is answered in the first two sentences before expanding), named examples over anonymous ones, and comparison tables or FAQ sections that give AI models a clean extraction target.
If their definition of content quality stops at word count, readability scores, or E-E-A-T signals, they are optimizing for Google alone.
- Pass: Published piece with an active AI citation. Structural explanation behind the citation is specific.
- Fail: Content quality is defined in terms of word count, readability score, or keyword density.
Point 12: Is Their Content Specific Enough to Be Citable?
Vague content does not get cited by AI models. A sentence like “many SaaS companies face challenges with organic growth” carries no extractable information.
A sentence like “B2B SaaS companies at the $5M to $20M ARR stage typically see their first meaningful organic pipeline contribution between months four and six of a structured content program” is a citable claim with a specific range, a named audience, and a defined timeframe.
Before your call, pull a random recent piece from the agency’s portfolio and read it against this test: does every major claim have a named example, a specific number, or a source with a year? Pull three articles.
If the ratio of specific claims to generic statements is low, their editorial standard is too loose for B2B SaaS buyers who evaluate content against category expertise. Buyers in a $30,000 to $100,000 buying cycle notice when content reads like it was assembled from research rather than written from experience.
- Pass: Portfolio content has named examples, sourced statistics, and specific claims throughout.
- Fail: Articles are built around general advice with minimal data, unnamed examples, or phrases like “industry experts agree.”
Point 13: Can They Produce the SaaS-Specific Content Formats That Actually Convert?
Blog posts are not the highest-converting content format for B2B SaaS. They are the most produced format, and they are often the least efficient at moving buyers through the evaluation stage.
The content formats that convert buyers who are already comparing options are: comparison pages (“[Competitor A] vs [Competitor B]“), alternative pages (“Best [Competitor] alternatives”), use-case landing pages (“[Category] for [specific use case]“), integration guides, and tool or calculator pages that create product-adjacent utility.
Ask the agency to walk you through their bottom-of-funnel content architecture for a recent client. Ask specifically: do they produce comparison pages and alternative pages, and can you see examples?
An agency that delivers only informational blog posts is optimizing for TOFU traffic. Most B2B SaaS companies at the $5M to $30M ARR stage already have decent TOFU content. What they are missing are the BOFU pages that capture buyers in active evaluation. That is where the pipeline lives.
- Pass: Portfolio includes comparison pages, alternative pages, and use-case landing pages with visible evidence of commercial-intent architecture.
- Fail: Portfolio is predominantly informational blog posts. No visible BOFU content strategy.
Point 14: Do They Have Written Editorial Standards? (PROOF REQUIRED)
An agency without written editorial standards is relying on individual taste to maintain content quality across writers and accounts. Individual taste does not scale. It produces inconsistency across pieces, inconsistency across writers, and no defensible quality floor when output drops.
Ask to see their content standards document or style guide. A credible one defines: what makes a section opening acceptable (direct answer before expansion), what structural elements are required in every post (FAQ section, TL;DR block, internal links with descriptive anchor text), what words and phrases are off-limits, and what the QC reviewer checks before a piece ships. The existence of this document tells you the agency has institutionalized quality rather than depending on whoever is assigned to your account that week.
- Pass: Written standards document visible. Specific rules at the sentence, paragraph, and structural level.
- Fail: Quality described as “our editor reviews it” or “we maintain high standards” with no written criteria.
Point 15: Do They Understand How Content Strategy Changes at Different ARR Stages?
A $3M ARR SaaS company and a $30M ARR SaaS company need fundamentally different content strategies, even if they are in the same product category. The early-stage company needs to build topical authority from scratch and establish the brand in AI search before competitors do.
The growth-stage company likely has some existing authority and needs to convert that authority into pipeline by filling in BOFU gaps, expanding into adjacent categories, and defending against competitors building their own comparison pages.
Ask the agency: “If you were working with a $5M ARR PLG company versus a $25M ARR sales-led company in the same category, how would the content strategy differ?” A specific answer that distinguishes between signup-driving TOFU content for PLG and demo-driving BOFU content for sales-led motions signals genuine SaaS understanding.
A generic answer about “high-quality content for the right audience” tells you this agency applies one content framework regardless of business model.
- Pass: Clear articulation of how content strategy changes by ARR stage, buying motion (PLG vs. sales-led), and competitive position.
- Fail: Generic answer with no distinction between stages or buying motions.
4. Link Building (Points 16 to 19)

The link building section tests whether the agency’s approach to off-page authority is credible, ethical, and calibrated to how AI models weigh source authority. Traditional link building criteria, editorial quality over volume, have not changed. What HAS changed is the connection between link acquisition and AI citation authority.
The sources that carry the most weight for LLM citations are not the same as the sources that carry the most weight for PageRank, and an agency operating in both channels should understand the distinction.
Point 16: What Is Their Link Acquisition Method, Specifically? (PROOF REQUIRED)
A guaranteed number of links per month is a red flag, not a selling point. Promising 10 links per month regardless of what is earned means the agency is running a volume-over-quality operation, which means link schemes, paid placements passed off as editorial, or private blog networks. All three create penalty risk and carry no weight with AI models.
Ask for the names of specific publications where they have placed links for clients in a vertical similar to yours. Ask them to describe the outreach method: digital PR, editorial pitch, broken link building, or resource page outreach.
A credible agency can name publications like G2, TechCrunch, SaaStr, or vertical-specific industry media. An agency that cannot name publications is either running a link network or building links on low-authority sites that will not move rankings or AI citation signals.
- Pass: Named publications provided. Outreach methodology described specifically. No guaranteed volume promises.
- Fail: Guaranteed monthly link volume with no named publication examples.
Point 17: Do They Understand Link Building for AI Citation Authority?
Not all links carry the same weight with AI models. When ChatGPT, Perplexity, or Gemini generates a response that cites a brand, it draws on sources those models treat as high-authority: major publications, established industry media, research organizations, and directories that AI training datasets weighted heavily. A link from a DR 30 niche blog may help a Google ranking. It contributes almost nothing to AI citation authority.
Ask the agency whether they distinguish between link acquisition for Google authority and link acquisition for AI citation authority.
A credible answer describes targeting publications and sources that AI models treat as high-authority reference points: established trade publications in your category, major tech and business media, authoritative review platforms, and research-quality content sites.
If they have no framework for this distinction, their link building program is incomplete for a full-channel organic strategy.
- Pass: Explicit framework for distinguishing Google PageRank value from AI citation authority value in link targets.
- Fail: Link building and AI search are treated as separate concerns with no connection, or no awareness of the distinction.
Point 18: How Do They Vet Domain Quality Before Pursuing a Placement?
Not every DR 50+ domain is a legitimate editorial target. Many high-DR sites are link farms with inflated metrics, expired domain networks, or sponsored content hubs where “editorial” placements are purchased. An agency that vets only by domain rating is missing the more important quality signal: does this publication have real editorial standards, real readership, and real relevance to your buyer’s information diet?
Ask the agency to describe their domain vetting process. A credible process checks: domain rating AND organic traffic trend (a high-DR site with declining traffic is often a manipulated metric), editorial standards (does the site have an actual editorial team or is it a contributor network?), topical relevance to your category, and whether the site appears in your ICP’s reading list.
The SaaS-specific link building program at a category-specialist agency will have a materially different publication list than a generalist agency’s link network.
- Pass: Multi-factor vetting criteria described. Not just domain rating.
- Fail: Vetting process is DR score plus a brief manual review with no topical relevance or editorial standards check.
Point 19: Are They Transparent About What They Will NOT Do?
An agency that cannot tell you what it will not do for your business is an agency with no principles around methodology. Ethical clarity is a trust signal. Ask directly: “What link building tactics are off the table for you, and why?”
A credible answer immediately names private blog networks, paid placements represented as editorial, link exchanges, and mass directory submissions, and can explain specifically why each carries risk for a B2B SaaS company’s domain health.
The follow-up question is just as useful: “Have you ever walked away from a client who wanted to pursue tactics you considered risky?” A yes answer with a brief explanation tells you the agency has a line and holds it. Any answer that hedges, deflects, or implies “we adapt to what clients want” is a flag worth noting.
- Pass: Named prohibited tactics with specific explanations of risk.
- Fail: Unwilling to define limits, or all tactics framed as “it depends on the situation.”
5. Reporting (Points 20 to 22)

The reporting section is where the gap between a traffic-focused agency and a pipeline-focused agency becomes impossible to hide. A monthly report is the agency’s argument for why they deserve next month’s retainer. Look at what they choose to put in that argument.
Point 20: Does Their Reporting Connect Organic Traffic to Pipeline Metrics? (PROOF REQUIRED)
Ask to see an actual sample report. An anonymized example is fine. What you are looking for is a report that connects organic traffic to business outcomes: which pages drove demo requests, which content pieces had organic as the first or last touch on closed deals, and what percentage of MQLs in the period had an organic source.
This kind of reporting requires proper analytics setup from month one, including UTM discipline, conversion event tracking, and attribution modeling. It is more work to build. It is the only report worth receiving.
A report built around session graphs, keyword position tables, and domain authority charts with no conversion data is not a pipeline report. It is a traffic report with a pipeline label. The distinction matters because a company spending $10,000 per month on SEO needs to know whether that spend is contributing to revenue, not whether their average keyword ranking moved from position 14 to position 11.
- Pass: Sample report includes conversion events, pipeline attribution data, and content performance tied to business outcomes.
- Fail: Report is primarily sessions, rankings, and domain authority with no connection to demos, trials, or pipeline.
Point 21: Do They Track AI Citation Rate as a Metric? (PROOF REQUIRED)
If an agency is not measuring AI share of voice, they are not managing it. A credible AI search program requires a documented method for tracking how often your brand appears in AI-generated answers for buying-intent queries, how that rate changes month over month, and how your citation frequency compares against named competitors.
Ask what tool or methodology they use for AI citation tracking. Ask to see an example of what that tracking looks like in a client report.
DerivateX’s AI Visibility Checker and the methodology behind the 2026 AI Visibility Benchmark Report show what rigorous, platform-level citation tracking looks like across ChatGPT, Perplexity, Claude, and Gemini.
Any agency claiming a GEO practice should be able to show something comparable. If they report on AI search performance using organic traffic as a proxy, they are not tracking AI citations. They are guessing.
- Pass: Documented AI citation tracking methodology. Example report showing platform-level citation data.
- Fail: AI search performance reported via organic traffic trends. No platform-level citation monitoring.
Point 22: Will You Own All Your Data? (PROOF REQUIRED)
This is a contractual matter as much as a reporting one. Get the answer in writing before signing. You should hold Owner-level access to Google Search Console, Google Analytics, and any third-party SEO tools set up in your name during the engagement. You should leave the engagement with all content, keyword research documents, strategy files, and editorial relationships intact.
Agencies that position themselves as the sole owner of your analytics accounts are creating leverage. They are betting that the friction of transitioning data access will keep you in an engagement even after the relationship stops producing.
Ask directly: “If we end the engagement today, what do I walk away with, and how long does access transition take?” The answer should be: “Everything, immediately.” Any hesitation on this question deserves follow-up in writing.
- Pass: Owner-level analytics access confirmed. Asset ownership clause in contract covers content, strategy documents, and keyword research.
- Fail: Agency owns analytics access, or asset ownership terms are vague in the contract.
6. SaaS Understanding (Points 23 to 25)
The final section tests whether the agency understands how SaaS businesses actually grow. Every agency that targets SaaS clients puts “SaaS SEO” in their headline. Very few of them understand the difference between a PLG and sales-led motion, or can explain why the multi-stakeholder buying committee changes content strategy in a meaningful way.
The questions in this section are designed to separate the category label from the category knowledge.
Point 23: Do They Speak Fluently About SaaS Metrics Beyond Traffic?
An agency that reports “traffic up 40% this quarter” without connecting that number to CAC, trial-to-paid conversion, NRR, or pipeline contribution does not understand how SaaS companies measure growth.
You are not hiring an agency to increase traffic. You are hiring them to reduce your organic CAC and compound the contribution of the organic channel to revenue.
Ask them: “If you had to explain to our CFO why SEO is worth the retainer, what would you say?” A credible answer builds from organic traffic to qualified demos to pipeline contribution to CAC reduction to LTV multiplied over the cohort.
An answer that stays at the traffic and ranking level, even if delivered confidently, tells you the agency has learned the SaaS vocabulary without internalizing the underlying business logic.
- Pass: Unprompted use of CAC, trial-to-paid conversion, pipeline attribution, or NRR in strategic discussion.
- Fail: Success defined in terms of rankings, sessions, or DA.
Point 24: Do They Understand the PLG vs. Sales-Led Content Distinction?
A product-led growth company and a sales-led SaaS company require different keyword strategies, different content formats, and different conversion architectures, even if they are in the same product category.
A PLG company needs content that converts to free signups and demonstrates product value quickly. A sales-led company needs content that generates demo requests and supports a multi-stakeholder evaluation process.
Ask them to describe how their content strategy would differ for a PLG company versus a sales-led company at the same ARR stage in the same category. The answer should immediately distinguish between signup-driving top-of-funnel content with low friction CTAs (PLG) and demo-driving bottom-of-funnel content that addresses the evaluation committee (sales-led).
If the answer is generic, the agency is applying one content framework to all SaaS clients and calling it specialization.
- Pass: Clear, specific articulation of how buying motion changes keyword prioritization, content format, and CTA architecture.
- Fail: Generic answer about “understanding the target audience” with no motion-specific distinction.
Point 25: Can They Show You Their Own AI Visibility? (PROOF REQUIRED)
This is the fastest and most reliable test on the entire checklist. Ask the agency, during the live call, to open ChatGPT or Perplexity and search for “best SEO agency for B2B SaaS.” Then ask them to search for “[their agency name] SEO results.”
An agency that practices what it sells will appear in at least one AI-generated answer for a relevant commercial query. Their own site is the first proof of concept for their methodology.
If they do not appear, ask them to explain why. No explanation fixes the result. An agency that cannot achieve AI citation visibility for their OWN brand in their OWN category has a theoretical GEO practice, not an operational one. You are not evaluating their theory.
You are evaluating their execution. A credible SaaS SEO agency built for AI search visibility optimizes its own entity with the same rigor it brings to client accounts.
- Pass: Agency brand appears in AI-generated answers for relevant commercial queries. Live demonstration during the call.
- Fail: No AI citation for the agency’s own brand and category. Any explanation offered in place of demonstration.
The Scoring Table: Compare Up to Three Agencies Side by Side
Score each point 0, 1, or 2. Sum each column. The agency with the highest score is your strongest candidate on objective criteria.
| Point | Dimension | Agency A | Agency B | Agency C |
|---|---|---|---|---|
| 1 | Asks about funnel before keywords | |||
| 2 | Differentiates strategy across funnel stages | |||
| 3 | Documented AI search methodology (PROOF) | |||
| 4 | Separates SEO and GEO in SOW | |||
| 5 | Keyword strategy is buyer-intent-first | |||
| 6 | Named team with defined account loads (PROOF) | |||
| 7 | Documented content production process (PROOF) | |||
| 8 | Understands marketing site vs. app divide | |||
| 9 | Technical audit included in retainer (PROOF) | |||
| 10 | Has a framework for AI hallucination correction | |||
| 11 | Writes for dual intent with live AI citation (PROOF) | |||
| 12 | Content is specific enough to be citable | |||
| 13 | Produces SaaS-specific BOFU content formats | |||
| 14 | Written editorial standards exist (PROOF) | |||
| 15 | Content strategy adapts by ARR stage | |||
| 16 | Link acquisition is editorial and named (PROOF) | |||
| 17 | Distinguishes link value for Google vs. AI citation | |||
| 18 | Multi-factor domain vetting process | |||
| 19 | Transparent about prohibited tactics | |||
| 20 | Reporting ties organic to pipeline (PROOF) | |||
| 21 | Tracks AI citation rate as a metric (PROOF) | |||
| 22 | You own all analytics access and assets (PROOF) | |||
| 23 | Fluent in SaaS metrics beyond traffic | |||
| 24 | Understands PLG vs. sales-led content distinction | |||
| 25 | Can demonstrate own AI visibility live (PROOF) | |||
| TOTAL | /50 | /50 | /50 |
A score difference of 10 or more points between two agencies is significant. Look at which sections drove the gap. If one agency scores 8/10 on Strategy and 2/10 on AI Visibility, and another scores 5/10 on Strategy and 9/10 on AI Visibility, the right choice depends on your current state.
A company with no AI search presence should weight the AI Visibility section more heavily. A company with strong existing content and no pipeline attribution should weight the Reporting and Strategy sections.
10 Questions to Ask on the Discovery Call
Copy these directly into your notes before each call. The “what a strong answer looks like” note is there so you can score the response immediately after the conversation, not after you have been charmed by a deck.
- “Walk me through how a piece of content you produce gets cited in ChatGPT or Perplexity.” Strong answer: specific structural decisions described with a live example ready to show.
- “Who will actually manage my account week-to-week, and how many other accounts do they currently manage?” Strong answer: a named person with a specific account count under 12.
- “Can I see the content brief template you use for B2B SaaS clients?” Strong answer: a document shared on the spot or within 24 hours of the call.
- “Show me a client whose brand now appears in an AI response for a buying-intent query.” Strong answer: opens the tool live during the call and demonstrates it.
- “What does your month-one deliverable list actually look like?” Strong answer: keyword strategy finalized, content calendar built, and first piece in progress before month one ends.
- “What would you not do for a company at our ARR stage, and why?” Strong answer: immediately names irrelevant services and explains the reasoning.
- “Show me a sample monthly report from a comparable client.” Strong answer: report includes pipeline attribution data, not just rankings and sessions.
- “What publications have you placed editorial links on for B2B SaaS clients in our category?” Strong answer: names three to five specific publications within two minutes.
- “If we end the engagement, what happens to our Google Search Console access, our content, and our keyword research?” Strong answer: “You own everything. Transition happens immediately.”
- “Why would a company NOT hire you?” Strong answer: a specific, honest answer about the types of clients they are not the right fit for. An agency that positions itself as right for everyone is right for no one.
The 10-Minute Test You Should Run Before Every Discovery Call
Before you talk to a single agency, run this test. It takes ten minutes and produces information that makes every subsequent conversation more specific.
Open ChatGPT and Perplexity in separate private browser windows. Search for “best SEO agencies for B2B SaaS.” Note which agencies appear. Then search for each shortlisted agency by name: “[Agency Name] SEO results for SaaS.” Note whether their own brand appears in AI-generated answers.
An agency that appears in AI answers for their own category is demonstrating their methodology on their own account.
An agency that does not appear has either not prioritized their own AI visibility (which tells you something about how they will prioritize yours), or does not yet have the methodology to produce that result.
Run the same test for your own company. Search: “best [your product category] for [your primary use case].” Then search: “top [your category] tools for [your buyer type].” Note which competitors appear and which positions they hold.
This is your AI visibility baseline. You can bring it to every agency conversation and ask them specifically: “Given this gap, what would your program do to close it in 90 days?”
If you want a formalized version of this baseline with scores across ChatGPT, Perplexity, Claude, and Gemini, the free AI Visibility Audit produces exactly that in about ten minutes.
It gives you an AI Presence Score, a breakdown of which queries surface your brand, and which competitors are winning citation share in your category. That number is the single most useful thing you can bring into an agency selection process.
FAQ
1. What is a SaaS SEO agency checklist and why do I need one instead of just asking questions?
A SaaS SEO agency checklist is a scored, pass/fail evaluation rubric for comparing agencies on identical criteria before signing. Discovery call questions alone are unreliable because agencies optimize for the sales process.
A checklist that requires proof for key points, not just verbal answers, removes the advantage a polished pitch gives to an agency with weak execution. It also produces a number you can compare across shortlisted agencies, which makes a subjective decision as objective as possible. Without a structured framework, the agency that presents best wins.
With one, the agency that performs best wins.
2. How do I know if an SEO agency actually understands B2B SaaS versus just claiming to?
Ask for case studies at your specific ARR stage, not “SaaS companies” as a category. Ask them to explain the difference between a PLG keyword strategy and a sales-led keyword strategy.
Ask whether they have worked with a company in your specific buying motion. Genuine SaaS specialists can explain how multi-stakeholder buying committees change content strategy, how comparison pages and alternative pages convert differently from blog posts, and why commercial-intent keywords outperform informational ones for pipeline even when search volume is lower.
Generic answers to any of these questions indicate a generalist with a SaaS label.
3. What score on the 25-point checklist means I should sign?
A score of 40 to 50 indicates a strong shortlist candidate. A score of 30 to 39 means proceed with caution and identify which specific sections pulled the score down before deciding. Scores below 30 are a clear signal to keep looking.
One additional rule: a score of 0 on Point 3 (documented AI search methodology) or Point 25 (live AI citation demonstration) is a disqualifier regardless of the total score, because those two points test the capability gap most likely to leave your AI search channel unaddressed.
4. What is the difference between a GEO agency and an SEO agency that says it also does GEO?
A GEO agency has a documented methodology for AI citation engineering, tracks citation rate as a primary KPI, and can demonstrate live examples of clients cited in AI-generated answers for commercial queries.
An SEO agency that says it also does GEO typically updated its service page to include GEO language while continuing to deliver the same keyword research, blog post production, and link building it always has.
The fastest way to tell the difference: ask them to open ChatGPT during the call and show you a client citation. An agency with a real GEO practice can do this in two minutes. An agency with a label cannot.
5. How long should the first contract with a B2B SaaS SEO agency be?
A 3-month minimum with 30-day notice for cancellation afterward is the right structure for a new agency relationship. It gives the agency enough runway to move past setup and into execution, while giving you the ability to exit if the engagement is clearly not working after 90 days.
A 12-month minimum with no performance clause on a new relationship is a significant financial risk. At $8,000 per month locked into 12 months, a poor-fit engagement costs $96,000 with no recourse. Always confirm asset ownership terms in the same conversation.
6. Is it worth running this checklist if the agency comes highly recommended?
YES. A referral tells you the agency produced results for the referring company. It does not tell you the agency can produce the same results for your specific buying motion, your ARR stage, and your AI search visibility gap. The checklist is most valuable precisely when an agency arrives with strong social proof, because social proof is the condition most likely to make you skip verification.
A referred agency that scores 45/50 on the checklist is an outstanding candidate. A referred agency that scores 22/50 is a referral you should not follow.
7. What if no agency I evaluate scores above 40?
That outcome is more common than it seems, and it tells you something specific: you are likely evaluating the wrong agencies.
Expand your search, specifically looking for agencies that publish original research or frameworks on AI search visibility, can demonstrate their own brand in AI answers, and have case studies with pipeline attribution data, not traffic graphs. Agencies with all three are rare but they exist.
The B2B SaaS SEO agency hiring guide covers how to build a more targeted shortlist before you run the evaluation process.
The Checklist Removes Gut Feel. What You Do With the Score Is Still Your Call.
The hardest part of hiring an SEO agency is not a lack of information. Most discovery calls give you plenty of information. The hard part is that most of the information is presented by people who are skilled at presenting, and the presentation is designed to make you feel confident before you have actually verified anything.
This checklist is not a guarantee of a perfect hire. It is a filter that removes the advantage the best presenter has over the best operator. An agency that scores 46/50 has demonstrated, with proof, that their strategy is buyer-intent-first, their execution team is named and accountable, their content process is documented, their link acquisition is editorial, their reporting connects to pipeline, and their AI search methodology is real and demonstrable.
That is a materially different basis for a hiring decision than “they seemed sharp and their case studies looked good.”
Run the free AI Visibility Audit before your first call. Know your AI Presence Score and your current citation gap. Then bring this checklist to every conversation. The agency that scores highest on objective criteria and can close the specific gap your audit reveals is the right hire. Everything else is noise.









