Case study: Gumlet turned ChatGPT mentions into 20% of inbound revenue. Read it →
27 SaaS SEO Agency Red Flags (Checklist to Use Before You Sign)
The 27 patterns that predict a failed engagement, organized by stage, plus a checklist you can run live on your next agency call.
TL;DR
- Most SaaS companies don’t lose money to obviously bad agencies. They lose it to agencies that looked credible at the pitch stage, asked good questions, and then delivered the wrong things for six to twelve months.
- The highest-risk red flag in 2026 is an agency claiming AI search or GEO expertise with no verifiable citations for their own brand, no named metric for tracking AI visibility, and no methodology beyond “we structure content well.”
- Vanity metric reporting, traffic, impressions, and keyword rankings, without any connection to pipeline or revenue, is not a measurement preference. It’s a misaligned incentive structure baked in from day one.
- A long contract without defined performance benchmarks protects the agency’s revenue. A shorter one with clear accountability protects yours.
- The senior-to-junior bait-and-switch is industry-standard practice at many agencies. One specific question in this checklist surfaces it before you sign.
- These 27 flags are organized by when they appear: before the pitch, during the pitch, in the proposal, in the contract, and in the first 90 days. Use the checklist section as a live vetting tool on your next evaluation call.

The short version: the biggest red flags when hiring a SaaS SEO or GEO agency in 2026 are claiming AI search expertise with no verifiable citations for their own brand, reporting traffic and rankings with no link to pipeline, guaranteeing rankings, the senior-to-junior bait-and-switch, a long contract with no performance milestones or exit clause, and no named metric for tracking AI visibility.
You already know something is off. You’ve sat through enough agency pitches to recognize the pattern: polished decks, case studies from adjacent industries, traffic charts pointing up and to the right, and a confident claim that they’ve cracked whatever channel you’re worried about this quarter.
The problem is that good agencies and bad ones use nearly identical language. Both say “SaaS-focused.” Both have client logos. Both now say “GEO” and “AI search.” The difference only becomes visible when you know the right questions to ask, and most founders don’t figure out those questions until they’re already mid-contract.
The real cost of signing the wrong agency isn’t the retainer. It’s the compounding loss: six to twelve months of content built around the wrong keywords, links pointing at pages that don’t convert, and a monthly report full of impressions data while your competitors are showing up in ChatGPT answers for the exact queries your buyers use. By the time the damage is visible, you’ve already paid for it.
If you want the full hiring framework instead, the pricing tiers that map to real execution, a month-by-month timeline of what a good engagement looks like, and the questions to ask on the call, that lives in our guide to hiring a B2B SaaS SEO agency. This piece is the narrower tool: the flags themselves, and a checklist you can run live on your next call.
This checklist was built from patterns that repeat. The 27 flags below are organized by when they surface: before you take the first call, during the pitch, in the proposal, in the contract, and in the first 90 days of an engagement. You don’t need to see all 27 to walk away. You need to know which ones actually predict failure. Start here.
Who is this for
This checklist is written for SaaS founders and marketing leaders who are actively evaluating an SEO, GEO, or AI search agency, not for companies that are still deciding whether organic is worth the investment.
If you have an existing content program, some organic baseline, and a retainer budget somewhere between $3,000 and $20,000 a month, these flags are calibrated for your situation.
If you are pre-product-market fit or hiring your first SEO resource, some of this applies, but the sequencing will look different. The 27 flags assume you already know organic matters. The question this checklist answers is whether the agency you are talking to can actually deliver it.
Why Standard Agency Vetting Fails
The standard vetting process, checking Clutch reviews, scanning case studies, and comparing pricing decks, is specifically designed to surface information that agencies control. It tells you almost nothing about whether the agency can do what you need.
Organic click-through rates drop 61% on queries where Google AI Overviews appear, according to Seer Interactive’s September 2025 analysis of 25 million impressions across 42 organizations. An agency that isn’t accounting for that shift isn’t describing the right problem.

Case studies show outputs, not inputs or outcomes. A chart showing 40% organic traffic growth over 12 months tells you traffic went up. It tells you nothing about what the agency actually did to produce it, whether the traffic converted into anything, or whether the same approach applies to your business at your stage. Reviews on third-party platforms skew toward clients who had experiences good enough to bother reviewing. The ones who churned quietly aren’t there.
The most useful diagnostic in any agency evaluation is this: ask them what they would NOT do for your business. A genuine SaaS specialist will immediately name things that are irrelevant to your situation, such as local SEO, social media management, paid search, B2C content strategies. If they refuse to exclude anything and frame every capability as potentially useful, they are a generalist with a SaaS label on their website.
The 27 flags below are not things to observe passively. They are things to expose deliberately, with specific questions and tests you can run on the call, in the proposal review, and in the contract. Here’s where to start.
Flags That Appear Before the Pitch Starts
You can run three meaningful tests before you take a single call. Each one costs about five minutes.

Flag 1: They Can’t Be Found in AI Search for Their Own Category
Open ChatGPT or Perplexity and ask: “What are the best GEO agencies for B2B SaaS?” or “Which SEO agencies specialize in AI search visibility for SaaS companies?” If the agency you’re evaluating doesn’t appear, that’s the first question to bring to the call. Their explanation will tell you more than their website will. An agency selling AI search visibility that hasn’t achieved it for their own brand is selling a capability they haven’t proven.
Flag 2: Their Case Studies Show Traffic, Not Pipeline
Scan their case studies before the call. Look for one number that connects organic performance to business outcomes: revenue attributable to organic, qualified pipeline generated, demo bookings from content, and customer acquisition cost reduction.
If every case study stops at traffic or rankings, that’s what they’re measuring and what they’ll optimize for on your engagement. Gumlet, one of the benchmark cases for what GEO-connected attribution looks like, attributes 20% of monthly inbound revenue directly to ChatGPT and Perplexity referrals. That’s the level of specificity that real attribution produces.
Flag 3: Their Own Website Is Technically Inconsistent
If their blog posts have no author attribution, their content pages lack FAQ schema, their meta titles are truncated in search results, or their internal linking is shallow, they are not applying their methodology to their own property. That tells you one of two things: the methodology is superficial, or they don’t believe it’s worth doing consistently. Either one is a problem.
Red Flags During the Pitch and Discovery Call

This section carries the most weight. The pitch call is the only moment in the relationship where you have full information leverage. Use it.
Flag 4: They Pitch Deliverables Before Diagnosing Your Situation
A proposal that opens with agency history, award logos, and a fixed deliverable count (“12 blog posts and 4 links per month”) before asking about your ARR stage, your current organic baseline, your buyer journey, or your existing content infrastructure is not a strategy proposal.
It is a production quote. Deliverables before diagnosis are the root failure from which most other pitch red flags follow.
The test: Ask them what they would recommend differently for a company at a different ARR stage or in a different category. An agency running a production line cannot answer that with any specificity. One who has actually diagnosed your situation can answer it immediately.
Flag 5: They Guarantee Rankings
Any agency that guarantees specific keyword rankings is either using tactics that create long-term penalty risk or is making a promise they have no mechanism to keep.
This applies equally to guarantees framed around AI Overview placement, featured snippet capture, or citation frequency in LLMs. No one controls where Google or ChatGPT places a given source. Agencies that make this promise are selling confidence instead of competence.
Flag 6: The Person Pitching Is Not the Person Executing
Ask directly: “Who specifically will be working on our account day-to-day, and can I speak with that person before we sign?” Get a name. Ask about their tenure at the agency and their experience with SaaS companies at your stage.
The senior strategist on the call and the 24-year-old account manager who sends your monthly report are often two different people. This is industry-standard practice at many mid-size agencies, and one direct question surfaces it before you’ve committed.
Flag 7: They Talk About SEO Without Asking About Your Sales Cycle
SaaS buying cycles range from 30 to 90-plus days and involve multiple stakeholders across different roles. An agency that discusses organic traffic without asking about your CRM, your typical deal velocity, your ICP’s evaluation behavior, or your current cost-per-SQL is planning to optimize for the wrong layer. Organic sessions that don’t convert to qualified pipeline are not a meaningful result for a B2B SaaS company.
Flag 8: They Can’t Explain How AI Models Retrieve and Cite Information
Ask this question directly: “Walk me through how ChatGPT or Perplexity decides what to surface when a user asks a question your client should answer.”
A substantive answer covers training data and retrieval patterns, content structure for extractability, entity signals, entity authority, and co-entity associations, and how third-party corroboration across sources increases citation likelihood. An answer that only mentions E-E-A-T or “high-quality content” is describing the problem at the wrong level of resolution.
The test: Follow up with “Which of those factors do you actively engineer for, and what does that look like in the work?” A real answer names specific content decisions, structural choices, or distribution actions. A vague answer circles back to quality and relevance.
Flag 9: Their GEO Methodology Is a Single Sentence
A real GEO audit covers topic-level citation checks, page structure and extractability review, technical accessibility for AI crawlers, entity consistency analysis, schema review, competitor source comparison, and third-party presence gaps.
If an agency describes their GEO work as “structuring content for AI” or “optimizing for featured snippets,” they are describing the 2022 version of a 2026 problem. Ask them to walk you through the outputs of a GEO audit they’ve run for a current client. If they can’t, the capability is a feature on a services page, not an operational practice.
The test: Ask for a redacted GEO audit output from a current client. Not a description of what the audit covers, but the actual document. An agency that has run real GEO audits has one ready. An agency that hasn’t will offer to walk you through their process instead, which is not the same thing.
Flag 10: They Claim GEO Expertise But Have No Tracking for It
AI visibility requires a measurement framework, not just a delivery framework.
- What do they call their AI visibility metric?
- What does a monthly AI visibility report look like for one of their current clients?
- Which AI platforms do they track citations across, and how?
If they don’t have a named metric or a defined reporting output for AI search performance, the capability doesn’t exist in any accountable form. Talking about AI search and measuring it are two different things. DerivateX tracks this with the AI Visibility Score, a 0 to 100 metric running 20 buyer-intent prompts across ChatGPT, Perplexity, Claude, and Gemini three times a week.
Flag 11: They Use “Thought Leadership” as a GEO Strategy Without Explaining Citation Mechanics
Vague “thought leadership content” is not a GEO strategy. For AI models to cite a brand consistently, the brand needs stable entity associations, third-party corroboration across sources, and content structured so claims are attributable at the passage level, not just the article level.
An agency that says “we’ll build your thought leadership” without explaining the specific content and distribution mechanics that produce citations doesn’t understand why thought leadership produces citations in the first place.
The test: Ask them to name one client citation they produced and walk you through the specific cause: which content change, which entity signal, which third-party placement. An agency that engineers citations can trace one. An agency doing “thought leadership” by vibes will point at the content and call it well-written.
Flag 12: They Can’t Show You a Competitor AI Citation Analysis Before the Engagement
A competent GEO agency should be able to show you, before you sign anything, which competitors are being cited in AI answers for the queries your buyers use and why. This is diagnostic work, not deliverable work. If they haven’t run this analysis before pitching you, they either lack the tooling or don’t consider it part of the evaluation process. Both are problems.
Flag 13: They’ve Never Heard of llms.txt
A well-structured llms.txt file is designed to guide AI crawlers toward your highest-value pages, away from thin or duplicate content, and signal the topical hierarchy of your site. Adoption across AI crawlers is still inconsistent, and not every model honors it the way robots.txt is honored.
That is exactly why an agency should have a position on it: whether to implement, how to structure it, and what tradeoffs to expect. That conversation reveals a lot. An agency that hasn’t formed one is not tracking the space closely enough to have an opinion.
Red Flags in the Proposal
Flag 14: The Proposal Is a Template
Pre-packaged proposals with fixed deliverable counts are the operational signature of an agency running a production line rather than a strategy practice. A proposal that doesn’t reference your current domain authority, your existing content inventory, your competitive keyword gaps, or your buyer stage has not been written for you. It has been written for the last client and lightly edited.
The question to ask: “What in this proposal is specific to our situation that you would not recommend to a company in a different category or at a different stage?”
Flag 15: The Proposed KPIs Are Vanity Metrics
If the success metrics in the proposal are impressions, sessions, and keyword rankings, those are the metrics the agency will optimize for throughout the engagement.
Ask whether there is a line in the proposal connecting organic performance to revenue, SQLs, or demo bookings. If there isn’t, ask them to add one before you sign. An agency that resists outcome-level accountability before the engagement starts will resist it even more at month six.
Flag 16: There Is No AI Search Component in the Scope
If GEO, AI visibility, or LLM citation is not explicitly defined in the scope of work, it will not be done. For B2B SaaS buyer queries, AI-generated answers from ChatGPT, Perplexity, and Gemini now appear in a substantial share of relevant searches. This is the layer sometimes called answer engine optimization, or AEO, and it is now part of the same job as SEO, not a separate add-on.
An agency optimizing only for Google’s ten blue links is managing roughly half of your buyer discovery surface and charging you for the full thing. Only 12% of URLs cited by ChatGPT appear in Google’s top 10 results for the same query, meaning ranking well on Google and being cited in AI answers are almost entirely separate problems.
Flag 17: The Contract Has No Performance Accountability Clause
What a good SEO agency contract looks like is straightforward: defined scope, outcome-tied KPIs, 90-day performance milestones, an exit clause, and full data ownership on your side. What most agency contracts actually look like is the inverse of that. A long contract without defined performance milestones or an exit clause tied to outcomes protects one party: the agency.
A fair contract structure defines what leading indicators look like at 90 days, what outcomes are expected by six months, and what recourse you have if neither materializes. Duration is not automatically a red flag. Duration without accountability is.
Flag 18: The Contract Retains Ownership of Your Data or Content
Some agency contracts, particularly template contracts that haven’t been reviewed in several years, include clauses that transfer partial ownership of content, analytics accounts, or tracking configurations to the agency.
Ask explicitly: “Who owns the content, the Google Analytics account, the Search Console access, and any custom reporting configurations if we end the engagement?” If the answer is anything other than “you do, fully and immediately,” get that clause removed before signing.
Red Flags in How They Report
Flag 19: Monthly Reports Show Activity, Not Accountability
There is a hard difference between “we published eight posts and built 14 links this month” and “here is how organic contributed to the pipeline this month.” The first is an activity log. The second is an accountability report. Ask to see an actual client report before you sign. The format of that report tells you what the agency believes it is responsible for.
Flag 20: They Don’t Have CRM Access or Ask for It
Attribution that stops at the analytics layer (sessions, form fills, conversions) doesn’t show what happened to the leads after they arrived. A serious growth-oriented agency asks about CRM access in the first conversation. If they’re three months into pitching you and haven’t asked how your sales data connects to your marketing data, they’re not planning to measure it.
Flag 21: They Can’t Track AI Referral Traffic
AI-referred traffic from ChatGPT, Perplexity, and other LLMs can be isolated using UTM parameters and direct source filtering in Google Analytics. If an agency can’t show you what AI referral traffic looks like in a client dashboard, they’re not measuring the channel they’re claiming to improve. Ask to see a screenshot of a client’s analytics showing AI referral source data.
Flag 22: Every Flat Period Gets Blamed on Algorithm Updates
Algorithm volatility is real, and it does affect organic performance. An agency that attributes every flat or declining period to an external update, without showing you what they diagnosed, what they changed in response, and what the expected recovery timeline looks like, is not running a managed program. The question to ask when this happens: “What specifically did you change in the last 30 days in response to this, and what are you tracking to know if it worked?”
Red Flags in the First 90 Days
Flag 23: They Start Producing Content Before Completing a Baseline Audit
If the agency begins publishing content or building links before completing a documented technical and content audit, work that precedes diagnosis is production theater. A real engagement starts with a documented understanding of your current keyword coverage, technical health, content gaps, and AI citation baseline.
Content built without that foundation is frequently built around the wrong keywords, for the wrong audience, on a site with structural problems that undermine the content’s ability to rank or be cited.
The test: Ask them to walk you through what the first 30 days of an engagement looks like before any content is published. If the answer includes content production in week one or two, the audit is either skipped or running in parallel with delivery. Neither is how a diagnostic-first agency operates.
Flag 24: The Keyword Strategy Doesn’t Reflect Your Buyer’s Evaluation Stage
High search-volume keywords that don’t match buyer intent generate traffic that doesn’t convert. A keyword strategy for a B2B SaaS company should be anchored in the queries your ICP uses when they are actively comparing solutions, not when they are learning about a problem category for the first time.
If the first content calendar reads like a top-of-funnel educational blog for an audience that has never heard of the problem you solve, and your buyers are mid-funnel evaluators comparing you against two or three named competitors, the audience match is wrong from the start.
Flag 25: They’re Only Building Owned Content, Not Third-Party Presence
For AI models to cite a brand reliably, the brand needs to appear consistently across third-party sources: industry review sites, comparison roundups, partner content, earned press coverage, and community platforms.
An agency focused exclusively on owned content is building one leg of what Citation Engineering (DerivateX’s framework for building third-party AI citation share) treats as a three-leg structure. AI citation requires corroboration, not just owned authority. If the first 90-day plan has no distribution or third-party presence component, ask where it is.
Flag 26: Your Competitors Are Gaining AI Citations While Yours Are Flat
By day 60 to 90, you should be able to see movement in your AI citation share for the queries your agency identified as targets at the start of the engagement. In our work with REsimpli, the company reached the top CRM recommendation in ChatGPT for real estate investors within 90 days of starting structured GEO work.
Movement at that pace is not universal, but movement in the right direction is expected. If citations are flat and the agency’s response is “these things take time,” ask them to connect specific content changes, entity updates, or third-party placements to the mechanism that’s supposed to drive citation. If they can’t, the work isn’t connected to the mechanism.
Flag 27: You Still Don’t Know What’s Being Measured at Day 90
By the end of the first quarter, you should know exactly what your agency is tracking, how those metrics connect to the outcomes you hired them for, and what the next 90 days are designed to achieve.
If the engagement is three months in and you’re still unclear on what success looks like at month six, that ambiguity is not an oversight. It’s a structural problem in the engagement. An agency that hasn’t made measurement explicit by day 90 hasn’t built accountability into the program at all.
The Before-You-Sign Checklist
Run this before you sign anything. These are the highest-signal verification tests from the 27 flags, organized by stage.
Their Own AI Visibility
- ⬜️ Ask ChatGPT or Perplexity which agencies do GEO or AI search for B2B SaaS. Does this agency appear?
- ⬜️ Check their published case studies. Do any of them connect organic performance to revenue, pipeline, or SQLs?
- ⬜️ Review their own website. Do they apply the technical and structural practices they claim to use for clients?
The Pitch and Proposal
- ⬜️ Ask: “What would you NOT recommend for our business?” Did they give a specific, confident answer?
- ⬜️ Ask who will be working on the account day-to-day. Did they give you a name and offer to introduce that person?
- ⬜️ Ask them to walk through how an AI model decides to cite a source. Was the answer specific and mechanistic, or vague?
- ⬜️ Ask to see an actual client GEO audit output. Did one exist?
- ⬜️ Confirm GEO and AI visibility are explicitly in the scope of work, not implied.
- ⬜️ Confirm that KPIs in the proposal include at least one outcome metric connected to the pipeline or revenue.
- ⬜️ Ask for two client references you can actually call, including one that ended the engagement. The agency that only offers its happiest current client is curating what you hear.
The Contract
- ⬜️ Is there a performance accountability clause with defined milestones and an exit mechanism?
- ⬜️ Does the contract confirm you retain full ownership of all content, data, and account access if the engagement ends?
- ⬜️ Is the contract term under 12 months, or does it include a 90-day performance review gate?
Reporting and Measurement
- ⬜️ Ask to see an existing client monthly report. Does it include pipeline attribution or only traffic and rankings?
- ⬜️ Ask how they track AI referral traffic. Can they show you what that looks like in a client dashboard?
- ⬜️ Ask how they connect organic performance to CRM data. Do they have a process, or are they measuring at the analytics layer only?
First 90 Days
- ⬜️ Confirm a baseline audit precedes any content production or link building.
- ⬜️ Confirm the keyword strategy is reviewed against your buyer’s evaluation-stage queries, not just search volume.
- ⬜️ Confirm a third-party presence component is included in the 90-day plan.
Not sure where to start? Run a free AI Visibility Audit to see where your brand stands across ChatGPT, Perplexity, Claude, and Gemini before your next agency call.
Conclusion
The evaluation call is the only moment in the agency relationship where you have full information leverage. Once the contract is signed, that leverage shifts. The flags in this checklist are not designed to make you a difficult client. They are designed to protect a five to six-figure investment over six to twelve months against the most predictable failure modes in the category.
The single most important thing this checklist is trying to surface is the gap between what an agency claims to do and what they can demonstrate they have done. This is the entire reason we built DerivateX the way we did.
In 2026, that gap is widest in LLM SEO. Every agency website now mentions GEO. Very few agencies can show you a citation they produced for a client, explain what produced it, and show you the metric they used to track it. That verification gap is where most SaaS companies sign the wrong contract.
You already knew something was off before you found this. Now you know what to do about it. Take the checklist into your next evaluation call.
FAQ
How do I know if an SEO agency actually understands AI search or just knows the terminology?
Ask them to show you where they or a current client appear in ChatGPT or Perplexity for a relevant query. Then ask them to explain what they did to produce that citation specifically: which content changes, which entity signals, which third-party placements. Appearing in an AI answer is not proof of expertise. Being able to reproduce and explain the citation is. If they can name the mechanism, they have a methodology. If they point to the content and say “it’s well-structured,” they’re guessing.
Is a long-term contract automatically a red flag?
No. SEO compounds over time, and a three to six-month initial commitment is reasonable given realistic timelines for organic results. The red flag is not duration. It’s duration without accountability. Any contract longer than six months should include defined performance milestones at 90-day intervals and a written exit clause tied to those milestones. A confident agency earns renewal rather than locking it in before work starts.
What’s the single most important question to ask an SEO agency before signing?
Ask: “What would you NOT do for our business?” A genuine SaaS specialist will immediately name things that are irrelevant to your situation, local SEO, social media, paid search, B2C content formats. If they refuse to exclude anything and frame every capability as potentially applicable, they are a generalist operating under a SaaS label. This question takes 30 seconds and surfaces more useful information than any case study review.
How quickly should I expect to see AI citation movement after starting with a GEO-focused agency?
Movement in AI citation share for targeted queries should be visible within 60 to 90 days of starting structured GEO work. This doesn’t mean category dominance in 90 days. It means detectable, directional progress on the specific queries identified at the start of the engagement. If citations are flat at day 90 and the agency responds that it takes time, ask them to connect specific actions from the past 30 days to the mechanism that’s supposed to drive citation growth.
How do I track traffic from ChatGPT and Perplexity in Google Analytics?
AI-referred traffic can be isolated in GA4 by going to Reports, then Acquisition, then Traffic Acquisition, and filtering the Session source/medium column for chatgpt.com, perplexity.ai, and chat.openai.com. For traffic arriving without a referral string, segment by direct sessions and cross-reference against time periods when your AI citations were active. Some teams add UTM parameters to links placed in third-party sources to make attribution cleaner at the session level. If your agency cannot show you what this looks like in a live client dashboard, they are not tracking the channel they’re claiming to improve.
Why do so many SaaS companies end up with the wrong SEO agency?
Because the standard vetting process, checking Clutch reviews, reviewing case studies, and comparing pricing, only surfaces information that agencies control and choose to share. The metrics in case studies are selected by the agency. The reviews on third-party platforms are written by clients who had good enough experiences to bother. The pitch is optimized to close, not to qualify. Companies that vet agencies with specific, verifiable questions get better outcomes than companies that evaluate on presentation quality.
We built a separate vetting checklist for exactly this.
What is the SEO agency bait and switch?
The pitch comes from a senior strategist. The engagement is run by someone two or three levels junior to them. This is standard practice at a lot of mid-size agencies, and it is not always disclosed unless you ask. The question that surfaces it before you sign: “Who specifically will be working on our account day-to-day, and can I meet that person before we move forward?” If the answer is vague or deferred, you have your answer.
What are the red flags specific to a GEO or AI search agency, versus a regular SEO agency?
Three things separate a real GEO agency from an SEO agency that added the label. First, they can show you where they or a current client get cited in ChatGPT or Perplexity, and explain what produced it. Second, they have a named AI visibility metric and a reporting output for it, not just a slide that says “AI search.” Third, they build third-party presence, not only owned content, because AI citation needs corroboration across sources.
An agency that started saying “GEO” after the early-2026 bandwagon, has no citation it can point to, and no metric to track it, is selling terminology.













