The Buyer Answer Gap Index
What the study measures, where the questions come from, and how answers are scored, written so the findings can be checked and reproduced.
B2B buyers now do most of their evaluation before they ever talk to you, on your site, and increasingly by asking an AI. The Buyer Answer Gap is the distance between the questions those buyers ask and the answers they can actually get. This note defines exactly what we test and how we grade it. The question set was fixed before any site was evaluated, and was not changed in response to results.
What this measures, and what it doesn't
We test one thing: can a serious B2B buyer get a good answer to the questions they ask while evaluating a vendor, in two independent places.
- On the vendor's own site. Is the answer there, where a buyer would look, and does it actually answer (explain how), or just claim an outcome?
- From AI. When a buyer asks an AI assistant the same question, do they get a real answer grounded in the vendor's own content, or a generic one stitched from third-party and competitor sources the vendor doesn't control?
This is a measure of answer availability and quality, not product quality. A vendor can have an excellent product and score poorly here because the answers aren't reachable. The reverse is also possible.
It is explicitly not: a security audit or a substitute for a SOC 2 / CAIQ review; a design, speed, or UX assessment; or a judgment of whether a vendor is "good." Scope is limited to B2B SaaS.
The question set
Buyers evaluate across two kinds of question. We grade them separately on purpose, because they fail for different reasons.
The questions below are shown exactly as the grader asks them, in the plain language a buyer actually uses, not a formal checklist. The grader fills in the specifics for each vendor (a real competitor's name, the tools the buyer runs, the problem they're solving), so the shape is fixed but the wording a buyer sees is concrete.
The questions that decide whether a buyer stays interested. There is no industry-standard list for these, so we use Gartner's six B2B "buying jobs" as the spine.
| Category | Representative question (in a buyer's words) | Frame |
|---|---|---|
| Problems solved | How do you actually solve our specific problem, what happens, specifically, not just the claim that you do? | Gartner: problem ID |
| Benefits & outcomes | What's the single biggest outcome customers point to, and how long did it take them to get there? | Gartner: selection |
| How it works | Walk me through how this works day to day for the person using it, what do they do that they didn't before? | Gartner: requirements |
| ROI / business case | What would we measure to know this is working, and what do customers typically see move? | Gartner: validation |
The vetting questions once you're on the shortlist. Each sits behind a published standard, the Cloud Security Alliance's CAIQ, the Vendor Security Alliance's VSA, the Shared Assessments SIG, and common RFP practice.
| Category | Representative question (in a buyer's words) | Source |
|---|---|---|
| Security | Are you enterprise-security serious, SOC 2, the basics, or is security going to be a problem when I bring you to my team? | CAIQ / VSA |
| Data & privacy | Where does our data actually live, and is that going to be a problem for us (EU, regulated industry)? | CAIQ / VSA |
| Integrations | Will you actually work with the tools we already run, or is connecting you going to be a project? | RFP practice |
| Implementation | Is this going to be a heavy lift to get running, or can my team get value without a big project? | RFP practice |
| Support | If something breaks or my team gets stuck, can we get real help, or are we on our own? | RFP practice |
| Scale | Has this actually worked for companies like us, or would we be the ones figuring it out? | RFP practice |
| Differentiation | We're also looking at a competitor, why you over them, honestly, and where are they actually better? | Gartner: validation |
| Searches buyers run | For the real "X vs Y" and "X reviews" searches buyers run about you, is your own answer anywhere they'll find it? | Search-intent |
| Vendor viability | Who's actually using you, and are you established enough that this isn't a risky bet? | Shared Assessments SIG |
The full library holds roughly 90 questions across these categories; the snapshot samples one representative question per category. That coverage asymmetry is itself a finding, not a flaw: the diligence questions each sit behind a published standard, while the value questions have none, and the questions with no standard are the ones vendors most often can't answer.
How the questions were chosen
To remove hand-picking, each category's representative question follows one stated rule:
It is the question a typical B2B buyer asks first in that category, the obvious, first-order question, not the most obscure or the most damaging, fixed before any site was run.
No question was added, removed, or reworded in response to any individual site's results, and none was selected for being one that sites tend to fail. Value questions are phrased to require the mechanism or specifics rather than a yes/no, because a buyer needs to know how, not just whether.
How answers are graded
The site and AI checks are scored separately and never blended, a vendor can answer well on its own site but be invisible to AI, or the reverse. We report one grade per axis, Coverage, and then a set of findings that show whether those answers are any good, whose they are, and the deciding ones no page can answer.
Matching by meaning, not keywords
We don't keyword-match. The grader reads your page text and a language model judges, by meaning, whether that content answers the question. So a conversational question like "is this a heavy lift to get running?" is matched against whatever you say about implementation, onboarding, and time-to-value, even if none of those exact words appear. The same judgment also decides whether an answer explains how it works or only claims an outcome, which drives the findings below.
The grade: Coverage
Coverage asks the simplest question: can a buyer get an answer at all? A real answer or a claim both count; only a missing answer scores zero. It's rolled up to a percentage and banded:
We grade Coverage and nothing else, for a simple reason: it's the one measure that actually separates sites. The things that matter just as much, whether an answer explains anything, and whether AI's answer is even yours, come back low for almost everyone, so a letter there would tell you little. Those we report as findings instead.
The findings: is the answer any good, and is it yours?
- Claim vs. explain (your site). Of the answers a buyer can find, how many explain how the product works, versus just claim an outcome ("we lower costs," "we're more secure")? Explaining the mechanism is the bar, and the single most important quality signal. Almost every site claims more than it explains, and unlike the deciding questions below, this one is fixable.
- Whose answer, and how good (AI). When a buyer asks AI, two separate things matter, and we report both: grounding, does the answer come from your own pages, or is it stitched from third parties and competitors you don't control? And depth, is it a real, specific answer, or a generic gist?
How we read the site
We fetch your pages and read the rendered content, what a browser actually shows, not the raw markup, so a modern JavaScript-built site is read for what's really on it, not mistaken for empty. We read your top-level pages and follow links one level deeper into the sections buyers check most, customers, security, integrations, docs, so the grade reflects your whole site, not just the homepage. If a page is genuinely blocked by bot protection, that category is marked "couldn't assess"; if the crawler reaches only a few pages, the site is marked "couldn't fully assess" and we hold back a grade, because it would swing run to run.
The free grader and the full study read your site the same way; they differ in how many questions they ask. The instant grader samples one question per category, twelve in all. It leaves out the "searches buyers run" question, because that one needs a real competitor name and search query researched for each vendor. The full study includes it, for thirteen graded categories, and the deep-dive library holds ~90 questions in total.
Sample and limits
- Sample. 130 US B2B SaaS vendors, all in the HubSpot ecosystem, graded on the 13 frozen questions across the two axes. 108 graded cleanly; sites blocked by bot protection or reached too thinly were held back rather than failed. Mid-market and enterprise are reported as two labeled cohorts, never blended. Run on June 26, 2026.
- Snapshot, not audit. The published grade samples one question per category; the full evaluation asks ~90. A snapshot can shift a band run to run.
- Crawl limits. Sites that block automated crawlers receive "couldn't assess" on the site axis, and sites the crawler reaches only thinly receive "couldn't fully assess." In both cases the AI axis still computes; these sites are excluded from the aggregate on the site axis.
- AI variance. AI answers can vary between runs, so AI-side grades are labeled snapshots.
- Generalizability. Findings apply to B2B SaaS and should not be read as describing software buying generally.
Conflict of interest
SlateCX builds a product premised on the existence of this gap, so we have an interest in the gap being real. We state that plainly, because it's the reason the controls above exist. To keep this honest:
- the question set and its sources are frozen before any run;
- each question is derived by a stated rule, not selected to taste;
- no headline number is chosen in advance;
- the grader is free and the method is reproducible, anyone can re-run any site and check our work.
If this becomes a recurring index, the question set and grading are held constant between editions; any change is versioned and disclosed here, so movement in the number reflects the market, not the method.
Run it on your own site.
Run the free grader and get your two Coverage grades, what's answered, what's missing, and where AI is filling the gap with someone else's content.
Grade your site Read the studyHelp Prospects Discover Your Website
With SlateCX you can design your web content for maximum visibility. You can ungate your content so that search and AI engines give you higher rankings and more inclusion in AI answers.


Convert More Leads
Experiment with removing forms and letting the SlateCX agent collect prospect information during the course of a natural conversation. Prospects engage more freely when they're not confronted with barriers, leading to higher quality leads and better conversion rates.
Increase Engagement
Keep prospects actively involved throughout their buyer journey. Interactive workspaces encourage exploration, team collaboration, and deeper engagement with your solutions.


Put Your Best Foot Forward
Your prospects want to do their own deep research - about solutions available to them and their strengths and weaknesses. With SlateCX they can do this while you supply the AI Agent with the most relevant information about your product and brand.
Help Your Reps Engage
When prospects are actively engaged and ready for sales interaction, SlateCX automatically invites your sales rep to join the team chat at exactly the right moment. No more cold calls - your reps enter warm conversations with prospects who are already engaged.
