What's the best way to measure AI? Five honest options compared

AI spend has exploded. Measurement hasn't kept up. Here's an honest look at the five approaches businesses actually use today — what each is good for, where each falls short, and how to pick the right combination for your team.

Every leadership team we talk to is asking some version of the same question. We've bought seats in five different AI tools across the company. Some people seem to love them. The bills are real. Is any of this actually working?

It's a fair question. It's also surprisingly hard to answer, because AI tools sit in an awkward spot — they don't fit neatly into the measurement frames we already have. They're not a hire, where you measure outputs. They're not a piece of software with a clear process metric, like Salesforce or HubSpot. They're a general-purpose lever, used differently by every person who touches them.

So how do you actually measure it? Five approaches are in common use today. Each has a sweet spot. None is a universal answer. The trick is picking the right one — or, more often, the right combination — for the question your team is genuinely trying to answer.

1. Annual engagement and pulse surveys

Examples: Culture Amp, Lattice, 15Five, Officevibe, the Gallup Q12.

These are the broad employee-experience tools that most mid-sized businesses already run. Quarterly or annually, you ask your team a battery of questions about how they're feeling, and you slot in two or three AI-specific questions next to the engagement and management items.

Sweet spot: tracking sentiment at a high level over time. Useful if your leadership team needs a single "are people happy with the AI rollout?" number for a board report.

Where it falls short: the response rate. Even the best-run annual surveys get 40-60% participation, and the people who respond are skewed — the enthusiasts and the complainers, with the silent majority absent. You're also asking people to remember three months back to a tool they used twice. The answers reflect vibes, not behaviour. And because AI questions live alongside thirty others, they get a few seconds of attention, not real reflection.

Use this if you're already running an engagement programme and you want a coarse trend line. Don't use it to make tool-by-tool budget decisions.

2. Vendor usage dashboards

Examples: the admin panel that ships with ChatGPT Enterprise, Microsoft Copilot's usage report, Claude for Work's analytics, GitHub Copilot's seat dashboard.

Every major AI vendor ships an admin dashboard for the seats you've bought. They tell you who's logged in, who's active, how many messages or completions or suggestions per user per week, sometimes with a feature breakdown.

Sweet spot: answering the activation question. "Did Sarah, who I gave a Copilot seat to, actually use it?" The data is objective, free (it's bundled with your subscription), and granular.

Where it falls short: usage is not effectiveness. Someone can use a tool a hundred times a week and find it net-unhelpful. Someone else can use it twice and save themselves a day. The dashboard tells you the count, not the consequence. Vendor metrics also tend toward flattering — "your team submitted 1,200 prompts this quarter!" is a number designed to make you not cancel the seats, not a number designed to help you decide whether to keep them.

Use this to spot dormant seats and to flag unhealthy patterns (e.g. a tool with strong adoption in Engineering but zero in Marketing — is that right?). Don't use it as your ROI proof.

3. SaaS spend management platforms

Examples: Zylo, Productiv, Vendr, Spendflo. Some also touch the AI-cost-management space (Zylo's AI Discovery, Productiv's AI tooling visibility).

These platforms started life as SaaS spend-rationalisation tools — surface every subscription, find the duplicates, negotiate them down at renewal. Most have a story about AI now: which tools are sprawling across your org, what the cost trajectory looks like, whether seats are sitting idle.

Sweet spot: cost visibility. If you've genuinely lost track of how many AI subscriptions are scattered across your business cards, departmental budgets, and shadow procurement, these tools find them and consolidate them.

Where it falls short: they answer the wrong question. Spend management tells you what you're spending; it tells you nothing about whether the spend is earning its place. Even the dormant-seat warnings only get you to "kill the seats nobody uses" — they don't tell you which tools are quietly working but under-adopted, or which ones people use a lot but secretly resent.

Use this if your AI spend is genuinely out of control and you need a defragmentation exercise. Don't expect it to answer "is this working?" — it isn't designed to.

4. Consultant-led audit (one-off engagement)

Examples: a six-week engagement with a Big Four consultancy, an AI-specialist boutique, or your in-house transformation team running a structured audit with interviews, focus groups, and a final report.

Done well, this is the most credible single snapshot you can get. A skilled consultant will interview a cross-section of your team, observe actual usage, build a quantitative case study or two, and hand you back a board-ready report with concrete recommendations.

Sweet spot: depth. You get qualitative texture (the "why" behind the numbers) plus a strategic recommendation written for your specific context. Hard to beat for a board presentation or a budget decision that justifies the engagement cost.

Where it falls short: it's a snapshot, not a sensor. Six weeks after the report is delivered, your team has bought two new AI tools, churned a third, and the usage patterns have shifted. By Q2 the report is a historical document. The other problem: cost. A serious AI audit from a top-tier consultancy lands at $40k–$200k+. You'll do it once and then go without measurement until you can justify another round.

Use this for a one-off strategic reset — the moment you're presenting AI ROI to the board for the first time, or making a multi-year platform decision. Don't expect it to be your ongoing instrument.

5. Continuous in-context micro-surveys

Example: The GAiGE. (We're aware this is our blog. Stick with us — we'll be honest about the trade-offs.)

The approach we built. A small browser extension delivers a thirty-second pulse to a team member right after they've used an AI tool — one or two short questions, in the moment. Pulses fire 3× a week per user. Responses flow to a dashboard that turns them into per-tool ROI, hours saved, satisfaction, adoption, and training-gap signals.

Sweet spot: continuous, defensible per-tool numbers. Because we ask one question in the moment, response rates run at 70-90% (compared to 40-60% for annual surveys), and the responses aren't self-selected — the silent majority shows up too. The methodology is published — a 2.5× extrapolation cap, your own blended hourly rate, minimum-N response thresholds before any number renders, full aggregate-only privacy. It survives board scrutiny because every number is auditable end-to-end.

Where it falls short: two things, honestly. First, it requires your team to install a Chrome extension. We've kept the install warning as tame as possible (browser-page access is requested only when needed, not at install) and most teams roll it out via MDM in minutes — but it's still a step. Second, we measure browser-based AI tools. If your team uses a desktop-app AI tool we don't yet cover, those interactions flow through the extension's inbox rather than as in-page pulses. The signal still reaches you; the immediacy is reduced.

Use this when you want a defensible per-tool ROI number on an ongoing basis — for the CFO who keeps asking, for renewal decisions, for spotting which tools your team secretly resents before they show up in churn. Don't use this if your AI usage is entirely off-browser (rare, but possible).

What the right answer usually looks like

Most mid-sized businesses we work with end up with two or three of the above, layered:

Vendor dashboards for activation and seat-utilisation hygiene (free, already there).
Continuous in-context pulses for the ongoing per-tool ROI signal that drives renewal and rollout decisions.
An annual or biannual consultant audit for strategic resets and big platform decisions, every 18-24 months.

The annual employee survey can include two or three AI questions for the sentiment trend line, but it's not where you'll find the answers that matter. SaaS spend management is worth it only if your subscription sprawl is genuinely out of control.

Where to start

If you're just beginning, the order we'd suggest is the reverse of how most companies actually start. Most start with the annual survey and the vendor dashboards because they're already paid for. The problem is, neither tells you anything you can defend in a board pack.

Start with the question your CFO will ask in six months, work backwards from that, and pick the measurement that answers it. For most teams that's a continuous per-tool ROI signal. If you'd like to see what that looks like in practice, we've published the full methodology behind The GAiGE on our 2.5× rule post, and a deeper case for why surveys beat usage data for this specific question.

We've also opened a free trial (up to 30 days) — no credit card — if you'd rather see the dashboard with your own team's data than read about it. Start a trial here.