Notes · AI Share of Voice

AI Share of Voice: How to Measure Brand Mentions Across ChatGPT and Perplexity

· AEO measurement · ~9 min read

AI share of voice is the percentage of times an AI engine names your brand, out of every brand it could have named, for a fixed set of buyer questions. You measure it by running the same prompts repeatedly across ChatGPT, Perplexity and the other engines, then counting how often you appear versus the field. Because the answers shift run to run, one response is a sample — the rate, tracked week over week, is the signal.

You already know how to find out where you rank on Google. You type the query, you look at the page, the position is the position. AI search broke that habit. Ask ChatGPT, Perplexity or Gemini the same question twice and you may get two different sets of recommended companies. There is no page to refresh, no position to read off. So the honest answer to "how visible are we inside AI answers?" is usually a shrug — and you cannot fix what you cannot see.

AI share of voice is the metric that turns the shrug into a number. It tells you how often the engines name you versus everyone else in your category, and — tracked over weeks — whether that share is climbing or sliding. This note explains what it measures, why a single answer lies to you, and how to track brand mentions in AI in a way you can act on.

What AI share of voice actually measures

The definition is borrowed from traditional media and adapted for generative engines. Authority Tech frames it as "the percentage of times your brand appears in AI-generated answers for a defined set of prompts, measured across the engines that buyers actually use." The arithmetic is simple:

  • AI share of voice = (your brand mentions ÷ total brand mentions across all tracked prompts) × 100.

Run a fixed set of category questions through the engines, count how many times your name comes up, divide by every brand name that came up, multiply by a hundred. If you and four competitors are mentioned a hundred times between you and you account for twenty of those, your share is 20%. The same idea is increasingly called share of model or LLM share of voice — the term the industry is settling on as the awareness KPI that replaces keyword rank for AI surfaces. The name varies; the question does not. When a buyer asks a machine who solves their problem, how often is the machine's answer you?

This matters because the behaviour is real, not hypothetical. Authority Tech's 2026 guide reports that around 30% of target audiences now research products through AI systems, and that traffic referred by large language models converts at 30–40% — far above what cold search or paid social tends to deliver. The reason is intent: someone who arrives having already been told you are the right fit arrives pre-sold. If you are absent from that conversation, you are not losing a ranking. You are losing the introduction.

Why one answer lies to you

Here is the part that trips most people up, and the reason casual "I asked ChatGPT and we weren't there" checks are worse than useless. Large language models are probabilistic. They predict the next likely piece of text from a distribution, so the same prompt can produce a different answer on every run. On top of that sits per-user personalisation — what the engine remembers about you — and, for tools like Perplexity, live retrieval that pulls fresh sources each time the web changes. Otterly notes that AI search platforms store a "Memory RAG" of information about you, so your own logged-in result is among the least reliable readings you can take.

The practical consequence: a single AI answer is a sample, not a ranking. Asking once and drawing a conclusion is like polling one voter and calling the election. The fix is the same one statisticians have always used — sample repeatedly and measure the rate. Authority Tech recommends 3–5 executions per prompt per engine; run each question several times, in a clean and ideally signed-out context, and record the proportion of runs that mention you. That proportion is stable even when any one answer is not. It is the difference between "we appeared that time" and "we appear 40% of the time" — only the second is something you can manage.

The metrics worth tracking

Raw share of voice is the headline, but it hides things. A serious AI visibility tracker watches a small family of metrics together, because each answers a different question:

  • Mention rate / share of voice — how often you appear versus the field. The top-line score.
  • Citation rate — how often the engine links to or sources your own domain as evidence. AI citation rate tracking matters because being named is good, but being the source behind the answer is what compounds. Authority Tech reports that 82–85% of AI citations come from third-party sources rather than brand websites, which tells you most visibility is earned elsewhere — in press, reviews and forums — not on your homepage.
  • Position — where in the answer you land. Being named first carries more weight than being the fourth option in a list, the same way it does on a results page.
  • Sentiment — how the engine characterises you. Nightwatch makes the point that you want to track not just how often you appear, but how the model frames you when it does. Being recommended and being mentioned-then-dismissed are not the same outcome and should never share a number.
  • Per-engine breakdown — never a single blended figure. The engines behave like different audiences.

Why the per-engine split is the most important cut

A blended score is comforting and misleading. The same query set produces very different shares depending on which engine you ask. Authority Tech's benchmarks put a typical brand's share at roughly 28–38% on Perplexity but only 10–16% on ChatGPT, with Gemini and Claude elsewhere again — because each engine prefers different evidence. ChatGPT leans heavily on earned and editorial media; Perplexity surfaces community sources like Reddit and cites many footnotes; Gemini weights YouTube and organic signals. As the guide puts it bluntly, a 20% blended figure can hide 35% on one engine and 3% on another. Average those and you will optimise for a brand that does not exist. Measure brand visibility on ChatGPT and Perplexity separately, then decide where the gap is worth closing.

How to set up tracking that you'll actually use

The temptation is to track everything. Resist it — a measurement habit you abandon in three weeks is worth nothing. Build the smallest version that still tells the truth.

Start with the prompt set. This is the whole game; your share of voice is only as honest as the questions you feed in. Nightwatch suggests beginning with your top 10–20 category prompts as a baseline. Write them in real buyer language, not your internal product vocabulary — "best [thing] for [situation]", "alternatives to [the obvious incumbent]", "is [you] any good for [use case]". Authority Tech's fuller method blends roughly 40% from keyword research, 35% conversational question forms and 25% language you have actually heard buyers use. Keep the list fixed. The instant you change the questions you lose the ability to compare week to week.

Name your competitive set. Pick four to eight brands you genuinely compete with. Without a field to divide by, "we got mentioned eight times" means nothing. With one, it becomes "we hold 22% and the incumbent holds 41%" — a sentence a board understands.

Set a cadence and hold it. Authority Tech's rhythm is sensible: a weekly check on a handful of priority prompts, a fuller monthly sweep across every prompt and engine, a quarterly strategic review. Watch for swings of five percentage points or more in a week — that is usually a signal something changed (a new competitor page, a Reddit thread, a model update), not noise. Consistency beats completeness: the same questions, the same engines, the same way, every time.

Decide build versus buy. You can run this by hand in a spreadsheet — prompts down one axis, engines across, a tally of mentions per run. It is tedious but it works, and for a tight prompt set it is genuinely viable. Or you use a dedicated AI search monitoring tool. Tools such as Otterly and Nightwatch automate the queries, run them from neutral contexts to dodge the personalisation problem, and track mentions, citations, position and sentiment across ChatGPT, Perplexity, Gemini, Copilot and Google's AI surfaces. The honest trade-off: manual is free but does not scale past a dozen prompts, and tooling costs money but removes the discipline problem — the tracking happens whether or not anyone remembers to do it. Brand mention monitoring across AI platforms is mostly a question of how much of your own time you want to spend running prompts.

From a number to a decision

A dashboard that nobody acts on is decoration. The point of an AEO KPI in 2026 is to drive a next move, and share of voice points cleanly at one. Because most AI citations are earned from third-party sources, low visibility is rarely fixed by editing your homepage. It is fixed by becoming the kind of brand the sources the engines trust already talk about — strong category pages structured so they are easy to quote, presence in the reviews and roundups your buyers read, and content recent enough to be pulled in. Authority Tech notes that roughly half of cited content was published within the last 13 weeks, so freshness is itself a lever.

Tracking does not improve your share. It tells you whether the work you are doing to improve it is landing — within 60–90 days, on Nightwatch's estimate, for focused effort, with larger shifts over six to twelve months. The measurement is the feedback loop. Without it you are optimising blind, guessing whether last quarter's content moved anything. With it, you know which engine, which competitor and which gap to spend the next month on.

If you are a smaller business, here is the honest framing. You do not need an enterprise platform to start, and you almost certainly should not buy one this month. A fixed set of fifteen real questions, run by hand across two engines once a week, will tell you more than most companies in your category know about themselves. Start there. Buy the tooling only when the manual version is the bottleneck — when the prompt set has outgrown the spreadsheet, not before.

Where SOLMONARC fits

We build the measurement and the plumbing behind it — the standing prompt set, the neutral-context query runs, the weekly share-of-voice and citation tracking wired into something you will actually open. If a spreadsheet and a Friday-morning habit are enough for where you are, we will tell you so, and you should keep your money. The work is worth paying for when the question stops being "are we in the answers?" and becomes "we know our share, now move it" — and you would rather the tracking ran itself than depended on someone remembering.

Straight answers

AI share of voice — common questions

What is AI share of voice?

AI share of voice is the percentage of times an AI engine names your brand, out of every brand it could have named, for a fixed set of buyer questions. It is calculated as your brand mentions divided by total brand mentions across your tracked prompts, multiplied by 100. It is the AI-search equivalent of where you rank, and is increasingly called share of model or LLM share of voice.

How do I measure brand mentions across ChatGPT and Perplexity?

Build a fixed set of 10 to 20 real buyer questions, then run each one several times through each engine — Authority Tech suggests 3 to 5 runs per prompt — in a clean, signed-out context. Count how often you appear versus your competitors, and keep ChatGPT and Perplexity as separate scores rather than blending them, because the engines cite very different sources.

Why do I get a different answer every time I ask an AI the same question?

Large language models are probabilistic — they predict likely text from a distribution, so the same prompt can produce different output each run. Personalisation and live web retrieval add more variance. This is why a single answer is a sample, not a ranking, and why you measure the rate across many runs instead of trusting one check.

What's the difference between a mention and a citation in AI search?

A mention is the engine naming your brand in its answer. A citation is the engine linking to or sourcing your own domain as evidence. Both matter, but citation rate compounds — and notably, Authority Tech reports 82 to 85% of AI citations come from third-party sources like press and reviews, not brand websites, so most visibility is earned elsewhere.

Do I need a paid tool to track AI share of voice?

No, not to start. A fixed prompt set run by hand in a spreadsheet across two engines once a week works well for a dozen or so questions. Dedicated tools like Otterly or Nightwatch are worth buying when the manual version becomes the bottleneck — they automate the runs from neutral contexts and track mentions, citations, position and sentiment so the tracking does not depend on anyone remembering.

How long before improving my AI share of voice shows results?

Nightwatch estimates consistent improvement within 60 to 90 days of focused content and citation work, with larger shifts taking 6 to 12 months. Because roughly half of cited content was published in the last 13 weeks, freshness helps — but the measurement itself is the feedback loop that tells you whether the work is landing.

Stop guessing whether the engines name you

A free Diagnostic shows where you stand right now across ChatGPT, Perplexity and the rest — your share, your competitors' share, and the gap worth closing first. If a spreadsheet is enough for where you are, we'll say so.