Traditional share of voice measures how much of the conversation your brand owns in media, social, or search. You count mentions, impressions, or ranking positions, divide by the total, and get a percentage.

LLM share of voice works on the same principle but measures something different: how often AI models cite, recommend, or mention your brand when users ask questions in your category.

This metric didn’t exist two years ago. Now it’s one of the most important indicators of whether your AEO strategy is working.

Why This Metric Matters

When a prospect asks ChatGPT “What’s the best CRM for small businesses?” and your product appears in the answer, that’s a sales touch you didn’t pay for. When your competitor appears instead, that’s a sales touch they got for free.

AI search is growing at 15-25% quarter over quarter. Gartner projects that by 2027, AI-assisted search will handle 30% of all information queries. The brands that appear in those answers capture demand at the moment of intent. The brands that don’t lose deals before they know the deal existed.

Share of voice in LLMs tells you where you stand. It tells you whether your content strategy, press coverage, and entity signals are working. And it gives you a benchmark to improve against.

The Prompt Testing Framework

Measuring LLM share of voice requires a structured prompt testing process. Here’s the framework we use.

Step 1: Define Your Category Prompts

Write 15-20 prompts that your ideal customer would type into an AI model. These should cover three types:

Category prompts ask for recommendations within your space. “What are the best [category] tools?” “Which [category] platform should I use for [use case]?” “Compare the top [category] solutions.”

Problem prompts describe a challenge your product solves. “How do I [problem your product solves]?” “What’s the best way to [task]?” “I need help with [pain point], what should I use?”

Brand prompts ask about you directly. “What is [your brand]?” “Tell me about [your company].” “Is [your brand] good for [use case]?”

Write the prompts as a real customer would phrase them. Avoid jargon. Use natural language.

Step 2: Run Prompts Across Platforms

Test each prompt on four platforms minimum: ChatGPT (GPT-4), Claude, Perplexity, and Google AI Overview.

For each prompt on each platform, record:

Mentioned: Did the AI mention your brand? (Yes/No)

Position: Where did your brand appear in the response? First mention, middle of a list, last mention, or not mentioned.

Sentiment: Was the mention positive, neutral, or negative?

Context: Was your brand recommended, compared, or just mentioned in passing?

Competitors mentioned: Which competitors appeared in the same response?

Run the same prompts every month. Consistency matters more than volume. Twenty prompts tested monthly across four platforms gives you 80 data points per cycle.

Step 3: Score Each Response

Use a simple scoring system:

3 points: Your brand is mentioned first or recommended as a top option.

2 points: Your brand appears in a list of options with positive or neutral context.

1 point: Your brand is mentioned but not recommended or compared favorably.

0 points: Your brand doesn’t appear.

-1 point: Your brand appears with negative sentiment.

Calculate your total score and divide by the maximum possible score (3 points x number of prompts x number of platforms). That gives you a percentage. That percentage is your LLM share of voice.

Building the Tracking Spreadsheet

Create a spreadsheet with three tabs.

Tab 1: Monthly Raw Data

Columns: Date, Platform, Prompt, Your Brand Mentioned (Y/N), Position, Sentiment, Score, Competitors Mentioned, Notes.

One row per prompt per platform. At 20 prompts across 4 platforms, that’s 80 rows per month.

Tab 2: Monthly Summary

Columns: Month, Overall Score %, ChatGPT Score %, Claude Score %, Perplexity Score %, Google AI Overview Score %, Top Competitor Score %, Change vs. Last Month.

This tab shows your trend line. After three months, you’ll see whether your AEO efforts are moving the needle.

Tab 3: Competitor Comparison

Columns: Competitor Name, Times Mentioned (total across all prompts/platforms), Average Position, Most Common Context (recommended/compared/mentioned), Share of Voice %.

To calculate competitor share of voice: count how many times each competitor appears across all your prompts and platforms, then divide by the total mentions (yours + all competitors). This shows your relative position in the AI answer landscape.

What the Scores Mean

Above 40% share of voice: Strong position. AI models recognize your brand as a category leader. Focus on maintaining this through continued press coverage and content freshness.

20-40% share of voice: Competitive position. You’re appearing but not dominating. Identify which prompt types you’re missing (category vs. problem vs. brand) and target those gaps.

10-20% share of voice: Emerging presence. You’re on the map but not a default recommendation. Prioritize entity-building activities: press coverage, Wikipedia, structured data, and authoritative content.

Below 10% share of voice: Low visibility. AI models don’t associate your brand with your category. This requires a foundational AEO strategy starting with press coverage and entity signals.

Benchmarking Against Competitors

The absolute score matters, but the relative score matters more. A 25% share of voice sounds modest until you realize your closest competitor has 15%.

For competitive benchmarking, pick your top 3-5 competitors and run the same prompts. Track their scores alongside yours.

Look for patterns:

Which platforms favor which brands? You might dominate on Perplexity but trail on ChatGPT. That tells you something about the content sources each platform prioritizes.

Which prompt types reveal gaps? If competitors beat you on category prompts but you win on problem prompts, your content is solving problems but your brand recognition needs work.

Which competitors are gaining ground? Month-over-month changes in competitor scores tell you who’s investing in AEO. A competitor whose score jumps 10% in a month just launched a press campaign or published a batch of authoritative content.

How to Improve Your Share of Voice

Low scores point to specific actions.

Missing from category prompts: AI models don’t associate your brand with the category. Fix this with press coverage that names your brand alongside category terms. “Acme, a project management platform” creates the association. Publish comparison content and category guides on your own site.

Missing from problem prompts: AI models don’t connect your brand to the problems you solve. Create content that matches the exact phrasing of problem prompts. If customers ask “How do I manage remote teams?” publish content titled “How to Manage Remote Teams” that mentions your product as a solution.

Low position when mentioned: AI models know about you but don’t prioritize you. This is an authority signal problem. Increase press coverage in higher-authority publications. Build more third-party mentions and reviews. Strengthen your Wikipedia and Wikidata presence.

Negative sentiment: Something in the AI’s training data or retrieval sources paints your brand poorly. Identify the source (often a negative review, Reddit thread, or news article) and address it with positive coverage that outweighs the negative signal.

Tools That Help

Several tools support LLM share of voice tracking, though none automate it completely.

Semrush tracks Google AI Overview appearances for your target keywords. It shows which brands appear in AI Overviews and how often.

Ahrefs offers similar AI Overview tracking with competitive comparison features.

Perplexity API lets you run prompts programmatically and parse responses for brand mentions. Costs roughly $5 per 1,000 queries.

OpenAI API gives you programmatic access to ChatGPT. You can script prompt tests and log responses automatically. Costs vary by model but budget $20-50 per month for monitoring.

Anthropic API provides the same capability for Claude. Similar pricing structure.

Google Sheets + manual testing remains the most accessible approach for teams without engineering resources. Run the prompts by hand, log the results, and build your trend line over time.

Reporting for Stakeholders

When presenting share of voice data to leadership, clients, or stakeholders, focus on three views:

The trend line. A simple chart showing your share of voice percentage over time. This answers “Are we improving?”

The competitive scorecard. A table comparing your score to top competitors. This answers “Where do we stand?”

The gap analysis. A breakdown of which prompt types and platforms represent the biggest opportunities. This answers “Where should we invest next?”

Avoid drowning stakeholders in raw data. The 80 data points per month matter for your analysis, but the executive summary needs three numbers: your current score, the change from last month, and the gap to the top competitor.

The Monthly Ritual

Set a recurring calendar event for the first week of each month. Run your 20 prompts across 4 platforms. Log the results. Calculate your scores. Compare to last month and to competitors.

This takes about 2-3 hours per month if you’re doing it manually. With API automation, you can cut that to 30 minutes of review time.

The data compounds. After six months, you’ll see which of your AEO investments are producing results and which aren’t. After twelve months, you’ll have a clear picture of your brand’s trajectory in AI search.

Share of voice in LLMs is the metric that separates brands with an AEO strategy from brands that are guessing. Start measuring it this month. The sooner you have a baseline, the sooner you can improve it.