Share of voice in traditional media tells you what slice of the market hears your message. In LLMs, the metric shifts: it’s about presence in AI-generated answers.
When someone asks an LLM about your category, does your brand appear? How often? How do you stack against competitors? That’s share of voice in LLMs—and it matters because LLMs are becoming the primary way people discover information.
This post covers how to measure it, build a tracking system, benchmark against competitors, and improve your presence in AI responses.
What Is Share of Voice in LLMs?
Share of voice in LLMs measures how often your brand appears in AI-generated responses relative to your competitors, across one or multiple language models.
The metric answers three questions:
- Visibility: Does my brand get mentioned when someone asks about my category?
- Frequency: How many times does it appear versus competitors?
- Prominence: Does it appear early in the response, or buried at the end?
In traditional media, share of voice tracks ad impressions and spend. In LLMs, it tracks mentions in answer text.
Why measure it? Because LLMs are becoming a primary discovery channel. Perplexity grows 25% month-over-month. ChatGPT 4o handles 200+ million weekly users. Google’s integrated AI overviews now appear in most search results. If your brand doesn’t show up in these systems, customers won’t see you.
How to Measure: The Prompt Testing Method
The most reliable way to measure share of voice in LLMs is direct testing—a structured set of prompts run across models, scored for presence and prominence.
Step 1: Define Your Test Prompts
Build a list of 15–30 queries that mirror real customer research. These should:
- Cover your main product categories
- Include comparison queries (“best X for Y”)
- Use competitor names (“how does X compare to Y”)
- Target intent that drives purchase (not vanity searches)
Example queries for an AEO tools company:
- “What tools help with answer engine optimization?”
- “How do I optimize my site for AI search?”
- “Best AEO platforms for content creators”
- “Compare answer engine optimization tools”
- “How to measure share of voice in LLMs”
- “Does Perplexity use my website content?”
Create a shared Google Sheet with one row per query. You’ll add columns for each model (ChatGPT, Claude, Perplexity, Gemini) and score results.
Step 2: Run Prompts Across Models
Use the web versions (or API if you have it) of:
- ChatGPT: openai.com/chatgpt
- Claude: claude.ai
- Perplexity: perplexity.ai
- Gemini: gemini.google.com
Copy the exact prompt into each model. Capture the full response (screenshot or copy text). Run each prompt in the same model once—repeat runs can vary.
Pro tip: Run tests at the same time of day if possible. LLMs train on data with temporal patterns. Morning and evening responses may differ slightly.
Step 3: Score Each Response
Use a simple scoring system:
| Score | Criteria |
|---|---|
| 3 | Brand mentioned in top 2 results or sentences; primary recommendation |
| 2 | Brand mentioned mid-response (after 1–2 other solutions); secondary recommendation |
| 1 | Brand mentioned at end or in list only; not highlighted |
| 0 | Brand not mentioned at all |
Enter the score in your spreadsheet for each brand (yours and competitors) in each model.
Example: For “best AEO platforms,” ChatGPT scores:
- Your brand: 3 (mentioned in opening sentence as top pick)
- Competitor A: 2 (mentioned 2nd, treated as alternative)
- Competitor B: 1 (listed at end)
- Competitor C: 0 (not mentioned)
Score every brand you care about. Usually 3–5 per category.
Building a Tracking Spreadsheet
Create a master tracking sheet with this structure:
Sheet 1: Monthly Tests
- Column A: Query
- Columns B–E: ChatGPT Score, Claude Score, Perplexity Score, Gemini Score
- Column F: Average Score
- Column G: Mentions (count of mentions per brand per model)
- Column H: Notes (e.g., “Position shifted after new content”)
Sheet 2: Competitor Scorecard
- Rows: Each competitor (including you)
- Columns: ChatGPT Avg, Claude Avg, Perplexity Avg, Gemini Avg, Total, % of Queries Where Mentioned
- Add a monthly tab—copy the scorecard forward each month to track trends
Sheet 3: Benchmark Targets
- Row 1: Your brand targets for Q2, Q3, Q4
- Row 2: Competitive average (baseline)
- Row 3: Actual performance (pull from Scorecard)
- Color-code green if you hit target, red if you miss
Update this sheet monthly. The trend line matters more than individual scores.
Advanced Scoring: Prominence & Authority
Basic scoring captures whether you appear. Advanced scoring captures how you appear.
Add a Prominence Multiplier based on context:
- 1.5x if mentioned as the solution (e.g., “Use X for AEO”)
- 1.0x if mentioned as an option among several
- 0.5x if mentioned only in a list
Add an Authority Score based on supporting language:
- “X is industry standard” = high authority language (multiplier 1.3x)
- “X offers” = neutral (1.0x)
- “X also exists” = low authority (0.7x)
Example: If Claude says, “Use Brand X—it’s the standard for AEO,” that’s a score of 3 (top mention) × 1.5 (solution language) × 1.3 (authority language) = 5.85.
This takes longer but gives you actionable insight into quality of mentions, not just quantity.
Benchmarking Against Competitors
Share of voice only matters in context. Benchmark yourself against:
-
Competitor average: Calculate the mean score of top 3–5 competitors per query. Your target should beat this by at least 15%.
-
Market saturation: If only 40% of queries mention your category at all, getting a 2.5 average is strong. If 90% of queries mention it, you need a 2.8+.
-
Your own baseline: Measure month 1 (baseline), then month 2, 3, etc. If you go from 1.2 to 1.8 average, that’s a 50% improvement.
Example benchmark framework:
- Underperforming: Your avg < competitor avg
- At parity: Your avg = competitor avg ± 0.3
- Outperforming: Your avg > competitor avg + 0.3
Track this on your Scorecard sheet. Update targets quarterly.
How to Improve Your Share of Voice
Scoring reveals where you stand. Now improve it.
1. Content Optimization for LLM Consumption
LLMs train on web data, but they favor:
- Comprehensive articles (1500–2500 words covering full topic)
- Clear structure (H2s, bullet lists, tables—easier for models to extract)
- Original data (research, case studies, benchmarks—models cite original sources)
- Answer-ready language (“X is a method to…” vs “We provide solutions”)
Write content that answers the exact queries you’re testing. If you’re not showing up for “share of voice in LLMs,” write a guide on exactly that. Title it clearly. Make it comprehensive.
2. Technical SEO Signals
LLMs don’t directly see rankings, but they:
- Train on content that ranks well (Google gives better content more traffic, more crawls)
- Prioritize sites with strong Domain Authority
- Favor sites that load fast (crawlability)
Strong SEO ≠ guaranteed LLM mentions, but it helps. Focus on: core web vitals, site architecture, internal linking, E-E-A-T signals (experience, expertise, authority, trustworthiness).
3. Citation & Byline Strategy
LLMs cite sources when asked (“where did you find this?”) or when training data includes attribution. Build:
- Bylined content on industry publications (get real attribution in training data)
- Original research (most cited source type)
- Thought leadership positioning (personal brand + company)
Publish quarterly research reports. Get quoted in relevant publications. Build a POV that models learn to associate with your name.
4. Answer Engine Optimization
AEO-specific tactics:
- Answer featured snippets (position 0) on your own site—Google often feeds these to LLMs
- FAQ schema markup (helps LLMs understand Q&A content)
- Definitions & glossaries (LLMs love reference content)
- Data-backed claims (“73% of marketers report…” gets cited)
If Perplexity or Claude references your site in responses, you’re visible. Make that easier by creating content models actually want to cite.
5. Direct Model Integration
Some brands can request inclusion in LLM training data or knowledge cutoffs:
- Perplexity has a business partnership program
- Claude (via Anthropic) accepts direct submissions for knowledge updates
- ChatGPT training includes recent web data (SEO helps here)
This isn’t guaranteed, but it’s worth exploring if you have the resources.
Tools That Help
A few tools accelerate share of voice measurement:
Typeform + Zapier: Automate prompt submission and result logging (build a form, have team members submit responses, Zapier logs to spreadsheet). Costs: $25–50/month.
Perplexity Assistant API: Run prompts programmatically at scale, log results automatically. Costs: $20 per 1M tokens (roughly $5 per 1000 queries). Best for large-scale testing.
Google Sheets + Apps Script: Write a custom script that queries (via API) and logs results. Free but requires coding.
ChatGPT API: Run prompts at scale for $0.50 per 1M tokens. Use for baseline testing.
Manual + Spreadsheet: Honestly, most companies start here. 30 prompts × 4 models = 2 hours/month. Free, low overhead, easy to iterate.
Reporting to Stakeholders
Present share of voice in three formats:
1. Trend Line
- Y-axis: Average score (0–3 or 0–5 depending on system)
- X-axis: Months
- Lines for: Your brand, top competitor, market average
- Tells executives whether share is growing or shrinking
2. Scorecard
- Row per brand
- Columns: ChatGPT, Claude, Perplexity, Gemini, Total, % of Queries
- Green/red highlighting
- Tells teams where to focus effort (which models, which competitors to beat)
3. Query-Level Insights
- Table: Query, Your Score, Competitor A Score, Competitor B Score, Opportunity
- Identifies specific gaps (“We don’t appear in comparison queries”)
- Drives content and optimization priorities
Pair metrics with wins: “Share of voice in Claude increased 40% after publishing the AEO guide. We now show up in 85% of methodology queries.”
Building Momentum
Start with a baseline month (Month 1). Set modest targets (Month 2: 10% improvement). Measure, optimize, measure again.
After three months, you’ll see patterns:
- Which models favor your content
- Which queries are hardest to crack
- Whether your optimizations actually move the needle
Share of voice in LLMs isn’t the only metric that matters, but it’s the one that tells you whether your brand appears when customers ask AI for answers. That’s worth measuring.
Next step: Build your query list this week. Run your first test next week. Use this framework to track it monthly. In 90 days, you’ll know exactly where you stand and what to do next.