What is an AEO audit?

An AEO audit measures whether AI answer engines like ChatGPT, Perplexity, and Google's AI Overviews can find, trust, and cite your brand. It checks your entity data, your presence on the sources those engines pull from, and how often you actually surface in answers to the questions your buyers ask.

How is an AEO audit different from an SEO audit?

An SEO audit asks whether you rank on a page of blue links. An AEO audit asks whether you get named inside a generated answer, where there is no page two and usually only three or four brands get mentioned. The inputs overlap, but the scoring and the stakes are different.

How often should you run an AEO audit?

Quarterly for most brands, monthly if you are in a fast-moving category or actively running an AEO campaign. AI models retrain and citation patterns shift faster than Google's index, so a score from six months ago tells you little about today.

Can you run an AEO audit yourself?

Yes. The nine checks below need no special software beyond access to the major AI engines and your own analytics. The harder part is acting on what you find, because most fixes touch sources you do not own, like Wikipedia, Reddit, and trade publications.

AEO Audit Framework: 9 Checks for AI Visibility

Most brands have no idea whether ChatGPT recommends them or skips them, and that blind spot is getting expensive. A buyer asks an AI engine “who are the best options for X,” gets four names back, and three competitors are on the list while you are not. You never see the loss. No analytics dashboard logs the sale that went to whoever the model named. The AEO audit framework below exists to make that invisible loss visible, by scoring your AI visibility across nine specific checks you can run this week.

I built this as a numbered sequence on purpose. An AEO audit that hands you a vague grade (“your AI presence is weak”) changes nothing. One that tells you exactly which of nine layers is broken tells you what to fix first. Call it the nine-layer visibility score. Each layer is worth points, each maps to a concrete action, and the lowest-scoring layer is almost always where your next win hides.

Layer 1: the direct-question test

A person reviewing a printed checklist on a clipboard, scoring each item by hand

Start where your buyers start. Open ChatGPT, Perplexity, Claude, and Google’s AI Overviews, and ask each one the exact question a customer would type. Not your brand name, the category question: “best AEO agency for SaaS,” “top press release distribution services,” “who helps with Google Knowledge Panels.” Write down whether you appear, where in the answer, and who appears alongside you.

This is layer one because it is the truth the other eight layers explain. If you surface in three of four engines, you score high here and your problem is consistency. If you surface in zero, you have a foundational gap and the remaining checks will tell you which one. Run each question three times, because generated answers vary between sessions, and a single test can flatter or punish you by luck.

Layer 2: entity recognition

The models cannot recommend a brand they do not understand as a distinct thing. Ask each engine “what is [your brand]” and read the answer cold. Does it know what you do, who you serve, and what makes you different? Or does it hedge, confuse you with a similarly named company, or admit it has no information?

A clean, confident description means the model holds a solid entity for you, which is the precondition for every recommendation downstream. A muddled answer means your entity data is thin or contradictory across the web, and no amount of clever content fixes that until the underlying facts line up. Entity recognition is the rung the whole ladder leans on, and most brands that fail layer one fail it because they failed here first.

Layer 3: source presence

Hands typing on a laptop, pulling up the third-party sites an AI engine actually reads

AI engines do not invent recommendations. They synthesize them from sources: Wikipedia, Reddit threads, trade publications, listicles, review sites, and the handful of pages they trust on a topic. Layer three audits whether you exist on those sources at all. Pick your top five category questions, look at which pages the AI citations point to, then check whether your brand appears on any of them.

This is usually where the audit gets uncomfortable. Brands discover that the deciding sources for their category are a Reddit thread from 2024, two roundup articles they have never heard of, and a G2 grid, none of which mention them. You cannot rank your way onto a generated answer. You earn your way onto the sources behind it, which is a publication and reputation problem more than a content one.

Presence is binary. Citation share is the gradient. For each category question, count how many of the cited sources mention you versus your strongest competitor. If a rival shows up in nine of ten sources and you show up in two, the model has ten reasons to name them and two to name you, and it will name them.

Score this as a ratio, not a yes or no, because it tells you how far you have to climb. Closing a citation-share gap is slow work, and seeing the real number keeps you honest about the timeline. A brand at twenty percent citation share does not get to parity in a month, and pretending otherwise just sets up a disappointed client call later.

Layer 5: freshness and recency

Models favor sources that look current, and they increasingly timestamp their reasoning. A brand whose last press mention was 2023 reads as stale, even if the company is thriving. Check the dates on the sources that mention you. If your freshest citation is two years old while a competitor earned coverage last month, recency is quietly working against you.

The fix is a steady drip of new mentions rather than one big splash. Ten modest placements spread across a year beats a single feature that then ages out. Freshness is less about volume than about never going dark, because a brand that publishes and earns coverage on a rhythm signals to every engine that it is active and worth surfacing now.

Layer 6: consistency of facts

When the web disagrees about your basic facts, the model gets cautious. One source says you were founded in 2019, another says 2021. Your headquarters is listed in two cities. Your service list reads differently on three directories. Each contradiction lowers the model’s confidence and makes it likelier to hedge or skip you.

Audit your core facts across every place they appear: your site, your LinkedIn, Crunchbase, directories, and any profile you do not control. Reconcile them to one canonical version. This is unglamorous and it moves the needle more than most content work, because the models reward the boring brand whose story is identical everywhere over the interesting one whose facts wobble.

Layer 7: sentiment in the source pool

Being mentioned is not the same as being mentioned well. Read the actual language in the sources the models pull from. A Reddit thread that names you as the overpriced option is a citation that hurts. Layer seven scores the tone of your presence, not just its existence, because models read sentiment and pass it through into how they frame you.

Negative or lukewarm sentiment in a high-authority source is worth fixing before you chase new placements, since a single trusted critical thread can outweigh several glowing ones. Map where the sentiment is poor, decide what is fair criticism worth answering and what is noise, and address the fair part in public where the models can see the response.

Layer 8: structured data and machine readability

Layer eight is the one engineers love because it is concrete. Does your site expose clean structured data: organization markup, clear service descriptions, FAQ schema, an about page that states plainly who you are and what you do? Machines read your site differently than humans, and a page that looks fine to a visitor can be a fog to a crawler.

This will not save a brand that fails layers two through seven, so do not over-index on it. But once the off-site fundamentals are in place, clean structured data is the cheap, fully-in-your-control multiplier that helps the models parse you correctly. Treat it as the finishing layer, not the foundation.

Layer 9: the re-ask and the trend line

The final layer is time. Run layers one through eight today, write down the scores, then run them again next quarter. A single audit is a snapshot. The trend across audits is the real signal, because AEO is not a project you finish but a position you hold against competitors who are also working at it.

How to score it without fooling yourself

The framework only helps if you score it honestly, and the temptation to grade generously is strong. Give each layer a simple zero-to-three: zero for absent, one for weak, two for solid, three for dominant. Add them up, and your total out of twenty-seven is a number you can track over time. The absolute score matters less than two things: which layers are lowest, and whether the total is climbing or falling against your own past audits. A brand obsessing over its total while ignoring its weakest layer is measuring the wrong thing.

Run the test queries more than once and across more than one engine, because a single session can flatter or punish you by chance, and the engines disagree with each other often enough that any one of them is a partial picture. Score what you actually observe, not what you hope, and resist the urge to credit yourself for a mention that was lukewarm or a presence that was really your competitor’s. The audit that tells you a comfortable story is worthless. The one that tells you exactly where you are losing is the one that lets you win.

Turn the audit into an order of operations

A score is diagnosis, not treatment, and the value of the framework is that it tells you what to fix first. Work from the foundation up. If layers two and three, entity recognition and source presence, are weak, fix those before you touch anything else, because no clever content or clean schema compensates for a model that does not understand who you are or never encounters you in its sources. Most brands that fail the direct-question test fail it here, and most try to fix it at the wrong layer, polishing their own website while the actual gap is that they appear nowhere the model reads.

Once the foundation is solid, move up the stack to citation share, freshness, sentiment, and structured data in roughly that order, because each builds on the last. Citation share grows slowly as you earn more presence in trusted sources, freshness comes from a steady rhythm of new mentions, sentiment improves as you address fair criticism in public, and structured data is the cheap finishing multiplier once everything beneath it is in place. The order matters because effort spent on a high layer while a low one is broken is effort wasted. Find your weakest foundational layer, fix that, re-audit, and repeat, and the trend line that actually predicts your AI visibility starts to climb.

Score your nine layers, total them, and the lowest number is your starting point. For most brands the floor is layer three or four, source presence and citation share, which is exactly why publication and reputation work sits at the center of any serious AEO effort. Run the framework, find your weakest layer, and fix that one before you touch anything else.

AEO Audit Framework: 9 Checks for AI Visibility

Layer 1: the direct-question test

Layer 2: entity recognition

Layer 3: source presence

Layer 5: freshness and recency

Layer 6: consistency of facts

Layer 7: sentiment in the source pool

Layer 8: structured data and machine readability

Layer 9: the re-ask and the trend line

How to score it without fooling yourself

Turn the audit into an order of operations

Frequently asked

Explore the Journal

Ready to get published?

Layer 1: the direct-question test

Layer 2: entity recognition

Layer 3: source presence

Layer 4: citation share

Layer 5: freshness and recency

Layer 6: consistency of facts

Layer 7: sentiment in the source pool

Layer 8: structured data and machine readability

Layer 9: the re-ask and the trend line

How to score it without fooling yourself

Turn the audit into an order of operations

Frequently asked

Keep reading

Explore the Journal

Ready to get published?