Most AEO advice online is written at the level of “publish good content and earn citations.” That’s true as far as it goes, but it leaves out the actual mechanism — what happens inside the AI system when it decides to mention your brand. Without understanding the mechanism, the advice becomes a set of superstitions. “I tried this and it worked” versus “I tried this and it didn’t” with no way to explain why.

This post goes one layer deeper. Not deep enough to satisfy an ML researcher, but deep enough to make the tactical advice actually make sense. (For the practical playbook, see the AEO guide for 2026.)

The two pathways

When you ask a language model a question and it produces an answer that names specific brands, the brand names got into the answer through one of two pathways. Sometimes both.

Pre-training pathway. The model was trained on a massive corpus of internet text. During training, the model learned statistical associations between words and concepts. If your brand name appeared frequently in the training corpus alongside certain topics, products, or descriptions, the model learned those associations. When a user later asks a question that activates those associations, the model generates an answer that includes your brand name. This pathway doesn’t require any live internet access. A model that’s been cut off from the internet since its training date can still mention your brand if you were in the corpus.

Retrieval pathway. At query time, the AI system runs a search against a live web index or a specialized retrieval corpus, pulls a handful of relevant documents, and passes them into the model’s context window along with the original question. The model generates an answer that’s grounded in those documents, and typically cites them with inline links. This pathway does require live internet access, and the citations reflect what exists on the web right now rather than what existed when the model was trained.

Different AI products use these pathways differently.

Knowing which product you’re targeting changes which pathway you should optimize for. (For a focused look at one platform, see how to get cited in ChatGPT.)

Pre-training: how brands get baked in

During pre-training, a language model processes a huge corpus of text — hundreds of billions of tokens from crawled web pages, books, code, and specialized datasets. For every token in the corpus, the model updates its parameters to make the token more predictable given the surrounding context.

For a brand like “Acme Payroll” to become something the model will mention in an answer about small business payroll, two things need to be true in the training corpus.

First, the brand name needs to appear frequently. A brand that’s mentioned 10 times in a trillion-token corpus is statistical noise. A brand that’s mentioned 10,000 times has signal. The specific threshold varies by how distinctive the brand name is and how clustered the mentions are, but the general rule is that volume matters.

Second, the mentions need to appear in contexts that associate the brand with the topics you want to be recalled for. If Acme Payroll is mentioned 10,000 times but 9,000 of those are in a single subreddit thread about a specific controversy, the model learns to associate the brand with the controversy, not with the payroll category. You want your brand mentioned in contexts that look like “Acme Payroll, a payroll platform for small businesses, processed X in transactions this year” — topic-aligned mentions, across many different sources, with consistent framing.

This is exactly what press coverage, trade publication features, review roundups, and knowledge panel establishment do. Not one giant mention but thousands of small, consistent mentions in trusted sources over time. That’s the pattern that moves pre-training.

Retrieval: how citations get picked at query time

Retrieval is more tractable because it’s happening in real time and the decisions are observable.

When a retrieval-based AI product receives a query, the workflow is roughly:

  1. The query gets rewritten or expanded into one or more search queries.
  2. Those queries get sent to a search index (Google’s index for AI Overviews, Bing’s for Copilot, their own crawler for Perplexity).
  3. The top N results are retrieved. N is usually between 5 and 20.
  4. The retrieved pages are fetched, parsed, and ranked for relevance to the original question.
  5. The top few pages (often 3 to 8) are passed into the model’s context window as source material.
  6. The model generates an answer that’s grounded in those sources, with inline citations.

Winning citations in this pathway is essentially a two-step problem. First, your page needs to rank in the underlying search. Second, your page needs to look useful to the model when it’s reading the retrieved content.

The first step is classical SEO. The second step is where AEO-specific tactics come in. The model is looking for pages that directly answer the question, use clean structure, contain specific facts and numbers, and look like authoritative sources. A page that ranks fifth on Google but has the clearest direct answer to the question sometimes gets cited over the page that ranks first with a hedged marketing answer.

This is why featured snippet optimization and AEO optimization overlap so heavily. The signals that make a page extractable for a featured snippet also make it extractable for a retrieval-based AI citation.

The hybrid case

Most consumer AI products use both pathways, weighted differently for different query types.

On a factual query with a clear single answer (“what year did WWII end”), retrieval usually dominates. The model pulls a reliable source and uses it.

On a subjective or exploratory query (“what are some good podcasts about history”), pre-training matters more. The model draws on associations baked in during training, with retrieval supplementing.

On a brand-specific query (“tell me about Acme Payroll”), both matter. The model uses pre-training associations to frame the answer and retrieval to get current facts.

Your AEO program has to address both. The on-site content and structural work addresses the retrieval pathway. The press, review, and citation work addresses the pre-training pathway. Neither alone is enough.

Why schema helps less than you think

Structured data markup (Schema.org / JSON-LD) is often pitched as a major AEO lever. The honest answer is that it helps, but less than the marketing suggests.

Schema directly helps retrieval-based systems that parse it — Google’s AI Overviews, and some of Bing’s systems. It helps those systems understand page structure, entity types, and relationships.

Schema does not directly help pre-training. Language models trained on raw web text don’t get any special signal from JSON-LD markup because the markup looks like weird JSON to them, not structured data.

That said, the indirect effect is real. Pages with good schema tend to rank better in search, which means they get retrieved more often, which means the text of those pages ends up in more AI contexts. Schema is a ranking input more than an AEO input directly.

Worth doing. Not worth overhyping. Spend 10 percent of your AEO effort on schema and 90 percent on content and off-site work.

The retrieval signal list

Here’s the concrete list of things the retrieval-based systems seem to weight highly, based on observed behavior and some public documentation.

If your page checks most of these boxes, it will start showing up as a retrieved source for queries in its topic area.

The pre-training signal list

Less observable but here’s what the pattern suggests based on brand frequency analysis in answers from models trained on different snapshots.

This is the harder pathway to influence because the feedback loop is slow (training happens every 6 to 18 months) and you can’t see what’s in the corpus directly. But the tactical moves are clear: earn press, build entity recognition, publish on authoritative sources, be consistent in how you describe your company.

The takeaway

AEO is two different optimization problems layered on top of each other. One is near-real-time retrieval optimization that rewards the same signals SEO has always rewarded, plus some new structural ones around question-answer formatting. The other is long-cycle brand building through citation density in authoritative sources, which happens over months and shows up in future model training runs.

The brands winning at AEO are doing both. The ones that treat it as pure content work are getting some retrieval wins but missing the pre-training pathway. The ones that only do PR are getting the pre-training work but failing at retrieval because their on-site content isn’t extractable.

The combined program is harder, but it’s also more defensible — the companies that establish strong positions in both pathways compound their advantage over competitors who only do one.