Every month, a bigger share of buyer research starts inside ChatGPT instead of Google. The brands named in those answers win the buyer. The brands skipped do not get a second chance at that query. LLM optimization is the discipline of making sure you're the one named.
This guide walks through LLM optimization end to end — what it is, how LLMs actually decide which brands to mention, the four levers that move the needle, the measurement workflow that proves it's working, and the mistakes that waste budget without producing results. The discipline goes by several names in 2026. LLM optimization is the technical term. AEO — Answer Engine Optimization — is what most agencies use in marketing copy. GEO, Generative Engine Optimization, is the Search Engine Land framing. Same workflow, three labels. Pick whichever one your stakeholders respond to and keep going.
What LLM optimization actually is
Traditional SEO optimizes for a list of ranked URLs returned by a search engine. LLM optimization optimizes for a paragraph of generated text returned by a language model. The goal isn't to rank on page one. The goal is to be the brand the model names when it composes the answer.
The shift matters because the output surface is smaller. A Google results page can list ten brands and a buyer might scroll to the fifth. An LLM answer names three or four brands and that's the whole result. There is no "position five." You are in the answer or you are invisible to that buyer for that query. The upside of the smaller surface is that the brands who win compound harder — being named in one answer tends to correlate with being named in the next one, because the same authority signals that drove the first mention drive the second.
LLM optimization overlaps with SEO but is not a subset of it. The inputs share common ground — authority, backlinks, content depth, structured data — but LLM optimization adds layers SEO traditionally ignores: tier-one press placements weighted for training data, entity disambiguation through Wikipedia and knowledge graphs, community presence on Reddit and Quora, and cross-platform consistency for the model to resolve your brand as a distinct entity. The full comparison lives here.
How LLMs decide which brands to name
Every major LLM answers brand questions from two distinct information layers. Understanding both is the foundation of everything else.
Layer one: training data
Training data is everything the model saw during pre-training. For ChatGPT that includes the common crawl, licensed publisher deals (OpenAI has signed agreements with the Financial Times, News Corp, Axel Springer, and others), Reddit, Wikipedia, Stack Overflow, books, code, and a long tail of general web content. For Claude, Anthropic uses a different mix with heavier licensed publisher weighting. For Gemini, Google draws on its own crawl and licensed partnerships. The exact composition varies, but the sources that carry the most weight in each are broadly similar — tier-one publishers, Wikipedia, and high-authority community forums.
When a model answers a brand query without browsing the live web, it's drawing on training data. The names that come out are the names the training data made prominent. If your brand wasn't in the corpus with enough weight, the model doesn't know you exist. Simple as that.
Layer two: live retrieval
Live retrieval is what the model fetches at query time. Perplexity does this for every query by design. ChatGPT does it when web browsing kicks in. Claude does it through its web_search tool. Gemini pulls from Google's live index. Google AI Overviews do it as part of every query. When retrieval is active, the model reads current web pages and synthesizes them into the answer — which means recent coverage can influence today's output, even if it wasn't in the training data.
Retrieval is the fast lane. A new Forbes feature published this morning can appear in a Perplexity answer by tonight. The training lane is slower — that same Forbes feature gets baked into the training data on the next model refresh, which might be months out, and once it's baked it keeps paying dividends for every subsequent version of the model.
Retrieval is the fast lane. Training is the durable lane. Brands that win both get named most often.
The four levers that move the needle
Every serious LLM optimization workflow is some combination of four levers. Skip any of them and you cap your upside.
Lever one: tier-one press placements
Land real editorial coverage in the publications LLMs weight most — Forbes, Reuters, Bloomberg, the Financial Times, the Wall Street Journal, the BBC, Business Insider, USA Today, Entrepreneur. A single placement is a signal. Ten placements across 90 days is a pattern that gets baked into the next training refresh. This is the heaviest lever and the one most agencies skip because it requires editorial relationships they don't have. Instant Press exists because this is now a durable lever for LLM optimization, not a vanity metric.
Lever two: entity disambiguation
LLMs need to know what your brand is before they can name it. Wikipedia article if you qualify. Wikidata entry. Google Knowledge Panel. Complete and consistent profiles on LinkedIn, Crunchbase, G2, Product Hunt, and any relevant directory. Schema markup on your site that matches how the model describes your category. None of this moves rankings on its own. All of it moves LLM citation because it resolves the entity so the model stops confusing you with a competitor or skipping you entirely.
Lever three: community seeding
Reddit drives roughly 21% of LLM answer citations in recent studies. Quora drives about 14%. These aren't fringe sources. They're the second and third most-cited layers in most categories. A brand with real community presence in its category — honest participation, not astroturfing — starts appearing in LLM answers within twelve months even without any press coverage changes. The community layer pulls its own weight through the training corpus on every refresh.
Lever four: cross-platform consistency
Your tagline, category positioning, founder name, and core product claims need to read the same across every source a model might read. Same description on LinkedIn, on your homepage, in press releases, in directory profiles. LLMs penalize inconsistent signals because inconsistent signals suggest untrustworthy entities. A brand that describes itself one way on its homepage and a different way in a press release ends up in neither answer.