How does ChatGPT decide which sites to cite?

When the question needs current information, ChatGPT runs a web search, opens a handful of results, reads their raw HTML, and quotes short passages it can attribute. It leans toward pages it can read without running JavaScript, pages that answer the question directly near the top, and sources it already sees mentioned across the web. Being readable, quotable and trusted is what gets you named.

Why isn't my site getting cited by ChatGPT even though it ranks on Google?

The usual reason is that your content only appears after JavaScript runs, so the AI crawler sees a near-empty page even though Google renders it fine. The next most common reasons are a robots.txt that blocks GPTBot or OAI-SearchBot, an answer buried deep in the page instead of stated up front, and missing schema. Run an AEO scan to see which of these is hitting you.

Do I need to allow GPTBot to get cited by ChatGPT?

Yes, in most cases. ChatGPT reaches your pages through named crawlers like OAI-SearchBot, ChatGPT-User and GPTBot. If your robots.txt blocks them, you opt out of being read and quoted. Plenty of sites block these bots by accident in a robots.txt copied from somewhere else, so check yours and allow the ones you want answering questions about you.

What is an llms.txt file and does it help with ChatGPT citations?

An llms.txt is a plain text file at the root of your site that points AI models at your most important pages in clean Markdown. It does not force anything, but it makes your best content easier to find and read, which helps models pick the right page to quote. It is quick to add and there is little reason not to.

How long does it take to get cited by ChatGPT?

The technical fixes work fast: once your content is in the raw HTML and the crawlers are allowed, the next crawl can pick you up within days to a few weeks. The trust side is slower, because being mentioned across the web and building a track record on a topic compounds over months. Most sites see the quickest wins from the technical layer, which is why it comes first.

How to Get Cited by ChatGPT in 2026

Q: How do I check if my site is ready to be cited by ChatGPT?

Run an AEO scan. Amabrik's SEO/AEO scan crawls your site, scores it for answer engines, and flags the exact gaps that keep ChatGPT from quoting you: content only rendered by JavaScript, missing schema, headings an AI can't follow, blocked AI crawlers and a missing llms.txt. Each issue comes with a plain-English explanation and a copy-paste fix prompt.

A few years ago, the goal was simple: rank on Google, earn the click, win the customer. That goal still matters, but a second one now sits next to it. People open ChatGPT, ask a full question, and read the answer it writes back, often without visiting a single website. When ChatGPT answers, it pulls the facts from a small set of sources and names some of them. Being one of those named sources is the new version of ranking on page one.

The catch is that getting cited by ChatGPT is part content and part plumbing, and most guides only cover the content half. They tell you to write helpful pages and earn mentions, which is true, then skip the technical reasons ChatGPT never sees your page in the first place. This guide covers both halves, in the order that actually moves the needle, so you fix the silent technical failures first and then earn the trust that keeps you in the answer. If you want the bigger picture on how classic search and answer engines differ, the SEO vs AEO guide sets it up.

How does ChatGPT decide what to cite?

ChatGPT cites pages it can read, quote cleanly, and already trusts. When you ask it something that needs current information, it runs a web search behind the scenes, opens a handful of the results, reads their raw HTML, and lifts short passages it can attribute back to a source. The page that gets named is usually the one that made all three of those steps easy.

ChatGPT sources list with nature.com selected and tagged readable, quotable and trusted, the basis of how to rank on ChatGPT

Readable comes first, because the model works from the raw HTML rather than the polished page a human sees in a browser. Quotable comes next, since a clear, self-contained answer near the top of the page is far easier to lift than one buried in the eighth paragraph. Trusted comes last and compounds, because models lean on sources they already see referenced across the web. Get those three working together and you stop hoping for a citation and start earning it on purpose.

It helps to picture the actual sequence. When your question needs fresh information, ChatGPT sends a search query to a web index, gets back a ranked list of pages, opens several of them, and reads their text before it writes a single word of the answer. It then builds a response from the passages it found most relevant and attaches citations to the handful of sources it leaned on. Every one of those steps is a filter, and a page that loads slowly, hides behind scripts, or stays vague in its wording quietly drops out before the citation is ever decided. The pages that survive are the ones that are easy to fetch, easy to read, and clearly about the exact thing that was asked.

First, make sure ChatGPT can actually read your page

If your content only appears after JavaScript runs, ChatGPT often sees a blank page where your answer should be. This is the single most common reason a site that ranks fine on Google still never shows up in an AI answer. Google has spent years getting good at rendering JavaScript, while many AI crawlers simply pull the raw HTML, take what’s there, and move on without running your scripts.

Split code editor showing server-rendered HTML marked visible to AI next to an empty JavaScript shell marked empty to crawlers, a core chatgpt seo fix

The fix is to serve your real content in the HTML that comes back on the first request, which usually means server-side rendering or a static build rather than a client-only single-page app. You can check this yourself in about ten seconds: open your page, view the page source, and search for a full sentence from your main content. If the words are sitting right there in the HTML, an AI crawler can read them too. If the source is mostly empty markup and script tags, your answer is invisible to the model, no matter how good it is.

This trips up modern sites more than old ones, because a lot of today’s frameworks render everything in the browser by default. A single-page app built with plain React or Vue often ships an almost empty HTML shell and paints the real content with JavaScript a moment later, which is exactly what a crawler that skips scripts will miss. The fix is to render on the server or build the pages ahead of time, and frameworks like Next.js, Nuxt, Astro and SvelteKit all support that out of the box. If you run WordPress, Webflow, Framer or a static site generator, your content usually sits in the HTML already, so this is mostly a worry for custom-built apps. When you’re unsure, run the view-source test on your most important pages and fix any that come back empty.

Let the AI crawlers in

ChatGPT reaches your site through named bots, and if your robots.txt blocks them you have quietly opted out of every answer they generate. The crawlers worth knowing are OAI-SearchBot and ChatGPT-User, which fetch pages for ChatGPT search and live browsing, GPTBot, which gathers training and retrieval data, and the equivalents from other engines like ClaudeBot, PerplexityBot and Google-Extended. A surprising number of sites block these without meaning to, usually in a robots.txt copied from another project or bolted on by a plugin.

A robots.txt file in an editor allowing GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot and Google-Extended so you get cited by AI

Open your robots.txt and confirm the bots you want answering questions about you are allowed rather than disallowed. While you’re in there, add an llms.txt at the root of your domain, a plain text file that points models at your most important pages in clean Markdown. It doesn’t force anything, but it makes your best content easier to find and read, which helps an engine pick the right page to quote instead of guessing.

A minimal robots.txt that welcomes the main answer engines is just a few short blocks: User-agent: GPTBot followed by Allow: /, then the same pair for OAI-SearchBot, PerplexityBot and ClaudeBot, and a User-agent: Google-Extended block with Allow: / so Google’s AI products can use your content too. You can be selective if you want, since blocking GPTBot from training while still allowing OAI-SearchBot to browse is a perfectly reasonable stance when you only care about live answers. What matters is making that choice on purpose rather than finding out months later that a default rule has quietly kept you out the whole time. After you edit the file, open yourdomain.com/robots.txt in a browser to confirm the live version says what you think it says.

Structure your content so it’s easy to quote

Lead every section with the exact question a real person would ask, written as a heading, then answer it in the first sentence or two before you expand. Models lift short, self-contained passages that make sense on their own, so the closer your page sits to a clean question-and-answer shape, the more of it an engine can quote with confidence. Bury the answer in the middle of a long paragraph and you hand the citation to a competitor who put theirs up front.

A page leading with a question heading and an answer-first sentence highlighted beneath it, the content shape that wins generative engine optimization

Keep each answer specific and standalone, the kind of passage that would still make sense if someone pasted it into a chat with no other context. A tidy heading followed by a direct two-sentence answer, then the detail underneath, gives the model an obvious capsule to grab. This is the same structure that wins featured snippets on Google, which is part of why it pays off twice.

A few formats travel especially well into AI answers. A short definition placed right after a “what is” heading hands the model a clean sentence it can quote without editing. A comparison laid out as a simple table lets it pull a single row without losing the meaning around it. A numbered set of steps under a “how to” heading maps directly onto the kind of instructions people ask ChatGPT for in the first place. Depth still matters underneath all of this, because the model rewards a page that goes well beyond its one-line answer. The move is to lead each block with the cleanest version of that answer, then earn the reader’s time with everything that follows.

Mark it up with schema

Schema markup tells a machine what each part of your page actually means, so a price reads as a price, a question reads as a question, and a step reads as a step. Adding FAQ, HowTo, Article and Organization schema in JSON-LD turns a wall of text into labeled, structured facts that an answer engine can extract without guessing, which makes it far more comfortable quoting you.

FAQPage schema written as JSON-LD in a code editor with question and acceptedAnswer objects, the markup that helps you get cited by AI

You don’t need to mark up everything, just the parts you want pulled into answers. A FAQ block with proper FAQPage schema, an article with author and date, and an Organization entry that ties your brand together cover most of what matters. The markup never shows on the page, so it costs your reader nothing while giving the model a cleaner, more confident read of what you offer.

In practice this is a small block of JSON-LD in your page head. A FAQ section becomes a FAQPage object holding a list of questions and their accepted answers, an article carries an Article object with a headline, an author and a date, and your whole site gets one Organization object with your name, logo and social profiles so engines can tie your mentions together. Most content systems can output this for you, and you can confirm it parses cleanly with Google’s Rich Results Test or the Schema.org validator. The markup won’t rescue a weak page, but on one that already answers the question well, it removes the last bit of ambiguity that might otherwise stop a model from quoting you.

Give ChatGPT something genuinely worth quoting

Models reach for specific, original, current facts over generic advice anyone could have written. A page with your own numbers, a clear method, or a result you actually measured gives an engine something it can’t find in ten other places, which is exactly what makes a passage worth lifting and attributing. Round, vague claims get skipped in favor of the source that committed to a real figure.

A report leading with the original stat 47% of buyers, an Updated June 2026 badge and a chart, the kind of page chatgpt seo rewards

Freshness and reputation do the rest. A page with this year’s data and a visible updated date beats a stale one that hedges, because the model has its own credibility on the line and favors sources that look maintained. Being mentioned across the web, in forums, comparisons and credible articles, teaches the model that you’re a real authority on the topic, so the work you do off your own site feeds directly back into whether you get named on it.

Some kinds of pages earn citations far more often than others. Original research and your own numbers lead the list, because a real figure that exists nowhere else is exactly what a model reaches for when it wants to sound specific. Clear definitions, honest comparisons and detailed how-to guides follow close behind, since each one answers a common shape of question in a way that’s easy to lift cleanly. Thin opinion pieces and restatements of what everyone else already wrote tend to get skipped, because they add nothing a model couldn’t have generated on its own.

The trust half of the equation lives largely off your own site. ChatGPT leans on sources the wider web already treats as credible, so a mention in a respected publication, a recommendation in a Reddit thread, a comparison on a third-party site, and a steady presence on review platforms all feed back into whether you get named. You don’t control those directly, but you earn them by being genuinely useful and worth referencing. The sites that get cited the most are usually the ones people were already talking about, which is why the off-site work and the on-page work pull in the same direction.

Check what ChatGPT actually sees

You can’t judge any of this by looking at your site in a browser, because the browser runs all the JavaScript and hides the exact gaps an AI crawler trips over. You need to see your pages the way a bot does, scored against what answer engines care about. This is the part most people skip, and it’s usually where the real problem is hiding.

Amabrik SEO and AEO scan scoring a site 84 out of 100 and flagging JavaScript content, missing schema and blocked AI crawlers, so you learn how to rank on ChatGPT

Amabrik’s SEO/AEO scan crawls your site and grades it for answer engines, then flags the precise issues keeping ChatGPT from quoting you: content only rendered by JavaScript, missing schema, headings an AI can’t follow, blocked AI crawlers and a missing llms.txt. Each finding comes with a plain-English explanation and a copy-paste fix prompt, so you go from a vague worry to a ranked list of what to change. The scan docs walk through what each score means and how to read it.

Measure your ChatGPT citations

Once the fixes are in, track whether ChatGPT is actually naming you, because the old metrics won’t show it. Start by watching your analytics for referral traffic from domains like chatgpt.com and perplexity.ai, since that traffic is small today and growing fast, and it tends to arrive already half-convinced. A rising trickle of AI referrals is the first concrete sign that the work is landing.

An analytics referrers table showing rising AI referral traffic from chatgpt.com, perplexity.ai and gemini.google.com, proof you are getting cited by AI

Then test it directly by asking ChatGPT the questions your customers ask and noting whether you get named, how often, and against which competitors. That share of the answer is the AEO version of a ranking, and watching it over time tells you whether you’re gaining ground or quietly losing it. Pair the two views, the referral numbers and the prompt tests, and you can see both that you’re being cited and roughly how much.

A few habits turn this from a guess into something you can actually track. Keep a short list of the questions that matter most to your business and run them through ChatGPT, Perplexity and Gemini on a regular schedule, writing down who gets named each time. Watch your referral report for the AI domains as a group, since a steady climb there is the clearest proof that citations are turning into real visits. If you want to move faster, a tool that monitors AI mentions can watch far more prompts than you ever could by hand and flag the moment your share of the answer shifts. You’re really after a reliable trend you can read and act on, more than any single perfect number.

So, where do you start?

Work the order that gets you the fastest wins. Make your content readable in raw HTML, open the AI crawlers in robots.txt, add an llms.txt, structure each page answer-first, mark up the important parts with schema, then earn the original content and mentions that build trust. The technical layer pays off in days, the trust layer compounds over months, and doing them in that order means you stop losing citations you could already be winning.

If you want a fast read on where your site stands today, run the SEO/AEO scan, fix the issues it ranks by impact, then rescan to confirm they’re gone. ChatGPT is already answering questions about your market right now, and the sites built so a machine can read, quote and trust them are the ones getting named while everyone else wonders why the traffic dried up.

How to Get Cited by ChatGPT in 2026

How does ChatGPT decide what to cite?

First, make sure ChatGPT can actually read your page

Let the AI crawlers in

Structure your content so it’s easy to quote

Mark it up with schema

Give ChatGPT something genuinely worth quoting

Check what ChatGPT actually sees

Measure your ChatGPT citations

So, where do you start?

Questions, answered

Get the next guide in your inbox

How does ChatGPT decide what to cite?

First, make sure ChatGPT can actually read your page

Let the AI crawlers in

Structure your content so it’s easy to quote

Mark it up with schema

Give ChatGPT something genuinely worth quoting

Check what ChatGPT actually sees

Measure your ChatGPT citations

So, where do you start?

Questions, answered

Keep reading

SEO vs AEO: What's the Difference in 2026?

Get the next guide in your inbox