# GEO Guide 2026: How to Make ChatGPT and Claude Cite Your Website

> Clean Markdown version of https://claudecodeguia.com/en/geo-guide/, optimized for AI agent consumption. Last updated: May 2026.

Traditional SEO is a fight to enter Google's top 10. **GEO (Generative Engine Optimization)** plays a different game: 83% of Google AI Overview citations come from pages outside the top 10. This guide shows, with real data and copy-paste code, how to configure your site in one hour so ChatGPT, Claude, Perplexity, and Copilot find and cite it.

## What is GEO and why it matters in 2026

**GEO (Generative Engine Optimization)** is the set of practices that make your website visible and citable by generative search engines like ChatGPT, Claude, Perplexity, Google AI Overview, and Microsoft Copilot. Unlike classic SEO, GEO isn't about climbing positions: it's about **helping AI understand what you already have**.

The term was coined in the paper ["GEO: Generative Engine Optimization"](https://arxiv.org/abs/2311.09735), published by Princeton and IIT Delhi researchers at KDD 2024. The central finding: visibility in AI responses increases up to **115% by adding authoritative citations**, **43% with direct quotations** from credible sources, and **33% with relevant statistics**.

Traffic data also explains why we're talking about GEO now: AI search grew **527% year-over-year** in H1 2025, ChatGPT reached **900 million weekly active users** by February 2026, and AI-referred traffic converts at **5x the rate** of traditional search. Even so, it still represents less than 1% of total traffic.

> "GEO is a brand visibility strategy, not a traffic strategy. Worth an hour of setup, not a week." — [@HiTw93](https://x.com/HiTw93/status/2050931710066565374)

## SEO vs GEO: four key differences

| Aspect | Traditional SEO | GEO |
|---|---|---|
| Goal | Top 10 in Google | Get cited in AI answers |
| Key metric | Position + clicks | Citations + retrieval-to-citation rate |
| Signals that matter | PageRank, backlinks, CTR | Clear structure, reliable sources, specific data |
| Where citations come from | Top 10 results | 83% from outside the top 10 |

The PageRank moat no longer protects giants in the AI era. If your README or documentation is well-written, you can outrank a massive site with thin content.

## The four types of AI crawlers

| Type | Examples | What they do | Recommendation |
|---|---|---|---|
| Training | GPTBot, ClaudeBot, CCBot | Take content to train models | Block if you want to opt out |
| Search and retrieval | OAI-SearchBot, Claude-SearchBot, PerplexityBot | Fetch in real time to answer queries | **Always allow** |
| User-triggered | ChatGPT-User, Claude-User | Fire when someone pastes your URL | **Always allow** |
| Undeclared | Bytespider | Don't follow rules | Block |

The most expensive mistake: blocking OAI-SearchBot thinking you're protecting content. What you actually did was disappear from ChatGPT search results.

## How to configure robots.txt for AI

```
# Search and retrieval: allow
User-agent: OAI-SearchBot
Allow: /

User-agent: Claude-SearchBot
Allow: /

User-agent: PerplexityBot
Allow: /

# User-triggered: allow
User-agent: ChatGPT-User
Allow: /

# Training: block (optional)
User-agent: GPTBot
Disallow: /

# Undeclared: block
User-agent: Bytespider
Disallow: /

Sitemap: https://yoursite.com/sitemap.xml
```

## How to create your llms.txt

`llms.txt` is a new standard, similar to `robots.txt` but designed for AI consumption. According to BuiltWith, more than **840,000 sites** have already deployed it (Anthropic, Cloudflare, Stripe, Vercel). But SE Ranking's survey of 300,000 domains shows real adoption at only **10%**: you're early, and that's an advantage.

Simple format at `/llms.txt`:

```markdown
# Your project name

> One-line description of what this is.

## Links

- [Documentation](https://yoursite.com/docs)
- [GitHub](https://github.com/you/project)
- [Blog](https://yoursite.com/blog)

## About

Short paragraph explaining the project, purpose, key features.
```

After creating it, submit to [directory.llmstxt.cloud](https://directory.llmstxt.cloud/), [llmstxt.site](https://llmstxt.site/), and the `llms-txt-hub` repo on GitHub.

## Why you also need llms-full.txt

While `llms.txt` is the summary, `llms-full.txt` is the complete version: typically 30–60 KB with project descriptions, use cases, comparisons, and README excerpts. Mintlify's CDN analysis shows that **`llms-full.txt` receives 3 to 4 times more traffic than `llms.txt`**. When AI finds the summary, it almost always goes for the full version.

## Markdown routes: feed AI clean content

A typical 15,000-token HTML page becomes a 3,000-token Markdown document. **That's 80% less noise for AI**. Add to your `<head>`:

```html
<link rel="alternate" type="text/markdown" href="/page.md" />
```

Claude Code and Cursor already send `Accept: text/markdown` headers by default. This is **standard HTTP/1.1 content negotiation since 1997** — not magic, just protocol.

**Important**: never serve different content to bots and humans based on User-Agent. That's cloaking and Google penalizes.

## Register with search platforms

1. **Google Search Console**: verify your domain, submit sitemap.xml.
2. **Bing Webmaster Tools**: underrated but critical — Copilot, DuckDuckGo, and Yahoo all run on Bing's index.
3. **IndexNow**: protocol in Bing Webmaster to notify immediately when publishing. URLs indexed in minutes.
4. **Perplexity Publisher Program**: apply at [pplx.ai/publisher-program](https://pplx.ai/publisher-program). Once approved, 80/20 revenue share and citation analytics.

## Each project needs its own page

Cited pages have titles with higher semantic similarity to user queries, and natural-language slugs (`/projects/pake`) are cited more than opaque IDs (`/page?id=47`). AI makes decisions before reading body content.

Don't concentrate everything on one giant page with anchors. AI's citation granularity is the URL, not the anchor.

## Princeton paper data

Three highest-impact factors:

- **+115%**: adding authoritative citations with links to original source
- **+43%**: including direct quotations from credible sources
- **+33%**: adding relevant statistics with concrete numbers

Practical findings from [geo-citation-lab](https://github.com/yaojingang/geo-citation-lab) (602 prompts, tens of thousands of pages):

- **Specificity**: pages with real data, clear definitions, and comparisons have over 50% higher impact.
- **Depth**: high-impact pages average 2,000 words and 10+ headings. Low-impact: 170 words (10x gap).
- **Sweet spot**: 1,000–3,000 words.
- **FAQ doesn't work**: pure FAQ format hurts citation rate.

## Platform differences

| Platform | Citation style | Optimal strategy |
|---|---|---|
| ChatGPT | Few sources, deeply. Per-citation impact: 5x Google's. | Depth. Few excellent, long pages. |
| Perplexity | More than 2x sources of ChatGPT. | Volume. Multiple medium pages. |
| Claude | Conservative, prioritizes verifiable sources. | Authority. External citations and concrete data. |
| Bing/Copilot | Only AI where JSON-LD directly helps. | Keep schema markup clean. |

**83% of global citations are English content**. International audience = need English version.

## What doesn't work in GEO

- `<meta name="ai-content-url">` and `<meta name="llms">`: no specification, no adoption.
- `/.well-known/ai.txt`: competing proposals, no winner.
- HTML comments with hints for AI: parsers strip them.
- Serving different Markdown to bots via User-Agent: cloaking, Google penalizes.
- Unofficial "AI-friendly" meta tags: noise, not signal.

**JSON-LD isn't as useful**: SearchVIU tested 5 AI systems and none found data placed only in JSON-LD. The only confirmed exception is Bing/Copilot. Keep existing JSON-LD for Bing and Google rich results, but don't expect ChatGPT or Claude to cite you more for adding it.

## How to verify your GEO works

1. **Direct prompt testing**: once a week, run the same 5 prompts in ChatGPT, Claude, Perplexity, Google AI Overview. Incognito mode.
2. **Server logs / Cloudflare panel**: filter `OAI-SearchBot`, `Claude-SearchBot`, `PerplexityBot`. Seeing crawler download your `llms.txt` is the strongest signal.
3. **Bing Webmaster Tools → AI Performance**: only official panel with citation data (covers Copilot, DuckDuckGo, Yahoo).
4. **Referrers in analytics**: watch traffic from `chat.openai.com`, `claude.ai`, `perplexity.ai`. The definitive proof.

The CJR Tow Center analyzed 200 AI-generated citations and found **153 with errors** (partial or complete). Do the structural work but don't take an AI citation as proof of exact words seen.

## How to implement GEO with Claude Code in one hour

1. **Open your project in Claude Code**: navigate to folder, run `claude`.
2. **Ask it to create the categorized robots.txt**: "Create a robots.txt allowing OAI-SearchBot, Claude-SearchBot, and PerplexityBot, blocking Bytespider, and including the sitemap."
3. **Generate your llms.txt and llms-full.txt**: "Read README.md and main project files. Generate summary llms.txt and complete llms-full.txt following llmstxt.org standard."
4. **Add Markdown routes**: "For each HTML page, generate equivalent .md version and add `link rel='alternate' type='text/markdown'` to `<head>`."
5. **Verify**: ask Claude Code to validate resulting llms.txt. Commit and deploy.

## Official resources

- [GEO: Generative Engine Optimization (Princeton/IIT Delhi, KDD 2024)](https://arxiv.org/abs/2311.09735)
- [llmstxt.org](https://llmstxt.org/) — Standard specification
- [geo-citation-lab](https://github.com/yaojingang/geo-citation-lab) — Open research with 602 prompts
- [Why ChatGPT Cites One Page Over Another (Ahrefs)](https://ahrefs.com/blog/why-chatgpt-cites-pages/)
- [IndexNow Documentation](https://www.indexnow.org/documentation)
- [Original article by @HiTw93](https://x.com/HiTw93/status/2050931710066565374)

---

**Web version (HTML)**: https://claudecodeguia.com/en/geo-guide/  
**Spanish version**: https://claudecodeguia.com/geo-guia/  
**Main site**: https://claudecodeguia.com/en/
