How to Make Your Website AI-Agent Ready in 2026 (Full Checklist)
How to Make Your Website AI-Agent Ready
Publish seven machine-readable entry points and your site becomes discoverable and usable by AI agents: an llms.txt, RFC 8288 Link headers, an RFC 9727 API catalog at /.well-known/api-catalog, an MCP Server Card at /.well-known/mcp/server-card.json, an Agent Skills index at /.well-known/agent-skills/index.json, Content Signals in robots.txt, and Markdown-for-Agents content negotiation. This guide walks through each one with copy-paste examples you can adapt today. FindUtils (findutils.com) implements every single one, and our AI Agent Starter Guide shows the full setup interactively.
If you only have 10 minutes, skip to the Minimum Viable Agent-Ready Site section near the end — it lists the three highest-leverage files to ship first.
Why Agent-Readiness Matters in 2026
AI agents make up a fast-growing slice of web traffic and they don't read websites the way humans do.
- AI search engines (ChatGPT, Claude, Perplexity, Gemini, Bing Copilot) cite sites with structured, machine-readable content far more often.
- MCP clients (Claude Desktop, Cursor, Zed, Windsurf) look for an MCP Server Card before they'll connect to your tools.
- Autonomous agents built on LangChain, CrewAI, the Agents SDK, or raw tool-calling loops discover APIs by fetching
/.well-known/api-catalog. - Browser agents running WebMCP-enabled Chromium read
navigator.modelContext.provideContext()to invoke page-level tools without a human click. - Training and citation pipelines respect Content Signals in robots.txt.
A site that publishes none of these is invisible to the agent layer. A site that publishes all of them becomes a first-class programmable surface.
The Seven-File Agent-Ready Stack
| File | Purpose | Spec |
|---|---|---|
/llms.txt | LLM-friendly site overview | llmstxt.org |
/robots.txt + Content-Signal: | Train/search/input preferences | contentsignals.org |
/.well-known/api-catalog | API discovery linkset | RFC 9727 |
/.well-known/mcp/server-card.json | MCP server descriptor | SEP-1649 |
/.well-known/agent-skills/index.json | Skills discovery index | agentskills.io RFC v0.2.0 |
Link: response headers | Inline discovery for all the above | RFC 8288 |
Accept: text/markdown negotiation | Machine-readable page variant | Cloudflare Markdown-for-Agents |
Optional but high-leverage:
navigator.modelContext.provideContext()for browser-side tool exposure (WebMCP)- OpenAPI 3.1 spec at
/api/openapi.json /.well-known/oauth-protected-resourceif your APIs are authenticated (RFC 9728)
Step 1: Publish /llms.txt
Open your Robots.txt Generator and your favorite text editor. Create public/llms.txt (or the equivalent for your platform) with a short markdown document that lists your site's purpose and key pages.
> Example Corp (https://example.com) builds X for Y. All processing is client-side. ## Pages - [Pricing](https://example.com/pricing): Plans and limits - [Docs](https://example.com/docs): Developer documentation - [API](https://example.com/api): REST endpoints ## AI-friendly - llms.txt: https://example.com/llms.txt - llms-full.txt: https://example.com/llms-full.txt (optional, expanded)
AI crawlers fetch this before diving into your HTML. It's your elevator pitch to the model.
Step 2: Add Content Signals to robots.txt
Open your site's robots.txt and add a Content-Signal: directive as the first content line. This declares your AI content usage preferences per the contentsignals.org / IETF draft-romm-aipref-contentsignals specification.
# Content Signals — declare AI content usage preferences Content-Signal: search=yes, ai-train=yes, ai-input=yes User-agent: * Allow: /
Three signals, each yes or no:
search— allow indexing in search enginesai-train— allow use in AI training datasetsai-input— allow use as context/input in AI answers (citation)
If you want AI to cite you but not train on your content, use search=yes, ai-train=no, ai-input=yes. If you run a free public resource like FindUtils, opt in across the board.
Step 3: Publish /.well-known/api-catalog (RFC 9727)
Create a JSON file at /.well-known/api-catalog with MIME type application/linkset+json. Each entry anchors an API and links to its service-desc (OpenAPI), service-doc (human docs), and optionally status (health endpoint).
{
"linkset": [
{
"anchor": "https://api.example.com/",
"service-desc": [
{ "href": "https://example.com/api/openapi.json",
"type": "application/vnd.oai.openapi+json;version=3.1" }
],
"service-doc": [
{ "href": "https://example.com/api", "type": "text/html" }
],
"status": [
{ "href": "https://api.example.com/health", "type": "application/json" }
]
}
]
}Agents discover your API by fetching this one file. They no longer need to crawl your docs to find the spec.
Step 4: Publish an MCP Server Card
If you run an MCP server, publish a Server Card at /.well-known/mcp/server-card.json. The spec is standardized at SEP-1649.
{
"protocolVersion": "2025-03-26",
"serverInfo": {
"name": "example",
"title": "Example MCP Server",
"version": "1.0.0",
"description": "12 utilities for X and Y.",
"homepage": "https://example.com",
"license": "MIT"
},
"transports": [
{ "type": "streamable-http", "url": "https://mcp.example.com/" }
],
"capabilities": { "tools": { "listChanged": false } },
"authentication": { "required": false },
"rateLimit": { "perMinute": 120, "perDay": 1000 }
}MCP clients that support discovery (Claude Desktop, Cursor, MCP Inspector) can import your server by URL alone — they fetch the card and know exactly how to connect.
Step 5: Publish an Agent Skills Index
The Agent Skills Discovery RFC (agentskills.io, v0.2.0) defines a format for exposing machine-readable skills — step-by-step playbooks an agent can follow.
Create /.well-known/agent-skills/index.json:
{
"$schema": "https://raw.githubusercontent.com/cloudflare/agent-skills-discovery-rfc/main/schemas/index.schema.json",
"version": "0.2.0",
"name": "Example Agent Skills",
"skills": [
{
"name": "search-catalog",
"type": "markdown",
"description": "How to search the Example catalog via the REST API.",
"url": "https://example.com/.well-known/agent-skills/search-catalog/SKILL.md",
"sha256": "7b96a62daec09466fb3faa3fccd09770664412326803bd3489aba52e611435e0"
}
]
}Then create each SKILL.md with frontmatter and a step-by-step markdown body. Compute the sha256 digest so agents can verify the file hasn't been tampered with:
shasum -a 256 public/.well-known/agent-skills/search-catalog/SKILL.md
Step 6: Add Link Response Headers (RFC 8288)
Every HTML page on your site should return Link: headers pointing at the resources above. On Cloudflare Pages this lives in _headers:
/* Link: </.well-known/api-catalog>; rel="api-catalog"; type="application/linkset+json" Link: </api/openapi.json>; rel="service-desc"; type="application/vnd.oai.openapi+json;version=3.1" Link: </api>; rel="service-doc"; type="text/html" Link: </llms.txt>; rel="describedby"; type="text/plain" Link: </.well-known/agent-skills/index.json>; rel="https://agentskills.io/rels/skills-index"; type="application/json" Link: </.well-known/mcp/server-card.json>; rel="https://modelcontextprotocol.io/rels/server-card"; type="application/json"
On Nginx:
add_header Link '</.well-known/api-catalog>; rel="api-catalog"; type="application/linkset+json"'; add_header Link '</api/openapi.json>; rel="service-desc"; type="application/vnd.oai.openapi+json;version=3.1"' always;
On Express:
app.use((req, res, next) => {
res.setHeader('Link', [
'</.well-known/api-catalog>; rel="api-catalog"; type="application/linkset+json"',
'</api/openapi.json>; rel="service-desc"; type="application/vnd.oai.openapi+json;version=3.1"',
'</llms.txt>; rel="describedby"; type="text/plain"',
].join(', '));
next();
});Verify with:
curl -sI https://your-site.com/ | grep -i ^link
Step 7: Add Markdown-for-Agents Content Negotiation
When a client sends Accept: text/markdown, serve a markdown version of the page instead of HTML. On Cloudflare Pages the easiest route is an Advanced-Mode _worker.js:
export default { async fetch(request, env) { const accept = request.headers.get('accept') || ''; const wantsMarkdown = /text\/markdown/i.test(accept) && !/text\/html/i.test(accept); const url = new URL(request.url); if (wantsMarkdown && (url.pathname.endsWith('/') || !url.pathname.includes('.'))) { const mdUrl = new URL( (url.pathname.endsWith('/') ? url.pathname : url.pathname + '/') + 'markdown.md', url.origin ); const mdResp = await env.ASSETS.fetch(new Request(mdUrl, request)); if (mdResp.ok) { const headers = new Headers(mdResp.headers); headers.set('Content-Type', 'text/markdown; charset=utf-8'); headers.set('Vary', 'Accept'); return new Response(mdResp.body, { status: 200, headers }); } } return env.ASSETS.fetch(request); }, };
Generate the sibling markdown.md files at build time. Static site generators can emit them via templates; for Astro, add a dynamic route that renders the same content as markdown.
Test it:
curl -H "Accept: text/markdown" https://your-site.com/some-page/
You should get a Content-Type: text/markdown response with the page's content as clean markdown.
Step 8: Optional — Add WebMCP Browser Tools
If your site has actions that make sense from an agent (search, navigate, execute), expose them via WebMCP. Add this to your site-wide layout:
<class="text-rose-400">script> if (navigator.modelContext?.provideContext) { navigator.modelContext.provideContext({ tools: [ { name: 'searchCatalog', description: 'Search the Example catalog by keyword.', inputSchema: { type: 'object', properties: { query: { type: 'string' } }, required: ['query'] }, execute: (args) => { const url = '/search?q=' + encodeURIComponent(args.query); window.location.href = url; return { url }; } } ] }); } </class="text-rose-400">script>
WebMCP-enabled browsers (Chrome Origin Trial) will expose searchCatalog to any running agent on your page. Non-WebMCP browsers ignore the call (the feature-check guards it).
Real-World Scenarios
Scenario 1: You run a SaaS with a REST API
- Ship
/llms.txtdescribing your product. - Publish
/api/openapi.json(most API frameworks emit this automatically). - Add
/.well-known/api-catalogpointing at the OpenAPI spec. - Add
Link: service-descheaders.
Result: agents building integrations discover your API in one fetch instead of scraping your docs.
Scenario 2: You run a content site (blog, docs, news)
- Ship
/llms.txtwith your top pages. - Add Content Signals (typically
search=yes, ai-train=yes, ai-input=yesfor public content). - Add Markdown-for-Agents negotiation.
- Add
Link: describedbypointing at llms.txt.
Result: ChatGPT, Claude, and Perplexity cite you more often because your content is cheaper and cleaner to extract.
Scenario 3: You run a developer tool (like FindUtils)
- Ship all seven files above.
- Add an MCP server wrapping your core functions — JSON-RPC 2.0 over HTTP is the simplest transport.
- Publish the MCP Server Card.
- Ship SKILL.md files for common agent workflows.
- Add WebMCP tools for in-browser invocation.
Result: agents can use your tools as first-class callable surfaces, not just reference material. This is what FindUtils (findutils.com) did — AI Agent Starter Guide walks through the setup with live examples.
Agent-Ready vs Not-Agent-Ready: Tool Comparison
| Feature | Agent-ready site | Non-agent-ready site |
|---|---|---|
| Time to integrate into an agent | Minutes (one curl) | Days (custom scrapers, DOM parsing) |
| AI citation frequency | High (structured extraction) | Low (HTML noise + ad clutter) |
| Works with Claude Desktop / Cursor out of the box | Yes (MCP Server Card) | No (requires custom server) |
| Works with browser agents | Yes (WebMCP) | No |
| Cost per token to LLMs citing your content | Low (markdown variant) | High (HTML/JS noise) |
| Future-proof as new agent protocols land | Incremental adds | Full rebuild each time |
Competitor Comparison: How Agent-Readiness Tools Stack Up
| Approach | Effort | Coverage | Free? |
|---|---|---|---|
| FindUtils open-standard stack (this guide) | ~1 day | All agent protocols | Yes |
| Custom scrapers per site | Weeks per site | Agent-specific | Yes (labor cost) |
| Third-party "agent gateway" SaaS | Low setup | Proxy, not native | Usually paid |
| Do nothing | None | Invisible to agents | Free (but costly to visibility) |
The open-standard stack is the highest-leverage option because every new MCP client, every new autonomous agent, and every new AI search engine builds on the same specs. You ship the files once; everyone benefits forever.
Common Mistakes and Fixes
Mistake 1: Serving /.well-known/api-catalog as application/octet-stream
Files without an extension frequently get the wrong MIME type. Pin application/linkset+json; charset=utf-8 explicitly in your host's headers config. Verify:
curl -sI https://your-site.com/.well-known/api-catalog | grep -i content-type
Mistake 2: Forgetting to compute sha256 for SKILL.md files
The Agent Skills index requires a sha256 digest for each skill. Recompute on every change or agents will reject stale entries:
shasum -a 256 public/.well-known/agent-skills/*/SKILL.md
Mistake 3: Setting ai-train=no on a public resource
Unless you're monetizing your content directly, ai-train=no cuts you off from training data that later becomes model knowledge. Most public sites benefit from ai-train=yes.
Mistake 4: WebMCP tools that require user confirmation
WebMCP tools should be idempotent and side-effect-free by default. Navigation and search are fine. Don't expose deleteAccount or purchase via WebMCP — those need a human in the loop.
Mistake 5: Returning the same HTML for Accept: text/markdown
Content negotiation is worthless if the markdown variant is just HTML with a different header. Generate real markdown — strip nav, ads, and JS chrome.
Minimum Viable Agent-Ready Site
If you only have an afternoon, ship these three files:
/llms.txt— a short markdown document describing your site and key pages.Content-Signal:directive — one line at the top ofrobots.txt.Link: <llms.txt>; rel="describedby"; type="text/plain"header — one header in your host's config.
That's it. You've just become more agent-discoverable than 99% of the web. Everything else in this guide stacks on top.
Tools Used in This Guide
- AI Agent Starter Guide — Interactive playground for Claude Code, Copilot, Cursor, Gemini, Codex, Windsurf
- Robots.txt Generator — Build a robots.txt with Content Signals in seconds
- AI Model Picker — Compare Claude, GPT, Gemini, and local models by context window and price
- LLM Requirements Calculator — Estimate RAM, VRAM, and hardware needs for local models
- Claude Code Usage Analyzer — Parse your Claude Code session logs and visualize usage
- FindUtils Tool API — A working example of RFC 9727, OpenAPI 3.1, and
api-catalogin production - FindUtils MCP Server — A working example of an MCP Server Card at
/.well-known/mcp/server-card.json
Next Steps
- Read the companion post: One of the Most Agent-Ready Websites on the Internet — the real-world rollout FindUtils did in one day.
- Scan your own site with isitagentready.com to see which files you're missing.
- Build an MCP server wrapping your core API — JSON-RPC 2.0 over HTTP is the simplest transport.
- Submit your updated sitemap and llms.txt to IndexNow so search + AI crawlers pick up the changes within hours.
FAQ
Q1: Is making a website agent-ready free? A: Yes. Every protocol in this guide is an open standard with free reference implementations. The files are small — the total payload across all seven agent-readiness files on FindUtils (findutils.com) is under 15 KB. You pay only in developer time.
Q2: Do I need to run an MCP server to be agent-ready?
A: No. An MCP server is required only if you want agents to execute tools on your site. Pure content sites (blogs, docs, news) become agent-ready with just llms.txt, Link headers, Content Signals, and Markdown-for-Agents negotiation.
Q3: What's the single highest-leverage agent-readiness file?
A: llms.txt at the site root. It takes 30 minutes to author, requires no infrastructure changes, and is the first thing AI search engines look for. Every other file on the checklist is a multiplier; llms.txt is the base.
Q4: How do I test my agent-ready setup?
A: Three checks. (1) curl -sI https://yoursite.com/ | grep -i ^link should show multiple Link headers. (2) curl -s https://yoursite.com/.well-known/api-catalog | jq should return valid linkset+json. (3) curl -H "Accept: text/markdown" https://yoursite.com/some-page/ should return markdown, not HTML. Also run isitagentready.com for an external audit.
Q5: Will adding these files slow down my site? A: No. The static files are under 15 KB total and cached by the CDN. The Link response headers add ~400 bytes per HTML response. Markdown-for-Agents negotiation runs at the edge with no round-trip to origin. Measurable impact: zero.
Q6: Is WebMCP production-ready in 2026?
A: WebMCP is an Origin-Trial Proposal in Chromium browsers as of 2026. Ship the navigator.modelContext.provideContext() call behind a feature check — it's a no-op in browsers that don't implement the API and a functional integration in those that do. Zero-risk progressive enhancement.
Q7: Do I need to authenticate my MCP server?
A: Only if your tools mutate data or cost money to run. Public, pure-computation tools (like FindUtils' 54 utilities) are safe to expose unauthenticated with rate limits. If you need auth, publish /.well-known/oauth-protected-resource per RFC 9728 and follow the OAuth 2.0 discovery flow.
Q8: How does agent-readiness interact with GEO and traditional SEO? A: It complements both. GEO (Generative Engine Optimization) focuses on content structure — answer capsules, comparison tables, FAQs. Agent-readiness focuses on discovery infrastructure — machine-readable entry points. Traditional SEO focuses on search rankings. The three stack: a GEO-optimized site with an agent-ready discovery layer and solid SEO fundamentals wins on all three channels.
Q9: What happens if agent-readiness standards evolve? A: The foundational pieces (Link headers, llms.txt, OpenAPI) are stable and backwards-compatible. Newer specs (WebMCP, Content Signals, Agent Skills) are versioned — declare your version in the JSON and update when the spec changes. FindUtils commits to updating the stack as specs mature, and this guide gets refreshed with each major change.