v0.2.0 web survey · June 25, 2026 · 50,074 sites

The State of Agent Readability on the Web

Sites scored
50,074
Median score
52 / 100
Scored "excellent"
none (best is 83)
llms.txt
27%
AGENTS.md
25%

Summary

On most of the popular web, an AI agent burns roughly twice the tokens it should.

We scored the 50,074 most-visited websites for how well an agent can discover, parse, and comprehend them. The plumbing search engines asked for is everywhere: 88% expose a robots.txt. The layer agents need is not. Only 27% ship an llms.txt and 25% an AGENTS.md. The median site scores 52/100, and not one of the 50,074 scored "excellent." Raising a site's a14y score roughly halves the tokens an agent spends to use it. The web is wide open for improvement and very few have moved.

← All research

The question

AI agents are becoming a primary way people reach the web. They read sites, follow links, and synthesize answers on someone's behalf. That raises a concrete question for every site owner: can an agent use my site efficiently? The a14y scorecard answers it for one site. We ran it across the web at scale to ask the bigger version: is the web ready for agents?

"Ready" here is mechanical, not subjective. Does the site expose what an agent needs to find and parse it: a usable llms.txt, an AGENTS.md, a sitemap, clean robots rules, semantic HTML, all measured by the v0.2.0 scorecard's checks. It does not judge the quality of the content itself.

Methodology

A single point-in-time scan of the CrUX most-visited list, scored in page mode against the published scorecard. Run on cloud infrastructure, off any residential IP.

Source
CrUX top-100k most-visited origins (zakird/crux-top-lists, global)
Sample
50,074 scored: 100,000 rows → 64,746 after dedup to one per registrable domain → 50,074 reachable & not adult/gambling/spam
Mode
Page mode: each site's homepage, not a full-site crawl
Scorecard
Published v0.2.0 (38 checks) and the in-flight v0.3.0-draft (45 checks), scored 0–100. Headline scores use v0.2.0; the analysis below draws on both, tagged.
Tool
npx a14y (the same audit anyone can run)
Infra
Sharded across 64 Cloud Run tasks; run data archived in object storage
Source label
crux batch-2026-06-15-100k
Date

Page mode is deliberate. Scoring one homepage per site keeps a survey this size tractable and comparable. It proxies a site's agent readiness; it does not capture readiness that lives deeper in the site.

Results: the web scores low

Distribution of agent-readability scores across all 50,074 sites. Mean 50.5, median 52.

Score distribution Number of sites in each 10-point score bucket from 0–9 to 90–100. 2320-191039220-392572940-591371660-79580-100
BandSitesShare
Excellent (85–100) 0 0%
Good (70–84) 1,678 3.4%
Fair (50–69) 26,530 53%
Poor (0–49) 21,866 43.7%

The popular web clusters in the middle and below. 43.7% of sites score under 50, and only 3.4% clear 70. Not one site in the top 50,074 scored "excellent" (85 or above); the single best managed 83. Whatever the web was optimized for, it wasn't agents.

Explore all 50,074 sites on the web leaderboard →

Results: the agent-era layer is missing

What the most-visited sites actually expose to agents, against the published v0.2.0 scorecard and the in-flight v0.3.0-draft.

The signals search engines taught the web to ship are nearly everywhere: 87% expose a robots.txt and 63% a sitemap. But only 78% actually let the major AI crawlers in, so close to a fifth of the most-visited sites block them outright.

The signals agents need are not just rare, they are often a mirage. 29% ship an llms.txt, and most are real (97% are non-empty). The other agent files mostly answer with a 200 but not a usable document: 30% appear to have an AGENTS.md while only 0.2% carry the expected sections, and 24% answer at /sitemap.md while only 0.1% are a real structured sitemap.

The markdown mirror tells the same story, and the draft scorecard makes it sharper. 27% advertise one, but under v0.3.0-draft only 1.8% of those mirrors are actually markdown rather than HTML. And the page itself often needs a browser to read: v0.3.0-draft finds 74% serve real homepage content in the initial HTML, which leaves about a quarter as JavaScript shells that agents without a JS engine (Claude, Perplexity, OpenAI's SearchBot) see as blank.

SignalSitesWhat's behind the numberFrom
robots.txt allows AI bots 78% 22% block GPTBot, ClaudeBot, CCBot, or Google-Extended v0.2.0
llms.txt 29% 97% of them are non-empty, a real file v0.2.0
AGENTS.md 30% only 0.2% carry the expected sections v0.2.0
sitemap.md 24% only 0.1% are a real structured sitemap v0.2.0
Markdown mirror advertised 27% 4% serve it on request via content negotiation v0.2.0
Mirror is valid markdown 1.8% of advertised mirrors are actually markdown, not HTML v0.3.0-draft
Homepage server-renders content 74% 26% are JavaScript shells, invisible to agents that don't run JS v0.3.0-draft
No consent interstitial 93% 7% gate content behind a wall an agent can't click v0.3.0-draft
JSON-LD structured data 37% 6% include a dateModified v0.2.0
Glossary link 0.5% effectively nobody v0.2.0
Full reference: every check, by theme

Adoption for all 38 v0.2.0 checks, plus the 5 checks v0.3.0-draft adds. "Of applicable" excludes sites where a check does not apply (the markdown sub-checks, for example, only apply once a mirror exists). Three crawl-dependent checks (discovery.in-page-link, discovery.indexed, discovery.no-duplicate-content) need a multi-page crawl and were not measured by this single-page survey.

DiscoverabilityOf allOf applicable
sitemap-md.has-structure 0% 0%
llms-txt.md-extensions 0% 0%
agents-md.has-min-sections 0% 0%
llms-txt.content-type 5% 19%
sitemap-md.exists 24% n/a
sitemap-xml.has-lastmod 24% 50%
llms-txt.non-empty 28% 97%
llms-txt.exists 29% n/a
agents-md.exists 30% n/a
sitemap-xml.valid 47% 76%
sitemap-xml.exists 63% n/a
robots-txt.allows-ai-bots 78% n/a
robots-txt.exists 87% n/a
robots-txt.allows-llms-txt 91% n/a
Markdown mirrorOf allOf applicable
markdown.frontmatter 0% 0%
markdown.sitemap-section 0% 0%
markdown.alternate-link 0% 0%
markdown.canonical-header 0% 0%
markdown.content-negotiation 4% 4%
markdown.mirror-suffix 27% 27%
Structured dataOf allOf applicable
html.json-ld.date-modified 6% 17%
html.json-ld.breadcrumb 10% 27%
html.json-ld 37% 37%
Content structureOf allOf applicable
html.glossary-link 0% 0%
html.text-ratio 45% 45%
html.headings 61% 61%
HTML metadataOf allOf applicable
html.og-description 52% 52%
html.canonical-link 55% 55%
html.og-title 55% 55%
html.meta-description 65% 65%
html.lang-attribute 82% 82%
HTTPOf allOf applicable
http.content-type-html 83% 83%
http.redirect-chain 95% n/a
http.status-200 97% n/a
http.no-noindex-noai 100% n/a
Added in v0.3.0-draftOf allOf applicable
markdown.valid-markdown 0% 2%
markdown.size-reduction 5% 17%
markdown.navigation-stripped 14% 52%
html.ssr-content 74% 74%
http.no-interstitial 93% 93%

And the fix works

The gap matters because closing it pays off. In a controlled A/B (full case study), serving the same site with the agent-readiness layer, versus without it, sharply cut what an AI agent spent to use it, with no loss in answer quality.

a14y score
3789
Agent tokens
−49%
Tool calls
−52%
Wall-clock
−30%
Answer quality
tied (84 vs 83)

Raising one site's a14y score from 37 to 89 cut the agent's token use about 49% and its tool calls about 52%, while an independent judge rated the answers statistically indistinguishable. Put the two findings together. The agent-readiness layer roughly halves what an agent spends to use a site, and 73% of the most-visited web hasn't shipped it.

What it means

For site owners

Agent readiness is still a near-empty field. Roughly three in four popular sites haven't shipped the layer, and none has it fully dialed in. That makes it a cheap, early-mover advantage: run npx a14y on your site, ship the top fixes, and roughly halve what every agent spends to use you.

For agent & model builders

The signals you'd want to lean on can't be assumed yet. Only about 1 in 4 popular sites expose an llms.txt or AGENTS.md, so most of the time agents still fall back to the expensive path: fetching and parsing raw HTML. This dataset quantifies exactly where the gaps are.

For the a14y project

This is the baseline. We'll re-run the survey on each scorecard release and track whether the web's agent readiness moves: a standing measure of how the agent-readable web is (or isn't) being built.

Caveats

What this survey does not show, and where to read the numbers with care.

Reproduce

Score any single site with the same tool the survey uses:

npx a14y https://example.com --scorecard 0.2.0

The survey itself is the same audit fanned out across the CrUX list. Batch provenance: crux batch-2026-06-15-100k, 50,074 sites, generated June 18, 2026.