v0.2.0 ablation · June 25, 2026 · 24 variants
Which agent-readiness features actually pay off
- Full layer (Claude)
- −47% tokens
- Best single feature
- −48.6% md-mirrors
- Answer quality
- flat ~75/100 across all variants
Summary
Ship a markdown mirror and a real meta description first. Skip the agent-skills directory for now.
We took a14y.dev, toggled each of its 11 agent-readiness features on and off, and measured what each one
is worth to an AI agent doing a real retrieval task. For Claude, the full discovery layer cuts token use
about 47%, and a markdown mirror alone does most of that. But the features do not simply add up: some
substitute for each other, one (the agent-skills directory) actively makes things worse, and a judge
re-grade found the answers stayed exactly as accurate. The features make the agent faster and more
efficient, not smarter. We ran the same probe through Codex and Cursor too, but only Claude's runs were
cleanly instrumented enough to quantify, which is a finding in itself (more on that below).
← All research