The Guardian
Site checks · 2/7 passed
Evaluated once against the site's origin: discoverability surfaces like
llms.txt, AGENTS.md, and sitemap signals.
- FAIL llms-txt.exists No llms.txt or llms-full.txt found at /, /.well-known/, or /docs/
- PASS robots-txt.exists https://www.theguardian.com/robots.txt
- FAIL robots-txt.allows-ai-bots Blocks: ClaudeBot, CCBot
- PASS robots-txt.allows-llms-txt
- FAIL sitemap-xml.exists sitemap.xml not reachable
- FAIL sitemap-md.exists sitemap.md not reachable
- FAIL agents-md.exists No agent skill file found
Pages · 1
Single page audited: https://www.theguardian.com/us.
- PASS http.status-200 200
- PASS http.redirect-chain 1 hops
- PASS http.content-type-html text/html; charset=utf-8
- PASS http.no-noindex-noai bingbot: noarchive
- PASS html.canonical-link https://www.theguardian.com
- PASS html.meta-description 128 chars
- FAIL html.og-title missing
- FAIL html.og-description missing
- PASS html.lang-attribute en
- FAIL html.json-ld no parseable JSON-LD found
- PASS html.headings 199 headings
- FAIL html.text-ratio 2.4%
- FAIL html.glossary-link no glossary/terminology link
- FAIL markdown.mirror-suffix no .md/.mdx mirror found
- FAIL markdown.alternate-link no <link rel="alternate" type="text/markdown">
- FAIL markdown.content-negotiation text/html; charset=utf-8