Scorecard v0.3.0-draft

Discoverability · 16 checks

llms-txt.exists llms.txt is published Pass if llms.txt or llms-full.txt is reachable at /, /.well-known/, or /docs/ and is not an HTML page (a soft-200 SPA shell or styled 404 does not count).
llms-txt.content-type llms.txt served as text/plain Pass if the llms.txt response Content-Type starts with text/plain.
llms-txt.non-empty llms.txt is not empty Pass if the llms.txt body has any non-whitespace content.
llms-txt.md-extensions llms.txt links use .md or .mdx Pass if every link in llms.txt points at a .md or .mdx URL (the format agents can ingest cleanly).
robots-txt.exists robots.txt is published Pass if /robots.txt returns a 2xx response.
robots-txt.allows-ai-bots robots.txt allows AI bots Pass if robots.txt does not disallow GPTBot, ClaudeBot, CCBot, or Google-Extended from fetching the site root.
robots-txt.allows-llms-txt robots.txt does not disallow llms.txt Pass if /llms.txt and /.well-known/llms.txt are reachable to all user-agents per robots.txt rules.
sitemap-xml.exists sitemap.xml is published Pass if /sitemap.xml (or sitemap_index.xml / sitemap-index.xml) returns a 2xx response.
sitemap-xml.valid sitemap.xml parses as urlset or sitemapindex Pass if the sitemap parses as XML and contains <urlset> or <sitemapindex>.
sitemap-xml.has-lastmod sitemap entries include <lastmod> Pass if every <url> in sitemap.xml has a <lastmod> child whose value parses as a W3C Datetime (the format sitemaps.org requires).
sitemap-md.exists sitemap.md is published Pass if /sitemap.md, /docs/sitemap.md, or /.well-known/sitemap.md returns a 2xx response that is not an HTML page (a soft-200 SPA shell or styled 404 does not count).
sitemap-md.has-structure sitemap.md has headings and links Pass if sitemap.md contains at least one heading and one link.
agents-md.exists AGENTS.md (or equivalent) is published Pass if a 2xx agent skill file is found that is not an HTML page (a soft-200 SPA shell or styled 404 does not count).
agents-md.has-min-sections agent skill file documents at least 2 of install/config/usage Pass if the discovered skill file has heading-level sections matching at least 2 of: installation, configuration, usage/examples.
discovery.no-duplicate-content No URLs share a canonical with another announced URL Pass if no two crawled URLs collapse to the same canonical. N/A in single-page mode (no cross-page view).
discovery.in-page-link Agent files are linked in-page Pass if a top-level page (the root URL or a first-level path like /docs) links in-page (in-DOM <a href>) to an agent-discovery file (/llms.txt, /llms-full.txt, /sitemap.md, /AGENTS.md, or the page's .md mirror); warn if only a deeper page does; fail if no crawled page does. N/A in single-page mode.

HTTP · 5 checks

HTML metadata · 5 checks

Structured data · 3 checks

Content structure · 4 checks

Markdown mirror · 9 checks

Code · 1 check

code.language-tags Code blocks declare a language Pass if every <pre><code> block has a language-* or lang-* class on either the <code> or its parent <pre>.

API · 1 check

api.schema-link API pages link to a machine-readable schema Only applies to URLs whose path looks like API documentation. Pass if the page links to openapi.json, swagger.json, swagger.yaml, or schema.json. Returns "na" for non-API pages.

Discoverability · 1 check

discovery.indexed Page is indexed by sitemap, llms.txt, or sitemap.md Pass if the page URL appears in at least one of the discovered indexes (sitemap.xml, llms.txt, sitemap.md). Pages found only by link crawling fail this check.

Changes vs v0.2.0

Scorecard v0.3.0-draft

Site checks

Discoverability · 16 checks

Page checks

HTTP · 5 checks

HTML metadata · 5 checks

Structured data · 3 checks

Content structure · 4 checks

Markdown mirror · 9 checks

Code · 1 check

API · 1 check

Discoverability · 1 check