llms.txt directory

A curated directory of websites offering /llms.txt and /llms-full.txt to help LLMs consume site context.

Introduction

llms.txt directory — Technical overview

The llms.txt directory is a curated, searchable index of websites publishing machine-readable /llms.txt and /llms-full.txt files. It collects metadata (category, description, main domain), token counts for both short and full manifests, availability status, and direct links to each site's llms endpoints.

Key features

  • Aggregated registry of sites exposing /llms.txt and /llms-full.txt, including token counts and availability flags.
  • Full-text and short manifest tracking: tracks both /llms.txt and /llms-full.txt where available for precise context budgets.
  • Filtering & search: search bar, category filters and sorting to find sites by domain, category or token footprint.
  • Useful metadata: category, description, mainDomain, down flag, skipCheck, and token-size metrics to assist ingestion planning.
  • Submission flow: self-service "Submit" form for site owners to add or update entries.

Technical use cases

  • Data ingestion planning for RAG pipelines: pick domains and files by token counts to control context size and cost.
  • Automated crawlers and indexers: integrate the directory as a source of canonical llms endpoints to fetch site manifests.
  • AI agents and browsers: discover sites that explicitly publish machine-readable usage instructions and context to improve prompt safety and fidelity at inference time.
  • SEO & GEO tooling: identify sites that surface content specifically for LLMs to optimize visibility in generative search.

Target users

  • LLM engineers building retrieval-augmented systems
  • API and platform engineers who need canonical site manifests for automated crawlers
  • SEO and Growth engineers optimizing for generative search engines
  • Site owners who want to expose structured guidance to AI systems

Implementation notes

  • Each entry includes direct links to /llms.txt and /llms-full.txt; many entries include precomputed token counts to aid selection.
  • The directory is built with Astro + shadcn/ui + Tailwind CSS and offers an open submission link for additions.

Unique selling points

  • Focused on the emerging standard of llms.txt: provides a single authoritative index for LLM-aware site manifests.
  • Token-count-aware: helps teams design cost-efficient retrieval and context injection strategies.
  • Community-driven: easy submission and curated entries help maintain quality and relevance.

Information

Newsletter

Join the Community

Subscribe to our newsletter for the latest news and updates