Does llms.txt Actually Work? An Honest Read of the 2026 Evidence

TL;DR

llms.txt sits at roughly 10% adoption, but the largest analysis found no correlation with AI citations. Google has said it has no plans to use it. Verdict: ship it cheaply, do not expect a visibility spike.

Published 2026-05-30 · Last reviewed 2026-05-30

In May 2026, llms.txt sits in the same category as the keywords meta tag did in 2009: widely deployed, easy to ship, loudly recommended by a slice of the SEO market, and unsupported by the dominant search and answer engines that are supposed to consume it. That gap between adoption and impact is the whole story.

This article is a deliberate honest read of the public evidence rather than a hype piece or a hit piece. The data points come from the largest available study (SE Ranking’s analysis of roughly 300,000 domains), Google’s own stated position, a counter-argument from Wix’s AI Search Lab, and the on-the-ground experience of teams who shipped llms.txt early. It is aimed at site owners, founders, and content strategists who want to know whether to spend the next billable hour writing one.

The short version: llms.txt is cheap to add and will not hurt you, but the available evidence does not back the “publish an llms.txt and watch your AI citations spike” framing that is doing the rounds. The interesting question is no longer “does it work” but “what is the smallest reasonable bet you can place on it without overpromising.”

What llms.txt is, and what it promises

The proposal landed on 3 September 2024 from Jeremy Howard and the team at Answer.AI, hosted at llmstxt.org. The pitch was straightforward. Modern websites are bloated with navigation, scripts, ads, and template chrome that consume context budget without carrying useful signal for a model trying to answer a question. A site owner who already knows which pages matter could publish a small Markdown index, llms.txt, at the root of the domain, listing the primary documentation, the canonical FAQ, the most-cited blog posts. A model with a small context window could fetch that file first and follow only the links that are actually relevant.

The format itself is intentionally trivial. An llms.txt is a Markdown file with an H1 title, an optional paragraph summary, then sections of links grouped by topic. Many implementations also publish an expanded variant called llms-full.txt, which inlines the body of each linked document so a single fetch returns the full text. Both are static files. There is no schema, no validator, no runtime API. A site owner who can write a README can write an llms.txt in under an hour.

The promise wrapped around the format is bigger than the file itself. Vendors marketing llms.txt frame it as the entry point that large language models will prefer when answering questions about your site, the rough analog of what robots.txt is for crawlers or sitemap.xml is for search engines. Some go further: ship one, the pitch runs, and your odds of being cited in ChatGPT, Perplexity, and Google’s AI Overviews go up. That is the load-bearing claim, and it is the one the evidence has to support for the file to matter.

Two distinctions are worth pinning down before looking at the evidence. First, llms.txt is not a Google standard or an OpenAI standard, it is a Jeremy Howard proposal that has spread through SEO blogs and starter templates. Second, llms.txt is not the same as AGENTS.md. AGENTS.md is a separate convention stewarded by the Agentic AI Foundation that documents build and test instructions for coding agents working inside a repository. The two files are sometimes lumped together, but they target different agents in different contexts and neither was designed to be paired with the other.

What the data actually shows

The largest available analysis comes from SE Ranking, published in 2025 and re-circulated in 2026. The team pulled roughly 300,000 domains, recorded whether each had an llms.txt at the root, and then measured whether those domains were cited more or less often in answers from major LLM-powered assistants. Two statistical methods, a correlation analysis and an XGBoost machine-learning model with the llms.txt variable held out and held in, were used to test the hypothesis that publishing the file lifts AI citation rates.

The headline finding is the one nobody marketing llms.txt wants to repeat. Across the 300,000 domains, there was no observed correlation between having an llms.txt and being cited more often in LLM answers. The XGBoost model actually got slightly better at predicting citation likelihood when the llms.txt variable was removed, which is the language data scientists use when a feature is adding noise rather than signal. SE Ranking’s own write-up summarises this with rare bluntness for the SEO trade: “if the goal is a near-term visibility bump in AI answers, the data says you should not expect one.”

A second study from Trakkr, scanning 37,894 domains, reached the same verdict from a different angle. They looked at citation-rate differences between domains that had shipped an llms.txt and matched controls that had not, and reported no measurable citation advantage. The two studies use different samples and different methodologies; the agreement on the null result is the data point that matters.

Adoption itself is real. SE Ranking found that 10.13% of the measured domains carried an llms.txt file. The distribution across traffic tiers is also worth knowing: 9.88% of low-traffic sites (0 to 100 visits), 10.54% of mid-traffic sites (1,001 to 5,000 visits), and 8.27% of high-traffic sites (100,001+ visits). The largest sites are actually slightly less likely to ship the file than mid-tier ones. That pattern, broad adoption that does not skew toward the most sophisticated publishers, is the fingerprint of a low-cost convention rather than a competitive moat.

None of this proves llms.txt cannot work. The honest caveat is that the studies measure observable citation behavior in a window where the file has been around for roughly twenty months and the dominant runtime consumers (Google, OpenAI, Anthropic) have not committed to using it. The absence of an effect in May 2026 is what the data shows. Whether that changes if a major provider switches on runtime use is a separate question, taken up in the next section.

What Google has said, and what other providers have not

The clearest signal on the provider side came from Google. At the Search Central Deep Dive event in Bangkok in July 2025, Gary Illyes, who has been Google’s public voice on crawling and indexing for years, stated that Google does not support llms.txt and has no plans to start. The wording, reported in detail by Search Engine Land in the months that followed, mapped llms.txt onto a familiar Google argument: site-owner-controlled signals about the importance or summary of a page are inherently gameable, and Google has spent years pulling that class of signal out of its ranking inputs rather than adding more in. John Mueller, another Google search advocate, made the analogy explicit by comparing llms.txt to the keywords meta tag, which Google has effectively ignored since the late 2000s.

In December 2025 a small irony made the rounds: an llms.txt briefly appeared on Google’s own developer documentation site and was removed the same day. The exact reason was not explained publicly, but the timing reinforced the official position. Google’s broader 2026 guidance is that publishers who want to appear in AI Overviews should focus on standard SEO practice, not on emitting AI-specific files.

OpenAI, Anthropic, and Perplexity have been quieter. None has published a clear statement on whether ChatGPT, Claude, Perplexity Pro, or the assistant retrieval layers they expose to enterprise customers fetch or weight an llms.txt at runtime. There are scattered third-party observations (Cloudflare logs of bot user-agents pulling the file, anecdotal operator reports of small upticks after publishing) but nothing that rises to the level of an on-the-record provider commitment.

The honest read is narrow and important. The dominant runtime consumers of web content for AI answers, in May 2026, have either said they do not use llms.txt or have not committed publicly to using it. That is not the same as saying they will never use it. It is the basis on which a site owner has to decide today whether the file is a probable lever or a speculative one. The current evidence places it firmly in the speculative bucket.

The counterpoint: why Wix and others still ship it

The strongest published counter-argument in 2026 comes from Wix’s AI Search Lab. In a post titled “Debunking LLMs.txt Myths: What You Need to Know for AI Visibility,” the team made three concrete claims. Google’s own index contained between 30,000 and 60,000 llms.txt files as of October 2025, which they read as evidence that Google is in fact crawling the file even if it has publicly said otherwise. The format is token-efficient (a clean Markdown index can require a small fraction of the tokens of a rendered HTML page), which becomes meaningful as more workflows shift to agentic retrieval over limited context windows. And the author had reviewed more than 1,400 llms.txt files in the wild, which is the largest hand-graded sample reported in the trade press.

Each of those claims is worth taking seriously, and each has a fair caveat. Index counts tell you Google fetched the file; they do not tell you any model used the file to generate an answer. The same crawler that pulls llms.txt also pulls robots.txt, sitemap variants, and a long tail of other well-known files, and indexing presence is not the same as ranking input. The token-efficiency argument is real but applies to whichever consumer chooses to fetch the file; if Google’s mainline retrieval ignores it, the savings accrue to whoever does not. And the 1,400-file review is qualitative; it documents what people write inside llms.txt, not whether writing it changed their odds of being cited.

A second class of pro voices comes from the broader “agent-ready web” SEO community, which argues that the file is cheap insurance for an agentic future and that early adoption builds operational muscle for the conventions that will eventually matter. That argument is honest as long as it is framed as a future-leaning bet rather than a present-day citation lever. It collapses the moment it gets re-translated, on the way to a client deck, into “publish this and your AI traffic goes up.”

Read together, the counterpoint does not flip the verdict. It clarifies it. Some providers are fetching llms.txt today; none of the dominant answer engines have publicly committed to using it as a ranking or citation input. The market is hedging, the providers are not, and the honest middle is the one the next section spells out.

The honest verdict for 2026

If llms.txt costs an hour to ship and your site has documentation or a content library already in Markdown, the calculus is straightforward: do it, treat it as table-stakes hygiene the way you treat robots.txt, and move on. The downside is small (a stale file readers can ignore) and the upside, if a major provider quietly starts using it, is captured for free. The mistake is paying for the file to be built and maintained as a paid service line on the promise of an AI-citation lift the data does not back.

Three concrete rules cover most situations. First, write llms.txt by hand or generate it from canonical source, then commit to keeping it accurate. A stale or aspirational llms.txt that lists pages that no longer exist or summaries that no longer match the body is a credibility tax in the small set of cases where a model does fetch it. Second, treat it as a static convention, not a signal. The conventional file (Markdown, root path, links and short summaries) is what will be portable if a standard ever crystalises around it; the bolted-on “AI optimization” wrappers some plugins generate add code paths without adding signal. Third, do not bill llms.txt as the AI visibility line item on the proposal. The verifiable AI visibility levers in 2026 are the boring ones: substantive content on a fast, well-structured, server-rendered site that already ranks for human queries, with clean schema and a coherent topical narrative. llms.txt is, at best, an accessory to that work. It is one row in a broader agent-readiness picture; the audit checklist of what the scorers do and do not catch puts the file in context.

If the file does become load-bearing later (a major provider switches on runtime use, a regulator nudges publishers toward an AI-readable index, the spec gets cleaner), the cheapest position to be in is “we ship one, it is accurate, we did not over-promise it.” That is the honest middle, and it is the one this article is recommending.

How this connects to Tobira

llms.txt is a readability play. The file makes a site easier for a language model to read once a model already knows the site exists and has chosen to fetch it. That is genuinely useful, and it is genuinely narrow. It does not help an agent acting on behalf of a person decide which expert to contact, whether to trust the entity behind a website, or how to exchange identity before sending a real message. Those are different layers.

That is the layer Tobira is building. A Tobira @handle adds the addressability and identity row that llms.txt is not designed to fill: a human-readable name for an agent (or for the site agent representing a business), a public profile other agents can qualify against, and a mutual-reveal step before either side learns who the other is. The two layers are complementary. A well-written llms.txt makes your content cleaner to read; a Tobira @handle makes the entity behind that content addressable and qualifiable by other agents. Neither replaces the other, and a site owner who cares about both is solving two genuinely different problems. For the longer version of the agent-readable vs agent-addressable distinction, see Why your AI agent needs a name, not a wallet address.

Takeaways

llms.txt is a Markdown index at the root of your site, proposed by Answer.AI in September 2024 to give large language models a cleaner reading path.
The largest public analysis (SE Ranking, ~300,000 domains) found no correlation between having llms.txt and AI citation frequency. A second study (Trakkr, 37,894 domains) reached the same conclusion.
Adoption is real (about 10% of measured domains, slightly lower on the largest sites) but does not predict citation lift.
Google has publicly said it does not use llms.txt and has no plans to. OpenAI, Anthropic, and Perplexity have not publicly committed either way.
The strongest counter-argument (Wix AI Search Lab) shows Google indexing llms.txt files, which is not the same as using them in answers.
Honest verdict for 2026: ship one if it costs you an hour, keep it accurate, treat it as future insurance rather than a present-day AI visibility lever.
The verifiable AI visibility levers in 2026 are still substantive content, fast clean rendering, structured data, and topical depth.

FAQ

Is llms.txt the same as AGENTS.md?

No. llms.txt is a Markdown index at the root of a website, proposed by Answer.AI in September 2024 to give language models a clean reading path. AGENTS.md is a separate convention, stewarded by the Agentic AI Foundation under the Linux Foundation, that documents build and test instructions for coding agents working inside a code repository. They target different agents in different contexts and were not designed as a pair.

Will llms.txt help me rank in Google AI Overviews?

The available evidence says no. Google has stated publicly (Gary Illyes, Search Central Deep Dive, July 2025) that it does not use llms.txt and has no plans to. John Mueller compared the file to the keywords meta tag, which Google has ignored since the late 2000s. Google’s broader 2026 guidance is that publishers who want to appear in AI Overviews should focus on standard SEO practice.

Should I publish llms.txt if I run a small business website?

If it costs about an hour and you can keep it accurate, yes. Treat it as table-stakes hygiene the way you treat robots.txt. Do not pay an agency to maintain it as a recurring service line on the promise of an AI-citation lift the data does not back. The verifiable AI visibility levers in 2026 are still substantive content, fast clean rendering, structured data, and topical depth.

What about llms-full.txt, the expanded variant?

llms-full.txt inlines the body of each linked document into a single file so a model can fetch the whole content set in one request. It is the same convention scaled up, and the same caveats apply: it is cheap to ship, it does not hurt, and there is no public evidence that the dominant runtime consumers cite domains more often because they emit one. Useful as a self-contained documentation export, not yet a verified visibility lever.

Will Google or OpenAI ever start using llms.txt?

Maybe. The honest answer is that nobody outside those companies can promise it, and they have not committed to it. The cheapest position today is to ship an accurate, hand-written file, treat it as future insurance rather than a present-day lever, and avoid framing it to clients or stakeholders as something it is not yet.

Sources

Jeremy Howard / Answer.AI, llms.txt proposal: https://llmstxt.org/
SE Ranking, “LLMs.txt: Why Brands Rely On It and Why It Doesn’t Work” (~300k-domain analysis): https://seranking.com/blog/llms-txt/
Search Engine Journal, “LLMs.txt Does Not Boost AI Citations, New Analysis Finds”: https://www.searchenginejournal.com/llms-txt-shows-no-clear-effect-on-ai-citations-based-on-300k-domains/561542/
Search Engine Land, “Google says normal SEO works for ranking in AI Overviews and LLMS.txt won’t be used”: https://searchengineland.com/google-says-normal-seo-works-for-ranking-in-ai-overviews-and-llms-txt-wont-be-used-459422
Wix AI Search Lab, “Debunking LLMs.txt Myths: What You Need to Know for AI Visibility”: https://www.wix.com/studio/ai-search-lab/llms-txt-myths
Trakkr, “The llms.txt Effect: 37,894 Domains Scanned, Zero Citation Advantage”: https://trakkr.ai/trakkr-research/llmstxt-effect