Source: Search Engine Roundtable by barry@rustybrick.com (Barry Schwartz). Read the original article
TL;DR Summary of OpenAI Crawling LLMS.txt Files Despite Google’s Stance
OpenAI appears to be actively crawling LLMS.txt files on websites, checking for updates approximately every 15 minutes. This activity contrasts with Google’s official position, as they have stated they do not use or support LLMS.txt files. Notable figures like Gary Illyes from Google have confirmed Google’s lack of interest in this protocol. Meanwhile, other AI companies such as Anthropic, ElevenLabs, and PineCone acknowledge or support LLMS.txt usage.
Optimixed’s Overview: Emerging Trends in AI Crawling with LLMS.txt Files
Introduction to LLMS.txt and Its Current Usage
LLMS.txt files are becoming a point of interest as AI companies explore methods to manage and identify large language models (LLMs) associated with websites. While Google has publicly stated it does not crawl or utilize LLMS.txt files, recent evidence suggests that OpenAI is actively accessing these files.
Evidence of OpenAI Crawling Behavior
- Log File Analysis: Ray Martinez shared screenshots revealing OpenAI pinging LLMS.txt files on his servers approximately every 15 minutes, indicating a systematic crawl to check for file freshness.
- Community Awareness: This behavior has sparked discussions among webmasters and AI developers on platforms such as X and LinkedIn.
Google’s Official Position
- Gary Illyes Statements: At the Google Search Central Live Deep Dive event, Illyes confirmed Google neither supports nor plans to support the LLMS.txt protocol.
- Contrast with Other AI Providers: Companies like Anthropic, ElevenLabs, and PineCone have documentation or support for LLMS.txt, showing a divergence in industry approaches.
Implications for Website Owners and AI Development
Website administrators should be aware of the potential for AI systems to crawl LLMS.txt files, affecting server load and privacy considerations. The evolving adoption of LLMS.txt protocols signals an emerging standard for AI model identification, though widespread acceptance remains uncertain.