Cloudflare has just issued the AI industry a new deadline to separate the web crawlers used for traditional search purposes, like Google Search, from those used for AI agents and training. Starting on September 15, 2026, Cloudflare’s default settings will block “mixed-use” crawlers from any pages that host ads, the company announced on Wednesday.
That means that the crawlers that blend search, agent use, and training will be blocked from crawling these sites by default, unless the site owner adjusts the settings otherwise. These changes to the defaults will apply to new Cloudflare customers, new sites set up by existing customers, and all existing free customers, the company says.
The move could impact how AI model providers are able to access web content for training purposes and to help power their agentic services.
Cloudflare points out that most website owners want their content to be discoverable via search and often through AI services as well, but they want protections against having their intellectual property given away for free.
Cloudflare specifically calls out the “world’s largest search engine” (clearly a Google reference!) as having access to about “2x more information” than other AI companies because the search giant makes it difficult for customers to remain discoverable without being used for AI.
Google has pushed back against this generalization in the past, noting that it provides a bot called Google Extended that lets site owners opt out of having their content used for training and AI products and services like Gemini Apps and Vertex API. Its use doesn’t impact a site’s inclusion in Google Search. However, the tech giant’s flagship Googlebot crawls for Search, including AI features like AI Overviews and AI Mode.
“Now that the majority of traffic on the Internet is non-human, we must go further and act faster so that a sustainable ecosystem can emerge,” said Cloudflare co-founder and CEO Matthew Prince in his announcement of the news, referring to the recent milestone where bots surpassed human traffic online for the first time. That shift was not expected to occur until next year.
“Cloudflare’s new tools and partnerships give website owners increased visibility and commercial opportunities and benefit AI companies that have bots with clear and transparent intent. We hope that our proposed default changes encourage mixed-use crawlers to separate out search from agent use and training,” Prince said.
While Cloudflare offers a number of products to help users launch their own AI systems, the company has also released a range of tools to give publishers more control over their content in the AI era. In recent years, Cloudflare launched tools to combat AI bots, including a marketplace that lets websites charge AI bots for scraping, dubbed Pay Per Crawl.
The latter is now also evolving into “Pay Per Use,” the company said, which will allow publishers to charge AI companies when their content creates value, not just when it’s fetched.
The change could also help conserve publishers’ bandwidth and compute resources for AI model providers, as Cloudflare’s data suggested that over 50% of crawl traffic from AI crawlers is spent re-fetching unchanged pages.
To put this into action, Cloudflare is initially working with two partners, Ceramic.ai and You.com. When a publisher opts in, they’re paid when their content appears in Ceramic’s AI search results or when You.com accesses a piece of their premium content.
Other AI companies can customize this model for how they work, Cloudflare says.
When you purchase through links in our articles, we may earn a small commission. This doesn’t affect our editorial independence.