- Cloudflare is introducing a way to charge AI web scrapers
- Content creators can protect their sites from unwanted scrapers
- Specific crawlers can be granted free access, charged, or blocked
Online creators often have very little control over the types of crawlers that can access their content, but Cloudflare may have a solution.
The company has revived HTTP response code 402 as a neat way to block or charge AI crawlers to access your site in a new feature it calls ‘pay per crawl’.
The best part is, it’s not a block or charge all control – users will be able to allow specific crawlers to access their site for free, charge others for access, and block the ones you don’t want trawling your content.
Charging AI crawlers for access
HTTP response code 402, otherwise known as the 402 Payment Required status code, indicates to crawlers payment is needed to access the content. As a result, the crawler can either respond with intent to pay, or is blocked from accessing the content.
As an added bonus, content creators with a block on their site can effectively ‘tell’ AI crawlers that they are open to potential payments in the future.
For those thinking that someone could simply spoof a crawler that has access to the site, Cloudflare is one step ahead. An authentic crawler will use the ‘signature-agent’, ‘signature-input’, and ‘signature’ headers to authenticate themselves with Cloudflare.
Cloudflare will then compare a public key from a Ed25519 key pair that is stored in a hosted directory with the URL of the key directory and user agent information that is registered with Cloudflare, thus allowing the authentic crawler through and blocking any spoofed crawlers.
Crawlers will also be able to crawl the web with a set budget for accessing protected sites using the ‘crawler-exact-price’ header to accept the proposed price listed by the ‘crawler-price’ header on the desired site, or preemptively use the ‘crawler-max-price’ when accessing a site which will grant access if the price is equal to or less than the crawler’s budget.
Cloudflare also has some theories for the potential of pay per crawl in the future. An AI agent can be given a budget to crawl the web when responding to a prompt, allowing the user to access high-quality and relevant content when entering a prompt.
Pay per crawl is currently only available in private beta, but interested parties can reach out to Cloudflare via the link at the bottom of the blog.
Leave a comment