Supported by Fastmail
Sponsor: Fastmail

Fast, private email that's just for you. Try Fastmail free for up to 30 days.

Cloudflare and Perplexity Battle Over ‘Stealth Crawling Behavior’

Cloudflare:

We are observing stealth crawling behavior from Perplexity, an AI-powered answer engine. Although Perplexity initially crawls from their declared user agent, when they are presented with a network block, they appear to obscure their crawling identity in an attempt to circumvent the website’s preferences. We see continued evidence that Perplexity is repeatedly modifying their user agent and changing their source ASNs to hide their crawling activity, as well as ignoring — or sometimes failing to even fetch — *robots.txt* files. […]

There are clear preferences that crawlers should be transparent, serve a clear purpose, perform a specific activity, and, most importantly, follow website directives and preferences. Based on Perplexity’s observed behavior, which is incompatible with those preferences, we have de-listed them as a verified bot and added heuristics to our managed rules that block this stealth crawling.

That’s a pretty damning accusation, and a harsh penalty. Perplexity, for its part, is crying foul:

Because Cloudflare has conveniently obfuscated their methodology and declined to answer questions helping our teams understand, we can only narrow this down to two possible explanations.Cloudflare needed a clever publicity moment and we–their own customer–happened to be a useful name to get them one.Cloudflare fundamentally misattributed 3–6M daily requests from BrowserBase’s automated browser service to Perplexity, a basic traffic analysis failure that’s particularly embarrassing for a company whose core business is understanding and categorizing web traffic.
Whichever explanation is the truth, the technical errors in Cloudflare’s analysis aren’t just embarrassing—they’re disqualifying. When you misattribute millions of requests, publish completely inaccurate technical diagrams, and demonstrate a fundamental misunderstanding of how modern AI assistants work, you’ve forfeited any claim to expertise in this space.

This controversy reveals that Cloudflare’s systems are fundamentally inadequate for distinguishing between legitimate AI assistants and actual threats. If you can’t tell a helpful digital assistant from a malicious scraper, then you probably shouldn’t be making decisions about what constitutes legitimate web traffic.

I don’t know which multi-billion dollar behemoth is right. On the one hand, I don’t like how much control Cloudflare has over the internet. I’ve often been stymied by one of their “Checking if the site connection is secure” loops, and their “Content Independence Day” is an obvious—if unstated—cash grab. On the other hand, Perplexity has a multitude of issues—and they just signed up to power searches for Donald Trump’s “Truth” Social.

A pox on both their houses?

⚙︎

Like what you just read?

Get more like it, direct to your inbox. It’s free for you and an ego boost for me. Win-win!

Free, curated, possibly habit-forming. (It’s OK, you can stop anytime.)