This idea that you could have Cloudflare help by telling off AI crawlers sounds nice, but how long until it becomes a premium feature that requires loads of money to operate because AI companies lobby them to make it inaccessible to the masses? Or something equally as bad happens?
This could further accelerate the arms race between malicious srappers and websites.
My fear is this would create collateral damage, block legitimate scrappers and visitors, hassle people with an increasing number of captcha.
yeah i think that there’s a good chance for vpn users to be harassed by anti ai measures
I don’t believe they have that capability
Oh yes, they have and more, they are one of the most powerfull security and AI company with a ton of services. Perfectly capable to remove the plug of any service and web. Sadly with similar privacy concerns as Google.
Okay great. Go ahead and explain to me how they plan to fight an army of bots doing everything they can to be invisible?
How about you just read up on Cloudflare Turnstile instead of acting like you know anything? Here are some notable methods:
- Residential IP requirements
- TLS Fingerprinting
- Canvas Fingerprinting
It’s still possible to get around these but it’s not easy. You either must have as good network engineers on staff as Cloudflare or pay some third party service to unlock it for you. All Cloudflare needs to do is keep their prices lower than the third party services.
-
they can already block VPN traffic (unless you use their VPN)
-
their whole business model is based on them being a man in the middle that decrypts ssl and analyses the packets plainly
-
about a third of the worldwide websites are using cloudflare so they have a pretry good birds eye view on behaviour of any machine, datacenter or ip range that will be visiting a lot of websites, which in turn will trivially whether it is normal user behaviour or a crawler.
Not the first time that, with all my privacy measures on, instead of an page, I see the page from Cloudflare analyzing if I am a bot, before it let me access the page I wanted. Invisible in the web is only a bad joke. Anybody is visible in the moment he goes online, irrelevant if he use VPN, TOR or whatever, this times have passed. Believing it is as hilarious as in the Movie Independence day infecting with an Virus an Alien Mothership, using an crappy Laptop (I have laughed a lot with this scene).
Anybody is visible in the moment he goes online, irrelevant if he uses …, TOR
No
Yes, TOR never was secure against secret services and goverments, les nowadays with AI and massive server power from these. Don’t forget who developed the TOR network and from whom are the servers used. Drug Barons since time turned to use pen and paper for their orders and communication, because the web and even the Dark Web isn’t really private anymore (traffic analysis, exploiting software vulnerabilities, monitoring exit nodes, using Honeypot nodes…)
Just calling it “Dark Web” gives away you have no idea what you’re talking about.
they can already block VPN traffic unless it goes through their VPN
Yeah that’s how most VPNs work.
their whole business model is based on them being a man in the middle that decrypts ssl and analyses the requests plainly
Okay? Analyze all you want. They can’t stop bots on any of the other sites they regulate either.
about a third of the worldwide websites are using cloudflare so they have a pretry good birds eye view on behaviour of any machine that will be visiting a lot of websites
Great. Bots intentionally change up their behavior and identifying information as to be undetected.
They can’t stop bots on any of the other sites they regulate either.
Why not? They are doing edge caching, they can literally just block the connection from visiting the site just like they do with their DDoS mitigation.
they can literally just block the connection
block which connection? Again, these AI companies know people don’t want them crawling their sites and they do everything they can to be invisible. This has been an issue for years at this point.
just like they do with their DDoS mitigation
blocking DDoS is trivial by comparison.
They can’t stop bots on any of the other sites they regulate either.
They can and do. What is blocked depends on what the website owner sets as settings in cloudflare.
Bots intentionally change up their behavior and identifying information as to be undetected.
If they have to crawl the web while behaving like a normal human, it will be magnitudes slower and more costly.
What is blocked depends on what the website owner sets as settings in cloudflare.
And how does the owner know which connections are bots?
If they have to crawl the web while behaving like a normal human, it will be magnitudes slower and more costly.
They don’t care, they have trillions of dollars of VC money to power through.
The owner sets the level. If they set strict level, all bots are blocked.
They do care. VC funding happens because the result is profitable. If it is less profitable, there will be less funding because of higher investment risk.
-
Based on the headline this is not about blocking ai scrappers but by making them pay to do it.
Based on the discussion below which moved that goalpost the most likely answer is by making it cheaper to scrape “legally” then it costs to mimic millions of individual residential browsers with human users.
I don’t know how many aces cloudflare has up its sleep to detect secret ai but they definitely have the tools to make it pretty costly and difficult. There is also a broadband impact difference between a few capitalist megapigs scrapping secretly versus loads of global basement dwellers and smaller companies scrapping worry free.
It’s literally what their entire business is based on. Filtering good and bad traffic.
Their business is largely based on security and preventing specific types of actions on a site. Not just the mere act of visiting it.
It is. And they are blocking AI crawler for a while now. https://www.wired.com/story/cloudflare-tools-detect-block-ai-bots/
Believe all you want, reality doesn’t care.
I agree!
They probably can. Link