Nearly every website today seems to be hosted behind Cloudflare which is really concerning for the future of privacy on the internet.
Cloudflare no doubt logs, stores, and correlates network telemetry that can be used for a wide array of deanonymization attacks. Not only that, but Cloudflare acts as a man-in-the-middle for all encrypted traffic which means that not even TLS will prevent Cloudflare from snooping on you. Their position across the internet also lends them the ability to conduct netflow and traffic correlation attacks.
Even my proposed solution to use archive.org as a proxy is not a valid solution since I found out today that archive.org is also hosted behind Cloudflare…
So what options do we even have? What privacy concerns did I miss, and are there any workaround solutions?
It only does if you upload your private keys to them or if you use their certificate.
By you, you mean the user or the site owner? Do I, as the user have a choice in the matter? And, as far as I know, CDNs are for delivering frontend bundles. How does TLS come into play here?
No. As an end user you have no choice. My employer uses Akamai for CDN, WAF, and other services. All customer facing connections use certs for which Akamai has the private keys.
The CDN needs to know the content in order handle it properly. When a request is served by a website it includes a bunch of headers that tell the browser and CDN if it should be cached and for how long. It might tell you to cache a static image for 30 days, but a dynamic image like one from a webcam for only 10 minutes. And there’s some content, like pages from banking sites, that should never be cached.
Services like Akamai also offer other services to optimize the speed of sites. Their Image Manager will analyze and optimize JPG, PNG, etc. images if you want. They can also “minify” JavaScript, and compress some content via gzip or brotli compression to speed things up as well. All these sorts of optimizations require access to the unencrypted content.
Then there are WAFs (web application firewalls) that site owners use to protect themselves from malicious traffic. Cloudflare, Akamai, AWS, etc. all have WAFs that analyze inbound requests and will block any that they deem malicious. Again, it needs access to the unencrypted request to do this.
Bingo. This. That’s so obvious it’s bizarre how many people continue to believe that CF does not see their traffic, as if CF can process requests it cannot see. I can’t get my head around why so many have trouble grasping this. If CF cannot decrypt the payload, it obviously can only pass it through to the source webserver. And obviously if everything is passed through, then the owner’s webserver must be able to handle the load, which defeats the purpose website owners use CF for.
That’s what a vast majority of sites do. CF is not gratis if you use your own keys.