> At least when it comes to CDN and DNS there’s literally no vendor lock-in.
ehhhh, really depends on which CDN features you're using, and at what volume. Using ESI? VCL? Signed URLs or auth? Any other custom functionality? Are you depending on your provider's bot management features which are "CONTACT FOR PRICE" with other providers? Does your CDN provider have a special egress deal with your cloud provider?
It's possible to picture this being easy in the same way that being multi-cloud or multi-region is easy.
>Using ESI? VCL? Signed URLs or auth? Any other custom functionality? Are you depending on your provider's bot management features which are "CONTACT FOR PRICE" with other providers?
I have no idea what two of those acronyms mean. None of this is part of what a CDN offers.
Yes if you use DDoS protection, or cloudfare’s ZeroTrust or embrace $X proprietary features then what I said no longer applies.
ESI = Edge Side Includes think Server Side Includes on a CDN technology as supported by Akamai and used by sites like Ikea to deliver a fast maintainable experience
VCL = Varnish Configuration Language i.e. how you configure your Fastly services
If you're just using a CDN as a proxy then there's no lock in but plenty of sites are using CDNs for much more than that
> I've never seen an SLA which is clear cut enough to be worth pursuing if you want more than a free t-shirt.
I have, regularly. I am not sure what kind of business you are running but parties that rely on service providers for critical (primary business process driving) components routinely agree to SLAs with large penalties and the ability to open up an existing contract in case of non-performance. Obviously you would have to be willing to pay for such a service in the first place otherwise there is no point in setting up an SLA, this won't be cheap. But we're definitely not talking about 'free t-shirts' here, more about direct liability, per hour penalties and so on.
By the time SLA thresholds are being breached you've been through months (or years) of pain. They're not strong enough or specific enough to save you from a bad provider. ymmv
Colo and cloud providers that provide real SLAs exist. But they're pricey because they tend to insure against breach of that that SLA and they pass on the cost of that insurance. If you're a run-of-the-mill e-commerce company then it probably doesn't make much sense. But if you yourself are providing critical services to others and they have you by the short hairs in case you don't perform you better make sure that you're not going to end up holding the bag.
One simple example: energy market services, 15 minute ahead and day ahead markets require participants to have the ability to perform or they will be penalized severely, to the point where they can lose that access, the damage of which could easily be in the 10's of millions to 100's of millions depending on their size. Asset owners and utilities both would be able to hit them hard if they do not perform, the asset owners for lost income and the utilities for both government penalties and possibly for outages and all associated costs. These are not the kind of contracts you enter into lightly.
I'm criticizing the readability of that first "Agent Loop" section.
It's basically a slideshow which advances and presents several content areas which are intended to be read, all while advancing and resizing themselves.
Pausing and clicking through manually stepwise is also pretty obnoxious.
Would much rather just see the content all laid out at once
> A popular theory is that this is because of sloppy coding, AI companies are too rich to care, but then again that doesn't really add up
I can substantiate this a bit. Verified traffic from Amazonbot is too dumb to do anything with 429s. They will happily slam your site with more traffic than you can handle, and will completely ignore the fact that over half the responses are useless rate limits.
They say they honor REP, but Amazonbot will still hit you pretty persistently even with a full disallow directive in robots.txt
Theft isn't far off, it seems closer to me than using the word for IP violations.
When a crawler aggressively crawls your site, they're permanently depriving you the use of those resources for their intended purpose. Arguably, it looks a lot like conversion.
The root sources of the traffic from residential proxies gets murky very quickly.
It's easy to follow the chain partway for some traffic, eg "Why are we receiving all this traffic from Digital Ocean? ... oh, it's their hero client Firecrawl, using a deceptive UserAgent" ... but it still leaves the obvious question about who the Firecrawl client is.
Res proxy traffic is insane these days. There is also plenty of grey-market snowshoe IPs available for the right price, from a handful of ASNs. I regularly see unified crawling missions by unknown agents using 1000+ "clean" IP addresses an hour.
reply