And, it's not necessarily trivial to get that kind of throughput on file serving out of the box this bandwidth is something like CPU video card, not disk cpu, or cpu network. 10Gbps circuits are commodities (I had one at my desk at my last job), but 100Gbps circuits are still pretty pricey. That is starting to get out of the range of what is easily available. They are seeing 60k downloads of 6GB models per day, which is about 33Gbps of bandwidth (assuming no burstiness in when people visit, which is a poor assumption). Without the cloud provider caching setup, we'd probably be at similar $/day costs, but this way it's basically free (even without the free fastly service, we'd only be at ~$1000/month or so). (Side note: Shout out to fastly for hosting our binaries for free!). For example, after GitHub actions became widely used, we suddenly started seeing TB/day traffic from Azure, which we hadn't deployed any caches to, so our bandwidth utilization on Fastly shot through the roof. You do have to monitor it though to make sure that load doesn't suddenly shift to a cloud provider you didn't account for. Everything else then goes through fastly for us (we also use fastly to serve custom redirects if your request comes from one of aforementioned cloud providers). We do something similar to serve julia downloads, because it turns out that most downloads are from people running on the cloud (so we basically replicate our binaries to every cloud provider and then to every region for that cloud provider). For this particular situation though, make sure to replicate your files across multiple buckets, one in each GCP zone, otherwise you incur the cross zone transfer costs.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |