How One Little Amazon Error Can Destroy the Internet

0
139

The fact that Amazon controls a vast swath of cloud computing services became dreadfully clear on Tuesday afternoon when a string of errors brought countless websites to their knees. This consolidation of power is, perhaps suddenly, a very big problem.

Unlike its internet marketplace, Amazon Web Services (AWS) works more like a house of cards than a traditional retail business. After all, instead of selling books and reasonably priced electronics, AWS caters to enterprise clients to provide cloud-computing services. Amazon Simple Storage Service (S3), the product that suffered errors and knocked out a solid portion of the web on Tuesday, provides storage for cloud-based apps like Slack and Trello. Amazon says that its S3 service is “designed to deliver 99.999999999 percent durability.” But when it one piece of the infrastructure fails, AWS fails big.

This is because Amazon controls a ridiculous portion of the market share when it comes to cloud computing and, specifically, cloud storage. A Gartner study from August 2016 claims that AWS controls 31 percent of the market in global cloud infrastructure, and the business is growing. The same study said that AWS accounted for 51 percent of Amazon’s profits. (Another study from the same time period puts Amazon’s market share at 45 percent.) Microsoft, IBM, and Google are all expanding their cloud offerings as well, but Amazon’s been the leader in the space since 2006.

So for over a decade, Amazon has been king of the cloud. During that span of time, the company’s business model, which Jeff Bezos once compared to the early days of electricity, enabled startups to scale and yet still afford the cost of hosting. Ingrid Burrington explained in The Atlantic last year:

In practice, this meant that pricing for services was entirely contingent on actual use, an approach that allowed developers to rapidly scale small startups into massive companies by paying for infrastructure support on an as-needed basis and scaffolding as needs grew. Thanks to AWS, the initial overhead for starting a service like Airbnb or Slack (both AWS customers) is so low that those companies can afford to expand quickly.

But what happens when any service gets so big that its tentacles touch the entire industry? Its failures become amplified to a destructive degree. In the case of AWS, that .000000001 percent of the time when things don’t work just right means that over a third of the internet ceases to function well. Amazon won’t say how many cloud computing customers it has or the exact percentage of internet traffic that’s affected when an error happens. But Tuesday’s outage showed that it could bring entire networks of websites grinding to a halt. (Gizmodo Media is an AWS customer, so I can confirm that this was a fucked up day.)

Meanwhile, the fact that many of Amazon’s AWS servers are located in northern Virginia, where an unholy number of tubes come together to form one of the most congested bottlenecks of internet traffic, certainly doesn’t help. Amazon says that this region, known as US-EAST-1, was the source of Tuesday’s outage.

So while this week’s paralyzing series of errors gave Amazon engineers a terrible headache, cloud computing competitors like Microsoft, IBM, and Google must be thrilled. As mentioned earlier, they’re all gaining on Amazon’s absurd market share, and now their salespeople will have a single incident to show that AWS is not 100 percent durable. The fact that added competition should improve services and lower prices for everyone is undeniably a good thing, too.

Amazon still hasn’t explained exactly what went down on Tuesday. In response to a Gizmodo request for comment the company said:

We continue to experience high error rates with S3 in US-EAST-1, which is impacting various AWS services. We are working hard at repairing S3, believe we understand root cause, and are working on implementing what we believe will remediate the issue.

That’s basically a different version of the error notice posted on the AWS website. We’ll update this post as we learn more. In the meantime, good luck using the internet. It’s a mess out there.



Source link

LEAVE A REPLY