On the last day of February 2017, the internet almost ground to a halt. The reason was a 4-hour outage at Amazon’s computing division, Amazon Web Services (AWS) caused hundreds of thousands of websites throughout the US to go dark. Wait, Amazon, the computer online shopping site was interrupted? How can that be?
Amazon, the largest retailer in the Western World, began AWS as a side-business. Today it is among the largest web services providers and accounts for about 8% of Amazon’s revenue – in other words, AWS is a money-maker.
What Caused AWS to Go Down?
Amazon Web Services is a huge provider of hosting websites like:
Other companies large and small lost their service too. Estimates of affected sites are in the hundreds of thousands. AWS is a provider of web services and cloud storage for companies that choose not to heavily invest in computer hardware – this frees companies of spending capital on construction and outfitting of their own server farms.
The affected part of the AWS system was its S3 system (Simple Storage Service) that went offline that February afternoon. Though not all AWS clients were affected, many experienced slowdowns or simply became non-responsive on the site location.
Dave Bartoletti, a cloud analyst with Forrester said:
“This is a pretty big outage, AWS had not had a lot of outages and when they happen, they’re famous. People still talk about the one in September of 2015 that lasted five hours.”
The outage seems to have started at a few minutes past 12:30 PM and operations were completely restored some four hours later.
Another Forrester Cloud Analyst, Lydia Leong commented about the outage’s cause:
“The most common causes of this type of outage are software related, either a bug in the code or human error. Right now, we don’t know what it was.”
Amazon later explained that it was human error, not an attack on AWS or failure of hardware of software. Amazon explained that an authorized Amazon employee was debugging an issue with the S3 payment system and inadvertently entered a wrong command shutting down more than the few servers needing attention. As a result, Amazon had to restart all the affected servers and service was completely recovered by 4:40 PM.
The enormity of the outage serves as a reminder to companies of the associated with depending on just a few companies for cloud computing. Other providers of similar services include:
Leong commented on the impact of the outage:
“More than anything else, S3 clients need to be able to get at their data, because often S3 is used to store images. So, no S3, no nice picture or fancy logo on your website.”
Most Web Sites Did Not Fail Completely
Most modern websites pull data from more than one cloud database, so while an image may not be available, other information still appears when a database like S3 goes down.