{"id":520857,"date":"2025-10-23T00:17:18","date_gmt":"2025-10-23T00:17:18","guid":{"rendered":"https:\/\/www.europesays.com\/uk\/520857\/"},"modified":"2025-10-23T00:17:18","modified_gmt":"2025-10-23T00:17:18","slug":"the-long-tail-of-the-aws-outage","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/uk\/520857\/","title":{"rendered":"The Long Tail of the AWS Outage"},"content":{"rendered":"<p>A sprawling Amazon Web Services <a href=\"https:\/\/www.wired.com\/story\/what-that-huge-aws-outage-reveals-about-the-internet\/\" target=\"_blank\" rel=\"noopener\">cloud outage<\/a> that began early Monday morning illustrated the fragile interdependencies of the internet as major communication, financial, health care, education, and government platforms around the world suffered disruptions. As the <a href=\"https:\/\/www.wired.com\/story\/the-aws-outage-was-a-nightmare-for-college-students\/\" target=\"_blank\" rel=\"noopener\">day wore on<\/a>, AWS diagnosed and began working to correct the issue, which stemmed from the company&#8217;s critical US-EAST-1 region based in northern Virginia. But the cascade of impacts took time to fully resolve.<\/p>\n<p class=\"paywall\">Researchers reflecting on the incident particularly highlighted the length of the outage, which started around 3 am ET on Monday, October 20. AWS said in status updates that by 6:01 pm ET on Monday \u201call AWS services returned to normal operations.\u201d The outage directly stemmed from Amazon&#8217;s DynamoDB database application programming interfaces and, according to the company, \u201cimpacted\u201d 141 other AWS services. Multiple network engineers and infrastructure specialists emphasized to WIRED that errors are understandable and inevitable for so-called \u201chyperscalers\u201d like AWS, Microsoft Azure, and Google Cloud Platform, given their complexity and sheer size. But they noted, too, that this reality shouldn&#8217;t simply absolve cloud providers when they have prolonged downtime.<\/p>\n<p class=\"paywall\">\u201cThe word hindsight is key. It&#8217;s easy to find out what went wrong after the fact, but the overall reliability of AWS shows how difficult it is to prevent every failure,\u201d says Ira Winkler, chief information security officer of the reliability and cybersecurity firm CYE. \u201cIdeally, this will be a lesson learned, and Amazon will implement more redundancies that would prevent a disaster like this from happening in the future\u2014or at least prevent them staying down as long as they did.\u201d<\/p>\n<p class=\"paywall\">AWS did not respond to questions from WIRED about the long tail of the recovery for customers. An AWS spokesperson says the company plans to publish one of its \u201cpost-event summaries\u201d about the incident.<\/p>\n<p class=\"paywall\">\u201cI don&#8217;t think this was just a \u2018stuff happens\u2019 outage. I would have expected a full remediation much faster,\u201d says Jake Williams, vice president of research and development at Hunter Strategy. \u201cTo give them their due, cascading failures aren&#8217;t something that they get a lot of experience working with because they don&#8217;t have outages very often. So that&#8217;s to their credit. But it&#8217;s really easy to get into the mindset of giving these companies a pass, and we shouldn&#8217;t forget that they create this situation by actively trying to attract ever more customers to their infrastructure. Clients don&#8217;t control whether they are overextending themselves or what they may have going on financially.\u201d<\/p>\n<p class=\"paywall\">The incident was caused by a familiar culprit in web outages\u2014\u201cdomain name system\u201d resolution issues. DNS is essentially the internet&#8217;s phonebook mechanism to direct web browsers to the right servers. As a result, DNS issues are a common source of outages, because they can cause requests to fail and keep content from loading.<\/p>\n","protected":false},"excerpt":{"rendered":"A sprawling Amazon Web Services cloud outage that began early Monday morning illustrated the fragile interdependencies of the&hellip;\n","protected":false},"author":2,"featured_media":520858,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[324,4121,51,14270,168544,6084,3082,16,15],"class_list":{"0":"post-520857","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-business","8":"tag-amazon","9":"tag-aws","10":"tag-business","11":"tag-cloud-computing","12":"tag-dns","13":"tag-infrastructure","14":"tag-internet","15":"tag-uk","16":"tag-united-kingdom"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@uk\/115420610293032805","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/520857","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/comments?post=520857"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/520857\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media\/520858"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media?parent=520857"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/categories?post=520857"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/tags?post=520857"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}