Amazon says AWS cloud service back to normal after global outage hits thousands of sites

Amazon Web Services (AWS) has restored normal operations following a significant global outage that disrupted thousands of websites and applications, including major platforms like Snapchat, Reddit, and Venmo. The incident, which began on Monday, marked the largest internet disruption since last year’s CrowdStrike malfunction, which impacted hospitals, banks, and airports. AWS attributed the outage to a subsystem failure within its network health monitoring system, specifically affecting its US-EAST-1 data center in northern Virginia, a region historically prone to such issues. The problem stemmed from the Domain Name System (DNS), which prevented applications from accessing AWS’s DynamoDB API, a critical cloud database. While most services were restored by Monday afternoon, some, including AWS Config and Redshift, faced message backlogs requiring additional processing time. The outage underscored the fragility of global cloud infrastructure and the widespread reliance on a few dominant providers. Experts emphasized the need for better fault tolerance and diversified cloud strategies to mitigate future disruptions. Major companies, including Lloyd Bank, Vodafone, and HMRC, were among those affected, with over 4 million users reporting issues. Despite the chaos, Amazon’s stock rose 1.6%, reflecting Wall Street’s muted reaction to the incident.