GMail outage blamed on overloaded servers
By Stuart Turton
Posted on 2 Sep 2009 at 07:59
Google has blamed the outage which knocked GMail offline for two hours last night on server maintenance intended "to improve service availability".
The problems began at around 8:30pm last night, when Google took several mail servers offline for routine maintenance - a common procedure that generally goes unnoticed by the users. Unfortunately, Google had previously made a number of changes to its servers, and underestimated the strain these changes would place on the network.
"As we now know, we had slightly underestimated the load which some recent changes (ironically, some designed to improve service availability) placed on the request routers - servers which direct web queries to the appropriate GMail server for response," writes Ben Treynor, vice president of engineering and site reliability.
We're committed to keeping events like today's notable for their rarity
"The request routers became overloaded and in effect told the rest of the system 'stop sending us traffic, we're too slow!' This transferred the load onto the remaining request routers, causing a few more of them to also become overloaded, and within minutes nearly all of the request routers were overloaded," he adds.
The problem was fixed when Google's engineering team tapped its network to bring additional request routers online. The company is promising that in future it will ensure the network has sufficient capacity to handle spikes in traffic, and is also looking at isolating processes so this sort of cascade can't happen again.
"We'll be hard at work over the next few weeks implementing these and other GMail reliability improvements. GMail remains more than 99.9% available to all users, and we're committed to keeping events like today's notable for their rarity," Treynor concludes.
The outage will come as a blow to Google, which is struggling to convince businesses of the merits of cloud computing services, as it looks to face a renewed online challenge from Microsoft.
From around the web
advertisement
- Laptop bag reviews: nine tested
- Sony VAIO T Series Ultrabook review: first look
- Revealed: the military standards and robots HP uses to test its laptops
- Windows 8: multi-monitors and double standards?
- Why is TalkTalk's year-old porn filter suddenly big news?
- Why are laptop screens so far behind mobiles?
- HP EliteBook Folio review: first look
- The shoebox-sized all-in-one printer
- Forget the Ultrabook: here comes the HP Sleekbook
- HP Spectre XT review: first look
- Why you have to be left in the dark on OS patches
- Is Microsoft mismanaging Windows on ARM?
- Dealing with spam surrogates
- Why 3G broadband can be better and cheaper than ADSL
- Is Twitter bad for business?
- Publishing your email address isn't a security disaster
- Why you'll need a fax machine to develop iOS apps
- Learning to adapt to the mobile web
- Why you shouldn't use WPS on your Wi-Fi network
- Disabled users suffer when software breaks the rules
advertisement
