BBC knocked offline by network failure
By Nicole Kobie
Posted on 30 Mar 2011 at 08:21
The BBC is looking into the causes of a technical failure that knocked its entire web estate down for an hour last night.
BBC News editor Steve Herrmann said the sites were down for a full hour, but came back live at about midnight. "We haven't yet had a full technical debrief, but it's clear it was a major network problem," he said in a post on the BBC blog.
BBC staff suggested via Twitter it was down to a "faulty switch" or configuration problems. Others suggested that it looked like a problem with the BBC's DNS servers.
A BBC news report said Siemens, which provides the broadcaster's IT support, was looking into the problem, but its engineers got the sites back online by powering down equipment and turning it back on again.
During the outage, many took to Twitter to suggest more dramatic causes, including a distributed-denial-of-service (DDoS) attack by Anonymous.
"It's not often we get a message from the BBC's technical support teams saying, 'Total outage of all BBC websites'," Herrmann said.
"We'd like to apologise to everyone who couldn't get onto the BBC News website during that time," he added.
Update: The BBC's controller for digital distribution, Richard Cooper, has issued an explanation to the outage, saying multiple failures knocked the sites offline.
"Our systems are designed to be sufficiently resilient (multiple systems, and multiple data centres) to make an outage like this extremely unlikely," he said in a blog post. However, I'm afraid that last night we suffered multiple failures, with the result that the whole site went down."
"For the more technically minded, this was a failure in the systems that perform two functions," he added. "The first is the aggregation of network traffic from the BBC's hosting centres to the internet. The second is the announcement of 'routes' onto the internet that allows BBC Online to be 'found.' With both of these having failed, we really were down."
He said the BBC would take a "hard look" at its systems to ensure such a fault didn't happen again.
Is your business a social business? For helpful info and tips visit our hub.
The Fall back position if IT managers everywhere.
"have you tried switching it off and back on again madam"
Funny how it works 9/10 times though.:-)
By Jaberwocky on 30 Mar 2011
Wondered why my streaming radio app could not get BBC last night ...
By mike0whit on 30 Mar 2011
I've know idea what he means by "announcement of 'routes'". I'd prefer a actual technical explanation.
By peterm2k on 30 Mar 2011
I agree with peterm2k, his description for the technically minded is very vague
By DaChimp on 30 Mar 2011
It is vague but I imagine he means either a dns problem or all of their routers started sending faulty updates to each other
By PCmaster1000 on 31 Mar 2011
1 hour ?
The website may have been "down" for an hour, but it was completely unusable for a great deal longer than that, more like five or six hours.
By howardabates1 on 1 Apr 2011
- CeBit 2014 diary: Cameron comes to town
- The 5 most interesting UK businesses at SXSW
- Quickest way to upload 1GB? Hop on a train
- Move over Delia: IBM Watson is cooking tonight
- Eric Schmidt on the double-edged smartphone: friend and foe
- Getty joins the race to the bottom
- Hour of Code: five steps to learn how to code
- Sony Xperia Z2 Tablet review: first look
- Sony Xperia Z2 review: first look
- Samsung Galaxy Gear 2 review: first look
- Headings vs headers: how to use both in Word
- Windows Server 2012 R2: how the Datacenter edition could change SMBs
- Invoices and VAT: how to set up your documents correctly
- Nexus 5 vs Samsung Galaxy S4 Active: the best phone for avoiding screen burn
- How much is a social user worth?
- The key to choosing a secure password
- Thunderbolt Bridge: a fast Mac migration tool
- Should you advertise on Twitter?
- How to track a lost smartphone
- Self-publishing success: the best way to sell your book