BBC knocked offline by network failure
By Nicole Kobie
Posted on 30 Mar 2011 at 08:21
The BBC is looking into the causes of a technical failure that knocked its entire web estate down for an hour last night.
BBC News editor Steve Herrmann said the sites were down for a full hour, but came back live at about midnight. "We haven't yet had a full technical debrief, but it's clear it was a major network problem," he said in a post on the BBC blog.
BBC staff suggested via Twitter it was down to a "faulty switch" or configuration problems. Others suggested that it looked like a problem with the BBC's DNS servers.
A BBC news report said Siemens, which provides the broadcaster's IT support, was looking into the problem, but its engineers got the sites back online by powering down equipment and turning it back on again.
During the outage, many took to Twitter to suggest more dramatic causes, including a distributed-denial-of-service (DDoS) attack by Anonymous.
"It's not often we get a message from the BBC's technical support teams saying, 'Total outage of all BBC websites'," Herrmann said.
"We'd like to apologise to everyone who couldn't get onto the BBC News website during that time," he added.
Update: The BBC's controller for digital distribution, Richard Cooper, has issued an explanation to the outage, saying multiple failures knocked the sites offline.
"Our systems are designed to be sufficiently resilient (multiple systems, and multiple data centres) to make an outage like this extremely unlikely," he said in a blog post. However, I'm afraid that last night we suffered multiple failures, with the result that the whole site went down."
"For the more technically minded, this was a failure in the systems that perform two functions," he added. "The first is the aggregation of network traffic from the BBC's hosting centres to the internet. The second is the announcement of 'routes' onto the internet that allows BBC Online to be 'found.' With both of these having failed, we really were down."
He said the BBC would take a "hard look" at its systems to ensure such a fault didn't happen again.
Is your business a social business? For helpful info and tips visit our hub.
The Fall back position if IT managers everywhere.
"have you tried switching it off and back on again madam"
Funny how it works 9/10 times though.:-)
By Jaberwocky on 30 Mar 2011
Wondered why my streaming radio app could not get BBC last night ...
By mike0whit on 30 Mar 2011
I've know idea what he means by "announcement of 'routes'". I'd prefer a actual technical explanation.
By peterm2k on 30 Mar 2011
I agree with peterm2k, his description for the technically minded is very vague
By DaChimp on 30 Mar 2011
It is vague but I imagine he means either a dns problem or all of their routers started sending faulty updates to each other
By PCmaster1000 on 31 Mar 2011
1 hour ?
The website may have been "down" for an hour, but it was completely unusable for a great deal longer than that, more like five or six hours.
By howardabates1 on 1 Apr 2011
- Windows 8.1 Update: an abject surrender
- The insane economics of Sky Now TV
- No such thing as a free app... so pay up if you want quality
- Time to outlaw crapware-laden installers
- Windows Phone 8.1 video: hands-on
- Office for iPad: key information
- Why every PC buyer owes Richard Durkin a debt of gratitude
- HTC One M8 vs Samsung Galaxy S5: 2014's big-hitters compared
- Windows XP end of life: key information
- Cut out the broadband jargon? What jargon?
- Make your mobile battery last longer
- Small steps into handling Big Data
- Nexus 5: does it really run stock Android?
- How to get broadband to a garden office
- How to write your company's IT security policy
- Raspberry Pi and Wolfram: a must-have for every child
- Could you get by with Office Web Apps?
- The best Android antivirus apps for 2014
- Headings vs headers: how to use both in Word
- Windows Server 2012 R2: how the Datacenter edition could change SMBs