The dark side of the web
Posted on 9 Mar 2010 at 15:47
Google sees only a fraction of the content that appears on the internet. Stuart Andrews finds out what's lurking in the deep web
When Google indexes so many billions of web pages that it doesn’t even bother listing the number any more, it’s hard to imagine that much lies beyond its far-reaching tentacles.
Beneath, however, lies an online world that few know exists. It’s a realm of huge, untapped reserves of valuable information containing sprawling databases, hidden websites and murky forums. It’s a world where academics and researchers might find the data required to solve some of mankind’s biggest problems, but also where criminal syndicates operate, and terrorist handbooks and child pornography are freely distributed.
Disappear into the dark webThere's many valid reasons why you might not want your online exploits searchable. Find out how you can disappear from the web
At the same time, the underground web is the best hope for those who want to escape the bonds of totalitarian state censorship, and share their ideas or experiences with the outside world.
Interested? You’re not alone. The deep web and its “darknets” are a new battleground for those who want to uphold the right to privacy online, and those who feel that rights need to be sacrificed for the safety of society. The deep web is also the new frontier for those who want to rival Google in the field of search. Take a journey with us to the other side of the internet.
Deep webs, the dark web and darknets
The first thing to grasp is that, while the elements that make up this other web have aspects in common, we’re not talking about a single, unified entity. Those in the know will often talk in terms of the deep or invisible web, darknets and the dark web, and you might think these are all the same thing. In fact, they’re separate phenomena, albeit linked by common themes, properties or interests.
The deep web isn’t half as strange or sinister as it sounds. In computer-science speak, it refers to those portions of the web that, for whatever reason, have been invisible to conventional search engines such as Google.
The majority of this deep web is made up of dynamically created pages and database entries that are accessible only through manual completion of an HTML form
The majority of this deep web is made up of dynamically created pages and database entries that are accessible only through manual completion of an HTML form. A smaller proportion has been accidentally or purposefully made inaccessible to Google’s crawlers, while other areas sit behind password-protected or subscription-only sites.
Make no mistake, the deep web is huge. Michael Bergman’s pioneering 2001 study, The Deep Web: Surfacing Hidden Value, estimated that it accounted for 7,500TB of data at a time when search engines could index only 19.
Even the more conservative estimates in a 2007 paper written by Google’s Jayant Madhavan, Alon Halevy and colleagues, suggests that there are more than 25 million different sources of deep web content, many of which are huge repositories.
“There is a prevailing sense in the database community that we missed the boat with the WWW,” the Google paper concluded. “The over-arching message of this paper is that a second boat is here, with staggering volumes of structured data, and that boat should be ours.”
Treasures of the deep
“There’s a lot of legitimate and valuable content in the deep web,” said Dr Juliana Freire, the leader of a University of Utah project, DeepPeep, which aims to make deep web content more accessible.
“For example, there are several scientific data sets (such as the Sloan Digital Sky Survey and the Center for Coastal Margin Observation & Prediction), documents and databases, and these are useful to society and have many important applications.”
and those who feel that rights need to be sacrificed for the safety of society
"They who can give up essential liberty to obtain a little temporary safety, deserve neither liberty nor safety."
Benjamin Franklin, 1775.
By Lacrobat on 12 Mar 2010
accounted for 7,500TB of data at a time when search engines could index only 19.
What, 19 pages, sites, TBs?
By greemble on 12 Mar 2010
TB. It's self explanatory really
By TimoGunt on 18 Mar 2010
Here is a good article that adds some additional detail to the topic and a good set of links to the deep web search engines and other helpful sites.
By theTribster on 19 Mar 2010
Attempt 2. See link below.
le-web is a good article that adds some additional detail to the topic and a good set of links to the deep web search engines and other helpful sites.
By theTribster on 19 Mar 2010
The dark side of the web
Fascinating article. I had no idea that their was an "underworld" web.
I agree with the article author about content and use. When you put togther any number of people in doing something, there will always be those whose purposes are less than honorable. But, that does not change the fact that the good of its use can outweigh the bad.
Thank you for this post. I learned a lot from it.
By moomoosweetbaby on 25 Mar 2010
For more details about purchasing this feature and/or images for editorial usage, please contact Jasmine Samra on firstname.lastname@example.org
- Sony warns of fresh VAIO battery fires
- 4G version of Surface 2 launched in the UK
- BlackBerry CEO says not selling off phones "any time soon"
- 13 May: the day we'll know if Microsoft is really abandoning Windows XP
- Office for iPad hits 12m downloads, but receives poor reviews
- Windows Phone 8.1 gets its own PA: Cortana
- 24m vulnerable home routers ready to launch DDoS attacks
- Mozilla's Eich: my views on gay marriage are irrelevant
- Windows support scam ringleader convicted
- Intel takes $740m bet on big data firm, Cloudera
- Windows 8.1 Update: an abject surrender
- The insane economics of Sky Now TV
- No such thing as a free app... so pay up if you want quality
- Time to outlaw crapware-laden installers
- Windows Phone 8.1 video: hands-on
- Office for iPad: key information
- Why every PC buyer owes Richard Durkin a debt of gratitude
- HTC One M8 vs Samsung Galaxy S5: 2014's big-hitters compared
- Windows XP end of life: key information
- Cut out the broadband jargon? What jargon?