The dark side of the web
Posted on 9 Mar 2010 at 15:47
Google sees only a fraction of the content that appears on the internet. Stuart Andrews finds out what's lurking in the deep web
When Google indexes so many billions of web pages that it doesn’t even bother listing the number any more, it’s hard to imagine that much lies beyond its far-reaching tentacles.
Beneath, however, lies an online world that few know exists. It’s a realm of huge, untapped reserves of valuable information containing sprawling databases, hidden websites and murky forums. It’s a world where academics and researchers might find the data required to solve some of mankind’s biggest problems, but also where criminal syndicates operate, and terrorist handbooks and child pornography are freely distributed.
Disappear into the dark webThere's many valid reasons why you might not want your online exploits searchable. Find out how you can disappear from the web
At the same time, the underground web is the best hope for those who want to escape the bonds of totalitarian state censorship, and share their ideas or experiences with the outside world.
Interested? You’re not alone. The deep web and its “darknets” are a new battleground for those who want to uphold the right to privacy online, and those who feel that rights need to be sacrificed for the safety of society. The deep web is also the new frontier for those who want to rival Google in the field of search. Take a journey with us to the other side of the internet.
Deep webs, the dark web and darknets
The first thing to grasp is that, while the elements that make up this other web have aspects in common, we’re not talking about a single, unified entity. Those in the know will often talk in terms of the deep or invisible web, darknets and the dark web, and you might think these are all the same thing. In fact, they’re separate phenomena, albeit linked by common themes, properties or interests.
The deep web isn’t half as strange or sinister as it sounds. In computer-science speak, it refers to those portions of the web that, for whatever reason, have been invisible to conventional search engines such as Google.
The majority of this deep web is made up of dynamically created pages and database entries that are accessible only through manual completion of an HTML form
The majority of this deep web is made up of dynamically created pages and database entries that are accessible only through manual completion of an HTML form. A smaller proportion has been accidentally or purposefully made inaccessible to Google’s crawlers, while other areas sit behind password-protected or subscription-only sites.
Make no mistake, the deep web is huge. Michael Bergman’s pioneering 2001 study, The Deep Web: Surfacing Hidden Value, estimated that it accounted for 7,500TB of data at a time when search engines could index only 19.
Even the more conservative estimates in a 2007 paper written by Google’s Jayant Madhavan, Alon Halevy and colleagues, suggests that there are more than 25 million different sources of deep web content, many of which are huge repositories.
“There is a prevailing sense in the database community that we missed the boat with the WWW,” the Google paper concluded. “The over-arching message of this paper is that a second boat is here, with staggering volumes of structured data, and that boat should be ours.”
Treasures of the deep
“There’s a lot of legitimate and valuable content in the deep web,” said Dr Juliana Freire, the leader of a University of Utah project, DeepPeep, which aims to make deep web content more accessible.
“For example, there are several scientific data sets (such as the Sloan Digital Sky Survey and the Center for Coastal Margin Observation & Prediction), documents and databases, and these are useful to society and have many important applications.”
and those who feel that rights need to be sacrificed for the safety of society
"They who can give up essential liberty to obtain a little temporary safety, deserve neither liberty nor safety."
Benjamin Franklin, 1775.
By Lacrobat on 12 Mar 2010
accounted for 7,500TB of data at a time when search engines could index only 19.
What, 19 pages, sites, TBs?
By greemble on 12 Mar 2010
TB. It's self explanatory really
By TimoGunt on 18 Mar 2010
Here is a good article that adds some additional detail to the topic and a good set of links to the deep web search engines and other helpful sites.
By theTribster on 19 Mar 2010
Attempt 2. See link below.
le-web is a good article that adds some additional detail to the topic and a good set of links to the deep web search engines and other helpful sites.
By theTribster on 19 Mar 2010
The dark side of the web
Fascinating article. I had no idea that their was an "underworld" web.
I agree with the article author about content and use. When you put togther any number of people in doing something, there will always be those whose purposes are less than honorable. But, that does not change the fact that the good of its use can outweigh the bad.
Thank you for this post. I learned a lot from it.
By moomoosweetbaby on 25 Mar 2010
For more details about purchasing this feature and/or images for editorial usage, please contact Jasmine Samra on email@example.com
- Google I/O live stream and blog: how to watch 2014 Google I/O keynote speech live
- Google testing its own domain registration service
- Adobe announces first hardware: Adobe Ink and Slide
- Vote now in the PC Pro Excellence Awards 2014!
- What’s new in OS X 10.10? Apple Yosemite’s new features
- Samsung Z Tizen phone helps loosen ties with Android
- Microsoft rumoured to launch smartwatch this summer
- LG G3 launched: LG takes the wraps off smartphone that offers “more with less effort”
- LG G3 launch live video stream and blog: as it happened
- Apple fixes iMessage lock-in for Android switchers
- How Google Glass ruined my lunch hour
- Smartphone battery packs: can a USB power pack beat the festival battery blues?
- Windows Easy Transfer – not so "easy" in Windows 8.1
- Formula 1: what a difference virtualisation makes
- Office of the future: comfy chairs and tablets everywhere
- I went to Glastonbury and the only thing that got high was my smartphone
- Meet the robots helping teach children
- PaperLater: would you pay to print the internet?
- Amazon vs Kobo: how much to make the ebook switch?
- Phishing emails: how I nearly got caught out