The dark side of the web
Posted on 9 Mar 2010 at 15:47
Google sees only a fraction of the content that appears on the internet. Stuart Andrews finds out what's lurking in the deep web
When Google indexes so many billions of web pages that it doesn’t even bother listing the number any more, it’s hard to imagine that much lies beyond its far-reaching tentacles.
Beneath, however, lies an online world that few know exists. It’s a realm of huge, untapped reserves of valuable information containing sprawling databases, hidden websites and murky forums. It’s a world where academics and researchers might find the data required to solve some of mankind’s biggest problems, but also where criminal syndicates operate, and terrorist handbooks and child pornography are freely distributed.
Disappear into the dark webThere's many valid reasons why you might not want your online exploits searchable. Find out how you can disappear from the web
At the same time, the underground web is the best hope for those who want to escape the bonds of totalitarian state censorship, and share their ideas or experiences with the outside world.
Interested? You’re not alone. The deep web and its “darknets” are a new battleground for those who want to uphold the right to privacy online, and those who feel that rights need to be sacrificed for the safety of society. The deep web is also the new frontier for those who want to rival Google in the field of search. Take a journey with us to the other side of the internet.
Deep webs, the dark web and darknets
The first thing to grasp is that, while the elements that make up this other web have aspects in common, we’re not talking about a single, unified entity. Those in the know will often talk in terms of the deep or invisible web, darknets and the dark web, and you might think these are all the same thing. In fact, they’re separate phenomena, albeit linked by common themes, properties or interests.
The deep web isn’t half as strange or sinister as it sounds. In computer-science speak, it refers to those portions of the web that, for whatever reason, have been invisible to conventional search engines such as Google.
The majority of this deep web is made up of dynamically created pages and database entries that are accessible only through manual completion of an HTML form
The majority of this deep web is made up of dynamically created pages and database entries that are accessible only through manual completion of an HTML form. A smaller proportion has been accidentally or purposefully made inaccessible to Google’s crawlers, while other areas sit behind password-protected or subscription-only sites.
Make no mistake, the deep web is huge. Michael Bergman’s pioneering 2001 study, The Deep Web: Surfacing Hidden Value, estimated that it accounted for 7,500TB of data at a time when search engines could index only 19.
Even the more conservative estimates in a 2007 paper written by Google’s Jayant Madhavan, Alon Halevy and colleagues, suggests that there are more than 25 million different sources of deep web content, many of which are huge repositories.
“There is a prevailing sense in the database community that we missed the boat with the WWW,” the Google paper concluded. “The over-arching message of this paper is that a second boat is here, with staggering volumes of structured data, and that boat should be ours.”
Treasures of the deep
“There’s a lot of legitimate and valuable content in the deep web,” said Dr Juliana Freire, the leader of a University of Utah project, DeepPeep, which aims to make deep web content more accessible.
“For example, there are several scientific data sets (such as the Sloan Digital Sky Survey and the Center for Coastal Margin Observation & Prediction), documents and databases, and these are useful to society and have many important applications.”
and those who feel that rights need to be sacrificed for the safety of society
"They who can give up essential liberty to obtain a little temporary safety, deserve neither liberty nor safety."
Benjamin Franklin, 1775.
By Lacrobat on 12 Mar 2010
accounted for 7,500TB of data at a time when search engines could index only 19.
What, 19 pages, sites, TBs?
By greemble on 12 Mar 2010
TB. It's self explanatory really
By TimoGunt on 18 Mar 2010
Here is a good article that adds some additional detail to the topic and a good set of links to the deep web search engines and other helpful sites.
By theTribster on 19 Mar 2010
Attempt 2. See link below.
le-web is a good article that adds some additional detail to the topic and a good set of links to the deep web search engines and other helpful sites.
By theTribster on 19 Mar 2010
The dark side of the web
Fascinating article. I had no idea that their was an "underworld" web.
I agree with the article author about content and use. When you put togther any number of people in doing something, there will always be those whose purposes are less than honorable. But, that does not change the fact that the good of its use can outweigh the bad.
Thank you for this post. I learned a lot from it.
By moomoosweetbaby on 25 Mar 2010
For more details about purchasing this feature and/or images for editorial usage, please contact Jasmine Samra on firstname.lastname@example.org
- What is Google Inbox?
- Windows 10 release date, features and how to get the Technical Preview
- Google announces the Nexus 6, Nexus 9 and the arrival of Android Lollipop
- Lenovo and Ashton Kutcher launch Yoga Tablet 2 Pro, Yoga Tablet 2 and Yoga 3 Pro
- Lenovo Yoga event live stream: watch Ashton Kutcher's tablet launch live
- HTC shows off Desire Eye selfie phone and periscope-like camera
- Xim: the slideshow app to get excited about
- Adobe has more apps for iOS, but none for Android
- How to download and install Windows 10 Technical Preview
- iPhone 6 Plus "less likely to bend than HTC One"
- Google Glass: mugger bait, pub problem and other lessons learned from two dangerous weeks
- Twitter, please don't fiddle with my feed
- How Satya Nadella can get some pay-raise karma
- Windows 10: a step back to go forward
- Michael Dell: Cloud infrastructure is the roads, bridges and highways of the 21st century
- How to check your identity hasn’t been sold to the hackers
- Tim Cook: this is how much TV has changed since the 70s
- Westminster wins the .London battle
- 20 years of PC Pro: from deep pan pizza to virtualisation
- Five reasons why the Apple Watch leaves me cold