Features
The new seekers
By November last year, there were over 100 million unique websites on the internet, and that's not counting other items such as blogs and MySpace pages. That's a lot, and if even just a fraction contain useful information you might actually want, finding those pages among the rest can be frustrating and difficult. Here we show you how to find exactly the results you want, and how you can ensure your own site is easier to find.
Web directories
When the web was first made accessible to the public in the early 1990s, it was still mainly used by academics to exchange research and talk to their peers. Even at this early stage, however, there was a great deal of uncatalogued information online. If you didn't already know where a particular site or newsgroup was hosted, the chances of finding it were very slim.
In keeping with the enthusiastic 'have-a-go' ethos of the early internet, most of the early solutions to this problem were developed by students working in their own time. Web directories were among the first attempts to bring order to the internet. These were simple lists of websites, grouped by category. There are considerable disadvantages to this approach. Finding the right information from a list of websites can be a slow, hit-and-miss process. Keeping a web directory up to date enough to be useful is also incredibly time consuming.
One of the best-known web directories of the early internet was Yahoo.com. Originally created by PhD students David Filo and Jerry Yang as a way of keeping track of their own favourite sites, the directory became
ADVERTISEMENT |
|
Search engine evolution
At the same time as the early web directories were being created, other groups of students were developing search engines for the web. The very first search engine actually predates the World Wide Web. Called Archie, it was developed by Barbadian computer science student Alan Emtage to search for information in public FTP archives. Wandex and Allweb, the first search engines designed for the web, appeared in 1993. Unlike their modern counterparts, however, these early search engines looked only at the words in the website's URL. The first search engine to index all the words on a website was Webcrawler, launched in 1994. Within a year, several other search engines had been launched, including the now well-known AltaVista and Lycos. If it works well, a search engine will help you find the information you need more quickly than a web directory.
Although the science of web searching can be tremendously complicated when you examine it in detail, the basics are simple. Behind every search engine is a database of websites. Every site in that database is matched to a number of search criteria, things such as words taken from the page titles and headers, or words that are repeated many times in the page's body text. In order for the search engine to be useful, the database must be updated continually. This is done by programs called bots, or spiders. These programs download websites, just as your browser would. Each site is checked against the search engine's index. If it is new, it's added to the index. If it is already in the index, the information associated with the site is refreshed. While this is happening, the bot gets to work on all the links from the downloaded site to other sites. In this way, new sites are discovered and the information about old sites is kept up to date. One of the big differences between search engines is the size of their databases and how efficiently they keep their databases up to date.





