Indexing

When using a search engine to look for anything instantly, the search engine will sort through millions of pages and present you with ones that best match your topic. The matches will even be ranked, with most relevant ones coming first.

The search engines don’t always get it right but considering the vast amounts of information they’re dealing in and the timescales they operate in, they usually do a pretty amazing job.

Search engine software sifts through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant.

Bots, robots, web crawlers or spiders are software agents used by the search engines that regularly send them out to ‘crawl’ or ‘spider’ web sites and web pages. They suck in page content, following links from page to page, returning to the engines to incorporate the information into the enormous indexes of content maintained by the search engines in their databases.

Everything the spider finds goes into the index. The index, sometimes called the ‘catalog’, is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with new information.

Leave a comment