0

Web search engines do their job by storing information about many web pages, which they then retrieve from the html itself. These pages are retrieved by a Web crawler (sometimes also known as a spider) — an automated Web browser which follows every link on the site. The contents of each page are then examined to determine how it should be indexed (for example, words are extracted from the titles, headings, or special fields called meta tags).

Data about web pages are stored in an index database for use in later queries. A query can be just a single word. The purpose of an index is to allow the needed information to be located as quickly as possible. When a person enters a query into a search engine (commonly by using key words), the engine determines its index and provides a listing of best-matching web pages, usually with a short summary containing the document’s title and often parts of the text. The advantage of a search engine depends on the relevance of the result set it gives back. While there may be millions of web pages that have a particular word or phrase, some pages may be more relevant, popular, or authoritative than others.

Tags: , ,

Leave a Comment