Steps a search engine typically does
- Downloads as much of the web as it can, before ever serving the results of any search query.
- Extracts words from the text on the pages it has downloaded.
- Creates a big index associating each word found with a list of documents containing that word.
At this point, the search engine is ready to handle queries. To handle a query it might:
- Look up each word in the query in the word-document index
- Intersect the list of documents found for each word to produce a list of documents each of which has all of the words.
- Group related documents.
- Try to order the documents by how relevant they seem to be to the query.