Static Inverted Indices




CS267

Chris Pollett

Oct. 10, 2011

Outline

Introduction

Quiz

Which of the following is true?

  1. Jump is an example stopword of jumping.
  2. The number of character 5-grams in a document is linearly proportional to the number of characters in a document.
  3. A segmenter is usually not used when a natural language supports dynamic compound words.

The Dictionary

Dictionary Types

Storing Dictionary Terms

Sort-based versus Hash-based dictionaries

Posting Lists