Posting Lists, Index Construction




CS267

Chris Pollett

Oct 10, 2018

Outline

Posting Lists

Random Accesses of Posting Lists

Prefix Queries

Interleaving Dictionary and Posting Lists

An example of the dictionary interleaving strategy

In-Class Exercise

Suppose we have documents 1, 2, 3. In the whole corpus we have twenty-five terms A,B,C,D,E,AA,BB,CC,DD,EE,..., AAAAA,BBBBB,CCCCC,DDDDD,EEEEE. Document i contains a term if:

(len(term) + (alphabet_pos(term))) mod 3 = i mod 3.

Here alphabet_pos(term) is 0 if the term uses the letter A, 1 if the term uses B, etc.

Let's assume A and C occur in all documents.

(1) Suppose we have a per-term index where we have a synchronization entry every second posting. Using ascii draw what the auxiliary index and posting list look like.

(2) Draw what an interleaved index might look like for the above corpus.

Post your answer to the Oct 10 Thread.

Dropping the distinction between terms and postings