CS267
Chris Pollett
Oct 10, 2018
Suppose we have documents 1, 2, 3. In the whole corpus we have twenty-five terms A,B,C,D,E,AA,BB,CC,DD,EE,..., AAAAA,BBBBB,CCCCC,DDDDD,EEEEE. Document i contains a term if:
(len(term) + (alphabet_pos(term))) mod 3 = i mod 3.
Here alphabet_pos(term) is 0 if the term uses the letter A, 1 if the term uses B, etc.
Let's assume A and C occur in all documents.
(1) Suppose we have a per-term index where we have a synchronization entry every second posting. Using ascii draw what the auxiliary index and posting list look like.
(2) Draw what an interleaved index might look like for the above corpus.
Post your answer to the Oct 10 Thread.