CS267
Chris Pollett
Oct 31, 2018
Consider the posting list [89, 101, 112, 122, 130, 145]. Suppose the corpus size is 200 documents. (a) Compute the `\Delta`-list for this list. Then encode this list using (b) a `\gamma`-code, (c) Golumb-code using the formula from today to determine the best modulus.
Post your answer to the above to the Oct 31 In-Class Exercise Thread.
buildIndex_ImmediateMerge(M) { I[mem] := empty index //init in-memory index currentPosting := 1; while (there are more tokens to index) { T := next token; I[mem].addPosting(T, currentPosting); currentPosting++; if (I[mem] contains more than M-1 tokens) { if (I[disk] exists) { I[disk] := mergeIndices(I[mem], I[disk]); } else { I[disk] := I[mem]; } I[mem] = empty index } } return; }