CS267
Chris Pollett
Oct 31, 2018
Consider the posting list [89, 101, 112, 122, 130, 145]. Suppose the corpus size is 200 documents. (a) Compute the `\Delta`-list for this list. Then encode this list using (b) a `\gamma`-code, (c) Golumb-code using the formula from today to determine the best modulus.
Post your answer to the above to the Oct 31 In-Class Exercise Thread.
buildIndex_ImmediateMerge(M)
{
I[mem] := empty index //init in-memory index
currentPosting := 1;
while (there are more tokens to index) {
T := next token;
I[mem].addPosting(T, currentPosting);
currentPosting++;
if (I[mem] contains more than M-1 tokens) {
if (I[disk] exists) {
I[disk] := mergeIndices(I[mem], I[disk]);
} else {
I[disk] := I[mem];
}
I[mem] = empty index
}
}
return;
}