CS267
Chris Pollett
Oct. 17, 2011
Which of the following is true?
buildIndex_sortBased(inputTokenizer)
{
position := 0;
while (inputTokenizer.hasNext()) {
T := inputTokenizer.getNext();
obtain dictionary entry for T, create new entry if necessary;
termID := unique termID of T;
write record R[position] := (termID, position) to disk;
position++;
}
tokenCount := position;
sort R[0], .., R[tokenCount-1] by first component; break ties with second component;
perform a sequential scan of R[0], .., R[tokenCount-1] creating the final index;
return;
}
