CS267
Chris Pollett
Feb 24, 2021
rankProximity(t[1],.., t[n], k)
// t[] term vector
// k number of results to return
{
u := - infty;
[u,v] := nextCover(t[1],.., t[n], u);
d := docid(u);
score := 0;
j := 0;
while( u < infty) do
if(d < docid(u) ) then
// if docid changes record info about last docid
j := j + 1;
Result[j].docid := d;
Result[j].score := score;
d := docid(u);
score := 0;
score := score + 1/(v - u + 1);
[u, v] := nextCover(t[1],.., t[n], u);
if(d < infty) then
// record last score if not recorded
j := j + 1;
Result[j].docid := d;
Result[j].score := score;
sort Result[1..j] by score;
return Result[1..k];
}
Using an analysis similar to that used for galloping search in the book, you can prove this algorithm has running time:
`O(n^2 l cdot log(L/l))`.
docRight(A AND B, u) := max(docRight(A, u), docRight(B,u)) docLeft(A AND B, v) := min(docLeft(A,v), docLeft(B,v)) docRight(A OR B, u) := min(docRight(A, u), docRight(B,u)) docLeft(A OR B, v) := max(docLeft(A,v), docLeft(B,v))
nextSolution(Q, position)
{
v := docRight(Q, position);
if v = infty then
return infty;
u := docLeft(Q, v+1);
if(u == v) then
return u;
else
return nextSolution(Q, v);
}
u := -infty
while u < infty do
u := nextSolution(Q, u);
if(u < infty) then
report docid(u);
If we implement nextDoc, prevDoc with galloping search, the complexity of this algorithm is `O(n cdot l cdot log(L/l))`