CS267
Chris Pollett
Feb 24, 2021
rankProximity(t[1],.., t[n], k) // t[] term vector // k number of results to return { u := - infty; [u,v] := nextCover(t[1],.., t[n], u); d := docid(u); score := 0; j := 0; while( u < infty) do if(d < docid(u) ) then // if docid changes record info about last docid j := j + 1; Result[j].docid := d; Result[j].score := score; d := docid(u); score := 0; score := score + 1/(v - u + 1); [u, v] := nextCover(t[1],.., t[n], u); if(d < infty) then // record last score if not recorded j := j + 1; Result[j].docid := d; Result[j].score := score; sort Result[1..j] by score; return Result[1..k]; }
Using an analysis similar to that used for galloping search in the book, you can prove this algorithm has running time:
`O(n^2 l cdot log(L/l))`.
docRight(A AND B, u) := max(docRight(A, u), docRight(B,u)) docLeft(A AND B, v) := min(docLeft(A,v), docLeft(B,v)) docRight(A OR B, u) := min(docRight(A, u), docRight(B,u)) docLeft(A OR B, v) := max(docLeft(A,v), docLeft(B,v))
nextSolution(Q, position) { v := docRight(Q, position); if v = infty then return infty; u := docLeft(Q, v+1); if(u == v) then return u; else return nextSolution(Q, v); }
u := -infty while u < infty do u := nextSolution(Q, u); if(u < infty) then report docid(u);
If we implement nextDoc, prevDoc with galloping search, the complexity of this algorithm is `O(n cdot l cdot log(L/l))`