Concordance Lists, trec_eval




CS267

Chris Pollett

Oct. 22, 2012

Outline

Introduction

Concordance Lists

Quiz

Which of the following is true?

  1. `TF_(BM25)(t,d)` is upper bounded by `b + 1`.
  2. The MaxScore heuristic allows one to remove terms from the term heap when doing query processing with heaps.
  3. Accumulator pruning was used in our document-at-a-time query processing algorithms.

Properties of GC-lists

Operators

Examples

Implementation

More Implementation

  • The book defines four operations `tau(S, k)`, `rho(S,k)` and `tau'(S, k)`, `rho'(S,k)`.
  • `tau(S, k)` returns the first interval in the GC-list starting at or after the position `k`; `tau'(S, k)` returns the last interval in `S` ending at or before `k`.
  • `rho(S,k)` returns the first interval in `S` ending at or after the position `k`; `rho'(S,k)` returns the last interval in `S` starting at or before the position `k`.
  • Using these four operations the book shows how to define each of our binary operators.
  • The books definition of the four binary operators

    trec_eval

    trec_eval command-line arguments

    trec_rel_file and trec_top_file

    Example trec_rel_file

    351 0 FR940104-0-00001 0
    351 0 FR940104-0-00002 1
    351 0 FR940104-0-00003 1
    351 0 FR940104-0-00004 1
    351 0 FR940104-0-00005 1
    351 0 FR940104-0-00006 1
    351 0 FR940104-0-00007 1
    351 0 FR940104-0-00008 1
    351 0 FR940104-0-00009 1
    351 0 FR940104-0-00010 1
    351 0 FR940104-0-00011 0
    351 0 FR940104-0-00012 1
    

    Example trec_top_file

    351   0  FR940104-0-00001  1   102.38   run-nam2
    351   0  FR940104-0-00002  1   101.38   run-nam2
    351   0  FR940104-0-00003  1   91.38   run-nam2
    351   0  FR940104-0-00004  1   81.38   run-nam2
    351   0  FR940104-0-00005  1   71.38   run-nam2
    351   0  FR940104-0-00006  1   61.38   run-nam2
    351   0  FR940104-0-00007  1   51.38   run-nam2
    351   0  FR940104-0-00008  1   41.38   run-nam2
    351   0  FR940104-0-00009  1   31.38   run-nam2
    351   0  FR940104-0-00010  1   22.38   run-nam2
    351   0  FR940104-0-00011  1   21.38   run-nam2
    351   0  FR940104-0-00012  1   11.38   run-nam2
    

    Example trec_eval Output

    runid                 	all	run-nam2
    num_q                 	all	1
    num_ret               	all	12
    num_rel               	all	10
    num_rel_ret           	all	10
    map                   	all	0.7904
    gm_map                	all	0.7904
    Rprec                 	all	0.9000
    bpref                 	all	0.4500
    recip_rank            	all	0.5000
    iprec_at_recall_0.00  	all	0.9000
    iprec_at_recall_0.10  	all	0.9000
    iprec_at_recall_0.20  	all	0.9000
    iprec_at_recall_0.30  	all	0.9000
    iprec_at_recall_0.40  	all	0.9000
    iprec_at_recall_0.50  	all	0.9000
    iprec_at_recall_0.60  	all	0.9000
    iprec_at_recall_0.70  	all	0.9000
    iprec_at_recall_0.80  	all	0.9000
    iprec_at_recall_0.90  	all	0.9000
    iprec_at_recall_1.00  	all	0.8333
    P_5                   	all	0.8000
    P_10                  	all	0.9000
    P_15                  	all	0.6667
    P_20                  	all	0.5000
    P_30                  	all	0.3333
    P_100                 	all	0.1000
    P_200                 	all	0.0500
    P_500                 	all	0.0200
    P_1000                	all	0.0100