Evaluating Results, Token and Term Processing




CS267

Chris Pollett

Feb 28, 20221

Outline

Measuring Retrieval Effectiveness

Recall and Precision

F-measures

Measures for the first `k` results

Example Precision Recall Plot from the Book

Average Precision

Precision at k and MAP scores for different ranking schemes

Quiz

Which of the following is true?

  1. The proximity ranking algorithm incorporates the number of documents that contain query terms directly in the proximity score of a document.
  2. The IDF score as presented in class depends on the document but not the query term.
  3. If we are using VSM for ranking and retrieval, we can use a frequency index for our inverted index.

Building a Test Collection

Efficiency Measures

Token and Terms

Punctuation and Capitalization

Stemming

Finishing up Stemming

Example of stemmed text

Stopping