Language Modeling, Test Collections, Open-Source IR Systems, Inverted Indexes




CS267

Chris Pollett

Aug 29, 2018

Outline

Higher-order Models

On Monday, we were talking about language modeling. Let's briefly recall some of the things we learned...

Example

In-Class Exercise

The phrase "second time" appears 20 times in Shakespeare and "second" appear 128 times.

What is `M_0(mbox("second time"))`? What is `M_1(mbox("time") | mbox("second"))`?

Post your solutions to the Aug. 29 In-Class Exercise Thread.

Smoothing

Markov Models

Markov Model for to be or not to be

More Markov Models

Test Collections

TREC Tasks

Open-Source IR Systems

Inverted Indices

diagram with dictionary and posting list of an inverted index

ADT Example: Phrase Search