Language Modeling, Test Collections, Open-Source IR Systems, Inverted Indexes




CS267

Chris Pollett

Feb 2, 2022

Outline

Higher-order Models

On Monday, we were talking about language modeling. Let's briefly recall some of the things we learned...

Example

Smoothing

In-Class Exercise

Suppose the phrase "before me" appears 4 times in Shakespeare and "before" appear 2048 times.

What is `M_0(mbox("before me"))`? What is `M_1(mbox("me") | mbox("before"))`?

Post your solutions to the Feb 2 In-Class Exercise Thread.

Markov Models

Markov Model for to be or not to be

More Markov Models

Test Collections

TREC Tasks

Open-Source IR Systems

Inverted Indices

diagram with dictionary and posting list of an inverted index

ADT Example: Phrase Search