Document Quality Measures




CS267

Chris Pollett

Dec 3, 2018

Outline

Introduction

Traffic Rank

Online Page Importance Computation (OPIC)

Quiz

Which of the following is true?

  1. `P_1` in DFR was estimated using Laplace's law of succession.
  2. A document is said to be elite for the term `t` when it is "about" the topic associated with the term.
  3. Intra-query parallelism is parallelism achieved by having whole queries computed by different machines.

Bonus Quiz

Which of the following is true?

  1. A variation of OPIC is used by Yioop.
  2. Traffic Rank is what's used in Yioop.
  3. DFR is a document quality measure.

Page Rank

The Power Method

Tweaks on the basic matrix

Topic Specific Page Rank

More Algorithms -- HITS

SALSA