CS267 Fall 2019 Lecture Notes

Topics in Database Systems

Videos of lectures are available.

Below are my lecture notes for the class so far. They should serve as a rough guide to what was covered on any given day. Frequently, however, I say more in class than is in these notes. Also, I tend to dynamically correct typos on the board that might appear in these lecture notes. So caveat emptor.

Week 1: [Aug 21 - Introduction to Information Retrieval]

Week 2: [Aug 26 - Text Formats, Tokenization, Term Distributions, Language Models] [Aug 28 - Language Modeling, Test Collections, Open-Source IR Systems, Inverted Indexes]

Week 3: [Sep 2 - Labor Day] [Sep 4 - Learning to Crawl]

Week 4: [Sep 9 - PHP] [Sep 11 - Galloping Search; Document-Oriented Indexes]

Week 5: [Sep 16 - Vector Space Model, Proximity Ranking, Boolean Retrieval] [Sep 18 - Evaluating Results, Token and Term Processing]

Week 6: [Sep 23 - More PHP] [Sep 25 - Char-gramming, Language Processing, Static Inverted Indices]

Week 7: [Sep 30 - Dictionaries, Posting Lists, Index Construction] [Oct 2 - Index Construction]

Week 8: [Oct 4 - Practice Midterm (Oct 7 - no class)] [Oct 9 - Midterm]

Week 9: [Oct 14 - Query Processing] [Oct 16 - Accumulator Pruning, Concordance Lists]

Week 10: [Oct 21 - trec_eval, Index Compression] [Oct 23 - Huffman and Arithmetic Coding, Posting List Compression]

Week 11: [Oct 28 - Gap Compression, Dynamic Inverted Indexes] [Oct 30 - Logarithmic Merging, BM25F, PRF]

Week 12: [Nov 4 - Ranking using Language Models] [Nov 6 - Divergence-from-randomness, Parallel Information Retrieval]

Week 13: [Nov 11 - Veteran's Day] [Nov 13 - More Parallel Information Retrieval]

Week 14: [Nov 18 - Document Quality Measures] [Nov 20 - Map Reduce, Page Rank, Hadoop]

Week 15: [Nov 25 - Hadoop, Categorization and Filtering] [Nov 27 - Thanksgiving Holiday]

Week 16: [Dec 2 - More Categorization and Filtering] [Dec 4 - Online Adaptive Spam Filtering]