Chris Pollett>
CS267
( Print View )
Student Corner:
[Submit Sec1]
[Grades Sec1]
[Lecture Notes]
[Discussion Board]
Course Info:
[Texts & Links]
[Description]
[Course Outcomes]
[Outcomes Matrix]
[Course Schedule]
[Grading]
[Requirements/HW/Quizzes]
[Class Protocols]
[Exam Info]
[Regrades]
[University Policies]
[Announcements]
HW Assignments:
[Hw1]
[Hw2]
[Hw3]
[Hw4]
[Hw5]
[Quizzes]
Practice Exams:
[Midterm]
[Final]
|
CS267 Spring 2022Practice Final
To study for the final I would suggest you:
- Know how to do (by heart) all the practice problems.
- Go over your notes at least three times. Second and third time try to see how much you can remember from the first time.
- Go over the homework problems.
- Try to create your own problems similar to the ones I have given and solve them.
- Skim the relevant sections from the book.
- If you want to study in groups, at this point you are ready to quiz each other.
Here are some facts about the actual final:
- It is comprehensive.
- It is closed book, closed notes. Nothing will be permitted on your desk except your pen (pencil) and test.
- You should bring photo ID.
- There will be more than one version of the test. Each version will be of comparable difficulty.
- It is 10 problems (3pts each), 6 problems will be on materials since the second midterm, 4 problems will be from the topics of the midterm.
- Two problems will be exactly (less typos) off of the practice final, and one will be off of the practice midterm.
The practice final is below:
- Prove there is an upper bound that a single term's BM25 score for a single document can contributed to the overall BM25 score for that document with respect to a query.
- Give the context in which one might use accumulator pruning for ranking. Then explain and give an example of how the accumulator pruning algorithm from class works.
- Express the following as region algebra queries: (a) Your first name within 8 terms of your last name. (b) Your zip code in a tags before the phrase "doxing is fun".
- Give a concrete example involving your name of how a Canonical Huffman code might be written as a preamble to encoding a string.
- Briefly describe how each of the following coding schemes for compression lists work: (a) `gamma`-code, (b) LLRUN, (c) Rice code.
- Give a situation with numbers where it would make sense to rebuild an index rather than remerge.
- Suppose there is a 1,000,000 word corpus. Document `d` is 450 words long. The terms blue and suede occur 800, 100 times respectively. In document `d` blue occurs twice and suede once. Let `mu=1000`. What is the LMD score for d for the query "blue suede"?
- Distinguish intra-query and inter-query parallelism as far as information retrieval goes. What bottlenecks exists to intra-query parallelism if partition-by-document is used.
- Explain how hub and authority scores are calculated in SALSA.
- Briefly define aggregate P@k and aggregate AP and explain why they might be used.
|