Chris Pollett>
Old Classses >
CS267
( Print View )
Student Corner:
[Submit Sec1]
[Grades Sec1]
[Lecture Notes]
[Discussion Board]
Course Info:
[Texts & Links]
[Description]
[Course Outcomes]
[Outcomes Matrix]
[Course Schedule]
[Grading]
[Requirements/HW/Quizzes]
[Class Protocols]
[Exam Info]
[Regrades]
[University Policies]
[Announcements]
HW Assignments:
[Hw1]
[Hw2]
[Hw3]
[Hw4]
[Hw5]
[Quizzes]
Practice Exams:
[Midterm]
[Final]
|
CS267 Fall 2019Practice Final
To study for the final I would suggest you:
- Know how to do (by heart) all the practice problems.
- Go over your notes at least three times. Second and third time try to see how much you can remember from the first time.
- Go over the homework problems.
- Try to create your own problems similar to the ones I have given and solve them.
- Skim the relevant sections from the book.
- If you want to study in groups, at this point you are ready to quiz each other.
Here are some facts about the actual final:
- It is comprehensive.
- It is closed book, closed notes. Nothing will be permitted on your desk except your pen (pencil) and test.
- You should bring photo ID.
- There will be more than one version of the test. Each version will be of comparable difficulty.
- It is 10 problems (2pts each), 6 problems will be on materials since the second midterm, 4 problems will be from the topics of the midterm.
- Two problems will be exactly (less typos) off of the practice final, and one will be off of the practice midterm.
The practice final is below:
- Prove there is an upper bound that a single term's BM25 score for a single document can contributed to the overall BM25 score for that document with respect to a query.
- Give the context in which one might use accumulator pruning for ranking. Then explain and give an example of how the accumulator pruning algorithm from class works.
- What file does the trec_eval software take as input? Briefly give the format of each. What kind of output does it produce?
- Suppose 'a' occurs with probability 0.65 and 'b' with probability 0.35. Suppose our symbols are bi-grams of these two letters. Draw the Huffman tree for these bi-grams. What would the preamble for the canonical Huffman code look like?
- Briefly explain how the following index compression codes work: (a) `gamma`-code. (b) Golomb code.
- Explain what kind of incremental updates logarithmic merging is a suitable algorithhm for. Explain how this algorithm works.
- Give the following ranking functions together with common values for any magic numbers they have: (a)LMD, (b) LMJM, (c) DFR. Define each of the variables your formulas have and give an example of using them.
- Suppose we are using document partitioning to split our index across 5 machines. When a user enters a query our search engines returns the top 10 results. What is the probability machine 1 has half of the top 10 results?
- Briefly explain how the page rank algorithm might be implemented as a map reduce job.
- What is Batch filtering? What is aggregate P@k?
|