Chris Pollett>
CS267
( Print View )
Student Corner:
[Lecture Notes]
[Discussion Board]
Course Info:
[Texts & Links]
[Description]
[Course Outcomes]
[Outcomes Matrix]
[Course Schedule]
[Grading]
[Requirements/HW/Quizzes]
[Class Protocols]
[Exam Info]
[Regrades]
[University Policies]
[Announcements]
HW Assignments:
[Hw1]
[Hw2]
[Hw3]
[Hw4]
[Hw5]
[Quizzes]
Practice Exams:
[Midterm]
[Final]
|
CS267 Fall 2023Practice Final
To study for the final I would suggest you:
- Know how to do (by heart) all the practice problems.
- Go over your notes at least three times. Second and third time try to see how much you can remember from the first time.
- Go over the homework problems.
- Try to create your own problems similar to the ones I have given and solve them.
- Skim the relevant sections from the book.
- If you want to study in groups, at this point you are ready to quiz each other.
Here are some facts about the actual final:
- It is comprehensive.
- It is closed book, closed notes. Nothing will be permitted on your desk except your pen (pencil) and test.
- You should bring photo ID.
- There will be more than one version of the test. Each version will be of comparable difficulty.
- It is 10 problems (3pts each), 6 problems will be on materials since the second midterm, 4 problems will be from the topics of the midterm.
- Two problems will be exactly (less typos) off of the practice final, and one will be off of the practice midterm.
The practice final is below:
- Briefly give the API operations expected of an inverted index dictionary that supports indexing, queries, including prefix queries. Briefly describe the insert-at-back and move-to-front heuristic heuristics for dictionary term collisions.
- Describe the merge-based index construction algorithm (disk-based). How is it modified if logarithmic merging is done? Give an example.
- What is a generalized concordance list? Using the operators we had for such lists, express the query, "Return all intervals that begin with a <problem> and end with a </problem>" that contain the term "statement". Pick one of the GC-operators and explain how it could be implemented.
- What is Gap Compression? What is LLRUN? What is a Golomb code? Make sure to go into details. Give an example for each explanation.
- Suppose we have `m=4` many accumulators. Give examples that would trigger the QUIT and the CONTINUE accumulator pruning heuristics if term at a time processing was being used.
- Suppose there is a 500,000 word corpus. Document `d` is 250 words long. The terms grand and canyon occur 2000, 400 times respectively. In document `d` grad occurs thrice and canyon once. Let `\mu=7500`. What is the LMD score for `d` for the query "crand canyon"?
- Derive how the factor `P_2` is estimated in the DFR formula.
- Explain how hub and authority scores are calculated in HITS.
- Describe where/in what steps a distributed file system might be used in the computation of Page rank as a map reduce job.
- Briefly define size specific P@k and aggregate P@k and explain why they might be used.
|