CS267 Spring 2016Practice Final
- Briefly explain the QUIT and CONTINUE accumulator pruning strategies for Term-at-a-Time Query Processing.
- Express the following using our operators for GC-Lists. The contents of the body of any html document whose head contains at least two meta tags.
- Briefly explain the following coding schemes and what they are used for: Canonical Huffman Code, LLRUN.
- Consider the list `L=langle 5,8,15,25 rangle`. Compute its `Delta`-list, then encode it using a `gamma`-code.
- Define the REBUILD and REMERGE batch update operations. Give a concrete (non-trivial, i.e., some work is done) situation in which they would both have the same performance.
- Describe how the Logarithmic Merge Dynamic Index algorithm works.
- Suppose there is a 1,000,000 word corpus. Document `d` is 300 words long. The terms blue and bayou occur 1000, 300 times respectively. In document `d` blue occurs twice and bayou once. Let `mu=1000`. What is the LMD score for `d` for the query "blue bayou"?
- Derive how the factor `P_1` is estimated in the DFR formula.
- Briefly explain how query documents are returned in the parallel setting when query processing with term partitioning is done.
- Describe how the HITS algorithm measures the quality of a document. How does SALSA differ from HITS?
|