Huffman and Arithmetic Coding




CS267

Chris Pollett

Oct. 29, 2012

Outline

Huffman Coding

Huffman Tree Construction

  • Suppose we have a compression model `M` with `M(sigma_i) = Pr(sigma_i)`.
  • To construct a Huffman code, we:
    1. Start with a set of trees `T_i` one for each `sigma_i`. The probability of a tree will be the sum of the probabilities of the symbols in it.
    2. We next repeat until there is only one tree left and do:
      Take the two trees of least probability, say `T_j` and `T_k`, and merge these into a single tree `T_i` consisting of a new top node labeled with the sum of `T_j` and `T_k`'s probabilities and having the more probable tree as the left child and the less probable as the right child. The edge to left child is labeled 0; to the right is labeled 1.
  • Once this tree is constructed, codes for symbols correspond to paths down this tree to the symbol in question.
  • Example

    Huffman tree Example

    Facts About Huffman Codes

    Canonical Huffman Codes

    Quiz

    Which of the following is true?

    1. A generalized concordance list is any sequence of text intervals.
    2. trec_eval can be used to generate search result judgments.
    3. Compression techniques often involve a modeling and coding phase.

    Motivating Arithmetic Coding

    Arithmetic Coding

    More Arithmetic Coding

    Redux