Linear Time Search




CS146

Chris Pollett

Mar 10, 2014

Outline

Introduction

Lower Bounds for Comparison Sorting

Decision Trees

An example decision tree

Worst case lower bound

Quiz (Sec 5)

Which of the following statements is true?

  1. The worst-case recurrence for quicksort determined in class was `T(n) = T(n-1) + Theta(n)`.
  2. The best-case recurrence for quicksort determined in class was `T(n) = 2T(n/2) + Theta(log n)`
  3. For some inputs the random variable `X_(ij)` used in our quicksort analysis could take on values greater than `1`.

Quiz (Sec 6)

Which of the following statements is true?

  1. It is possible that on some inputs quicksort and randomized-quicksort as presented in class give different outputs.
  2. In our quicksort analysis
    `E[X_(ij)] = 1 cdot Pr{z_i \text( is compared to )z_j} + 0 cdot Pr{z_i \text( is not compared to )z_j}
    ` `= Pr{z_i \text( is first pivot chosen from )Z_(ij)} + Pr{z_j \text( is first pivot chosen from )Z_(ij)}`
    `= 1/(j - i + 1) + 1/(j - i + 1)`
    `= 2/(j - i + 1)`.
  3. The quicksort PARTITION procedure is `Omega(n log n)`.

Counting Sort

Counting Sort - Pseudocode

Properties of Counting Sort

Radix Sort

Picture of a Blue Punch Card

Example and Pseudo-code

Radix Sort Example

Correctness and Runtime

Lemma. Given `n` d-digit numbers in which each digit can take on up to `k` possible values, RADIX-SORT correctly sorts these numbers in `Theta(d(n+k))` time if the stable sort it uses takes `Theta(n + k)` time.

Proof. The correctness of radix sort follows by induction on the columns being sorted. The induction hypothesis is that after the `m`th stable sort the numbers are in sorted order according to their least `m` significant digits. The analysis of the running time depends on the stable sort used as the intermediate sorting algorithm. When each digit is in the range `0` to `k-1`, and `k` is not too large, counting sort is the obvious choice. Each pass over `n` `d`-digit numbers then takes time `Theta(n+k)`. There are `d` passes, and so the total time for radix sort is `Theta(d(n + k))`. QED

Making Things Binary

Lemma. Given `n` `b`-bit numbers and any positive `r le b`, RADIX-SORT correctly sorts these numbers in `Theta((b/r)(n+2^r))` time if the stable sort it uses takes `Theta(n +k)` time for inputs in the range `0` to `k`.

Here we can view `r` as being the number of bits needed to represent a digit in some other base than binary.

Proof. For a value `r le b`, we view each key as having `d = |~b/r~|` digits of `r` bits each. Each digit is an integer in the range `0` to `2^(r) - 1`, so we can use counting sort with `k = 2^r -1`. Each pass of counting sort takes time `Theta(n + k) = Theta(n + 2^r)` and there are `d` passes, for a total running time of `Theta(d(n+2^r))= Theta((b/r)(n+2^r))`. QED