Introduction

Last day, we started talking about probabilistic analysis and randomized algorithms.
We considered the Hiring Problem which was to work out the total cost of interviewing n candidates, where if we find candidate `i+1` is better than the candidate so far we fire the candidate so far and hire `i+1`.
We assumed there was a cost `c_i` to interview a candidate and a code `c_h gt gt c_i` to hire and fire and candidates.
We said if `m` is the number of people hired then the total cost goes as `O(n cdot c_i + m cdot c_h)`.
Only `m` depends on the order of candidates. Asking for the expected value of this leads to a probabilistic analysis of this problem.
To do probabilistic analysis of our hiring problem, we begin by reviewing basic probability theory...

Probabilistic analysis

Probabilistic analysis is the use of probability to analyze problems.
One important issue is what is the distribution of inputs to the problem?
For instance, we could assume all orderings of candidates are equally likely.
That is, we consider all functions rank: `[1..n] |-> [1..n]` where `rank[i]` is supposed to be the `i`th candidate that we interview. So `langle rank(1), rank(2), ...rank(n) rangle` should be a permutation of of `langle 1,...,n rangle`.
There are `n!` many such permutations and we want each to be equally likely.
If this is the case, the ranks form a uniform random permutation.

Randomized algorithms

In order to use probabilistic analysis, we need to know something about the distribution of the inputs.
Unfortunately, often little is known about this distribution.
We can nevertheless use probability and analysis as a tool for algorithm design by having the algorithm we run do some kind of randomization of the inputs.
This could be done with a random number generator.
For example, we could assume we have primitive function Random(a,b) which returns an integer between integers a and b inclusive with equally likelihood.
Algorithms which make use of such a generator are called randomized algorithms.
In our hiring example we could try to use such a generator to create a random permutation of the input and then run the hiring algorithm on that.

Distributions

To do probabilistic analysis and to understand randomized algorithms, we need to know a little probability -- so let's review.

A sample space `S` will for us be some collection on elementary events. For instance, results of coin flips.
An event `E` is any subset of `S`.
For example, if `S={HH, TH, HT, T\T}`, an event might be `{TH, HT}`
A probability distribution `Pr_S{}` on `S` is a mapping from events on `S` to the real numbers satisfying for any events `A` and `B`:
1. `Pr_S{A} ge 0`
2. `Pr_S{S} = 1`
3. `Pr_S{A cup B} = Pr_S{A} + Pr_S{B}` if `A cap B= emptyset`
Notice `1 = Pr_S{S cup emptyset} = Pr_S{S} + Pr_S{emptyset} = 1 + Pr_S{emptyset}`. So `Pr_S{emptyset} = 0`.
Let `bar(A)` denote the complement of `A` in `S` -- all the elements of `S` that are not in `A`.
Notice `1 = Pr_S{S}= Pr_S{bar(A) cup A} = Pr_S{bar(A)} + Pr_S{A}`. So `Pr_S{bar(A)} = 1 - Pr_S{A}`.

Conditional Probability and Independence

The conditional probability of an event `A` given an event `B` is defined to be:
`Pr{A\|B} = frac{Pr{A cap B}}{Pr{B}}`.
Two events are independent if
`Pr{A cap B} = Pr{A} cdot Pr{B}`
Given a collection `A_1, A_2,...A_k` of events we say they are pairwise independent if
`Pr{A_i cap A_j} = Pr{A_i}Pr{A_j}` for any `i` and `j`.
They are mutually independent if for any subset `A_(i_1), A_(i_2),..., A_(i_m)` of them `Pr{A_(i_1) cap ...cap A_(i_m)} = Pr{A_(i_1)}cdot cdots cdot Pr{A_(i_m)}`

Quiz

Which of the following statements is true?

If you have 10 insurance points and you score 10 out of 20 on the midterm. Your recorded score in the midterm will be 20/20.
Of the five homework scores and total quiz score, the lowest one will be dropped in computing your final grade.
The best case cost for the hiring problem is also `O(n cdot c_h)`.

Discrete Random Variables

Assuming `S` is finite, a discrete random variable `X` is a map `X:S -> RR`.
Given such a function `X` we can define the probability density function for `X` as:
`f(x) = Pr{X = x}` where the little `x in RR`.

Expectation, Variance, Linearity of Expectation

The expected value of a random variable `X` is defined to be: `E[X] = sum_x x cdot Pr{X=x}`
The variance of `X`, `Var[X]`, is defined to be: `E[(X - E(X))^2] = E[X^2] -(E[X])^2`
The standard deviation of `X`, `sigma_X`, is defined to be the `(Var[X])^(1/2)`.
Given two random variables `X, Y` we can create a new `X+Y`. Since random variables are functions from the sample space to the reals, we define for `s in S` (where S is our sample space) `(X+Y)(s) := X(s) + Y(s)`.
Similarly, for any `c in RR` and random variable `X`, we can make a random variable `c cdot X`.
Given this, we note from the definition of expectation, we get the following linearity of expectation properties:
1. `E(X+Y) = E(X) + E(Y)` and
2. `E(c \cdot X) = c\cdot E(X)`.

Indicator Random Variables

In order to analyze the hiring problem we need a convenient way to convert between probabilities and expectations.
We will use indicator random variables to help us do this.
Given a sample space `S` and an event `A`. Then the indicator random variable `I{A}` associated with event `A` is define as:
`I{A}={(1, mbox(if A occurs) ),(0,mbox(if A does not occur.)):}`

Example

Suppose our sample space `S={H,T}` with `Pr{H}=Pr{T}=1/2`.
We can define an indicator random variable `X_H` associated with the coin coming up heads: `X_H = I{H}={(1, mbox(if H occurs) ),(0,mbox(if T occurs.)):}`
The expected number of heads in one coin flip is then: \begin{eqnarray*} E[X_H] &=& E[I\{H\}]\\ &=& 1 \cdot Pr\{H\} + 0 \cdot Pr\{T\}\\ &=& 1 \cdot (1/2) + 0 \cdot (1/2)\\ &=& 1/2 \end{eqnarray*}

Lemma 5.1

Given a sample space `S` and an event `A subseteq S`, let `X_A=I{A}`. Then `E[X_A]=Pr{A}`.

Proof: `E[X_A] = E[I{A}] = 1 cdot Pr{A} + 0 cdot Pr{bar(A)} = Pr{A}`.

More Indicator Variables

Indicator random variables are more useful if we are dealing with more than one coin flip.
Let `X_i` be the indicator that indicates whether the result of the `i`th coin flip was a head.
Consider the random variable:
`X = sum_(i=1)^n X_i`
Then the expected number of heads in `n` tosses is: `E[X] = E[sum_(i=1)^n X_i] = sum_(i=1)^n E[X_i] = sum_(i=1)^n 1/2 = n/2`

Analysis of the Hiring Problem

Let `X_i` be the indicator random variable which is `1` if candidate `i` is hired and `0` otherwise.
Let `X = sum_(i=1)^n X_i`
By our lemma `E[X_i]=Pr{mbox(candidate i is hired)}`
Candidate `i` will be hired if `i` is better than each of candidates 1 through `i - 1`.
As each candidate arrives in random order, any one of the first candidate `i` is equally likely to be the best candidate so far. So `E[X_i] =1/i` and
`E[X] = E[sum_(i=1)^n X_i] = sum_(i=1)^n E[X_i] = sum_(i=1)^n 1/i = ln n + O(1)`
(via integral bound).

Finishing up the Hiring Problem

So far we have analyzed the Hire-Assistant algorithm assuming the inputs were ordered according to a random uniform permutation.

We can use coin tosses (hence, a randomized algorithm to ensure this situation).

Randomized-Hire-Assistant(n)
1 randomly permute the list of candidates
2 best := dummy candidate
3 for i := 1 to n
4     do interview of candidate i
5         if i is better than best
6             best := i
7             hire candidate i

So we need a way to generate a random permutation.

Randomly Permuting Arrays (Method 1)

Idea:
- Start with non-permuted list: `A= langle 1,2,3,4 rangle`.
- Generate random priorities: `langle 36,3,97,19 rangle`.
- Sort the elements of `A` according to these priorities to get `B= langle 2, 4, 1, 3 rangle`.

In more detail:

Permute-By-Sorting(A)
1. n :=length[A]
2. for i :=1 to n
3.     P[i] = Random(1,n³)
4. sort A, using P as sort keys
5. return A.

Analyzing Method 1

Lemma. Procedure Permute-By-Sorting produces a uniform random permutation of the input, assuming that the priorities are distinct.

Proof. Let `sigma:[1 .. n] ->[1..n]` be a permutation, `sigma(i)` being where `i` goes under this permutation. Let `X_i` be the indicator that `A[i]` receives the `sigma(i)`th smallest priority. That is, it indicates that `i` will be mapped correctly after sorting by priorities. So if `X_i` holds then after sorting the element with original value `i` stored in `A[i]` gets mapped to `A[sigma(i)]`. By the definition of conditional probability,
`Pr{Y|X} = (Pr{X cap Y})/(Pr{X})`, so `Pr{X cap Y} = Pr{X} cdot Pr{Y|X}`.
Using this, we have
`Pr{X_1 cap ... cap X_n} = Pr{X_1 cap ... capX_(n-1)} cdot Pr{X_n | X_1 cap ... cap X_(n-1)}`.
Continuing to expand, we get: `Pr{X_1 cap ... cap X_n} = Pr{X_1} cdot Pr{X_2|X_1} cdots Pr{X_n | X_1 cap ... cap X_(n-1)}`.
We can now fill in some of these values:
`Pr{X_1} = 1/n = ` probability that first priority chosen out of `n` is `sigma(1)`th smallest.
`Pr{X_i|X_1 cap... cap X_(i-1)} = 1/(n - i + 1)`.
This is because of the remaining elements `i`, `i+1`, ... `n`, each is equally likely to be the `sigma(i)`th smallest. So
`Pr{X_1 cap ... cap X_n} = 1/n cdot 1/(n-1) cdot ... cdot 1/2 cdot 1/1 = 1/(n!)`.
As `sigma` was arbitrary, any permutation is equally likely.

Analyzing the Hiring Problem, Generating Random Permutations

Outline