Introduction

Last day, we started talking about probabilistic analysis and randomized algorithms.
We considered the Hiring Problem which was to work out the total cost of interviewing n candidates, where if we find candidate `i+1` is better than the candidate so far we fire the candidate so far and hire `i+1`.
We assumed there was a cost `c_i` to interview a candidate and a code `c_h gt gt c_i` to hire and fire and candidates.
We said if `m` is the number of people hired then the total cost goes as `O(n cdot c_i + m cdot c_h)`.
Only `m` depends on the order of candidates. Asking for the expected value of this leads to a probabilistic analysis of this problem.
Since a priori we don't know the input distribution of th candidates, we might try to randomly permute the candidates as a preprocessing step before doing our hiring procedure. This gives rise to a randomized algorithm.
To analyze our hiring problem, we started reviewing probability. We defined the notions of sample space, elementary event, event, probability distribution, and showed a couple of simple properties of the last.
We start today by continuing our review of probability theory.

Conditional Probability and Independence

The conditional probability of an event `A` given an event `B` is defined to be:
`Pr{A\|B} = frac{Pr{A cap B}}{Pr{B}}`.
Two events are independent if
`Pr{A cap B} = Pr{A} cdot Pr{B}`
Given a collection `A_1, A_2,...A_k` of events we say they are pairwise independent if
`Pr{A_i cap A_j} = Pr{A_i}Pr{A_j}` for any `i` and `j`.
They are mutually independent if for any subset `A_(i_1), A_(i_2),..., A_(i_m)` of them `Pr{A_(i_1) cap ...cap A_(i_m)} = Pr{A_(i_1)}cdot cdots cdot Pr{A_(i_m)}`

In-Class Exercise

Consider three events `X`, `Y`, `Z` in a sample space consisting of two coin tosses.

`X` is the event that the first coin toss was head.
`Y` is the event that the second coin toss was head.
`Z` is the event that the coin tosses are different.

Write down the sample space. Then show each of these events is pairwise independent, but the three events are not mutually independent. Post your solutions to the Jan 30 In-Class Exercise Thread.

Discrete Random Variables

A discrete random variable `X` is a map `X:S -> RR`.
Given such a function `X` we can define the probability density function for `X` as:
`f(x) = Pr{X = x}` where the little `x in RR`.

Expectation and Variance

The expected value of a random variable `X` is defined to be: `E[X] = sum_x x cdot Pr{X=x}`
The variance of `X`, `Var[X]`, is defined to be: `E[(X - E(X))^2] = E[X^2] -(E[X])^2`
The standard deviation of `X`, `sigma_X`, is defined to be the `(Var[X])^(1/2)`.

Indicator Random Variables

In order to analyze the hiring problem we need a convenient way to convert between probabilities and expectations.
We will use indicator random variables to help us do this.
Given a sample space `S` and an event `A`. Then the indicator random variable `I{A}` associated with event `A` is define as:
`I{A}={(1, mbox(if A occurs) ),(0,mbox(if A does not occur.)):}`

Example

Suppose our sample space `S={H,T}` with `Pr{H}=Pr{T}=1/2`.
We can define an indicator random variable `X_H` associated with the coin coming up heads: `X_H = I{H}={(1, mbox(if H occurs) ),(0,mbox(if T occurs.)):}`
The expected number of heads in one coin flip is then: \begin{eqnarray*} E[X_H] &=& E[I\{H\}]\\ &=& 1 \cdot Pr\{H\} + 0 \cdot Pr\{T\}\\ &=& 1 \cdot (1/2) + 0 \cdot (1/2)\\ &=& 1/2 \end{eqnarray*}

Lemma 5.1

Given a sample space `S` and an event `A in S`, let `X_A=I{A}`. Then `E[X_A]=Pr{A}`.

Proof: `E[X_A] = E[I{A}] = 1 cdot Pr{A} + 0 cdot Pr{bar(A)} = Pr{A}`.

More Indicator Variables

Indicator random variables are more useful if we are dealing with more than one coin flip.
Let `X_i` be the indicator that indicates whether the result of the `i`th coin flip was a head.
Consider the random variable:
`X = sum_(i=1)^n X_i`
Then the expected number of heads in `n` tosses is: `E[X] = E[sum_(i=1)^n X_i] = sum_(i=1)^n E[X_i] = sum_(i=1)^n 1/2 = n/2`

Analysis of the Hiring Problem

Let `X_i` be the indicator random variable which is `1` if candidate `i` is hired and `0` otherwise.
Let `X = sum_(i=1)^n X_i`
By our lemma `E[X_i]=Pr{mbox(candidate i is hired)}`
Candidate `i` will be hired if `i` is better than each of candidates 1 through `i - 1`.
As each candidate arrives in random order, any one of the first candidate `i` is equally likely to be the best candidate so far. So `E[X_i] =1/i` and
`E[X] = E[sum_(i=1)^n X_i] = sum_(i=1)^n E[X_i] = sum_(i=1)^n 1/i = ln n + O(1)`
(via integral bound).

Finishing up the Hiring Problem

So far we have analyzed the Hire-Assistant algorithm assuming the inputs were ordered according to a random uniform permutation.

We can use coin tosses (hence, a randomized algorithm to ensure this situation).

Randomized-Hire-Assistant(n)
1 randomly permute the list of candidates
2 best := dummy candidate
3 for i := 1 to n
4     do interview of candidate i
5         if i is better than best
6             best := i
7             hire candidate i

So we need a way to generate a random permutation.

Randomly Permuting Arrays (Method 1)

Idea:
- Start with non-permuted list: `A= langle 1,2,3,4 rangle`.
- Generate random priorities: `langle 36,3,97,19 rangle`.
- Sort the elements of `A` according to these priorities to get `B= langle 2, 4, 1, 3 rangle`.

In more detail:

Permute-By-Sorting(A)
1. n :=length[A]
2. for i :=1 to n
3.     P[i] = Random(1,n³)
4. sort A, using P as sort keys
5. return A.

Analyzing Method 1

Lemma. Procedure Permute-By-Sorting produces a uniform random permutation of the input, assuming that the priorities are distinct.

Proof. Let `sigma:[1 .. n] ->[1..n]` be a permutation, `sigma(i)` being where `i` goes under this permutation. Let `X_i` be the indicator that `A[i]` receives the `sigma(i)`th smallest priority. That is, it indicates that `i` will be mapped correctly after sorting by priorities. So if `X_i` holds then after sorting the element with original value `i` stored in `A[i]` gets mapped to `A[sigma(i)]`. By the definition of conditional probability,
`Pr{Y|X} = (Pr{X cap Y})/(Pr{X})`, so `Pr{X cap Y} = Pr{X} cdot Pr{Y|X}`.
Using this, we have
`Pr{X_1 cap ... cap X_n} = Pr{X_1 cap ... capX_(n-1)} cdot Pr{X_n | X_1 cap ... cap X_(n-1)}`.
Continuing to expand, we get: `Pr{X_1 cap ... cap X_n} = Pr{X_1} cdot Pr{X_2|X_1} cdots Pr{X_n | X_1 cap ... cap X_(n-1)}`.
We can now fill in some of these values:
`Pr{X_1} = 1/n = ` probability that first priority chosen out of `n` is `sigma(1)`th smallest.
`Pr{X_i|X_1 cap... cap X_(i-1)} = 1/(n - i + 1)`.
This is because of the remaining elements `i`, `i+1`, ... `n`, each is equally likely to be the `sigma(i)`th smallest. So
`Pr{X_1 cap ... cap X_n} = 1/n cdot 1/(n-1) cdot ... cdot 1/2 cdot 1/1 = 1/(n!)`.
As `sigma` was arbitrary, any permutation is equally likely.

More on Method 1

What do we do if the priorities aren't all distinct?
Well, we just try again and draw a new list of priorities.
What's the likelihood this bad situation happens?

Claim. The probability that all the priorities are unique is at least 1- 1/n.

Proof. Let `X_i` be the indicator that the `i`th priority was unique. Again,
`Pr{X_1 cap ... cap X_n} = Pr{X_1} cdot Pr{X_2|X_1} cdots Pr{X_n | X_1 cap ... cap X_(n-1)}`.
`= n^3/n^3 cdot (n^3 -1)/n^3 cdots (n^3 - (n-1))/n^3`
`ge (n^3 -n)/n^3 cdot (n^3 -n)/n^3 cdots (n^3 -n)/n^3`
`= (1 - 1/n^2)^(n-1)`
`ge 1 - (n-1)/n^2` as `(1-a)(1-b) > 1 - a - b` if `a` and `b` nonnegative.
`> 1 - 1/n`

Analyzing the Hiring Problem, Generating Random Permutations

Outline