Chris Pollett > Old Classses > CS255
( Print View )

Student Corner:
  [Submit Sec1]
  [Grades Sec1]

  [
Lecture Notes]
  [Discussion Board]

Course Info:
  [Texts & Links]
  [Description]
  [Course Outcomes]
  [Outcomes Matrix]
  [Course Schedule]
  [Grading]
  [Requirements/HW/Quizzes]
  [Class Protocols]
  [Exam Info]
  [Regrades]
  [University Policies]
  [Announcements]

HW Assignments:
  [Hw1]  [Hw2]  [Hw3]
  [Hw4]  [Hw5]  [Quizzes]

Practice Exams:
  [Midterm]  [Final]

                           












HW1 Solutions Page

After grading, I selected from amongst the better homeworks to post the solution below. The student or students that had their homework chosen received 1 bonus point a fter curving for having their homework selected. If you were chosen and would rather your homework not be used as the solution let me know and I will choose someone else's homework and they will receive the bonus point instead. The only changes I made were to convert the names listed in the homework files to SJSU student. No one solved problem 3 adequately, and as I said in class, the problem was harder than I intended, so I gave everyone a point. Below I give my solution which is tight up to a constant.

[Student Hw1 Solution-ZIP]

Problem 3. Suppose we toss balls into one of `n` bins. Assume each bin is equally likely. Calculate with work the expected number of balls you would need to toss until there are two bins with at least two balls.

Solution. Let our sample space `S` be sequences of ball tosses. Let `X` be the random variable which maps a particular sequence of ball tosses to the integer `n` at which two bins have at least two balls. We are asked to compute:

`E[X] = sum_{i=1}^{infty} i \cdot Pr{X=i}.`

We will show `E[X] = Theta(sqrt(n))`. To prove this we will show `E[X] gt 1/4 sqrt(n)` and `E[X] lt 7 sqrt(n)`.

Claim 1. `E[X] gt 1/4 sqrt(n)`.

Proof. We first note
`E[X] gt sum_{i=1/2 sqrt(n)}^{infty} i \cdot Pr{X=i} \qquad\qquad (\star)`
since it is possible to get two balls into two distinct bins before the `1/2 sqrt(n)`th toss. We also have
`sum_{i=1/2 sqrt(n)}^{infty} i \cdot Pr{X=i} ge 1/2 sqrt(n) sum_{i=1/2 sqrt(n)}^{infty} Pr{X=i} \qquad\qquad (\star\star)`
since `i` in the sum is always greater than or equal to `1/2 sqrt(n)`. We next note that if the first `1/2 sqrt(n)` balls all end up in distinct bins the `X` will have value greater than `1/2 sqrt(n)`. Let `Y` be the event that in a sequence of ball tosses the first `1/2 sqrt(n)` balls all end up in distinct bins. We thus have:
`sum_{i=1/2 sqrt(n)}^{infty} Pr{X=i} ge Pr{Y} \qquad\qquad (\star\star\star)`.
Computing `Pr{Y}` we get
`Pr{Y} = prod_{i=1}^{1/2sqrt(n)}(1 - i/n)`
`gt (1 - (1/2sqrt(n))/n)^{1/2sqrt(n)} = (1 - (1/2)\cdot n^{-1/2})^{1/2sqrt(n)} = e^{1/2sqrt(n) \cdot ln (1 - (1/2)\cdot n^{-1/2})}`

Using the Taylor expansion of `ln(1 - x)`, we get
`ln(1 - x) = -x - (x^2)/2 - (x^3)/3 - ... gt -x - x^2` if `0 lt x le 1/2`. So from the above
`Pr{Y} gt e^{1/2sqrt(n) \cdot ln (1 - (1/2)\cdot n^{-1/2})} gt e^(1/2sqrt(n) \cdot [-(1/2)\cdot n^{-1/2} - ((1/2)\cdot n^{-1/2})^2] ) = e^(-1/4 - 1/8\cdotn^{-1/2}) ge e^(-1/4 - 1/8) gt .687 gt 1/2.`

Since `Pr{Y} gt 1/2`, `(\star\star\star)`, `(\star\star)`, and `(\star)` together imply the claim. Q.E.D.

Claim 2. `E[X] lt 7 sqrt(n)`.

Proof. First we note:
`E[X] = sum_{i=1}^{infty} i \cdot Pr{X=i} = sum_{i=1}^{2 + sqrt(n)} i \cdot Pr{X=i} + sum_{i=2 + sqrt(n)}^{infty} i \cdot Pr{X=i}`
`lt (2 + sqrt(n))sum_{i=1}^{2 + sqrt(n)} Pr{X=i} + sum_{i=2 + sqrt(n)}^{infty} i \cdot Pr{X=i}`
`lt (2 + sqrt(n)) + sum_{i=2 + sqrt(n)}^{infty} i \cdot Pr{X=i}`

Let `(#)` denote the sum on the right of the last line above. Now consider the random variable `X'(s)` which if the toss sequence `s` has two bins with two balls before the `2 + sqrt(n)`th toss is 0; otherwise, `X'(s) = X(s)`. It follows that:
`E(X') = sum_{i=2 + sqrt(n)}^{infty} i \cdot Pr{X'=i} = sum_{i=2 + sqrt(n)}^{infty} i \cdot Pr{X=i} = (#)`.
Let `Z(s)` be the random variable which is `0` if the sequence of tosses does not have at least `1+sqrt(n)` many distinct bins with at least one ball after the first `2 + sqrt(n)` ball tosses; otherwise, its value is the first toss `i` where two of these first `1+sqrt(n)` bins have at least two balls. Notices on all sequences `s`, `X'(s) \leq Z(s)`. Therefore,
`E[X'] = sum_{s in S}X'(s)Pr(s) \leq sum_{s in S}Z(s)Pr(s) = E[Z]`.
Hence, we have
`E[X] lt (2 + sqrt(n)) + E[X'] \leq (2 + sqrt(n)) + E[Z]`.

The probability that a ball lands in 1 of these first `1+sqrt(n)` bins is at least `(1+sqrt(n))/n gt 1/sqrt(n)`. So by the same argument as we did for our second ball and question, we would expect a ball to land in one of these bins in at most `sqrt(n)` steps. After the first ball lands in one of these `1+sqrt(n)` bins, the odds that a second ball lands in one of the `sqrt(n)` remaining ins is `(sqrt(n))/n = 1/sqrt(n)`, so we would expect to get the second ball in at most `sqrt(n)` additional steps. Hence,
`E[Z] lt 1+sqrt(n) + sqrt(n) + sqrt(n) = 1 + 3 sqrt(n)`
and so
`E[X] lt (2 + sqrt(n)) + 1 + 3 sqrt(n) = 3 + 4 sqrt(n) < 7 sqrt(n)`
as claimed. Q.E.D.