Solutions to Problem Set #2, CS 255, March 15, 2006

 

    1. The probability that a jth trial will be required is p = [k/M]j. So the expected number of trials is the sum of these values over all nonnegative integers j, which is 1/(1 - k/M), or M/M-k.
    2. The expected number of random number computations is the sum of M/M-k as k ranges from 0 through n-1. This is asymptotically equal to M[ln M - ln (M-n)], or M ln M/(M-n)
    3. Yes. Suppose that M = 2n. Then steps 2 and 3 take time 2n ln 2n/n, which is O(n). Or put another way, from part 1 we know that the expected number of comparisons in the ith iteration of the loop of steps 2 and 3 is at most 2. In any case, since one pass of bucket sort is sufficient, step 4 will take time O(M + n), which is O(n). So the algorithm's overall expected time complexity is O(n). Note that if item i is placed into a bucket as soon as the ith iteration of the loop is complete, then checking whether a computed random value is acceptable is trivial. But also note that the modified algorithm computes significantly more random numbers that the original algorithm, which may outweigh the advantage of using a simpler sorting algorithm.

  1. The hint can be shown by induction on n. In the base case, the sum is 0 and the product is 1, so the desired conclusion holds.

    Induction step: if we let P be the product from 1 to n-1, then the left-hand side of the inequality becomes P(1-x(n)). By induction, this is at least (1-S)(1-x(n)), where S is the sum corresponding to P. This product equals 1 - S - x(n) + Sx(n). Since Sx(n) is nonnegative this is at least
    1 - S - x(n), which is the right-hand side of the desired inequality.

    For the main argument, we have that the probability that all choices are different is 1(1-1/n3)(1-2/n3) ... (1-(n-1)/n3). By the hint, this is at least 1 - Q/n3, where Q is the sum of the first n-1 integers. Since Q < n2, the desired probability is at least 1 - n2/n3, or 1 - 1/n.

    1. The integer n has size Θ(log n). So it takes time Θ(log n) to guess the bits of two candidate factors whose lengths are no greater than the length of n. To verify that these candidate factors are actual factors, it takes time quadratic in their length. So the entire algorithm, both guessing and verifying, takes time polynomial in the length of n.

    2. Each step of the argument is correct. The problem is that NP-completeness is defined in terms of the size of the input, and the size of the input is Θ(log n), and not n. So in terms of the input size, there are exponentially many divisions required.

  2. Using the notation of the text, first note that the sum of the items of the set S has 2 in each digit labeled by a variable and 6 in each digit labeled by a clause. The desired target value t will be exactly half of the sum of the items in S if we add another item X to S with a 0 in each digit labeled by a variable and a 2 in each variable labeled by a clause.

    So for any instance I of 3-CNF-SAT, let T(I) be the union of {X(I)} and S(I), where S(I) is the set S that the text's transformation would construct for I, and X(I) is the value of X for I described above. To show that T(I) is equivalent to I, it's enough to show that T(I) has a solution iff S(I) has a subset summing to the target value t(I) for I.

    First suppose that S(I) has such a subset S'(I). Then we may partition T(I) into S'(I) and T(I)-S'(I). Since T(I) sums to 2t(I), each subset will sum to t(I).

    Conversely, if T(I) has a partition, let T'(I) be the subset of the partition that does not contain X(I). Then all members in T'(I) are members of S(I). Again, since T(I) sums to 2t(I), T'(I) must sum to t(I), so T'(I) gives a solution to the subset sum problem.

    . Finally, to show that the partition problem is in NP for an input set of n elements, we need only make n binary guesses about whether to include each item in the first subset of the partition. The sums of the two subsets may be found by making O(n) additions, each taking time polynomial in the size of the instance. Since n is polynomial in the size of the instance, and since comparing the two sums requires only time polynomial in the size of the instance, this nondeterministic solution algorithm has polynomial time complexity. So the partition problem is in NP.

Grading scale: 82 A, 77 A-, 72 B+, 64 B, 59 B-, 54 C+, 46 C