Solutions to Problem Set #5

CS 155 -- due December 8, 1999

    1. 3,5,11,21,43,85

    2. When i=11, the sequence consists of 5 elements of L and 10 elements of W. As i decreases by 1, the sequence contains one more element of L and one fewer element of W, until when i=6 it consists of 10 elements of L and 5 elements of W.

    3. When i=m(k), the sequence consists of m(k-1) elements of L and m(k)-1 elements of W. By the defining recurrence, this equals 2k-1. As in the previous case, the size of this sequence remains unchanged within a block of subscripts.

    4. The winners of the pairwise comparisons are (8 9 7 6 2); the losers are (5 4 1 3 0). This takes 5 comparisons.
      In sorting the sequence of winners, its sequence of winners is (9 7). Finding these winners takes 2 comparisons; sorting it to get (7 9) requires 1 additional comparison. The sequence L corresponding to this is (6 8 2). Adding 6 takes 0 comparisons, merging 2 and then 8 takes 4 comparisons (2 each).
      The resulting sequence W is (2 6 7 8 9). The corresponding sequence L is (0 3 1 5 4). Adding 0 to W requires 0 comparisons, merging 1 and then 3 requires 4 comparisons (2 each), and merging 4 and then 5 requires 6 comparisons (3 each). The total number of comparisons required is thus 5+2+1+4+4+6=22 comparisons. The theoretical minimum is given by log2 10!, or log23628800, which when rounded up is also 22.

    5. When n=21, there are 10 elements of W to be sorted recursively. This takes 22 comparisons as seen above, in addition to the 10 comparisons required to find W in the first place. In addition, the elements of L from positions 2 through 11 must be merged. Elements 2 and 3 require 2 comparisons each, elements 3 and 4 need 3 comparisons each, and elements 6 through 11 need 4 comparisons each. The overall number of comparisons required is thus 10 + 22 + 2(2) + 2(3) + 6(4) = 66.
      Since log221! is roughly 65.5, the theoretical minimum is achieved in this case as well. Note that n log n is less than 64, so that this value gives a somewhat less precise estimate.

  1. Since the input representation must list all the strings explicitly, we may assume that the length of the input is greater than the sum Z of their lengths. We may also assume that K<=Z, since the answer is clearly "yes" otherwise. So a nondeterministic algorithm could simply guess K characters of a string over S, and verify that each string of R is a substring of this string.

    The guessing phrase takes time K, so is bounded by Z, and is thus polynomial in the input length. In the worst case a substring match requires testing every position of the candidate substring against every position of the target string. Here this would take time at most Z2 per string, or |R|Z2, altogether. Since |R|<=Z, this is also polynomial in Z, and hence in the length of the input.

    Since there is a nondeterministic polynomial-time algorithm to solve the problem, the problem is in NP.

  2. Consider the following algorithm: To show that this algorithm has polynomial time complexity, note that both the number of variables and the number of clauses are less than the input size. Thus the first step takes only polynomial time.

    The propagation step requires modifying only polynomially many clauses (each in bounded time) and thus only polynomially many iterations. Checking whether an assignment leads to some clause having both literals assigned F requires only checking through the polynomially many clauses, as does finding a clause with one literal assigned F and the other unassigned. So each iteration takes polynomial time, and therefore the second and third steps of the algorithm require only polynomial time. The final step requires only a polynomial-time pass through the clauses, so the overall algorithm has polynomial time complexity.

    Note that what makes the algorithm polynomial is that the backtracking is severely limited -- at most one decision has to be redone at any one time. It may not be clear that this is good enough (it certainly isn't for 3SAT), but the correctness of the algorithm may be shown as follows:

    At the end of each successful propagation, each clause involving a variable with a truth assignment has a truth assignment of one of the forms TT, TF, FT, T*, *T, or **, where "*" stands for an unassigned value. This is because all singleton clauses have already been handled in step 1, the case FF cannot occur after a successful propagation, and propagation cannot terminate if there is an assignment of the form F* or *F.

    No demonstration of unsatisfiability can involve a clause with one of the first five of the truth assignments, so any such demonstration must come from the clauses with values **. But by assumption these clauses involve completely new variables, so we may treat these clauses as a separate (and smaller) instance. So, under our assumption that the most recent propagation was successful, the old instance was unsatisfiable iff this new one is, and that is what the algorithm says.

The sorting algorithm of this problem set is due to Ford and Johnson, and is described on pp. 466-8 of the Horowitz & Sahni text Fundamentals of Computer Algorithms.