Solutions: Formal languages and theory of computation, part 2, CS 288, May 9, 2005

  1. a) One grammar has start symbol E and rules
       E -> EF | EG | LH | x | y
       F -> PE
       P -> +
       G -> TE
       T -> *
       L -> (
       H -> ER
       R -> ) 
    
    b) The parse tree has n leaves, with n distinct parents. If the leaves are removed, then the resulting tree is a full binary tree with n leaves, and thus has 2n-1 nodes. So the original tree has 3n-1 nodes.

    c) Under any reasonable model of nondeterminism, the answer will be Θ(n). Since the parse tree has n nodes, it will take time Ω(n) to construct, even nondeterministically. On the other hand, the parse tree may be constructed at a cost of O(1) per node, and thus in time O(n) overall, by selecting the next node to expand (this may be done deterministically, using a stack or queue), guessing which rule to use (since the grammar has size independent of n, this takes nondeterministic time O(1) under our assumptions), and constructing and attaching the children of the node in deterministic time O(1). Updating the stack or queue will also take time O(1) per node.

    d) The worst-case time complexity is Θ(n3). Recall that there are Θ(n2) table locations to fill, and each takes time O(n) to fill. This shows that the time complexity is O(n3); a more careful argument would show that it is Θ(n3).

     

  2. a) Find the set of generating symbols of G, where X is a generating symbols iff (1) X is a terminal, or (2) X → γ, where each symbol of γ is a generating symbol. A variant of breadth-first search can be used to find this (finite) set. L(G) is empty iff S is not a member of this set.

    b) Find the set of states accessible from the start state by means of a path in M's underlying graph. Again, a variant of breadth-first search can be used here. L(M) is empty iff this set does not contain a final state.

    c) There is a simple algorithm that will take a PDA that accepts by empty stack and construct an equivalent CFG. The algorithm of (a) can then be applied to this resulting CFG. But note that L(M), according to the notation of Hopcroft et al, refers to the language accepted by final state. So here one first needs to construct a PDA that is equivalent to M, but that accepts by empty stack. There is an algorithm to do this. Then one can covert this new PDA to a CFG and apply the algorithm of (a).

    d) There is no such algorithm, by Rice's Theorem (or by the special case stated as Theorem 9.10 in Hopcroft et al).

    e) First eliminate all dead states from M (that is, those states from which no final state can be reached), and eliminate all inaccessible states from M. In both cases, this may be done by a variant of breadth-first search. Then L(M) is infinite if and only if there is a cycle in the resulting graph. Here loops (edges from a state to itself) count as cycles. Once again, cycles may be found by a variant of breadth-first search.

  3. a) Let n = |S|. A nondeterministic solution algorithm needs to guess a subset, find the sum of the elements in the subset, and compare this sum to J. The guessing takes time O(1) per element of S, or O(n) in all. The sum can be found in time O(s), where s is the sum of the lengths of the elements of S. Comparing with the target sum can easily be done in time O(s). Since s > n and the instance of the problem has size at least s, the total time is polynomial in the size of the instance.

    b) Since x1 and x2 are to be true and x3 false, the SUBSET-SUM solution should include v1, v2, and w3, giving a total so far of 11132110. The desired target can then be attained only by also including p4, q3, p2 and q2, and p1 and q1. Note that there can be no carrying in any addition of numbers from S.

    c) Here the target J is 1114440 and the elements of S and their weights are:

      v1: 0011100
      v2: 0101010
      v3: 1000100
      w1: 0010010
      w2: 0100100
      w3: 1001010
      p1: 0000010
      p2: 0000100
      p3: 0001000
      q1: 0000020
      q2: 0000200
      q3: 0002000
    d) If I has a solution, we begin by including in the subset vi whenever xi is assigned 1 in the solution, and wi whenever xi is assigned 0. This will give values of exactly 1 for those digit positions corresponding to variables and 0 for the rightmost digit. For the position corresponding to clause j, the value is at least 1 (the existence of a solution ensures that each clauses has a variable that is assigned 1), and at most 3 (since each clause has at most 3 literals). So this position of the sum can be given the desired value of 4 (without affecting the other positions) by including one or both of {pj, qj}.

    If T(I) has a solution, then to get 1's in the leftmost columns of the sum, exactly one of {vi, wi} must be included. So we let xi have value 1 iff it's vi that is included. To get 4's in the appropriate positions of the sum, at least one 1 must come from a vi or xi in the subset (since the p's and q's can contribute at most 3). This corresponds to satisfying each clause of I.

    e) T needs to be shown to have polynomial time complexity.

GRADING SCALE Formal languages and theory of computation: 85 A, 80 A-, 75 B+, 60 B, 55 B-, 50 C+, 35 C