The first is to construct an NFA for the regular expression, perhaps with ε transitions. Of the many possible NFAs, one without ε transitions has the transition function given below. Converting this to a DFA gives a DFA isomorphic to the given one.
δ | q0 q1 q2 q3
-------------------------------------
0 | {} {q0} {q3} {}
1 | {q1, q2} {} {} {q0}
A second strategy is to find a regular expression for the DFA. Again
there are several ways to do this, and some algebraic manipulation may be
needed afterwards to get the expression into the correct form.
If we let A, B, C, and D, be the sets of strings that go respectively to q0, q1, q2, and q3, then the DFA gives equations
A = ε B = A1 + D1 C = B0 + D0 D = C1By eliminating A and B and substituting, we get
C = 10 + D10 + D0 = 10 + C110 + C10 D = 101 + D101 + D01So C = 10(110 + 10)* = 1(011 + 01)*0 = (101 + 10)*10, and
A third strategy is to argue directly. If r is the given regular expression, we can show by induction that L(r) is a subset of L(M). Clearly ε is in L(M). And inductively, if x is in L(M), then x must be taken to q0, q2, or q3. But in each case, x10 goes to q2 and x101 goes to q3, so both x10 and x101 are in L(M).
To show that L(M) is a subset of L(r), we may use the terminology of the previous strategy, and strong induction on the length of x. If x has length 0, then x = ε, so x is in L(r). This is the only case where x is in A.
So if x in L(M) has nonzero length, x is in C or D. If x is in C, then inspection of M shows that x is of the form y10, where y is in A or C or D. By induction y is in L(r), so x must be as well. A similar argument handles the case where x is in D (since x must be of the form y101 where y is in A or C or D).
3SAT, which is decidable.
Let w be a string derived from any variable A of the CFG.
case
1:
if w has a + that is outside of parentheses, then start with the
rule E -> E+T,
where the + matches the rightmost + that is
outside of parentheses
case 2:
if w has a * but no + that is
outside of parentheses, then start with the rule E->T or T->T*F
(depending on whether A=E or A=T)
where the * matches the
rightmost * that is outside of parentheses
case 3:
if w begins
with a left parenthesis and ends with a right one, then start with the
rule
E -> T or T -> F or F -> (E), depending on whether A =
E or T or F
case 4:
if w=x, then start with the rule
E
-> T or T -> F or F -> x, depending on whether A = E or T or
F
case 5:
if w=y, then start with the rule
E -> T or
T -> F or F -> y, depending on whether A = E or T or F
We may show by induction on the length of w that this is the only possible parsing strategy, and thus that the parse tree is completely determined by w. First note that the cases are disjoint -- in particular, in cases 3 and 4 and 5 there is no * or + outside of parentheses.
We next observe that F cannot derive a string with a + or * that occurs outside of parentheses. An easy induction on the length of w shows that if T derives w, then w can have no + outside of parentheses. Finally, note that no variable can derive a string with a different number of left parentheses or right parentheses.
So in case 1, w can be derived only from E. The initial use of the rule E -> T would not allow matching of the + that is outside of parentheses, so we must begin with the rule E -> E + T. The + in the rule must in fact match the rightmost + outside of parentheses, since a string containing this + would have to be derivable from T. Since w is derivable, the portion of w to the left of this + has a parse tree rooted at E, and the portion to the right has a parse tree rooted at T. By induction, these parse trees are unique.
In case 2, w cannot be derived from F, and the rule E -> E+T cannot be used initially (since the + would have to be outside of parentheses). So a derivation from T must begin with T*F. A derivation of w from E must begin, E -> T, and must then continue with T -> T*F. In either case, as in case 1 this * must match the rightmost * outside of parentheses, and the derivations of the substrings to the left and right from T and F are unique by induction.
In case 3, we cannot use the rules E -> E+T or T -> T*F initially, since these would give a + or * outside of parentheses. Nor can we use the rules F -> x or F -> y. So a derivation from E must start with E => T => F => (E), a derivation from T must start with T => F => (E), and a derivation from F must start with F => (E). In each of these subcases, the derivation from the E inside parentheses is unique by induction. The argument for cases 4 and 5 is similar.
Note that whenever multiple steps of a derivation are given in the first three paragraph, every step is consistent with the given strategy. Also note that all of these derivations are leftmost derivations, so that they uniquely determine a parse tree. Since we have handled all 5 cases, the proof is complete.