Chomsky Normal Form




CS154

Chris Pollett

Mar 18, 2020

Outline

Methods for Transforming Grammars

More Methods of Transforming Grammars

Removing `epsilon`-rules/productions

Eliminate Unit Productions

Chomsky Normal Form

Conversion to Chomsky Normal Form (Chomsky 1959)

Theorem. Any CFL `L` can be generated by a CFG in Chomsky Normal Form

Proof. Let `G` be a CFG for `L`. First we add a new start variable and rule `S_0 ->S`. This guarantees the start variable does not occur on the RHS of any rule. We remove any `epsilon`-rules `A -> epsilon` where `A` is not the start variable. To do this for each occurrence of `A` on the RHS of a rule, say `R -> uAv`, we add a rule `R -> uv`. We do this for each occurrence of an `A`. So for `R -> uAvAw`, we would add the rules `R ->uvAw`, `R -> uAvw`, `R -> uvw`. If we had the rule `R ->A`, add the rule `R -> epsilon` unless we previously removed the rule `R -> epsilon`. This rule will be removed when we perform our steps for the variable `R`. We cycle over variables repeating these steps till all epsilon rules have been eliminated. Next we handle unit rules `A -> B`. To do this, we delete this rule and then for each rule of the form `B -> u`, we add then rule `A ->u`, unless this is a unit rule that was previously removed. We repeat until we eliminate unit rules. Finally, we convert all the remaining rules to the proper form. For any rule `A -> u_1u_2 ldots u_k` where `k geq 3` and where each `u_i` is a variable or a terminal symbol, we replace the rule with `A -> u_1A_1`, `A_1 -> u_2 A_2`, `ldots` `A_(k-2) -> u_(k-1)u_k`. For any rule with `k=2`, we replace any terminal with a new variable `U_i` and a rule `U_i -> u_i`.

Example

In-Class Exercise

Convert the CFG with production rules: `S->SS|(S)|a` to Chomsky Normal Form.

Post your solution to the March 18 In-Class Exercise Thread.