A context-free grammar (CFG) is a set of rules of the form:
VARIABLE ::= PATTERN1 | PATTERN2 | ...
This rule says that the variable on the left side of the
"consists of" sign (::=) can be replaced by any of the patterns on the
right. The patterns are regular expressions that may also contain variables.
For example, the following grammar says that an expression
consists of two expressions separated by a + or * operator, and expression
surrounded by parentheses, an identifier, or an integer:
EXP ::= EXP \+ EXP | EXP \* EXP | (EXP) | ID | INT
ID ::= [a-zA-Z][0-9a-zA-Z]*
INT ::= [1-9][0-9]*
Assume G is a grammar and u and v are strings that may
contain variables from G, then a production u => v means that v can be obtained
from u by replacing a variable in u with a pattern according to one of the
rules in G.
A derivation u =>* v is a sequence of productions: u
=> u1 => u2 => ... => v.
For example, let G be the EXP grammar above, then the string
x + (42 * y) can be derived from EXP (EXP =>* x + (42 * y)). Here's a
derivation:
EXP =>
EXP + EXP =>
EXP + (EXP) =>
EXP + (EXP * EXP) =>
ID + (EXP * EXP) =>
x + (EXP * EXP) =>
x + (INT * EXP) =>
x + (42 * EXP) =>
x + (42 * ID) =>
x + (42 * y)
The parent nodes of a parse tree are labeled by variables.
Leaves are variable-free strings. Concatenated, the children of a parent form a
pattern. Derivations can be represented as parse trees. For example, here's a
tree representation of the previous derivation:
Recall that a simplified language processor (compiler of
interpreter) is a pipeline:
In tree form, it's easy to execute or translate a program
using recursion.
Assume G is a CFG, and assume S is a designated
"start" variable.
L(G) = all variable-free strings
that can be derived from S
= {w | S =>* w and w is
variable-free}
CFL = the family of context-free grammars = {L | L = L(G) for some CFG G}
S ::= 0S0 | 1
eg. S => 0S0 => 00S00 => 000S000 => 0001000
L(S) = {0n10n| 0 < n}
BEXP ::= DISJUNCTION
DISJUNCTION ::= CONJUNCTION~(||~DISJUNCTION)?
CONJUNCTION ::= VALUE~(&&~VALUE)*
VALUE ::= true | false | !VALUE