Grammar class, and a method findFirst for this class, which under the assumptions of this assignment may be used to determine whether a given grammar can be parsed with a recursive-descent parser. For this assignment, you are to use Java.
Your Grammar class is also to have a method AddRule, which takes a string representing a rule, and adds it to the grammar. This class is also to have a constructor that takes a symbol (represented as a string) and interprets it as the grammar's start symbol. If this string is empty, your constructor should throw an IllegalArgumentException. If it contains white space, you may throw such an exception, or simply accept the substring before the first whitespace character.
The format of the input string to the AddRule method is a sequence of symbols, separated by white space. No extensions to BNF should be assumed in the rule format -- not even the vertical bar convention. The first symbol is assumed to be the left-hand side of the rule, so that "S NP VP" would be interpreted as the rule S -> NP VP. If there are fewer than two symbols in the rule, this method is to throw an IllegalArgumentException (with a useful message).
The AddRule method is to have a boolean return value. It should detect and reject an attempt to add a rule to the grammar that is already present in the grammar. In this case the method should return false; otherwise it should return true.
You should assume that a grammar symbol is a terminal symbol unless it appears on the left-hand side of a rule.
The findFirst method is to compute the value of the First function, as described on p. 108 of the Louden text, for every nonterminal symbol of the grammar. It should return its value in an object of class
HashMap<String, TreeSet<String> >. If recursive descent parsing cannot be used for the grammar, you should throw an IllegalArgumentException with an informative message.
Because of the constraints enforced by the AddRule method, your findFirst method may assume that grammars have no epsilon rules, and do not have multiple copies of the same rule. You may also assume -- but needn't enforce -- that grammars not have any useless symbols (that is, you may assume that every symbol is accessible from the start symbol, and derives some string of terminals). These assumptions make the computation of the findFirst method relatively straightforward. In particular, you needn't concern yourself with any Follow function. Also note that under our assumptions, no value of the First function will ever be empty for any grammar symbol.
Since the First function is recursively defined, and since you will be computing it for several different values, you might consider the technique of memoization. In other words, you can enter the recursively computed values into the map as soon as they are computed. If you do this, a call to the function that computes First for a particular argument can check the map initially, to see if the value has already been computed for that argument. Of course this technique won't work if the map is local to this function.
Recall that recursive descent parsing cannot be used with grammars with left recursion. You may assume this fact for this assignment.
You may define any classes and methods that you find useful, in addition to those mentioned above. How you represent grammar rules is up to you.
You are to test your program with the test class A2.java available from the class web site. Note that this class assumes a particular package name. Also note that it prints objects of class
HashMap<String, TreeSet<String> > using default formatting. Please don't try to improve the formatting.
You are to turn in both hard and soft copies of all of your code (but not A2.java, unless you modify it) as well as a hard copy of the results of your testing. The soft copy should be sent by email or turned in on diskette.