Assignment #5

CS 152
due December 7, 2006
100 points, plus 20 points of extra credit

Final version

In this exercise you are to write a somewhat more realistic parser and interpreter than for Assignment 1, for a language L(G) generated by a given context-free grammar G. For this assignment, you may use Java or Scheme.

Specifically, you are to define an integer-valued function evaluate, which takes a string input and return its value if it is a well-formed string. In Java, this function should be a public method of an Interpreter class. If there is a syntactic or semantic error, the method should throw an InputMismatchException whose message is an informative error message. In Scheme, the function should deal with erroneous input by returning a string representing an informative error message. In the ill-formed case, the method (or function) needn't try to find all of the errors or correct any errors; reporting one error is sufficient.

Note that in Scheme, much of your code will be taken up with propagation of errors, since we have not discussed the Scheme analog of Java exceptions.

Your evaluate function is to use recursive descent parsing and an explicit dynamic scoping environment with a stack of frames. The representation of parsed program consituents, including terminal symbols, is up to you, although I recommend that you use syntax trees for expressions. I also recommend the use of a scanner or tokenzier (a tokenization function for Scheme is given in Assignment 4 of the Fall 2005 version of CS 152, on my web site). The representation of frames, as well as of parsed definitions, is up to you; it of course should be easy to construct a frame for a function from its parsed definition. It's also up to you whether and how you explicitly represent the predefined functions + and *.

The grammar G has the productions given below. The symbols |, {, and } are metasymbols of the grammar, while +, *, (, and ) are terminal symbols. The start symbol is <program>. The grammar may be parsed by recursive descent.

  <program> ::= {<def>} <expr> 
  <expr> ::= <id> | <number> | ( <fname> <expr> <expr> )
  <def> ::= define <id> ( <id> <id> ) {<def>} <expr> end
  <fname> ::= <id> | + | *
As in Assignments 1 and 4, you may assume that the tokens of a program in L(G) are separated by those whitespace characters that are recognized as such by the programming language you use.

Note that the grammar gives no rules for the symbols <id> and <number>. Normally identifiers and numbers would be recognized by a scanner. Your recursive descent parser needn't contain recognizers for these symbols. You should assume that identifiers consist of one or more letters and numbers consist of one or more digits. How you enforce this is up to you. You may assume that identifiers are case-senstiive.

The intended interpretation of the rules and their corresponding program constructs is as follows:

You may define any classes and methods that you find useful, in addition to those mentioned above. You may use any part or parts of my solutions to any of the earlier assignments, if you give credit.

If you use Java, you are to test your program with the test class A5.java, available from the class web site. Note that this class assumes a particular package name. If you use Scheme, you are to test with the test function of the file a5-test.scm. In any case, you are to turn in both hard and soft copies of all of your code (but not the test "program", unless you modify it) as well as a hard copy of the results of your testing. The soft copy should be sent by email or turned in on diskette. Submissions that extend over multiple files are to be zipped, single files should not. Scheme submissions should consist of a single file. Single-file submissions are to use the naming convention of Assignment 4.

For this assignment, late submissions will be accepted only until 5 p.m. on December 8. Note that the department office will be closed that afternoon. I expect to be in my office until that time, but if you don't find me there, please leave your hard copies under my office door.

Extra credit

Do the assignment using static scoping rather than dynamic scoping. Note that in Java you might want to use frames with their own link field(s), rather than relying on the Java LinkedList class.