San Jose State University : Site Name


Main Content

Working in Mars Mission Control, JPL

Ronald Mak

Department of Computer Science
Fall Semester 2014

Office hours: MW: 6:00-7:00 PM
Office location: MacQuarrie Hall, room 413
Mission Control, Jet Propulsion Laboratory (JPL)
NASA Mars Exploration Rover Mission


CS 153/SE 153: Concepts of Compiler Design

MW 9:00-10:15 AM, SCI 311


# Date Due Assignment
1 Aug 25 Sept 5 Write a Pascal program
Input file: employees.txt
2 Sept 8 Sept 17 Java scanner
Input file:
3 Sept 17 Oct 1 Pascal set expressions and assignments
Input files: sets.txt    seterrors.txt
4 Oct 1 Oct 17 Pascal set type definitions and set variable declarations
Input files: input.txt    errors.txt
5 Oct 22 Oct 31 Generate a Java scanner with JavaCC
Input files: input.txt
6 Oct 29 Nov 10 Generate a parser with JavaCC
7 Nov 10 Nov 21 Generate parse trees with JJTree


Date Content
Aug 25 Slides: Goals; compiler as translator; definitions; overview; conceptual designs; parser; scanner; token; front end; intermediate tier; backend; intermediate code; symbol table; code generator; compiler project; grading
Aug 27 Slides: Interpreters; compiler vs. interpreter; key steps for success; Java packages; class relationships; abstract parser and scanner classes; token and source classes; currentChar() vs. nextChar(); messages;

Sample Pascal programs:
newton.pas   newton.j (Jasmin assembly language object program)
wolfisland.pas (Wolf Island input data)
Sept 3 Slides: Initial Pascal-specific parser and scanner; token class; factory classes; initial back end classes; first end-to-end test Pascal tokens; Pascal token classes; syntax diagrams; how to scan for tokens; basic scanning algorithm
Sept 8 Slides: Pascal scanner and token classes: word, string, and number; syntax error handling; Pascal tokenizer; Assignment #2; symbol table conceptual design; symbol table interfaces; symbol table factory class; symbol table and symbol table stack implementation
Sept 10 Slides: Symbol table entries; what to store in each entry; cross-reference listing; statement syntax diagrams; parse tree conceptual design; syntax diagrams: Pascal statements and expressions; parse tree: conceptual design, basic operations, implementation, building; Pascal precedence rules; example expression decomposition; parsing expressions
Sept 15 Slides: Parsing expressions; printing parse trees; temporary hacks; statement and expression executor classes; executing compound statements, assignment statements, and expressions; runtime error handling; Simple Interpreter I; Pascal control statements; statement parser classes; parsing REPEAT and WHILE statements; synchronization and error recovery; Pascal Syntax Checker II
Sept 17 Slides: Assignment #3; parsing FOR, IF, and CASE statements; top-down recursive descent parsing; syntax and semantics; executing REPEAT, WHILE, FOR, and IF statements
Sept 22 Slides: Executing CASE statements; jump tables and jump caches; multipass compilers; parsing declarations; syntax diagrams for Pascal declarations; constant definitions; type definitions: simple, array, and record; variable declarations; declarations and the symbol table; scope and the symbol table stack

An article about the FORTRAN compiler for the IBM 1401 computer system. The compiler made 63 passes and ran in 8K of memory; each pass had at most 300 instructions! Assembly language source code of the compiler. Just read the comments for enlightenment.
Sept 24 Slides: Type specification interfaces and attributes; string types; type factory; how identifiers are defined; predefined identifiers and types; BlockParser; DeclarationsParser; ConstantDefinitionsParser; type definition structures
Sept 29 Slides: TypeDefinitionsParser; TypeSpecificationParser; SimpleTypeParser; SubrangeTypeParser; EnumerationTypeParser; RecordTypeParser; Pascal multidimensional arrays; ArrayTypeParser; VariableDeclarationsParser
Oct 1 Slides: Type checking; assignment compatible; comparison compatible; variables with subscripts and fields; adding type information to parse tree nodes; Assignment #4; program header; procedure and function declarations; nested scopes and the symbol table stack
Oct 6 Slides: Parsing programs, procedures, and functions; parsing calls to procedures and functions; class Predefined; formal and actual parameters; parsing calls to the standard routines; compile time vs. run time; runtime memory management; runtime stack; runtime activation records; accessing nonlocal values
Oct 8 Slides: Runtime display; allocating and initializing activation records; passing parameters during a call; memory management classes; executing procedure and function calls; runtime error checking; executing entire Pascal programs
Oct 13 Slides: Review for the midterm
Oct 20 Slides: Midterm solutions; minimum acceptable compiler project; deterministic finite automata (DFA); state transition matrix; DFA for Pascal identifiers and numbers; a simple DFA scanner

DFA scanner:    SimpleDFAInput.txt
Oct 22 Slides: BNF; Extended BNF (EBNF); grammars and languages; derivations and productions; compiler-compilers; JavaCC; JavaCC regular expressions; "hello world" examples of .jj files; JavaCC Eclipse plug-in; compiler project

Example .jj files:
helloworld1.jj   helloworld2.jj   helloworld3.jj   helloworld4.jj   helloworld5.jj
Oct 27 Slides: JavaCC parser specification; production rule methods; choice conflicts; left factoring; backtracking; lookahead; left recursion; right recursion; iteration; JJDoc

Example .jj files:
phone.jj   phone_method_param.jj   phone_choice.jj   phone_left_factored.jj   phone_lookahead.jj   expression_left_recursion.jj   expression_iteration.jj   expression_right_recursion.jj
Oct 29 Slides: JJDoc command line; Assignment #6; JJTree; calculator example; visitor design pattern; node names; node descriptors; tree shaping; Pcl

Example .jj and .jjt files:
calculator_simple.jj   calculator_tree.jjt   calculator_tree_image.jjt   calculator_calculating.jjt   calculator_visitor.jjt   node_name_change.jjt   node_void.jjt   calculator_tree_shape_1.jjt   calculator_tree_shape_2.jjt

Nov 3 Slides: JavaCC grammar repository; syntax error handling with JavaCC; token errors; synchronize the parser; repair the parse tree; target machines; JVM architecture; runtime stack and stack frame; JVM instructions; Jasmin assembler; example Jasmin program

Sample .jj .jjt and .j files:
logo_tokenizer.jj   logo_skip_chars.jj   logo_synchronize.jj   logo_tree_recover.jjt   hello.j
Nov 5 Slides: Jasmin assembly instructions; using Java to test Jasmin code; code templates; compilation strategy; Jasmin type descriptors; Pascal program, program fields, main program; loading and storing program variables; Pcl2 code generation; code for procedures and functions

Jasmin multiply engine: MultiplyEngine.j
Nov 10 Slides: Local variables; expressions; javap disassembler; Jasmin instructions for comparisons; code templates for relational expressions, assignment statements, if statements, looping statements, and select statements; Jasmin code for procedures and functions
Nov 12 Slides: Code for System.out.println(), String.format(), scalar arrays, and subscripted variables
Nov 17 Slides: Code for non-scalar arrays; code for records and fields
Nov 19 Slides: JavaCC error handling with synchronization; Pascal runtime library; passing parameters by value and by reference; wrapping scalars; cloning arrays; compiling Wolf Island and Hilbert; compiled code vs. interpreted code speed comparison

Parameter passing: parmswrap.pas   parmsclone.pas   parmsbad.pas
Performance tester: hilbert2.pas
Pcl with error handling:
Nov 24 Slides: Code generation; instruction selection; register allocation; data flow analysis; instruction scheduling; code optimization: safety, profitability, constant folding, constant propagation, strength reduction, dead code elimination, loop unrolling, common subexpression elimination; debugging and optimizing compilers; compiling object-oriented anguages

Nov 26 Slides: Static and dynamic scoping; runtime memory management; Java command-line arguments; heap management; garbage collection algorithms: reference counting, mark and sweep, stop and copy, generational; aggressive heap management; garbage collection research
Dec 1 Slides: Team presentations
Dec 3 Slides: Team presentations
Dec 8 Slides: Context-free and context-sensitive grammars; top-down parsers; bottom-up parsers; hift-reduce bottom-up parser; yacc and lex; simple calculator; LL and LR parsers

Example yacc and lex files (Simple Calculator): calc.y   calc.l
Dec 10 Slides: Review for the final examination; source-level debugger; machine-level debugger; debugger architecture; back end messages; IDE; interprocess communication


First goal: You'll learn the concepts of writing compilers and interpreters for high-level programming languages. You'll write your compilers and interpreters in Java.

In the first half of the course, you'll learn about the front end of a compiler or interpreter: the scanner and the parser; and its middle tier: symbol tables and intermediate code (parse trees). You'll also learn how an interpreter executes source programs in its back end. You'll understand and modify the Java code of a working interpreter. The interpreter will execute source programs written in the Pascal programming language.

In the second half of the course, you'll write your own compiler using JavaCC, the Java compiler-compiler. Your compiler will translate programs written in the high-level language of your choice — or perhaps in a language that you invent — first to Jasmin assembly language and then to .class files that will execute directly on the Java Virtual Machine (JVM).

Compilers and interpreters are complex programs, and writing them successfully is hard work. To tackle the complexity, we'll take a strong software engineering approach in this course. We'll use design patterns, Unified Modeling Language (UML) diagrams, XML, and other modern object-oriented design practices to make the code understandable and manageable.

Second goal: You'll learn critical job skills that employers look for in new college hires:


CS 47 or
CMPE 102
Introduction to Computer Organization
Fundamentals of Embedded Software
grade C- or better
CS 146 Data Structures and Algorithms grade C- or better
CS 154 Formal Languages and Computability grade C- or better
  Department policy is to enforce  
  all course prerequisites strictly,  
  especially for a deep course.  

Required books

Writing Compilers and Interpreters, 3rd edition
Ronald Mak
Wiley Publishing, Inc., 2009
ISBN: 978-0-470-17707-5
Source files from each chapter
Generating Parsers with JavaCC, 2nd edition (PDF)
Tom Copeland
Centennial Books, 2009
ISBN: 0-9762214-3-8

Recommended books

Programming for the Java Virtual Machine
Joshua Engel
Addison-Wesley Professional
ISBN: 0201309726
The Essentials of Pascal I
Gary W. Wester
Research and Education Association
ISBN: 0-87891-694-6
The Essentials of Pascal II
Gary W. Wester
Research and Education Association
ISBN: 0-87891-718-7

Online Pascal tutorials

Learn Pascal  Looks good, although it doesn't appear to cover set types.
Pascal Programming  Ignore Lesson 3. Also doesn't cover set types.
Pascal - Sets  A tutorial on Pascal set types.