CS 153: Concepts of Compiler Design

 

Fall Semester 2009

Department of Computer Science
San Jose State University

Instructor: Ron Mak

Assignment #5

Assigned:

Thursday, October 22

Due:

Thursday, October 29 at 11:59 pm

 

Team assignment, 100 points max

Generate a C scanner with JavaCC

Use JavaCC to generate a tokenizer (i.e., scanner) that recognizes the tokens of the simplified C programming language. Refer to Assignment #2 for the tokens your tokenizer must recognize. (NOTE: Add >= to the list of special symbol tokens.)

 

For this assignment, you do not need to recognize the special meaning of the \ character in character and string tokens. Therefore:

 

  • A character token is a single character (only) surrounded by single quotes (').
  • The \ character has no special meaning inside a string token.
  • A string token cannot span multiple lines. (However, a comment surrounded by /* and */ can still span multiple lines.)

 

The purpose of this assignment is to become familiar with JavaCC. Therefore,

 

  • It is not necessary to make the output pretty, just readable. For each recognized token, simply print a line that names the type of token (e.g., identifier, reserved word auto, colon-equals) and writes the token string. You do not have to compute the value of number tokens.
  • Recognition of reserved words should be case sensitive.
  • The tokenizer should skip over comments.
  • You may assume that the input file contains no errors.

Input file testdata.txt

You must run your program with the following input file. (There should be no invisible control characters other than blanks and line feeds.)

 

/* This is a comment. */

// So is this.

 

/* Here's a comment

   that spans several

   source lines. */

 

Two/*comments in*//***a row***/ here.

/* This is /* not a nested comment. */

// Nor is /* this */ one.

 

{ Not a comment. }

 

// Word tokens

Hello world

auto Auto AUTO AuTo

What?

 

// Character tokens

'x' 'A' '\' 'b'

 

// String tokens

"Hello, world."

"It's Friday!"

""

 

// Special symbol tokens

+ - * / := . , ; : = <> < <= >= > ( ) [ ] { } } ^ ..

<<=  >>=

:=<>=<==>>===

 

// Number tokens

0 1 12.0 00000000000000000012 .12 12.

012 0128 0x12

12. 0.12 .12 1.2e+2 12e+2 12e2 0e2

12e-2 12e-5 .12e+2 12.e+2 12e-1 12e9

.31415926 3.1415926

0.00031415926E4  0.00031415926e+00004  31415.926e-4

3141592600000000000000000000000e-30

 

"That's all, folks!"

What to turn in                               

This is a team assignment. Each team turns in one assignment and each team member will get the same score.

 

  • Your .jj source file. (Do not turn in the generated .java files.)
  • Any extra input files to show off special features.
  • Text files containing the output from running your program.

 

E‑mail the files as attachments in a message to mak@cs.sjsu.edu. In the subject line, include the name of your team and ASSIGNMENT #5. In your e-mail message, you can explain any assumptions you made or point out anything especially clever about your program.