Assignment 4

CS 146
due April 26, 2001
125 points

Implement a template for simulating a B-tree, with at least the following public methods:
  1. a constructor that takes the order of the tree as a parameter
  2. an insert method that takes a key and a record and updates the B-tree so that the record is stored in the tree at the position given by the key
  3. a search method that takes a key and returns the record associated with it. If there is no record for the given key, then return the record that is returned by the zero-argument constructor.
  4. a traverse method that takes an output stream and prints to it each record in the tree, in the order given by the keys of the records.
  5. a draw method that takes an output stream and writes a humanly readable representation of the tree to the stream. This needn't and shouldn't be fancy. I recommend simply writing each key on a different line, in key order, with each line indented a distance proportional to its level in the tree.
  6. a clear method
  7. the two additional methods described below
  8. any destructors, copy constructors, and overloaded assignment operators that are appropriate
You do not need to worry about deletion.

Your template parameters should correspond to types or classes for the key and for the record. The key is to be part of the record, but it will sometimes need to be manipulated separately. The key type may be assumed to have a < operator. The record type may be assumed to have a << operator and a zero-argument constructor. Note that the value returned by this constructor should, when it is printed using <<, give an appropriate error message for unsuccessful search.

B-tree leaves should not store entire records, but only pointers to records (as well as keys). These pointers may be integer indices to a separate, unsorted data structure. These indices are not the same as the keys, which need not be integers. B-tree nonleaves may store keys and links, but not pointers to records. Keys that appear in nonleaves should also appear in leaves (so in this sense, all data is stored in leaves). Any key appearing in a particular position of a nonleaf should be the smallest key that appears in any leaf to the right of that position.

For each B-tree you should maintain a count of

  1. the number of element comparisons, and
  2. the number of simulated disk accesses. Here you should assume that the highest two levels of the B-tree are stored in internal memory, and that references to any other node require a disk access. You should also assume that accessing a full record requires a separate disk access. You need only consider read accesses, and not write accesses.
These counters should be readable by a method
int void report(ostream&)
and simultaneously resettable by a method
void reset()
The draw method need not increment these counters.

Several of the test cases involve processing a moderately large file of data for US counties. This text file consists of a line for each county, with fields for the population rank, an identifier, the county name, the state name, the 1990 population, the 2000 population, the change in population, and the percentage change. This last field is not necessarily an integer. The fields are separated by white space; however the county names have white space internally. To disambiguate, you may assume that any upper-case string of length 2 is a state name, and thus terminates the county name field.

You may assume that this file is well-formed in the sense that there are no blank lines, that all fields are present for every line, that there is no white space between the last field and the end of a line, and that there are no blank lines at the end of the file.

The corresponding record class should have a component for each of the fields. It should have a constructor with an istream& parameter that reads the next record (as well as the zero-argument constructor, and any other appropriate constructors). It should also have 2 getKey methods -- one that returns the string consisting of the state name followed by a single space followed by the county name, and one that returns the string consisting of the county name followed by a single space followed by the state name.

When printing a record, you may print the component in whatever order you want, as long as the components that make up the key are printed in the order in which they appear in the key. Don't worry about the fact that the order of states when sorted by their two letter abbreviations is different than when they are sorted by their full names.

Test your implementations by linking to the file a4.cpp, obtainable from the class web site.

As always, you may introduce any additional methods or classes you find useful.