3. Creating New Classes from Old Classes

Links and Association

In UML the relationship:

Every instance of class A uses/has n instances of class B

is shown by an association arrow pointing from A to B. The B-end of the arrow is labeled by n. Additionally, the arrow might be labeled by the name of the relationship.

For example, if every instance of A uses three instances of B, then we would draw:

The endpoints of an association are called roles. The A-role of an association arrow can be labeled by the number of instances of A that use a single instance of B. For example, if every instance of B is used by two instances of A, then we would draw:

Of course the user may become the used and vice versa. If this can happen, we can make our association arrow bi-directional. For example, if instances of B can use the instances of A that use them, then we would draw:

In an object diagram we can represent the uses relationship using link arrows connecting objects. For example, we would expect every instance of B to have two links pointing to the instances of A it uses. We would expect every instance of A to have three links pointing to the two instances of B it uses:

Relational Modeling

Entity-Relationship models focus on the important relationships in the application domain. Almost all of the important features of a domain relationship can be captured in UML by an association.

For example, an on-line world atlas might represent the relationship "City X is in Country Y" and the relationship "City X is capitol of Country Y" as two associations between the class City and the class Country. In UML an association is a line segment connecting the icons of the related classes:

In this example we have labeled our associations with names and directions. This is the only way to distinguish an association from its inverse association in UML, although in most of our examples the distinction won't be important.

In an object diagram we represent the links "Washington D.C. is capitol of the USA," "Philadelphia is in the USA," and "Washington D.C. is in the USA" with line segments connecting object icons:

The relationship between links and associations is analogous to the relationship between objects and classes. Sometimes links are referred to as association instances.

Example

Suppose we want to enhance our flight simulator program. We begin by adding Pilot, Fleet, and Wing classes to our class diagram.

Clearly there are some important relationships between these classes that we must also model. For example, Pilot P can fly Airplane A, Wing W is part of Airplane A, and Airplane A belongs to Fleet F.

Specifying Navigation

Next, we specify the navigability of these associations. An association between airplanes and wings doesn't necessarily imply that an airplane knows which wings hold it up or that a wing knows which airplane it is attached to. However, an airplane will probably need to send signals to its wings and vice-versa, so it is important that an Airplane object can quickly determine which Wing objects are holding it up, and a Wing object we can quickly determine which Airplane object it is attached to. The association between Airplane and Wing has bi-directional navigability. We can represent this by drawing barbed arrowheads at each end of the association. (Don't get creative with arrowheads. They have very specific meanings in UML.)

Clearly an object representing a fleet of planes should know which planes belong to it, but planes move from one fleet to another, and which fleet a plane belongs to will make no difference in how the plane works, so the association between Fleet and Airplane has unidirectional navigability, which UML indicates by drawing an arrowhead only on the Airplane end of the association:

Navigation arrowheads can be placed on links, too. For example, the following object diagram shows a fleet containing two airplanes. Airplane a is connected to two wings, w1 and w2:

Of course the links between an airplane and its wings are bi-directional. In other words, if Airplane a is connected to Wing w1, then w1 is connected to a.

Specifying Multiplicities

Next, we specify the multiplicities of these associations. Suppose every airplane in a fleet has exactly three pilots who are authorized to fly it, and suppose every pilot is authorized to fly exactly two planes in the fleet, then the relationship "Pilot X is authorized to fly airplane Y" is a 2-to-3 relationship. In UML we can indicate the multiplicity of a relationship by labeling each role with the multiplicity of its adjacent class:

In a corresponding object diagram, we would expect to see every pilot object linked to two distinct airplane objects, and every airplane object linked to three distinct pilot objects:

If every airplane has 3 or 5 pilots who are authorized to fly it, then we can represent this multiplicity as a sequence:

If every airplane has 3 to 5 pilots who are authorized to fly it, then we can represent this multiplicity as a range:

Clearly, every wing is attached to exactly one plane, and every plane has exactly two wings, so the association between Airplane and Wing has 1-to-2 multiplicity. An airplane can only belong to one fleet at a time, but the number of airplanes that belongs to a fleet is zero or more. In UML we use an asterisk to indicate zero or more:

Implementing Associations

In most examples in this book we will represent links using C++ pointers. For example, our idealized CASE tool interprets the bi-directional, 1-to-2 association between Airplane and Wing by adding two Wing pointers to the Airplane class:

class Airplane
{
public:
   void setLeftWing(Wing* w); // see below
   void setRightWing(Wing* w); // see below
   Wing* getLeftWing() const { return leftWing; }
   Wing* getRightWing() const { return rightWing; }
   // etc.
protected:
   Wing *leftWing, *rightWing;
   // etc.
};

and an Airplane pointer to the Wing class:

class Wing
{
public:
   Wing();
   virtual ~Wing() {}
   Airplane* getAirplane() const { return airplane; }
   void setAirplane(Airplane* p) { airplane = p; }
private:
   Airplane* airplane;
};

Of course the usual getter and setter functions are automatically provided, and the pointers are initialized to 0 by the default constructors:

Airplane::Airplane()
{
   leftWing = 0;
   rightWing = 0;
   // etc.
}

Wing::Wing()
{
airplane = 0;
}

However, there are several constraints implicit in our class diagram that require some effort to enforce in C++. For example, the class diagram tells us that every airplane object is linked to two distinct wing objects, but there is nothing to prevent the left and right wing pointers of a C++ airplane from pointing to the same C++ wing. We should attempt to enforce this in setLeftWing() and setRightWing(). For example:

void Airplane::setLeftWing(Wing* w)
{
if (w != rightWing) leftWing = w;
}

Another constraint implicit in our class diagram is that if airplane a is connected to wing w, then wing w is connected to airplane a, and not some other airplane. Conversely, if wing w is connected to airplane a, then airplane a is connected to wing w. We could enforce this in setLeftWing() and setRightWing(), too. For example, setLeftWing() can disconnect the plane from the old left wing, connect the new wing to the plane, then connect the plane to the new wing:

void Airplane::setLeftWing(Wing* w)
{
   if (w != rightWing)
   {
      // disconnect this from old wing:
      if (leftWing) leftWing->setAirplane(0);
      // connect new wing to this:
      leftWing = w;
      // connect this to new wing:
      if (leftWing) leftWing->setAirplane(this);
   }
}

Of course a fumbling user might directly connect a wing to the wrong airplane:

Airplane a, b;
Wing w;
a.setLeftWing(&w); // a connected to w, w connected to a
w.setAirplane(&b); // a connected to w, w connected to b

This creates an inconsistency because the setAirplane() function simply sets the airplane pointer to a new airplane. Of course an improved setAirplane() function could attempt to:

1. disconnect this wing from the old airplane;
2. connect this wing to the new airplane;
3. connect the new airplane to this wing;

There are several problems with this approach. First, was this wing the left or right wing of the old plane, and is it to be the left or right wing of the new plane? Second, how will the wing be attached to the new plane? For example, will we call the setLeftWing() function to do this? If we are not careful, this could result in a non-terminating recursion.

Perhaps we are giving too much freedom to our users. Perhaps users should only be allowed to connect wings to airplanes, but not airplanes to wings. We can enforce this by making setAirplane() a private member function. But then Airplane can't call this function either. In C++ we can solve this problem by declaring the Airplane class to be a friend of the Wing class:

class Wing
{
public:
   Wing();
   virtual ~Wing();
   Airplane* getAirplane() const { return airplane; }
private:
   friend class Airplane;
   void setAirplane(Airplane* p) { airplane = p; }
   Airplane* airplane;
};

Naming Roles

Notice that our CASE tool cleverly selected the names leftWing and rightWing for the two Wing pointers encapsulated by an Airplane. Of course a real CASE tool wouldn't know how to differentiate between the two wing pointers, and so would probably choose to store them in a small array:

class Airplane
{
public:
   Airplane();
   void setWing(Wing* w, int i);
   Wing* getWing(int i) const { return wing[i]; }
   // etc.
private:
   Wing* wing[2];
   // etc.
};

Explicit Constraints

Suppose we add an attribute called side to our Wing class that distinguishes between left and right wings:

class Wing
{
public:
   enum Orientation { LEFT, RIGHT };
   Orientation getSide() { return side; }
   void setSide(Orientation s) { side = s; }
   // etc.
private:
   Orientation side;
   // etc.
};

We can attach constraints to the Wing role of our Airplane-to-Wing associations to require that an airplane's left wing is a left-side wing and its right wing is a right-side wing. A constraint is a Boolean-valued condition bracketed by curly braces. Although OCL, the Object Constraint Language, is a proposed standard language for expressing constraints, we will often simply use Boolean-valued C++ expressions or even informal expressions.

Unlike notes, constraints do impact the implementation, even though the implementation may be ad hoc. For example, we might enforce the orientation constraints on wings in the Airplane's wing setter functions:

void Airplane::setLeftWing(Wing* w)
{
   if (w->side == Wing::LEFT)
   {
      // disconnect this from old wing:
      if (leftWing) leftWing->setAirplane(0);
      // connect new wing to this:
      leftWing = w;
      // connect this to new wing:
      if (leftWing) leftWing->setAirplane(this);
   }
}

Links = References in Java

Java doesn't support friends. The best we can do is to give setAirplane() package scope:

class Wing {
   private Airplane airplane;
   void setAirplane(Airplane a) { airplane = a; }
   public Airplane getAirplane() { return airplane; }
}

class Airplane {
   protected double altitude, speed;
   protected Wing leftWing, rightWing;

   // getters & setters:
   public Wing getLeftWing() { return leftWing; }
   public Wing getRightWing() { return rightWing; }

   public void setLeftWing(Wing w) {
      if (w != rightWing) {
         if (leftWing != null) leftWing.setAirplane(null);
         leftWing = w;
         if (leftWing != null) leftWing.setAirplane(this);
      }
   }
   // etc.
}

Resolving File Dependencies

Our CASE tool generates separate header and implementation files for each class icon. We can represent these files and the dependencies between them by drawing a dependency graph:

In this diagram (which is not a UML diagram), the dashed arrows indicate dependencies. For example, if the programmer makes changes to the wing.h file, for example if the setAirplane() is renamed setPlane(), then this could force changes to be made to wing.cpp, airplane.h, and by transitivity, to airplane.cpp.

Normally, dependencies between files are resolved using include directives. For example, airplane.cpp and wing.h can include airplane.h, and wing.cpp and airplane.h can include wing.h. However, some preprocessors will get confused by the fact that airplane.h includes wing.h and wing.h includes airplane.h.[1] The preprocessor might get caught in an infinite loop. One common trick for solving this problem is to resolve the dependency from wing.h to airplane.h by using a forward reference:

// wing.h
class Airplane; // forward reference

class Wing
{
public:
   Wing();
   virtual ~Wing();
   Airplane* getAirplane() const { return airplane; }
private:
   friend class Airplane;
   void setAirplane(Airplane* p) { airplane = p; }
   Airplane* airplane;
};

This works provided wing.h doesn't attempt to call any Airplane member functions. After all, the compiler only knows that Airplane is the name of a class. No other information is given. Calls to Airplane member functions must be moved to wing.cpp, which creates a direct dependency from wing.cpp to airplane.h. But this dependency can be resolved without circularities by including airplane.h in wing.cpp:

In summary, the simple bi-directional association between the Airplane and Wing class hides many unpleasant implementation details: How will the implicit constraints be enforced in C++? How will the bi-directional dependency between wing.h and airplane.h be resolved? Both of these issues need to be resolved by an expert programmer during early iterations through the implementation phase.

Extension and Inheritance

In UML the relationship "Every instance of A is an instance of B", in other words, A is a subclass of B.

Object oriented languages allow us to define subclasses. Instances of a subclass inherit the features of the super class. In addition, subclasses can add new features or modify inherited features.

For example, a Polygon class in a graphics program provides member functions for computing area and perimeter. Triangle and Rectangle subclasses might redefine these functions using simpler formulas or add new formulas and attributes peculiar to triangles or rectangles. Obtuse and acute triangles are instances of subclasses of the Triangle class, while squares are instances of a subclass of Rectangle. In a class diagram we can express the relationship between a subclass and a super class by connecting the class icons with a generalization arrow:

In C++ we can create subclasses using derivation. In this context Polygon is called the base class while Triangle and Rectangle are called derived classes. Of course Triangle is the base class for its derived classes, Obtuse and Acute, while Rectangle is the base class for Square:[2]

class Polygon { ... };
class Rectangle: public Polygon { ... };
class Square: public Rectangle { ... };
class Triangle: public Polygon { ... };
class Obtuse: public Triangle { ... };
class Acute: public Triangle { ... };

Java syntax makes the relationship between the two classes clearer by declaring that the features of subclasses extend the features inherited from super-classes:

class Polygon { ... }
class Rectangle extends Polygon { ... }
class Square extends Rectangle { ... }
class Triangle extends Polygon { ... }
class Obtuse extends Triangle { ... }
cloass Accute extends Triangle { ... }

As another example, assume a new release of our flight simulator program will make a distinction between military planes and passenger planes. Of course we will still need to keep track of the altitude and speed of both types of planes, and both types of planes will still need to takeoff, fly, and land. In addition, a military plane can drop bombs and has a Boolean attribute indicating if it is flying at a supersonic speed, while a passenger plane can show movies to its passengers and has an integer attribute indicating how many passengers are on board.

We could replace the Airplane class with PassengerPlane and MilitaryPlane classes, but then we would need to re-implement the takeoff(), fly(), and land() operations. Worse yet, we would need to re-implement these operations twice, once for each new class. Instead, we make the PassengerPlane and MilitaryPlane subclasses of the Airplane class. Now each new class inherits the takeoff(), fly(), and land() operations as well as the altitude and speed attributes from the Airplane super class:

Our class diagram introduces two new features. First, instead of drawing two generalization arrows, we combined them into a single forked arrow. This makes the diagram easier to read and makes it easier to add new subclasses later. Second, we have attached a note to the PassengerPlane class icon. A note is a dog-eared box containing a comment. It has no impact on the implementation, but it can make the class diagram easier to understand.

Assume we create and takeoff in one of each type of airplane:

MilitaryPlane a;
PassengerPlane b;
a.takeoff();
b.takeoff();

Assume our military plane has reached an altitude of 50,000 feet and a speed of 700 miles per hour, while our passenger plane— which has 80 passengers on board— has reached an altitude of 28,000 feet and a speed of 550 miles per hour. The object diagram depicting this scenario clearly shows the altitude and speed attributes each plane inherits from the Airplane super class:

Note that the inherited attributes are listed above the subclass attributes. In fact, this corresponds to the way these objects would be laid out in memory in Java and C++. Thus, instances of MilitaryPlane and PassengerPlane literally are instances of Airplane, but with additional fields hanging off the bottom.

Encapsulation (Scope and Extent)

The scope of a declaration is the region of the program where the declaration is valid (hence visible, unless shadowed by another declaration). Scopes are well defined. The possible scopes, in order of decreasing size, are program, subclass, package, class, and block:

Program or Global scope means the declaration is valid throughout the entire program or model. Subclass scope implies validity in all subclasses of a particular class. Package scope implies validity throughout a package. Class scope implies visibility in a particular class, only.

Many programming languages provide nest-able block statements:

{
   int x;
   int y;
   // more statements and declarations
   {
      int z;
      int y; // an error in Java
      // still more statements and declarations
   }
}

The body of a method is a block:

void test(int x) {
// etc.
}

The scope of a declaration that appears in a block extends from the point of declaration to the end of the enclosing block. This is called block scope. Of course the block might contain a nested block that contains a declaration of a name declared in the outer block (like y, above). In this case the inner declaration shadows the outer declaration.

We have already seen that UML allows modelers to specify the scope or visibility of its members (attributes and methods):

UML indicates visibility using "+" for public, "#" for protected, and "-" for private.

A private member of class A has class scope. It is only visible to other members of A A protected member of A has subclass scope. It is only visible to members of subclasses of A. (In this context we are thinking of the subclass relationship as transitive and reflexive, hence members of a subclass of a subclass of A is a subclass of A, hence can access A's protected members. Naturally, any member of A can access any other member of A, private, protected, or public.) In Java subclass scope refers to subclasses outside of A's package. Protected members are visible throughout A's package. Public members have global scope. UML has no special notation for package scope.

Stereotypes

UML icons can be stereotyped:

<<stereotype>>

A stereotype either indicates that the icon doesn't have its usual meaning, or the role played by the icon in the model.

Example: Factories and Products

A GUI tool kit (like AWT or Swing) has a method that creates GUI components:

class GUIToolKit {
   GUIComponent makeComponent(...) {
      return new GUIComponent(...);
   }
   // etc.
}

This is a common sight: one class has a method that creates new instances of a second class. In this context we often refer to the first class as a factory, the second class as a product, the method as a factory method, and the relationship between the classes as the creation relationship. We can communicate all of this by adding stereotypes to our diagram:

Example: Coad's "Archetypes"

<<entity>>
<<role>>
<<event>>
<<place>>

Example: Jacobson's Stereotypes

<<control>> = Instance represent system control objects

<<boundary>> = Instance represent system interface objects

<<entity>> = Instances represent application domain entities

J2EE Stereotypes

<<bean>>
<<session>>

UML Stereotypes

<<powertype>> = Instances represent subclasses of another class

<<metatype>> = Instances represent other classes

<<active>> = Instances own their own thread of control

<<persistent>> = Instances can be saved to secondary memory

<<actor>> = Instances represent external systems or users

In some cases a stereotype is so common that it earns its own icon. For example, in UML actors are sometimes represented by stick figures:

Interaction Diagrams

By itself, an object diagram isn't very useful. It becomes much more useful when it shows typical interaction sequences between the objects. An interaction occurs when a client object sends a message or invokes a member function of a server object. The server may or may not return a value to the client.

UML provides two types of interaction diagrams: collaboration diagrams and sequence diagrams. In this book we will use sequence diagrams. At the top of a sequence diagram is a row of object icons. A life line hangs below each icon. If a and b are objects, and if a calls b.fun() at time t, then we would draw a horizontal arrow labeled "fun" that emanates from a's life line at time t, and terminates at b's life line. The exact location of time t on a life line isn't as important as its relative position. Time flows from the top of the diagram to the bottom.

For example, assume a point of sale terminal (POST) records a sale by:

1. Checking inventory to see if item is in stock
2. Debiting the customer's account.
3. Crediting the retailer's account.
4. Updating inventory.
5. Printing a receipt.

Here is the corresponding sequence diagram:

Problems

Modeling Application Domains

An application domain is the real world context of an application: bank, warehouse, space ship, etc. Often, a specification document includes a UML class diagram that represents the application domain's important concepts and relationships as classes and associations respectively. In each of the following problems draw a UML class diagram that models the important concepts and their relationships in the application domain described. You may draw the diagram by hand, with a diagram editor, or by using a CASE tool.

Next, faithfully translate your class diagram into Java class declarations. Be sure to include all supporting functions implied by your diagram. For example, each member variable requires initialization as well as setter and getter functions (getAAA(), setAAA()). Each container member should provide clients with functions for adding and removing elements as well as traversing the container. Do not invent new member variables or member functions. Each class should be declared in its own header file and should have its own source file (even if it's empty).

A scenario description follows each domain description. Draw an object diagram that instantiates your class diagram and models the scenario. Implement a School.main() function in School.java so that it creates the objects and links in your object diagram. Insert diagnostic messages in main() to prove your program compiles and runs.

Hints:

1. When analyzing a domain description, important nouns often make good classes, while important verbs often make good associations.

2. Deciding if a property should be an attribute or an association can be tricky. One rule of thumb is this: if the type of the property in question is already a class in your diagram, then the property should be shown as an association, not an attribute. Attribute types tend to be primitive types (number, Boolean, char) or foundational classes (Date, String, etc). Whatever you do, don't show a property as both an attribute and an association. In either case, you will need to provide the necessary supporting functions for the property.

3. Remember: this exercise is about domain modeling, not program designing. Don't try to anticipate the application by adding things to your model that you think the application might need. There is no application-- yet.

4. However, when deciding if an association should be uni-directional or bi-directional, it's often useful to imagine that the ultimate application might be some sort of object-oriented database. Imagine what sort of queries this database might reasonably be expected to answer.

Problem

Domain:

A school offers many courses. Courses are identified by their course number, section number, and the semester taught (eg. Spring 2003). There are at least two types of special courses: lectures and seminars. A lecture has an additional location property (e.g. Evans Hall room 35). A seminar may or may not qualify for graduation credit.

A student has a name and ID number. A student may take one to five courses in one semester. A student knows the courses he takes. A course knows the students who are enrolled.

A teacher has a name, ID number, and department. A teacher must teach at most two courses per semester. A teacher knows the courses he teaches. A course knows the teacher who teaches it.

Scenario

Here's an excerpt from UC Berkeley's Spring 2003 catalog:

COURSE             SECTION       TEACHER    NOTES
Physics 150        1             Newton     seminar (credit)
Physics 2A         3             Newton     lecture (213 Durant)
Econ 1             1             Keynes     lecture (415 Evans)
Econ 1             2             kEYNES     LECTURE (415 Evans)

Bill Jones and Sue Smith take Physics 150. Bill takes Econ 1 sec 1. Sue takes Econ 1 sec 2 with her fellow student, Howard Johnson, who takes Physics 2A. (You can make up your own ID numbers).

[1] Actually, the circular includes won't pose a problem as long as they appear within the #ifndef/#endif directives we conventionally place in our header files. Programmers still need to be aware of other tricks for resolving circular dependencies.

[2] Throughout the text we will use ellipsis "..." to indicate unseen code.