A system S can be viewed as a tree called the component hierarchy. S is the root of the tree. The other nodes of the tree include the packages, classes, and methods of S. Node M is a child of node N if M is declared inside of N. For example, M might be a package and N a class or sub-package declared inside M. Or M might be a class and N an inner class or method of M.
Assume A and B are components of some system, S. A and B can be packages, files, classes (and interfaces), objects, or methods. A critical relationship between components is the dependency relationship. A depends on B if some change in B causes a change in A. Formally, we write:
A dep B
For example, Assume A and B are packages and A imports a class defined in B. If the author of B changes the name of this class, then clearly the author of A will need to make some modifications. Thus, A imports from B implies A depends on B.
A imports-from B => B exports-to A => A dep B
The dependency relationship is clearly a transitive relationship:
A dep A (reflexive)
A dep B & B dep C => A dep C (transitive)
The dependency relationship is the transitive closure of a more basic relationship. I will call this the references relationship[1]:
A ref B
Where:
A ref B implies A dep B
Graphically, we use the UML dependency arrow to represent the references relationship:
We call A the client and B the provider, supplier, or server.
For packages, A references B if A uses or imports components declared in B. In UML:
In a package-level dependency graph the nodes are the package children of system S or of some package within S and the arrows represent the references between packages. We could go further and define the recursive package-level dependency graph. This includes all of the packages in system S.
If package A contains the declaration of sub-package B, and if A also uses some of the components declared in B, then this is regarded as a self reference and therefore does not show up as an arrow in the dependency graph.
Object A references object B if a state change of B can cause a state change in A. How would A know that B had changed state? A would need to access a field of B or send a query to B. To do this A would need a reference or pointer to B (but not a copy of B). For example, if B represents a nuclear reactor and A represents a thermometer that indicates the temperature of B, then a change in the temperature of B would cause a change in the thermometer's temperature indicator. Maybe some method in B executes Java code such as this:
this.currentTemp = A.getTemp(); // this = B
this.updateIndicator();
In an object-level dependency graph nodes are objects and references are links between objects. Be careful, the object-level dependency graph is dynamic. For example, the thermometer in the last example might indicate the temperatures of different reactors at different times.
B is local to A if it is a sub-object of A (this can happen in C++ if A has a field of type B instead of a reference or pointer to B) or if the scope and lifetime of B is restricted to a method invocation of A.
For example consider the following C++ code:
class AAA {
BBB B1;
void m() {
BBB B2;
// etc.
}
// etc.
};
If A is an instance of AAA, then A.B1 is local to A. Also, when A.m() is called, B2 is local to A. B1 and B2 can't be shared by other objects. For example, if object C calls
x = A.B1;
it only receives a clone of a.B1 thanks to the copy semantics of C++.
In these cases B only exists for A and cannot be shared by other objects (although clones of B could possibly be shared.) In these cases the references relationship is regarded as a self-reference, hence no arrow is drawn from B to A in the dependency graph.
From the software engineering perspective class-level dependencies are the most interesting because the state of a class is identified with the current version of its source code. Assume A and B are classes. We might begin by saying that A references B if A extends B or if the declaration of A contains the declaration of a variable, parameter, or constant of type B. For example, A might declare a field of type B or a method of A might have a parameter of type B or a local variable of type B.
Consider the following declaration:
class A extends B1 {
B2 x;
void m(B3 y) {
B4 z;
// etc.
}
// etc.
}
Clearly:
A ref B1, A ref B2, A ref B3, and A ref B4
We need to be a little more cautious, because B might only appear in a more complex type expression contained in A. For example, consider the following C++ declaration:
class A: public list<B1> {
B2& x;
void m(B3* y) {
B4 z[];
// etc.
}
// etc.
};
Once again:
A ref B1, A ref B2, A ref B3, and A ref B4
In all of these cases we can say that A explicitly references B. It is also possible that A implicitly references B. This can happen if A contains an anonymous expression of type B. For example, consider the following C++ declarations:
B1 x; // a global variable
class C {
B2 y;
B2 getY() { return y; }
B3* makeB3() { return new B3(); }
};
class A {
C c;
void m() {
x.m1();
c.y.m2();
c.getY().m2();
c.makeB3().m3();
// etc.
}
// etc.
};
In this example A implicitly references B1, B2, and B3 because A contains the expressions x, c.y, c.getY(), and c.makeB3() respectively. In any case, we still claim:
A ref B1, A ref B2, A ref B3, and A ref B4
The Law of Demeter discourages many implicit dependencies. We say that B is an acquaintance of A if A contains an implicit reference to B. B is a preferred acquaintance of A if the implicit reference is global variable or if the instance of B is created in some method of A. In the example above B1, B2, B3, and B4 are all acquaintances of A, but B1 and B3 are preferred acquaintances of A. The Law of Demeter bans or limits non-preferred acquaintances.
The Acyclic Dependency Principle shuns cycles in the package-level dependency graph. Cycles in the class-level dependency graph can also lead to overly complex code.
For example:
class Student {
Teacher myTeacher;
void setTeacher(Teacher t) { myTeacher
= t; }
}
class Teacher {
Student[] myStudents = new
Student[100];
int size = 0;
void addStudent(Student s) {
if (size < 100)
myStudents[size++] = s;
}
// etc.
}
This code has problems. Adding or removing a student, s, from a teacher's array of students doesn't automatically update s.myTeacher. Modifying s.myTeacher doesn't automatically update the student arrays of the old and new teacher. One class needs to be designated as the manager of the bi-directional dependency between students and teachers. In addition, some way needs to be devised to enforce this decision. For example, if Teacher manages the dependency, then users should be forbidden to modify the myTeacher field of a student object. But then how can Teacher modify this field?
Assume A is a class in a class-level dependency graph.
clients(A) = {B | B ref A}
dependents(A) = {B | B dep A} = TC(clients(A))
providers(A)= suppliers(A) = {B | A ref B}
dependencies(A) = {B | A dep B} = TC(providers(A))
We call the references from A to its providers efferent or exiting couplings. We call references from the providers of A to A afferent or arriving couplings.
The encumbrance of A is the cardinality of its set of dependencies:
encumbrance(A) = #dependencies(A)
Encumbrance is a crude measure of the stability, reusability, and portability of A.
There are also various forms of control dependency. An arrow connecting component A to component B might mean that object A sends a message to object B, function A calls function B, or statement B is executed after statement A.
[1] I wrestled over the best term for this relationship. In UML this is the dependency relationship. I don't like this because logically, depends is transitive while the relationship "connected by a depends arrow" is not. Also, I wanted a definition that was syntactic in nature so as to suggest algorithms for possible design metrics.