Cohesion

The Modularity Principle states that programs should be built from cohesive, loosely coupled modules.

Intuitively, a module is cohesive if the members of the module (i.e., the public functions, variables, classes, etc.) are related.

Modules can exhibit varying degrees of cohesion: incidental, logical, temporal, procedural, informational.

Bloated classes—classes with too many responsibilities—often lack cohesion. Such classes should be refactored into several classes.

A cohesion analysis tool examines the declaration of a class and measures its degree of cohesion on a scale from 0 (low cohesion) to 1 (high cohesion):

0 <= cohesion("someClass.java") <= 1

Of course a low score doesn't necessarily mean that the class lacks cohesion, only that closer scrutiny is warranted.

There are several algorithms for implementing our cohesion analyzer. They are all based on the same principle:

Closely related methods often reference the same fields.

One technique is to build a method-attribute graph (MAG). Methods and fields are the nodes of this graph. A method node is connected to a field node if the method references the field. The cohesion degree of the class is the number of edges divided by the number of possible edges. (The number of possible edges is simply the number of method notes multiplied by the number of field nodes.)

Example

Assume Test.java contains the following class declaration:

class Test {
   int a, x, y, z;
   void m1() {
      System.out.println(x);
      System.out.println(y);
   }
   void m2() {
      System.out.println(y);
      System.out.println(z);
   }
   void m3() {
      System.out.println(x);
      System.out.println(z);
   }
   void m4() {
      System.out.println(a);
   } 
   void m5() {
      System.out.println(a);
   }
}

The cohesion analyzer builds the following MAG:

Then:

cohesion("Test.java") = 8/(5 * 4) = 2/5

Examining the graph suggests this low score is due to the fact that methods m4 and m5 don't access any of the fields shared by m1, m2, and m3. Perhaps Test should be refactored into two classes:

class Test1 {
   int x, y, z;
   void m1() {
      System.out.println(x);
      System.out.println(y);
   }
   void m2() {
      System.out.println(y);
      System.out.println(z);
   }
   void m3() {
      System.out.println(x);
      System.out.println(z);
   }
}

class Test2 {
   int a;
   void m4() {
      System.out.println(a);
   } 
   void m5() {
      System.out.println(a);
   }
}

What are the cohesion degrees of Test1 and Test2?

Reference

My simplistic algorithm is based on one of Chidamber and Kemerer's algorithms for measuring the lack of cohesion in a class.