Complexity

The worst-case runtime of an algorithm A, TA(N), is one way to measure the complexity of A. In this case O(1), O(log N), O(N), O(N2), O(2N), etc. can be viewed as complexity classes-- significant markers on a yardstick for measuring algorithm complexity.

Let's define the complexity of a problem, P, as the complexity of the most efficient algorithm that solves P. Specifically:

complexity(P) = TA(N)

where algorithm A solves P and if algorithm A' also solves P, then

TA(N) = O(TA'(N))

Of course there are often infinitely many different algorithms that solve a given problem P, including many that have yet to be discovered; so it practice it can be difficult to prove that a particular algorithm is the most efficient, and therefore in practice it can be difficult to determine the complexity of a given problem.

With this definition in place, O(1), O(log N), O(N), O(N2), O(2N), etc. may also be viewed as complexity classes of problems.

Polynomial-Time Problems

Three important complexity classes for problems are the Linear, Polynomial, and Exponential classes:

L = Linear = all problems that have O(N) algorithms
P = Polynomial = all problems that have O(Nk) algorithms for some k
E = Exponential = all problems that have O(kN) algorithms for some k

Clearly:

L Ì P Ì E

Undecidable Problems

At the extreme end of the complexity spectrum are the undecidable problems. P is an undecidable problem if there is no algorithm A that solves P.  Note: we are not saying that no algorithm has yet been found that solves P, we are saying no algorithm will ever be found that solves P! And yet, we know that P has a solution.

The Halting Problem

Assume a compiler implemented in Java represents programs as objects:

interface Program {
   Object run(Object input);
}

The compiler provides static methods for parsing, optimizing, and compiling programs. Assume it also provides a method that determines if a given program processing a given input will eventually return a value:

public class Compiler {
   public static Program parse(File source) { ... }
   public static Program optimize(Program p) { ... }
   public static File compile(Program p) { ... }
   public static boolean halts(Program p, Object input) {
      // = true if p.run(input) eventually returns a result
      // = false otherwise
   }
   // etc.
}

Now consider the following program:

class LoopProgram implements Program {
   public Object run(Object input) {
      if (input instanceof Program) {
         Program p = (Program)input;
         if (!Compiler.halts(p, p)) return null; // halt
      }
      for(;;); // loop forever
   }
   public static void main(String[] args) {
      LoopProgram loop = new LoopProgram();
      boolean willHalt = Compiler.halts(loop, loop);
      System.out.println("will halt = " + willhalt); // ???
   }
}

Assume loop is an instance of LoopProgram. If the input to loop is a program, p, then loop halts if p.run(p) doesn't halt, otherwise loop goes into an infinite loop. Thus:

loop.run(p) halts <=> p.run(p) doesn't halt

We then ask if loop.run(loop) halts. In this case p = loop, hence we get the apparent paradox:

loop.run(loop) halts <=> loop.run(loop) doesn't halt

The only way to resolve this paradox is to conclude that there is no algorithm that implements Compiler.halts!

Non-Deterministic Polynomial -Time Problems

A non-deterministic algorithm may occasionally prompt the user for decisions regarding which of several alternate choices it should make. We make the unrealistic assumption that the user's decisions are infallible. An infallible user is called an oracle.

We define a new complexity class of algorithms and problems:

NP = non-deterministic algorithms that run in O(Nk) time for some k

Basically, a problem is NP if the problem of verifying a proposed solution can be done in polynomial time.

The Traveling Salesman Problem (TSP)

A map is a network of nodes representing cities connected by links representing roads. A route is a sequence of roads R0, R2, ..., RN such that Ri connects city Ci to city Ci+1. In other words, Ri+1 begins in the city where Ri ends. The length of a route is simply the sum of the lengths of every road on the route. A circuit is a route that visits every city.

The Traveling Salesman Problem is this:

Given any map, m, is there a circuit c that has length < k miles?

We can represent a map as an N x N array of distances, where N = the number of cities:

map[i][j] = distance from city i to city j

If there is no road from i to j, we can represent this as positive infinity:

map[i][j] = Double.POSITIVE_INFINITY

A circuit is simply a length N array of city indices such that

i != j => circuit[i] != circuit[j]

In other words, a circuit is simply a permutation of the integers from 0 to N - 1. (Of course some of the roads in this circuit may have infinite length.)

 Here's a pseudo code implementation:

class TSP {
   // this algorithm is O(N)
   static double length(double map[][], int[] circuit) {
      double result = 0;
      for(int i = 0; i < circuit.length - 1; i++) {
         result += map[circuit[i]][circuit[i + 1]];
      }
      return result;
   }
   // this algorithm is O(N!)
   static boolean tsp1(double[][] map, double k) {
      for each permutation c of indices < map[0].length {
         if (length(m, c) < k) return true;
      }
      return false;
   }
   // this non-deterministic algorithm is O(N)
   static boolean tsp2(Map m, double k) {
      int c = new int[map[0].length];
      for(int i = 0; i < map[0].length; i++) {
         System.out.println("pick next city on shortest circuit");
         System.in.read(c[i]);
      }
      return length(m, c) < k;
   }
}
     

TSP.tsp1 is a deterministic algorithm that solves TSP in O(N!) time. This is actually worse that exponential time. TSP.tsp2 solves the problem in O(N) time, but is non-deterministic. There is no known deterministic algorithm that solves TSP in polynomial time. Thus TSP is an NP problem, but it is unknown if it is a P problem.

P=NP

The biggest problem in Computer Science is to prove or disprove:

Every problem that can be solved in polynomial time by a non-deterministic algorithm can also be solved in polynomnial time by a deterministic algorithm.

This problem is abbreviated P=NP. If P=NP is true, if we can prove every problem with an NP algorithm also has a P algorithm, then of course we would know that a deterministic polynomial time solution to TSP exists, although we still might not be able to find it.

Most computer scientists believe P¹NP.

NP Complete Problems

An NP problem, P, is NP-complete if it is NP and if for every other NP problem, Q:

Q P P

Problem P1 can be reduced to problem P2 in polynomial time if there is a deterministic, polynomial time algorithm that transforms every instance of P1 into an instance of P2, and every solution of the instance of P2 back into a solution of the original instance of P1. We write:

P1 P P2

For example, the problem of adding two decimal numbers can be reduced to the problem of adding two binary numbers.

An NP problem, P, is NP-complete if it is NP and if for every other NP problem, Q:

Q P P

We can prove P=NP if we can find a deterministic, polynomial-time algorithm for a single NP-complete problem. Assume P is an NP-Complete problem that has a polynomial-time deterministic algorithm, A. Let Q be any NP problem. To solve Q:

1. reduce Q to P
2. find a solution to P using A
3. transform solution of P into a solution to Q

Notice each step is polynomial-time and deterministic.

There are many known NP-Complete problems, including TSP.