Finish Primality Checking; Start NP-completeness

CS255

Chris Pollett

Apr 15, 2015

Outline

LastDay,we began to show that if `n` is an odd composite number, then the number of witnesses to the compositeness is at least `(n-1)/2`.
We did this by first showing any non-witness must be in `ZZ_n^star` and these elements form a subgroup of `ZZ_n^star`.
We then wanted to show that they form a propersubgroup to give the result.
The case where there was an `x` such that `x^(n-1) \ne 1 (mod n)` was handled last day.
Suppose for all `x in ZZ_n^star` , `x^(n-1) equiv 1 (mod n)`.

Then `n` is a Carmichael number.
First, notice `n` can't be a prime power. To see this suppose `n = p^e`. Since `n` is odd, `p` must also be odd, so `ZZ_n^star` will be cyclic, so has a generator `g` and by assumption we have `g^(n-1) equiv 1 mod n`. On the other hand, `ord(g) = phi(n) = (p-1)p^(e-1)` and the discrete logarithm theorem implies `n-1 equiv 0 mod phi(n)`. I. e., `(p-1)p^(e-1) | p^e - 1`, which is impossible as the left hand-side is divisble by `p`, but the right hand side is not.
So suppose `n` is odd, not a prime power and composite. We can then decompose it as `n =n_1 cdot n_2` where `n_1` and `n_2` have different prime factors.
Recall, `t` and `u` are defined so that `n - 1 = 2^t u` and `u` is odd.
The Witness procedure of last day computes the sequence
`X = langle a^u, a^(2u), a^((2^2)u), ..., a^((2^t)u) rangle` (all mod n)
Call a pair `(v, j)` acceptable if `v in ZZ_n^star` and `v^((2^j)u) equiv -1 mod n`.
For example, `v = n-1` and `j=0` is acceptable.
Pick an acceptable pair `(v, j)` with the largest possible value `j le t`.
Then one can show
`B = {x in ZZ_n^star |x^((2^j)u) equiv +-1 mod n}`
is a subgroup of `ZZ_n^star`.

Every non-witness must be a member of `B`, since the sequence `X` produced by a non-witness must be all `1`'s or else have a `-1` no later than the `j`th position, by the maximality of `j`.
We now use the existence of `v` such that `v^((2^j)u) equiv -1 mod n` to show there exists a `w` in `ZZ_n^star setminus B`.
Since `v^((2^j)u) equiv -1 mod n` we have `v^((2^j)u) equiv -1 mod n_1`.
So we can find by the Chinese Remainder Theorem a `w` such that `w equiv v mod n_1` and `w equiv 1 mod n_2`.
In which case, `w^((2^j)u) equiv -1 mod n_1` and `w^((2^j)u) equiv 1 mod n_2`.
So using Chinese Remainder theorem, we get `w^((2^j)u)` is not congruent to `+-1 mod n`.
So `w` is not in `B`. Nevertheless, one can show its `gcd(w, n) = 1` using the Chinese Remainder Theorem together with the fact that `v` is in `ZZ_n^star` . So `w` is in `ZZ_n^star` completing the proof.

Most algorithms we have studied run in polynomial time or some randomized variant.
That is, on all inputs of length `n`, the algorithms we've considered run in time at more `O(n^k)` for some fixed `k`.
We'll start looking today at some problems for which it is unknown if such efficient algorithms exist.
First we make formal what it is we mean by polynomial time, then we'll consider variant which might be harder.

We need a framework for describing problems and reasoning about their runtimes.
We define an abstract problem `Q` to consist of a set of instances `I` and a set of solutions `S`.
For example, for SHORTEST-PATH the instances might be triples consisting of graph and two vertices. A solution might be a sequence of vertices for a path between those two points in the graph of shortest distance.
We will be interested in a subclass of problems called decision problems, where the answers are always yes (1) or no (0).
For example, does there exists a shortest path of size at most `k`?
It is usually straightforward to binary search from a way to solve the decision problem to solve the associated optimization problem.
Here optimization problems are where we want to find a largest or smallest value.

An encoding of a set `S` of abstract objects is a mapping from `S` to binary strings.
For example, one can encode the natural numbers `{0, 1, 2, ...}` as strings `{0, 1, 10,..}`.
One can encode legal English sentences using ASCII, etc.
A computer algorithm "solves" some abstract decision problem by going from an encoding of a problem instance as an input to `0` or `1` as output.
We call a problem whose instance set is the set of binary strings a concrete problem.
We say an algorithm solves the problem in `O(T(n))` time if when provided a problem instance `i in I` of length `n = |i|`, the algorithm can produce the solution using `O(T(n))` steps.
A concrete problem is called polynomial time decidable if there is an algorithm that solves it which on all instances of length `n` runs in time `O(n^k)` for some fixed `k`.
We write `P` for the class of all such decision problems.
Similarly, we can define the class of polynomial computed functions `f:{0,1}^\star -> {0,1}^\star`.

In order to study decision problems it is useful to have an understanding of formal languages.
An alphabet `Sigma` is a finite set of symbols.
A language is a set of strings over the symbols in an alphabet.
Some common ways to create new languages from old ones is via unions, concatenation, and star.

We want to connect algorithms with languages.
We say an algorithm `A` accepts a string `x` if `A` run on `x` outputs `1`.
If it outputs `0`, it rejects the string.
We say an algorithm `A` accepts a language `L` if the only strings it accepts are in `L`.
We say a language is decided by `A` if `A` accepts the language and strings not in the language are rejected.
A complexity class is a set of languages membership in which is determined by some complexity measure, for instance, runtime.
For example, `P` is the complexity class of languages decided in polynomial time.
It is also equivalently formulated as the class of languages accepted in polynomial time. (Just run polynomially many steps if it hasn't accepted yet, reject.)

We now look at algorithms which can verify membership in languages.
As an example...
Call an undirected graph G Hamiltonian if it contains a Hamiltonian cycle that is, a simple cycle which contain each vertex of G.
Let HAM-CYCLE `= { langle G rangle | G mbox( is a Hamiltonian graph )}`
How might one decide this problem? One could try each possible permutation of vertices. Let `m` be the number of vertices of the graph. Typically, `m = Omega(sqrt(|langle G rangle|))`. There are `m!` many permutations. So this algorithm would have exponential runtime.
On the other hand, consider the language
`H = {langle G, P rangle | P mbox( is a Hamiltonian cycle in ) G}`.
This language has a polynomial time decision algorithm. Further, the size of `P` is polynomial in the size of `G`, so we could rewrite HAM- CYCLE as:
`{ langle G rangle | exists P, |P| le |G| and langle G, P rangle in H}`
`H` can be viewed as verifying HAM-CYCLE in polynomial time.

We are now ready to define the complexity class `NP`.
We say a language `L` belongs to `NP` if there exists a two input polynomial-time algorithm `A` and a constant `c` such that
`L= {x in {0,1}^star : exists y, |y| = O(|x|^c) mbox( and ) A(x,y)=1}`
I.e., it is the class of languages that have polynomial time verification algorithms. So HAM-CYCLE `in NP`.
It is not hard to see `P subseteq NP`, but it is unknown if `P=NP`.
In fact, there is a million dollar prize to anyone who can solve this problem.
Given a complexity class `C`, let `co-C` denote the class of languages whose complement is in `C`.
One can see `P subseteq NP cap co-NP`, but it is unknown if equality holds.