Outline
- Intermediate Problems between `P` and `NP`.
- Oracles.
- `P` versus `NP` with oracles.
Intermediate Languages
- If a problem is not in `P`, but is in `NP` it is quite often the case that one can show it is `NP`-complete.
- If `P=NP` then this will always be the case.
- However, suppose `P ne NP`, are there languages not in `P` that are not `NP`-complete?
- Ladner (1975) showed that there are.
Ladner's Theorem
Theorem. Suppose `P ne NP`. Then there exists a language `L in NP - P` that is not `NP`-complete.
Proof. For every function `H: NN -> NN`, we define the language `SAT_H` to contain all length-`n` satisfiable formulas that
are padded by `n^(H(n))` `1`'s. That is,
`SAT_H = { \psi 01^(n^(H(n))) : psi in SAT mbox( and ) n= |psi| }`.
We now define a particular `H: NN -> NN ` as follows:
`H(n)` is the smallest number `i < log log n` such that for every `x in {0, 1}^star` with `|x| le log n`, `M_i` outputs `SAT_H(x)` within `i|x|^i` steps. If there is no such `i` then `H(n) = log log n`.
`H` is well-defined since `H(n)` determines membership in `SAT_H` of strings whose length is greater than `n`, and the definition of `H(n)` relies upon checking the status of strings of length at most `log n`. In fact, the definition of `H` implies an `O(n^3)`-time algorithm to compute `H(n)` from `n`. (In-Class Exercise in a moment). `H` was defined to ensure the following claim is true:
More Ladner's Theorem
Claim. `SAT_H in P` iff `H(n) = O(1)`. Moreover, if `SAT_H !in P` then `H(n)` tends to infinity with `n`.
Proof. (`SAT_H in P => H(n) = O(1)`): Suppose there is a machine `M` solving `SAT_H` in at most `cn^c` steps. Since `M` is represented by infinitely many string, there is a number `i > c` such that `M = M_i`. The definition of `H(n)` implies that for `n > 2^(2^i)`, `H(n) leq i`. Thus `H(n) = O(1)`.
(`H(n) = O(1) => SAT_H in P`): If `H(n) = O(1)` then `H` can take only one of finitely many values, and hence there exists an `i` such that `H(n) = i` for infinitely many `n`'s. But this implies that TM `M_i` solves `SAT_H` in time `i n^i`: otherwise, if there was an input on which `M_i` fails to output the right answer within this bound, then for every `n > 2^(|x|)`, we would have `H(n) ne i`. Note this holds even if we are only assuming that there's some constant `C` such that `H(n) < C` for infinitely many `n`'s, hence proving the moreover part of the claim.
Finish Ladner's Theorem
Using the claim, we can show that if `P ne NP` then `SAT_H` is neither in `P` nor `NP`-complete:
- Suppose that `SAT_H in P`. Then by the claim, `H(n) le C` for some constant `C`, implying that `SAT_H` is simple `SAT` padded by at most a polynomial (namely, `n^c`) many `1`'s. But then then a `p`-time algorithm for `SAT_H` can be used to solve `SAT` in `p`-time, implying `P=NP`!
- Suppose that `SAT_H` is `NP`-complete. This means there is a reduction `f` from `SAT` to `SAT_H` that runs in `O(n^i)` time for some constant `i`. Since we already know `SAT_H !in P`, the claim above implies that `H(n)` tends to infinity. Since the reduction works in `O(n^i)` time only, for large enough `n` it must map `SAT` instance of size `n` to `SAT_H` instances of size smaller than `n^(H(N))`. Thus for large enough formula `phi`, the reduction `f` must map it to a string of the type `psi01^(n^(H(|psi|)))` where `psi` is smaller by more than some fixed polynomial factor, say, smaller that `n^(1/3)`. But the existence of such a reduction yields a simple polynomial time algorithm for `SAT`, contradicting `P ne NP`! (Just iteratively apply this reduction until the SAT instance one needs to solve can be done by brute force.)
In-class Exercise
At the start of Ladner's Theorem we asserted that `H(n)` could be computed in `O(n^3)`-time
Write up why this is true and post it to the Mar 15 Class Thread.
Oracles
- We now consider TMs which have access to a black box called an oracle.
- It turns out many of the proofs about relationships between complexity classes carry over to the oracle setting.
- So oracle results give us bounds on what can happen for the usual complexity classes without oracles.
- The oracle setting also tells us something about the strength of reductions.
- Namely, one might ask: Can an "`NP`-reduction" be more powerful that a "`P`-reduction"?
Oracle Machines
Definition. A Turing Machine `M^?` with oracle is a multi-tape DTM (a similar definition works for NDTMs) with a special read-write query tape. It also has three distinguished states `q_?` (book calls `q_(query)` but harder to type), `q_(yes)`, `q_(no)`. We feed into the "?" slot of `M^?` an oracle language `A subseteq Sigma^star` to get a machine `M^A`. On input `x`, `M^A` computes as normal unless it enters the state `q_?`, in which case if `y` is the contents of the query tape then the next state will be `q_(yes)` if `y` is in `A` and will be `q_(no)` if `y` is not in `A`. The computation keeps going until a halt state is reached.
Here are a couple points to keep in mind about oracle machines:
- `M^A` might enter the query state `q_?` several times during its computation, so might ask for several different strings if they belong to `A`.
- Given space or time bounded complexity class `C` defined using DTMs or NDTMs, let `C^A` denote the class of languages one gets by allowing the machines in `C` to be oracle machines with access to `A`. That is, `P^A` is the class of languages recognized in p-time by DTMs `M^A`.
Examples
- Let `bar {SAT}` denote the language consisting of unsatisfiable formulas. Then `bar {SAT} in P^(SAT)`: If `phi` is an instance of `bar {SAT}` then with a p-time machine having an oracle for `SAT`, one can write `phi` on the query tape, enter the query state, and find out if the formula is satisfiable. If not, output `1`; if yes, output `0`.
- Let `A in P`, then `P^A = P`. Given a p-time `M^A`, a non-oracle machine could simulate `M^A` until it enters the query state. Since `A` is in `P`, the non-oracle machine could then simulate the p-time machine for `A` until it gets an answers, then uses this answer to keep simulating `M^A`. The total run-time of this machine will be bounded by the product of the degrees of the run-times of `M^A` and `A`.
Baker-Gill-Solovay
Theorem. There are oracle sets `A`, `B` such that `P^A=NP^A` and `P^B ne NP^B`.
Proof. Consider the following canonical `EXP`-complete language:
`A = {langle M, x, 1^n rangle | M mbox( outputs 1 on x within ) 2^n mbox( steps) }`
Notice `A in EXP`. Recall we showed `P subseteq NP subseteq EXP`. `P^A = EXP` since given a `L(M)` in `EXP` and an input `x` in p-time
we could write `langle M, x, 1^n rangle` on the query tape and then using the oracle determine if `x in L(M)`. On the other hand, by the same argument as on the previous slide, since `A in EXP`, we will have `NP^A subseteq EXP^A = EXP`. So we have `P^A = NP^A =EXP`.
The construction for `B` is a little more involved. Let `L` be
the following language:
`L_B = { 0^n | mbox( There is an ) x in B mbox( with ) |x|=n}`.
This language is in `NP^B`. We guess an `x` of length `n` and check if it is in `B` using the oracle. We will show that we can choose `B` so that this language is not in `P^B`.
BGS proof cont'd
- To build `B` we enumerate oracle DTMs, `M_1^?`, `M_2^?`, `...` by listing out strings in lex order and then checking if they are oracle DTMs.
- We define `B` in stages `B_i` so that `B = cup_i B_i` based on which oracle DTM we have just enumerated.
- Our construction has the property that `B_i` contains all strings in `B` of length `le i`.
- `B_0` is the empty set.
- Assume we have constructed `B_(i-1)` and have just written `M_i^?` on the tape where we are doing the enumeration. We then simulate `M_i^B(0^i)` for `i^(log i)` steps.
- Notice this is more than polynomially many steps.
- Since we haven't completed `B` yet how do we answer oracle queries? ...
Yet More proof
- To answer queries "y in B?":
- If `|y| < i` then answer according to `B_(i-1)`.
- If `|y| ge i` then answer "no" and make sure to remember `y` in some "no" set stored on another string, so that we never add `y` to `B`.
- Suppose after `i^(log i)` steps `M_i^B` rejects. Then we pick some string of length `i` that was never queried by any `M_j^B` for `j le i`.
- This is possible since
`sum_(j=1)^i j^(log j) le sum_(j=1)^i i^(log i) = i cdot 2^(log^2 i) < 2^i`.
- On the other hand, if `M_i^B` accepts, we set `B_i= B_(i-1)`, so that there are no strings of length `i in B` and so `L` does not contain `0^i`.
- The last case is that `M_i^B` did not halt within `i^(log i)` steps. This might happen even if `M_i^B` is p-time if the coefficients in the polynomial `p(i)` bounding its run time are such that `i^(log i) le p(i)`. Again, we set `B_i= B_(i-1)`. We know that an equivalent machine to `M_i^B` will eventually be listed out with large enough index `I` so that `I^(log I) ge p(I)` in which case the first two cases will ensure that `M_i^B`'s is not `L`.