`NP`-completeness




CS254

Chris Pollett

Feb 22, 2017

Outline

Introduction

`NP`-complete problems

Theorem.
`mbox(TMSAT) := {\langle alpha, w, 1^n, 1^t rangle |`
`\mbox(there exists a ) u in {0,1}^n mbox( such that ) M_alpha mbox( accepts w, u in at most t steps) }`
is Karp-complete for NP.

Proof. We first show `mbox(TMSAT) ` is in `NP`. A nondeterministic algorithm to recognize this language is as follows: On input `\langle alpha, w, 1^n, 1^t rangle`, nondeterministically guess a string `u` of length `n` and then simulate `M_alpha` according to this string for `t` steps and see if it accepts `langle w, u rangle`. To see this language is hard for `NP`, suppose `L` is an `NP` language. Then there is some verifier `M` such that `x in L` iff there is a `u in {0,1}^(p(n))` satisfying `M(x,u) = 1` and `M` runs in time `q(n)` for some polynomial `q`. To reduce `L` to `mbox(TMSAT)`, we simply map every string `x in {0, 1}^star` to `langle lfloor M rfloor, x, 1^(p(|x|)), 1^(q(m)) rangle`, where `m = |x| + p(|x|)`. This mapping is computable in `p`-time and
`langle lfloor M rfloor, x, 1^(p(|x|)), 1^(q(m)) rangle in mbox(TMSAT) iff `
`exists u in {0,1}^(p(n))mbox( such that ) M_alpha mbox( accepts w, u in at most ) q(m) mbox( steps ) iff`
`x in L`.

In-class Exercise

Below is a problem to machine learning. Argue that it is in NP. It's actually NP-complete (Brucker 1978), but you don't have to show completeness.

Clustering: Given a finite set `X`, a distance function `d(x,y)` which returns nonnegative integers for any inputs `x,y in X`, and two positive integers `k` and `B`, is there a partition of `X` into disjoint sets `X_1, ..., X_k` such that, for `1 \leq i \leq k` and all pairs `x,y in X_i`, `d(x, y) \leq B`?

Post your solution to the Feb 22 Class Thread.

Boolean Formulas

Satifiability, Unsatisfiability, and Validity

Conjunctive Normal Form

The Cook-Levin Theorem (1971, 1973)

Theorem.
(1) SAT is `NP`-complete.
(2) 3SAT is `NP`-complete.

Proof. First notice, given a truth assignment, we can check in polynomial time is each clause in a CNF is satisfied or not. So both SAT and 3SAT are in NP. So it suffice to show they are hard for `NP`. We will prove this over the next couple of slides.

CNFs are universal

Claim. For every Boolean function `f:{0, 1}^l -> {0,1}`, there is an `l`-variable CNF formula `phi` of size `l2^l` such that `phi(u) = f(u)` for every truth assignment `u in {0, 1}^l`. Here the size of a CNF formula is defined to be the number of `^^`'s/`vv`'s appearing in it.

Proof. For each `v in {0,1}^l` we make a clause `C_v(u_1, .., u_l)` where `u_i` appears negated in the clause if bit `i` of `v` is `1` otherwise it appear un-negated. Notice this clause has `l` ORs. Also notice `C_v(v) = 0` and `C_v(u) =1` for `u ne v`. Using these `C_v`'s we can define a CNF formula for `f` as:
`phi = ^^_(v:f(v) = 0) C_v(u_1, .. u_l)`.
As there are at most `2^l` strings `u` which make `f(u)=0`, the total size of this CNF will be `l 2^l`.

Example of Converting to CNF

The NP-hardness of SAT is to be continued next day....