Introduction

We have been looking at approximation algorithms for NP-hard problems.
We have given constant factor approximations for the optimization versions of vertex cover and TSP with triangle inequality.
We showed that general TSP does not admit a constant factor approximation algorithm.
We showed for set cover that sometimes one can come up with approximations algorithms whose approximation ratio grows with some property of instance.
We gave a one-sided error randomized quadratic algorithm for 2-SAT based on random walks. Although this algorithm runs in quadratic runtime, in the general k-SAT case the walks turn out to be exponential to have a good chance of finding a satisfying assignment.
We did however give a constant factor randomized approximation algorithm for 3-SAT.
Recall an approximation scheme for an optimization problem is an algorithm that takes both an instance of the problem as well as a constant `epsilon` and then runs a `(1 + epsilon)`-approximation on the instance.
If for any `epsilon`, the approximation scheme runs in `p`-time, then it is called a polynomial time approximation scheme.
We say that an approximation scheme is a fully `p`-time approximation scheme if it is an approximation scheme and its run time is `p`-time in both `1/epsilon` and the instance size `n`. For example, the scheme might have a running time of `O((1/epsilon)^2n^3)`.
We start today by giving one more approximation result, this time making use of linear programming, then we give a fully `p`-time approximation scheme for subset sum.

Weighted Vertex Cover

The minimum-weight vertex cover problem is given a graph `G=(V, E)` and a positive weight function `w(v)` on vertices, find a vertex cover `V' subseteq V`, such that `w(V') = sum_(v in V') w(v)` is as small as possible.
Our vertex cover algorithm from before treats all vertices equally and may return a very heavy cover, so a new algorithm is needed for this problem.
We are going to give a linear programming-based algorithm to get a solution.

0-1 Program for Minimum Weight Vertex Cover

In a linear program, we are trying to maximize or minimize a function of several variables, subject to a set of linear inequalities on those variables.
In our case, to each vertex `v in V` associate a 0 or 1 valued variable `x(v)`. We put `v` in the vertex cover iff `x(v)=1`.
The constraint that for any edge `(u,v)` one of `u` or `v` must be in the vertex cover, can be represented as the constraint `x(u) + x(v) ge 1`.
This gives us the following 0-1 integer program for minimum weight vertex cover:
minimize
`sum_(v in V) w(v) x(v)`
subject to
`x(u) + x(v) ge 1` for each `(u,v) in E`
`x(v) in {0, 1}` for each `v in V`.
The special case where all the `w(v)` are 1 is the optimization version of the NP-hard vertex cover problem, so the above kinds of programs are hard to solve.

Using Relaxation to Approximately Solve Problems

If instead of requiring `x(v) in {0,1}`, we only require `0 le x(v) le 1`, we obtain the following linear program:
minimize
`sum_(v in V) w(v) x(v)`
subject to
`x(u) + x(v) ge 1` for each `(u,v) in E`
`x(v) le 1` for each `v in V`.
`0 le x(v)` for each `v in V`.
This is known as the linear programming relaxation of our 0-1 program.
Notice any feasible solution (a setting to the variables that satisfies all the constraints) to the original 0-1 integer program is also a feasible solution to the linear program.
So an optimal solution to the linear program gives a lower bound on the value of an optimal solution to the 0-1 integer program, and hence a lower bound on the optimal weight in the minimum weight vertex-cover problem.

Approximation Algorithm For Minimum Weight Vertex Cover

We now use the linear program from the last slide together with rounding to get an approximation algorithm for minimum weight vertex-cover.

APPROX-MIN-WEIGHT-VC(G, w)
1 C = ∅
2 Compute x, an optimal solution to the 
  linear program of the previous slide
3 for each v ∈ V
4     if x(v) ≥ 1/2
5         C = C ∪ {v}
6 return C

One can use the ellipsoid method to find an optimal solution to the linear program in `p`-time (`n^4L`, where `L` is the number of digits) (see Chapter 29 of the book). We will assume, we are just calling some software package that implements this.

APPROX-MIN-WEIGHT-VC is a 2-approximation algorithm

Theorem. APPROX-MIN-WEIGHT-VC is a polynomial time 2-approximation algorithm for the minimum-weight vertex-cover problem.

Proof. As we have already mentioned, line 2 in the algorithm can be done in p-time using the ellipsoid method. Lines 3-5 are linear time in the number of vertices, so the whole algorithm is p-time.

Let `C^star` be an optimal solution to a minimum-weight vertex-cover problem. Let `z^{\star}` be an optimal solution to the linear program described on the previous slides. Since an optimal cover is a feasible solution to the linear program, we have
`z^{\star} le w(C^star)`.
The Theorem follows from the following claim which we prove on the next slide:

Claim. The rounding of variables `x(v)` in APPROX-MIN-WEIGHT-VC produces a set `C` that is a vertex cover and satisfies `w(C) le 2z^{\star}`.

Proof of Claim

As one of our constraints is `x(u) + x(v) ge 1`, at least one `x(u)` or `x(v)` must be at least 1/2. Therefore, at least one of `u` or `v` is included in the vertex cover, and so every edge is covered.

Consider the weight of the cover. We have
`z^{\star} = sum_(v in V) w(v) x(v)`
`ge sum_(v in V; x(v) ge 1/2)w(v) x(v)`
`ge sum_(v in V; x(v) ge 1/2)w(v) 1/2`
`= sum_(v in C)w(v) 1/2`
`= 1/2 sum_(v in C)w(v)`
`= 1/2 w(C)`
So this gives:
`w(C) le 2z^{\star} le 2w( C^star)`
completing the proof.

Quiz

Which of the following statements is true?

Our APPROX-TSP-TOUR algorithm was a p-time, 2-approximation algorithm for general TSP.
Our GREEDY-SET-COVER algorithm was a p-time, 2-approximation algorithm for SET COVER.
Picking a satisfying assignment uniformly at random is a randomized `8/7`-approximation algorithm for MAX-3SAT.

The Optimization Version of Subset Sum

In subset sum we are given a pair `(S,t)` where `S` is a set of positive integers `{x_1, ..., x_n}` and `t` is a positive integer and the goal is to find a subset of `S` which sums to `t`.
We showed this problem was `NP`-complete last week.
In the optimization problem, we wish to find a subset of `S` whose sum is a large as possible but is still `le t`.
For example, we may have a ship that when loaded with more than `t` tonnes of goods will sink.
We can imagine having `n` containers `{1, ..., n}` that need to be shipped such that container `i` have weight `x_i`.
If we get paid by the tonne, we want to load as much as possible on the ship without sinking it.

An Exponential-time Exact Algorithm

One way to solve the previous task is the following:

bestS' = ∅
bestW = 0
foreach S' ⊆ S:
   w := ∑_{x ∈ S'}x
   if w > bestW and w ≤ t:
        bestW := w
        bestS' := S'
return bestS'

As there are `2^(|S|)` we would have to cycle over this is an exponential time algorithm.
We can do a little better than the above... namely, when computing the sum, if it ever exceeds `t`, we can skip to the next subset.
We can thus modify the algorithm so that it iteratively computes `L_i`, the list of sums of all subsets of `{x_1, ..., x_i}` that do not exceed `t`, and then return the maximum value of `L_n`.

Improving the Exponential Time Algorithm

If `L` is a list of positive integers and `x` is another positive integer, let `L +x` denote the list of positive integers obtained by increasing each element of `L` by `x`.
So if `L = langle 1,2,3,5,9 rangle` then `L + 2 = langle 3, 4, 5, 7,11 rangle`.
We use a similar notation for sets rather than lists:
`S + x = {s +x | s in S}`.
Suppose `L` and `L'` are sorted lists, let MERGE-LISTS(L, L') be the procedure which uses two counters to scan through the elements of `L` and `L'` once and outputs the sorted list with all the elements of each of these lists and with duplicates removed. As this is just one scan of the two lists, its runtime is `O(|L| + |L'|)`.

Using this, our improved procedure for exact subset sum is:

EXACT-SUBSET-SUM(S, t)
Here S = {x[1], ... x[n]}
1 n = |S|
2 L[0] = (0)
3 for i = 1 to n
4     L[i] = MERGE-LISTS(L[i-1], L[i-1] + x[i])
5     remove from L[i] every element that is greater than t
6 return the largest element in L[n]

Example of Exponential Time Algorithm

Let `P_i` denote the set of all values obtained by selecting a subset of `{x_1, ..., x_i}` and summing its members.
For example, if `S = {1, 4, 5}`, then
`P_1 = {0, 1}`
`P_2 = {0, 1, 4, 5}`
`P_3 = {0, 1, 4, 5, 6, 9, 10}`.
Given the identity
`P_i = P_(i-1) cup (P_(i-1) + x_i)`,
by induction on `i`, we have that the list `L_i` is a sorted list containing every element of `P_i` whose value is not more than `t`.
Unfortunately, the length of `L_i` can be as much as `2^i`, so this is an exponential time algorithm.
If, however, `t` is polynomial in `|S|`, or if all the numbers in `S` are bounded by a polynomial in `|S|` then this is a `p`-time algorithm.

A Fully p-time Approximation Scheme

We will use the idea of our previous algorithm to get a fully `p`-time approximation scheme for the subset-sum by "tossing out/trimming" some the values in each list `L_i` as it is created to keep the list length from getting too large.
To do this, we use a trimming parameter `delta` such that `0 < delta < 1`.
A list `L'` is a trimmed list of `L` by `delta` if it contains only elements from `L` and if for every `y in L` there is a `z in L'` such that
`y/(1+delta) leq z leq y`.
For example, if `delta = 0.1` and
`L = langle 10, 11, 12, 15, 20, 21, 22, 23, 24, 29 rangle`,
then one possible `delta` trimmed list of `L` is `L' = langle 10, 12, 15, 20, 23, 29 rangle`.

A Trimming Procedure

We next give a `Theta(m)` time algorithm to create a `delta` trimmed list from a list `L = langle y_1, ..., y_m rangle` assuming `L` is sorted into monotonically increasing order.

TRIM(L, δ)
We assume L = (y[1], ..., y[m])
1 let m be the length of L
2 L' = (y[1])
3 last = y[1]
4 for i = 2 to m
5     if y[i] > last * (1 + δ) // y[i] ≥ last because L is sorted
6         append y[i] to L'
7         last = y[i]
8 return L'

The Subset Sum Approximation Algorithm

We now give our approximation algorithm for subset sum.
It takes as input a set `S = {x[1], ..., x[n]}` of integers (in arbitrary order), a target `t` and an "approximation parameter" `epsilon` where `0 < epsilon < 1`.
It returns a value `z` whose value is within a `1 + epsilon` factor of the optimal solution.

APPROX-SUBSET-SUM(S, t, ε)
1 n = |S|
2 L[0] = (0)
3 for i = 1 to n
4     L[i] = MERGE-LISTS(L[i-1], L[i-1] + x[i])
5     L[i] = TRIM(L[i], ε/2n)
6     remove from L[i] every element that is greater than t
7 let zstar be the largest value in L[n]
8 return zstar

An Example of Our Algorithm in Action

Supppose we had `S=(104, 102, 201, 101)` and `t = 308`, and `epsilon = 0.40`.
The trimming parameter is `delta = epsilon/8 = 0.05`.

Below are the values of the `L[i]`'s at various places as we step through our algorithm:

line 2: L[0] = ( 0 )
line 4: L[1] = ( 0, 104 )
line 5: L[1] = ( 0, 104 )
line 6: L[1] = ( 0, 104 )
line 4: L[2] = ( 0, 102, 104, 206 )
line 5: L[2] = ( 0, 102, 206 )
line 6: L[2] = ( 0, 102, 206 )
line 4: L[3] = ( 0, 102, 201, 206, 303, 407 )
line 5: L[3] = ( 0, 102, 201, 303, 407 )
line 6: L[3] = ( 0, 102, 201, 303 )
line 4: L[4] = ( 0, 101, 102, 201, 203, 302, 303, 404 )
line 5: L[4] = ( 0, 101, 201, 302, 404 )
line 6: L[4] = ( 0, 101, 201, 302 )

Proof That APPROX-SUBSET-SUM works

Theorem. APPROX-SUBSET-SUM is a fully `p`-time approximation scheme for the subset-sum problem.

Proof. Trimming and removing elements of value greater than `t` maintain the property that every element of `L[i]`, which from now on we'll write as `L_i`, is also a member of `P_i`. So zstar, which we'll write from now on as `z^star`, returned by line 8 is the sum of some subset of `S`. let `y^star in P_n` denote an optimal solution to the subset-sum problem. From line 6, we know that `z^(star) leq y^star`. So we need to show `y^star/z^star le 1 +epsilon` and that the algorithm runs in polynomial time in both `1/epsilon` and the input size.

From the definition of trimming one can show for every element `y in P_i` that is at most `t` there exists a `z in L_i` such that
`y/(1+epsilon/(2n))^i le z le y`.
In particular as `y^star in P_n`, there exists an element `z in L_n` such that
`y^star/(1+epsilon/(2n))^n le z le y^star`,
and so,
`y^star/z leq (1+epsilon/(2n))^n`.
Since there exists a `z in L_n` satisfying the above, it must also hold for `z^star`, the largest element of `L_n`. Therefore
`y^star/z^star le (1+epsilon/(2n))^n`.
We now argue `(1+epsilon/(2n))^n le 1 + epsilon` to get `y^star/z^star le 1 + epsilon`. To see this notice
`lim_(n -> infty) (1 + epsilon/(2n))^n = e^(epsilon/2)` and as `d/(dn)(1 + epsilon/(2n))^n > 0`, `(1 + epsilon/(2n))^n` increases with `n`. So we have
`(1 + epsilon/(2n))^n le e^(epsilon/2)`
`le 1 + epsilon/2 + (epsilon/2)^2` (looking at terms in Taylor series)
`le 1 + epsilon`.

So we have that the algorithm satisfies the desired approximation ratio, to complete the proof we need to show a bound on the length of `L_i`.

Bound the Length of the `L_i`'s

After trimming, succesive elements `z` and `z'` of `L_i` must have the relationship `(z')/z > 1 + epsilon/(2n)`. That is, they must differ by a factor of at least `1 + epsilon/(2n)`. So each list contains the value 0, possibly the value 1, and up to `|__ log_(1+epsilon/(2n)) t__|` additional values. So the number of elements in each list `L_i` is at most
`log_(1+epsilon/(2n)) t +2 = (ln t)/(ln(1+epsilon/(2n))) + 2`
`le (2n(1 + epsilon/(2n)) ln t)/(epsilon) + 2`
`< (3n ln t)/epsilon +2`
which is polynomial in both `n` and `epsilon` as `ln t < n` as it is provided as part of the input.

Weighted Vertex Cover - Fully p-time Approximation of Subset Sum

Outline