The Optimization Version of Subset Sum

Last day, we were talking about approximation algorithms for subset-sum.
Since, this was toward the end of lecture, let's start today by quickly reviewing these slides.
In subset sum we are given a pair `(S,t)` where `S` is a set of positive integers `{x_1, ..., x_n}` and `t` is a positive integer and the goal is to find a subset of `S` which sums to `t`.
We showed this problem was `NP`-complete last week.
In the optimization problem, we wish to find a subset of `S` whose sum is a large as possible but is still `le t`.
For example, we may have a ship that when loaded with more than `t` tonnes of goods will sink.
We can imagine having `n` containers `{1, ..., n}` that need to be shipped such that container `i` have weight `x_i`.
If we get paid by the tonne, we want to load as much as possible on the ship without sinking it.

An Exponential-time Exact Algorithm

One way to solve the previous task is the following:

bestS' = ∅
bestW = 0
foreach S' ⊆ S:
   w := ∑_{x ∈ S'}x
   if w > bestW and w ≤ t:
        bestW := w
        bestS' := S'
return bestS'

As there are `2^(|S|)` we would have to cycle over this is an exponential time algorithm.
We can do a little better than the above... namely, when computing the sum, if it ever exceeds `t`, we can skip to the next subset.
We can thus modify the algorithm so that it iteratively computes `L_i`, the list of sums of all subsets of `{x_1, ..., x_i}` that do not exceed `t`, and then return the maximum value of `L_n`.

Improving the Exponential Time Algorithm

If `L` is a list of positive integers and `x` is another positive integer, let `L +x` denote the list of positive integers obtained by increasing each element of `L` by `x`.
So if `L = langle 1,2,3,5,9 rangle` then `L + 2 = langle 3, 4, 5, 7,11 rangle`.
We use a similar notation for sets rather than lists:
`S + x = {s +x | s in S}`.
Suppose `L` and `L'` are sorted lists, let MERGE-LISTS(L, L') be the procedure which uses two counters to scan through the elements of `L` and `L'` once and outputs the sorted list with all the elements of each of these lists and with duplicates removed. As this is just one scan of the two lists, its runtime is `O(|L| + |L'|)`.

Using this, our improved procedure for exact subset sum is:

EXACT-SUBSET-SUM(S, t)
Here S = {x[1], ... x[n]}
1 n = |S|
2 L[0] = (0)
3 for i = 1 to n
4     L[i] = MERGE-LISTS(L[i-1], L[i-1] + x[i])
5     remove from L[i] every element that is greater than t
6 return the largest element in L[n]

Example of Exponential Time Algorithm

Let `P_i` denote the set of all values obtained by selecting a subset of `{x_1, ..., x_i}` and summing its members.
For example, if `S = {1, 4, 5}`, then
`P_1 = {0, 1}`
`P_2 = {0, 1, 4, 5}`
`P_3 = {0, 1, 4, 5, 6, 9, 10}`.
Given the identity
`P_i = P_(i-1) cup (P_(i-1) + x_i)`,
by induction on `i`, we have that the list `L_i` is a sorted list containing every element of `P_i` whose value is not more than `t`.
Unfortunately, the length of `L_i` can be as much as `2^i`, so this is an exponential time algorithm.
If, however, `t` is polynomial in `|S|`, or if all the numbers in `S` are bounded by a polynomial in `|S|` then this is a `p`-time algorithm.

A Fully p-time Approximation Scheme

We will use the idea of our previous algorithm to get a fully `p`-time approximation scheme for the subset-sum by "tossing out/trimming" some the values in each list `L_i` as it is created to keep the list length from getting too large.
To do this, we use a trimming parameter `delta` such that `0 < delta < 1`.
A list `L'` is a trimmed list of `L` by `delta` if it contains only elements from `L` and if for every `y in L` there is a `z in L'` such that
`y/(1+delta) leq z leq y`.
For example, if `delta = 0.1` and
`L = langle 10, 11, 12, 15, 20, 21, 22, 23, 24, 29 rangle`,
then one possible `delta` trimmed list of `L` is `L' = langle 10, 12, 15, 20, 23, 29 rangle`.

In-Class Exercise

Support we had the list `L= langle 50,51,55,70,71,82,83,84,99 rangle`.
Let `delta = 0.1`. Come up with a `delta` trimmed list for `L`.
Post your solutions to the May 8 In-Class Exercise Thread.

A Trimming Procedure

We next give a `Theta(m)` time algorithm to create a `delta` trimmed list from a list `L = langle y_1, ..., y_m rangle` assuming `L` is sorted into monotonically increasing order.

TRIM(L, δ)
We assume L = (y[1], ..., y[m])
1 let m be the length of L
2 L' = (y[1])
3 last = y[1]
4 for i = 2 to m
5     if y[i] > last * (1 + δ) // y[i] ≥ last because L is sorted
6         append y[i] to L'
7         last = y[i]
8 return L'

The Subset Sum Approximation Algorithm

We now give our approximation algorithm for subset sum.
It takes as input a set `S = {x[1], ..., x[n]}` of integers (in arbitrary order), a target `t` and an "approximation parameter" `epsilon` where `0 < epsilon < 1`.
It returns a value `z` whose value is within a `1 + epsilon` factor of the optimal solution.

APPROX-SUBSET-SUM(S, t, ε)
1 n = |S|
2 L[0] = (0)
3 for i = 1 to n
4     L[i] = MERGE-LISTS(L[i-1], L[i-1] + x[i])
5     L[i] = TRIM(L[i], ε/2n)
6     remove from L[i] every element that is greater than t
7 let zstar be the largest value in L[n]
8 return zstar

An Example of Our Algorithm in Action

Supppose we had `S=(104, 102, 201, 101)` and `t = 308`, and `epsilon = 0.40`.
The trimming parameter is `delta = epsilon/8 = 0.05`.

Below are the values of the `L[i]`'s at various places as we step through our algorithm:

line 2: L[0] = ( 0 )
line 4: L[1] = ( 0, 104 )
line 5: L[1] = ( 0, 104 )
line 6: L[1] = ( 0, 104 )
line 4: L[2] = ( 0, 102, 104, 206 )
line 5: L[2] = ( 0, 102, 206 )
line 6: L[2] = ( 0, 102, 206 )
line 4: L[3] = ( 0, 102, 201, 206, 303, 407 )
line 5: L[3] = ( 0, 102, 201, 303, 407 )
line 6: L[3] = ( 0, 102, 201, 303 )
line 4: L[4] = ( 0, 101, 102, 201, 203, 302, 303, 404 )
line 5: L[4] = ( 0, 101, 201, 302, 404 )
line 6: L[4] = ( 0, 101, 201, 302 )

Proof That APPROX-SUBSET-SUM works

Theorem. APPROX-SUBSET-SUM is a fully `p`-time approximation scheme for the subset-sum problem.

Proof. Trimming and removing elements of value greater than `t` maintain the property that every element of `L[i]`, which from now on we'll write as `L_i`, is also a member of `P_i`. So zstar, which we'll write from now on as `z^star`, returned by line 8 is the sum of some subset of `S`. let `y^star in P_n` denote an optimal solution to the subset-sum problem. From line 6, we know that `z^(star) leq y^star`. So we need to show `y^star/z^star le 1 +epsilon` and that the algorithm runs in polynomial time in both `1/epsilon` and the input size.

From the definition of trimming one can show for every element `y in P_i` that is at most `t` there exists a `z in L_i` such that
`y/(1+epsilon/(2n))^i le z le y`.
In particular as `y^star in P_n`, there exists an element `z in L_n` such that
`y^star/(1+epsilon/(2n))^n le z le y^star`,
and so,
`y^star/z leq (1+epsilon/(2n))^n`.
Since there exists a `z in L_n` satisfying the above, it must also hold for `z^star`, the largest element of `L_n`. Therefore
`y^star/z^star le (1+epsilon/(2n))^n`.
We now argue `(1+epsilon/(2n))^n le 1 + epsilon` to get `y^star/z^star le 1 + epsilon`. To see this notice
`lim_(n -> infty) (1 + epsilon/(2n))^n = e^(epsilon/2)` and as `d/(dn)(1 + epsilon/(2n))^n > 0`, `(1 + epsilon/(2n))^n` increases with `n`. So we have
`(1 + epsilon/(2n))^n le e^(epsilon/2)`
`le 1 + epsilon/2 + (epsilon/2)^2` (looking at terms in Taylor series)
`le 1 + epsilon`.

So we have that the algorithm satisfies the desired approximation ratio, to complete the proof we need to show a bound on the length of `L_i`.

Bound the Length of the `L_i`'s

After trimming, successive elements `z` and `z'` of `L_i` must have the relationship `(z')/z > 1 + epsilon/(2n)`. That is, they must differ by a factor of at least `1 + epsilon/(2n)`. So each list contains the value 0, possibly the value 1, and up to `|__ log_(1+epsilon/(2n)) t__|` additional values. So the number of elements in each list `L_i` is at most
`log_(1+epsilon/(2n)) t +2 = (ln t)/(ln(1+epsilon/(2n))) + 2`
`le (2n(1 + epsilon/(2n)) ln t)/(epsilon) + 2`
`< (3n ln t)/epsilon +2`
which is polynomial in both `n` and `epsilon` as `ln t < n` as it is provided as part of the input.

The Probabilistic Method

The probabilistic method is a technique for showing the existence of combinatorial objects, which in turn can be used in algorithm construction.
There are two common ways to assert the existence of something via knowing something about probability, using either can be said to be using the probabilistic method. As ideas these are:
1. Any random variable assumes at least one value that is no smaller than its expectation, and at least one value that is no greater than its expectation. For example, if we know the average salary of a computer scientist is `\$`20,000, then we know there must be at least one computer scientist with a salary less than or equal to `\$`20,000.
2. If an object chosen randomly from a universe satisfies a property with positive probability, then there must be an object in the universe that satisfies that property. For example, if we know that a ball randomly chosen from a bin is red with probability 1/3, then we know there is a least one red ball.

Finding Cuts in Graphs

Given a graph `G=(V, E)`, a cut is a partition of the graph into two disjoint sets of vertices `A`, `B`.
The size of a cut is the number of edges `(a,b) in E` such that `a in A` and `b in B`. (If the graph had weighted edges then we would sum the weights of the edges).
The problem associated with finding a minimal cut (a cut of smallest size) is connected with finding flows in a network and can be done in polynomial time.
On the other hand, the optimization problem of finding a maximum cut, max-cut problem is known to be `NP`-hard.

Cut-size and the Probabilistic Method

Theorem. For any undirected graph `G=(V, E)` with `n` vertices and `m` edges there is a cut `A`, `B` such that
`|{(u, v) in E | u in A and v in B}| ge m/2`

Proof. Consider the following experiment: For each vertex flip an unbiased coin and if it is heads put the vertex in `A`; otherwise, put the vertex in `B`.

For an edge `(u,v)`, the probability that its end-points are in different sets is 1/2. By linearity of expectation, the expected number of edges with end-points in different sets is thus `m/2`. It follows by the probabilistic method that there must be a partition satisfying the theorem. QED.

Remark. The above experiment essentially gives us an algorithm to find a cut of expected size `m/2`. In general, the probabilistic method will be closely tied with randomized algorithms for constructing objects.

Subset Sum Approximation - The probabilistic Method

Outline