The Role Algorithms in Computing

CS146

Chris Pollett

Jan 27, 2014

Outline

An algorithm is any well-defined computational procedure that takes some values, or set of values, as input, and produces some values, or set of values, as output.

For example, an algorithm to multiply `n cdot m`, might be:

int product (int n, int m)
{
    int p = 0;
    int cnt = 0;
    while(cnt < m) {
       p += n;
       cnt++;    
    }
    return p;
}

The inputs are `n` and `m` and we have specified a well-defined sequence of steps to produce the output p which is their product.
We can view algorithms as tools (recipes) for solving a well-specified computational problem.

Often we pose a computational problem and then try to come up with different ways of solving it.
For example,
The Sorting Problem
Input: A sequence of `n` numbers `langle a_1, ..., a_n rangle`.
Output: A permutation (reordering) `langle a'_1, ..., a'_n rangle` of the input sequence such that `a'_1 leq a'_2 leq ... leq a'_n`.
Given the input sequence `langle 99, 5, 377, 4, 66, 8 rangle` a sorting algorithm (a method to solve the about problem) returns as output `langle4, 5, 8, 66, 99, 377 rangle`.
Such an input sequence is called an instance of the sorting problem.
An instance of a problem consists of an input needed to compute a solution to the problem.

There are many different ways we could try to solve a problem such as sorting.
For example, we could move left to right through the array comparing adjacent elements and swap if they are out of order. If we do this for size the sequence many passes we end up with a sorted list.
Alternatively, we might split the sequence into halves, sort each half and merge the results into a sorted list.
Which algorithm we choose might depend on the number of items to be sorted, how sorted the items are already, possible restrictions on the items values, space constraints etc.
We say an algorithm is correct, if for every input instance, it halts with the correct output.
We say that a correct algorithm solves the given computation problem.

The human mind is a computation device with tight memory constraints.
Experiments have shown have shown we can only attend to/ remember 3 - 7 things at the same time.
To multiply things quickly in ones head we might look for special cases.
For example, given `n` and `m`, we might look for an `a` such that:
`n = x - a` and
`m = x + a` then
`n cdot m = (x -a)(x + a) = x^2 - a^2`
and the latter might be easy to calculate given our limited memory. So `49 cdot 51 = 50^2 - 1^2 = 2500 -1 = 2499.`
We can't always find such an `a`, which is an integer and such that we know the square of `x` and `a`, so the above algorithm doesn't solve the problem; nevertheless, it can be useful for mental math.
There are other cases where we are willing to allow some errors provided the rate is small. This comes up in the fastest algorithms for checking if a number is prime.

Human Genome Project - we wanted to identify the 100,000 genes in human DNA and determine the 3 billion chemical base pairs that make up human DNA. This involved sophisticated string matching algorithms.
The Internet - algorithms are behind finding good routes for data through the network from the place where the data was stred to who requested it.
Electronic Commerce - the core cryptographic and signature schemes are rely on numerical algorithms and number theory.
Manufacturing - an oil company might have a problem to find the best place to drill to maximize expected profits, an airline might want to assign crews to flights in the least expensive way possible, etc. These kind of problems can be solved with linear programming algorithms.
This semester we will look at algorithms to solve a variety of useful problems.

Besides looking at algorithms, this semester we will also look at data structures
A data structure is a way to store and organize data in order to facilitate access and modifications.
For example, we might use a tree like structure to aid in a sorting prcoess.
Different data structures work well for different purposes. For example, sometimes we might prefer using a tree to a hash table or vice-versa.
It is useful to know the strengths and weaknesses of a variety of data structures.

Java and most modern languages come pre-built with a variety of data structures and algorithms, why do we still need to learn this stuff?
Here are a few reasons:
1. To even pick between the different class choices you need to know something about the underlying algorithm
2. You might run into situations where no existing class does what you want, but because you know the general techniques, you can put together a good solution.
3. Given you have a solution to a computational problem, you might find it is too slow, we are going to learn the tools to understand and analyze what makes something slow, so we can speed it up.

Toward the end of this semester, we will look at some problems for which no efficient algorithm for solutions is known.
One class of such problem is the NP problems. These may have efficient algorithms (or more probably, may not). Computer scientists don't know at this point know how to solve these problems efficiently.
On the other hand, these kinds of problems come up fairly often and people have developed a lot of techniques for special cases, approximation techniques, etc., which might make the particular instances you are interested in tractible.
Next day we will look at algorithms as a technology and start looking at how to analyze algorithms.