Efficiency

We begin today by formalizing the notion of running time. As every nontrivial computational task requires at least reading the input, we count the number of basic steps of a computation as a function of the input length.

Definition. Let `f:{0,1}^star -> {0,1}^star` and let `T: NN -> NN` be some functions, and let `M` be a TM. We say that `M` computes `f` if for every `x in {0, 1}^star`, whenever `M` is initialized to the start configuration on input `x`, then after computing it halts with `f(x)` written on its output tape. We say that `M` computes `f` in time `T(n)` if its computation on every input `x` require at most `T(|x|)` many steps.

Example. The TM we gave last day for palindrome recognition runs in time `3n`.

What should be legal functions `T` to consider when measuring run time? Weirdly oscillating functions might be difficult to reason about. The next definition tries to capture the notion of reasonable run time bound...

Definition. A function `T:NN -> NN` is time constructible if `T(n) ge n` and there is a TM `M` that computes the function `x mapsto lfloor T(|x|) rfloor` in time `T(n)`.

In this definition, we are assuming that we can expand the alphabet of our TMs and use this to speed up computations by up to a linear factor.

You should verify that functions like: `n log n`, `n^2`, and `2^n` are all time constructible.

Robustness of our definition

The specific details of our definition of TM are quite arbitrary: Most changes to our model do not yield a substantially different model as our model can simulate any of these new models.
In computational complexity, however, we need to verify not only that one model can simulate another, but also it can do it efficiently.
We next give a few results of this kind with the intent of being able to conclude that the exact model is unimportant if we are willing to ignore polynomial factors in the running time.
We will only give proof sketches of these results.

Alphabet doesn't Matter

Claim. For every `f:{0, 1}^star -> {0, 1}` and time constructible `T: NN -> NN`, if `f` is computable in time `T(n)` by a TM `M` using alphabet `Gamma`, then it is computable in time `4 (log |Gamma|)T(n)` by a machine `bar{M}` using alphabet `{0, 1, square, Delta }`.

Proof. Suppose `M` has `k`-tapes and uses states `Q`. We will define an `bar(M)` which also uses `k` tapes but a new set of states `Q'`. The idea is that we will encode the symbols of `Gamma` as strings over `{0, 1}` of length `log |Gamma|`. Each tape cell on any tape of the original machine will now be represented by `log |Gamma|` cells on `bar{M}`.

To simulate one step `M`, `bar{M}` will (1) use `log |Gamma|` steps to read from each tape the `log |Gamma|` bits encoding a symbol `Gamma`, (2) use its state register to store the symbols read, (3) use `M`'s transition function to compute the symbols `M` writes and `M`'s new state information, (4) store this information in its state register, (5) use `log |Gamma|` steps to write the encoding of these symbols on its tapes, and (6) use up to `log |Gamma|` steps to move appropriate each of the simulating tape heads.

`bar{M}` needs to have access to a register that can store `M`'s states, the `k` symbols from `Gamma`, and a counter from `1` to `log|Gamma|` for reading symbols. So to make `bar{M}` we will need `O(|Q||Gamma|^{k+1})` states. The runtime bound follows from the time to simulate one step: That is, read symbol on each tape, write symbol on each tape, rewind to start of symbol, move to next symbol on each tape, if necessary. Each of these take `log |Gamma|` steps.

Number of Tapes doesn't Matter

Claim. Define a single-tape machine to be a TM that has only one read-write tape, that is used as input, work, and output tape. For every `f:{0, 1}^star -> {0, 1}` and time constructible `T: NN -> NN`, if `f` is computable in time `T(n)` by a TM `M` using `k` tapes, then it is computable in time `5k(T(n))^2` by a single tape TM `bar(M)`.

Proof. We encode the `k` tapes on a single tape by using locations `1`, `k+1`, `2k+1`, ... of the single tape to code the squares of the first tape, locations `2`, `k+2`, `2k+2`, ... to encode the second tape, etc. For every symbol `a` in `M`'s alphabet `bar(M)` will have both the symbol `a` and `bar(a)`. `bar(M)` does not touch the `n` input tape squares, but instead takes `O(n^2)` steps to copy the input bit by bit into the rest of the tape, using the encoding scheme above.

To simulate one step of `M`, `bar(M)` makes two sweeps of its work tape. First, it sweeps the tape in a left-to-write direction and records to its register the `k` symbols marked with a bar. Then it uses `M`'s transition function to determine the new state, symbols, and head movements. Finally, it sweeps the tape back in the right-to-left direction to update the encoding accordingly.

After the carrying out its simualtion, `bar(M)` will produce as output the encoding in the above way of the output of `M`. Since `M` does not read more than `T(n)` squares, `bar(M)` will never need to read more than `2n + kT(n) le (k+2)cdot T(n) le 2k cdot T(n)` locations, meaning that for each of at most `T(n)` steps, `bar(M)` performs at most `5 cdot k cdot T(n)` work. So the run-time bound follows.

Oblivion doesn't Matter.

Remark. We can modify the proof of the claim of the last slide, so that `bar(M)`'s heads' movements do not depend on the input, but only on the input length. That is, for every input `x in {0,1}^star` and `i in NN`, the location of each of `M`'s heads at the `i`th step of execution on input `x` is only a function of `|x|` and `i`. A machine with this property is called oblivious.

Doubly-Infinite Doesn't Matter.

Claim. Define a bi-directional TM to be a TM whose tapes are infinite in both directions. For every `f:{0, 1}^star -> {0, 1}` and time constructible `T: NN -> NN`, if `f` is computable in time `T(n)` by a bi-directional TM `M`, then it is computable in time `4T(n)` by a standard (unidirectional) TM `bar(M)`.

Proof. The idea is we will use the `i` tape square of a one way tape to represent both the squares `-i` and `i` of tape of a two-way tape. We can do this by making our new machine's alphabet be `Gamma cup (Gamma times Gamma)` where `Gamma` was the original machines alphabet. We use `k+1` tapes where `M` had `k` tapes. `bar(M)` first converts its input into the second coordinates of the squares of the second work tape. We now treat the last `k` tapes of `bar(M)` as simulating the tapes of `M`. To simulate a step `bar(M)` performs the corresponding operation that `M` did for that tape based on the coordinate that it is reading from. If `M` moves left off the `0th` position it switches to using the left coordinating for that tape and left moves of `M` will map to right moves of `bar(M)` and vice-versa. Similarly, if `bar(M)` more right off of the `0` position of a tape if switches to using the right coordinate for that tape.

Efficiency

Outline

Efficiency

Robustness of our definition

Alphabet doesn't Matter

Number of Tapes doesn't Matter

Oblivion doesn't Matter.

Doubly-Infinite Doesn't Matter.