Introduction

In the PRAM model all of the processors used the same clock so all the processors computation steps were in sync.
We will now consider the situation where we have n processors each with its own clock. So a step on one processor might be longer or shorter than some other processor.
There is a global memory consisting of `m` registers.
As several processors might attempt to simultaneously read/modify a register, we will assume before a processor accesses a global register it must first get a lock for it.
While it has the lock of a register, no other processor can access that register and must wait.
When the lock is released, all waiting processors are notified and can contend for the lock again.

The Choice Coordination Problem

Memorably Disgusting Motivating story:
- Have mites which survive by making colonies in the ears of moths.
- If they infect both ears of the same moth, the moth can't hear bat sonar calls, and it and the mite colonies will be eaten.
- How can the mites agree on only one ear to infect?
We will be interested in a computer science variation on this problem called the Choice Coordination Problem.
We will have `n` processors in our distributed setting.
We want them to agree on a value between `1` and `m`.
We will know an agreement has been met at the point when exactly one of the registers our processors have access to contains a special symbol #.
An `Omega(n^(1/3))` time deterministic lower bound is known for this problem in the distributed model.
We will show there is a expected constant time randomized algorithm.

Warm-Up Algorithm

We first consider the simplified case of only two processor which are synchronized.
Let `P[i]` denote the processor and `C[i]` denote its choice.
Finally, let `B[i]` be a value which is local to processor `P[i]` only.

Synch-CCP:
Input: Registers C[0] and C[1] initialized to 0. 
Output: Exactly one of the register has the value #.
0 P[i] is initially scanning the register C[i] and 
  has its local variable B[i] initialized to 0.
1 Read the current register and obtain a bit R[i].
2 Select one of three cases: 
   (a) case [R[i] = #]: halt;
   (b) case [R[i] = 0, B[i] = 1]: write # into the current register and halt;
   (c) case [otherwise]: assign an unbiased random bit to B[i] 
       and write B[i] into the current register.
3 P[i] exchanges its current register with P[1-i] and returns to step 1.

In-Class Exercise

Use coin flip to simulate the unbiased random bit.
Carry out a step by step simulation of two processors using the above algorithm.
Post your solutions to the Feb 28 In-Class Exercise Thread.

Analysis of Synch-CCP

Let's look at the correctness of Synch-CCP:
- First notice, at most one register can ever have # written into it. Why? If both registers get the same value `#` then by 2(a) they must have both written `#` in the same iteration. Suppose this happens on the `k`th iteration. Let `B[i]_(k)` and `R[i]_(k)` denote the values used by `P[i]` just after Step 1 of the `k`th iteration. I.e., Just before case 2(b) could be applied in the `k`th round. The previous round must have used case 2(c), so we know `R[0]_(k) = B[1]_(k)` and `R[1]_(k) = B[0]_(k)`. The only way a `#` could be written is if `R[i] =0` and `B[i]=1;` but then `R[1-i] = 1` and `B[1-i]=0`, so `P[1-i]` can't write `0` in that iteration.
Notice during each iteration, the probability that both `B[i]` have the same value is a `1/2`. If the two bits are ever different then within two stages the algorithm stops. So after `k` steps the odds the algorithm has not stop is `O(1/2^k)`.
So with odds `1-O(1/(2^k))` the algorithm terminate in `k` steps.

The Asynchronous Problem

We now assume the two processors may be executing at varying speeds and cannot exchange the registers after each iteration.
We no longer assume that the two processors begin by scanning different registers.
We assume that each processor chooses its starting register at random.
The two processors could be in a conflict at the very first step so locking needs to be used.
To do coordination we want to use timestamps.
We will assume a read on `C[i]` will yield a pair `( t[i], R[i] )` where `t[i]` is the timestamp and `R[i]` is the register.

Asynchronous-CCP

Input: Registers C[0] and C[1] initialized to (0,0).
Output: Exactly one of the two registers has value #.
0. P[i] is initially scanning a randomly chosen register. 
   Thereafter, it changes its current register at the end of each iteration. 
   The local variables T[i] and B[i] are initialized to 0.
1. P[i] obtains a lock on the current register and reads (t[i],R[i]).
2. P[i] selects one of five cases:
   (a) case [R[i] = #]: halt;
   (b) case [T[i] < t[i]]: set T[i] = t[i] and B[i] = R[i]
   (c) case [T[i] > t[i]]: write # into the current register and halt;
   (d) case [T[i] = t[i], R[i] = 0, B[i] = 1]: 
       write # into the current register and halt
   (e) case [otherwise]: Set T[i] = T[i] + 1 and t[i] = t[i] + 1, 
       assign a random bit-value to B[i] , 
       and write (t[i] , B[i]) to the current register.
3. P[i] releases the lock on its current register, 
   moves to the other register, and returns to Step 1.

Analysis of Asynchronous-CCP

Theorem. For any `c gt 0`, Asynchronous-CPP has total cost in operations executed exceeding `c` with probability at most `2^(-Omega(c))`.

Proof. The main difference between this and the synchronous case is in Step 2 (b) and 2 (c):

Case 2(b) is supposed to handle where the processor is playing catch up with the other processor.
Case 2(c) handles where the processor is ahead of the other processor.

To prove correctness of the protocol, we consider the two cases (2(c), 2(d)) where a processor can write a # to its current cell. At the end of an iteration, a processor's timestamp `T[i]` will equal the timestamp of the current register `t[i]`. Further # cannot be written in the first iteration by either processor. (Proof cont'd next slide).

Proof cont'd

Suppose `P[i]` has just entered case 2(c), with some timestamp `T^star[i]`, and its current cell is `C[i]` with timestamp `t^star[i] lt T[i]^star`. The only possible problem is that `P[1-i]` might write `#` into register `C[1-i]` . Suppose this error occurs, and let `t^star[1-i]` and `T^star[1-i]` be the timestamp during the iteration for the other processor.

As `P[i]` comes to `C[i]` with a timestamp of `T^star[i]` , it must have left `C[1-i]` with a timestamp before `P[1-i]` could write `#` into it. Since timestamps don't decrease `t^star[1-i] ge T^star[i]`. Further `P[1-i]` cannot have its timestamp `T^star[1-i]` exceed `t^star[i]` since it must go to `C[1-i]` from `C[i]` and the timestamp of that register never exceeds `t^star[i]`. So we have `T^star[1-i] le t^star[i] lt T^star[i] le t^star[1-i]`. This means `P[1-i]` must enter case 2(b) as `T^star[1-i] lt t^star[1-i]`. This contradicts it being able to write a #.

We can analyze the case P[i] has just entered case 2(d) in a similar way, except we would reach the conclusion that `T[1-i] le t^star[i] = T[i] le t^star[1-i]` and so it is possible that `T[1-i] = t^star[1-i]`. But if this event happens, we are in the synchronous situation so our earlier correctness argument works. Thus, we have established correctness of the algorithm.

Runtime of Asynchronous-CCP

We now just need to analyze the runtime of our above algorithm.
The cost is proportional to the largest timestamp.
Only case 2(e) increase the timestamp of a register, and this only happens if case 2(d) does not apply.
Furthermore, the processor that raises the timestamp must have its current `B[i]` value chosen during a visit to the other register. So the analysis of the synchronous case applies.

Distributed Algorithms

Outline