Introduction

On Monday, we started talking about distributed algorithms -- algorithms run on multiple machines/processors that share a global memory, but don't share a clock and so might execute instructions of their programs at different rates.
We then started looking at the choice coordination problem, where the different machines need to agree on a value in the range 0,...,m-1.
We first look at the subcase where two synchronized processor had to agree on a register to write a # symbol into, with the aim that we would relax the synchronization requirement.
We said the processors each use three accumulators cur[i], B[i], R[i], and we denoted the two global registers C[0], C[1]. All of these accumulators and registers except cur[i] are initialized to 0. cur[0] is initialized to 0 and cur[1] is initialized to 1.
The algorithm proceeded by have each processor read C[cur[i]] into R[i], if its value was # then halt. Then it checks if B[i] = 1 (supposed to represent, its last coin flip) and R[i] = 0 (supposed to represent the other processors last coin flip). If so, it writes # and halts. Otherwise, it flips a coin and writes it to B[i] and C[cur[i]], and then sets cur[i] = (cur[i] + 1) % 2.
We showed with odd `1-O(1/(2^k))` the algorithm terminates in `k` steps (it stops any round if the two coin flips are different which happens 1/2 time).
We begin today by looking at the asynchronous version of this problem.

The Asynchronous Problem

We now assume the two processors may be executing at varying speeds and cannot exchange the registers after each iteration.
We no longer assume that the two processors begin by scanning different registers.
We assume that each processor chooses its starting register at random.
The two processors could be in a conflict at the very first step so locking needs to be used.
To do coordination we want to use timestamps.
We will assume a read on `C[i]` will yield a pair `( t[i], R[i] )` where `t[i]` is the timestamp and `R[i]` is the register.

Asynchronous-CCP

Input: Registers C[0] and C[1] initialized to (0,0).
Output: Exactly one of the two registers has value #.
0. P[i] is initially scanning a randomly chosen register. 
   Thereafter, it changes its current register at the end of each iteration. 
   The local variables T[i] and B[i] are initialized to 0.
1. P[i] obtains a lock on the current register and reads (t[i],R[i]).
2. P[i] selects one of five cases:
   (a) case [R[i] = #]: halt;
   (b) case [T[i] < t[i]]: set T[i] = t[i] and B[i] = R[i]
   (c) case [T[i] > t[i]]: write # into the current register and halt;
   (d) case [T[i] = t[i], R[i] = 0, B[i] = 1]: 
       write # into the current register and halt
   (e) case [otherwise]: Set T[i] = T[i] + 1 and t[i] = t[i] + 1, 
       assign a random bit-value to B[i] , 
       and write (t[i] , B[i]) to the current register.
3. P[i] releases the lock on its current register, 
   moves to the other register, and returns to Step 1.

In-Class Exercise

In teams of two, where each person pretends to be a separate processor, simulate step by step this algorithm., have a common area on the desk where the values of C[i] are written.
Each person should work at their own pace and write down for each round the values of their accumulators and registers as well as how they change.
Please post your solution to the Mar 9 In-Class Exercise Thread.

Analysis of Asynchronous-CCP

Theorem. For any `c gt 0`, Asynchronous-CCP has total cost in operations executed exceeding `c` with probability at most `2^(-Omega(c))`.

Proof. The main difference between this and the synchronous case is in Step 2 (b) and 2 (c):

Case 2(b) is supposed to handle where the processor is playing catch up with the other processor.
Case 2(c) handles where the processor is ahead of the other processor.

To prove correctness of the protocol, we consider the two cases (2(c), 2(d)) where a processor can write a # to its current cell. At the end of an iteration, a processor's timestamp `T[i]` will equal the timestamp of the current register `t[i]`. Further # cannot be written in the first iteration by either processor. (Proof cont'd next slide).

Proof cont'd

Suppose `P[i]` has just entered case 2(c), with some timestamp `T^star[i]`, and its current cell is `C[i]` with timestamp `t^star[i] lt T[i]^star`. The only possible problem is that `P[1-i]` might write `#` into register `C[1-i]` . Suppose this error occurs, and let `t^star[1-i]` and `T^star[1-i]` be the timestamp during the iteration for the other processor.

As `P[i]` comes to `C[i]` with a timestamp of `T^star[i]` , it must have left `C[1-i]` with a timestamp before `P[1-i]` could write `#` into it. Since timestamps don't decrease `t^star[1-i] ge T^star[i]`. Further `P[1-i]` cannot have its timestamp `T^star[1-i]` exceed `t^star[i]` since it must go to `C[1-i]` from `C[i]` and the timestamp of that register never exceeds `t^star[i]`. So we have `T^star[1-i] le t^star[i] lt T^star[i] le t^star[1-i]`. This means `P[1-i]` must enter case 2(b) as `T^star[1-i] lt t^star[1-i]`. This contradicts it being able to write a #.

We can analyze the case P[i] has just entered case 2(d) in a similar way, except we would reach the conclusion that `T[1-i] le t^star[i] = T[i] le t^star[1-i]` and so it is possible that `T[1-i] = t^star[1-i]`. But if this event happens, we are in the synchronous situation so our earlier correctness argument works. Thus, we have established correctness of the algorithm.

Runtime of Asynchronous-CCP

We now just need to analyze the runtime of our above algorithm.
The cost is proportional to the largest timestamp.
Only case 2(e) increase the timestamp of a register, and this only happens if case 2(d) does not apply.
Furthermore, the processor that raises the timestamp must have its current `B[i]` value chosen during a visit to the other register. So the analysis of the synchronous case applies.

Byzantine Agreement Problem

The problem was introduced by Pease, Shostak, and Lamport in 1980.
It is similar to the Choice Coordination Problem.
The original name of the problem was the Byzantine Generals Problem and it was supposed we have `n` Byzantine Generals that have to agree on a battle plan.
In our set-up, we want to agree on one of two possible values (say, heads or tails `[0,1]`).
We have `n` processors `t` of which may be faulty.
We require that the decision reached by a protocol should have:
1. All good processors finish with the same decision.
2. If all the good processors begin with the same value `v`, then they should all finish with same value `v`.

Some More Remarks Before We Begin

Our solution to the choice coordination problem was only for two processors choosing between two values.
Neither of these processors was faulty.
The algorithm we present today works for `n` processors choosing between two values, so is already more general, ignoring the allowance for faulty processors.
The original choice coordination problem was for `n` processors to choose among `m` choices.
Notice by repeating the Byzantine procedure `log_2 m` times we can have our processors agree on a first bit of the number between `1` to `m`, then a second bit of the number between `1` to `m`, etc.
If each such single bit agreement can be done in constant time as a function of `n`, then agreeing on a number between 1 and `m` can be done in `O(log m)` time. Notice this does not depend on the number of processors.

Randomized Algorithm for Byzantine Agreement

In Byzantine Agreement Problem, we have `n` asynchronous processors (generals) which may communicate in rounds their votes on whether to carry out some action (attack/not-attack).
At the end each round, each processor sends its vote to each other processor.
Some number `t` of processors may be faulty.
We want a protocol such that after some number of rounds all the good processors agree on an action. Further, if at the start of the procedure all the good processors were thinking the same thing, then at the end of our protocol they wouldn't have changed their decision.
We will assume that at the start of each round a trusted third party flips a fair coin.
Any of the processors have access to this coin.
We will show having access to such a coin will allow us to solve the problem in an expected constant number of rounds.
We will assume that the number of faulty processors is a number `t < n/8`.
Each round a good processor sends the same vote to all the other processors.
A faulty processor may send arbitrary or even inconsistent votes to each other processor.
Let `L=((5n)/8)+1` , `H=((3n)/4)+1` , and `G=(7n)/8`.

What the ith Processor does (if it is good).

Input: A value for b[i], our current decision choice. 
Output: A decision d[i].
1. vote = b[i].
2  For each round, do
3.    Broadcast vote;
4.    Receive votes from all the other processors.
5.    Set maj = majority (0 or 1) value among the votes cast
6.    Set tally = the number of votes that maj received.
7.    If coin = heads then set threshold = L; else set threshold = H
8.    If tally >= threshold then set vote = maj; else vote = 0
9.    If tally >= G then set d[i] = maj permanently.

Analysis

First, if all processors begin the round with the same vote, then 9 will apply and so this value will be the value eventually settled upon.
Suppose the processors begin the round with different values for the vote.
If two processors compute different values for maj in Step 5, since at most 1/8 of the processors are faulty, the tally does not exceed threshold regardless of whether L or H was chosen as threshold. So all good processors would set their votes to 0 and an agreement would be reached.
We say a faulty processor foils a threshold `x in {L,H}` in a round if, by sending different messages to the good processors, they cause tally to exceed `x` for at least one good processor, and to be no more than `x` for at least one good processor.
Since the difference between the two possible thresholds is at least `t`, the faulty processor can foil at most one threshold in a round.
Since the threshold is chosen with equal probability from `{L,H}`, it is foiled with probability at most `1/2`.
Thus, the expected number of rounds before we have an unfoiled threshold is at most `2`. If the threshold is not foiled then all good processors compute the same vote in Step 8.
Since 7/8th of the processors are good, this ensure that by at most the next round 9 applies.

Asynchronous CPP, Byzantine Agreement

Outline