Wednesday 2013-12-18

Several Google engineers wrote up their experience implementing paxos for their distributed locking. It's fun to read until you hit the following:

The first version of Chubby was based on a commercial, third-party, fault-tolerant database; we will refer to this database as “3DB” for the rest of this paper. This database had a history of bugs related to replication. In fact, as far as we know, the replication mechanism was not based on a proven replication algorithm and we do not know if it is correct. Given the history of problems associated with that product and the importance of Chubby, we eventually decided to replace 3DB with our own solution based on the Paxos algorithm.
And wonder just how much pain occurred until "eventually" happened.