In the world of blockchains, consensus protocols are widely discussed. However, it’s often hard to distinguish between competing protocols and evaluate them, especially because it is not always clear how a particular protocol solves a consensus problem, or how that particular problem relates to the practical concerns of blockchains (e.g., when is my transaction confirmed? what fork am I on?). In this post, I will outline the problem that consensus protocols are trying to solve, detail what a valid consensus protocol should achieve, and hopefully this will help convey which approaches to this problem are likely to bear fruit.
Roughly speaking, consensus protocols aim to achieve agreement between mutually distrusted parties. In networks like the Bitcoin network, nodes are attempting to reach agreement on what to include in the next block. In this network, blocks tell us what transactions have been broadcasted, and the order of those transactions. Other consensus protocols are usually similar — they tell clients the order in which operations should be applied.
As a user of a consensus protocol, you might initially think “Well, I’m not so concerned about the order and timing of operations, so long as I can guarantee those operations are correct.” Indeed, correctness is an important property. However, this property is completely linked to the order and timing of operations. To see why, let’s look at some example transactions.
T1: Public Key A transfers coin “Z” to Public Key B
T2: Public Key A transfers coin “Z” to Public Key C
T3: Public Key B transfers coin “Z” to Public Key D
Now, when examined individually, confirming that these transactions are correct is trivial. For transactions 1 and 2, a verifier need only confirm that the transaction was signed by the private key corresponding to A. This is achievable with rather boring cryptography, and has been a solved problem for decades. But examining these transactions individually isn’t interesting, and doesn’t provide a useful measure of correctness. I want to know whether or not I “own” coin Z, in the sense that nobody else can convincingly attempt to transfer that coin without my permission. What do consensus protocols have to say about this?
Well, importantly, in a consensus protocol, clients and nodes in the network reach an agreed upon view of the world, and that view of the world includes an ordering of transactions. Once we can order transactions, we can begin to evaluate whether or not they are valid at a specific point in time. For example, if a consensus protocol establishes that at Time 0, coin “Z” is owned by Public Key A, and that the transactions are ordered as T1, T3, T2 — then clearly T1 and T3 are valid, but T2 is not (because after T1 broadcasts, Z is owned by Public Key B, and no longer Public Key A). However, if the protocol establishes that the transactions are ordered T2, T1, T3, then only T2 is valid. Clearly, order and timing guarantees matter. Fortunately, this is exactly what consensus protocols attempt to provide.
Consensus protocols provide two guarantees with respect to order and timing:
- Safety: As long as the protocol does not have more than some threshold of faulty participants, other participants cannot convince a client to accept an incorrect or invalid message.
- Liveness: As long as the protocol does not have more than some threshold of faulty participants, other participants cannot indefinitely delay the acceptance of a correct message.
What do these guarantees usually mean in the context of blockchains? As we saw before, incorrect messages are nearly trivial to validate as such on the blockchain. Does that mean safety is a trivial property to achieve? Why is liveness important when it comes to timing of transactions, and the order of transactions? What does any of this have to do with forks?
To map this into the context of blockchains, we have to first understand what, exactly, blockchains are trying to achieve consensus on. The answer, it turns out, isn’t messages, or even transactions: it’s histories. A blockchain client is continuously trying to figure out which history of transactions is the “true” history. From the previous example of transactions, a client would need to conclude which of these two possible histories is the correct one:
History 1: Transaction 1, Transaction 3.
History 2: Transaction 2.
Nakamoto consensus uses Proof-of-Work mining and a “longest chain wins” metric to decide on which history is the correct history. And this history should match the expected history if all nodes in the network were honest.
With this understanding, we can now frame our two consensus guarantees as they relate to blockchains:
- Safety: As long as the protocol does not have more than some threshold of faulty participants, other participants cannot convince a client to accept a history which is the wrong history,
- Liveness: As long as the protocol does not have more than some threshold of faulty participants, other participants cannot prevent clients from accepting some history as the correct history.
Interestingly, when it comes to blockchains (though this is true of most consensus protocols), our usual understanding of properties like correctness, validity, ordering, and timing require guarantees of both safety and liveness simultaneously. This is just the nature of blockchains. If I want to be confident that I cannot be attacked via a “double-spend”, I need to know, first, that the transaction will be accepted in a timely fashion, and second, that once accepted, the transaction is very unlikely to be in an orphaned block.
This is why safety and liveness are inseparable properties of fault-tolerant protocols. Fault tolerant protocols use techniques to guarantee liveness which affect safety and vice-versa. For example, in Practical Byzantine Fault Tolerance (one of the most influential fault-tolerant consensus protocols– I recommend reading the ’99 paper from MIT), the two-phase commit mechanism used by replicas is what ensures that the replicas make progress, but it is also how new messages are accepted into the protocol history. If this mechanism were faulty, it would affect both safety and liveness. These properties cannot be effectively treated or proven in the absence of the other— a protocol which assumes liveness can be trivially safe. A protocol which waves away concerns about safety can always make progress.
If blockchains are going to revolutionize the way that people interact on the internet, and I believe they will, then the proper functioning of those blockchains is massively important. When evaluating the technologies and plans for various platforms, it’s essential to remember the fundamental problems that a blockchain is trying to solve. At a basic level, that problem is an old one: achieving consensus while tolerating some number of faulty or malicious actors. In both theory and practice, this means providing liveness and safety guarantees. However, these guarantees are inseparable, because the mechanisms for providing one always affect the other.
Now, when evaluating projects that are trying to make progress on new mechanisms for achieving consensus in support of open blockchains, projects that sufficiently treat these twin concerns are the most likely to bear fruit. Some projects like Stellar use consensus protocols in a federated model (non-open), but provide sound fault-tolerance guarantees. Algorand has also made significant progress towards building a fault-tolerant protocol, though the question of dealing with open-membership sets is still open. Our own blockchain design work provides fault-tolerance guarantees while also enabling open-membership through Proof-of-Burn mining. Progress on that work can be tracked in pull requests on our blockstack-core Github repository.