What is Tendermint Core?…
Reading Time: 17 mins
Cosmos is one of the most promising projects out there. With people like Jae Kwon and Ethan Buchman in their team, it has a lot of potential. At its heart and soul lies Tendermint Core.
Tendermint Core combines the Tendermint consensus algorithm along with a p2p gossip protocol. So, when you put it all together in the software stack, you get the Tendermint Core along with the Cosmos-SDK application layer.
Anyway, so before we go in any further, let’s look at why Tendermint is such a necessity.
Bitcoin and Blockchain
When Satoshi Nakamoto created Bitcoin, he made the first-ever decentralized cryptographic system. The really remarkable part about this discovery was that he was able to solve the Byzantine General’s Problem which helped a wide area network(WAN) to come to a consensus in a trustless environment. Bitcoin used proof-of-work algorithm to take care of their consensus.
Having said that, Bitcoin’s main contribution may very well be the fact that it introduced the whole world to the blockchain technology.
A blockchain is, in the simplest of terms, a time-stamped series of immutable record of data that is managed by a cluster of computers not owned by any single entity. Each of these blocks of data (i.e. block) are secured and bound to each other using cryptographic principles (i.e. chain).
In other words, a blockchain is a deterministic state machine replicated on nodes that do not necessarily trust each other.
By deterministic we mean that if the same specific steps are taken, then it will always lead to the same result.
Eg. 1+2 will always be 3.
So, what does state mean? Let’s look at it wrt Bitcoin and Ethereum.
In Bitcoin, the state is a list of balances for each account, which is a list of Unspent Transaction Output(UTXO). This state gets modified via transactions which change the balance.
On the other hand, in Ethereum, the application is a virtual machine which runs smart contracts. Each transaction goes through the Ethereum Virtual Machine and modifies the state according to the specific smart contract that is called within it.
If you look at the architecture of the blockchain technology, then you will three specific layers:
- Networking: The propagation of the transaction/information throughout the nodes
- Consensus: Allows the nodes to come to a decision provided >2/3rd of the nodes are non-malicious
- Application: Responsible for updating the state given a set of transactions, i.e. processing transactions. Given a transaction and a state, the application will return a new state.
For visual cues:
Problems With The Current Blockchain Architecture
Turns out that building a blockchain from ground up with all these 3 layers is really hard work. So, many projects preferred building on by forking the Bitcoin codebase. Now, while this does save up on a lot of time, the fact is that they are still handcuffed by the limitations of the Bitcoin protocol. Obviously, you can’t execute complex projects when you are using a protocol which is well-known for its throughput issues.
Things got a whole lot better when Ethereum came into play. Ethereum actually gave developers a platform that they could use to create their own customized code aka smart contracts and projects. However, as with Bitcoin, Ethereum also suffers from the same problem. They both have monolithic architecture as opposed to modular.
Monolithic Architecture vs Modular Architecture
Monolithic architecture means that everything is composed all in one piece. When software is deemed “monolithic” the components are interconnected and interdependent with each other and the design is more self-contained. In this case, the architecture is more tightly-coupled and the associated components must be all present in order for the code to be executed or compiled.
While this makes the system it was created for more robust, you can’t really derive from it and create custom codes. It is not the most flexible of systems. Plus, there is another problem with this system. If any component of the program needs to be updated, the whole application will have to be reworked. This is not really the most ideal of situations now, is it?
On the other hand, we have modular architecture. Unlike Monolithic, the layers are not that linked to each other. So, while it may not be as robust, it is quite easy to update the whole application by working with the different separate modules.
Since the modules are so independent, modular architecture allows you to actually update a particular section without causing unforeseen changes to the rest of the system. Iterative processes are also much more simple in modular programs, as opposed to monolithic.
Tendermint’s Architecture and Goals
Tendermint utilizes the modular architecture. Their goals are as follows:
- Provide the networking and consensus layers of a blockchain as a platform where different decentralized applications can be built
- Developers only need to worry about the application layer of the blockchain, saving all the hours that they would have wasted working on the consensus and the networking layer as well.
- Tendermint also includes the Tendermint consensus protocol which is the Byzantine fault tolerant consensus algorithm used within the Tendermint Core engine
Let’s look at how Tendermint’s architecture will look:
As you can see, the application is connected to Tendermint Core via a socket protocol called the APCI or Application Blockchain Interface. Since Tendermint Core and the Application running on it run in separate UNIX processes, they need to have a method to speak with each other. ABCI helps these two in their communication.
So, what does ABCI’s design look like? ABCI will have some distinct design components:
#1 Message Protocol
- Pairs of request and response messages
- Requests are made by the consensus while the application takes care of response
- It is defined using protobuf
- The consensus engine runs the client
- The application runs the server
- There are two proper implementations: async raw bytes and grpc
#3 Blockchain Protocol
ABCI is very connection oriented. The three connections for Tendermint Core are as follows:
- Mempool connection: This checks if the transactions should be relayed before they get committed. It can only use CheckTx
- Consensus connection: This connection helps in executing transactions that have been committed. Message sequence is, for every block, BeginBlock, [DeliverTx, …], EndBlock, Commit
- Query Connection: Helps in querying the application state. This part only uses Query and Info
All in all, the main goal of Tendermint is to provide developers with a tool that is not only practical but also has a high throughput. Here are the properties of Tendermint that makes it so alluring:
#1 Public or Private Blockchain Compatible
Different projects have different needs. Some projects need to have an open system where anyone can join in and contribute, like Ethereum. On the other hand, we have organizations like the Medical Industry, who can’t expose their data to just about everyone. For them, they require something like the permissioned blockchain.
Ok, so how can Tendermint help in satisfying both these needs? Remember that Tendermint only handles the networking and consensus for the blockchain. So, it helps in the:
- Propagation of the transaction between the nodes via the gossip protocol
- Helps the validators agree on the set of transactions that gets appended to the blockchain.
What this means is that the application layer is free to be defined any way that the developers want it to be defined. It is upto the developers to define how the validator set is defined within the ecosystem.
- The developers can either allow the application to have an election-system which elects validators based on how many native tokens these validators have staked within the ecosystem..aka Proof-of-stake and create a public blockchain
- Plus, the developers can also create an application that defines a restricted set of pre-approved validators who take care of the consensus and the new nodes that get to enter the ecosystem. This is called proof-of-authority and is the hallmark of a permissioned or private blockchain.
#2 High Performance
Applications made via Tendermint Core can expect exceptional performance. Tendermint Core has a block time of just 1 second. It can also handle a transaction volume of 10,000 transactions per second for 250byte transactions, as long as the application allows it to do so.
What is finality?
In simple terms, it means that once a certain action has been executed, it cannot be taken back. So, let’s take the example of a simple financial transaction. Suppose you buy some stocks in a company, just because a glitch in their system, you shouldn’t lose out on the ownership of your stocks. As you can imagine, finality is super critical for a financial system. Imagine doing a million dollar transaction and then the very next day, that transaction isn’t valid anymore because of a glitch.
Like we have mentioned before, Bitcoin and Ethereum (until full implementation of Casper FFG) don’t really have settlement finality. On the occasion of a hardfork or a 51% attack, transactions have a chance of getting reverted.
Tendermint, on the other hand, gives instant finality within 1 second of the transaction completion. Forks are never created in the system, as long as less than 2/3rd of the validators are malicious. As soon as a block is created (which is within a second) the users can rest assured that their transaction is finalized.
Tendermint is secure and forces its participants to be accountable for their actions as well. Like we have said before, tendermint can never be forked as long as less than 2/3rd of the validators are malicious. If in some case, the blockchain does fork, there is a way to determine liability. Plus, Tendermint consensus is not only fault tolerant, it’s optimally Byzantine fault-tolerant
Another great thing about Tendermint is its user-friendliness. Like we have mentioned before, they have a modular architecture where the application layer can be suitably customized. This makes it possible for existing blockchain codebases to be effortlessly linked on to Tendermint via ABCIs. The perfect example of this is Etheremint which is basically the Ethereum virtual machine codebase plug on top of Tendermint.
Ethermint works exactly like Ethereum but also benefits from all the positive features that we have listed above. All the Ethereum tools like Metamask and Truffle are compatible with Ethermint.
Tendermint’s proof-of-stake implementation is a lot more scalable than a traditional proof-of-work consensus algorithm. The main reason being that POW-based systems can’t do sharding.
Sharding basically horizontally partitions a database and creates smaller databases or shards which are then parallelly executed by the nodes. The reason being that a strong mining pool can easily take over a shard.
Tendermint will allow the implementation of sharding which will greatly increase the scalability.
Tendermint Consensus Protocol
Ok, so let’s look into how the Tendermint consensus protocol works. What exactly is a consensus protocol?
This is how Wikipedia defines consensus decision-making:
“Consensus decision-making is a group decision-making process in which group members develop, and agree to support a decision in the best interest of the whole. Consensus may be defined professionally as an acceptable resolution, one that can be supported, even if not the “favourite” of each individual. “Consensus” is defined by Merriam-Webster as, first, general agreement, and second, group solidarity of belief or sentiment.”
In simpler terms, the consensus is a dynamic way of reaching agreement in a group. While voting just settles for a majority rule without any thought for the feelings and well-being of the minority, a consensus, on the other hand, makes sure that an agreement is reached which could benefit the entire group as a whole.
From a more idealistic point-of-view, Consensus can be used by a group of people scattered around the world to create a more equal and fair society.
A method by which consensus decision-making is achieved is called “consensus mechanism”.
So now what we have defined what a consensus is, let’s look at what the objectives of a consensus mechanism are (data taken from Wikipedia).
- Agreement Seeking: A consensus mechanism should bring about as much agreement from the group as possible.
- Collaborative: All the participants should aim to work together to achieve a result that puts the best interest of the group first.
- Cooperative: All the participants shouldn’t put their own interests first and work as a team more than individuals.
- Participatory: The consensus mechanism should be such that everyone should actively participate in the the overall process.
- Inclusive: As many people as possible should be involved in the consensus process. It shouldn’t be like normal voting where people don’t really feel like voting because they believe that their vote won’t have any weight in the long run.
- Egalitarian: A group trying to achieve consensus should be as egalitarian as possible. What this basically means that each and every vote has equal weight. One person’s vote can’t be more important than another’s.
Now that we have defined what consensus mechanisms are and what they should aim for, we need to think of the other elephant in the room.
Which consensus mechanisms should be used for an entity like blockchain.
Before Bitcoin, there were loads of iterations of peer-to-peer decentralized currency systems which failed because they were unable to answer the biggest problem when it came to reaching a consensus. This problem is called “Byzantine Generals Problem”.
Byzantine General’s Problem
In order to get anything done in a peer-to-peer network, all the nodes should be able to come to a consensus. The thing is though, for this system to work, it lays a lot of emphasis on people to act in the best interest of the overall network. However, as we know already, people aren’t really trustworthy when it comes to acting in an ethical manner. This is where the Byzantine General’s problem comes in.
Imagine this situation.
There is an army surrounding a well-fortified castle. The only way that they can win is if they attack the castle together as a unit. However, they are facing a big problem. The army is far apart from each other and the generals can’t really directly communicate and coordinate the attack and some of the generals are corrupt.
The only thing that they can do is to send a messenger from general to general. However, a lot of things could happen to the messenger. The corrupt generals can intercept the messenger and change the message. So, what can the generals do to make sure that they launch a coordinated attack without relying on the ethics of each individual general? How can they come to a consensus in a trustless way to do what needs to be done?
That’s the Byzantine General’s Problem and Satoshi Nakamoto solved this problem by using the Proof-of-Work (POW) consensus mechanism.
What is Proof-of-Work?
Let’s check how POW works with context to our example given above. Suppose a general wants to communicate with another general. How do you think it will go down?
- A “nonce” is added to the original message. The nonce is a random hexadecimal value.
- This new message is then hashed. Suppose the generals agree beforehand that they will only send messages, which when hashed begins with 4 “0”s.
If the hashed does not give the desired number of 0s, the nonce is changed and the message is hashed again. This process keeps repeating until the desired hash is received.
- The entire process is extremely time-consuming and takes up a lot of computational power.
- Now when they finally get the hashed value, the messenger is given the original message and the nonce and told to communicate with the other generals. So what does happen if someone does try to intercept the message? Well, remember the avalanche effect of the hash functions? The message will change drastically and since it won’t start with the required number of “0”s anymore, people will realize that the message has been tampered with.
So, to put POW in the context of crypto mining:
- The miners try to solve cryptographic puzzles to add a block to the blockchain.
- The process requires a lot of effort and computational power.
- The miners then present their block to the bitcoin network.
- The network then checks the authenticity of the block by simply checking the hash, if it is correct then it gets appended to the blockchain.
So, discovering the required nonce and hash should be difficult, however checking whether it is valid or not should be simple. That is the essence of proof-of-work.
Now, you are probably wondering, why should the miners sacrifice their time and resources to mine bitcoins? Well, turns out that they have a pretty healthy economic incentive:
When you discover a block, you receive a block reward of 12.5 bitcoins. The reward halves every 210,000 blocks.
Once you have mined a block, you become the temporary dictator of the block. You are the one responsible for putting transactions inside the block and are hence entitled to transaction fees.
There is only a limited number of bitcoins out there, 21 million to be exact. So, what is stopping these miners from mining out all the bitcoins at once?
Turns out that bitcoin mining gets progressively harder over time. This feature is called “difficulty”, and the difficulty of mining keeps on increasing as you keep on mining.
This is why it is pretty much impossible nowadays for solo miners to mine Bitcoins using just their computers. Miners have now joined forces and created “mining pools” to pool their computational power together and mine as a group. These pools use ASICs (Application-Specific Integrated Circuits) specifically created for mining to mine bitcoins.
Problems with POW
There are three main problems with the Proof-of-Work algorithms. We have talked about this in detail before, so we are just going to do a general overview.
- Energy wastage: Bitcoin eats up more power than Ireland and the Slovak Republic. This huge wastage of energy is one of the principles of Bitcoin. It is wastage for the sake of wastage.
- Centralization: As we have already told you, Bitcoin uses ASICs for mining. The problem with that is ASICs are expensive, and pools with more money tend to have more ASICs and, consequently, more mining power. As such, Bitcoin is not as decentralized as it wants to be.
- Scalability: The very architecture of POW prevents scalability. Bitcoin manages a mere 7 transactions per second. For a modern-day financial system, it is simply not adequate enough.
So, to counteract the many problems with the Proof-of-Work consensus system, Jae Kwon, a computer science and system engineering graduate, created Tendermint. Tendermint is a purely BFT-based protocol, built in a permissionless setting with the Proof-of-Stake(PoS) as the underlying security mechanism.
Because of the complexity, Tendermint has taken almost 4 years to be completed.
Jae Kwon and Tendermint CTO Ethan Buchman were inspired by Raft and PBFT to create a consensus system which satisfied the Byzantine general’s problem. It is
“modeled as a deterministic protocol, live under partial synchrony, which achieves throughput within the bounds of the latency of the network and individual processes themselves.”
Alright, we know that’s a lot of complicated words to throw one after another but in order to understand what Tendermint consensus is and why it was designed the way it was designed, you need to understand what some of those complicated terms mean. You will see how all of them link up with each other like an intricate puzzle.
#1 FLP Impossibility
The FLP (Fischer Lynch Paterson) Impossibility states that a consensus algorithm can only have 2 of the following 3 properties:
- Guaranteed termination or liveness
- Fault Tolerance
Image Credit: Medium
In other words, the FLP impossibility states that
“both termination and agreement (liveness and safety) cannot be satisfied in a timebound manner in an asynchronous distributed system, if it is to be resilient to at least one fault (they prove their result for general fault tolerance, which is weaker than Byzantine fault tolerance, since it only requires one fail-stop node — so BFT is included inside FLP impossibility claims).”
So, basically, it is quite impossible for an asynchronous WAN to come to a consensus as there is no specific amount of time that the nodes will take to receive, process, and respond to messages. This is obviously a big problem because it is extremely impractical for a large network of nodes like Bitcoin to assume that they are going to synchronize.
Ok, so synchronicity was going to be a problem. However, researchers Dwork, Lynch and Stockmeyers threw a lifeline here with their paper called “Consensus in the Presence of Partial Synchrony.” This was called DLS consensus.
#2 DLS Consensus and Partial Synchronocity
The DLS paper states that between a synchronous system and an asynchronous system, there exists a special system which is “partially synchronous”. Since this partially synchronous system can have an upper bound time given, it will be able to design a feasible BFT protocol.
According to DLS, the real challenge in designing protocols is to have one that works correctly in a partially synchronous system.
So, let’s see how popular decentralized protocols like Bitcoin and Ethereum works in that regard.
Bitcoin has a known upper bound which is around 10 mins. So, a block of transactions is produced every 10 mins. This timing assumption is imposed on the network so that the nodes get 10 whole mins to collect the information and transmit it along via gossip.
On the other hand, we have Ethereum which makes synchrony assumptions for their blocks and network by keeping an upper block time on 15 seconds. With such a low block time, they are more scalable than Bitcoin, however, they are not really that efficient. Ethereum miners produce a lot of orphan blocks.
#3 Liveness and Termination
Termination is a property which states that each correct processor should eventually make a decision. Most consensus algorithms, that we have right now, rely on synchronous models for their safety and termination. They have fixed bounds and rules that are known so, in the event that they don’t hold up, the chain forks into multiple protocols
Sure there are consensus protocols that work in asynchronous networks, however going by the FLP impossibility theorem, they cannot be deterministic. Which brings us to….
#4 Deterministic vs. Nondeterministic Protocols
Usually, purely asynchronous consensus protocols depend upon nondeterministic members such as Oracles which involves a high degree of uncertainty and complexity.
So How Does Tendermint Deal with All these Factors?
Tendermint is a mostly asynchronous, deterministic, BFT consensus where validators have a stake which denotes their voting power. In the FLP impossibility triangle, it prefers Fault-tolerance and safety (consistency) over liveness.
Tendermint constantly swings between periods of synchrony and asynchrony. This means that, while it relies upon timing assumptions to make progress, the speed of the said progress doesn’t depend on system parameters but instead depends on real network speed.
Also, Tendermint never forks in the presence of asynchrony if less than 1/3rd of the validators are corrupt/careless. This is the very reason why Tendermint is Byzantine Fault Tolerant. Like we have said before, Tendermint focusses on safety above liveness. So, if more than a third of the validators are malicious, instead of the network forking, the Tendermint blockchain will simply come to a temporary halt until more 2/3rd validators come to a consensus.
Tendermint is also completely deterministic and there is no randomness in the protocol. The leaders in the system are all elected in a deterministic version, via a defined mathematical function. So, we can actually mathematically prove that the system is behaving in the way that it is supposed to behave.
Tendermint – The Proof of Stake System
In a proof of stake (POS) system, we have certain people called “validators”. These validators lock up a stake inside the system. After that, they have the responsibility of betting on the block that they feel is going to be added next to the blockchain. When the block gets added, they get a reward proportional to their stake.
Alright, so that’s how a generic POS works. Now, let’s look into how tendermint works.
Let’s first familiarize ourselves with some of the terms that we will be using:
- A network composes of a lot of nodes. Nodes that are connected to a particular node are called its peers.
- The consensus process takes place at a particular block height H. The process to determine the next block consists of multiple rounds.
- The round consists of many states which are: NewHeight, Propose, Prevote, Precommit, and Commit. Each state is called a Roundstep or just “step”.
- A node is said to be at a given height, round, and step, or at (H,R,S), or at (H,R) in short to omit the step.
- To prevote or precommit something means to broadcast a prevote vote or precommit vote for something.
- When a block gets >2/3 of the prevotes at (H,R) then it is called proof-of-lock-change or PoLC.
What is the State machine?
The state machine is the engine of the Tendermint protocol so to speak. The following diagram gives you a good idea of what it will look like:
Ok, so what is going on here?
Remember the states that each round goes through? NewHeight, Propose, Prevote, Precommit, and Commit.
Of these, “Propose, Prevote, Precommit” consist of one round while the other two are special rounds. In an ideal scenario, the state transition would act like this:
NewHeight -> (Propose -> Prevote -> Precommit)+ -> Commit -> NewHeight ->…
However, that’s not how it may always work. Multiple rounds may be required before the block is committed. The following are the reasons why multiple rounds may be needed:
- The designated proposer may be absent.
- The block proposed maybe invalid.
- The block didn’t propagate in time.
- >2/3 of prevotes weren’t received in time by the validator nodes.
- Even though +2/3 of prevotes are necessary to progress to the next step, at least one validator may have voted <nil> or maliciously voted for something else.
- >2/3 of precommits for the block weren’t received even though prevotes may have been received.
What happens during each state?
Alright… so now let’s look into each and every state and see how the whole thing comes together.
In this stage, the designated proposer, i.e. the node selected proposes a block to be added at (H, R). This stage ends in one of two ways:
The block gets proposed and that enters the prevote stage.
The proposer’s time to choose the block expires upon which it enters the prevote stage anyway.
Now we come to the prevote stage. In this stage, every validator needs to make a decision.
- If somehow, the validator is locked on a proposed block from some previous round, they automatically sign off and broadcast that block.
- If the validator had received an acceptable proposal for the current round, then they sign and broadcast a prevote for the proposed block.
- However, if they find something fishy with the proposal or have not received any proposal at all (eg. if the proposer’s time runs out), then they sign with a “nil” prevote.
- No block-locking happens during this stage.
- During this period, all the nodes propagate the prevotes throughout the system via the gossip protocol.
Now we enter the final step of the “round” called “precommit.” Upon entering this stage, the validators precommit to their decision by broadcasting their prevotes. One of the following three scenarios can happen:
- If the validator receives >2/3 of the prevotes for a particular acceptable blocks then the validator signs off and broadcasts their precommit to the block. They also get locked on to that block. One validator can lock on to only one block at a time.
- However, if the validator receives more than 2/3rd of NUL prevotes then they unlock and precommits turn to “NIL”.
- Finally, if they haven’t received a super majority of 2/3rd at all then they don’t sign off or lock on anything.
Throughout this stage, the nodes keep on continuously gossiping about the precommits throughout the network.
In the end, if the proposed block gets more than 2/3rd precommits then we move towards the “Commit” step. However, if they don’t reach that stage then they enter the “Propose” stage of the next round.
The Commit state isn’t a part of the “round”. Along with NewHeight, it’s one of the two special rounds. During the commit state, two parallel conditions are checked to see if they are getting fulfilled or not.
- Firstly, the validators must receive the block that has been precommitted by the network. Once that is done, they sign off and broadcast their commitment.
- Secondly, they must wait until they have received at least 2/3rd precommits for the block.
Once this is done, the block gets committed to the network.
Simply increments block height by 1 to show that the block has been added.
Choosing the Validators
As you may have understood by now, choosing the initial set of validators is critical for Cosmos to function. So, how exactly they are going to be chosen?
Unlike Bitcoin where anyone can become a miner anytime, there is only so many validators that the Tendermint system can take in. Since validators will individually need to do a lot of functions, increasing the count of validators will only lead to delay.
This is why Cosmos decided to choose 100 validators during Genesis day (i.e. the day of the fundraiser.) The number of validators will increase by 13% every year until 10 years when it will settle on 300.
So, what about results?
As the cosmos whitepaper states:
“Tendermint provides exceptional performance. In benchmarks of 64 nodes distributed across 7 datacenters on 5 continents, on commodity cloud instances, Tendermint consensus can process thousands of transactions per second, with commit latencies on the order of one to two seconds. Notably, the performance of well over a thousand transactions per second is maintained even in harsh adversarial conditions, with validators crashing or broadcasting maliciously crafted votes.”
The graph below support the claim made above:
Casper vs Tendermint
Along with Tendermint, Casper is another popular implementation of POS protocol.
While Tendermint focusses on Safety, Casper’s focus is in liveness, wrt the FLP impossibility. So, what happens in Casper during a fork?
Casper FFG will allow a blockchain to continue being built, while also having the property that all nodes will be aware that this chain is not finalized. So, the blockchain can remain available without any finality. The validators the chain have the choice to move to the forked chain. If more than 2/3rd of the validators vote, then they switch chains.
Plus, Casper has a famous slashing mechanism. Any sort of malicious attack will lead to validators getting their stake instantly slashed.
So, there you have it. We hope that we gave you as much valuable information as possible. What do you think of Tendermint and its potential? Sound off on the comment section below!