Concensus Algorithm

 Consensus Algorithm

  • A consensus algorithm is a distributed computing protocol that enables a group of distributed nodes to agree on a single value or decision 
  • In a distributed system, where multiple nodes must coordinate to perform a task or provide a service, a consensus algorithm ensures that all nodes are in agreement on the state of the system, even in the face of node failures, network delays, or other forms of disruption. 
Some common use cases for consensus algorithms are: 
  • Distributed databases: Consensus algorithms are used to ensure that all nodes in a distributed database agree on the state of the data, even in the presence of node failures or network delays. Examples Apache Cassandra and Riak. 
  • Distributed ledgers: Consensus algorithms are used in blockchain systems and other distributed ledgers to ensure that all nodes in the network agree on the state of the ledger. Examples Bitcoin and Ethereum. 
  • Distributed file systems: Consensus algorithms are used in distributed file systems to ensure that all nodes have a consistent view of the file system's state, even in the presence of node failures or network delays. Examples Google File System and Hadoop Distributed File System. 
  • Distributed messaging systems: Consensus algorithms are used in distributed messaging systems to ensure that all nodes agree on the ordering of messages, even in the presence of network delays or message loss. Examples Apache Kafka and RabbitMQ. 
  • Cloud computing: Consensus algorithms are used in cloud computing to ensure that all nodes in a cluster agree on the state of the system, even in the presence of node failures or network delays. Examples Apache Mesos and Kubernetes. 
  • Distributed Transactions: Consensus algorithms can be used to ensure that all nodes agree on the outcome of the transaction, even in the presence of node failures or network delays.  
    • Example, in the two-phase commit protocol, a consensus algorithm is used to coordinate the commit or abort decision across all nodes involved in the transaction. In this protocol, a coordinator node sends a prepare message to all nodes involved in the transaction, asking them to prepare for the commit. If all nodes agree to commit, the coordinator sends a commit message to all nodes, and the transaction is committed. If any node fails or does not respond, the coordinator sends an abort message to all nodes, and the transaction is aborted. 
    • Consensus algorithms can also be used in other distributed transaction protocols, such as three-phase commit and commit protocols based on Paxos or Raft 
  • Distributed Lock: a consensus algorithm can be used in a distributed lock system to ensure that only one node in the system can access a shared resource at a time. The node that successfully proposes the lock value is granted access to the shared resource, while other nodes must wait until the lock value is released before they can propose their own lock value. This ensures that only one node can access the shared resource at a time, even in the presence of failures or network partitions. 
  • Leader Election: a consensus algorithm can be used in leader election to ensure that a distributed system can elect a single leader node to coordinate the activities of the other nodes.  
    • For example, in a Paxos-based leader election algorithm, the nodes in the system exchange messages to propose and agree upon a sequence of values, with each value representing a proposed leader node. The node that successfully proposes a value that is accepted by a quorum of the nodes becomes the leader node, while the other nodes become followers. 
    • Similarly, in a Raft-based leader election algorithm, the nodes in the system participate in a series of elections to elect a leader node. The election process consists of two phases: a candidate phase and a leader phase. During the candidate phase, nodes send vote requests to each other to determine if they are eligible to become the leader. The node that receives a majority of votes becomes the leader. If no node receives a majority of votes, a new election is triggered. 
    • In both Paxos and Raft, the consensus algorithm ensures that only one node is elected as the leader, even in the presence of failures or network partitions. 
Popular Consensus Algorithms:
  • Paxos: This is a consensus algorithm designed to allow a distributed system to reach consensus on a value or a decision even in the presence of failures or network partitions. Paxos is commonly used in systems that require high availability, such as databases and distributed file systems. 
  • Raft: This is a consensus algorithm designed to be more understandable and easier to implement than Paxos. Raft is often used in systems that require a leader node, such as distributed key-value stores and distributed databases. 
  • Byzantine fault tolerance (BFT): This is a class of consensus algorithms designed to tolerate faults or attacks by a minority of nodes in a distributed system. BFT algorithms are commonly used in systems that require high security, such as cryptocurrency networks and financial systems. 
  • Practical Byzantine Fault Tolerance (PBFT): This is a variant of the Byzantine fault tolerance algorithm that is designed to be practical for use in real-world systems. PBFT is commonly used in distributed systems that require high throughput and low latency, such as financial trading systems. 
  • Hashgraph: This is a relatively new consensus algorithm that is designed to be fast and secure. Hashgraph is often used in systems that require high performance, such as gaming and social media platforms. 
  • Zab (Zookeeper Atomic Broadcast) : It is developed by the Apache ZooKeeper project. Zab ensures that messages are delivered in the order they were broadcasted, and that every message is delivered to every node in the same order. This ensures that the distributed system remains consistent and can recover from failures. 
  • Viewstamped Replication (VSR) : The key concept in VSR is the use of a view number to manage changes in the system configuration due to leader failures or network partitions. The view number represents the current state of the system and is incremented whenever a new leader is elected or a new partition is created. The view number ensures that replicas are always working with the latest state of the system. 

Consensus algorithm must satisfy the following properties:

  • Uniform Agreement: No two nodes decide differently
  • Integrity: No nodes decide twice
  • Validity: If a node decides value v, the v was proposed by some node.
  • Termination: Every node that does not crash eventually decides same value.
  • The Validity Property exists mostly to rule out trivial solutions: For example;
    • You could have an algorithm that always decides null, no matter what was proposed. this algorithm would satisfy uniform and integrity properties but not the validity properties.
  • With Termination property, even if some node fail, the other nodes must still reach a decision.
    • The system model of consensus assumes that when a node "crashes", it suddenly disappears and never come back.
    • Most consensus algorithm's termination property is subject to the assumption that fewer than half of the node unreachable. The majority can easily form a quorum.
  • Most consensus algorithm assume that there are no Byzantine faults: that is if a node does not correctly follow the protocol (e.g. it sends contradictory message to different node), it may break the safety properties of the protocol.

Limitation of Consensus:

  • You need minimum three nodes in order to tolerate one failure, five nodes in order to tolerate two failures.
  • Most consensus algorithm assume a fixed set of nodes that participate in voting, which means that you can not just add or remove nodes in cluster.
    • Dynamic membership extension to consensus algorithm allow the set of nodes in cluster to change over time.
  • Consensus system generally rely on timeouts to detect failed nodes, causing frequent leader election results in terrible performance.


    

Comments

Popular posts from this blog

Distributed Lock with Redlock

Distributed Transaction

Storage Engine