Validators form the heart of the security of the ETH ecosystem. Before Ethereum shifted to POS, there was a POW mechanism that was inefficient. The shift from POW to POS came with the development of Casper FFG and Casper GHOST consensus protocols which were inspired by PBFT and Tendermint. In the POS mechanism, the validators have 2 main roles to play - 1. Produce blocks -these are called proposers in the post-PBS era and 2. Produce attestations for the proposed blocks. Unlike miners in the PoW system, validators are not in competition with one another; instead, they are algorithmically selected to propose and vote on new blocks.
For these validators, which are 800,000 in number on ETH, missing attestations and malicious behavior have penalties and slashing. These are part of the Casper protocol to disincentivize the validators from being offline and malicious behavior. If the validator client software is unable to create timely messages to perform validator duties, the validator suffers an inactivity leak that reduces its balance. Apart from these penalties, the validators also have other risks that could lead to loss of stake, etc.
These come mainly from the loss of the private key for the validator client which could happen for a variety of reasons including trusting a single operator for the validator client and staking services that hold the key. This signing key is required every epoch, ie every 6.4 mins. There are risks of node failure due to hardware failures, software bugs, disk failure, and others. To tackle these problems, let us look at some ways to tackle each and see what a combined solution looks like.
Implementing DVT requires a Distributed Key Generation (DKG) protocol to share a secret key (in this case, the validator signing key) among several committee members in the form of partial signing keys called shares, while keeping the full secret key unknown to any individual party. These shared secret keys sign together and tolerate certain m out of n failures or malicious behavior. After using the DKG, and TSS we can be sure that we have addressed the issue of key theft or loss contributing to the validator failure.
In the case of node failure due to a variety of reasons as mentioned before, having redundancy in the form of multiple clients running the validator helps. It is important that the nodes are in the same state and synced so that they do not vote differently and get slashed. These different instances of different clients need to have a consensus amongst themselves to protect against slashing. This isn’t the ETH consensus, but a layer of consensus over the redundant nodes that run the validators.
Taking these into consideration, let us dive deeper into the validator architecture and how the solution incorporates these techniques to come up with a new design that provides protection against key theft, node failure, and consensus among them.
The current validator is made up of a Beacon Node and a validator client. The beacon node is an outward-facing client that participates in the P2P network, keeps track of the chain and directly faces the ETH network. Beacon nodes communicate their processed blocks to their peers via a P2P (peer-to-peer) network, which also manages the lifecycle process of active validator clients. The beacon nodes handle network synchronization, drawing consensus, and performing several other low-level functions, the role of validators who stake ETH in order to perform block proposals and attestations is an equally critical component of the Ethereum beacon chain. A validator client begins participating in the network once 32 ETH is locked up in a validator deposit contract. The validator clients also handle the key for signing the attestations and proposing.
I know it’s getting a bit tiring seeing modular blockchains, modular ZK-proof designs, modular searchers, modular smart contracts, and everything modular. But modular architecture for validators is not something made up.
Validators can be considered 3 things bundled together - single machine, single operators, and single private key. If we consider these aspects critical and ways to make them better, we are indeed talking about a modular architecture that leads to specialization.
Using key-shares, DKG the private key is divided into a number of parts - 4 or 16, and could be even more in the future. The private key becomes scaled and no longer poses a threat as a single point of failure. The operators are also scaled using the same tech, and we have more flexibility on the operator set and reliance on a single operator. A validator or a validator cluster can be run together by multiple operators which leads to better diversity and again, improves the fault tolerance of the validators. The last part is also dividing the validator into multiple machines and multiple client implementations for protection against bugs etc.
DVT = DKG + TSS + Consensus
We have addressed the problems of key theft, node failures, and the solution design consisting of Distributed key generation, Threshold signature schemes, redundant instances, and consensus among them. Let us see how all these fit together to create a middleware orchestrating all this.
We first split the validator client key into multiple parts using DKG and TSS, creating multiple redundant instances out of them that run the same client. This ensures that each of the instances has access to its own private key share and they sign and combine the keys together. These separate clients could be connected to the same beacon node, but since we’re aiming for a fault-tolerant system, the beacon nodes are also replicated to remove the single point of failure. If the Beacon Node to which the validator client is connected has a fault, a validator may end up following a minority fork resulting in it appearing to be offline to the rest of the PoS protocol. The solution is a middleware that contains Distributed validator clients which enable the communication between beacon nodes and validator clients and a consensus layer in between using iBFT for consensus between both the validator instances and beacon node instances.
For the middleware consensus choice, we need it to tolerate 1/3rd failure and a fast leader change to produce attestations or proposals quickly. Istanbul BFT was chosen as the consensus that satisfies these conditions.
The result of this middleware is a redundant system of validators that doesn’t go down as much as a single instance.
Other benefits of having DVT middleware are
Every single validator utilizing DVT could be made up of different client implementations so that the bugs affecting particular don’t affect the functioning of the validator as a whole. If a lot of validators utilize DVT, the overall effect on the network would be validators composed of different clients that are much more resilient, diverse, and located in different parts of the world.
This was a general overview of the problem, cryptographic primitives used in the solution, solution design, and effects. Let’s have a look at the 3 most dominant solutions in the DVT vertical and the differences between them.
The Obol Network consists of four core public goods:
Charon is a GoLang-based, HTTP middleware built by Obol to enable any existing Ethereum validator clients to operate together as part of a distributed validator. Charon's networking model can be divided into two parts: the internal validator stack and the external p2p network. Each operator should run the whole validator stack (all 4 client software types), either on the same machine or on different machines. The networking between the nodes should be private and not exposed to the public internet.
Obol has partnered with Nethermind to research and develop a specification for Obol V2. The partnership enables Nethermind to become the second core development team of the Obol Network. The aim of the partnership is to create a protocol from Obol design and allow for different implementations.
For their alpha and later, Obol is working with these partners to run different DV clusters. Below are the results of how different validator clients performed their duties on the Obol’s public test net.
The SSV Network functions as a fully decentralized and open-source DVT Network, offering a reusable solution to decentralize Ethereum validators. By employing a threshold signature scheme (TSS), it divides the private keys of validators into multiple shares, which are then distributed to distinct operator nodes. This design guarantees that control over the private key remains dispersed, ensuring the validator's immunity against compromise, even when confronted with malicious operators.
For security measures, SSV Network uses a remote signer, which manages signing tasks. This remote signer operates within an isolated software enclave, acting as a defense against potential attackers.
SSV Network also uses supplementary security measures including Shamir secret sharing, multi-party computation (MPC), and the Istanbul Byzantine Fault Tolerance (IBFT) consensus.
SSV Network is now live on mainnet after being on testnet for 2 years. SSV is running 224 validators across 24 operators. As of August 2023, 7,168 ETH has been staked through SSV Network.
Lido ran pilots with SSV Network and Obol Network and the following are their respective validator performances. Improvements in block proposer effectiveness, distributed key, and deposit data generation will come along with further stress testing.
Diva's DVT uses a threshold signature scheme to split the validator key into 16 key shares, which are then distributed to different operators. This ensures that no single operator has control of the private key, and that the validator cannot be slashed if an operator is malicious. Diva's DVT also uses a rotational nondeterministic consensus mechanism that is twice as fast as classic DVT. This allows Diva to operate a network of permissionless and trustless nodes, which has a number of benefits, such as lower risk of offline penalties, lost keys, slashing, and MEV stealing.
Diva is currently running its Early stakers & Operators Program and will likely be launching Mainnet in Q4 of 2023. In its testnet, Diva currently has 32 stakers with a total of 290 staked ETH
Liquid staking is one of the fastest-growing markets in Defi currently. Liquid staking allows people to stake their ETH with Liquid staking protocols and receive assets that represent a claim to the staked asset in return. These assets can be used in trading, borrowing, lending, providing liquidity, held as collateral, or any other defi activity. An entire market called LSDfi is emerging, which is using LSTs for different purposes.
LSDs are used for backing stablecoins that provide yield, as collateral for lending, and as a basket of assets that provide yield and a market for trading yield. Overall the liquid staking token asset class is seeing a lot of demand from the existing defi solutions.
The Liquid staking market is dominated by Lido and Rocketpool which targets slightly different users. Lido and Rocketpool contribute to 85% of the total staked ETH which is a critical level for the ETH ecosystem. Lido and Rocketpool have operators who manage and run the validators. Lido has a trusted set of 54 operators controlling most of the pools and Rocketpool has 3000+ operators but requires to deposit 8 or 16ETH. There is either trust or capital barriers for the operators which could be problematic since liquid staking represents most of the staking on ETH.
Liquid staking pools are run by operators that might threaten the validator as a whole and the stake of others if there are malicious operators or failure on their side. In this case, the pools in Liquid staking protocols have a clear use case for DVT to ensure Byzantine fault tolerance and safety of the pools. Liquid staking also requires eventual decentralization of their validator sets that use DVT for it.
Using DVT, multiple operators can deposit and run fault-tolerant versions of validators that would address the trust issue, reduce capital requirements per operator and reduce the slashing, and correlation risks for the operators.
Lido has already been working on pilots with SSV and Obol for running their validators using the DVT tech. There were 2 validators with 3/4 and 5/7 thresholds. Since the validators were activated, both validators have near-perfect attestation performance. There were occasions when an operator in the cluster experienced connection issues. Despite this, the validator successfully performed attestation duties without interruption. We also have seen two successful block proposals by cluster DKCSBES-Lido.
Since Lido was working with node operators and trusted validators, their previous architecture couldn’t accommodate new participants like DVT nodes and others. Lido has come up with the Staking router mechanism to modularize the validator set at the smart contract level.
The DVT solution provides large pools of stakes on Liquid staking protocols with geographical and client diversity, decreasing the correlation risks as well. And since the Liquid staking solutions have the largest contribution to the staking, it would lead to broader validator diversity and overall health of the network.
DVT is a technology useful for Liquid staking protocols, but Diva has come up with a DVT-integrated liquid staking protocol. It reduces the min ETH requirements to 1Eth for the node operators and has dynamic operators. Each validator is run by 16 DVT Key Shares and operators must lock 1 divETH collateral per Key Share instead of Ethereum's 32Ξ per validator. While Lido and Rocketpool are looking forward to integrating the benefits of DVT, Diva has these from the get-go. Diva uses a rotational nondeterministic consensus with 2x lower latency than classic DVT, which requires 2 round trips. Since Diva also needs 11 out of 16 nodes to be active, its downtime is expected to be less than 0.01%.
The risk associated with slashing and penalties for the validators has led web3 insurance leaders like Nexus Mutual to come up with an ETH slashing cover that covers missed rewards and penalties for validators that are covered due to being offline. StakeWise, a leading validator operator and liquid-staking service provider, recently purchased ETH2 Staking Cover from Nexus Mutual to protect all their validators against slashing risk. Chorus One is also one of the first node operators to purchase on-chain staking coverage to protect our customers through Nexus Mutual’s innovative tokenized cover. They have diverse clients including custodians, trusts, enterprises, individuals, and others for whom they have partnered with Nexus Mutual to protect from slashing and missed rewards protection.
Figment claims to manage over $3 billion in total assets staked, with nearly 5% of all staked ether on Figment validators. It is also working with Nextus Mutual to provide slashing protection against double signing of the blocks.
It is pretty clear that DVT tech has a huge impact on the insurance premiums of the staking cover product. As staking will grow into a huge market in the future, and DVT addresses the risks associated with slashing, downtime, and malicious activities, the premiums will definitely have a downward trend in the future.
MEV boost is one of the most successful implementations of PBS, which provides an API for the builder market and a relay service for the proposers to get access to more blocks and reduce their effort of creating blocks. It is possible that validators that use DVT form a significant percentage of total validators, and get elected as a proposer. In this case, the key shares/ partial validators would have to come to a consensus among themselves on which block they’d select for proposing.
Another middleware primitive that is popular in recent times is - Restaking. Eigenlayer makes it possible for validators to provide their guarantees for other middleware services. Restaking could use DVT for a better validator set that has high uptime and low byzantine risks. Validator clusters could also have different operators specializing in different middleware use cases so that slashing would be limited only to those instead of the entire validator using DVT