Web3 has introduced groundbreaking solutions to Web2 challenges, such as decentralization, open-source frameworks, and trustless systems. However, even these innovations have overlooked pain points, most notably — not entirely relying on trustless, verified data, leaving data feeds vulnerable to malicious attacks.
Today, anyone can bring data on-chain and claim it’s true, creating a significant risk for those who might implement it into their project. The idea of a trustless environment emerged from the need to eliminate these untrustworthy sources, instead relying on network participants with distributed trust and incentivization to only bring forward valid data. In theory, this is a great process; however, improper implementation in the node infrastructure and overlooked details can easily lead to its own risk factors.
Take the Ronin Bridge attack, for example, in March 2022. A hacker compromised five of the network’s nine validator nodes, taking the majority to approve their withdrawal of over 173.6k ETH and 25.5M USDC from the bridge. In this situation, the Ronin Bridge team did not suspect anything, as there was no process to raise suspicion about the true incentives of each node. The Ronin Bridge attack in 2022 highlighted a vulnerability still relevant today—weak node infrastructure that enables attacks.
Now begs the question, how exactly can data be implemented into all future Web3 builds in a truly trustless way? And once on-chain, how can we correctly validate this data in a decentralized way, assuring no further risk of tampering or manipulation?
When projects source off-chain data, oracles are their go-to tool as they provide an easy access point to deterministic Web2 data. However, whether an oracle claims that they verify the data or not, there is an underlying issue: Lack of trustlessness. How can we be sure that the data the oracle fetched was from the correct source and or hasn’t been previously tampered with?
Why Oracles Aren’t Fully Trustless:
Suppose a project automatically implements incorrect data provided by an oracle without double-checking. In that case, it can become highly vulnerable to attacks, losses, improper pricing, and/or data process discrepancies as it opens itself up to an attack vector from one source.
For example, at the time of writing, Mango Markets, a trading platform on Solana, had a targeted attack via oracle discrepancies. Two malicious accounts took an outsized position in MNGO-PERP, leading to a fluctuation of 5–10x in the MNGO price. With this, two oracles updated their MNGO benchmark, further causing a market-to-market increase in the value, all from an unrealized profit.
As mentioned by the Mango Markets team, “…neither oracle providers have any fault here. The oracle price reporting worked as it should have.” Clearly showing how limited oracles can be in terms of providing “valid” data.
Another typical attack is if a hacker targets the majority of validator nodes in a network to approve certain actions, AKA a 51% attack. This can occur if, for example, a network doesn’t have strong enough security in its node infrastructure or, simply, not enough nodes in general. With all projects vulnerable to this type of attack, it’s crucial that they ensure proper decentralization within their node infrastructure.
There are many ways to reduce the risk of a 51% attack. Today, KYVE is heavily focusing on this topic by implementing the right recipe of incentivization, high stake, weighted power, and more to create a secure, fully trustless environment for introducing data.
Once this is achieved, the next hurdle comes: How can we make sure the data introduced into the space is truly correct?
Since data can be uploaded by anyone and claimed that it’s true, having multiple sources of truth is a probable outcome. How do we ensure data accuracy in a trustless environment? The answer lies in decentralization.
Decentralization is the key pillar in the ethos of Web3, distributing power, trust, acts, etc., among stakeholders and network participants. In general, to determine if a piece of data is valid, there always needs to be a generic solution, i.e., developers creating custom validation methods per data set. However, what’s lacking is managing these different runtimes and ensuring that all data sets are properly sourced and validated quickly and efficiently.
Enter KYVE, the decentralized data hub built to ensure all types of on- and off-chain data are validated, truly decentralized, and continuously updated, providing the tooling developers need to write these custom solutions.
KYVE enables projects to store blockchain data with decentralization, distributing it across data pools, which are then uploaded and verified by validators for use.
Here’s how it works:
In each pool on the KYVE data lake, one node is responsible for uploading the data, with the rest accountable for voting on whether that data is valid. Once the vote is final, the responsibility of uploading data is switched to another randomly selected node. Doing so combats the risk of centralization, i.e., if we only had one node uploading data at all times, that would be a higher risk factor for an attack.
Below you can see KYVE’s current code for evaluating the vote distribution:
func (k Keeper) GetVoteDistribution(ctx sdk.Context, poolId uint64) (voteDistribution types.VoteDistribution) {
bundleProposal, found := k.GetBundleProposal(ctx, poolId)
if !found {
return
}
// get $KYVE voted for valid
for _, voter := range bundleProposal.VotersValid {
// valaccount was found the voter is active in the pool
if _, foundValaccount := k.stakerKeeper.GetValaccount(ctx, poolId, voter); foundValaccount {
delegation := k.delegationKeeper.GetDelegationAmount(ctx, voter)
voteDistribution.Valid += delegation
}
}
// get $KYVE voted for invalid
for _, voter := range bundleProposal.VotersInvalid {
// valaccount was found the voter is active in the pool
if _, foundValaccount := k.stakerKeeper.GetValaccount(ctx, poolId, voter); foundValaccount {
delegation := k.delegationKeeper.GetDelegationAmount(ctx, voter)
voteDistribution.Invalid += delegation
}
}
// get $KYVE voted for abstain
for _, voter := range bundleProposal.VotersAbstain {
// valaccount was found the voter is active in the pool
if _, foundValaccount := k.stakerKeeper.GetValaccount(ctx, poolId, voter); foundValaccount {
delegation := k.delegationKeeper.GetDelegationAmount(ctx, voter)
voteDistribution.Abstain += delegation
}
}
voteDistribution.Total = k.delegationKeeper.GetDelegationOfPool(ctx, poolId)
if voteDistribution.Total == 0 {
// if total voting power is zero no quorum can be reached
voteDistribution.Status = types.BUNDLE_STATUS_NO_QUORUM
} else if voteDistribution.Valid*2 > voteDistribution.Total {
// if more than 50% voted for valid quorum is reached
voteDistribution.Status = types.BUNDLE_STATUS_VALID
} else if voteDistribution.Invalid*2 >= voteDistribution.Total {
// if more or equal than 50% voted for invalid quorum is reached
voteDistribution.Status = types.BUNDLE_STATUS_INVALID
} else {
// if neither valid nor invalid reached 50% no quorum was reached
voteDistribution.Status = types.BUNDLE_STATUS_NO_QUORUM
}
return
}
Lastly, to incentivize good node behavior and maintain a proper flow of valid data, we introduced specific pool economics. To put it simply, those who require direct and easy access to trustless data act as “funders”, supplying $KYVE tokens as rewards for well-behaved pool participants. There are also “delegators” who delegate their tokens to support nodes in exchange for token rewards. However, if a node misbehaves, their tokens will get slashed.
To further improve the funding experience and enable a more collaborative experience, the multi-coin funding update now allows data pools to be funded in $KYVE and other tokens, enabling other projects to fund their integrations on KYVE with their own tokens.
With such initiatives, KYVE is constantly envisioning new ways to ensure the KYVE Network and overall infrastructure is highly incentivized and fully decentralized for its mission of providing truly trustless data for all to use for building secure and scalable Web3 projects.
In 2024, the global user base for digital currencies reached 562 million people, up from 420 million in 2023 (Source: Triple A). With the current global Web3 adoption soaring, there’s no doubt we have a long way to go, but that doesn’t mean it’s not time to start focusing on creating a secure baseline for all the projects to come in the near future.
Building with trustless data validated in a decentralized way is a necessity for developers when sourcing data for their dApp and or blockchain. Doing so will decrease their project’s risk of data manipulation and attacks and contribute to improving Web3’s data foundation.
Follow KYVE’s journey in taking a lead role in this movement, enabling all to easily access trustless data validated in a decentralized way via our data lake protocol, eliminating any data doubt or hard efforts for builders, node runners, and more.
For more information on KYVE, including how to get started and technical resources, visit the KYVE documentation.
Join KYVE’s community: Twitter | Discord | Telegram
Blog Author: Margaux, KYVE Head of Marketing