Ethereum Future Blueprint: The Purge drop complexity and historical burden

Vitalik: The Possible Future of Ethereum, The Purge

One of the challenges facing Ethereum is that, by default, the expansion and complexity of any blockchain protocol tends to increase over time. This occurs in two areas:

Historical Data: Any transaction conducted or any account created at any point in history needs to be permanently stored by all clients and downloaded by any new clients, thus fully synchronizing with the network. This will lead to an increasing client load and synchronization time over time, even if the chain's capacity remains unchanged.

Protocol Functionality: Adding new features is much easier than removing old ones, leading to increased code complexity over time.

To ensure that Ethereum can sustain itself in the long term, we need to apply strong counter-pressure to these two trends, gradually reducing complexity and inflation over time. However, at the same time, we need to preserve one of the key attributes that makes blockchain great: persistence. You can put an NFT, a love letter in a transaction call data, or a smart contract worth 1 million dollars on the chain, enter a cave for ten years, and come out to find it still there waiting for you to read and interact with. To allow DApps to confidently fully decentralize and remove upgrade keys, they need to be assured that their dependencies will not upgrade in ways that would be detrimental to them - especially L1 itself.

If we are determined to strike a balance between these two demands and minimize or reverse bloat, complexity, and decline while maintaining continuity, it is absolutely possible. Organisms can do this: while most organisms age over time, a lucky few do not. Even social systems can have very long lifespans. In some cases, Ethereum has succeeded: proof of work has disappeared, the SELFDESTRUCT opcode has mostly vanished, and beacon chain nodes have stored up to six months of old data. Finding this path for Ethereum in a more general way and moving toward a long-term stable end result is the ultimate challenge for Ethereum's long-term scalability, technological sustainability, and even security.

Vitalik: The Possible Future of Ethereum, The Purge

The Purge: Main Objective.

Reduce client storage requirements by minimizing or eliminating the need for each node to permanently store all historical records or even the final state.

Reduce protocol complexity by eliminating unnecessary features.

Article Directory:

History expiry

State expiry

Feature cleanup

History expiry

What problem does it solve?

As of the time of writing, a fully synced Ethereum node requires approximately 1.1 TB of disk space for the execution client, in addition to several hundred GB of disk space for the consensus client. The vast majority of this is historical: data related to historical blocks, transactions, and receipts, most of which are several years old. This means that even if the Gas limit does not increase at all, the size of the node will continue to grow by several hundred GB each year.

What is it and how does it work?

A key simplifying feature of historical storage issues is that, because each block is linked to the previous block via hash (and other structures), it is sufficient to reach consensus on the present to achieve consensus on the past. As long as the network reaches consensus on the latest block, any historical block, transaction, or state (account balance, nonce, code, storage) can be provided by any single participant along with a Merkle proof, and that proof allows anyone else to verify its correctness. Consensus is an N/2-of-N trust model, while history is an N-of-N trust model.

This provides us with many options for how to store historical records. A natural choice is a network where each node stores only a small part of the data. This is how seed networks have operated for decades: while the network as a whole stores and distributes millions of files, each participant only stores and distributes a few of those files. Perhaps counterintuitively, this approach may not even necessarily reduce the robustness of the data. If we can establish a network with 100,000 nodes, where each node stores a random 10% of the historical records, then each piece of data will be replicated 10,000 times - the same replication factor as a network of 10,000 nodes, where each node stores everything.

Now, Ethereum has begun to move away from the model where all nodes permanently store all history. Consensus blocks (i.e., the parts related to proof of stake consensus) only store data for about 6 months. Blobs only store data for about 18 days. EIP-4444 aims to introduce a one-year storage period for historical blocks and receipts. The long-term goal is to establish a unified period (possibly around 18 days), during which each node is responsible for storing everything, and then to create a peer-to-peer network composed of Ethereum nodes to store old data in a distributed manner.

Erasure codes can be used to improve robustness while keeping the replication factor the same. In fact, Blob has already implemented erasure codes to support data availability sampling. The simplest solution is likely to reuse these Erasure codes and also place execution and consensus block data into the blob.

Vitalik: The Possible Future of Ethereum, The Purge

What connections does ### have with existing research?

EIP-4444;

Torrents and EIP-4444;

Portal Network;

Portal Network and EIP-4444;

Distributed storage and retrieval of SSZ objects in Portal;

How to increase gas limit (Paradigm).

What else needs to be done, what needs to be weighed?

The remaining main tasks include building and integrating a specific distributed solution to store historical records------at least the execution history, but ultimately also including consensus and blobs. The simplest solution is to (i) simply introduce existing torrent libraries, and (ii) the Ethereum native solution called the Portal network. Once either of these is introduced, we can open EIP-4444. EIP-4444 itself does not require a hard fork, but it does require a new version of the network protocol. Therefore, enabling it simultaneously for all clients is valuable; otherwise, there is a risk of clients failing due to the expectation of downloading the full history when connecting to other nodes, but not actually retrieving it.

The main trade-off involves how we strive to provide "ancient" historical data. The simplest solution is to stop storing ancient history tomorrow and rely on existing archive nodes and various centralized providers for replication. This is easy, but it undermines Ethereum's status as a permanent record keeper. A more difficult but safer approach is to first build and integrate a torrent network to store historical records in a distributed manner. Here, "how hard we try" has two dimensions:

How do we work to ensure that the largest set of nodes indeed stores all the data?

How deep is the integration of historical storage into the protocol?

An extreme paranoid approach to (1) would involve custodial proof: essentially requiring each proof-of-stake validator to store a certain proportion of historical records and to regularly check in an encrypted manner whether they are doing so. A more moderate approach is to set a voluntary standard for the percentage of history stored by each client.

For (2), the basic implementation only involves the work already completed today: the Portal has stored an ERA file containing the entire Ethereum history. A more thorough implementation would involve actually connecting it to the synchronization process, so that if someone wants to sync a full history storage node or an archive node, they can achieve this through direct synchronization from the portal network, even if no other archive nodes are online.

How does it interact with other parts of the roadmap?

If we want to make it extremely easy for nodes to run or start, then reducing historical storage requirements can be said to be more important than statelessness: out of the 1.1 TB required by the node, about 300 GB is for the state, and the remaining approximately 800 GB has become historical. Only by achieving statelessness and EIP-4444 can we realize the vision of running an Ethereum node on a smartwatch and setting it up in just a few minutes.

Limiting historical storage also makes newer Ethereum nodes more viable, supporting only the latest version of the protocol, which simplifies them. For example, many lines of code can now be safely removed because all empty storage slots created during the 2016 DoS attack have been eliminated. Now that the shift to proof of stake has become history, clients can safely remove all code related to proof of work.

Vitalik: The Possible Future of Ethereum, The Purge

State expiry

What problem does it solve?

Even if we eliminate the need for clients to store historical records, the storage requirements of clients will continue to grow, approximately 50 GB each year, as the state continues to grow: account balances and nonce, contract code and contract storage. Users can pay a one-time fee, thereby imposing a burden on current and future Ethereum clients.

The state is harder to "expire" than history, because the EVM is fundamentally designed around the assumption that once a state object is created, it will always exist and can be read by any transaction at any time. If we introduce statelessness, some argue that the problem may not be that bad: only specialized block builder classes need to actually store state, while all other nodes (even those generating lists!) can operate in a stateless manner. However, there is a viewpoint that we do not want to rely too much on statelessness, and ultimately we may want to make the state expire to maintain the decentralization of Ethereum.

What is it and how does it work?

Today, when you create a new state object (which can occur in one of the following three ways: (i) sending ETH to a new account, (ii) creating a new account with code, (iii) setting a previously untouched storage slot), that state object remains in that state indefinitely. Instead, what we want is for the object to automatically expire over time. The key challenge is to do this in a way that achieves three goals:

Efficiency: No need for a large amount of additional computation to execute the expiration process.

User-friendliness: If someone enters a cave for five years and comes back, they should not lose access to their Ether, ERC20, NFT, or CDP positions...

Developer friendliness: Developers do not have to switch to a completely unfamiliar thinking model. Additionally, applications that are currently rigid and not updated should continue to operate normally.

If these goals are not met, it becomes easy to solve problems. For example, you could have each state object also store an expiration date counter (which can be extended by burning ETH, and this may happen automatically anytime during read or write operations) and have a loop to traverse the states to remove state objects with expired dates. However, this introduces additional computation (and even storage requirements), and it certainly cannot meet the requirements of user-friendliness. Developers also find it difficult to reason about edge cases where stored values sometimes reset to zero. Setting an expiration timer within the contract scope technically makes life easier for developers, but it makes the economics more challenging: developers must consider how to "pass on" the ongoing storage costs to users.

These are issues that the Ethereum core development community has been working hard to address for many years, including proposals such as "blockchain rents" and "regeneration". Ultimately, we combined the best parts of the proposals and focused on two categories of "known least bad solutions":

  • Partial status expiration resolution
  • Recommendations for state expiration based on address cycle.

Vitalik: The Possible Future of Ethereum, The Purge

Partial state expiry

Some status expiration proposals follow the same principles. We divide the status into blocks. Everyone permanently stores the "top-level mapping".

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 7
  • Share
Comment
0/400
GasFeeCriervip
· 07-11 06:47
When will the real purge happen?
View OriginalReply0
DecentralizedEldervip
· 07-09 20:51
Finally going to lose weight!
View OriginalReply0
consensus_failurevip
· 07-08 22:25
The blockchain is too fat and needs to lose weight.
View OriginalReply0
PositionPhobiavip
· 07-08 09:59
The light synchronization is so laggy that I want to cry!
View OriginalReply0
GasFeeVictimvip
· 07-08 09:59
The synchronization speed is a bit slow... dead old grave.
View OriginalReply0
ResearchChadButBrokevip
· 07-08 09:56
Sigh, let the old V worry so much.
View OriginalReply0
CryptoAdventurervip
· 07-08 09:38
A month-long weight loss plan squatting on the moon
View OriginalReply0
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)