Gate Square “Creator Certification Incentive Program” — Recruiting Outstanding Creators!
Join now, share quality content, and compete for over $10,000 in monthly rewards.
How to Apply:
1️⃣ Open the App → Tap [Square] at the bottom → Click your [avatar] in the top right.
2️⃣ Tap [Get Certified], submit your application, and wait for approval.
Apply Now: https://www.gate.com/questionnaire/7159
Token rewards, exclusive Gate merch, and traffic exposure await you!
Details: https://www.gate.com/announcements/article/47889
Former Arbitrum Technical Ambassador Explains Arbitrum's Component Structure (Part I)
Author: Benben Luo, former Arbitrum technical ambassador and geek web3 contributor
**This article is a technical interpretation of Arbitrum One by Benben Luo, former technical ambassador of Arbitrum and former co-founder of Goplus Security, a Smart Contract automation audit company. **
Because the articles or materials involving Layer 2 in the Chinese circle lack a professional interpretation of Arbitrum and even OP Rollup, this article tries to fill the gap in this field by popularizing the operation mechanism of Arbitrum. Due to the complexity of the structure of Arbitrum itself, the full text is still more than 10,000 words on the basis of simplifying as much as possible, so it is divided into two articles, which are recommended to be collected and forwarded as reference materials!**
A brief description of the Rollup sequencer
The principle of rollup scaling can be summarized in two points:
Cost optimization: Offload most of the compute and storage tasks to L1 off-chain, i.e., L2. **L2 is mostly a chain running on a single server, i.e. a sequencer/operator.
The sequencer looks close to a centralized server, abandoning “Decentralization” in “Blockchain Impossible Three Times” in exchange for TPS and cost advantages. Users can let L2 process transaction instructions instead of Ethereum at a much lower cost than trading on Ethereum.
**Security: The transaction content on L2 and the state after the transaction will be synchronized to Ethereum L1, and the validity of the state transition will be verified through the contract. At the same time, the L2 history will be retained on Ethereum, and even if the sequencer is permanently down, others can restore the entire L2 state through the records on Ethereum.
Fundamentally, the security of Rollups is based on Ethereum. If the sequencer does not know the private key of an account, it cannot initiate transactions in the name of that account, or tamper with the balance of assets in that account (and even if it does, it is quickly recognized).
Although the sequencer is centralized as the backbone of the system, in the relatively mature Rollup scheme, the centralized sequencer can only implement soft evil behaviors such as transaction review, or malicious downtime, but in the ideal Rollup scheme, there are corresponding means to curb it (such as censorship resistance mechanisms such as forced withdrawals or order proofs).
The state verification methods to prevent the rollup sequencer from being evil are divided into two types: Fraud Proof and Validity Proof. Rollups using fraud proofs are called OP Rollups (Optimistic Rollups, OPRs), and because of some historical baggage, Rollups using validity proofs are often called ZK Rollups (Zero-knowledge Proof Rollups, ZKR) instead of Validity Rollups.
Arbitrum One is a typical OPR that deploys contracts on L1 and does not actively validate the submitted data, optimistically believing that there is no problem with the data. If there is an error in the submitted data, the validator Node of L2 initiates a challenge.
So OPR also implies a trust assumption: there is at least one honest L2 validator Node at any given time. ZKR’s contract, on the other hand, uses cryptography to actively but cost-effectively verify the data submitted by the sequencer.
In this article, we will Depth introduce Arbitrum One, the leading project in optimistic rollups, covering all aspects of the entire system, and after reading carefully, you will have a deep understanding of Arbitrum and optimistic rollup/OPR.
Arbitrum’s core components and workflows
Core Contracts:
Arbitrum’s most important contracts include SequencerInbox, DelayedInbox, L1 Gateways, L2 Gateways, Outbox, RollupCore, Bridge, etc. More on that later.
Sequencer:
User transactions are received and sorted, transaction results are calculated, and a receipt is returned to the user quickly (usually < 1s). Users can often see their transactions on the L2 chain in a matter of seconds, just like a Web2 platform.
At the same time, the sequencer will also broadcast the latest L2 Block in real time under the Ethereum chain, and any Layer2 Node can receive it asynchronously. But at this point, these L2 Blocks are not finalized and can be Rollback by the sequencer.
Every few minutes, the sequencer will compress the sorted L2 transaction data, aggregate it into batches, and submit it to the inbox contract SequencerInbox on Layer 1 to ensure data availability and the operation of the Rollup protocol. In general, L2 data submitted to Layer 1 cannot be rollback and can be finalized.
From the above process, we can summarize: Layer2 has its own network of nodes, but the number of these nodes is scarce, and there is generally no consensus protocol used by the public chain, so the security is very poor, and it must be attached to Ethereum to ensure the reliability of data release and the effectiveness of state transition.
Arbitrum Rollup Protocol:
A series of contracts that define the structure of the Block RBlock of the Rollup chain, the continuation of the chain, the release of RBlock, and the challenge mode process. Note that the Rollup chain mentioned here is not a Layer2 ledger as we understand it, but an abstract “chain data structure” set up by Arbitrum One in order to implement a fraud proof mechanism.
An RBlock can contain the results of multiple L2 blocks, and the data is also very different, and its data entity RBlock is stored in a series of contracts in RollupCore. If there is an issue with an RBlock, the Validator will challenge the committer of that RBlock.
Validator:
Arbitrum’s validator Node is actually a special subset of Layer2 Full Nodes, currently with Allowlist access.
The Validator creates a new RBlock (Rollup Block, also known as assertion) based on the batch of transactions submitted by the sequencer to the SequencerInbox contract, and monitors the state of the current Rollup chain to challenge the erroneous data submitted by the sequencer.
Active validators require prior staking of assets on the ETH chain, sometimes referred to as stakers. Although Layer2 Nodes that do not stake can also monitor the running dynamics of Rollups, send abnormal alarms to users, etc., they cannot directly intervene in the error data submitted by the sequencer on the ETH chain.
Challenge:
The foundational steps can be summarized as multi-round interactive segmentation and single-step proofs. In the segmentation session, both parties are challenged to perform multiple rounds of turn-based subdivision of the problematic transaction data until the problematic step Operation Code instruction is broken down and verified. The paradigm of “multi-round segmentation - single-step proof” is considered by Arbitrum developers to be the most gas-efficient implementation of fraud proofs. Everything is under contract control, and no party can cheat.
Challenge Period:
Due to the optimistic nature of the OP Rollup, after each RBlock is submitted to the chain, the contract does not actively check it, leaving a window of time for validators to falsify. This time window is the challenge period, which is 1 week on the Arbitrum One Mainnet. After the challenge period ends, the RBlock will be finalized, and the corresponding message from L2 to L1 in the block (such as a withdrawal operation performed through the official bridge) can be released.
ArbOS, Geth, WAVM:
The virtual machine used by Arbitrum is called AVM, which consists of two parts: Geth and ArbOS. Geth is Ethereum’s most commonly used client software, and Arbitrum has made lightweight modifications to it. ArbOS is responsible for all L2-related special functions, such as network resource management, generating L2 blocks, working with the EVM, etc. We think of the combination of the two as a Native AVM, which is the virtual machine used by Arbitrum. WAVM is the result of compiling AVM’s code into Wasm. In the Arbitrum challenge process, the final “single-step proof” verifies the WAVM instructions.
Here, we can illustrate the relationship and workflow between the above components in the following diagram:
L2 transaction lifecycle
Here’s how an L2 transaction works:
The user sends a trading instruction to the sequencer.
The sequencer first verifies the data such as the Digital Signature of the transaction to be processed, eliminates the invalid transaction, and sorts and calculates.
The sequencer sends the transaction receipt to the user (usually very fast), but this is only the “preprocessing” of the sequencer off-ETH chain, which is in a state of Soft Finality and is not reliable. But for users who trust the sequencer (most users), it is optimistic that the transaction has been completed and will not be rollback.
The sequencer encapsulates the preprocessed transaction raw data into a batch after highly compression.
Every once in a while (affected by factors such as data volume, ETH congestion, etc.), the sequencer will publish transaction batches to the Sequencer Inbox contract on L1. At this point, the transaction can be considered to have a Hard Finality.
Sequencer Inbox contract
The contract will receive the batch of transactions submitted by the sequencer to ensure data availability. Looking deeper, the batch data in SequencerInbox completely records the transaction input information of Layer 2, even if the sequencer is permanently down, anyone can restore the current state of Layer 2 based on the batch record, replacing the faulty/Rug pull sequencer.
Physically, what we see as L2 is just a projection of the batch in SequencerInbox, and the light source is the STF. Because the light source STF does not change easily, the shape of the shadow is determined only by the batch that acts as the object.
The Sequencer Inbox contract, also known as the Fastbox, is a sequencer that submits preprocessed transactions to it, and only the sequencer can submit data to it. The corresponding fast box is the slow box Delayer Inbox, and its functions will be described in the subsequent process.
The Validator will always listen to the SequencerInbox contract, and whenever the sequencer releases a batch to the contract, it will throw an on-chain event, and after the Validator hears this event, it will download the batch data, and after local execution, it will release RBlock to the Rollup protocol contract on the ETH chain.
Arbitrum’s bridge contract has a parameter called accumulator, which will record the number of newly submitted L2 batches, as well as the number of newly received transactions and information on the slow inbox.
The SequencerInbox contract has two main functions:
add Sequencer L2Batch From Origin(), which the sequencer calls each time to submit Batch data to the Sequencer Inox contract.
force Inclusion(), which can be called by anyone and is used to implement censorship-resistant transactions. How this function works will be explained in more detail later when we talk about the Delayed Inbox contract.
Both functions call bridge.enqueueSequencerMessage() to update the accumulator parameter accumulator in the bridge contract.
Gas pricing
Obviously, L2 transactions can’t be free because of DoS attacks, the running costs of the sequencer L2 itself, and the overhead of submitting data on L1. When a user initiates a transaction within a Layer 2 network, the gas fee is structured as follows:
The data publishing cost generated by occupying Layer 1 resources mainly comes from the batches submitted by the sequencer (each batch has many user transactions), and the cost is ultimately shared equally by the transaction initiators. Algorithm for fee pricing generated by data publishing is dynamic, and the sequencer will price based on recent profit and loss, batch size, and current Ethereum gas price.
The cost incurred by users due to the occupation of Layer 2 resources is set to a maximum number of gases per second that can ensure the stable operation of the system (currently 7 million for Arbitrum One). The gas guide prices for both L1 and L2 are tracked and adjusted by ArbOS, and the formula will not be repeated here for the time being.
Although the specific gas price calculation process is more complicated, users do not need to perceive these details, and it is obvious that Rollup Transaction Fee much cheaper than ETH Mainnet.
Optimistic fraud proof
To recap, L2 is really just a projection of the batch input of the transactions submitted by the sequencer in Fastbox, i.e.:
Transaction Inputs -> STF -> State Outputs。 The input is determined, the STF is constant, and the output is also determined, and the system of fraud proof and the Arbitrum Rollup protocol is a system that publishes the root of the state of the output to L1 in the form of RBlock (aka assertion) and optimistically proves it.
On L1 there is the input data published by the sequencer, and there is also the output state published by the validator. Let’s take a closer look, is it necessary to publish the state of Layer 2 on-chain?
Because the input has fully determined the output, and the input data is publicly visible, submitting the output - state seems redundant, but this idea ignores the fact that state settlement is actually required between the L1-L2 systems, that is, the proposed behavior of L2 in the direction of L1, and there needs to be proof of state.
When building a rollup, one of the core ideas is to put most of the computing and storage on L2 to avoid the high cost of L1, which means that L1 does not know the state of L2, it only helps the L2 sequencer publish the input data of all transactions, but is not responsible for calculating the state of L2.
The withdrawal behavior is essentially to unlock the corresponding funds from the L1 contract according to the cross-chain interaction message given by L2, transfer them to the user’s L1 account or complete other things.
At this point, the Layer 1 contract will ask: what is your state on Layer 2, and how can you prove that you really own the assets you claim to cross. At this time, the user should give the corresponding Merkle Proof, etc.
So, if we build a rollup with no withdrawal function, it is theoretically possible not to synchronize the state to L1, and there is no need for a proof-of-state system such as fraud proof (although it may cause other problems). But in practice, this is clearly not feasible.
In the so-called optimistic proof, the contract does not check whether the output submitted to L1 is in the correct state, and is optimistic that everything is correct. The optimistic proof system assumes that there is at least one honest validator at any given time, and if there is an erroneous state, it is challenged by fraud proof.
The advantage of this design is that there is no need to actively validate every RBlock published to L1 to avoid wasting gas. In fact, it is not practical for OPR to verify each assertion, because each rblock contains one or more L2 Block, and it is no different from executing L2 transactions directly on L1 to re-execute each transaction on L1, which loses the meaning of Layer 2 scaling.
ZKR does not have this problem, because ZK Proof has simplicity and only needs to verify a small Proof, and does not need to actually execute many transactions behind the Proof. Therefore, ZKR is not optimistic, and every time the release state is mathematically verified by the Verfier contract.
Although fraud proofs cannot be as concise as Zero-Knowledge Proofs, Arbitrum uses a “multi-round split-single-step proof” turn-by-turn interaction process, and ultimately only a single virtual machine Operation Code needs to be proved, which is relatively small in cost.
Rollup Protocol
Let’s take a look at how the Rollup protocol, the entry point for initiating challenges and launching proofs, works.
The core contract of the Rollup protocol is RollupProxy.sol, which uses a rare dual proxy structure to ensure the consistency of the data structure, one proxy corresponds to two implementations of RollupUserLogic.sol and RollupAdminLogic.sol, which cannot be parsed well in tools such as Scan.
In addition, there is the ChallengeManager.sol contract to manage the challenge, and the OneStepProver series of contracts to determine fraud proofs.
In RollupProxy, a series of RBlocks (aka assertions) submitted by different validators are recorded, which are the squares in the following diagram: green - confirmed, blue - unconfirmed, yellow - falsified.
RBlock contains the final state of one or more L2 blocks after execution since the last RBlock. These RBlocks morphologically form a formal Rollup Chain (note that the L2 ledger itself is different). In an optimistic scenario, this Rollup Chain should not be forked, because having a fork means that there are validators who have submitted conflicting Rollup Blocks.
To make or agree with an assertion, the validator needs to first stake a certain amount of ETH for the assertion and become a staker. In this way, in the event of a challenge/fraud proof, the loser’s collateral will be slashed, which is the economic basis for guaranteeing the honest behavior of validators.
The blue block 111 in the lower right corner of the image will eventually be falsified because its parent block 104 is Block wrong (yellow).
In addition, validator A proposes Rollup Block 106, while B disagrees, challenging it.
After B initiates a challenge, the ChallengeManager contract is responsible for validating the breakdown of the challenge steps:
Segmentation is a process in which two parties take turns interacting, with one party segmenting the historical data contained in a rollup block and the other pointing out which part of the data fragment is problematic. Similar to a dichotomy (N/K, actually) process that gradually narrows the range.
After that, you can continue to locate which transaction and the result is problematic, and then further subdivide it into a machine instruction that is disputed in that transaction.
The ChallengeManager contract only checks whether the generated “data fragments” are valid after subdividing the original data.
When the challenger and the challenged party locate the machine instruction to be challenged, the challenger calls oneStepProveution() and sends a single-step fraud proof to prove that there is a problem with the execution result of the machine instruction.
One-step proof
Single-step proofs are at the heart of fraud proofs throughout Arbitrum. Let’s take a look at what a step proof proves.
This requires an understanding of WAVM, Wasm Arbitrum Virtual Machine, which is a virtual machine compiled from ArbOS modules and Geth (Ethereum client) core modules. Since L2 is very different from L1 in many ways, the original Geth core had to be lightly modified and work with ArbOS.
So, state transitions on L2 are actually a common feature of ArbOS+Geth Core.
Arbitrum’s Node Client (Sequencer, Validator, Full Node, etc.) compiles the program processed by the above ArbOS+Geth Core into native machine code (for x86/ARM/PC/Mac/etc.) that can be directly processed by the Node host.
If you change the compiled target language to Wasm, you get the WAVM that the validator uses to generate the fraud proof, and the contract that verifies the single-step proof also emulates the functionality of the WAVM virtual machine.
The main reason is that the contract that verifies the one-step fraud proof uses EthereumSmart Contract to simulate a virtual machine VM that can handle a certain set of instruction sets, and WASM is easy to implement on the contract.
However, WASM is slightly slower than native machine code, so WAVM will only be used by Arbitrum’s Node/Contract when fraud proofs are generated and verified.
After the previous rounds of interaction segmentation, the single-step proof ultimately proved to be the single-step instruction in the WAVM instruction set.
As you can see in the following code, OneStepProofEntry first determines which category the Operation Code of the instruction to be proved belongs to, and then calls the corresponding prover such as Mem, Math, etc., and passes the step-by-step instruction into the prover contract.
The final result of the afterHash will return to the ChallengeManager, and if the hash is inconsistent with the hash recorded on the Rollup Block, the challenge will be successful. If it is consistent, it means that the execution result of the instruction recorded on the Rollup Block is fine, and the challenge fails.
In the next article, we will analyze the contract modules that handle cross-chain interaction/bridging functions between Arbitrum and even Layer 2 and Layer 1, and further clarify how a true Layer 2 should be censorship-resistant.