Why is the Optimistic Rollup challenge period 7 days?

Have you ever tried using the "standard" bridge of an Optimistic Rollup like Optimism to withdraw assets back to Ethereum? If so, then you're probably aware that this default withdrawal process takes a full 7 days. Although fast bridges have popped up to speed up the process of moving assets between different chains, the "standard" bridge still takes quite a while (fast bridges actually comprise the majority of bridge activity now). None of this is new, and if you're reading this post then you're probably aware of this fact.

But isn't it weird that this "slow" withdrawal process takes exactly 7 days on most Optimistic Rollup protocols today? Why is this period exactly 7 days? Why not 3 days? Or 10 days?

The answer to this question is significantly more interesting than you might imagine. It touches on all sorts of weird crypto things like cryptoeconomics, rational behavior, and the power of making everyone in the Ethereum ecosystem mad at the same time. Let's dive in!

But first, why any delay at all?

Before we can dig into the length of the withdrawal period, we first need to establish exactly why we need a withdrawal period at all. Why can't we just let users withdraw their funds right away?

The withdrawal delay is a fundamental part of the Optimistic Rollup. I'll try to keep this section relatively concise because many other people have already written about the nature of this withdrawal delay. But in a nutshell, the idea is that a withdrawal starts when a user makes a claim to Ethereum about the state of the Optimistic Rollup. For example, this claim might be "I burned 20 tokens on Optimism, so let me withdraw 20 tokens on Ethereum."

Since the whole point of the Optimistic Rollup is that the L1 isn't actually executing the L2 chain, the L1 doesn't know if this claim is valid or not. ZK Rollups solve this problem by giving the L1 a cryptographic proof that a given claim is valid. Optimistic Rollups solve this problem by requiring that claims must pass through a challenge process before they can be considered valid. Each claim must wait a challenge period during which a challenger can state that the claim is invalid. If someone challenges a claim, then some on-chain game begins that determines whether or not the claim is actually valid.

Because it can take time for someone to detect an invalid claim and submit the challenge, it's inevitable that we need the challenge period to be greater than zero. After all, if the duration of the challenge period were zero seconds then there would be no chance to submit a challenge. Our question then becomes: how long should the challenge period be?

Some basic constraints

Let's put down some basic constraints. End-state modern Optimistic Rollup challenge games essentially take the form of a back-and-forth between the user who made a claim and the user who's challenging that claim (in practice, these protocols are typically designed so that anyone is allowed to participate on either "team" but let's keep this simple for now). For the sake of example, let's assume there are approximately 10 back-and-forth steps during the whole process (the exact number varies but isn't important here).

If both parties are really, really fast then you're looking at a minimum of 10 Ethereum blocks (2 minutes) for the entire challenge process to elapse. Of course, users aren't perfectly fast like this, so you likely want to add some padding of at least 10x that base-line number, so about 100 blocks (20 minutes). Still, 100 blocks is significantly shorter than the more than 50400 blocks that go into a 7 day challenge period. There must be something else here.

Malicious actors

Our 100 block number establishes a simple lower bound, but this lower bound only really applies if neither party is being malicious. In Optimistic Rollup land, the stakes are much higher. An attacker can use an invalid claim to potentially steal hundreds of millions if not billions of USD-valued assets. An attacker that could earn this much money should also be willing to spend up to the earnings in order to execute the attack. An attacker who's confident they'll be able to make $1b should be willing to part with up to $1b to execute the attack.

What could this attacker do with their money? Well, it's the attacker's goal to prevent challengers from being able to include their challenge transactions on-chain. After all, if the challenge transactions make it through, then the attack fails. The attacker essentially has three potential tactics here:

  1. Run direct DoS attacks on the challengers to prevent them from being able to interact with the L1 network in the first place
  2. Spam the L1 network with expensive transactions to drive up the gas price and prevent challengers from being able to transact
  3. Censor challengers directly by controlling a large number of validators

In practice, it's likely that a motivated attacker would use some combination of all three of these attack vectors to prevent challengers from being able to interact. We often ignore vector #1 because it's relatively easy to prevent and hard to quantify, but vectors #2 and #3 are really what we need to look out for.

Now we land at the underlying principle that, in theory, should determine the length of the Optimistic Rollup challenge period: the challenge period must be greater than the amount of time that the attacker can censor all available challengers.

Attacking the gas price

Oh, if it were only that simple.

Determining exactly how long an attacker can censor the chain with a given amount of capital is not an easy task. Censoring by increasing the cost of gas is very expensive. Remember that we assumed there were a total of 10 back-and-forth steps, so the challengers only need to land a total of 5 transactions. The total cost to attack is extremely lopsided in favor of the challengers. Let's say there's $1b at risk and the attacker is willing to spend $1m worth of ETH on priority fees in every single block. The attacker can keep this up for 1000 blocks, but the total cost to defend is only marginally greater than $5m. Note also that the cost of the attack increases as time goes on because the Ethereum base fee will adjust upwards in response to the many full blocks.

We can throw this into a formula to get a good lower bound (ignoring the base fee):

challenger_spend_per_tx = max_challenger_spend / num_challenger_txns
challenge_period_blocks = amount_at_risk / challenger_spend_per_tx

Optimism's TVL is currently ~$2b, so if we assume a conservative max challenger spend of $1m and that there are 5 challenger transactions in total, then the challenge period could be:

challenger_spend_per_tx = 1000000 / 5 = 200000
challenge_period_blocks = 2000000000 / 200000 = 10000

Already more than a day's worth of blocks!

Of course, we're oversimplifying a lot. Challengers are likely willing to spend much more than $1m in total to protect the chain because, in total, the people and protocols that would act as challengers likely have significantly more than that amount locked up on the network. But, you know, thumb to the wind and we've already got a challenge period of more than a day's worth of blocks and we haven't even talked about validator incentives.

Attacking the validator set

Another way to attack the network is to attack the validator set. This can take multiple different forms. One way to "attack" the validator set would be to spend a significant amount of ETH on validators. Each % of the validator set that you control is an additional % of blocks that you do not need to buy out when executing your gas attack. When your own validators are chosen to produce a block you can simply censor that block in its entirety.

If you control 20% of the validator set, then you'll be selected to produce 20% of blocks on average, so you can extend your attack by about 20%. You'll even start to make some money from the block production rewards and from the 2 users who are still willing to transact with a 50k gwei base fee! At 500k validators it'll only require an investment of $5b worth of ETH to grab 20% of the validator set. Nice!

You could also try to DoS other validators to prevent them from producing blocks during their assigned slots, further increasing the length of your attack while keeping the cost the same. If you're already putting in this much effort into attacking the chain, why not?

Things only get more confusing

Why stop there? If you control a supermajority of the validator set then you effectively own the entire chain and can censor whatever you want! I am untethered and my rage knows no bounds!

Well, good luck with that. Ethereum is much more... err... subjective after the Merge. Your Ethereum is not my Ethereum! I find it relatively difficult to believe that the ecosystem would accept the fork of the network that just did everything possible to destroy large parts of the network.

This sort of subjective stuff is what really throws a wrench in any reasonable attempt to come up with the perfect withdrawal period. I mean, just think about it. Can you even get $2b worth of ETH for gas without being spotted? Can you run the hundreds of thousands of nodes required to attack the validator set? Won't the entire Ethereum ecosystem begin to get very angry at the ludicrously high gas price after day 2? Will there be "honest" validators who choose to censor the attacker and include the challenger's blocks instead?

Oh boy.

How I learned to stop worrying and accept the 7 day challenge period

The more you think about this problem, the more confusing it becomes. There are simply so many second-order effects and practical constraints that make it close to impossible to figure out the perfect challenge period. You then also have to consider the value at stake across all Optimistic Rollups at the same time. Why only attack one network when you can attack them all at the same time? You can spend years coming up with the perfect model, after which we'll all have upgraded to be ZK Rollups anyway.

So, let's finally get to the question we've been trying to answer this whole time.

Why is the challenge period 7 days? Well, because it's much longer than conservative napkin math lower bounds and, perhaps more importantly, it leaves enough time for the entire Ethereum community to be thrown into a massive fit. Simply put, the type of attack required to exploit an Optimistic Rollup significantly degrades the experience of transacting on Ethereum for a very long time. Everyone is going to be very angry. Honest validators will pop out of the woodwork willing to submit the challenge transactions in order to stop the attack. A week gives us enough time to coordinate this sort of recovery on the social layer.

Why not a shorter delay like 3 days? Three days sounds fine, but there isn't a significant improvement in the user experience after you've already passed the 1 day mark. People still need to sit around and wait. Waiting 7 days might even be easier because you can remember it as "this time last week". And a withdrawal period of 1 day is already starting to cut it close with some of our conservative estimates. And remember that you also want to give yourself breathing room to coordinate mass action within the Ethereum community if necessary. You don't want to get rekt because everyone is offline for a long weekend! Basically, by the time you get past 1 day, you simply might as well give yourself more time just in case.

But why is everyone using a 7 day withdrawal period? When you don't have much evidence for any specific number, you might as well just use the same period that everyone else is using. Nobody ever got fired for buying IBM. Nobody wants to be the team that picked a 1 day withdrawal period and got hacked because there was some second-order effect they didn't consider. Many Optimistic Rollups are also built on top of Optimism's OP Stack, which means they inherit this withdrawal period by default. Plus, the uniformity across the ecosystem ends up being beneficial because users have some common expectations across different chains.

So, yeah. That's the answer for why Optimistic Rollups have a 7 day challenge period. It's a chaotic answer to a chaotic question. It's an excessively safe upper bound that leaves enough time to consider social layer solutions to a hack if necessary. It's like bringing a nuclear bomb to a gun fight. No one wants to be the team that gets put in the dirt because they decided it'd be easier to walk around with a rifle instead.

Next steps?

I still think this problem is very interesting. Trying to model the perfect challenge period seems like an amazing research project for someone who has significantly more time than I do.

Still, the ZK Rollup gives a better experience in this regard, so I wouldn't be surprised to see most existing Optimistic Rollups decide to shift towards ZK relatively soon. Optimism's Bedrock upgrade is an explicit step in this direction with the modularization of the Optimism proof mechanism. It may be the case that a principled approach to solving this problem simply isn't worth it for Optimistic Rollups as long as 7 days is enough time to coordinate the social layer in the worst case.

That said, Optimistic protocols (in the general sense) are likely going to be around for a long, long time. Protocols that don't have billions at stake are likely much more interested in determining a reasonable-but-safe challenge period. I'm sure many people will appreciate your work if you do decide to tackle this problem.

I hope you enjoyed this post! I also hope it was equal parts satisfying and dissatisfying. After all, that's the nature of the Optimistic Rollup challenge period!