Wednesday, July 13, 2022

Wrapping my head around network sharding

It is well known that traditional blockchains have all nodes store and verify the entire state, as well as process all transactions. As a result, the transaction throughput is much lower (<15 tx/s) than for transactions on centralised networks (visa ~1500 tx/s). Sharding is a promising concept which aims to improve this transaction rate for blockchains while upholding decentralisation and security.

The basic question is: do we really need all nodes in the network to verify every single transaction? 

Ideally, we want to partition the network into $K$ sets of validator nodes, each able to verify and add transactions securely in parallel, thus increasing the transaction throughput by a factor of $K$.

This is called network sharding, where we form groups of validator nodes in order to process transactions in parallel. However, if the groups are formed naively, it is possible to overpower a shard with fewer resources because the number of validators in a shard are < the total number of validators. To overcome this we can sample the validors in each shard at random, which means that the attacker can not force himself into a single shard so easily. In practice, we can achieve this by randomly shuffle the list of validators and allocating the first set of validators in the shuffled list to shard $1$, the next set to shard $2$ and so on.

To illustrate this, we analyse the scenario of having $N$ validator nodes in total (all of equal weight/power in the consensus mechanism, PoW/PoS/etc...) and $K$ shards of $X=N/K$ validators each. Suppose there are $M$ malicious nodes and that the rest ($N-M$) are honest. Fix a shard $1\leq i\leq K$, we want to know how large $M$ needs to be in order to have $>X/2$ malicious nodes in shard $i$.

We have ${N \choose X}$ validator groups possible for shard $i$, each of which is equally likely and
$$S=\sum_{j=0}^{X/2}{{M \choose j}{N-M \choose X-j}}$$
ways to choose validator groups for shard $i$ with $\leq X/2$ malicious nodes. This means that the probability of having $\leq X/2$ malicious nodes in shard $i$ is $\frac{S}{{N \choose X}}$.

Say $N=100$ and $K=10$ so that $N/K=10$. If we assume that $M=30$, then we have a
$$\frac{\sum_{j=0}^{5}{{30 \choose j}{100-30 \choose 10-j}}}{{100 \choose 10}}=0.96$$
probability of shard $i$ having $\leq 50$ malicious nodes. This makes it almost infeasible to force ones way into a shard. Below is a graph showing the probability of malicious nodes overwhelming a shard as $M$ varies.



The next question is then: how do we allocate transactions to our validator groups/shards?

This question is decided by transaction sharding mechanisms, but I will end this post here.

No comments:

Post a Comment

AtCoder shananigans

Here is the problem , and below is the raw transcript of my thoughts. Obs: It is always optimal to use all $K$ moves. intuitive proof. While...