Tuesday, August 9, 2022

AtCoder shananigans

Here is the problem, and below is the raw transcript of my thoughts.

Obs: It is always optimal to use all $K$ moves.

intuitive proof. While we have moves left, each move can be used to improve the permutation. Either we can shorten it if all are in increasing order.
Or we can improve it if we have say 1 4 2 by deleting the 4.

we want to prioritise getting 1 in first position

Suppose $1$ has $p[1]=1$. If we shift, we are worsening our permutation.
So shifts become useless. Once the 1, or the lowest value possible is
positioned at the head, we will only use deletes.

We have some options for _ ... _ 1 _ ... _ to be improved

1. Delete the prefix.
2. Shift $1$ to the head.
3. Delete part of the suffix and then shift $1$ to head.

If we are going to shift, we should delete from the suffix first.
Otherwise we waste moves shifting values that will be deleted.

We can simulate this by shifting first without deleting, then
anything in the suffix before the shift can be deleted for free.

As long as we take this into account, we can ignore $3$ and just do $2$.

Sub-Problem: Given a permutation how to optimally improve it with
only deletes. Just delete whenever there is a decrease so for example:

1 3 2 4 we would first delete the $3$, then the $4$. We can use a stack to push values to while they increase and as soon as we see a decrease, we pop until it goes back to increase.

Thursday, August 4, 2022

Programming again

The problem is given here.

Def 1: A zig-zag path is a path starting from $(i,j)$ which moves according to the following rule $(i,j)\rightarrow ((i+1)\mod 2, j)\rightarrow ((i+1)\mod 2, j+1)$. This rule may be repeated as many times as we like.

Claim:
Suppose we are at $(1,j)$ after making two consecutive right moves. Then we are forced to move $(1,j+1)\rightarrow ... \rightarrow (1,m)\rightarrow (2,m)\rightarrow ... \rightarrow (2,j-1)$. Call such a path a forced path.

Then the path that we must take will firstly consist of a zig-zag path starting from $(1,1)$ (possibly of length $0$) followed by a forced path (possibly of length $0$). 

Suppose for any cell $(i,j)$ we have pre-computed the time needed to complete the forced path in $f[i,j]$, and the time needed to zig-zag to this cell $z[i,j]$ from $(1,1)$. We define $t[i,j]$ to be the minimum time needed to visit each cell exactly once given that we have zig-zaged to $(i,j)$. Then

$$t[i,j]=z[i,j]+\min(t[(i+1) \mod 2, j+1], f[i,j+1])$$

The problem now reduces to computing $z[i,j]$ and $f[i,j]$ efficiently, i.e. in $O(m)$. The problem of computing $z[i,j]$ can be done in $O(m)$ by directly simulating the unique zig-zag path.

We now focus on the only remaining task of computing $f[i,j]$, i.e. the time to complete the forced path starting at $(i,j)$. This is where I got stuck in the contest. I am not sure how to efficiently compute $f[i,j]$ by dynamic programming, although I suspect this approach can be made to work. It feels like a hell of a lot of work though, so I am curious to see if there is a more straightforward approach. I strongly believe that most solutions should leverage the claim above.

It turns out that we need to notice that we need at least $k$ time to move through $k$ cells. Any additional time is spent waiting at some cells. Observe that it makes no difference where we wait, i.e. if we wait at a cell for 2 seconds and another for 4 seconds, then we can convert it to a path where we wait 6 seconds at the first cell and move directly through all cells with no waiting. So we compute the minimum waiting time for a path instead. 

For the $k$th cell in a path at $(x,y)$, the minimum waiting time $t$, is $t\geq a_{x,y}-k$. So we can take the maximum of the minimum waiting times over all cells in the forced path as our minimum waiting time for that path. If we know the minimum waiting time for a path from cells $[i,j]$, call it $t_{[i,j]}$, then we can compute the minimum waiting time $t_{[i,j+1]}=\max(t_{[i,j]}, a_{j}-(j+1-i))$. Also $t_{[i-1,j]}=\max(t_{[i,j]}+1, a_{i-1})$. This gives us everything we need to write an $O(m)$ solution.

Saturday, July 23, 2022

More programming problems

I kinda failed today's round. But oh well.


This is very simple implementation.


We just keep track of the sum of the differences between increasing columns for all intervals $[0,i]$ in an array $d$. Then we can get the jump damage for any $[i,j]$ interval using $d[j]-d[i]$.


A bracket sequence is regular (RBS) if and only if:
  1. number of '(' = number of ')',
  2. for every prefix: number of '(' $\geq$ number of ')'.
Idea 1: Iterate from left to right through $s$, whenever we meet a '?' determine if it may be replaced by a '(' and a ')'.

Replacing '?' with '(' can never break condition (2), however to satisfy (1) we need to ensure that we do not replace so that number of '( > number of ')'.

Replacing '?' with ')' requires that the current nesting depth is $>0$. Furthermore, we must not use too many such that we number of '(' < number of ')'. Additionally, we need to ensure that this replacement does not violate (2) which can occur even if the nesting depth is $>0$. For example in '(?)?' the first question mark is at nesting depth $1$, however if we replace it with ')', the sub-segment $[1,3]$ has more ')' than '('. However, I am not sure how to make this work.

In the end I used a hint from the editorial. Apparently problem D was easier than C, I noticed this in contest but was too stubborn to switch.

Saturday, July 16, 2022

Reviewing codeforces 808

Problem A:

We observe that for $a_1$ to be turned to $0$, we need $a_1$ to be a multiple of $a_0$. If this is not the case then the answer is no. Else we can make $a_1=a_0$, so we can inductively repeat the argument. Hence we only need to check that $a_i$ is a multiple of $a_0$ for $1\leq i<n$.


We observe that $\gcd(i, a_i)\in\{1,...,i\}$. Then we must have $\gcd(i,a_i)=i$. Suppose not, then $\gcd(i-1,a_{i-1})$ has $i-2$ possible values, $\gcd(i-2, a_{i-2})$ has $i-3$ possible values, ..., $\gcd(1,a_1)$ will have no possible values. This means that $a_i$ must be any multiple of $i$ in the range $[l,r]$, for $1\leq i\leq n$.


We observe that if we can take $x$ tests, then we can take $x-1$ tests. Therefore, we can do a binary search. How can we check if we can take $\geq x$ tests? We iterate through $a$ maintaining the current IQ $k$ and current tests taken $c$. If $k\geq a[i]$, then it definitely won't hurt to take test $i$. If $k<a[i]$, then we should only take the test $i$ if we have $n-i\leq x-curr$, i.e. the number of tests remaining is at most the number of tests needed to hit our minimum target of $x$ tests taken. The intuition is that taking such a test earlier decreases our IQ, so it may prevent us from taking a test we could otherwise have taken.


I did not get very far on this problem. I will update this section later once/if I solve it.

Wednesday, July 13, 2022

Wrapping my head around network sharding

It is well known that traditional blockchains have all nodes store and verify the entire state, as well as process all transactions. As a result, the transaction throughput is much lower (<15 tx/s) than for transactions on centralised networks (visa ~1500 tx/s). Sharding is a promising concept which aims to improve this transaction rate for blockchains while upholding decentralisation and security.

The basic question is: do we really need all nodes in the network to verify every single transaction? 

Ideally, we want to partition the network into $K$ sets of validator nodes, each able to verify and add transactions securely in parallel, thus increasing the transaction throughput by a factor of $K$.

This is called network sharding, where we form groups of validator nodes in order to process transactions in parallel. However, if the groups are formed naively, it is possible to overpower a shard with fewer resources because the number of validators in a shard are < the total number of validators. To overcome this we can sample the validors in each shard at random, which means that the attacker can not force himself into a single shard so easily. In practice, we can achieve this by randomly shuffle the list of validators and allocating the first set of validators in the shuffled list to shard $1$, the next set to shard $2$ and so on.

To illustrate this, we analyse the scenario of having $N$ validator nodes in total (all of equal weight/power in the consensus mechanism, PoW/PoS/etc...) and $K$ shards of $X=N/K$ validators each. Suppose there are $M$ malicious nodes and that the rest ($N-M$) are honest. Fix a shard $1\leq i\leq K$, we want to know how large $M$ needs to be in order to have $>X/2$ malicious nodes in shard $i$.

We have ${N \choose X}$ validator groups possible for shard $i$, each of which is equally likely and
$$S=\sum_{j=0}^{X/2}{{M \choose j}{N-M \choose X-j}}$$
ways to choose validator groups for shard $i$ with $\leq X/2$ malicious nodes. This means that the probability of having $\leq X/2$ malicious nodes in shard $i$ is $\frac{S}{{N \choose X}}$.

Say $N=100$ and $K=10$ so that $N/K=10$. If we assume that $M=30$, then we have a
$$\frac{\sum_{j=0}^{5}{{30 \choose j}{100-30 \choose 10-j}}}{{100 \choose 10}}=0.96$$
probability of shard $i$ having $\leq 50$ malicious nodes. This makes it almost infeasible to force ones way into a shard. Below is a graph showing the probability of malicious nodes overwhelming a shard as $M$ varies.



The next question is then: how do we allocate transactions to our validator groups/shards?

This question is decided by transaction sharding mechanisms, but I will end this post here.

Tuesday, July 5, 2022

Codeforces: The third problem

I couldn't take part in the round today because I was playing tennis =) . Anyway problem C had 2500 solves so I thought I would give it a go.

We are given a permutation of the numbers from $0$ to $n-1$ and are asked to find the number of permutations $b$, from $1$ to $n$ such that for every $1\leq l<r\leq n$, MEX$(a_l, ..., a_r)=$MEX$(b_l,...,b_r)$. Here the MEX of a sequence of numbers is the smallest non-negative number not in the sequence.

So we first observe that any sub-array which includes $0$ in $a$ must also include $0$ in $b$. Therefore, the position of $0$ in $b$ is the same as its position in $a$. Similarly, the position of $1$ is the same in both $a$ and $b$ since if it changes some sub-array whose MEX was $1$ will now be $>1$ or some sub-array whose MEX was $>1$ will now be $1$. This lead me to the following observation.

We want to make sure that all sub-arrays with MEX$=0,1,2,...,n-1$ keep their MEX unchanged. For MEX=0, this means the $0$ staying in the same position. For MEX$=1$, the same. Now for MEX$=2$, we observe that these sub-arrays must contain $0$ and $1$. Let pos$[i]$ be the position of $i$ in $a$. Then we have two cases.

Case 1: $\min(\text{pos}[0],\text{pos}[1])<\text{pos}[2]<\max(\text{pos}[0],\text{pos}[1])$.

Then the exact position of $2$, so long as it remains in this range, does not change MEX$=2$ intervals. This is because for a MEX of $2$, we need to include both $0$ and $1$ and so moving the $2$ around will not change the MEX of intervals which do not include either $0$ or $1$ in this case.

Case 2: $\min(\text{pos}[0],\text{pos}[1])>\text{pos}[2]$ or $\text{pos}[2]>\max(\text{pos}[0],\text{pos}[1])$.

Then $2$ cannot be moved at all in $b$.

In general for any $i$, we let $l=\min(\text{pos}[0],...,\text{pos}[i-1])$ and $r=\max(\text{pos}[0],...,\text{pos}[i-1])$. Then for case 1 we need $l<\text{pos}[i]<r$. 

So now we can easily find if an $i$ can be moved, and a rough range of positions it must stay within (namely $(l,r)$) . However, we need to figure out for each $i$, how many positions it can be moved to, i.e. how many elements it can swap with in $a$. Note that this will result in double counting, so we only consider for each $i$, the number of elements $>i$ that it can swap with.

If we have case 2 for some $i$, then the number of elements it can swap with is $0$. If we have case 1, then we have $r-l+1$ elements in the range $[l,r]$ of which $i$ of them are $<i$. The remaining $r-l-1-i$ elements are larger than $i$ and can be swapped with $i$ safely. Therefore, we can multiply our answer by $r-l+1-i$ every time we enter case 1.

This leads to an $O(n)$ solution basically using a two pointers method with $l$ and $r$ to maintain the range. Quite a nice problem! For me the hardest part was figuring out the correct approach to count the number of elements each $i$ can swap with.

Saturday, July 2, 2022

Codeforces: Permutation Graph

We are given a permutation of $1,2,...,n$, $a_1, a_2, ..., a_n$. Construct the undirected graph with vertex set $1, 2, ..., n$ and $(i, j)$ an edge iff $\min_{k=i}\{a_k\}=a_i$ and $\max_{k=i}\{a_k\}=a_j$ or the reverse. Find the shortest path from vertex $1$ to $n$. Here $n\leq 2.5\times 10^5$.

The main idea:

Firstly, we observe that a $(1,n)$-path always exists since for any vertices $i$ and $i+1$ we have the edge $(i,i+1)$.

Consider the element with value $1$, say it is at position $i$. Similarly, the element with value $n$ at position, say $j$.

$$a_1, ..., a_i, ..., a_j, ..., a_n$$

We have the edge $(i, j)$. Additionally, there can be no edge $(x,y)$ with $x<i$ and $y>i$ or $x < j$ and $y>j$. Therefore, the shortest path must take the edge $(i,j)$, and we can apply a divide-and-conquer strategy on the sub-arrays $a_1, ..., a_{i-1}$ and $a_{j+1}, ..., a_n$. This can be implemented in linear time.

AtCoder shananigans

Here is the problem , and below is the raw transcript of my thoughts. Obs: It is always optimal to use all $K$ moves. intuitive proof. While...