Sunday, October 31, 2021

My thoughts on ICPC UK problem D

We are given an array $a_1, a_2, ..., a_n$ where $-10^9\leq a_i\leq 10^9$ and $2\leq n\leq 3000$ and we want to compute the value of the maximum subsequence of this array such that the subsequence has gaps of non-increasing sizes between the positions of consecutive subsequence elements in the array. We call this condition $C$ for short.

The first thing that jumps to mind is to use dynamic programming. We can define $\text{dp}[i, l]$ to be the maximum subsequence ending at $a_i$ which satisfies $C$ where the previous subsequence element is at $i-l$.

Our base case is

$$\forall 1\leq l<n : \text{dp}[1, l] = a_1$$

then our transition function is

$$\text{dp}[i, l] = \max_{l\leq l' \text{ and } i-l-l' \geq 1}\{\text{dp}[i-l, l']\} + a_i$$

since if we arrive at $a_i$ with the last jump having length $l$, we know that we must have arrived from $a_{i-l}$ where we arrived to that with $l'\geq l$. In the above transition equation, we assume that $i>l$ otherwise $\text{dp}[i, l] = -\text{INF}$ since the jump of length $l$ would longer than possible.

However, a straightforward implementation yields an $O(n^3)$ solution which is too slow. We are looking for $O(n^2)$ or better. Although in the contest I did not find anything better.

Saturday, October 30, 2021

UK ICPC problem F

The problem is essentially, given $2\leq n\leq 10^5$ people, where the $i^{\text{th}}$ person has a skill level of $b_i$ in billiards and $p_i$ in pool, can you split the people into teams of two such that every team has the exact same skill in both pool and billiards? 

Checking through all pairs is not an option since that would be $O(n^2)$. So first we notice the following conditions for impossibility:

  • $n$ is odd
  • $\frac{1}{n/2}\sum_{1\leq i\leq n}^{b_i}$ is not whole, and same for $p_i$'s.
But we can also have impossibility if we have these conditions but some person has no possible pairings. The intuition to make progress is that we need to pair a small skill with a large skill, otherwise they won't balance. This hints at the following observation.

Obs: The person with the minimum skill in billiards must be paired with the person of maximum skill.

proof. Suppose this is not the case. Consider the skill levels of billiards in non-decreasing order
$$b_1', b_2', ..., b_n'$$
then some $b_i'$ for $i>1$ is paired with $b_n'$. Suppose $b_1'$ is paired with $b_j'$ for $i\not = j$.
Since $b_1' < b_i'$ and $b_j' < b_n'$, we know that $b_1' + b_j' < b_i' + b_n'$, thus the sums are not equal. A contradiction.

This directly suggests the approach of sorting the skill levels of the people for billiards, checking if the min and max skilled people can be paired in both billiards and pool, if not then it is impossible, if yes, then move to the next min and max pair until all have been looked at. This only requires $O(n\log{n} + n) = O(n\log{n})$ time and we are done.

Friday, October 22, 2021

$k$-gossip in dynamic graphs.

This post walks through the existing work on the $k$-gossip problem in dynamic graphs from [KLO10]. Our main objective is to understand their $\Omega(n\log{k})$ lower bound.

Problem: In the $k$-gossip problem ($k\leq n$) on a $1$-interval connected dynamic graph $\mathcal{G}$, a set of $k$ tokens, $\mathcal{T} = \{1, 2, ..., k\}$, is allocated amongst the $n$ nodes of $\mathcal{G}$ according to a function $f : V → 2^{\mathcal{T}}$ where $|\bigcup_{v\in V}{f(v)}| = k$. The problem is solved if each node terminates with output $k$.

Question 1: Let's first look at the $1$-gossip problem. How quickly can we solve it?

The first thing to try is for every node to broadcast any token it knows to all its neighbors. Since the dynamic graph is $1$-interval connected, there must be at least one new node each round receiving the token. Suppose this was false, then all nodes who have not yet received the token, neighboring a node with a token, would not receive it. The only way this could happen is if all nodes have already received the token, since there can not be any disconnected nodes. Thus, after $n-1$ rounds all nodes are guaranteed to have the token. So we have the following fact.

Fact 1: $1$-gossip can be solved in $n-1$ rounds on $1$-interval connected dynamic graphs.

Notice Fact $1$ holds independent of the chosen adversary controlling the evolution of $\mathcal{G}$. It also implies that any global function which is associative and commutative can be computed in $n-1$ rounds. This can be useful as a primitive in dynamic graph algorithms.

Also by the same logic, $k$-gossip, can be solved in $n-1$ rounds if we allow message sizes of $O(n\log{n})$, since a single node can broadcast multiple tokens in a single round. So in $n-1$ rounds every token will have propagated to every node.

Finally, if we assume messages of size $O(\log{n})$, then we can solve $k$-gossip in $O(nk)$ time, by first sending the token number $1$ to all nodes, and after $n-1$ rounds the node with token number $2$ does the same thing, etc. This will need $k$ rounds. However, this does assume that the nodes know the value of $n$.

Question 2: What is a lower bound for the $k$-gossip problem on $1$-interval connected dynamic graphs?

[KLO10] prove a lower bound on the $k$-gossip problem for an online adaptive adversary. 

If a single node knows all the tokens, we know that in $n+k-1$ rounds all nodes will have received all tokens (can be proved by induction). Thus we look at the opposite case when $k$ nodes start with one token each.

We observe that in the earlier rounds lots of nodes learn new tokens, since any token they receive must be new to them.

Observation 1: In the first round of broadcast in the $k$-gossip algorithm, at least $k$ nodes learn a new token.

proof idea. All $k$ nodes with tokens must have at least one neighbor. In the first round any neighbors will not have seen any token other than the ones they possibly have, thus at least $k$ nodes learn a new token.

As rounds progress, it becomes more likely that nodes will receive tokens they have already seen. This motivates a cost scheme on the order in which tokens are discovered for each node; where tokens discovered in earlier rounds are cheaper, since they are easier to find, but tokens discovered in later rounds are more costly.

We can give the $i^{\text{th}}$ token discovered a cost of $1/(k-i+1)$: the first token discovered will have cost $1/k$, whereas the last token discovered will have cost $1$.

We denote the total cost of all discovered tokens in round $r$ as $\Phi(r)$ where by lemma 1 $\Phi(1)=1$ and for the final round $r_f$, we have $\Phi(r_f)=n\cdot(1/k + 1/(k-1) + ... + 1)$ where

$$n\cdot(\frac{1}{2}\log_2{(k+1)}-1)\leq\Phi(r_f)\leq n\cdot\log_2{(k+1)} - 1)$$

The total discovered cost is very useful since if for some instance of $k$-gossip, the best we can do is invariantly increase $\Phi$ by some constant $c$ each round, then we need at least 

$$\lceil(\Phi(r_f)-\Phi(1))/c\rceil\geq \frac{n}{c}\cdot(\frac{1}{2}\log_2{(k+1)}-1) -\frac{1}{c}=\Omega(n\log{k})$$

rounds to solve $k$-gossip on that instance. Thus any algorithm for $k$-gossip must have round complexity $\Omega(n\log{k})$. Now we just need to construct a dynamic graph where the total cost from round to round increases by at most some constant.

Note that if a centralized algorithm for $k$-gossip needs $X$ rounds, then a distributed algorithm definitely needs $X$ rounds too since centralized algorithms have access to global information whereas distributed algorithms only have direct access to local information. Therefore, suppose we have a centralized algorithm which decides $t_v(r)$ to be the message broadcast by node $v\in V$ for round $r$. Additionally, we denote the set of tokens known to $v\in V$ in round $r$ as $A_v(r)$.

Observation 2: Given two vertices $u,v\in V$ in round $r$, if $t_u(r)\in A_v(r)$ and $t_v(r)\in A_u(r)$ then $u$ and $v$ do not contribute to each other discovering new tokens. Thus, connecting them by an edge will not change the total cost $\Phi(r)$. We call such an edge $(u,v)\in E(r)$ a free edge.

The first step of our construction of $G_r$ is thus to add all free edges. Suppose after adding all free edges, $G_r$ has connected components $C_1, C_2, ..., C_l$. We want to connect these components while incurring minimal cost. A simple idea is to arbitrarily fix representative nodes $v_1\in C_1, v_2 \in C_2, ..., v_l\in C_l$ for each connected component, and then construct a connected subgraph over $V_l=\{v_1, v_2, ..., v_l\}$ which has constant cost.

Intuition: We want to avoid/be careful about connecting to nodes who know "many" tokens since if one of them discovers a new token, the cost is much higher than when a node knowing "few" tokens discovers a new token.

To formalize the idea of nodes knowing many tokens we define 

$$\text{many}=\{v_i\in V_l | v_i \text{ is missing } \leq l/6 \text{ tokens}\}$$

to be the set of all nodes knowing "many" tokens, and 

$$\text{few}=\{v_i\in V_l | v_i \text{ is missing } > l/6 \text{ tokens}\}$$ to be the set of all nodes knowing "few" tokens. 

From our intuition, it seems reasonable to just connected all nodes in $\text{few}$ in a line since the cost of discovering a token is low. However, for nodes in $\text{many}$, we would ideally like for them to learn no new tokens since the cost is higher. A way for $u\in \text{many}$ to learn no new tokens is to connect it to a $v\in\text{few}$ where $t_v(r)\in A_u(r)$. The question is: can we always do this for every node $u\in\text{many}$? Supposing for a second that we can, then the cost of this construction would contribute is bounded by

$$\sum_{u\in\text{few}}{\sum_{i=1}^{\min(3,\text{missing(u)})}{\frac{1}{(\text{missing}(u)-(i-1)}}}\leq |\text{few}|\cdot\frac{3}{\frac{l}{6} - 2}\leq |\text{few}| \cdot \frac{6}{\frac{l}{6}} \leq l \cdot (36/l) = 36$$

Now, we just need a way to match each $v\in\text{many}$ to a single node $u\in\text{few}$, without $v$ discovering any new tokens. In order to do this we use a counting argument. If $|\text{few}|$ is larger than $|\text{many}|$, then each node $v$ has lots of options for nodes $u$ to connect to, where each $u$ knows few tokens.

Bounding $\text{|many|}$ and $|\text{few}|$:

We easily get that $\sum_{u\in\text{many}}\text{missing}(u) \leq |\text{many}|\cdot l/6$ from the definition of the $\text{many}$ set.

Secondly, note that ${|\text{many}| \choose 2}\leq\sum_{u\in\text{many}}\text{missing}(u)$. If we add an edge $\{u, v\}$ where $u, v\in V_l$, then we know that at least one node will learn a new token since otherwise, $\{u,v\}$ would be a free edge, implying that $u$ and $v$ would be in the same connected component which is a contradiction.

Thus we conclude that ${|\text{many}| \choose 2} \leq |\text{many}|\cdot l/6$. We can solve this inequality to get the bound $|\text{many}| \leq l/3 + 1$, which implies $|\text{few}|=l - |\text{many}| \geq 2l/3 - 1$.

Now, to argue that our matching from $\text{many}$ and $\text{few}$ always exist, we first observe that all nodes in $\text{few}$ will broadcast distinct tokens. This is because if any two broadcast the same token, then we know that there must be a free edge between them, which is a contradiction. For any node $v\in\text{many}$, we know $v$ is missing at most $l/6$ tokens by definition. Hence there are at least $|\text{few}| - l/6$ tokens broadcast collectively by the nodes in $\text{few}$ which $v$ already knows. Finally, note

$$|\text{many}| \leq (2l/3 - 1) - l/6 \leq |\text{few}| - l/6$$

so such a matching must exist, and we are done since this completes the proof that a construction with constant cost increase exists.

Note that this proof only works in the online adaptive adversary case since for the construction given above to be used the adversary needs to know $t_u(r)$ for each node and round, which it cannot know in advance.

[KLO10] Fabian Kuhn, Nancy Lynch, and Rotem Oshman. “Distributed Computation in Dynamic Networks”. In: Proceedings of the FortySecond ACM Symposium on Theory of Computing. STOC ’10. Cambridge, Massachusetts, USA: Association for Computing Machinery, 2010, pp. 513–522. isbn: 9781450300506. doi: 10.1145/ 1806689.1806760. url: https://doi.org/10.1145/1806689. 1806760.

Friday, October 15, 2021

BMO1 1996-97 Problem 3

The Dwarves in the Land-under-the-Mountain have just adopted a completely decimal currency system based on the Pippin, with gold coins to the value of $1$ Pippin, $10$ Pippins, $100$ Pippins and $1000$ Pippins. In how many ways is it possible for a Dwarf to pay, in exact coinage, a bill of $1997$ Pippins?

We notice that for a coin value $v$ and some amount of Pippins $p<v$, a $v$ value coin cannot be used to pay for this amount. This motivates the following. 

Obs 1: We have to pay 

  1. $7$ Pippins using $1$ value coins only. (only $1$ way to do this.)
  2. $90$ Pippins using $1$ and $10$ value coins only. (only $10$ ways to do this.)
  3. $900$ Pippins using $1, 10$ and $100$ value coins only.
  4. $1000$ Pippins using $1, 10, 100$ and $1000$ value coins only.
This motivates the idea of considering the problem in simpler stages. First we restrict ourselves to paying with only $1$ value coins. Then we extend to the case of $1$ and $10$ value coins and so on until we can tackle all four coin values given in the problem.

From $(1)$, we can consider our bill to be of $1990$ Pippins since the $7$ Pippins can only be payed by $1$ value coins.

Let $a_n, b_n, c_n$ and $d_n$ be the number of ways to pay for a bill of $n$ Pippins using coin values $\{1\}, \{1,10\}, \{1,10,100\}$ and $\{1,10,100,1000\}$ respectively.

Clearly, $a_n=1$ for all $n$. For $b_n$ we can use the $10$ value coin any of $i\in\{0, 1, ..., n/10\}$ times. Thus $b_n= n/10 + 1$.

For $c_n$ we can use the $100$ value coin any of $i\in\{0,1,..., n/100\}$ times. The $n-100i$ remaining Pippins can be payed in $b_{n-100i}$ ways. Hence
$$c_n=\sum_{i=0}^{n/100}{b_{n-100i}}$$

Since $n=1990$ we can write $d_n = c_{990} + c_{1990}$. Now
$$c_{1990} = \sum_{i=0}^{19}{b_{1990-100i}} = b_{1990} + b_{1890} + ... + b_{90} = 200 + 190 +...+ 10 = (210)(20/2) = 2100$$
and finally
$$c_{990} = b_{990} + b_{890} + ... + b_{90} = 100 + 90 +...+ 10 = (110)(10/2) = 550$$
Hence we can pay a bill of $1997$ Pippins, in exact coinage, in exactly $2100 + 550 = 2650$ ways.

Monday, October 11, 2021

Dynamic algorithms and reading research papers.

Although most graph theory textbooks focus on graphs $G=(V, E)$ where $V$ and $E$ remain unchanged, most of the real world applications of graph algorithms run on graphs where the topology changes: edges can be added, vertices can enter and leave. So, how can we define a dynamic graph?

Model: We define a dynamic graph $\mathcal{G}$ to be a sequence of static graphs $G_1, G_2, ...$ where $G_i=(V, E_i)$ for an unchanging set of vertices $V$. We can view $\mathcal{G}$ as the complete description of how the dynamic graph evolves. We can think of $\mathcal{G}$ as evolving round by round, where in round $i$, the dynamic graph is represented by $G_i$.

We are assuming that computation occurs in synchronous distributed rounds: in round $r$, nodes do some computation based on their internal state, the graph topology changes to $G_r$, and the nodes send a message of size $O(\log{n})$ to each of their neighbors in $G_r$ in this order.

The way that $\mathcal{G}$ evolves can be controlled by an adversary. In the offline setting, the adversary decides the sequence of graphs $G_1, G_2,...$ before the first round begins. This is called an oblivious adversary. In the online setting, for a given round $r$, the adversary decides $G_{r}$ based on the internal states of all nodes in round $r$. This is an adaptive adversary.

Of course we can restrict the class of dynamic graphs that our algorithms will run on. For example, we introduce the idea of $T$-interval-connectedness.

Property 1: A dynamic graph $\mathcal{G}$ is $T$-interval-connected if from rounds $t\leq r\leq t+T$, the $G_t, G_{t+1},..., G_{t+T}$ there is a connected spanning subgraph that is unchanged.

We can choose the class of dynamic graphs based on any properties and this may lead us to more specialized dynamic algorithms for this graph class. For example, another class of dynamic graphs may require all $G_i$ to be $d$-regular.

Some problems in dynamic graphs have strong lower bounds implying poor performance. However, this is not what we observe in practice. How can we explain this? Maybe the instances that match the lower bound are extremely rare so if we ignored them the algorithm would actually be "efficient".

For example, the hitting time of a pair of vertices $u, v$ is the expected number of rounds for a random walk to go from $u$ to $v$. For dynamic graphs, the hitting time can be lower bounded by $\Omega (2^n)$ by considering the star graph with the addition of a self-loop at each vertex. However, is this bound really useful in practice? We want to explore such questions by looking at the smoothed lower bound.

Def 1: Given a graph $G=(V, E)$ a $k$-smoothed graph of $G$ is any graph in $\{G' | G' \text{ has edit distance at most } k \text{ from } G\}$. The edit distance between $G$ and $G'$ is just the minimum number of edges needed to be added/removed from $G$ to get $G'$. We can thus $k$-smooth a dynamic graph $H$ by $k$-smoothing each of its $G_i\in H$.

The idea behind smoothing is that by changing our graph at random by a small amount, we can "ruin" any specific graph instances which achieve the worst-case runtime. Hence in expectation we can analyse how fast these "ruined" worst case graphs run on our algorithms. If only small perturbations of these worst-case graphs weaken the lower bounds significantly, then it must be that these instances are very artificial/unlikely to occur in practice.

Smoothed analysis has been applied to the hitting time of dynamic graphs giving a $k$-smoothed lower bound of $\Omega (n^{5/2}/(\sqrt{k}\log{n}))$.

Over the following $20$ weeks or so, I will be attempting to explore other problems on dynamic graphs which have vastly different lower bounds to $k$-smoothed lower bounds. Many different smoothing models exist and so an exploration of those will also be conducted. Finally, I may specialize the dynamic graph class depending on the problem being studied.

AtCoder shananigans

Here is the problem , and below is the raw transcript of my thoughts. Obs: It is always optimal to use all $K$ moves. intuitive proof. While...