Tom's blog: December 2021

Monday, December 13, 2021

Flooding requires $\Omega(n\log{k}/k)$ rounds in a $k$-smoothed adversarial dynamic graph

We have presented the $O(n\sqrt{\log{n}/k})$ upper bound for flooding to complete in a previous blog post. We now complete the adversarial $k$-smoothed analysis of flooding by presenting the corresponding lower bound for flooding against an adaptive adversary which was given by [cite].

We adapt the spooling graph used by the oblivious adversary, to construct the adaptive spooling graph. The idea is the same as in the oblivious case: all informed nodes will be in the left star, and all uniformed in the right star. Since the adversary is adaptive, he can keep this property even with smoothing.

Def 1: We define the dynamic adversarial spooling graph $\mathcal{G}=(V,\mathcal{E})$ as follows.

For every round $r$, let the set of informed nodes be $I_r$, and $v_r$ the minimum uninformed vertex. Then $G_r=(V,\mathcal{E}(r))$ where

$$\mathcal{E}(r)=(I_r\setminus \{1\}) \times \{1\} \cup (\overline{I_r}\setminus v_r) \times \{v_r\} \cup \{\{1,v_r\}\}$$

We want to use this adaptive version of the spooling graph to argue that flooding must use some number of rounds $R$ to complete with probability at least $1/2$. Our strategy will be to bound the expected number of newly informed nodes per round and then use Markov's inequality to finish the proof.

Goal: To upper bound the value of $\mathbb{E}[I_r]$.

We know that at least one node, namely, $v_r$ will learn the flooded message in round $r$. However, the smoothed edges between informed nodes in the left star and uninformed nodes in the right star will cause more nodes to become informed.

We introduce the indicator variable $X_{u,r}$ where:

$X_{u,r}=1$ if there is an edge between $u\in \overline{I_{r-1}}\setminus \{v_{r-1}\}$ and a vertex in the right star.
$X_{u,r}=0$ if there is no edge between $u\in \overline{I_{r-1}}\setminus \{v_{r-1}\}$ and a vertex in the right star.

Then we have $\mathbb{E}[I_r] \leq |I_{r-1}| + \mathbb{E}[\sum_{u\in \overline{I_{r-1}}\setminus \{v_{r-1}\}}{X_{u,r}}]+1$ since we may have two distinct edges between the left and right spools pointing to the same node in the right spool. By linearity of expectation

$$\mathbb{E}[I_r] \leq |I_{r-1}|+\sum_{u\in \overline{I_{r-1}}\setminus \{v_{r-1}\}}{\mathbb{E}[X_{u,r}]}+1$$

$$\leq |I_{r-1}| + c_2k(|\overline{I_{r-1}}|-1)|I_{r-1}|/n^2+1$$

$$\leq |I_{r-1}|\cdot(1+c_2k/n) + 1$$

We achieve the last inequality using the fact that $|I_{r-1}|\leq n$. (Don't understand this next part), but using the linearity of this term, we can write

$$\mathbb{E}[I_{r}]\leq \mathbb{E}[I_{r-1}]\cdot(1+c_2k/n) + 1$$

Now we have a recurrence which we want to upper bound. We let $\mathbb{E}[I_{r}]=A_{r}$ to abstract away from the exact meaning of the recurrence. So we have

$$A_{r+1}\leq A_{r}\cdot(1+c_2k/n) +1$$

Observe that $A_{r+1}\geq A_{r}+1$ because of connectivity of our dynamic graph. We split the analysis of $A_r$ into two cases.

Case 1: $A_r\leq n/c_2k$. Then $A_r\leq 1$ which means the additive term dominates the $A_{r}c_2k/n$ multiplicative term. Hence we have $A_{r}+1\leq A_{r+1}\leq A_{r} + 2$. Because $A_1=1$, we have $\Theta(n/k)$ iterations where stage $1$ applies.

Case 2: $A_r\geq n/c_2k$. Then $A_{r}c_2k/n \geq 1$, so the multiplicative term is larger than the additive term. Therefore we have $A_{r+1}\leq A_r(1+2c_2k/n)$ for $r\geq r_0$. Hence we can write

$$A_{r_0+r_1}\leq A_{r_0}(1+2c_2k/n)^{r_1}$$

where $r_0$ is the maximum $r$ such that $A_{r_0}\leq n/c_2k$. Using $A_{r_0}=\Theta(n/k)$ we have

$$A_{r_0+r_1}\leq \Theta(n/k)(1+2c_2k/n)^{r_1}$$

Letting $r_1=\alpha n\log{k}/k$, where $\alpha$ is small enough, we have a round $A_{r_0+r_1}\leq n/10$.

Then there exists a round, namely $r_0+r_1$, where the expectation is bounded by $n/10$, which means that flooding has not completed at this round in expectation. Now using Markov's inequality we get

$$\mathbb{P}[|I_{r_0+r_1}|\geq n] \leq \frac{\mathbb{E}[I_{r_0+r_1}]}{n}\leq 1/10$$

Hence $\mathbb{P}[|I_{r_0+r_1}|< n] \geq 9/10$, which implies a $k$-smoothed lower bound of $\Omega(n\log{k}/k)$ for flooding.

Theorem: Fix $0 < k \leq n/16$, not necessarily an integer. For any $k$-smoothed adaptive dynamic graph, for the flooding process to succeed with probability at least $1/2$ it must run for $\Omega(\min\{n, n\log{k}/k\})$ rounds.

Saturday, December 11, 2021

Flooding in $O(n\sqrt{\log{n}/k})$ on a $k$-smoothed adaptive dynamic graph

We have previously shown an $O(n^{2/3}\log{n}/k^{1/3})$ upper bound for flooding on $k$-smoothed connected dynamic graphs. However, our adversary was constrained to select the dynamic graph sequence $G_1,G_2,...$ without knowing the outcomes of $k$-smoothing. In this post we present [MPS20]'s work which extends this analysis to an adaptive adversary, i.e. one who can construct $G_r$ based on the previous outcomes of $k$-smoothing on $G_1,...,G_{r-1}$.

The intuitive idea is similar to what we have seen before:

In the first $r$ rounds of flooding we are guaranteed to have $r$ nodes learn the flooded message.
We then show that it is likely for some node in a support sequence to receive the message in the next $r$ rounds.
If a node in the support sequence has the flooded message, then it is likely that our target node $u$ will receive the message within the next $r$ rounds.

However, since our adversary is adaptive we don't have the last bullet point because the adversary can see which nodes have the flooded message and can make it hard for $u$ to receive the message. The adaptive adversary can't do anything about the first two bullet points though since these are a consequence of the connectivity of $\mathcal{G}$ only.

To solve the issue of the last bullet point no longer being a valid strategy, we ask how likely it is for a smoothed edge to appear in between the set of $r$ nodes who know the message after the first $r$ rounds and $u$, since the adversary can't control this.

So consider the set $I$ of informed nodes from the first $r$ rounds of flooding. We know that $|I|=r$ by the connectivity of $\mathcal{G}$. Before we looked at the potential edge set $I\times S_l$ for a support sequence set $S_l$, but now we look at the potential edge set $I\times \{u\}$ where $u$ is our target node.

The probability that at least one edge from $I\times \{u\}$ is in the smoothed version of $\mathcal{G}$ is at least $c_1k|I\times \{u\}|/n^2 = c_1kr/n^2$. Therefore, the probability of no edge from $I\times \{u\}$ being in the smoothed version of $\mathcal{G}$ across $r$ rounds is at most $(1-c_1kr/n^2)\leq e^{-c_1r^2k/(n^2)}$ which is upper bounded by $1/n^2$ for $r=2n\sqrt{\log{n}/(c_1k)}$.

Thus the probability that $u$ does not receive the flooded message after $2n\sqrt{\log{n}/(c_1k)}$ rounds is at most $1/n^2$, which implies that $u$ will receive the flooded message with high probability. A union bound over all vertices in $\mathcal{G}$ gives us that flooding will complete in $O(n\cdot\sqrt{\log{n}/k})$ rounds with high probability.

Theorem: Given a dynamic graph $\mathcal{G}=(V,\mathcal{E})$ constructed by an adaptive adversary, some $1\leq k \leq n/16$, flooding completes in $O(n\sqrt{\log{n}/k})$ rounds with high probability on the $k$-smoothed version of $\mathcal{G}$.

[MPS20] Uri Meir, Ami Paz, and Gregory Schwartzman. “Models of Smooth-

ing in Dynamic Networks”. In: CoRR abs/2009.13006 (2020).

arXiv: 2009.13006. url: https://arxiv.org/abs/2009.13006.

Thursday, December 9, 2021

$O(n^{2/3}\log{n}/k^{1/3})$ $k$-smoothed algorithm for flooding.

[Din+15] present an algorithm which runs in $O(n^{2/3}\log{n}/k^{1/3})$ rounds with high probability on a $k$-smoothed dynamic graph.

At the core of their argument is the concept of a support sequence, i.e. a temporally connected subgraph across some number of rounds. The motivation being that if any node in the support sequence receives the flooded message, then it is likely to later be propagated to the target node in the support sequence. In essence a support sequence is just a "fat" target the flooded message needs to hit for $u$ to get the message within some bounded number of rounds dependent on the size of the support sequence. We use this to show that each node receives the flooded message with high probability.

Def 1: Fix integers $t$ and $l$ s.t. $1\leq l < t$, a dynamic graph $\mathcal{G}=(V, \mathcal{E})$, and a target node $u\in V$. A $(t,l)$-support sequence for $u$ in $\mathcal{G}$ is a sequence $S_0,S_1,...,S_l$ such that

$S_0=\{u\}$,
for every $i\in [0,l]$ we have $S_i \subseteq V$,
for every $i\in [1,l]$ we have $S_{i-1}\subset S_i$ and $S_i\setminus S_{i-1}=\{v\}$ for some $v\in V$ such that $v$ is adjacent to at least one node of $S_{i-1}$ in $G_{t-i}$.

Since we are interested in connected dynamic graphs, it seems reasonable that any such dynamic graph has a $(t,l)$-support sequence.

Lemma 1: Given a connected dynamic graph $\mathcal{G}=(V,\mathcal{E})$, some node $u\in V$ and rounds $t$ and $l$ such that $1\leq l < t$, there exists a $(t,l)$-support sequence for $u$ in $\mathcal{G}$.

proof. We can construct such a support sequence inductively as follows. Let $S_0=\{u\}$. Now suppose we have validly constructed $S_0,...,S_i$. We define $S_{i+1}=S_i \cup \{v\}$, where $\{x,v\}\in \mathcal{E}(t-i)$ and $v\in V\setminus S_i$ and $x\in S_i$. Since $\mathcal{G}$ is connected, such a node $v$ must exist, otherwise $S_i$ would be disconnected from the rest of the graph.

Suppose we have a connected dynamic graph $\mathcal{G}$, from lemma 1, there exists a $(t,l)$-support sequence $S_0,...,S_l$ for a target node $u$ in $\mathcal{G}$. If some $v\in S_l$ receives the flooded message in round $t-l$, then we are certain that $u$ will receive the flooded message no later than round $t$. However, since $\mathcal{G}$ will be smoothed, it is possible for our support sequence to get "messed up". We want to show that this is unlikely. Take the following example support sequence.

Here we let $v_i=S_i\setminus S_{i-1}$ and let $v_0=u$. An edge between nodes $v_i$ and $v_j$ for $j<i$ indicates that $\{v_i, v_j\}\in\mathcal{E}(t-i)$. Suppose $v_7$ begins with the flooded message. Node $u$ will receive the message if a $\{v_7,u\}$-journey exists after $k$-smoothing. What is the probability of this in general? We will use a handy lemma that we showed in a previous blog post.

Handy Dandy Lemma: There exists constant $c_2 > 0$ such that the following holds. Consider any graph $G = (V, E)\in\text{allowed}(\mathcal{G}_{\text{conn}})$. Consider also any nonempty set $S \subseteq E$ of edges in the graph and smoothing value $k \leq n/16$. Then with probability at most $c_2k|S| /n^2$, the $k$-smoothed graph of $G$ removes an edge from $S$.

So let's say that $v_i$ has the message, and that $N(v_i)=\{v_j\text{ }|\text{ }\{v_i,v_j\}\in\mathcal{E}(t-i)\text{ and }j<i\}$ is the neighborhood of $v_i$ in $G_{t-i}$. For the message not to proceed in the support sequence, then all $|N(v_i)|$ edges need to be removed by smoothing. Using the handy dandy lemma we know that the probability that $v_i$ can't pass its message on to any $N(v_i)$ is at most $(c_2k/n^2)^{|N(v_i)|}\leq c_2k/n^2$. We can then take a union bound over the up to $l$ rounds needed for the message to reach the target node $u$. This gives us a probability of at most $c_2kl/n^2$ for $u$ not receiving the flooded message.

If we choose $l=n^{2/3}/k^{1/3}$, we get $c_2 k^{2/3}/n^{4/3}\leq 1/4$, which holds for sufficiently large $n$ and $k\leq n$. Thus the probability that $u$ receives the flooded message is at least $3/4$. This gives us the following lemma.

Lemma 2: Fix a connected dynamic graph $\mathcal{G}=(V,\mathcal{E})$ and any $(t,l)$-support sequence $S_0,...,S_l$ where $S_0=\{u\}$. If any $v\in S_l$ receives the flooded message by round $t-l$, then $u$ receives the flooded message with probability at least $3/4$.

Lemma 2 assumes that a node in the support sequence set $S_l$ already has the flooded message by round $t-l$. However, we don't know that this is the case so let's bound the probability of that happening.

We know that in round $l=n^{2/3}/k^{1/3}$ at least $l$ nodes know the message because of the connectivity of $\mathcal{G}$. Thus we have $l|S_l|$ possible edges between nodes that know the message and nodes in $S_l$. The probability that any of these edges are present in a round $r\geq l$ is at least $c_1kl|S_l|/n^2$. If any of these edges are present from rounds $l$ to $t-l$, then $S_l$ receives the flooded message by round $t-l$. Thus we have a probability of at most $(1-c_1kl|S_l|/n^2)^{t-2l}$ that none of these edges are present in all rounds from $l$ to $l-t$.

Choosing $t=\alpha n^{2/3}/k^{1/3}$ for $\alpha \geq 3$, we have probability at most

$$(1-c_1k^{1/3}|S_l|/n^{2/3})^{(\alpha-2)n^{2/3}k^{1/3}}\leq e^{-c_1(\alpha-2)}$$

which for a suitable choice of the constant $\alpha$ is at most $1/4$.

So if either the flooded message does not reach $S_l$ by round $t-l$ or the message does not reach its target in the support structure, $u$, then $u$ will not receive the message. So $u$ does not receive the message with probability at most $1/4 + 1/4 = 1/2$. This gives us the following key result.

Lemma 3: Fix a connected dynamic graph $\mathcal{G}=(V,\mathcal{E})$, any node $u\in V$, and a consecutive interval of $\alpha n^{2/3}/k^{1/3}$ rounds, where $\alpha\geq 3$ is a constant. For $k\leq n/16$, node $u$ receives the flooded message in the $k$-smoothed graph of $\mathcal{G}$ with probability at least $1/2$.

Now that we know each node has a constant probability of receiving the flooded message after $\alpha n^{2/3}/k^{1/3}$ rounds, we can give a high probability guarantee that every node receives the flooded message over a larger round interval.

If we repeat lemma 3 $\Theta(\log{n})$ times, we get that node $u$ does not receive the flooded message with probability at most $2^{-\log{n}}$, and thus it will receive the message with probability at least $1-\frac{1}{2^{\log{n}}}$, i.e. with high probability since $2^{\log{n}}=O(n)$.

Theorem 1: For any connected dynamic graph $\mathcal{G}$ and smoothing value $k\leq n/16$, flooding completes in $O(n^{2/3}\log{n}/k^{1/3})$ rounds on the $k$-smoothed version of $\mathcal{G}$ with high probability.

[Din+15] Michael Dinitz et al. “Smoothed Analysis of Dynamic Networks”.

In: Proceedings of the 29th International Symposium on Distributed

Computing - Volume 9363. DISC 2015. Tokyo, Japan: Springer-

Verlag, 2015, pp. 513–527. isbn: 9783662486528. doi: 10.1007/

978- 3- 662- 48653- 5_34. url: https://doi.org/10.1007/978-3-662-48653-5_34.

Tuesday, December 7, 2021

Self-reducibility

When studying the complexity of problems, we often focus on decision problems. For example, NP-complete problems must be decision problems since they must be in NP. What about problems which aren't answered by a simple yes or no? What about finding the minimum vertex cover, or finding an actual $3$-coloring?

Well it turns out that most of the time, if we have an oracle TM with access to the decider for $X$, we can solve the search/optimisation version of $X$. This means that the search version of $X$ cook reduces to the decision version of $X$.

We take the search problem of finding a $3$-coloring of a graph $G=(V,E)$ as an example. Suppose we have access to an oracle which decides if its input graph has a valid $3$-coloring. We want to use this oracle to actually construct a $3$-coloring on $G$ in polynomial time.

First, we should check if $G$ has a valid $3$-coloring to begin with. If it does not then we halt. Otherwise, we can start building the $3$-coloring.

Let's impose an ordering on the vertices, $v_1<v_2<...<v_n$. We will color these vertices one by one starting from $v_1$. Suppose we have colored all vertices $v_1,...,v_{i-1}$ in a way that can be extended to the entire graph $G$. Let $C_i$ be the set of valid colors for $v_i$, for each $c\in C_i$ we assign $c$ to $v_i$. Now we need a way to test if our new coloring can be extended to the entire graph $G$.

Since our oracle takes as input a graph and that is all, we can't just call the oracle on our graph since it will just ignore the colors we have assigned to it. We need a way to encode the choices of colors into a new graph $G'$ which is $3$-colorable iff the partial coloring can be extended to the entire graph $G$.

We propose the following construction for $G'$: merge all vertices with the same color into one super-vertex whose neighborhood is the union of the neighborhoods of all the vertices being merged. Uncolored vertices are left as in $G$.

Claim: The partial coloring over $v_1,...,v_i$ is extendable to the entire graph $G$ iff $G'$ is $3$-colorable.

proof. Suppose $v_1,...,v_i$ is extendable to the entire graph $G$. Then uncolored vertices in $G$ have a valid assignment of colors. If we construct $G'$, then all uncolored vertices are still adjacent to the same set of colors, so the coloring is still valid, hence $G'$ is $3$-colorable. Suppose $G'$ is $3$-colorable. Then if we undo our construction, we can still color the uncolored vertices since they will be adjacent to the same colors as in $G'$.

Thus we can simply construct $G'$ and call the oracle on it. If the oracle returns yes, then we fix $c$ as the color assigned to $v_i$. Otherwise we move to the next valid color in $C_i$ and do the same. Note that for some color in $c\in C_i$, the assignment of $c$ to $v_i$ must be a valid assignment for a $3$-coloring over the entire graph $G$, since we know that the coloring over $v_1,...,v_{i-1}$ is extendable to the entire graph $G$ and we know that $G$ is $3$-colorable to begin with.

This informal walkthrough demonstrates an example of self-reducibility. In this case we call the oracle $O(n)$ times and the construction of $G'$ can be done in polynomial time. Pretty neat!

Sunday, December 5, 2021

A huge sum

Problem E in the latest atCoder beginner round was quite interesting. The problem statement is quite simple, we are asked to compute

$$\sum_{i=1}^{N}{\lfloor \frac{N}{i} \rfloor}$$

for $1\leq N \leq 10^{12}$.

This problem would be easy if $N$ wasn't so huge. But since it is, we need a faster way to compute the sum than just doing it directly.

The first thing to notice is that all $\lfloor \frac{N}{2}\rfloor < i \leq N$ have $\lfloor \frac{N}{i}\rfloor =1$. More generally, $\lfloor \frac{N}{k+1}\rfloor < i \leq \lfloor \frac{N}{k}\rfloor$ have $\lfloor\frac{N}{i}\rfloor = k$. So our strategy becomes to group equal values together.

If we do this for say $X$ groups we will have

$$N-((N-\lfloor \frac{N}{2}\rfloor)+(\lfloor \frac{N}{2}\rfloor - \lfloor \frac{N}{3}\rfloor) + ... + (\lfloor \frac{N}{X}\rfloor - \lfloor \frac{N}{X+1}\rfloor))=\lfloor \frac{N}{X+1}\rfloor$$

values of $i$ for which we have still not computed the value of $\lfloor \frac{N}{i}\rfloor$ for. If we choose $X=\lfloor\sqrt{N}\rfloor$, then we have $O(\sqrt{N})$ groups and $\lfloor \frac{N}{\sqrt{N}+1}\rfloor = O(\sqrt{N})$ indices $i$ for which we can directly compute their contribution.

Thus we compute the sum using the following.

$$\sum_{k=1}^{\lfloor\sqrt{N}\rfloor}{k\cdot\left(\lfloor \frac{N}{k}\rfloor - \lfloor \frac{N}{k+1}\rfloor\right)} + \sum_{i=1}^{\lfloor \frac{N}{\sqrt{N}+1}\rfloor}{\lfloor \frac{N}{i}\rfloor}$$

This is done in $O(\sqrt{N})$ which is fast enough for the given constraints.

Pages