A Short Proof of the Nash-Williams’ Partition Theorem

Notations

  1. \(\mathbb{N}\) – the set of natural numbers;
  2. \(\binom{M}{k}\) – the family of all subsets of \(M\) of size \(k\);
  3. \(\binom{M}{<\omega}\) – the family of all finite subsets of \(M\);
  4. \(\binom{M}{\omega}\) – the family of all infinite subsets of \(M\);

The infinite Ramsey theorem, in its simplest form, states that for every partition \(\binom{\mathbb{N}}{k} = \mathcal{F}_1 \sqcup \dots \sqcup \mathcal{F}_r\), there exists an infinite set \(M\subset \mathbb{N}\) such that \(\binom{M}{k}\subset \mathcal{F}_i\) for some \(i\in [r]\). The Nash-Williams‘ partition theorem can be seen as a strengthening of the infinite Ramsey theorem, which considers a partition of a subset of \(\binom{\mathbb{N}}{<\omega}\).

Notations

  1. \(\mathcal{F}\restriction M\) – \(\mathcal{F}\cap 2^M\), that is, the set \(\{s\in\mathcal{F} : s\subset M\}\).
  2. \(s \sqsubset t \), where \(s,t\) are subsets of \(\mathbb{N}\) – \(s\) is an initial segment of \(t\), that is \(s = \{n\in t : n \le \max s\}\).

Definition Let set \(\mathcal{F} \subset \binom{\mathbb{N}}{<\omega}\).

  1. \(\mathcal{F}\) is Ramsey if for every partition \(\mathcal{F}=\mathcal{F}_1\sqcup \dots\sqcup\mathcal{F}_r\) and every \(M\in\binom{\mathbb{N}}{\omega}\), there is \(N\in\binom{M}{\omega}\) such that \(\mathcal{F}_i\restriction N = \emptyset\) for all but at most one \(i\in[r]\).
  2. \(\mathcal{F}\) is a Nash-Williams family if for all \(s, t\in\mathcal{F}, s\sqsubset t \implies s = t\).

Theorem [NASH-WILLIAMS 1965] Every Nash-Williams family is Ramsey.

The proof presented here is based on the proof given by Prof. James Cummings in his Infinite Ramsey Theory class. The purpose of this rewrite is to have a proof that resembles the one of the infinite Ramsey theorem.

Notation Let \(s\in\binom{\mathbb{N}}{<\omega}\) and \(M\in\binom{\mathbb{N}}{\omega}\). Denote $$[s, M] = \left\{t \in \binom{\mathbb{N}}{<\omega} : t \sqsubset s \text{ or } (s \sqsubset t \text{ and } t\setminus s \subset M)\right\}.$$

Definition Fix \(\mathcal{F}\subset \binom{\mathbb{N}}{<\omega}\) and \(s\in \binom{\mathbb{N}}{<\omega}\).

  1. \(M\) accepts \(s\) if \([s, M]\cap \mathcal{F}\neq \emptyset\) and \(M\) rejects \(s\) otherwise;
  2. \(M\) strongly accepts \(s\) if every infinite subset of \(M\) accepts \(s\);
  3. \(M\) decides \(s\) if \(M\) either rejects \(s\) or strongly accepts it.

We list some properties that encapsulates the combinatorial characteristics of the definitions above.

Properties

  1. If \(M\) decides (or strongly accepts, or rejects) \(s\) and \(N\subset M\), then \(N\) decides (respectively strongly accepts, rejects) \(s\) as well.
  2. For every \(M\in\binom{\mathbb{N}}{\omega}\) and \(s\in\binom{\mathbb{N}}{<\omega}\), there is \(N_1\in\binom{M}{\omega}\) deciding \(s\). Consequently, there is \(N_2\in\binom{M}{\omega}\) deciding every subset of \(s\).

Proof of Theorem Enough to show that if \(\mathcal{F} = \mathcal{F}_1\sqcup \mathcal{F}_2\), then for every \(M\in\binom{\mathbb{N}}{\omega}\), there is infinite \(N\in \binom{M}{\omega}\) such that \(F_i \restriction N = \emptyset\) for some \(i\in[2]\).

We are going to use \(\mathcal{F}_1\) instead of \(\mathcal{F}\) in the definitions of “accept”, “reject”, “strongly accept” and “decide”. Find \(N\in \binom{M}{\omega}\) that decides \(\emptyset\). If \(N\) rejects \(\emptyset\), by definition \(\mathcal{F}_1\restriction N = [\emptyset, N]\cap \mathcal{F}_1 = \emptyset\). Otherwise \(N\) strongly accepts \(\emptyset\).

Inductively, we build a decreasing sequence of infinite sets \(N \supset N_1 \supset N_2\supset \dots \), an increasing sequence of natural numbers \(n_1, n_2, \dots\), and maintain that \(n_i\in N_i, n_i < \min N_{i+1}\) and that \(N_i\) strongly accepts every \(s\subset \{n_j : j < i\}\). Initially, we take \(N_1 = N\) as \(N\) strongly accepts \(\emptyset\).

Mental picture of how \(n_i\) and \(N_i\) look like.
A mental picture of the construction.
Suppose \(N_1 \supset \dots \supset N_i\) and \(n_1 < \dots < n_{i-1}\) have been constructed. In the following lemma, when taking \(M = N_i\) and \(s = \{n_j : j < i\}\), it spits out \(m\) and \(N\), which are exactly what we need for \(n_i\) and \(N_{i+1}\) to finish the inductive step.

Lemma Suppose \(M\in\binom{\mathbb{N}}{\omega}\), \(s\in\binom{\mathbb{N}}{<\omega}\) and \(\max s < \min M\). If \(M\) strongly accepts every subset of \(s\), then there are \(m \in M\) and \(N \in \binom{M}{\omega}\) such that \(n < \min N\) and \(N\) strongly accepts every subset of \(s\cup \{n\}\)

Proof of lemma We can build \(M = M_0 \supset M_1\supset M_2 \supset \dots\) such that for every \(i\), \(m_i := \min M_i < \min M_{i+1}\) and \(M_{i+1}\) decides every subset of \(s\cup \{m_i\}\). It might happen that \(M_{i+1}\) rejects a subset of \(s\cup \{m_i\}\). However, we claim that this cannot happen for infinitely many times.

Otherwise, by the pigeonhole principle, there is \(t\subset s\) such that \(I = \{i : M_{i+1} \text{ rejects }t\cup\{m_{i}\}\}\) is infinite. Let \(M’ = \{m_i : i\in I\}\). Note that \([t, M’] \subset \cup_i [t\cup\{m_i\}, M_{i+1}]\), and so \([t,M’]\cap \mathcal{F}_1\subset \cup_i \left([t\cup\{m_i\}, M_{i+1}]\cap\mathcal{F}_1\right) = \emptyset\). Hence \(M’\subset M\) rejects \(t\subset s\), which is a contradiction.

Now we pick one \(i\) such that \(M_{i+1}\) strongly accepts every subset of \(s\cup\{m_i\}\), and it is easy to check that \(m = m_i\) and \(N = M_{i+1}\) suffice. QED for lemma.

Finally, we take \(N_\infty = \{n_1, n_2, \dots\}\). For any \(s\in\binom{N_\infty}{<\omega}\), there is \(i\) such that \(s\subset \{n_1, \dots, n_{i-1}\}\). Note that \(N_i\) strongly accepts \(s\) and \(N_\infty\subset N_i\). Therefore \(N_\infty\) (strongly) accepts \(s\), that is \([s, N_\infty]\cap \mathcal{F}_1 \neq \emptyset\), and say \(t\in [s, N_\infty]\cap \mathcal{F}_1\). Because \(t\in\mathcal{F}_1\) and \(\mathcal{F} = \mathcal{F}_1 \sqcup \mathcal{F}_2\) is a Nash-Williams family, \(s\notin \mathcal{F}_2\). QED.

9 (this post is made with love)

An Upper Bound on Stirling Number of the Second Kind

We shall show an upper bound on the Stirling number of the second kind, a byproduct of a homework exercise of Probabilistic Combinatorics offered by Prof. Tom Bohman.

Definition A Stirling number of the second kind (or Stirling partition number) is the number of ways to partition a set of \(n\) objects into \(k\) non-empty subsets and is denoted by \(S(n,k)\).

Proposition For all \(n, k\), we have $$S(n,k) \leq \frac{k^n}{k!}\left(1-(1-1/m)^k\right)^m.$$

Proof Consider a random bipartite graph with partite sets \(U:=[n], V:=[k]\). For each vertex \(u\in U\), it (independently) connects to exactly one of the vertices in \(V\) uniformly at random. Suppose \(X\) is the set of non-isolated vertices in \(V\). It is easy to see that $$\operatorname{Pr}\left(X=V\right) = \frac{\text{number of surjections from }U\text{ to }V}{k^n} = \frac{k!S(n,k)}{k^n}.$$

On the other hand,  we claim that for any \(\emptyset \neq A \subset [k]\) and \(i \in [k]\setminus A\), $$\operatorname{Pr}\left(i\in X \mid A\subset X\right) \leq \operatorname{Pr}\left(i\in X\right).$$ Note that the claim is equivalent to $$\operatorname{Pr}\left(A\subset X \mid i\notin X\right) \geq \operatorname{Pr}\left(A\subset X\right).$$ Consider the same random bipartite graph with \(V\) replaced by \(V’:=[k]\setminus \{i\}\) and let \(X’\) be the set of non-isolated vertices in \(V’\). The claim is justified since $$\operatorname{Pr}\left(A\subset X\mid i\notin X\right) = \operatorname{Pr}\left(A\subset X’\right) \geq \operatorname{Pr}\left(A\subset X\right).$$

Set \(A:=[i-1]\) in above for \(i = 2, \ldots, k\). Using the multiplication rule with telescoping the conditional probability, we obtain $$\begin{eqnarray}\operatorname{Pr}\left(X=V\right) &=& \operatorname{Pr}\left(1\in X\right)\operatorname{Pr}\left(2\in X \mid [1]\subset X\right)\ldots \operatorname{Pr}\left(k\in X\mid [k-1]\subset X\right)\\ & \leq & \operatorname{Pr}\left(1\in X\right)\operatorname{Pr}\left(2\in X\right)\ldots\operatorname{Pr}\left(k\in X\right) \\ & = & \left(1-(1-1/m)^k\right)^m.\end{eqnarray}$$ QED.

4 (this post is made with love)

Fun with Hex (Cont.)

In the previous post Fun with Hex, I asserted without proof that the Hex Game will never result a draw. In Jiri Matousek’s An Invitation to Discrete Mathematics, I found an elegant proof for the assertion, which requires only a little bit of elementary graph theory. The idea of the proof can also prove Sperner’s Lemma. Moreover, I will present later how to extrapolate this idea to discover something new. The only bit of elementary graph theory we are going to use is the Handshaking lemma.

Handshaking Lemma: Every finite undirected graph has an even number of vertices with odd degree.

To prove the lemma, simply observe that the sum of the degrees of all vertices is equal to twice the number of the edges, which should be even.

Now let us review the Hex Game. Recall two versions of the Hex Game.

Voronoi Duality
Voronoi Duality

The original version of the game is to color the hexagons of the game board (illustrated in black) and each player tries to link the opposite sides of the board with a connected series of hexagons with the same color.

Alternatively, one can play with the Voronoi dual of the hexagon game board (illustrated in red). This time, two players color the vertices and compete to link the opposite sides with a path of vertices of their own color. For instance, player one would try to form a red path from the northwest side to the southeast side, while player two would strive to build a blue one from the northeast side to the southwest side.

We can tilt and rescale the dual graph without changing the game as follows.

Dual graph which can be easily drawn
Dual graph which can be easily drawn

In this new graph, the first player wants to build a red path from the west to the east and the second player wants to build a blue path from the north to the south. Furthermore, for purpose of the proof, we add four auxiliary vertices on each side and color them according to which player it belongs to.

Dual graph with four auxiliary vertices on each side
Dual graph with four auxiliary vertices on each side

As are shown in the above graph, the vertices on the west and the east are red and the ones on the north and the south are blue. Our goal is to prove that no matter how you color the remaining vertices with red and blue, there is always a monochromatic path connecting two of the four auxiliary points. For the coloring below, there is a red path from W to E.

A red path from W to E
A red path from W to E

Proof: Assume, for the sake of contradiction, that some coloring does not form any red path from W to E or blue path from N to S. First, for every red vertex unreachable by any red path starting from W, we change it to green. Similarly, for every blue vertex unreachable by any blue path starting from N, we change it to green, too.

Under the assumption that there is no red path from W to E and no blue path from N to S, the vertices E ans S are both changed to green. Now we have a triangulated planar graph whose four vertices on the external face, W, N, E, S are colored by red, blue, green, green respectively.

Lemma: Given a 3-coloring on a triangulated planar graph whose four vertices on the external face, say W, N, E, S, are colored by red, blue, green, green respectively. There is an internal triangular face whose vertices are of different color.

By this lemma, we can conclude that there exists an internal triangular face, say ABC, such that A is red, B blue and C green. This implies that there are a red path from W to A and a blue path from N to B. If the vertex C is originally red, then it should still be red because there is an edge between A and C. By the same reason, if it is originally blue, then it should still be blue. This gives a contradiction! To finish the proof for the Hex Game, we only need to prove the lemma.

Proof of Lemma: For each face of the triangulated planar graph, we put a vertex in it. Two vertices are adjacent if and only if their corresponding faces share a common edge whose endpoints are colored red and blue.

3-coloring of a triangulated planar graph and its correspondent face graph.
3-coloring of a triangulated planar graph and its correspondent ‘face graph’.

Note that the degree of the vertex in the external face is 1. By the handshaking lemma, there is a vertex, say \(v\) in the internal face, say \(F\), whose degree is odd. It is easy to see the only possible coloring of the vertices of \(F\) to get an odd degree of \(v\) is to color them by 3 different colors. Q.E.D.

Now I am going to show something new. The idea to consider another game board which looks like:

Symmetric Hex Game of size 3
Symmetric Hex Game of size 3

The idea is to get a symmetric board. To build such a game board, one can start with a hexagon in the center and then add several layers of hexagons around it. In the picture above, the hexagon labeled by 1 is the one we start with. We call it the 1st layer. Then we can add 6 hexagons labeled by 2 around it to form the 2nd layer and another 12 hexagons labeled by 3 around the 2nd layer. In general, to build a symmetric Hex of size \(n\), one only needs to repeat the process \(n\) times.

For this symmetric Hex game board, we have three pairs of opposite sides. One reasonable goal, which is analogous to the goal of the Hex Game, is to build a monochromatic path that links two opposite sides. However two players can cooperate to tie the game in this case.

Two players can cooperate to not link any pair of opposite sides
Two players can cooperate not to connect any pair of opposite sides

Therefore, for the symmetric Hex, the goal is to build a monochromatic connected component that touches two opposite sides or three nonadjacent sides. For simplicity, we can such a monochromatic connected component a winning component. In other words, if we label six sides by 1 through 6, both players want to use their own colors to connect side 1 and 4, or side 2 and 5, or side 3 and 6, or side 1, 3 and 5, or side 2, 4 and 6. Again, we will only consider its dual variant in the sense that each player colors the vertex of the following graph.

Dual variant of symmetric Hex Game
Dual variant of symmetric Hex Game

Proposition: The Symmetric Hex Game can never end in a tie.

Proof: Assume, for the sake of contradiction, that there is a 2-coloring of the symmetric Hex that contains no winning component. Consider the graph with 4 extra vertices W, E, N, S which connect to the vertices of the original graph as follows.

Graph with 4 extra vertices
Graph with 4 extra vertices

To be specific, W connects to all the vertices on side 5, E the vertices on side 2, N the vertices on side 1 and 6, S the vertices on side 3 and 4.

By the same argument we used in the proof of the Hex Game, there is either a red path from W to E or a blue path from N to S. In other words, we should have one of the following schematic pictures.

Five possibilities
Five possibilities

By our assumption, it must look like the pictures on the second row. Without loss of generality, assume that we have the first picture on the second row and the blue path connects vertices \(a\) and \(b\).

Now we change the way the extra 4 vertices connect to the vertices on the boundary as follows.

4 extra vertices connect the vertices on the sides in a different way
4 extra vertices connect the vertices on the sides in a different way

Again, there is either a blue path from W to E or a red path from N to S. If there is a blue path from W to E, then this path combined with the blue path from \(a\) to \(b\) gives us a connected component that touches two opposite sides or 3 nonadjacent sides. Therefore it has to be the case that a red path connects N to S.

A red path on the blue path's right side.
A red path on the blue path’s right side.

To summarize, whenever we have a blue path \(P_1\) that connects side 4 and 6, there is be a red path \(P_2\) connecting the same sides on its right side. By symmetry, there is a blue path \(P_3\) connecting side 4 and 6 on \(P_2\)‘s right side. We can keep going and get an infinite sequence of paths connecting side 4 and 6 with alternating colors, which is impossible! Q.E.D.

Corollary For any 2-coloring of the symmetric Hex board of size \(n\), there is a monochromatic connected component of size at least \(2n+1\).

Proof By the symmetric Hex theorem, there is a monochromatic connected component that touches 2 opposite sides or 3 nonadjacent sides. In either case, the size of the component is at least \(2n+1\). Q.E.D.

2 (this post is made with love)

Communicating Information through Randomness

This week Staren sent me a puzzle on WeChat. After we discovered a solution for the puzzle, I tried to backtrack the source of it. I found it appeared in the blog post Yet another prisoner puzzle by Oliver Nash. The author seemed to work at a quantitative trading company SIG at that time. Given that Staren is working at JP Morgan, I guess this is one of many brain-teasers that was circulating among the traders around year 2010. Anyway, here is a version of the puzzle that was used in that blog post.

A room contains a normal 8 by 8 chessboard together with 64 identical coins, each with one “heads” side and one “tails” side. Two prisoners are at the mercy of a typically eccentric jailer who has decided to play a game with them for their freedom. The rules are the game are as follows.

The jailer will take one of the prisoners (let us call him the “first” prisoner) with him into the aforementioned room, leaving the second prisoner outside. Inside the room the jailer will place exactly one coin on each square of the chess board, choosing to show heads or tails as he sees fit (e.g. randomly). Having done this he will then choose one square of the chess board and declare to the first prisoner that this is the “magic” square. The first prisoner must then turn over exactly one of the coins and exit the room. After the first prisoner has left the room, the second prisoner is admitted. The jailer will ask him to identify the magic square. If he is able to do this, both prisoners will be granted their freedom.

These rules are explained to both prisoners before the game begins and they are allowed some time together to discuss their strategy. Of course the question is: what strategy should the prisoners adopt?

There is an extremely simple and clever solution that uses the XOR operation. However we will detour into the realm of linear algebra and information theory and come back to the elegant one-line solution at the end of the post.

In this puzzle, it doesn’t matter that we have a 8 by 8 chessboard to place the coins. All it matters is that we have \(n=64\) places to put the coins in. Also it doesn’t matter that the jailer declare a magic square. All it matters is that he pick a magic number from \(0, 1, \ldots, n-1\), which corresponds to the position of the magic square, and tell the first prisoner the number.

Henceforth, we can consider the following puzzle.

The jailer chooses an \(n\)-bit binary and pick a magic number from \(1, 2, \ldots, n\). The first prisoner flips exactly at most one bit of the binary (not necessarily the magic bit) and shows the second prisoner the altered binary. What strategy should the prisoners adopt so that the second prisoner can tell the magic number?

I have changed the puzzle a little bit. In the original puzzle, the first prisoner must flip a bit. Now we allow him to do nothing if he wants to. We shall solve this weaker version of the puzzle first and come back to the original one.

If we think of the set of all the \(n\)-bit binaries as a metric space under the Hamming distance. I claim such strategy exists if and only if one can partition all the \(n\)-bit binaries into \(n\) groups denoted as \(A_1, \ldots, A_n\) such that for all \(k\), the 1-neighborhood of \(A_k\) covers all the \(n\)-bit binaries, i.e. any \(n\)-bit binary is at most 1-bit away from some binary in \(A_k\).

The reason is as follows. Given such partition \(A_1, \ldots, A_n\). If the jailer chooses the initial binary \(x\) and the magic number \(k\), because there is some binary, say \(y\), in \(A_k\) such that \(x\) and \(y\) are at most 1 bit off, the first prisoner can turn the initial binary into \(y\) by flipping at most 1 bit in \(x\). The second prisoner then can take this binary \(y\), look up the partition \(A_1, \ldots, A_n\) and find the index of the group which contains \(y\). Conversely, one can argue it is necessary to have such a partition of all the binaries if one has the strategy.

For some reason we will see later, we call a subset \(A\) of \(n\)-bit binaries a covering code provided its 1-neighborhood covers all \(n\)-bit binaries. In terms of covering codes, our goal is to find \(n\) disjoint covering codes. By the way, in information theory, people call a subset of binaries a code.

Notice that if we can find \(n+1\) disjoint covering codes, we can find a strategy that works even if we allow the jailer to pick the magic number from \(1, 2, \ldots, n+1\). However we can not hope for more than \(n+1\) disjoint covering codes. Here is why. Suppose we have \(N\) disjoint covering codes. By pigeonhole principle, there is a covering code, say \(A\), whose size is no more than \(2^n/N\). Note that the 1-neighborhood of a binary contains exactly \(n+1\) binaries. Hence the 1-neighborhood of \(A\) contains at most \(2^n\times\frac{n+1}{N}\). In order to have the 1-neighborhood of \(A\) to cover all the binaries, we must have \(N\leq n+1\). Moreover, if \(N=n+1\), all codes should have the same size \(2^n/(n+1)\).

Let us start with small \(n\).

  • For \(n=1\), we can find 2 disjoint covering codes \(A_1=\{0\}, A_2=\{1\}\).
  • For \(n=2\), we can find 2 disjoint covering codes \(A_1=\{00, 11\}, A_2=\{01, 10\}\) or \(A_1=\{00, 01\}, A_2=\{10, 11\}\).
  • For \(n=3\), we can find 4 disjoint covering codes \(A_1=\{000,111\}, A_2=\{001,110\}, A_3=\{010,101\}, A_4=\{100, 011\}\).
  • For \(n=4\), we can take 4 disjoint covering codes by appending 0,1 to the covering codes for \(n=3\). These covering codes are \(A_1=\{*000,*111\}, A_2=\{*001,*110\}, A_3=\{*010,*101\}, A_4=\{*100, *011\}\) where \(*\) is a wild card that can be either 0 or 1.
  • For \(n=5\), it is impossible to find 5 disjoint covering codes because by pigeonhole principle one of the 5 disjoint covering codes would contain no more than \(2^5/5\sim 6\) binaries and the 1-neighborhood of any 6-element subset will always have 2 binaries not covered. This can be checked by a computer program.

It is interesting to see that for some \(n\) we can find \(n\) or \(n+1\) disjoint covering codes and for some \(n\), for instance 5, there are no \(n\) disjoint covering codes. The situation is quite complicated here.

Therefore we had better focus on the extremal cases in which we can find \(n+1\) disjoint covering codes. If it is the case that \(n+1\) disjoint covering codes exist, all the covering codes should be of the same size \(2^n/(n+1)\) and for each covering code \(A\), the 1-neighborhoods of two distinct point in \(A\) do not overlap. This implies that \(n=2^m-1\) for some \(m\). Since we are interested in the extremal case, henceforth we always assume \(n=2^m-1\).

Even finding one covering code is hard besides that our goal is to find \(n+1\) many of them. However, Richard Hamming in 1950 invented the first error correction code, the Hamming code \(H\), which has the properties:

  1. it is a linear code, i.e., \(H\) is a \((n-m)\)-dimensional linear subspace of \(\mathbb{Z}_2^n\), the liner space of all the \(n\)-bit binaries over the finite field \(\mathbb{Z}_2\);
  2. the minimum Hamming distance between two distinct elements in \(H\) is exactly three.

For an introduction to Hamming code, I recommend Thirty-three Miniatures by Jiri Matousek. The 5th miniature, Error-Correcting Codes, includes detailed construction of the Hamming code.

Because of the 1st property, \(H\) has \(2^{n-m}\) elements. The 2nd property says that the 1-neighborhoods of any two distinct elements in \(H\) are disjoint since \(1+1<3\). Hence the 1-neighborhood of \(H\) contains \(2^{n-m}(n+1)=2^{n-m}2^m=2^n\) many elements. In other words, the Hamming code is a linear covering code!

Good! We have found a linear covering code \(H\). By translating the Hamming code \(H\) by \(\mathbf{e_1}=100\ldots 00, \mathbf{e_2}=010\ldots 00, \ldots, \mathbf{e_n}=000\ldots 01\) respectively, we have \(n+1\) covering codes in total (including \(H\) itself) since a translate of a covering code is still a covering code. Moreover, they are disjoint from each other because of the 2nd property of Hamming code.

Great! We have found \(n+1=2^m\) disjoint covering codes for \(n\)-bit binaries. We can take this construction and use the wild card trick to get \(n+1\) disjoint covering codes for \(n+1\)-bit binaries. In fact, for any \(n+1\) disjoint covering codes for \(n\)-bit binaries, say \(A_0, \ldots, A_n\), define \(A_k^* = \{0,1\}\times A_k \). For each \(k\), \(A_k^*\) is a covering code for \(n\)-bit binaries, and what’s more, they are disjoint.

Hence the weaker version of the puzzle in which the first prisoner can opt to not flip the coin is totally settled since \(8\times 8=2^6\). The same strategy works for the stronger version of the puzzle, because if unfortunately the initial binary is an element of \(A_k^* = \{0,1\}\times A_k \) and the magic number is also \(k\), the first prisoner can flip the leftmost bit so that the outcome stays in \(A_k^*\).

Practically, in order to achieve such strategy, the first prisoner needs to memorize the Hamming code. However there is an easy way to get the Hamming code as it can be characterized as follows.

The Hamming code is the solution set of the linear system \(R\mathbf{x}=\mathbf{0}\), where the matrix \(R=[\mathbf{c_1}, \mathbf{c_2}, \ldots, \mathbf{c_n}]\) is an \(m \times n\) matrix whose \(k\)th column \(\mathbf{c_k}\) is the binary expansion of \(k\) for \(k=1,2,\ldots, n\).

For instance, when \(m=3, n=2^m-1=7\), the matrix $$R=\begin{bmatrix}0 & 0 & 0 & 1 & 1 & 1 & 1\\ 0  & 1 & 1 & 0 & 0 & 1 & 1\\ 1 & 0 & 1 & 0 & 1 & 0 & 1\end{bmatrix}.$$

Similarly, we can characterize \(\{0,1\}\times H\) as the solution set of $$\begin{bmatrix}\mathbf{0} & R\end{bmatrix}\mathbf{x}=\mathbf{0}.$$ We denote \(R^*=[\mathbf{c_0}, \mathbf{c_1}, \ldots, \mathbf{c_n}]\) as the matrix \(R\) prepended by a zero column.

Suppose \(x_0x_1\ldots x_n\) is in \(\{0,1\}\times H\). This means $$\sum_{x_k\neq 0}\mathbf{c_k}=\sum_{k=0}^nx_k\mathbf{c_k}=\mathbf{0},$$ where all summations take place in \(\mathbb{Z}_2^n\). The left hand side of the equation above can be thought as the following procedure: we take all indices of the non-zero bit in the binary, \(\{k: x_k\neq 0\}\), write out the binary expansions of them, \(\{\mathbf{c_k}: x_k\neq 0\}\) and then add them up bitwise. Recall that adding binaries bitwise is equivalent to the XOR operation of numbers. The number corresponding to the left hand side is \(\mathrm{XOR}_{x_k\neq 0}k\). We call this number the XOR-signature of the binary \(x_0x_1\ldots x_n\). Therefore the XOR-signature $$\mathrm{XOR}_{x_k\neq 0}k=0$$ is another characterization of the set \(\{0,1\}\times H\). Similarly, a binary \(x_0x_1\ldots x_n\) is in \(\{0,1\}\times (H+\mathbf{e_k})\) if and only if its XOR-signature is equal to \(k\).

All right. Now we have a convenient way to tell which covering code a binary belongs to according to its XOR-signature. How will this help the first prisoner to figure out which coin to flip? Given the initial binary \(x_0x_1\ldots x_n\), the first prisoner can compute its XOR-signature denoted by \(s\). He wants to flip the \(k\)th bit of the binary so as to change its XOR-signature from \(s\) to \(s’\) which is the magic number that the jailer has picked from \(0,1,\ldots, n(=2^m-1)\). Note that when he flips the \(k\)th bit of the binary, he will change the XOR-signature from \(s\) to \(s \mathrm{XOR} k\). So \(k\) must satisfy \(s \mathrm{XOR} k=s’\). One can solve this equation by XOR both sides by \(s\) and get \(k=s \mathrm{XOR} s \mathrm{XOR} k = s \mathrm{XOR} s’\).

In other words, the first prisoner only needs to compute the XOR-signature of the initial binary, say \(s\) and flip the \((s\mathrm{XOR}s’)\)th bit of the binary, which will change the XOR-signature of the binary from \(s\) to \(s’\). Then the second prisoner comes in and compute the XOR-signature of the new binary which gives the magic number.

Pretty neat, isn’t it?

8 (this post is made with love)

Number of Non-isomorphic Graphs

This expository essay is to test my understanding of the techniques used in More Bricks – More Walls?, Thirty-three Miniatures by Jiří Matoušek’s.

We shall prove the sequence \(g_n(0),\ldots, g_n({n\choose 2})\) is unimodal, i.e., it is first nondecreasing and then, from some point on, non-increasing, where \(g_n(k)\) is the number of non-isomorphic graphs with \(n\) vertices and \(k\) edges. In particular, we shall prove \(g_n(0)\leq g_n(1)\leq\ldots\leq g_n(\lfloor m/2\rfloor)=g_n(\lceil m/2\rceil)\geq\ldots\geq g_n(m-1)\geq g_n(m)\), where \(m={n\choose 2}\).

Throughout the article, the number of vertices, which is denoted as \(n\), is always fixed. And \(m\) always stands for the maximal number of edges between \(n\) vertices.

Notice that taking the complement of graphs establishes a bijection between graphs with \(k\) edges and graphs with \(m-k\) edges. Moreover, two graphs are isomorphic if and only if their complement graphs are isomorphic. Thus we have \(g_n(k)=g_n(m-k)\). Hence it is enough to show that \(g_n(k)\leq g_n(l)\) for all \(k\leq l\leq m/2\).

Denote \(U, V\) be the sets of all graphs with \(k, l\) edges on the fixed vertex set \([n]\) respectively. Let \(r, s\) denote the number of non-isomorphic graphs in \(U, V\). By our notation above, \(r=g_n(k), s=g_n(l)\). We shall show \(r\leq s\). The graph \(G\) is the bipartite graph between \(U\) and \(V\) with \(u\sim v\) if and only if \(u\) is a subgraph of \(v\).

Let \(B=(b_{uv})_{u\in U, v\in V}\) be the bipartite adjacent matrix of \(G\), where \(b_{uv}=1\) if \(u\) and \(v\) are adjacent in \(G\), otherwise \(0\).

We claim the matrix \(B\) is of full rank. As it is easy to see the size of \(U\) is no greater than the size of \(V\), we shall prove an equivalent statement that the matrix \(B\) is of full row rank, that is the rows of \(B\) are linearly independent.

Suppose not. There is a non-zero row vector \(y\) in \(\mathbb{R}^{U}\) such that \(yB=0\). Notice the coordinates of \(y\) are indexed by the elements in \(U\). Let \(K^*\in U\) such that \(y_{K^*}\neq 0\).

Now we partition \(U, V\) into \(k+1\) according to the number of edges in the intersection of the graph with \(K^*\): \(K_i\) is the set of graphs in \(U\) who share \(i\) common edges with \(K^*\) and \(L_j\) is the set of graphs in \(V\) who share \(j\) common edges with \(K^*\) for all \(i, j=0,\ldots, k\). Remember the number of edges in \(K^*\) is \(k\) and \(K_k\) contains only one element \(K^*\). Also all \(K_i\) and \(L_j\) defined above are non-empty because of the assumption \(k<l\leq m/2\).

Note that for any \(i, j\in \{0,\ldots, k\}\), all vertices in \(K_i\) have the same degree to \(L_j\) in the bipartite graph \(G\). This is because the ways to extend a graph with \(k\) edges and \(i\) edges in common with \(K^*\) to graphs with \(l\) edges and \(j\) edges in common with \(\) doesn’t specifically depend the graph we start with. Denote this number as \(d_{ij}\) and let \(D=(d_{ij})\) be the matrix. Apparently, \(D\) is an upper triangular matrix[/latex] with \(d_{ii}\neq 0\). Thus \(D\) is non-singular. On the other hand, as \(k_i\cdot d_{ij}=\sum_{K\in K_i, L\in L_j}B_{KL}\) where \(k_i\) is the size of \(K_i\), \(D=\mathrm{diag}(k_0^{-1},\ldots, k_k^{-1})FBG\) where \(F=(f_{i, u})_{i\in\{0,\ldots, k\}, u\in U}\) and \(G=(g_{v, j})_{v\in V, j\in\{0,\ldots, k\}}\) are matrices with \(f_{i, u}=1\) if and only if \(u\in K_i\) and \(g_{v, j}=1\) if and only if \(v\in L_j\) otherwise \(0\).

Let \(x\) be a \(k+1\) dimensional row vector such that \(x=yE\), where \(E=(e_{ui})_{u\in U, i\in\{0,\ldots, k\}}\) is matrix with \(e_{ui}=1\) if \(u\in K_i\) otherwise \(0\). In fact, \(x_i=\sum_{u\in U}y_ue_{ui}=\sum_{K\in K_i}y_K\). Hence \(x_k=y_{K^*}\neq 0\) as \(K_k\) contains only one element \(K^*\).

Now we have \(xD=yE\mathrm{diag}(k_0^{-1},\ldots, k_k^{-1})FBG\). However it is easy to check \(EF=\mathrm{diag}(k_0,\ldots, k_k)\) and \(yB=0\). So \(xD=yBG=0\) contradicting to the non-singularity of \(D\) as \(x\neq 0\).

Now, according to the graph isomorphism the graphs in \(U, V\) are classified into equivalent classes, denoted as \(U_1,\ldots, U_r, V_1, \ldots, V_s\).

Similarly, we observe that all vertices in \(V_j\) have the same degree to \(U_i\) since the number of ways to restrict two isomorphic graphs to a certain class of isomorphic graphs is the same. Denote this degree as \(d_{ij}\) as well. And again we claim the matrix \(D\)(different from the one we define before) is of full row rank where \(D=(d_{ij})_{r\times s}\). This will finish the proof since an \(r\times s\) matrix of full row rank implies \(r\leq s\).

Suppose not. We have a non-zero \(r\) dimensional row vector \(y\) such that \(yD=0\). Let \(x\in\mathbb{R}^U\) be the row vector indexed by \(U\) such that \(x_u=y_i\) for all \(u\in U_i\). Thus for all \(u\in V_j\) we have \((xB)_v=\sum_{u\in U}x_uB_{uv}=\sum_{i\in[r]}\sum_{u\in U_i}x_uB_{uv}=\sum_{i\in[r]}y_i\sum_{u\in U_i}B_{uv}=\sum_{i\in[r]}y_id_{ij}=(yD)_j=0\). Notice \(x\neq 0\) but \(xB=0\) which contradicts with the non-sigularity of \(B\).

Comment: This proof appears to me as an intricate idea similar to finding an injection. To prove one set is smaller than the other, one would just hope to find an injection from one set to the other. But this is hard in this problem because those two sets we are considering are sets of equivalent classes. On the other hand, in this problem, the inclusion mapping works just fine on the level of graphs though it breaks down on the level of equivalent classes. Thus we should come up with a delicate analysis on the bipartite graph induced by the inclusion mapping. And I think this is the motivation behind the proof.

2 (this post is made with love)

Fun with Hex

According to the folklore,

The Hex game was invented by the Danish mathematician Piet Hein, who introduced it in 1942 at the Niels Bohr Institute. It was independently re-invented in 1947 by the mathematician John Nash at Princeton University. The rules are really simple. Each player has an allocated color, Red and Blue being conventional. Players take turns placing a stone of their color on a single cell within the overall playing board. The goal is to form a connected path of your stones linking the opposing sides of the board marked by your colors, before your opponent connects his or her sides in a similar fashion. The first player to complete his or her connection wins the game. The four corner hexagons each belong to both adjacent sides… The game can never end in a tie, a fact proved by John Nash: the only way to prevent your opponent from forming a connecting path is to form a path yourself. In other words, Hex is a determined game.

– Hex (board game), Wikipedia

For instance, in the following 2 by 2 board, player 1 should intend to build a connected path linking \(x_1=0, x_1=2\), while player 2 should intend to build a connected path linking \(x_2=0, x_2=2\).

A 2 by 2 Hex Game Board with a coordinate attached to each cell.
A 2 by 2 Hex Game Board with a coordinate attached to each cell.

Now, we would like to play Hex game in higher dimensions. The first question is: how can we set up the game board in the \(n\) dimensional space? The answer all depends on your point of view. If we visualize the game board in 2 dimension as follows, we can view each cell as a cube in 3 dimensional space.

Hex Board from another perspective.
Hex Board from another perspective.

Here is a rework of the 2 dimensional board which can be easily generalized to higher dimension. Consider all unit cubes in \(\mathbb{R}^3\) who are translation of the canonical unit cube (whose vertices are \((0,0,0), (1,0,0), (0,1,0), (0,0,1), (1,1,0), (1,0,1), (0,1,1),(1,1,1)\)) along the vector \((x,y,z)\), where \((x,y,z)\in[n]\times[n]\times\mathbb{Z}\) (\([n]=\{1,2,\ldots, n\}\)) and \(x+y+z=0\). Basically, those unit cubes are the ones illustrated above. Now project all those cubes to the 2-dimensional plane \(x+y+z=0\). This gives us the hexagon board.

For higher dimension, say \(d\) dimension, just project all unit cubes in \(\mathbb{R}^{d+1}\) who are translation of the \(d+1\)-dimensional canonical unit cube along the vector \((x_0, \ldots, x_n)\), where where \((x_0,\ldots, x_d)\in\mathbb{Z}^{d+1}\) and \(x_0+\ldots+x_d=0\), to the \(d\)-dimensional hyperplane given by the equation \(x_0+\ldots+x_d=0\).

With the help of Mathematica, each cell of 3D Hex board looks like the following.

Building Block of 3D Hex
Building Block of 3D Hex

Source code is attached here for those who want to play with it a little bit in Mathematica.

v = Orthogonalize[{{1, -1, 0, 0}, {0, 1, -1, 0}, {0, 0, 1, -1}}];
data = Flatten[Table[{{x, y, z, t}.v[[1]], {x, y, z, t}.v[[2]], {x, y, z, t}.v[[3]]}, {x, 0, 1}, {y, 0, 1}, {z, 0, 1}, {t, 0, 1}], 3];
Needs["TetGenLink`"];
{p, s} = TetGenConvexHull[data];
Graphics3D[GraphicsComplex[p, Polygon[s]]]

I forgot to mention, in the \(d\)-dimensional Hex game, we have at most \(d\) players. Again, the game will never tie. For those who are interested beyond this mere fact, please refer to David Gale’s great exposition on this topic [DG]1.

However as a paper-and-pencil game, drawing a real Hex board is time consuming. Here is another version of Hex with an easy-to-draw game board. The vertices represent the hexagon region of the graph and the edges represent the borders. Two players take turns to color the vertices and try to form a monochromatic path from one side to the opposite.

A 3 by 3 Simplified Game Board
A 3 by 3 Dual Game Board

Indeed, mathematically speaking, this is the dual diagram of the original board. Its generalization in higher dimension is easy to imagine. Define a graph whose vertex set \(V=[n]^d\).Two distinct vertices \(u, v\in V\) are adjacent if \(u_i-v_i\in\{0,1\}\) for all \(i\in[d]\) or the other way around \(v_i-u_i\in\{0,1\}\) for all \(i\in[d]\).

3 (this post is made with love)
  1. David Gale, The Game of Hex and the Brouwer Fixed-Point Theorem, The American Mathematical Monthly, vol. 86, 1979, pp. 818-827 []

欺诈猜数游戏(上)

上个月在新加坡的时候,正值国际生物奥林匹克竞赛在新加坡国立大学举行。于是不免想去看看2012年IMO竞赛的题目和各国比赛的结果。

理论上,国际数学奥林匹克竞赛是从来不进行国家排名的。但是,私底下……你懂的。

这次团体的第一名是韩国队思密达,个人总分第一名是新加坡的一位选手 Jeck Lim。这货四次代表新加坡国家队出战,金银铜满分四种情况都集齐了。中国队总分位列第二,中国队在个人排名最高的是Yiyang She,位列第四。

言归正传,说说今年的题目。今年难度系数也是最有意思的题目当属第三题——欺诈猜数游戏。

广告插播:《欺诈者游戏》今年的剧场版换女主角了,我再也不相信爱情了。

下面是2012年IMO第三题:

“欺诈猜数游戏”在两个玩家甲和乙之间进行, 游戏依赖于两个甲和乙都知道的正整数\(k\)和\(n\)。

游戏开始时甲先选定两个整数\(x\)和\(N\),\(1\leq x\leq N\)。甲如实告诉乙\(N\)的值,但对\(x\)守口如瓶。乙现在试图通过如下方式的提问来获得关于\(x\)的信息:每次提问,乙任选一个由若干正整数组成的集合\(S\)(可以重复使用之前提问中使用过的集合),问甲“\(x\)是否属于\(S\)?”。乙可以提任意数量的问题。在乙每次提问之后,甲必须对乙的提问立刻回答“是”或“否”,甲可以说谎话,并且说谎的次数没有限制,唯一的限制是甲在任意连续\(k+1\)次回答中至少有一次回答是真话。

在乙问完所有想问的问题之后,乙必须指出一个至多包含n个正整数的集合\(X\),若\(x\)属于\(X\),则乙获胜;否则甲获胜。证明:

  1. 若\(n\geq 2^k\),则乙可保证获胜;
  2. 对所有充分大的整数\(k\),存在整数\(n\geq 1.99^k\),使得乙无法保证获胜。

扯点题外话,虽然很多年没有做题了。但是此次再次拿起竞赛题思考的时候,发现心智已经成熟了不少,知道如何去细致地处理问题了。

回想高中的那段时间,无论是看竞赛书,还是出去上课,对于做不出的问题,大都是看了书上或是老师的解答之后,感叹解答之妙,但始终对解题动机这一说没有一个理解。

高中竞赛的训练使得我成为了一个熟练的工人,对于一般的问题,只需要将一般的工具试一遍就好了,而遇到非主流问题就一筹莫展了。所以如何提高对我来说一直是一个问题,现在看来,正是缺乏组合问题的训练,以及对待具体问题具体处理的能力。

但是,一旦脱离了那样的环境,反而能静下心去想问题,也能理解为什么说数学是人类心智的荣耀了。

那下面我要开始分析这个谜题了哦,如果想自己做题目的,就别继续看了哦。隆重推荐这道题目的另外一个原因是,做这个谜题的时候纯粹只需要在脑子里面想就可以了,还可以找个人一起玩这个游戏找找感觉。

第一步,将问题转化为等价的容易思考的形式,也就是说,需要回答这个问题:\(k+1\)个回答中至少有一次回答是真话究竟意味着什么?

每次乙提出一个问题“\(x\)是否属于\(S\)?”,如果甲回答“是”,那么我们说\(S\)是\(x\)可能属于的集合,反之,我们说\(S\)的补集是\(x\)可能属于的集合。为了简化,我们称这个集合为可能集。

那么,经过若干轮问答之后,我们得到一连串的可能集。连续\(k+1\)个回答中至少有一次回答是真话就意味着:在这一连串的可能集中,任意连续的\(k+1\)个可能集中必然有一个可能集真真切切地含了\(x\)这个元素。也就是说\(x\)包含于任意\(k+1\)个连续可能集的并集。

第二步,从甲和乙的角度分别考虑取胜的策略。针对第一个小问,我们从乙的角度出发考虑问题。

作为乙的智囊团,我们需要向甲提出一系列为难的问题,使得将\(x\)可能出现的范围不断缩小。不妨我们把注意力先集中在开始的\(k+1\)个问题,我们的目标是使得对应的\(k+1\)个可能集的并集最小。

不过多久,就能想到一种二分法的策略。

一开始,\(x\)的出现范围是\(1\)到\(N\)中的所有数。那我们的第一个问题就是取出这个集合中一半数量的数问甲,\(x\)是不是在这个集合中呢?无论甲如何回答,在第一轮问答过后,得到的可能集,记为\(A_0\),其大小最多就是\(\frac{N}{2}\)。此处为了分析方便,我们暂时忽略各种取整的细节。

那如何设计第二轮的问题,才能使得两轮问答过后得到的可能集的并尽量小呢?

啊哈!只需要将在\(A_0\)以外的元素分出来一半作为第二次提问的集合就好了,这样无论甲此次如何选择可能集,记为\(A_1\),这两个集合并集的大小最多是\(\frac{N}{2}+\frac{N}{4}\)。

这样,如此下去,每次我们都取出前面产生可能集以外元素的一半来质问甲就好了。经过\(k+1\)论后,可能集并集的大小最多就是\(\frac{N}{2}+\cdots+\frac{N}{2^{k+1}}\),于是通过之前的分析我们一定可以排除这个并集以外的元素了,也就是\(\frac{N}{2^{k+1}}\)个元素了。

如果补回之前忽略的取整细节,你立即会发现,只要\(N\geq 2^{k+1}\),那经过这样\(k+1\)轮之后,一定能够排除至少一个数字,从而缩小\(x\)的出现范围。

我们只要反复应用这个战术,每\(k+1\)轮过后,只要\(x\)的出现范围的大小大于等于\(2^{k+1}\),我们就能继续排除。

换句话说,我们总是能将\(x\)的出现范围的大小缩小到\(2^{k+1}\)以下。

但第一问的实质是让我们将这个范围的大小缩小到\(2^k\),所以我们的方法需要改进,但毫无疑问我们已经开了一个好头。

观察到在这样的策略下,甲的第一次回答应该尽量选择一个较大的可能集,如果甲选择了只有一个元素的可能集,那可大事不妙了。

那能否迫使甲选择只有一个元素的可能集呢?答案是简单的。

只需要一直问甲,\(x\)是不是\(1\)就可以了。如果甲连续\(k+1\)轮都回答“不是”,那我们就成功排除了\(1\)这个数。如果甲某次回答“是”,那接下来的\(k\)个问题,我们如法炮制之前的二分法就可以成功排除\(\frac{N-1}{2^k}\)个数。

综上所述,只要\(N\)大于\(2^k\),那我们就能不断地排除一些数。于是当\(n\geq 2^k\),乙有必胜策略!

第二个子问题,请听下回分解。

21 (this post is made with love)

Hall’s Theorem

In 1935, the mathematician Philip Hall discovered a criteria of a perfect matching on a bipartite graph, known as Hall’s theorem, aka marriage theorem.

Considering two sets of vertices, denoted as $latex A=\{a_1, \cdots, a_m\}$ and $latex B=\{b_1, \cdots, b_n\}$. Edges are connected between $latex a_i$ and $latex b_j$ for some pairs of $latex (i, j)$. Here, we admit multiple edges between two vertices. This graph, named as $latex G$, is called a bipartite graph. One may think the bipartite graph as a model of a marriage system. Taken set $latex A$ and set $latex B$ as boys and girls, an edge between a boy and a girl tells that they might be a couple. A natural question arises that whether there exists a perfect matching, that is, every boy can find his girl without conflicting with other boys. Put formally, a perfect matching is an injective map from $latex A$ to $latex B$ which induces a subgraph of $latex G$.

Hall’s theorem asserted that there is a perfect matching in above bipartite graph $latex G$ if and on if for every subsets $latex S$ of $latex A$, the cardinal number of $latex S$(number of the members in a set) does not exceed the number of the neighbors of $latex S$. Here, the neighbors of $latex S$ are all vertices in $latex B$ which are connected with one vertex in $latex S$.

The necessary condition apparently holds. On the other hand, the proof of the sufficient part is a little bit tricky. Challenge yourself with this problem.

Now we represent several applications of Hall’s theorem.

  • Let $latex A=(a_{ij})$ be a matrix of order $latex n$ with nonnegative integers as entries, such that every row and column of $latex A$ has sum $latex l$. Then $latex A$ is the sum of $latex l$ permutation matrices.
  • Suppose all vertices of a directed graph $latex G$ have the same in-degrees and out-degrees of $latex l$, a fixed positive integer. Thus the graph can be separated as $latex l$ groups of circles. Every vertex appears exactly once in each group, and each edge appears in only one circle.
  • (Birkhoff-von Neumann theorem) A doubly stochastic matrix, is a square matrix of nonnegative real numbers, each of whose rows and columns sums to 1. Assertion: every doubly stochastic matrix is mere a convex combination of permutation matrices.

The first two questions are equivalent with each other if one noticed the relation between a graph and its adjacent matrix. In addition, in the first problem, we construct a bipartite graph $latex G$ consisting of $latex 2n$ vertices, $latex u_1, \cdots, u_n$ and $latex v_1, \cdots, v_n$ and $latex a_{ij}$ edges between each pair of vertices $latex u_i$ and $latex v_j$. It can be easily verified that this graph meets the condition required in Hall’s theorem through contradictory argument. Thus we get a perfect matching in this graph, which conversely represent a permutation matrix. After Eliminating the perfect matching in this graph, it become to the situation of $latex l-1$ in-degrees and out-degrees. This allows us to use mathematical induction on the parameter $latex l$.

If the entries of the matrix are all rational in the third problem, the result is reduced to the first problem. Still, we shall cover the gap between rational numbers and reals. Denote all permutation matrix as $latex M_1, \cdots, M_{n!}$. For each matrix $latex (a_{ij})$ meets the requirement that the sum of each row or each column is 1, one can approach it by $latex (a_{ij}^n)$ with rational entries which is also doubly stochastic. Thus we can decompose the rational matrix $latex (a_{ij}^n)$ into a convex combination of permutation matrices with rational coefficient $latex c_k^n$, where $latex k \in \{1, \cdots, n!\}$. By finding convergent subsequence in each $latex (c_k^n)_{n=1}^\infty$, one can require that all these subsequences are convergent for the same indices, and they converge to a limit $latex c_k$. After all, the original matrix $latex (a_{ij})$ can be expressed as a convex combination of permutation matrices with coefficients $latex c_k$.

0 (be the first to like this)