Home > theory > DP and the Erdős–Rényi model

DP and the Erdős–Rényi model

Yesterday I was in a pub with Vasilis Syrgkanis and Elisa Celis and we were discussing about how to calculate the expected size of a connected component in G(n,p), the Erdős–Rényi model. G(n,p) is the classical random graph obtained by considering n nodes and adding each edge (i,j) independently with probability p. A lot is known about its properties, which very interestingly change qualitatively as the value of p changes relativeto n. For example, for p <\frac{1}{n} then there is no component greater than O(\log n) with high probability. When p = \frac{c}{n}, c>1 and n \rightarrow \infty, then the graph has a giant component. All those phenomena are very well studied in the context of probabilistic combinatorics and also in social networks. I remember learning about them in Jon Kleinberg’s Structure of Information Networks class.

So, coming back to our conversation, we were thinking on how to calculate the size of a connected component. Fix some node u in G(n,p) – it doesn’t matter which node, since all nodes are equivalent before we start tossing the random coins. Now, let C_u be the size of the connected component of node u. The question is how to calculate C(n,p) = \mathbb{E} [C_u].

Recently I’ve been learning MATLAB (actually, I am learning Octave, but it is the same) and I am very amazed by it and impressed about why I haven’t learned it before. It is a programming language that somehow knows exactly how mathematicians think and the syntax is very intuitive. All the operations that you think of performing when doing mathematics, they have implemented. Not that you can’t do that in C++ or Python, in fact, I’ve been doing that all my life, but in Octave, things are so simple. So, I thought this was a nice opportunity for playing a bit with it.

We can calculate C(n,p) using a dynamic programming algorithm in time O(n^2) – well, maybe we can do it more efficiently, but the DP I thought was the following: let’s calculate \mathcal{C}(n,s,p) where it is the expected size of the u-connected component of a random graph with n nodes where the edges between u and other nodes have probability p' = 1 - (1-p)^s and an edge between v_1 and v_2 have probability p. What we want to compute is C(n,p) = \mathcal{C}(n,1,p).

What we can do is to use the Principle of Deferred Decisions,  and toss the coins for the n-1 edges between u and the other nodes. With probability bin(k,n,p') = {n \choose k} (p')^k (1-'p)^{n-k}, there are k edges between u and the other nodes, say nodes w_1, \hdots, w_k. If we collapse those nodes to u we end up with a graph of n-k nodes and the problem is equivalent to k plus the size of the connected component of u in the collapsed graph.

One difference, however is that the probability that the collapsed node u is connected to a node v of the n-1-k nodes is the probability that at least one of w_i is connected to v, which is 1-(1-p)^k. In this way, we can write:

\mathcal{C}(n,s,p) = 1 \cdot bin(0,n-1,p') + \sum_{k=1}^{n-1} bin(k,n-1,p') [ k +  \mathcal{C}(n-k,k,p)]

where p' = 1-(1-p)^s. Now, we can calculate C(n,p) by using DP, simply by filling an n \times n table. In Octave, we can do it this way:


function component = C(N,p)
  C_table = zeros(N,N);
  for n = 1:N for s =1:N
    C_table(n,s) = binopdf(0,n-1,1-((1-p)^s)) ;
    for k = 1:n-1
      C_table(n,s) += binopdf(k,n-1,1-((1-p)^s)) * (k + C_table(n-k,k));
    end
  end end
  component = C_table(N,1);
endfunction

And in fact we can call C(n,p) for say n = 200 and p = 0.01 .. 0.3 and see how C(n,p) varies. This allows us, for example, to observe the sharp transition that happens before the giant component is formed. The plot we get is:


Erdős–Rényi model

Categories: theory Tags:
  1. No comments yet.
  1. No trackbacks yet.