3 Random Graphs: Degree Sequences

A limitation of the Erdős–Rényi model is that it always generates graphs whose degree distributions are approximately Poisson. This isn’t very flexible, and most real-world networks have very different-looking degree sequences. In this set of lecture notes, we’ll discuss two more classes of random graph that emphasize the important role played by the degrees of nodes in determining graph structure.

3.1 The Configuration Model

Definition 3.1 (Configuration Model) A graph with \(n\) nodes has degree sequence \(\mathbf{k} \in \mathbb{Z}^n_+\) if each node \(i\) has degree \(k_i\).

A configuration model random graph with degree sequence \(\mathbf{k}\in \mathbb{Z}^n_+\) is a random graph sampled uniformly at random from the set of all graphs with degree sequence \(\mathbf{k}\).

While sampling from the Erdős–Rényi model was as easy as flipping \(\binom{n}{2}\) weighted coins, sampling from the configuration model is actually rather challenging. A classical approach for sampling from the configuration model is to perform a stub-matching algorithm. Here’s how it works.

See Fosdick et al. (2018) for a comprehensive discussion.

First, each node is assigned \(k_i\) stubs or half-edges.
Until there are no more stubs left:
- Two stubs are selected uniformly at random and matched, creating an edge.

The stub-matching algorithm can produce multi-edges and self-loops, which can cause the graph to not be simple. However, it can be proven (Bollobás 1980) that, when the graph is sparse, the expected number of multi-edges and self-loops does not grow with network size. One can, as a result, show these structures are rare, and can often be ignored in arguments.

Moments of the Degree Sequence

Let \(p_k\) be the proportion of nodes with degree \(k\) in a configuration model. Then, \(p_kn\) is the number of nodes with degree \(k\). The mean degree is \(\langle k \rangle \triangleq \sum_{i \in N} k p_k\). More generally, the \(\ell\)th moment of the degree sequence is \(\langle k^{\ell} \rangle \triangleq \sum_{i \in N}k^\ell p_k\).

Configuration Models with Constant Moments

Recall that, in the case of Erdős–Rényi random graphs, we called a sequence of models sparse if the mean degree \(c\) did not increase as a function of \(n\). Here, we use a similar working definition

A sequence of configuration model random graphs has constant moments if the moments \(\langle k^\ell \rangle\) converge to constants as \(n\) grows large, for each finite \(\ell\).

We’ll assume that we’re considering configuration models with constant moments. This assumption justifies “large \(n\)” reasoning of the same type we did with Erdős–Rényi random graphs.

Branching Process Approximation for Giant Components

Recall that the branching process approximation in Erdős–Rényi random graphs suggests that the size of the component containing a uniformly sampled node has expected size approximately \(1 / (1-c)\), where \(c\) is the average degree. On the other hand, when \(c > 1\), the branching process approximation suggests that the component containing a uniformly sampled node could be very large indeed.

Definition 2.5

We can perform a similar kind of analysis for the configuration model. There is only one new thing to figure out.

Definition 3.2 (Edge-Biased Degree Distribution) In a configuration model random graph, consider the following random process:

First, choose an edge uniformly at random.
Then, take one of the nodes attached to this edge and log its degree \(K\).

The random variable \(K-1\) is distributed according to the edge-biased degree distribution of the graph.

Exercise: show that the edge-biased degree distribution is \(q_k \triangleq \mathbb{P}(K = k) = \frac{kp_k}{\langle k \rangle}\).

Definition 3.3 (Excess degree distribution)

In a configuration model random graph, if \(K\) is distributed according to the edge-biased degree distribution, the variable \(H = K-1\) is distributed according to the excess degree distribution.

The reason to care about the excess degree distribution is that it answers the following question:

Suppose I follow an edge in a configuration model random graph, arriving at node \(i\). How many additional edges are attached to node \(i\)?

This is an important question, because it is precisely the question that we used to determine whether the branching process would die out quickly or continue forever in the case of the Erdős–Rényi random graph. The same heuristic works here:

Theorem 3.1 (Giant component in configuration models)

The configuration model with constant moments has a giant component as \(n\) grows large if \(\mathbb{E}[H] > 1\).

Indeed, this result can be formally proven by rigorous arguments, although these are beyond our scope here.

Molloy and Reed (1995)

Exercise: Show that \(\mathbb{E}[H] = \frac{\langle k^2 \rangle - \langle k \rangle}{\langle k \rangle}\). Infer that the criterion \(\mathbb{E}[H] > 1\) in Theorem 3.1 is equivalent to the criterion \[ \langle k^2 \rangle - 2\langle k \rangle > 0\;. \]

Friendship Paradox

The “friendship paradox” refers to the observation which is commonly phrased as:

Your friends have more friends than you do.

This is only true in a statistical sense: on average, in a graph, if you check the degree of each node and compare that to the mean degree of its neighbors, you’ll find that the average neighbor degree is higher than the average original degree.

The math here is that the neighbor degree nothing other than the degree of a node that we reached by following an edge. That is, its distribution is the edge-biased degree distribution. The mean of this distribution is \(\mu = \langle k^2 \rangle/\langle k \rangle\). The Cauchy-Schwartz inequality implies that \(\langle k^2 \rangle \geq \langle k \rangle^2\), which in turn implies that \(\mu \geq k\).