Home > theory > Eigenvectors and communities

## Eigenvectors and communities

I liked a lot the talk by Luca Trevisan in STOC’09 about the “Max Cut and the Smallest Eigenvalue”. In this paper he proposes an ${.531}$-approximation algorithm for MAX-CUT, which is the first approximation algorithm for MAX-CUT that doesn’t use semi-definite programming (SDP). The best possible approximation for that problem is the 0.878-approximation given by Goemans and Williamson, based on SDP (which is the best possible approximation assuming the Unique Games Conjecture).

I won’t discuss the paper in much detail here, but its heart is the very beautiful idea of analyzing properties of a graph looking at the spectrum of the matrix that defines it. For example, represent a graph ${G}$ by its adjacent matrix ${A}$, i.e., ${A}$ is an ${n \times n}$ matrix (where ${n}$ is the number of nodes of ${G}$) and ${A_{ij} = 1}$ iff the edge ${(i,j)}$ is in ${G}$. Let also ${D}$ be the diagonal matrix for which ${D_{ii}}$ is the degree of node ${i}$. The normalized adjacency matrix is given by ${D^{-1/2} A D^{-1/2}}$. Let ${\lambda_1 \geq \lambda_2 \geq \hdots \geq \lambda_n}$ be its eigenvalues. It is easy to see that all the eigenvalues are in ${[-1,1]}$, since:

$\displaystyle 0 \leq \sum_{ij} A_{ij} (x_i - x_j)^2 = 2 x^t D x - 2 x^t A x \Rightarrow \frac{x^t A x}{x^t D x} \leq 1$

taking ${y = D^{1/2}x}$ we have that: ${\frac{y^t D^{-1/2} A D^{-1/2} y}{y^t y} \leq 1}$ for all ${y \in {\mathbb R}^n}$. It is not hard to see the the maximum eigenvalue is:

$\displaystyle \lambda_1 = \max_{y \in {\mathbb R}^n} \frac{y^t D^{-1/2} A D^{-1/2} y}{y^t y}$

To prove ${\lambda_n \geq -1}$ it is analogous, just by analyzing ${\sum_{ij} A_{ij} (x_i + y_i)^2 \geq 0}$. The next natural questions is: do those bounds hold with equality?

In fact, ${\lambda_1 = 1}$ for all graphs. It is very easy to see that, just take the vector ${x \in {\mathbb R}^n}$ with ${x_i = \sqrt{d_i}}$ where ${d_i}$ is the degree of node ${i}$. And for ${\lambda_n}$, it is not difficult to see that ${\lambda_n = -1}$ iff the graph has a bipartite connected component. It is also easy to see that if the graph has ${k}$ connected components then ${\lambda_1 = \hdots = \lambda_k = 1}$.

So, given a connected graph ${\lambda_2 < 1}$. But what happens if ${\lambda_2}$ is very close to ${1}$. Intuitively, it means that the graph is almost disconnected in the sense that it has a very sparse cut. For ${d}$-regular graphs (a graph where all nodes have degree ${d}$), this intuition is captured by Cheeger’s inequality.

$\displaystyle \sqrt{2 (1-\lambda_2) } \geq h_G \geq \frac{1}{2} (1-\lambda_2)$

where ${h_G}$ is the sparsity of ${G}$, defined as ${\min_{S \subseteq V} \frac{\vert e(S, V \setminus S) \vert}{d \min\{\vert S \vert, \vert V \setminus S \vert \}}}$. In the paper, Luca Trevisan discusses an analogue os Cheeger’s Inequality for the smallest eigenvalues ${\lambda_n}$. Instead of measuring how close we are from a disconnected graph, we measure how close we are from a a bipartite graph.

Let’s try to get some intuition: if a ${d}$-regular graph has a good sparse cut, like in the picture below, we could divide it in two parts and try to get an eigenvalue close to ${1}$ by the following way: set “almost” ${1}$ in one part and almost ${-1}$ in the other. Since there are very few edges crossing the cut, after being transformed by ${\frac{1}{d}A}$ the values are almost still the same. Also if the graph is almost bipartite, we can set almost ${1}$ in one part and almost ${-1}$ in the other, and after applying ${\frac{1}{d}A}$, what was ${1}$ becomes ${-1}$ and vice versa.

Ok, we can use eigenvalues to have a good idea if the graph has good expansion or not, but how to use that to actually find good partitions of it? Let ${x_2}$ be the eigenvector corresponding to ${\lambda_2}$. The intuition in the last paragraph tells us that maybe we should think of getting ${S_1 = \{v \in V; x_2(v) < 0\}}$ and ${S_1 = \{v \in V; x_2(v) \geq 0\}}$. More generally we can substitute ${0}$ by a threshold ${t}$. This gives us only ${n}$ possible partitions to test and choose the best, instead of ${2^n}$.

Finding Communities in Graphs

Real world graphs like social networks have clusters of nodes that can be identified as communities. There is a lot of work in finding those communities using all sorts of methods – more references about that can be found in Jon Kleinberg’s Networks course website. Spectral techniques are a natural candidate for doing that. I did a small experiment using Orkut, the most used social network in Brasil. Let ${G}$ be the entire graph of Orkut and ${u}$ be the node representing my profile on ${G}$ and let: ${\partial \{u\} = \{ v; (u,v) \in E(G)\}}$ be the set of all my friends and ${G[\partial \{u\}]}$ be the graph induced on my friends.

First, I crawled the web to reconstruct ${G[\partial \{u\}]}$ (notice I am not in the graph, so I am not using the fact that I am connecting all them. I am just using the connections between them). First, I have ${211}$ friends, which form a huge ${194}$ nodes component, and smaller components (one of size ${5}$, two of size ${3}$ and six of size ${1}$). Let’s forget about the smaller components and care about the huge component: in this component there are my friends from undergrad, my friends from high-school, my family and some miscellaneous friends. Although all of them are interconnected, I wanted to be able to separate them just by using the topology of the graph.

I did the following experiment: I calculated the eigenvalues and eigenvectors of ${D^{-1/2} A D^{-1/2}}$. Let ${1 = \lambda_1 \geq \hdots \lambda_n}$ be the eigenvalues and ${x_1, \hdots, x_n}$ be the eigenvectors. For each node ${v \in V}$, I associated with coordinates ${(x_2(v), x_n(v))}$ in a way that a vertical cut would be a good approximation of the sparsest cut and an horizontal line would be a good approximation of the max-cut. The result is there:

After doing that I labeled the nodes of the graph with colors: my friends from IME (my undergrad university), I labeled blue. My friends from high-school, I labeled green, my family nodes, I labeled yellow and friends from other places, I labeled red. Also, there were some people that used strange characters in their names I could no parse, so I also labeled them red. I was happy to see that the algorithm efficiently separated the green from the blue nodes: clearly distinguishing between my high-school and my undergrad friends.

More about spectral methods

Luca Trevisan has a lot of very cool blog posts about Cheeger’s inequality and other interesting proofs of spectral facts. This MIT course has a lot of material about Spectral Graph Theory.

Categories: theory Tags:
1. January 29th, 2015 at 08:59 | #1

Thanks!
Helps a lot.

1. No trackbacks yet.