Home > theory > Sparsest Cut via Metric Embeddings

## Sparsest Cut via Metric Embeddings

A very powerful technique to design approximation algorithms is called metric embeddings. A lot of problems are easy when the graph has some special structure like a tree or a cycle or when the vertices are encoded as points in a ${{\mathbb R}^n}$. This gives us more structure to the problem an allows us to make geometric considerations about the object we are studying – and this is sometimes the key to find a good approximation. The basic idea is to start with a space without much structure and try to approximate it by some space with stronger geometric or topological properties. A lot has be done recently in finding cuts in graphs or solving Steiner Tree problems.

Consider the problem of finding the sparsest cut in a graph. Given a graph ${G = (V,E)}$, we define the sparsity of a ${(S, \overline{S})}$ as:

$\displaystyle \alpha(S) = \frac{\vert \partial S \vert}{\min\{ \vert S \vert, \vert V \setminus S \vert \}}$

where ${\partial S }$ is the set of edges with one endpoint in ${S}$ and other in ${\overline{S})}$. The sparsest cut is the cut minimizing the sparsity. A related problem is the flux problem, which asks for a cut minimizing:

$\displaystyle f(S) = \frac{\vert \partial S \vert}{ \vert S \vert \cdot \vert V \setminus S \vert}$

If we can minimize flux, we have a ${2}$-approximation for sparsity, since for all ${S}$ we have that: ${\frac{\alpha(S)}{\vert V \vert} \leq f(S) \leq \frac{2 \alpha(S)}{\vert V \vert}}$. If we can approximate the minimum flux, we can also approximate sparsity by the same factor multiplied by ${2}$.

Let’s formulate flux as an embeddings problem: we would like a mapping ${\phi: V \longrightarrow \{0, 1\}}$ such that minimizes:

$\displaystyle \frac{\sum_{(u,v) \in E} \vert \phi(u) - \phi(v) \vert }{ \sum_{(u,v) \in V^2} \vert \phi(u) - \phi(v) \vert }$

Let’s relax this a bit. Substitue ${\vert \phi(u) - \phi(v) \vert}$ by $d_{uv}$. So, we want to minimize ${\frac{\sum_{(u,v) \in E} d_{uv} }{\sum_{(u,v) \in V^2} d_{uv} }}$. A standard trick for minimizing fractions where both numerator and denominator are linear functions on the same variables is to fix the denominator to be ${1}$ and minimize the denominator. So, we want to minimize ${\sum_{(u,v) \in E} d_{uv}}$ given that ${\sum_{(u,v) \in V^2} d_{uv} = 1}$. We also wanted ${d_{uv} \in \{ 0, 1\}}$. But this is still not enough, we want ${d_{uv} }$ to be of a specific form called cut metric, i.e., there is a partition of ${V}$ in two sets and ${d_{uv} = 1}$ iff ${u}$ and ${v}$ are in different sets. We relax it a bit by just asking it to be a metric, i.e, non-negative value so that the triangle inequality holds:

$\displaystyle d_{uv} \leq d_{uw} + d_{wv}$

By relaxing a minimization problem we might get a solution which has actually smaller than the previous problem. This gives however a lower bound to the result. In general, finding a good lower bound is the first step in the analysis of approximation algorithms.

Theorem 1 The solution of the flux problem is lower-bounded by the solution of the following LP:

\displaystyle \left. \begin{aligned} & \min \sum_{(u,v) \in E} d_{uv} \text{ s.t. } \\ & \qquad \left\lbrace \begin{aligned} & \sum_{(u,v) \in V^2} d_{uv} = 1 \\ & d_{uv} \leq d_{uw} + d_{wv} \\ & d_{uv} \geq 0 \end{aligned} \right. \end{aligned} \right. \ \ \ \ \ (1)

The idea for approximating this is to drop the integrality constraint, solve the LP to get a general metric and the round it to obtain a cut. Suppose ${\{d_{uv}\}}$ is the optimal solution to the LP version of the program above. How to get a cut from it?

To do it, we will use a powerful result due to Bourgain, but first, let’s discuss some basics about metrics and embeddinds. A metric space is a pair ${(X,d)}$ where ${X}$ is a set and ${d:X^2 \longrightarrow {\mathbb R}_+}$. Examples of metric spaces are:

1. ${X}$ is the set of vertices of a weighted graph ${G}$ and ${d(u,v)}$ is the distance from ${u}$ to ${v}$ in the graph;
2. ${X = {\mathbb R}^n}$ and ${d}$ is the ${L_p}$-norm, i.e., ${d(u,v) = ( \sum_{i=1}^n \vert u_i - v_i \vert^p )^{1/p}}$, when ${p = 2}$, this is the Euclidian norm, when ${p = 1}$, it is the Manhattan distance. This space is called ${L^n_p}$ ;
3. ${X = {\mathbb R}^n}$ and ${d}$ is the ${L_\infty}$-norm, i.e., ${d(u,v) = \max_i \vert u_i - v_i \vert}$. It is called ${L_\infty}$ norm, because it ${\Vert u - v \Vert_\infty = \lim_{p \rightarrow \infty} \Vert u - v \Vert_p}$ where ${\Vert u - v \Vert_p}$ denoted the distance in the ${L_p}$ norm. This space is called ${L^n_\infty}$

Given two metric spaces ${(X,d_X)}$ and ${(Y, d_Y)}$, we say that a mapping ${\phi:X \longrightarrow Y}$ is a distortion ${D}$ embedding:

$\displaystyle d_X (u,v) \leq d_Y(\phi(u), \phi(v)) \leq D d_X (u,v)$

to make our life easier, we use most of the times the following (not so) weaker definition: we say it is a distortion ${D}$ embedding if there is some factor ${\gamma > 0}$ such that:

$\displaystyle \gamma d_X (u,v) \leq d_Y(\phi(u), \phi(v)) \leq D \gamma d_X (u,v)$

this is the same but allowing some scaling.

Bourgain’s Theorem says that:

Theorem 2 (Bourgain) For any metric space ${(X,d)}$ with ${|X| = n}$ there is an embedding into ${L_1^{O(\log^2 n)}}$ with distortion ${O(\log n)}$. And we can construct this embedding in poly-time using a randomized algorithm.

Supose ${\phi}$ is such embedding. Then we have a function ${\phi : V \rightarrow {\mathbb R}^{O(log^2 n)}}$ such that ${d_{uv} \leq \Vert \phi(u) - \phi(v) \Vert \leq d_{uv} \cdot O(\log n)}$. Now, this gives us ${O(n \log^2 n)}$ cuts obtained in the following way: for each coordinate ${i}$ of ${{\mathbb R}^{O(log^2 n)}}$, order the ${n}$ points according to ${\phi_i(u)}$ and for each ${j = 1 \ldots n-1}$ take the cut ${S}$ formed by the first ${j}$ points. Let this be the cut ${S_{ij}}$ Then, take the minimum of those cuts.

Theorem 3 The procedure described above generates a cut within a factor of ${O(\log n)}$ to the optimal in poly-time.

To prove this, first we need some notation. Given the cut ${(S_{ij}, \overline{S_{ij}})}$, let ${d_{S_{ij}}}$ be the cut-distance associated with it, i.e., ${d_{S_{ij}} (u,v)}$ is ${0}$ if ${u}$ and ${v}$ are in the same side of the cut and ${1}$ otherwise. Let also ${\{ x_{i1} \leq x_{i1} \leq \ldots \leq x_{in} \} = \{ \phi_i(v); v \in V \}}$. It is not difficult to see that:

$\displaystyle \vert \phi_i(u) - \phi_i(v) \vert = \sum_{j=1}^{n-1} (x_{i,j+1} - x_{i,j}) d_{S_{ij}} (u,v)$

and, therefore:

$\displaystyle \Vert \phi(u) - \phi(v) \Vert_1 = \sum_{i=1}^{O(\log^2 n)} \sum_{j=1}^{n-1} (x_{i,j+1} - x_{i,j}) d_{S_{ij}} (u,v)$

Now, we can go ahead and prove the theorem: Let ${f^*}$ be the optimal flux and ${f}$ the flux obtained by the algorithm. We have that:

$\displaystyle f^* \leq f = \min_{ij} \frac{ \sum_{(u,v)\in E} d_{S_{ij}(u,v)}}{ \sum_{(u,v)\in V^2} d_{S_{ij}(u,v)}} = \min_{ij} \frac{ c_{ij} \sum_{(u,v)\in E} d_{S_{ij}(u,v)}}{ c_{ij} \sum_{(u,v)\in V^2} d_{S_{ij}(u,v)}}$

where ${c_{ij} = x_{i,j+1} - x_{i,j}}$. Then:

$\displaystyle f \leq \frac{\sum_{ij} \left[ c_{ij} \sum_{(u,v)\in E} d_{S_{ij}}(u,v) \right] }{\sum_{ij} \left[ c_{ij} \sum_{(u,v)\in V^2} d_{S_{ij}(u,v)} \right] } = \frac{\sum_{(u,v)\in E} \left[ \sum_{ij} c_{ij} d_{S_{ij}}(u,v) \right] }{ \sum_{(u,v)\in V^2} \left[ \sum_{ij} c_{ij} d_{S_{ij}(u,v)} \right] }$

but this is exacly the expression of ${L_1}$-norm, so:

$\displaystyle f \leq \frac{\sum_{(u,v)\in E} \Vert \phi(u) - \phi(v) \Vert_1 }{ \sum_{(u,v)\in V^2} \Vert \phi(u) - \phi(v) \Vert_1 } \leq \frac{\sum_{(u,v)\in E} O(\log n) d_{uv} }{ \sum_{(u,v)\in V^2} d_{uv} }$

which is smaller than ${f^* O(\log n)}$ since this is the solution to the LP, which is a relaxed version of the original problem. And this finishes the proof.

This is one of the first applications to metric embeddings. It was introduced in a paper by Linal, London and Rabinovich. There are very nice resources on the web about metric embeddins, as the course by Anupam Gupta at CMU, the course by Michael Goemans at MIT and the course by Tim Roughgarden at Stanford.

This is also an area full of exciting open problems. It is known, for example, that it is hard to decide if a given metric is embeddable in ${L_1}$ or not. But given a metric that we know that is isometrically embeddable in ${L_1}$, how to find the embedding? This is an open problem. In fact, it is unknown even how to approximate this embedding by a factor better then ${\log n}$, which is given by Bougain’s Theorem.

Categories: theory Tags: