<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Big Red Bits &#187; probability</title>
	<atom:link href="http://www.bigredbits.com/archives/tag/probability/feed" rel="self" type="application/rss+xml" />
	<link>http://www.bigredbits.com</link>
	<description>Theory, Distributed Systems, and Other Random Bits</description>
	<lastBuildDate>Thu, 29 Sep 2011 07:13:29 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Bayesian updates and the Lake Wobegon effect</title>
		<link>http://www.bigredbits.com/archives/643</link>
		<comments>http://www.bigredbits.com/archives/643#comments</comments>
		<pubDate>Mon, 26 Sep 2011 01:48:27 +0000</pubDate>
		<dc:creator>renatoppl</dc:creator>
				<category><![CDATA[theory]]></category>
		<category><![CDATA[economics]]></category>
		<category><![CDATA[probability]]></category>

		<guid isPermaLink="false">http://www.bigredbits.com/?p=643</guid>
		<description><![CDATA[We seem to have a good mathematical understanding of Bayesian updates, but somehow a very poor understanding of its practical implications. There are many situations in practice that we easily perceive as irrational, one of the most famous is the so called Lake Wobegon effect, named after the fictional town in Minnesota, where &#8220;all the women are [...]]]></description>
			<content:encoded><![CDATA[<p>We seem to have a good mathematical understanding of Bayesian updates, but somehow a very poor understanding of its practical implications. There are many situations in practice that we easily perceive as irrational, one of the most famous is the so called <a href="http://en.wikipedia.org/wiki/Illusory_superiority">Lake Wobegon effect</a>, named after the <a href="http://en.wikipedia.org/wiki/Lake_Wobegon">fictional town in Minnesota</a>, where &#8220;all the women are strong, all the men are good looking, and all the children are above average&#8221;. It is described as a cognitive bias where individuals tend to overestimate their own capabilities. In fact, when drivers are asked to rate their own skilled compared to the average in three groups: low-skilled, medium-skilled and high-skilled, most rate themselves above the average.</p>
<p>In fact, the behavioral economics literate is full of examples like this where the observed data is far from what you would expect to observe if all agents were rational &#8211; and those are normally attributed to cognitive biases. I was always a bit suspicious of such arguments: it was never clear if agents were simply not being rational or whether their true objective wasn&#8217;t being captured by the model. I always thought the second was a lot more likely.</p>
<p>One of the main problems of the irrationality argument is that they ignore the fact that agents live in a world where its states are not completely observed. In a beautiful paper in Econometrica called &#8220;<a href="http://www2.um.edu.uy/dubraj/documentos/Apparentfinal.pdf">Apparent Overconfidence</a>&#8220;, Benoit and Dubra argue that:</p>
<blockquote><p>&#8220;But the simple truism that most people cannot be better than the median does not imply that most people cannot rationally rate themselves above the median.&#8221;</p></blockquote>
<p>The authors show that it is possible to reverse engineer a signaling scheme such that the data is mostly consistent with the observation. Let me try to give a simple example they give in the introduction: consider that each driver has one of three types of skill: low, medium or high: <img src='http://s.wordpress.com/latex.php?latex=%5C%7BL%2CM%2CH%5C%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\{L,M,H\}' title='\{L,M,H\}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BP%7D%28L%29%20%3D%20%5Cmathbb%7BP%7D%28M%29%20%3D%20%5Cmathbb%7BP%7D%28H%29%20%3D%20%5Cfrac%7B1%7D%7B3%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{P}(L) = \mathbb{P}(M) = \mathbb{P}(H) = \frac{1}{3}' title='\mathbb{P}(L) = \mathbb{P}(M) = \mathbb{P}(H) = \frac{1}{3}' class='latex' />. However, they can&#8217;t observe this. They can only observe some sample of their driving. Let&#8217;s say for simplicity that they can observe a signal <img src='http://s.wordpress.com/latex.php?latex=A&#038;bg=T&#038;fg=000000&#038;s=0' alt='A' title='A' class='latex' /> that says if they caused an accident or not. Assume also that the larger that skill of a driver, the higher it is his probability of causing an accident, say:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BP%7D%28A%20%5Cvert%20L%29%20%3D%20%5Cfrac%7B47%7D%7B80%7D%2C%20%5Cmathbb%7BP%7D%28A%20%5Cvert%20L%29%20%3D%20%5Cfrac%7B9%7D%7B16%7D%2C%20%5Cmathbb%7BP%7D%28A%20%5Cvert%20L%29%20%3D%20%5Cfrac%7B1%7D%7B20%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{P}(A \vert L) = \frac{47}{80}, \mathbb{P}(A \vert L) = \frac{9}{16}, \mathbb{P}(A \vert L) = \frac{1}{20}' title='\mathbb{P}(A \vert L) = \frac{47}{80}, \mathbb{P}(A \vert L) = \frac{9}{16}, \mathbb{P}(A \vert L) = \frac{1}{20}' class='latex' /></p>
<p style="text-align: left;">Before observing <img src='http://s.wordpress.com/latex.php?latex=A&#038;bg=T&#038;fg=000000&#038;s=0' alt='A' title='A' class='latex' /> each driver things of himself as having probability $\frac{1}{3}$ of having each type of skill. Now, after observing <img src='http://s.wordpress.com/latex.php?latex=A&#038;bg=T&#038;fg=000000&#038;s=0' alt='A' title='A' class='latex' />, they update their belief according to Bayes rule, i.e.,</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BP%7D%28s%20%5Cvert%20A%29%20%3D%20%5Cfrac%7B%20%5Cmathbb%7BP%7D%28A%20%5Cvert%20s%29%20%20%20%5Cmathbb%7BP%7D%28s%29%20%20%7D%7B%20%5Csum_%7Bs%27%7D%20%20%5Cmathbb%7BP%7D%28A%20%5Cvert%20s%27%29%20%20%20%5Cmathbb%7BP%7D%28s%27%29%20%20%7D%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{P}(s \vert A) = \frac{ \mathbb{P}(A \vert s)   \mathbb{P}(s)  }{ \sum_{s&#039;}  \mathbb{P}(A \vert s&#039;)   \mathbb{P}(s&#039;)  } ' title='\mathbb{P}(s \vert A) = \frac{ \mathbb{P}(A \vert s)   \mathbb{P}(s)  }{ \sum_{s&#039;}  \mathbb{P}(A \vert s&#039;)   \mathbb{P}(s&#039;)  } ' class='latex' /></p>
<p style="text-align: left;">doing the calculations, we have that <img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BP%7D%28A%29%20%3D%20%5Cfrac%7B2%7D%7B5%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{P}(A) = \frac{2}{5}' title='\mathbb{P}(A) = \frac{2}{5}' class='latex' /> and for the <img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7B3%7D%7B5%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\frac{3}{5}' title='\frac{3}{5}' class='latex' /> of the drivers that didn&#8217;t suffer an accident, they&#8217;ll evaluate <img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BP%7D%28L%20%5Cvert%20%5Cneg%20A%29%20%3D%20%5Cfrac%7B11%7D%7B48%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{P}(L \vert \neg A) = \frac{11}{48}' title='\mathbb{P}(L \vert \neg A) = \frac{11}{48}' class='latex' />, <img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BP%7D%28M%20%5Cvert%20%5Cneg%20A%29%20%3D%20%5Cfrac%7B35%7D%7B144%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{P}(M \vert \neg A) = \frac{35}{144}' title='\mathbb{P}(M \vert \neg A) = \frac{35}{144}' class='latex' />, <img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BP%7D%28H%20%5Cvert%20%5Cneg%20A%29%20%3D%20%5Cfrac%7B19%7D%7B36%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{P}(H \vert \neg A) = \frac{19}{36}' title='\mathbb{P}(H \vert \neg A) = \frac{19}{36}' class='latex' />, so:</p>
<div id="_mcePaste" style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BP%7D%28H%20%5Cvert%20%5Cneg%20A%29%20%3E%20%5Cmathbb%7BP%7D%28L%20%5Ccup%20M%20%5Cvert%20%5Cneg%20A%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{P}(H \vert \neg A) &gt; \mathbb{P}(L \cup M \vert \neg A)' title='\mathbb{P}(H \vert \neg A) &gt; \mathbb{P}(L \cup M \vert \neg A)' class='latex' /></div>
<p>and therefore will report high-skill. Notice this is totally consistent with rational Bayesian-updaters. The main question in the paper is: &#8220;when it is possible to reverse engineer a signaling scheme ?&#8221;. More formally, let <img src='http://s.wordpress.com/latex.php?latex=%5CTheta&#038;bg=T&#038;fg=000000&#038;s=0' alt='\Theta' title='\Theta' class='latex' /> be a set of types of users and let <img src='http://s.wordpress.com/latex.php?latex=%5Ctheta%20%5Csim%20H%20%5Cin%20%5CDelta%28%5CTheta%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='\theta \sim H \in \Delta(\Theta)' title='\theta \sim H \in \Delta(\Theta)' class='latex' />, i.e., <img src='http://s.wordpress.com/latex.php?latex=H&#038;bg=T&#038;fg=000000&#038;s=0' alt='H' title='H' class='latex' /> is a distribution on the types which is common knowledge. Now, if we ask agents to report their type, their report is some <img src='http://s.wordpress.com/latex.php?latex=H%27%20%5Cin%20%5CDelta%28%5CTheta%29%2C%20H%27%20%5Cneq%20H&#038;bg=T&#038;fg=000000&#038;s=0' alt='H&#039; \in \Delta(\Theta), H&#039; \neq H' title='H&#039; \in \Delta(\Theta), H&#039; \neq H' class='latex' />. Is there a signaling scheme <img src='http://s.wordpress.com/latex.php?latex=S&#038;bg=T&#038;fg=000000&#038;s=0' alt='S' title='S' class='latex' /> which can be interpreted as a random variable correlated with <img src='http://s.wordpress.com/latex.php?latex=%5Ctheta&#038;bg=T&#038;fg=000000&#038;s=0' alt='\theta' title='\theta' class='latex' /> such that <img src='http://s.wordpress.com/latex.php?latex=H%27&#038;bg=T&#038;fg=000000&#038;s=0' alt='H&#039;' title='H&#039;' class='latex' /> is the distribution rational Bayesian updaters would report based on what they observed from <img src='http://s.wordpress.com/latex.php?latex=S&#038;bg=T&#038;fg=000000&#038;s=0' alt='S' title='S' class='latex' /> ? The authors give necessary and sufficient condition on when this is possible given <img src='http://s.wordpress.com/latex.php?latex=H%2CH%27&#038;bg=T&#038;fg=000000&#038;s=0' alt='H,H&#039;' title='H,H&#039;' class='latex' />.</p>
<p style="text-align: center;">&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;</p>
<p style="text-align: left;">A note also related to the Lake Wobegon effect: I started reading a very nice book by Duncan Watts called &#8220;<a href="http://www.amazon.com/Everything-Obvious-Once-Know-Answer/dp/0385531680/ref=sr_1_1?s=books&amp;ie=UTF8&amp;qid=1317001355&amp;sr=1-1">Everything Is Obvious: *Once You Know the Answer</a>&#8221; about traps of the common-sense. The discussion is different then above, but it also talks about the dangers of applying our usual common sense, which is very useful to our daily life, to scientific results. I highly recommend reading the intro of the book, which is open in Amazon. He gives examples of social phenomena where, once you are told them, you think: &#8220;oh yeah, this is obvious&#8221;. But then if you were told the exact opposite (in fact, he begins the example by telling you the opposite from the observed in data), you&#8217;d also think &#8220;yes, yes, this is obvious&#8221; and come up with very natural explanations. His point is that common sense is very useful to explaining data observations, specially observations of social data. On the other hand, it is performs very poorly on predicting how the data will look like before actually seeing it.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bigredbits.com/archives/643/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MHR, Regular Distributions and Myerson&#8217;s Lemma</title>
		<link>http://www.bigredbits.com/archives/539</link>
		<comments>http://www.bigredbits.com/archives/539#comments</comments>
		<pubDate>Mon, 30 May 2011 10:46:08 +0000</pubDate>
		<dc:creator>renatoppl</dc:creator>
				<category><![CDATA[theory]]></category>
		<category><![CDATA[probability]]></category>
		<category><![CDATA[profit maximization]]></category>

		<guid isPermaLink="false">http://www.bigredbits.com/?p=539</guid>
		<description><![CDATA[Monotone Hazard Rate (MHR) distributions and its superclass regular distributions keep appearing in the Mechanism Design literature and this is due to a very good reason: they are the class of distributions for which Myerson&#8217;s Optimal Auction is simple and natural. Let&#8217;s brief discuss some properties of those distributions. First, two definitions: Hazard rate of [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Monotone Hazard Rate</strong> (MHR) distributions and its superclass <strong>regular distributions</strong> keep appearing in the Mechanism Design literature and this is due to a very good reason: they are the class of distributions for which <a href="http://www.econ.yale.edu/~dirkb/teach/521b-08-09/reading/1981%20optimal%20auction.pdf">Myerson&#8217;s Optimal Auction</a> is simple and natural. Let&#8217;s brief discuss some properties of those distributions. First, two definitions:</p>
<ol>
<li>Hazard rate of a distribution <img src='http://s.wordpress.com/latex.php?latex=f&#038;bg=T&#038;fg=000000&#038;s=0' alt='f' title='f' class='latex' /> : <img src='http://s.wordpress.com/latex.php?latex=h%28z%29%20%3D%20%5Cfrac%7Bf%28z%29%7D%7B1-F%28z%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='h(z) = \frac{f(z)}{1-F(z)}' title='h(z) = \frac{f(z)}{1-F(z)}' class='latex' /></li>
<li>Myerson virtual value of a distribution <img src='http://s.wordpress.com/latex.php?latex=f&#038;bg=T&#038;fg=000000&#038;s=0' alt='f' title='f' class='latex' /> : <img src='http://s.wordpress.com/latex.php?latex=%5Cphi%28z%29%20%3D%20z%20-%20%5Cfrac%7B1-F%28z%29%7D%7Bf%28z%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\phi(z) = z - \frac{1-F(z)}{f(z)}' title='\phi(z) = z - \frac{1-F(z)}{f(z)}' class='latex' /></li>
</ol>
<p>We can interpret the hazard rate in the following way: think of <img src='http://s.wordpress.com/latex.php?latex=T%20%5Csim%20f&#038;bg=T&#038;fg=000000&#038;s=0' alt='T \sim f' title='T \sim f' class='latex' /> as a random variable that indicates the time that a light bulb will take to extinguish. If we are in time <img src='http://s.wordpress.com/latex.php?latex=t&#038;bg=T&#038;fg=000000&#038;s=0' alt='t' title='t' class='latex' /> and the light bulb hasn&#8217;t extinguished so far, what is the probability it will extinguish in the next <img src='http://s.wordpress.com/latex.php?latex=%5Cdelta&#038;bg=T&#038;fg=000000&#038;s=0' alt='\delta' title='\delta' class='latex' /> time:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BP%7D%5BT%20%5Cleq%20t%2B%5Cdelta%20%5Cvert%20T%20%3E%20t%5D%20%5Capprox%20%5Cfrac%7Bf%28t%29%20%5Cdelta%7D%7B1-F%28t%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{P}[T \leq t+\delta \vert T &gt; t] \approx \frac{f(t) \delta}{1-F(t)}' title='\mathbb{P}[T \leq t+\delta \vert T &gt; t] \approx \frac{f(t) \delta}{1-F(t)}' class='latex' /></p>
<p style="text-align: left;">We say that a distribution is monotone hazard rate, if <img src='http://s.wordpress.com/latex.php?latex=h%28z%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='h(z)' title='h(z)' class='latex' /> is non-decreasing. This is very natural for light bulbs, for example. Many of the distributions that we are used to are MHR, for example, uniform, exponential and normal. The way that I like to think about MHR distributions is the following: if some distribution has hazard rate <img src='http://s.wordpress.com/latex.php?latex=h%28z%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='h(z)' title='h(z)' class='latex' />, then it means that <img src='http://s.wordpress.com/latex.php?latex=F%27%28z%29%20%3D%20%281-F%28z%29%29%20h%28z%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='F&#039;(z) = (1-F(z)) h(z)' title='F&#039;(z) = (1-F(z)) h(z)' class='latex' />. If we define <img src='http://s.wordpress.com/latex.php?latex=G%28z%29%20%3D%201-F%28z%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='G(z) = 1-F(z)' title='G(z) = 1-F(z)' class='latex' />, then <img src='http://s.wordpress.com/latex.php?latex=%28log%20G%28z%29%29%27%20%3D%20%5Cfrac%7BG%27%28z%29%7D%7BG%28z%29%7D%20%3D%20-h%28z%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='(log G(z))&#039; = \frac{G&#039;(z)}{G(z)} = -h(z)' title='(log G(z))&#039; = \frac{G&#039;(z)}{G(z)} = -h(z)' class='latex' />, so:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=F%28z%29%20%3D%201-%5Ctext%7Bexp%7D%28-%5Cint_0%5Ez%20h%28u%29%20du%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='F(z) = 1-\text{exp}(-\int_0^z h(u) du)' title='F(z) = 1-\text{exp}(-\int_0^z h(u) du)' class='latex' /></p>
<p style="text-align: left;">From this characterization, it is simple to see that the extremal distributions for this class, i.e. the distributions that are in the edge of being MHR and non-MHR are constant hazard rate, which correspond to the exponential distribution <img src='http://s.wordpress.com/latex.php?latex=F%28z%29%20%3D%201-e%5E%7B-%5Clambda%20z%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='F(z) = 1-e^{-\lambda z}' title='F(z) = 1-e^{-\lambda z}' class='latex' /> for <img src='http://s.wordpress.com/latex.php?latex=z%20%5Cin%20%5B0%2C%5Cinfty%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='z \in [0,\infty)' title='z \in [0,\infty)' class='latex' />. They way I like to think about those distributions is that whenever you are able to prove something about the exponential distribution, then you can prove a similar statement about MHR distributions. Consider those three examples:</p>
<p style="text-align: left;"><strong>Example 1: </strong><img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BP%7D%5B%5Cphi%28z%29%20%5Cgeq%200%5D%20%5Cgeq%20%5Cfrac%7B1%7D%7Be%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{P}[\phi(z) \geq 0] \geq \frac{1}{e}' title='\mathbb{P}[\phi(z) \geq 0] \geq \frac{1}{e}' class='latex' /> for MHR distributions. This fact is straightforward for the exponential distribution. For the exponential distribution <img src='http://s.wordpress.com/latex.php?latex=%5Cphi%28z%29%20%3D%20z-%5Clambda%5E%7B-1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\phi(z) = z-\lambda^{-1}' title='\phi(z) = z-\lambda^{-1}' class='latex' /> and therefore</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BP%7D%5B%5Cphi%28z%29%20%5Cgeq%200%5D%20%5Cgeq%20%5Cmathbb%7BP%7D%5Bz%20%3E%20%5Clambda%5E%7B-1%7D%5D%20%3D%201-F%28%5Clambda%5E%7B-1%7D%29%20%3D%20e%5E%7B-1%7D%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{P}[\phi(z) \geq 0] \geq \mathbb{P}[z &gt; \lambda^{-1}] = 1-F(\lambda^{-1}) = e^{-1} ' title='\mathbb{P}[\phi(z) \geq 0] \geq \mathbb{P}[z &gt; \lambda^{-1}] = 1-F(\lambda^{-1}) = e^{-1} ' class='latex' /></p>
<p style="text-align: left;">but the proof for MHR is equally simple: Let <img src='http://s.wordpress.com/latex.php?latex=r%20%3D%20%5Cinf%20%5C%7Bz%3B%20%5Cphi%28z%29%20%5Cgeq%200%5C%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='r = \inf \{z; \phi(z) \geq 0\}' title='r = \inf \{z; \phi(z) \geq 0\}' class='latex' />, therefore <img src='http://s.wordpress.com/latex.php?latex=r%20h%28r%29%20%5Cleq%201&#038;bg=T&#038;fg=000000&#038;s=0' alt='r h(r) \leq 1' title='r h(r) \leq 1' class='latex' />.<br />
<img src='http://s.wordpress.com/latex.php?latex=P%5C%7B%20%5Cphi%28v%29%20%5Cgeq%200%5C%7D%20%3D%20P%5C%7B%20v%20%5Cgeq%20r%20%5C%7D%20%3D%201%20-%20F%28r%29%20%3D%20e%5E%7B-%5Cint_0%5Er%20h%28u%29%20du%7D%20%5Cgeq%20e%5E%7B-r%20h%28r%29%7D%20%5Cgeq%20e%5E%7B-1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='P\{ \phi(v) \geq 0\} = P\{ v \geq r \} = 1 - F(r) = e^{-\int_0^r h(u) du} \geq e^{-r h(r)} \geq e^{-1}' title='P\{ \phi(v) \geq 0\} = P\{ v \geq r \} = 1 - F(r) = e^{-\int_0^r h(u) du} \geq e^{-r h(r)} \geq e^{-1}' class='latex' /></p>
<p><strong>Example 2</strong>: Given <img src='http://s.wordpress.com/latex.php?latex=z_1%2C%20z_2%20%5Csim%20f&#038;bg=T&#038;fg=000000&#038;s=0' alt='z_1, z_2 \sim f' title='z_1, z_2 \sim f' class='latex' /> iid where <img src='http://s.wordpress.com/latex.php?latex=f&#038;bg=T&#038;fg=000000&#038;s=0' alt='f' title='f' class='latex' /> is MHR and <img src='http://s.wordpress.com/latex.php?latex=v_1%20%3D%20%5Cmax%20%5C%7Bz_1%2C%20z_2%5C%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='v_1 = \max \{z_1, z_2\}' title='v_1 = \max \{z_1, z_2\}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=v_2%20%3D%20%5Cmin%20%5C%7Bz_1%2C%20z_2%5C%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='v_2 = \min \{z_1, z_2\}' title='v_2 = \min \{z_1, z_2\}' class='latex' />, then <img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BE%7D%5Bv_2%5D%20%5Cgeq%20%5Cfrac%7B1%7D%7B3%7D%20%5Cmathbb%7BE%7D%5Bv_1%5D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{E}[v_2] \geq \frac{1}{3} \mathbb{E}[v_1]' title='\mathbb{E}[v_2] \geq \frac{1}{3} \mathbb{E}[v_1]' class='latex' />. The proof for the exponential distribution is trivial, and in fact, this is tight for the exponential, the trick is to use the convexity of <img src='http://s.wordpress.com/latex.php?latex=z%20%5Cmapsto%20%5Cint_0%5Ez%20h%28u%29%20du&#038;bg=T&#038;fg=000000&#038;s=0' alt='z \mapsto \int_0^z h(u) du' title='z \mapsto \int_0^z h(u) du' class='latex' />. We use that <img src='http://s.wordpress.com/latex.php?latex=%5Cint_0%5E%7B2z%7D%20h%20%5Cgeq%202%20%5Cint_0%5Ez%20h&#038;bg=T&#038;fg=000000&#038;s=0' alt='\int_0^{2z} h \geq 2 \int_0^z h' title='\int_0^{2z} h \geq 2 \int_0^z h' class='latex' /> in the following way:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BE%7D%20%5Bv_2%5D%20%3D%20%5Cint_0%5E%5Cinfty%20%281%20-%20F%28z%29%29%5E2%20dz%20%3D%20%5Cint_0%5E%5Cinfty%20e%5E%7B-2%20%5Cint_0%5Ez%20h%7D%20dz%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{E} [v_2] = \int_0^\infty (1 - F(z))^2 dz = \int_0^\infty e^{-2 \int_0^z h} dz ' title='\mathbb{E} [v_2] = \int_0^\infty (1 - F(z))^2 dz = \int_0^\infty e^{-2 \int_0^z h} dz ' class='latex' /></p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cgeq%20%5Cint_0%5E%5Cinfty%20e%5E%7B-%5Cint_0%5E%7B2z%7D%20h%7D%20dz%3D%20%5Cfrac%7B1%7D%7B2%7D%20%5Cint_0%5E%5Cinfty%201%20-%20F%28z%29%20dz%20%3D%20%5Cfrac%7B1%7D%7B2%7D%20%5Cmathbb%7BE%7D%20%5Bz%5D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\geq \int_0^\infty e^{-\int_0^{2z} h} dz= \frac{1}{2} \int_0^\infty 1 - F(z) dz = \frac{1}{2} \mathbb{E} [z]' title='\geq \int_0^\infty e^{-\int_0^{2z} h} dz= \frac{1}{2} \int_0^\infty 1 - F(z) dz = \frac{1}{2} \mathbb{E} [z]' class='latex' /></p>
<p>Since <img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BE%7D%20%5Bv_1%20%2B%20v_2%5D%20%3D%20%5Cmathbb%7BE%7D%20%5Bz_1%20%2B%20z_2%5D%20%3D%202%20%5Cmathbb%7BE%7D%20%5Bz%5D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{E} [v_1 + v_2] = \mathbb{E} [z_1 + z_2] = 2 \mathbb{E} [z]' title='\mathbb{E} [v_1 + v_2] = \mathbb{E} [z_1 + z_2] = 2 \mathbb{E} [z]' class='latex' />, we have that <img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BE%7D%5Bv_1%5D%20%3D%202%20%5Cmathbb%7BE%7D%5Bz%5D%20-%20%5Cmathbb%7BE%7D%5Bv_2%5D%20%5Cleq%20%5Cfrac%7B3%7D%7B2%7D%20%5Cmathbb%7BE%7D%5Bz%5D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{E}[v_1] = 2 \mathbb{E}[z] - \mathbb{E}[v_2] \leq \frac{3}{2} \mathbb{E}[z]' title='\mathbb{E}[v_1] = 2 \mathbb{E}[z] - \mathbb{E}[v_2] \leq \frac{3}{2} \mathbb{E}[z]' class='latex' />. This way, we get: <img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BE%7D%5Bv_2%5D%20%5Cgeq%20%5Cfrac%7B1%7D%7B2%7D%5Cmathbb%7BE%7D%5Bz%5D%20%5Cgeq%20%5Cfrac%7B1%7D%7B2%7D%20%5Ccdot%20%5Cfrac%7B2%7D%7B3%7D%20%5Cmathbb%7BE%7D%5Bv_1%5D%20%3D%20%5Cfrac%7B1%7D%7B3%7D%20%5Cmathbb%7BE%7D%5Bv_1%5D%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{E}[v_2] \geq \frac{1}{2}\mathbb{E}[z] \geq \frac{1}{2} \cdot \frac{2}{3} \mathbb{E}[v_1] = \frac{1}{3} \mathbb{E}[v_1] ' title='\mathbb{E}[v_2] \geq \frac{1}{2}\mathbb{E}[z] \geq \frac{1}{2} \cdot \frac{2}{3} \mathbb{E}[v_1] = \frac{1}{3} \mathbb{E}[v_1] ' class='latex' /></p>
<p><strong>Example 3: </strong>For MHR distributions, there is a simple lemma that relates the virtual value and the real value and this lemma is quite useful in various settings: let <img src='http://s.wordpress.com/latex.php?latex=r%20%3D%20%5Cinf%20%5C%7Bz%3B%20%5Cphi%28z%29%20%3E%200%20%5C%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='r = \inf \{z; \phi(z) &gt; 0 \}' title='r = \inf \{z; \phi(z) &gt; 0 \}' class='latex' />, then for <img src='http://s.wordpress.com/latex.php?latex=z%20%5Cgeq%20r&#038;bg=T&#038;fg=000000&#038;s=0' alt='z \geq r' title='z \geq r' class='latex' />, <img src='http://s.wordpress.com/latex.php?latex=%5Cphi%28z%29%20%5Cgeq%20z%20-%20r&#038;bg=T&#038;fg=000000&#038;s=0' alt='\phi(z) \geq z - r' title='\phi(z) \geq z - r' class='latex' />. Again, this is tight for exponential distribution. The proof is quite trivial:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=x%20-%20%5Cphi%28x%29%20%3D%20%5Cfrac%7B1-F%28x%29%7D%7Bf%28x%29%7D%20%5Cleq%20%5Cfrac%7B1-F%28r%29%7D%7Bf%28r%29%7D%20%3D%20r&#038;bg=T&#038;fg=000000&#038;s=0' alt='x - \phi(x) = \frac{1-F(x)}{f(x)} \leq \frac{1-F(r)}{f(r)} = r' title='x - \phi(x) = \frac{1-F(x)}{f(x)} \leq \frac{1-F(r)}{f(r)} = r' class='latex' /></p>
<p>Now, MHR distributions are a subclass of regular distributions, which are the distributions for which Myerson&#8217;s virtual value <img src='http://s.wordpress.com/latex.php?latex=%5Cphi%28z%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='\phi(z)' title='\phi(z)' class='latex' /> is a monotone function. I usually find harder to think about regular distributions than to think about MHR (in fact, I don&#8217;t know so many examples that are regular, but not MHR. Here is one, though, called the <em>equal-revenue-distribution</em>. Consider <img src='http://s.wordpress.com/latex.php?latex=z%20%5Cin%20%5B1%2C%20%5Cinfty%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='z \in [1, \infty)' title='z \in [1, \infty)' class='latex' /> distributed according to <img src='http://s.wordpress.com/latex.php?latex=f%28z%29%20%3D%201%2Fz%5E2&#038;bg=T&#038;fg=000000&#038;s=0' alt='f(z) = 1/z^2' title='f(z) = 1/z^2' class='latex' />. The cumulative distribution is given by <img src='http://s.wordpress.com/latex.php?latex=F%28z%29%20%3D%201-1%2Fz&#038;bg=T&#038;fg=000000&#038;s=0' alt='F(z) = 1-1/z' title='F(z) = 1-1/z' class='latex' />. The interesting thing of this distribution is that posted prices get the same revenue regardless of the price. For example, if we post any price <img src='http://s.wordpress.com/latex.php?latex=r%20%5Cin%20%5B1%2C%5Cinfty%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='r \in [1,\infty)' title='r \in [1,\infty)' class='latex' />, then a customer with valuations <img src='http://s.wordpress.com/latex.php?latex=z%20%5Csim%20f&#038;bg=T&#038;fg=000000&#038;s=0' alt='z \sim f' title='z \sim f' class='latex' /> buys the item if <img src='http://s.wordpress.com/latex.php?latex=z%20%3E%20r&#038;bg=T&#038;fg=000000&#038;s=0' alt='z &gt; r' title='z &gt; r' class='latex' /> by price <img src='http://s.wordpress.com/latex.php?latex=r&#038;bg=T&#038;fg=000000&#038;s=0' alt='r' title='r' class='latex' />, getting  revenue is <img src='http://s.wordpress.com/latex.php?latex=r%20%281-F%28r%29%29%20%3D%201&#038;bg=T&#038;fg=000000&#038;s=0' alt='r (1-F(r)) = 1' title='r (1-F(r)) = 1' class='latex' />. This can be expressed by the fact that <img src='http://s.wordpress.com/latex.php?latex=%5Cphi%28z%29%20%3D%200&#038;bg=T&#038;fg=000000&#038;s=0' alt='\phi(z) = 0' title='\phi(z) = 0' class='latex' />. I was a bit puzzled by this fact, because of Myerson&#8217;s Lemma:</p>
<blockquote>
<p style="text-align: left;"><strong>Myerson Lemma: </strong>If a mechanism sells to some player that has valuation <img src='http://s.wordpress.com/latex.php?latex=v%20%5Csim%20f&#038;bg=T&#038;fg=000000&#038;s=0' alt='v \sim f' title='v \sim f' class='latex' /> with probability <img src='http://s.wordpress.com/latex.php?latex=x%28v%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='x(v)' title='x(v)' class='latex' /> when he has value <img src='http://s.wordpress.com/latex.php?latex=v&#038;bg=T&#038;fg=000000&#038;s=0' alt='v' title='v' class='latex' />, then the revenue is <img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BE%7D%20%5Bx%28v%29%20%5Cphi%28v%29%5D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{E} [x(v) \phi(v)]' title='\mathbb{E} [x(v) \phi(v)]' class='latex' />.</p>
</blockquote>
<p style="text-align: left;">And it seemed that the auctioneers was doomed to get zero revenue, since <img src='http://s.wordpress.com/latex.php?latex=%5Cphi%28z%29%20%3D%200&#038;bg=T&#038;fg=000000&#038;s=0' alt='\phi(z) = 0' title='\phi(z) = 0' class='latex' />. For example, suppose we fix some price <img src='http://s.wordpress.com/latex.php?latex=r&#038;bg=T&#038;fg=000000&#038;s=0' alt='r' title='r' class='latex' /> and we sell the item if <img src='http://s.wordpress.com/latex.php?latex=v%20%5Cgeq%20r&#038;bg=T&#038;fg=000000&#038;s=0' alt='v \geq r' title='v \geq r' class='latex' /> by price <img src='http://s.wordpress.com/latex.php?latex=r&#038;bg=T&#038;fg=000000&#038;s=0' alt='r' title='r' class='latex' />. Then it seems that Myerson&#8217;s Lemma should go through by a derivation like that (for this special case, although the general proof is quite similar):</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BE%7D%20%5Bx%28v%29%20%5Cphi%28v%29%5D%20%3D%20%5Cint_r%5E%5Cinfty%20%5Cphi%28z%29%20f%28z%29%20dz%20%3D%20%5Cint_r%5E%5Cinfty%20z%20f%28z%29%20-%20%281-F%28z%29%29%20dz%20%3D%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{E} [x(v) \phi(v)] = \int_r^\infty \phi(z) f(z) dz = \int_r^\infty z f(z) - (1-F(z)) dz = ' title='\mathbb{E} [x(v) \phi(v)] = \int_r^\infty \phi(z) f(z) dz = \int_r^\infty z f(z) - (1-F(z)) dz = ' class='latex' /></p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%3D%20%5Cint_r%5E%5Cinfty%20%5B%20z%20f%28z%29%20-%20%5Cint_z%5E%5Cinfty%20f%28u%29%20du%20%5D%20dz%20%3D%20%5Cint_r%5E%5Cinfty%20z%20f%28z%29%20dz%20-%20%5Cint_r%5E%5Cinfty%20%5Cint_r%5Eu%20f%28u%29%20dz%20du%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='= \int_r^\infty [ z f(z) - \int_z^\infty f(u) du ] dz = \int_r^\infty z f(z) dz - \int_r^\infty \int_r^u f(u) dz du ' title='= \int_r^\infty [ z f(z) - \int_z^\infty f(u) du ] dz = \int_r^\infty z f(z) dz - \int_r^\infty \int_r^u f(u) dz du ' class='latex' /></p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%3D%20r%20%281-F%28r%29%29%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='= r (1-F(r)) ' title='= r (1-F(r)) ' class='latex' /></p>
<p style="text-align: left;">but those don&#8217;t seem to match, since one side is zero and the other is 1. The mistake we did above is classic, which is to calculate <img src='http://s.wordpress.com/latex.php?latex=%5Cinfty%20-%20%5Cinfty&#038;bg=T&#038;fg=000000&#038;s=0' alt='\infty - \infty' title='\infty - \infty' class='latex' />. We wrote:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BE%7D%5B%5Cphi%28v%29%5D%20%3D%20%5Cint_r%5E%5Cinfty%20z%20f%28z%29%20dz%20-%20%5Cint_r%5E%5Cinfty%201-F%28z%29%20dz&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{E}[\phi(v)] = \int_r^\infty z f(z) dz - \int_r^\infty 1-F(z) dz' title='\mathbb{E}[\phi(v)] = \int_r^\infty z f(z) dz - \int_r^\infty 1-F(z) dz' class='latex' /></p>
<p style="text-align: left;">but both are infinity! This made me realize that Myerson&#8217;s Lemma needs the condition that <img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BE%7D%5Bz%5D%20%3C%20%5Cinfty&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{E}[z] &lt; \infty' title='\mathbb{E}[z] &lt; \infty' class='latex' />, which is quite a natural a distribution over valuations of a good. So, one of the bugs of the the equal-revenue-distribution is that <img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BE%7D%5Bz%5D%20%3D%20%5Cinfty&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{E}[z] = \infty' title='\mathbb{E}[z] = \infty' class='latex' />. A family that is close to this, but doesn&#8217;t suffer this bug is: <img src='http://s.wordpress.com/latex.php?latex=f%28z%29%20%3D%20%5Cfrac%7B%5Calpha-1%7D%7Bz%5E%5Calpha%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='f(z) = \frac{\alpha-1}{z^\alpha}' title='f(z) = \frac{\alpha-1}{z^\alpha}' class='latex' /> for <img src='http://s.wordpress.com/latex.php?latex=z%20%5Cin%20%5B1%2C%5Cinfty%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='z \in [1,\infty)' title='z \in [1,\infty)' class='latex' />, then <img src='http://s.wordpress.com/latex.php?latex=F%28z%29%20%3D%201%20-%20z%5E%7B1-%5Calpha%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='F(z) = 1 - z^{1-\alpha}' title='F(z) = 1 - z^{1-\alpha}' class='latex' />. For <img src='http://s.wordpress.com/latex.php?latex=%5Calpha%20%3E%202&#038;bg=T&#038;fg=000000&#038;s=0' alt='\alpha &gt; 2' title='\alpha &gt; 2' class='latex' /> we have <img src='http://s.wordpress.com/latex.php?latex=%5Cmathbb%7BE%7D%5Bv%5D%20%3C%20%5Cinfty&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathbb{E}[v] &lt; \infty' title='\mathbb{E}[v] &lt; \infty' class='latex' />, then we get <img src='http://s.wordpress.com/latex.php?latex=%5Cphi%28z%29%20%3D%20%5Cfrac%7B%5Calpha-2%7D%7B%5Calpha-1%7D%20z&#038;bg=T&#038;fg=000000&#038;s=0' alt='\phi(z) = \frac{\alpha-2}{\alpha-1} z' title='\phi(z) = \frac{\alpha-2}{\alpha-1} z' class='latex' />.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bigredbits.com/archives/539/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>DP and the Erdős–Rényi model</title>
		<link>http://www.bigredbits.com/archives/443</link>
		<comments>http://www.bigredbits.com/archives/443#comments</comments>
		<pubDate>Mon, 16 May 2011 21:41:28 +0000</pubDate>
		<dc:creator>renatoppl</dc:creator>
				<category><![CDATA[theory]]></category>
		<category><![CDATA[probability]]></category>

		<guid isPermaLink="false">http://www.bigredbits.com/?p=443</guid>
		<description><![CDATA[Yesterday I was in a pub with Vasilis Syrgkanis and Elisa Celis and we were discussing about how to calculate the expected size of a connected component in , the Erdős–Rényi model. is the classical random graph obtained by considering nodes and adding each edge independently with probability . A lot is known about its [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday I was in a pub with <a href="http://www.cs.cornell.edu/~vasilis/">Vasilis Syrgkanis</a> and <a href="http://www.cs.washington.edu/homes/ecelis/other.html">Elisa Celis</a> and we were discussing about how to calculate the expected size of a connected component in <img src='http://s.wordpress.com/latex.php?latex=G%28n%2Cp%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='G(n,p)' title='G(n,p)' class='latex' />, the <a href="http://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model">Erdős–Rényi model</a>. <img src='http://s.wordpress.com/latex.php?latex=G%28n%2Cp%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='G(n,p)' title='G(n,p)' class='latex' /> is the classical random graph obtained by considering <img src='http://s.wordpress.com/latex.php?latex=n&#038;bg=T&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' /> nodes and adding each edge <img src='http://s.wordpress.com/latex.php?latex=%28i%2Cj%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='(i,j)' title='(i,j)' class='latex' /> independently with probability <img src='http://s.wordpress.com/latex.php?latex=p&#038;bg=T&#038;fg=000000&#038;s=0' alt='p' title='p' class='latex' />. A lot is known about its properties, which very interestingly change qualitatively as the value of <img src='http://s.wordpress.com/latex.php?latex=p&#038;bg=T&#038;fg=000000&#038;s=0' alt='p' title='p' class='latex' /> changes relativeto <img src='http://s.wordpress.com/latex.php?latex=n&#038;bg=T&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' />. For example, for <img src='http://s.wordpress.com/latex.php?latex=p%20%3C%5Cfrac%7B1%7D%7Bn%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='p &lt;\frac{1}{n}' title='p &lt;\frac{1}{n}' class='latex' /> then there is no component greater than <img src='http://s.wordpress.com/latex.php?latex=O%28%5Clog%20n%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='O(\log n)' title='O(\log n)' class='latex' /> with high probability. When <img src='http://s.wordpress.com/latex.php?latex=p%20%3D%20%5Cfrac%7Bc%7D%7Bn%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='p = \frac{c}{n}' title='p = \frac{c}{n}' class='latex' />, <img src='http://s.wordpress.com/latex.php?latex=c%3E1&#038;bg=T&#038;fg=000000&#038;s=0' alt='c&gt;1' title='c&gt;1' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=n%20%5Crightarrow%20%5Cinfty&#038;bg=T&#038;fg=000000&#038;s=0' alt='n \rightarrow \infty' title='n \rightarrow \infty' class='latex' />, then the graph has a giant component. All those phenomena are very well studied in the context of probabilistic combinatorics and also in social networks. I remember learning about them in Jon Kleinberg&#8217;s <a href="http://www.cs.cornell.edu/Courses/cs6850/2011sp/">Structure of Information Networks</a> class.</p>
<p style="text-align: center;"><a rel="attachment wp-att-447" href="http://www.bigredbits.com/archives/443/pic1"><img class="aligncenter size-full wp-image-447" title="pic1" src="http://www.bigredbits.com/wp-content/uploads/2011/05/pic1.png" alt="" width="444" height="234" /></a></p>
<p>So, coming back to our conversation, we were thinking on how to calculate the size of a connected component. Fix some node <img src='http://s.wordpress.com/latex.php?latex=u&#038;bg=T&#038;fg=000000&#038;s=0' alt='u' title='u' class='latex' /> in <img src='http://s.wordpress.com/latex.php?latex=G%28n%2Cp%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='G(n,p)' title='G(n,p)' class='latex' /> &#8211; it doesn&#8217;t matter which node, since all nodes are equivalent before we start tossing the random coins. Now, let <img src='http://s.wordpress.com/latex.php?latex=C_u&#038;bg=T&#038;fg=000000&#038;s=0' alt='C_u' title='C_u' class='latex' /> be the size of the connected component of node <img src='http://s.wordpress.com/latex.php?latex=u&#038;bg=T&#038;fg=000000&#038;s=0' alt='u' title='u' class='latex' />. The question is how to calculate <img src='http://s.wordpress.com/latex.php?latex=C%28n%2Cp%29%20%3D%20%5Cmathbb%7BE%7D%20%5BC_u%5D&#038;bg=T&#038;fg=000000&#038;s=0' alt='C(n,p) = \mathbb{E} [C_u]' title='C(n,p) = \mathbb{E} [C_u]' class='latex' />.</p>
<p>Recently I&#8217;ve been learning MATLAB (actually, I am learning <a href="http://www.gnu.org/software/octave/">Octave</a>, but it is the same) and I am very amazed by it and impressed about why I haven&#8217;t learned it before. It is a programming language that somehow knows exactly how mathematicians think and the syntax is very intuitive. All the operations that you think of performing when doing mathematics, they have implemented. Not that you can&#8217;t do that in C++ or Python, in fact, I&#8217;ve been doing that all my life, but in Octave, things are so simple. So, I thought this was a nice opportunity for playing a bit with it.</p>
<p>We can calculate <img src='http://s.wordpress.com/latex.php?latex=C%28n%2Cp%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='C(n,p)' title='C(n,p)' class='latex' /> using a dynamic programming algorithm in time <img src='http://s.wordpress.com/latex.php?latex=O%28n%5E2%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='O(n^2)' title='O(n^2)' class='latex' /> &#8211; well, maybe we can do it more efficiently, but the DP I thought was the following: let&#8217;s calculate <img src='http://s.wordpress.com/latex.php?latex=%5Cmathcal%7BC%7D%28n%2Cs%2Cp%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathcal{C}(n,s,p)' title='\mathcal{C}(n,s,p)' class='latex' /> where it is the expected size of the <img src='http://s.wordpress.com/latex.php?latex=u&#038;bg=T&#038;fg=000000&#038;s=0' alt='u' title='u' class='latex' />-connected component of a random graph with <img src='http://s.wordpress.com/latex.php?latex=n&#038;bg=T&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' /> nodes where the edges between  <img src='http://s.wordpress.com/latex.php?latex=u&#038;bg=T&#038;fg=000000&#038;s=0' alt='u' title='u' class='latex' /> and other nodes have probability <img src='http://s.wordpress.com/latex.php?latex=p%27%20%3D%201%20-%20%281-p%29%5Es&#038;bg=T&#038;fg=000000&#038;s=0' alt='p&#039; = 1 - (1-p)^s' title='p&#039; = 1 - (1-p)^s' class='latex' /> and an edge between <img src='http://s.wordpress.com/latex.php?latex=v_1&#038;bg=T&#038;fg=000000&#038;s=0' alt='v_1' title='v_1' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=v_2&#038;bg=T&#038;fg=000000&#038;s=0' alt='v_2' title='v_2' class='latex' /> have probability <img src='http://s.wordpress.com/latex.php?latex=p&#038;bg=T&#038;fg=000000&#038;s=0' alt='p' title='p' class='latex' />. What we want to compute is <img src='http://s.wordpress.com/latex.php?latex=C%28n%2Cp%29%20%3D%20%5Cmathcal%7BC%7D%28n%2C1%2Cp%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='C(n,p) = \mathcal{C}(n,1,p)' title='C(n,p) = \mathcal{C}(n,1,p)' class='latex' />.</p>
<p style="text-align: left;"><a rel="attachment wp-att-454" href="http://www.bigredbits.com/archives/443/pic2"><img class="aligncenter size-full wp-image-454" title="pic2" src="http://www.bigredbits.com/wp-content/uploads/2011/05/pic2.png" alt="" width="338" height="187" /></a>What we can do is to use the <a href="http://en.wikipedia.org/wiki/Principle_of_deferred_decision">Principle of Deferred Decisions</a>,  and toss the coins for the <img src='http://s.wordpress.com/latex.php?latex=n-1&#038;bg=T&#038;fg=000000&#038;s=0' alt='n-1' title='n-1' class='latex' /> edges between <img src='http://s.wordpress.com/latex.php?latex=u&#038;bg=T&#038;fg=000000&#038;s=0' alt='u' title='u' class='latex' /> and the other nodes. With probability <img src='http://s.wordpress.com/latex.php?latex=bin%28k%2Cn%2Cp%27%29%20%3D%20%7Bn%20%5Cchoose%20k%7D%20%28p%27%29%5Ek%20%281-%27p%29%5E%7Bn-k%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='bin(k,n,p&#039;) = {n \choose k} (p&#039;)^k (1-&#039;p)^{n-k}' title='bin(k,n,p&#039;) = {n \choose k} (p&#039;)^k (1-&#039;p)^{n-k}' class='latex' />, there are <img src='http://s.wordpress.com/latex.php?latex=k&#038;bg=T&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' /> edges between <img src='http://s.wordpress.com/latex.php?latex=u&#038;bg=T&#038;fg=000000&#038;s=0' alt='u' title='u' class='latex' /> and the other nodes, say nodes <img src='http://s.wordpress.com/latex.php?latex=w_1%2C%20%5Chdots%2C%20w_k&#038;bg=T&#038;fg=000000&#038;s=0' alt='w_1, \hdots, w_k' title='w_1, \hdots, w_k' class='latex' />. If we collapse those nodes to <img src='http://s.wordpress.com/latex.php?latex=u&#038;bg=T&#038;fg=000000&#038;s=0' alt='u' title='u' class='latex' /> we end up with a graph of <img src='http://s.wordpress.com/latex.php?latex=n-k&#038;bg=T&#038;fg=000000&#038;s=0' alt='n-k' title='n-k' class='latex' /> nodes and the problem is equivalent to <img src='http://s.wordpress.com/latex.php?latex=k&#038;bg=T&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' /> plus the size of the connected component of <img src='http://s.wordpress.com/latex.php?latex=u&#038;bg=T&#038;fg=000000&#038;s=0' alt='u' title='u' class='latex' /> in the collapsed graph.</p>
<p style="text-align: left;"><a rel="attachment wp-att-462" href="http://www.bigredbits.com/archives/443/pic3"><img class="aligncenter size-full wp-image-462" title="pic3" src="http://www.bigredbits.com/wp-content/uploads/2011/05/pic3.png" alt="" width="489" height="238" /></a>One difference, however is that the probability that the collapsed node <img src='http://s.wordpress.com/latex.php?latex=u&#038;bg=T&#038;fg=000000&#038;s=0' alt='u' title='u' class='latex' /> is connected to a node <img src='http://s.wordpress.com/latex.php?latex=v&#038;bg=T&#038;fg=000000&#038;s=0' alt='v' title='v' class='latex' /> of the <img src='http://s.wordpress.com/latex.php?latex=n-1-k&#038;bg=T&#038;fg=000000&#038;s=0' alt='n-1-k' title='n-1-k' class='latex' /> nodes is the probability that at least one of <img src='http://s.wordpress.com/latex.php?latex=w_i&#038;bg=T&#038;fg=000000&#038;s=0' alt='w_i' title='w_i' class='latex' /> is connected to <img src='http://s.wordpress.com/latex.php?latex=v&#038;bg=T&#038;fg=000000&#038;s=0' alt='v' title='v' class='latex' />, which is <img src='http://s.wordpress.com/latex.php?latex=1-%281-p%29%5Ek&#038;bg=T&#038;fg=000000&#038;s=0' alt='1-(1-p)^k' title='1-(1-p)^k' class='latex' />. In this way, we can write:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cmathcal%7BC%7D%28n%2Cs%2Cp%29%20%3D%201%20%5Ccdot%20bin%280%2Cn-1%2Cp%27%29%20%2B%20%5Csum_%7Bk%3D1%7D%5E%7Bn-1%7D%20bin%28k%2Cn-1%2Cp%27%29%20%5B%20k%20%2B%20%20%5Cmathcal%7BC%7D%28n-k%2Ck%2Cp%29%5D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\mathcal{C}(n,s,p) = 1 \cdot bin(0,n-1,p&#039;) + \sum_{k=1}^{n-1} bin(k,n-1,p&#039;) [ k +  \mathcal{C}(n-k,k,p)]' title='\mathcal{C}(n,s,p) = 1 \cdot bin(0,n-1,p&#039;) + \sum_{k=1}^{n-1} bin(k,n-1,p&#039;) [ k +  \mathcal{C}(n-k,k,p)]' class='latex' /></p>
<p style="text-align: left;">where <img src='http://s.wordpress.com/latex.php?latex=p%27%20%3D%201-%281-p%29%5Es&#038;bg=T&#038;fg=000000&#038;s=0' alt='p&#039; = 1-(1-p)^s' title='p&#039; = 1-(1-p)^s' class='latex' />. Now, we can calculate <img src='http://s.wordpress.com/latex.php?latex=C%28n%2Cp%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='C(n,p)' title='C(n,p)' class='latex' /> by using DP, simply by filling an <img src='http://s.wordpress.com/latex.php?latex=n%20%5Ctimes%20n&#038;bg=T&#038;fg=000000&#038;s=0' alt='n \times n' title='n \times n' class='latex' /> table. In Octave, we can do it this way:</p>
<pre><code>
<strong>function</strong> component = C(N,p)
  C_table = zeros(N,N);
  <strong>for</strong> n = 1:N <strong>for</strong> s =1:N
    C_table(n,s) = binopdf(0,n-1,1-((1-p)^s)) ;
    <strong>for</strong> k = 1:n-1
      C_table(n,s) += binopdf(k,n-1,1-((1-p)^s)) * (k + C_table(n-k,k));
    <strong>end</strong>
  <strong>end end</strong>
  component = C_table(N,1);
<strong>endfunction</strong>
</code>
</pre>
<p style="text-align: left;">And in fact we can call <img src='http://s.wordpress.com/latex.php?latex=C%28n%2Cp%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='C(n,p)' title='C(n,p)' class='latex' /> for say <img src='http://s.wordpress.com/latex.php?latex=n%20%3D%20200&#038;bg=T&#038;fg=000000&#038;s=0' alt='n = 200' title='n = 200' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=p%20%3D%200.01%20..%200.3&#038;bg=T&#038;fg=000000&#038;s=0' alt='p = 0.01 .. 0.3' title='p = 0.01 .. 0.3' class='latex' /> and see how <img src='http://s.wordpress.com/latex.php?latex=C%28n%2Cp%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='C(n,p)' title='C(n,p)' class='latex' /> varies. This allows us, for example, to observe the sharp transition that happens before the giant component is formed. The plot we get is:</p>
<p style="text-align: left;"><a rel="attachment wp-att-481" href="http://www.bigredbits.com/archives/443/component_size"><a rel="attachment wp-att-484" href="http://www.bigredbits.com/archives/443/pic4"><img class="aligncenter size-full wp-image-484" title="pic4" src="http://www.bigredbits.com/wp-content/uploads/2011/05/pic4.png" alt="" width="233" height="279" /></a><br />
</a></p>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow: hidden;">
<h1 id="firstHeading" class="firstHeading">Erdős–Rényi model</h1>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bigredbits.com/archives/443/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Probability Puzzles</title>
		<link>http://www.bigredbits.com/archives/278</link>
		<comments>http://www.bigredbits.com/archives/278#comments</comments>
		<pubDate>Wed, 17 Feb 2010 02:54:53 +0000</pubDate>
		<dc:creator>renatoppl</dc:creator>
				<category><![CDATA[puzzles]]></category>
		<category><![CDATA[probability]]></category>

		<guid isPermaLink="false">http://www.bigredbits.com/?p=278</guid>
		<description><![CDATA[Today in a dinner with Thanh, Hu and Joel I heard about a paradox I haven&#8217;t heard so far. Probability is full of cute problems that challenge our understanding of the basic concepts. The most famous of them is the Monty Hall Problem, which asks: You are on a TV game show and there are [...]]]></description>
			<content:encoded><![CDATA[<p>Today in a dinner with Thanh, Hu and Joel I heard about a paradox I haven&#8217;t heard so far. Probability is full of cute problems that challenge our understanding of the basic concepts. The most famous of them is the <a class="snap_noshots" href="http://en.wikipedia.org/wiki/Monty_Hall_problem">Monty Hall Problem</a>, which asks:</p>
<blockquote><p>You are on a TV game show and there are <img src='http://s.wordpress.com/latex.php?latex=%7B3%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{3}' title='{3}' class='latex' /> doors &#8211; one of them contains a prize, say a car and the other two door contain things you don&#8217;t care about, say goats. You choose a door. Then the TV host, who knows where the prize is, opens one door you haven&#8217;t chosen and that he knows has a goat. Then he asks if you want to stick to the door you have chosen or if you want to change to the other door. What should you do?</p></blockquote>
<p>Probably you&#8217;ve already came across this question in some moment of your life and the answer is that changing doors would double your probability of getting the price. There are several ways of convincing your intuitions:</p>
<ul>
<li> Do the math: when you chose the door, there were three options so the prize is in the door you chose with <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cfrac%7B1%7D%7B3%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\frac{1}{3}}' title='{\frac{1}{3}}' class='latex' /> probability and in the other door with probability <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cfrac%7B2%7D%7B3%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\frac{2}{3}}' title='{\frac{2}{3}}' class='latex' /> (note that the presenter can always open some door with a goat, so conditioning on that event doesn&#8217;t give you any new information).</li>
<li> Do the actual experiment (computationally) as done <a class="snap_noshots" href="http://igor-nav.livejournal.com/16784.html">here</a>. One can always ask a friend to help, get some goats and perform the actual experiment.</li>
<li> To convince yourself that &#8220;it doesn&#8217;t matter&#8221; is not correct, think <img src='http://s.wordpress.com/latex.php?latex=%7B100%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{100}' title='{100}' class='latex' /> doors. You choose one and the TV host open <img src='http://s.wordpress.com/latex.php?latex=%7B98%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{98}' title='{98}' class='latex' /> of them and asks if you want to change or stick with your first choice. Wouldn&#8217;t you change?</li>
</ul>
<p>I&#8217;ve seen TV shows where this happened and I acknowledge that other things may be involved: there might be behavioral and psychologic issues associated with the Monty Hall problem &#8211; and possibly those would interest <a class="snap_noshots" href="http://www.amazon.com/Predictably-Irrational-Revised-Intl-Decisions/dp/0062018205/ref=sr_1_2?ie=UTF8&amp;s=books&amp;qid=1266295712&amp;sr=8-2">Dan Ariely</a>, whose book I began reading today &#8211; and looks quite fun. But the problem they told me about today in dinner was another: <a class="snap_noshots" href="http://en.wikipedia.org/wiki/Two_envelope_problem">the envelope problem</a>:</p>
<blockquote><p>There are two envelopes and you are told that in one of them there is twice the amount that there is in the other. You choose one of the envelopes at random and open it: it contains <img src='http://s.wordpress.com/latex.php?latex=%7B100%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{100}' title='{100}' class='latex' /> bucks. Now, you don&#8217;t know if the other envelope has <img src='http://s.wordpress.com/latex.php?latex=%7B50%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{50}' title='{50}' class='latex' /> bucks or <img src='http://s.wordpress.com/latex.php?latex=%7B200%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{200}' title='{200}' class='latex' /> bucks. Then someone asks you if you wanted to pay <img src='http://s.wordpress.com/latex.php?latex=%7B10%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{10}' title='{10}' class='latex' /> bucks and change to the other envelope. Should you change?</p></blockquote>
<p>Now, consider two different solutions to this problem: the first is fallacious and the second is correct:</p>
<ol>
<li> If I don&#8217;t change, I get <img src='http://s.wordpress.com/latex.php?latex=%7B100%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{100}' title='{100}' class='latex' /> bucks, if I change I pay a penalty of <img src='http://s.wordpress.com/latex.php?latex=%7B10%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{10}' title='{10}' class='latex' /> and I get either <img src='http://s.wordpress.com/latex.php?latex=%7B50%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{50}' title='{50}' class='latex' /> or <img src='http://s.wordpress.com/latex.php?latex=%7B200%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{200}' title='{200}' class='latex' /> with equal probability, so my expected prize if I change is <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cfrac%7B200%2B50%7D%7B2%7D-10%20%3D%20115%20%3E%20100%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\frac{200+50}{2}-10 = 115 &gt; 100}' title='{\frac{200+50}{2}-10 = 115 &gt; 100}' class='latex' />, so I should change.</li>
<li> I know there is one envelope with <img src='http://s.wordpress.com/latex.php?latex=%7Bx%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{x}' title='{x}' class='latex' /> and one with <img src='http://s.wordpress.com/latex.php?latex=%7B2x%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{2x}' title='{2x}' class='latex' />, then my expected prize if I don&#8217;t change is <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cfrac%7Bx%20%2B%202x%7D%7B2%7D%20%3D%20%5Cfrac%7B3%7D%7B2%7Dx%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\frac{x + 2x}{2} = \frac{3}{2}x}' title='{\frac{x + 2x}{2} = \frac{3}{2}x}' class='latex' />. If I change, my expected prize is <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cfrac%7Bx%20%2B%202x%7D%7B2%7D%20-%2010%20%3C%20%5Cfrac%7B3%7D%7B2%7Dx%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\frac{x + 2x}{2} - 10 &lt; \frac{3}{2}x}' title='{\frac{x + 2x}{2} - 10 &lt; \frac{3}{2}x}' class='latex' />, so I should not change.</li>
</ol>
<p>The fallacy in the first argument is perceiving a probability distribution where there is no one. Either the other envelope contains <img src='http://s.wordpress.com/latex.php?latex=%7B50%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{50}' title='{50}' class='latex' /> bucks or it contains <img src='http://s.wordpress.com/latex.php?latex=%7B200%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{200}' title='{200}' class='latex' /> bucks &#8211; we just don&#8217;t know, but there is no probability distribution there &#8211; it is a deterministic choice by the game designer. Most of those paradoxes are a result of either an ill-defined probability space, as <a class="snap_noshots" href="http://en.wikipedia.org/wiki/Bertrand%27s_paradox_%28probability%29">Bertrand&#8217;s Paradox</a> or a wrong comprehension of the probability space, as in Monty Hall or in several paradoxes exploring the same idea as: <a class="snap_noshots" href="http://en.wikipedia.org/wiki/Three_Prisoners_problem">Three Prisioners</a>, <a class="snap_noshots" href="http://en.wikipedia.org/wiki/Sleeping_Beauty_problem">Sleeping Beauty</a>, <a class="snap_noshots" href="http://en.wikipedia.org/wiki/Boy_or_Girl_paradox">Boy or Girl Paradox</a>, &#8230;</p>
<p style="text-align: center;"><img class="aligncenter size-medium wp-image-280" title="73a_humpty-dumpty" src="http://www.bigredbits.com/wp-content/uploads/2010/02/73a_humpty-dumpty-300x184.jpg" alt="73a_humpty-dumpty" width="300" height="184" /></p>
<p>There was very recently a thrilling discussion about a variant on the envelope paradox in the <a class="snap_noshots" href="http://blag.xkcd.com/">xkcd blag </a> &#8211; which is the blog accompaning that <a class="snap_noshots" href="http://xkcd.com/">amazing webcomic</a>. There was a recent blog post with <a class="snap_noshots" href="http://blog.xkcd.com/2010/02/09/math-puzzle/">a very intriguing problem</a>. A better idea is to go there and read the discussion, but if you are not doing so, let me summarize it here. The problem is:</p>
<blockquote><p>There are two envelopes containing each of them a distinct real number. You pick one envelope at random, open it and see the number, then you are asked to guess if the number in the other envelope is larger or smaller then the previous one. Can you guess correctly with more than <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cfrac%7B1%7D%7B2%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\frac{1}{2}}' title='{\frac{1}{2}}' class='latex' /> probability?</p>
<p>A related problem is: given that you are playing the envelope game and there are number <img src='http://s.wordpress.com/latex.php?latex=%7BA%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{A}' title='{A}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7BB%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{B}' title='{B}' class='latex' /> (with <img src='http://s.wordpress.com/latex.php?latex=%7BA%20%3C%20B%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{A &lt; B}' title='{A &lt; B}' class='latex' />). You pick one envelope at random and then you are able to look at the content of the first envelope you open and then decide to switch or not. Is there a strategy that gives you expected earnings greater than <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cfrac%7BA%2BB%7D%7B2%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\frac{A+B}{2}}' title='{\frac{A+B}{2}}' class='latex' /> ?</p></blockquote>
<p>The very unexpected answers is <strong>yes</strong> !!! The strategy that Randall presents in the blog and there is a <a class="snap_noshots" href="http://www.iwr.uni-heidelberg.de/groups/ngg/People/winckler/PU/p008.html">link to the source here</a> is: let <img src='http://s.wordpress.com/latex.php?latex=%7BX%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X}' title='{X}' class='latex' /> be a random variable on <img src='http://s.wordpress.com/latex.php?latex=%7B%7B%5Cmathbb%20R%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{{\mathbb R}}' title='{{\mathbb R}}' class='latex' /> such that for each <img src='http://s.wordpress.com/latex.php?latex=%7Ba%3Cb%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{a&lt;b}' title='{a&lt;b}' class='latex' /> we have <img src='http://s.wordpress.com/latex.php?latex=%7BP%28a%20%3C%20X%20%3C%20b%29%20%3E%200%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{P(a &lt; X &lt; b) &gt; 0}' title='{P(a &lt; X &lt; b) &gt; 0}' class='latex' />, for example, the normal distribution or the logistic distribution.</p>
<p>Sample <img src='http://s.wordpress.com/latex.php?latex=%7BX%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X}' title='{X}' class='latex' /> then open the envelope and find a number <img src='http://s.wordpress.com/latex.php?latex=%7BS%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{S}' title='{S}' class='latex' /> now, if <img src='http://s.wordpress.com/latex.php?latex=%7BX%20%3C%20S%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X &lt; S}' title='{X &lt; S}' class='latex' /> say the other number is lower and if <img src='http://s.wordpress.com/latex.php?latex=%7BX%20%3E%20S%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X &gt; S}' title='{X &gt; S}' class='latex' /> say the other number is higher. You get it right with probability</p>
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%20P%28%5Ctext%7Bpicked%20%7DA%29%20P%28X%20%3E%20A%29%20%2B%20P%28%5Ctext%7Bpicked%20%7DB%29%20P%28X%20%3C%20B%29%20%3D%20%5Cfrac%7B1%7D%7B2%7D%20%281%20%2B%20P%28A%20%3C%20X%20%3C%20B%29%29%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle  P(\text{picked }A) P(X &gt; A) + P(\text{picked }B) P(X &lt; B) = \frac{1}{2} (1 + P(A &lt; X &lt; B)) ' title='\displaystyle  P(\text{picked }A) P(X &gt; A) + P(\text{picked }B) P(X &lt; B) = \frac{1}{2} (1 + P(A &lt; X &lt; B)) ' class='latex' /></p>
<p>which is impressive. If you follow your guess, your expected earning <img src='http://s.wordpress.com/latex.php?latex=%7BY%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Y}' title='{Y}' class='latex' /> is:</p>
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Cbegin%7Baligned%7D%20%26P%28%5Ctext%7Bpicked%20%7DA%29%20%5Cmathop%7B%5Cmathbb%20E%7D%5BY%20%5Cvert%20%5Ctext%7Bpicked%20%7DA%5D%20%2B%20P%28%5Ctext%7Bpicked%20%7DB%29%20%5Cmathop%7B%5Cmathbb%20E%7D%5BY%20%5Cvert%20%5Ctext%7Bpicked%20%7DB%5D%20%3D%20%5C%5C%20%26%20%3D%20%5Cfrac%7B1%7D%7B2%7D%20%5BP%28X%3CA%29%20A%20%2B%20P%28X%3EA%29%20B%5D%20%2B%20%5Cfrac%7B1%7D%7B2%7D%20%5BP%28X%3CB%29%20B%20%2B%20P%28X%3EB%29%20A%5D%20%5C%5C%20%26%3D%20%5Cfrac%7B1%7D%7B2%7D%5BA%20%5BP%28X%3CA%29%20%2B%20P%28X%3EB%29%5D%20%2B%20B%20%5BP%28X%3EA%29%20%2B%20P%28X%3CB%29%5D%5D%20%3E%20%5Cfrac%7BA%2BB%7D%7B2%7D%20%5C%5C%20%5Cend%7Baligned%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \begin{aligned} &amp;P(\text{picked }A) \mathop{\mathbb E}[Y \vert \text{picked }A] + P(\text{picked }B) \mathop{\mathbb E}[Y \vert \text{picked }B] = \\ &amp; = \frac{1}{2} [P(X&lt;A) A + P(X&gt;A) B] + \frac{1}{2} [P(X&lt;B) B + P(X&gt;B) A] \\ &amp;= \frac{1}{2}[A [P(X&lt;A) + P(X&gt;B)] + B [P(X&gt;A) + P(X&lt;B)]] &gt; \frac{A+B}{2} \\ \end{aligned}' title='\displaystyle \begin{aligned} &amp;P(\text{picked }A) \mathop{\mathbb E}[Y \vert \text{picked }A] + P(\text{picked }B) \mathop{\mathbb E}[Y \vert \text{picked }B] = \\ &amp; = \frac{1}{2} [P(X&lt;A) A + P(X&gt;A) B] + \frac{1}{2} [P(X&lt;B) B + P(X&gt;B) A] \\ &amp;= \frac{1}{2}[A [P(X&lt;A) + P(X&gt;B)] + B [P(X&gt;A) + P(X&lt;B)]] &gt; \frac{A+B}{2} \\ \end{aligned}' class='latex' /></p>
<p>The xkcd pointed to this cool <a class="snap_noshots" href="http://www.iwr.uni-heidelberg.de/groups/ngg/People/winckler/PU/">archive of puzzles and riddles</a>. I was also told that the <a class="snap_noshots" href="http://forums.xkcd.com/viewforum.php?f=3&amp;sid=0a47f2eeadd72be7890309b1c685c503">xkcd puzzle forum</a> is also a source of excellent puzzles, as this:</p>
<blockquote><p>You are the most eligible bachelor in the kingdom, and as such the King has invited you to his castle so that you may choose one of his three daughters to marry. The eldest princess is honest and always tells the truth. The youngest princess is dishonest and always lies. The middle princess is mischievous and tells the truth sometimes and lies the rest of the time. As you will be forever married to one of the princesses, you want to marry the eldest (truth-teller) or the youngest (liar) because at least you know where you stand with them. The problem is that you cannot tell which sister is which just by their appearance, and the King will only grant you ONE yes or no question which you may only address to ONE of the sisters. What yes or no question can you ask which will ensure you do not marry the middle sister?</p></blockquote>
<p>copied from <a class="snap_noshots" href="http://forums.xkcd.com/viewtopic.php?f=3&amp;t=87">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bigredbits.com/archives/278/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Looking at probability distributions</title>
		<link>http://www.bigredbits.com/archives/231</link>
		<comments>http://www.bigredbits.com/archives/231#comments</comments>
		<pubDate>Fri, 13 Nov 2009 03:16:27 +0000</pubDate>
		<dc:creator>renatoppl</dc:creator>
				<category><![CDATA[theory]]></category>
		<category><![CDATA[mathematics]]></category>
		<category><![CDATA[probability]]></category>

		<guid isPermaLink="false">http://www.bigredbits.com/?p=231</guid>
		<description><![CDATA[I&#8217;ve been taking two classes in probability this semester and in those I saw the proofs of a lot of interesting theorems which I knew about previously but I have never seen the proof, as the Central Limit Theorem, the Laws of Large Numbers and so on&#8230; Also, some theory which is looks somewhat ugly [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been taking two classes in probability this semester and in those I saw the proofs of a lot of interesting theorems which I knew about previously but I have never seen the proof, as the Central Limit Theorem, the Laws of Large Numbers and so on&#8230; Also, some theory which is looks somewhat ugly in the undergrad courses becomes very clear with the proper formal treatment. Today I was thinking what was the main take-home message that a computer scientist could take from those classes and. at ;east for me, this message is the various ways of looking to probability distributions. I&#8217;ve heard about moments, Laplace transform, Fourier transform and other tools like that, but I never realized before their true power. Probably still today, most of their true power is hidden from me, but I am starting to look at them in a different way. Let me try to go over a few examples of different ways we can look at probability distributions and show cases where they are interesting.</p>
<p>Most of ways of looking at probability distributions are associated with multiplicative system: a multiplicative system <img src='http://s.wordpress.com/latex.php?latex=%7BQ%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Q}' title='{Q}' class='latex' /> is a set of real-valued functions with the property that if <img src='http://s.wordpress.com/latex.php?latex=%7Bf_1%2C%20f_2%20%5Cin%20Q%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{f_1, f_2 \in Q}' title='{f_1, f_2 \in Q}' class='latex' /> then <img src='http://s.wordpress.com/latex.php?latex=%7Bf_1%20%5Cdot%20f_2%20%5Cin%20Q%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{f_1 \dot f_2 \in Q}' title='{f_1 \dot f_2 \in Q}' class='latex' />. Those kinds of sets are powerful because of the Multiplicative Systems Theorem:</p>
<blockquote><p><strong>Theorem 1 (Multiplicative Systems Theorem)</strong> <em> If <img src='http://s.wordpress.com/latex.php?latex=%7BQ%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Q}' title='{Q}' class='latex' /> is a multiplicative system, <img src='http://s.wordpress.com/latex.php?latex=%7BH%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{H}' title='{H}' class='latex' /> is a linear space containing <img src='http://s.wordpress.com/latex.php?latex=%7B1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{1}' title='{1}' class='latex' /> (the constant function <img src='http://s.wordpress.com/latex.php?latex=%7B1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{1}' title='{1}' class='latex' />) and is closed under bounded convergence, then <img src='http://s.wordpress.com/latex.php?latex=%7BQ%20%5Csubseteq%20H%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Q \subseteq H}' title='{Q \subseteq H}' class='latex' /> implies that <img src='http://s.wordpress.com/latex.php?latex=%7BH%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{H}' title='{H}' class='latex' /> contains all bounded <img src='http://s.wordpress.com/latex.php?latex=%7B%5Csigma%28Q%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\sigma(Q)}' title='{\sigma(Q)}' class='latex' />-measurable functions. </em></p></blockquote>
<p>The theorem might look a bit cryptic if you are not familiar with the definitions, but it boils down to the following translation:</p>
<blockquote><p><strong>Theorem 2 (Translation of the Multiplicative Systems Theorem)</strong> <em> If <img src='http://s.wordpress.com/latex.php?latex=%7BQ%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Q}' title='{Q}' class='latex' /> is &#8220;general&#8221; multiplicative system, <img src='http://s.wordpress.com/latex.php?latex=%7BX%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X}' title='{X}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7BY%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Y}' title='{Y}' class='latex' /> are random variable such that <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathop%7B%5Cmathbb%20E%7D%7Bf%28X%29%7D%20%3D%20%5Cmathop%7B%5Cmathbb%20E%7D%7Bf%28Y%29%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathop{\mathbb E}{f(X)} = \mathop{\mathbb E}{f(Y)}}' title='{\mathop{\mathbb E}{f(X)} = \mathop{\mathbb E}{f(Y)}}' class='latex' /> for all <img src='http://s.wordpress.com/latex.php?latex=%7Bf%20%5Cin%20Q%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{f \in Q}' title='{f \in Q}' class='latex' /> then <img src='http://s.wordpress.com/latex.php?latex=%7BX%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X}' title='{X}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7BY%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Y}' title='{Y}' class='latex' /> have the same distribution. </em></p></blockquote>
<p>where general excludes some troublesome cases like <img src='http://s.wordpress.com/latex.php?latex=%7BQ%20%3D%20%5C%7B1%5C%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Q = \{1\}}' title='{Q = \{1\}}' class='latex' /> or all constant functions, for example. In technical terms, we wanted <img src='http://s.wordpress.com/latex.php?latex=%7B%5Csigma%28Q%29%20%3D%20%5Csigma%5C%7Bf%5E%7B-1%7D%28%28-%5Cinfty%2Ca%5D%29%3B%20a%5Cin%20%7B%5Cmathbb%20R%7D%2C%20f%20%5Cin%20Q%20%5C%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\sigma(Q) = \sigma\{f^{-1}((-\infty,a]); a\in {\mathbb R}, f \in Q \}}' title='{\sigma(Q) = \sigma\{f^{-1}((-\infty,a]); a\in {\mathbb R}, f \in Q \}}' class='latex' /> to be the Borel <img src='http://s.wordpress.com/latex.php?latex=%7B%5Csigma%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\sigma}' title='{\sigma}' class='latex' />-algebra. But let&#8217;s not worry about those technical details and just look at the translated version. We now, discuss several kinds of multiplicative systems:</p>
<ol>
<li> The most common description of the a random variable is by the cummulative distribution function <img src='http://s.wordpress.com/latex.php?latex=%7BF%28u%29%20%3D%20P%5C%7BX%20%5Cleq%20u%5C%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{F(u) = P\{X \leq u\}}' title='{F(u) = P\{X \leq u\}}' class='latex' />. This is associated with <img src='http://s.wordpress.com/latex.php?latex=%7BQ%20%3D%20%5C%7B%201_%7B%28-%5Cinfty%2Cu%5D%7D%28x%29%3B%20u%20%5Cin%20%7B%5Cmathbb%20R%7D%20%5C%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Q = \{ 1_{(-\infty,u]}(x); u \in {\mathbb R} \}}' title='{Q = \{ 1_{(-\infty,u]}(x); u \in {\mathbb R} \}}' class='latex' /> notice that simply <img src='http://s.wordpress.com/latex.php?latex=%7BF%28u%29%20%3D%20%5Cmathop%7B%5Cmathbb%20E%7D%7B1_%7B%28-%5Cinfty%2Cu%5D%7D%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{F(u) = \mathop{\mathbb E}{1_{(-\infty,u]}}}' title='{F(u) = \mathop{\mathbb E}{1_{(-\infty,u]}}}' class='latex' />.</li>
<li> We can characterize a random variable by its moments: the variable <img src='http://s.wordpress.com/latex.php?latex=%7BX%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X}' title='{X}' class='latex' /> is characterized by the set <img src='http://s.wordpress.com/latex.php?latex=%7BM_n%20%3D%20%5Cint_%7B%5Cmathbb%20R%7D%20x%5En%20%5Cmu_X%28dx%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{M_n = \int_{\mathbb R} x^n \mu_X(dx)}' title='{M_n = \int_{\mathbb R} x^n \mu_X(dx)}' class='latex' />. Given the moemnts <img src='http://s.wordpress.com/latex.php?latex=%7BM_1%2C%20M_2%2C%20%5Chdots%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{M_1, M_2, \hdots}' title='{M_1, M_2, \hdots}' class='latex' />, the variable is totally characterized, i.e., if two variables have the same moments, then they have the same distribution by the Multiplicative Systems Theorem. This description is associated with the system <img src='http://s.wordpress.com/latex.php?latex=%7BQ%20%3D%20%5C%7Bx%5En%3B%20n%20%3D%201%2C%202%2C%20%5Chdots%5C%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Q = \{x^n; n = 1, 2, \hdots\}}' title='{Q = \{x^n; n = 1, 2, \hdots\}}' class='latex' /></li>
<li> <strong>Moment Generating Function</strong>: If <img src='http://s.wordpress.com/latex.php?latex=%7BX%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X}' title='{X}' class='latex' /> is a variable that assumes only integer values, we can describe the it as <img src='http://s.wordpress.com/latex.php?latex=%7Bp_0%2C%20p_1%2C%20p_2%2C%20%5Chdots%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{p_0, p_1, p_2, \hdots}' title='{p_0, p_1, p_2, \hdots}' class='latex' />, where <img src='http://s.wordpress.com/latex.php?latex=%7Bp_n%20%3D%20P%5C%7B%20X%20%3D%20n%20%5C%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{p_n = P\{ X = n \}}' title='{p_n = P\{ X = n \}}' class='latex' />. An interesting way of representing those probabilities is as the moment generating function <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cpsi_X%28z%29%20%3D%20%5Cmathop%7B%5Cmathbb%20E%7D%7Bz%5EX%7D%20%3D%20%5Csum_%7Bn%3D0%7D%5E%5Cinfty%20p_n%20z%5En%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\psi_X(z) = \mathop{\mathbb E}{z^X} = \sum_{n=0}^\infty p_n z^n}' title='{\psi_X(z) = \mathop{\mathbb E}{z^X} = \sum_{n=0}^\infty p_n z^n}' class='latex' />. This is associated with the multiplicative system <img src='http://s.wordpress.com/latex.php?latex=%7BQ%20%3D%20%5C%7Bz%5Ex%2C%200%20%5Cleq%20z%20%3C%201%5C%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Q = \{z^x, 0 \leq z &lt; 1\}}' title='{Q = \{z^x, 0 \leq z &lt; 1\}}' class='latex' />.Now suppose we are given two discrete independent variables <img src='http://s.wordpress.com/latex.php?latex=%7BX%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X}' title='{X}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7BY%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Y}' title='{Y}' class='latex' />. What do we know about <img src='http://s.wordpress.com/latex.php?latex=%7BX%2BY%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X+Y}' title='{X+Y}' class='latex' />. It is easy to know its expectation, its variance, &#8230; but what about more complicated things? What is the distribution of <img src='http://s.wordpress.com/latex.php?latex=%7BX%2BY%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X+Y}' title='{X+Y}' class='latex' /> ? Moment generating functions answer this question very easily, since:
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Cpsi_%7BX%2BY%7D%20%28z%29%20%3D%20%5Cmathop%7B%5Cmathbb%20E%7D%20%5Bz%5E%7BX%2BY%7D%5D%20%3D%20%5Cmathop%7B%5Cmathbb%20E%7D%20%5Bz%5E%7BX%7D%5D%20%5Ccdot%20%5Cmathop%7B%5Cmathbb%20E%7D%5Bz%5E%7BY%7D%5D%20%3D%20%5Cpsi_X%28z%29%20%5Ccdot%20%5Cpsi_Y%28z%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \psi_{X+Y} (z) = \mathop{\mathbb E} [z^{X+Y}] = \mathop{\mathbb E} [z^{X}] \cdot \mathop{\mathbb E}[z^{Y}] = \psi_X(z) \cdot \psi_Y(z)' title='\displaystyle \psi_{X+Y} (z) = \mathop{\mathbb E} [z^{X+Y}] = \mathop{\mathbb E} [z^{X}] \cdot \mathop{\mathbb E}[z^{Y}] = \psi_X(z) \cdot \psi_Y(z)' class='latex' /></p>
<p>If we know moment generating functions, we can calculate expectation very easily, since <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathop%7B%5Cmathbb%20E%7D%5BX%5D%20%3D%20%5Cpsi%27_X%281%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathop{\mathbb E}[X] = \psi&#039;_X(1)}' title='{\mathop{\mathbb E}[X] = \psi&#039;_X(1)}' class='latex' />. For example, suppose we have a process like that: there is one bacteria in time <img src='http://s.wordpress.com/latex.php?latex=%7Bt%20%3D%200%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{t = 0}' title='{t = 0}' class='latex' />. In each timestep, either this bacteria dies (with probability <img src='http://s.wordpress.com/latex.php?latex=%7Bp_0%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{p_0}' title='{p_0}' class='latex' />), continues alive without reproducing (with probability <img src='http://s.wordpress.com/latex.php?latex=%7Bp_1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{p_1}' title='{p_1}' class='latex' /> or has <img src='http://s.wordpress.com/latex.php?latex=%7Bk%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{k}' title='{k}' class='latex' /> offsprings (with probability <img src='http://s.wordpress.com/latex.php?latex=%7Bp_%7B1%2Bk%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{p_{1+k}}' title='{p_{1+k}}' class='latex' />). In that case <img src='http://s.wordpress.com/latex.php?latex=%7B%5Csum_0%5E%5Cinfty%20p_n%20%3D%201%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\sum_0^\infty p_n = 1}' title='{\sum_0^\infty p_n = 1}' class='latex' />. Each time, the same happens, independently with each of the bacteria alive in that moment. The question is, what is the expected number of bacteria in time <img src='http://s.wordpress.com/latex.php?latex=%7Bt%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{t}' title='{t}' class='latex' /> ?</p>
<p>It looks like a complicated problem with just elementary tools, but it is a simple problem if we have moment generating functions. Just let <img src='http://s.wordpress.com/latex.php?latex=%7BX_%7Bti%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X_{ti}}' title='{X_{ti}}' class='latex' /> be the variable associated with the <img src='http://s.wordpress.com/latex.php?latex=%7Bi%5E%7Bth%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{i^{th}}' title='{i^{th}}' class='latex' /> bacteria of time <img src='http://s.wordpress.com/latex.php?latex=%7Bt%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{t}' title='{t}' class='latex' />. It is zero if it dies, <img src='http://s.wordpress.com/latex.php?latex=%7B1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{1}' title='{1}' class='latex' /> if it stays the same and <img src='http://s.wordpress.com/latex.php?latex=%7Bk%2B1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{k+1}' title='{k+1}' class='latex' /> if it has <img src='http://s.wordpress.com/latex.php?latex=%7Bk%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{k}' title='{k}' class='latex' /> offsprings. Let also <img src='http://s.wordpress.com/latex.php?latex=%7BN_t%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{N_t}' title='{N_t}' class='latex' /> be the number of bacteria in time <img src='http://s.wordpress.com/latex.php?latex=%7Bt%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{t}' title='{t}' class='latex' />. We want to know <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathop%7B%5Cmathbb%20E%7D%20N_t%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathop{\mathbb E} N_t}' title='{\mathop{\mathbb E} N_t}' class='latex' />. First, see that:</p>
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20N_t%20%3D%20%5Csum_%7Bi%3D1%7D%5E%7BN_%7Bt-1%7D%7D%20X_%7Bti%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle N_t = \sum_{i=1}^{N_{t-1}} X_{ti}' title='\displaystyle N_t = \sum_{i=1}^{N_{t-1}} X_{ti}' class='latex' /></p>
<p>Now, let&#8217;s write that in terms of moment generating functions:</p>
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Cpsi_%7BN_t%7D%28z%29%20%3D%20%5Cmathop%7B%5Cmathbb%20E%7D%7Bz%5E%7BX_%7Bt1%7D%20%2B%20%5Chdots%20%2B%20X_%7BtN_t%7D%7D%7D%20%3D%20%5Csum_%7Bk%3D0%7D%5E%5Cinfty%20P%5C%7BN_%7Bt-1%7D%20%3D%20k%5C%7D%20%5Ccdot%20%5Cmathop%7B%5Cmathbb%20E%7D%5Bz%5E%7BX_%7Bt1%7D%20%2B%20%5Chdots%20%2B%20X_%7BtN_t%7D%7D%20%5Cvert%20N_t%20%3D%20k%5D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \psi_{N_t}(z) = \mathop{\mathbb E}{z^{X_{t1} + \hdots + X_{tN_t}}} = \sum_{k=0}^\infty P\{N_{t-1} = k\} \cdot \mathop{\mathbb E}[z^{X_{t1} + \hdots + X_{tN_t}} \vert N_t = k]' title='\displaystyle \psi_{N_t}(z) = \mathop{\mathbb E}{z^{X_{t1} + \hdots + X_{tN_t}}} = \sum_{k=0}^\infty P\{N_{t-1} = k\} \cdot \mathop{\mathbb E}[z^{X_{t1} + \hdots + X_{tN_t}} \vert N_t = k]' class='latex' /></p>
<p>which is just:</p>
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Cpsi_%7BN_t%7D%28z%29%20%3D%20%5Csum_%7Bk%3D0%7D%5E%5Cinfty%20P%5C%7BN_%7Bt-1%7D%20%3D%20k%5C%7D%20%5Ccdot%20%5Cmathop%7B%5Cmathbb%20E%7D%5Bz%5E%7BX_%7Bt1%7D%20%2B%20%5Chdots%20%2B%20X_%7Btk%7D%7D%5D%20%3D%20%5Csum_%7Bk%3D0%7D%5E%5Cinfty%20P%5C%7BN_%7Bt-1%7D%20%3D%20k%20%5C%7D%20%5Ccdot%20%5Cpsi_X%20%28z%29%5Ek&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \psi_{N_t}(z) = \sum_{k=0}^\infty P\{N_{t-1} = k\} \cdot \mathop{\mathbb E}[z^{X_{t1} + \hdots + X_{tk}}] = \sum_{k=0}^\infty P\{N_{t-1} = k \} \cdot \psi_X (z)^k' title='\displaystyle \psi_{N_t}(z) = \sum_{k=0}^\infty P\{N_{t-1} = k\} \cdot \mathop{\mathbb E}[z^{X_{t1} + \hdots + X_{tk}}] = \sum_{k=0}^\infty P\{N_{t-1} = k \} \cdot \psi_X (z)^k' class='latex' /></p>
<p>since the variables are all independent and identically distributed. Now, notice that:</p>
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Cpsi_%7BN_%7Bt-1%7D%7D%28z%29%20%3D%20%5Csum_%7Bk%3D0%7D%5E%5Cinfty%20P%5C%7BN_%7Bt-1%7D%20%3D%20k%5C%7D%20%5Ccdot%20z%5Ek&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \psi_{N_{t-1}}(z) = \sum_{k=0}^\infty P\{N_{t-1} = k\} \cdot z^k' title='\displaystyle \psi_{N_{t-1}}(z) = \sum_{k=0}^\infty P\{N_{t-1} = k\} \cdot z^k' class='latex' /></p>
<p>by the definition of moment generating function, so we effectively proved that:</p>
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Cpsi_%7BN_%7Bt%7D%7D%28z%29%20%3D%20%5Cpsi_%7BN_%7Bt-1%7D%7D%28%5Cpsi_X%28z%29%29%20%3D%20%5Cpsi_X%20%5Cpsi_X%20%5Chdots%20%5Cpsi_X%28z%29%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \psi_{N_{t}}(z) = \psi_{N_{t-1}}(\psi_X(z)) = \psi_X \psi_X \hdots \psi_X(z) ' title='\displaystyle \psi_{N_{t}}(z) = \psi_{N_{t-1}}(\psi_X(z)) = \psi_X \psi_X \hdots \psi_X(z) ' class='latex' /></p>
<p>We proved that <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cpsi_%7BN_%7Bt%7D%7D%28z%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\psi_{N_{t}}(z)}' title='{\psi_{N_{t}}(z)}' class='latex' /> is just <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cpsi%28z%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\psi(z)}' title='{\psi(z)}' class='latex' /> iterated <img src='http://s.wordpress.com/latex.php?latex=%7Bt%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{t}' title='{t}' class='latex' /> times. Now, calculating the expectation is easy, using the fact that <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cpsi_X%281%29%20%3D%201%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\psi_X(1) = 1}' title='{\psi_X(1) = 1}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cpsi%27_X%281%29%20%3D%20%5Cmathop%7B%5Cmathbb%20E%7D%20X%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\psi&#039;_X(1) = \mathop{\mathbb E} X}' title='{\psi&#039;_X(1) = \mathop{\mathbb E} X}' class='latex' />. Just see that: <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathop%7B%5Cmathbb%20E%7D%20N_t%20%3D%20%5Cpsi_%7BN_%7Bt%7D%7D%27%281%29%20%3D%20%5Cpsi_%7BN_%7Bt-1%7D%7D%27%28%5Cpsi_X%281%29%29%20%5Ccdot%20%5Cpsi%27_X%281%29%20%3D%20%5Cpsi_%7BN_%7Bt-1%7D%7D%27%281%29%20%5Ccdot%20%5Cmathop%7B%5Cmathbb%20E%7D%20X%20%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathop{\mathbb E} N_t = \psi_{N_{t}}&#039;(1) = \psi_{N_{t-1}}&#039;(\psi_X(1)) \cdot \psi&#039;_X(1) = \psi_{N_{t-1}}&#039;(1) \cdot \mathop{\mathbb E} X }' title='{\mathop{\mathbb E} N_t = \psi_{N_{t}}&#039;(1) = \psi_{N_{t-1}}&#039;(\psi_X(1)) \cdot \psi&#039;_X(1) = \psi_{N_{t-1}}&#039;(1) \cdot \mathop{\mathbb E} X }' class='latex' />. Then, clearly <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathop%7B%5Cmathbb%20E%7D%20N_t%20%3D%20%28%5Cmathop%7B%5Cmathbb%20E%7D%20X%29%5Et%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathop{\mathbb E} N_t = (\mathop{\mathbb E} X)^t}' title='{\mathop{\mathbb E} N_t = (\mathop{\mathbb E} X)^t}' class='latex' />. Using similar technique we can prove a lot more things about this process, just by analyzing the behavior of the moment generating function.</li>
<li> <strong>Laplace Tranform</strong>: Now, moving to continuous variables, if <img src='http://s.wordpress.com/latex.php?latex=%7BX%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X}' title='{X}' class='latex' /> is a continuous non-negative variable we can define its Laplace tranform as: <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cphi_X%28u%29%20%3D%20%5Cmathop%7B%5Cmathbb%20E%7D%20%5Be%5E%7Bu%20X%7D%5D%20%3D%20%5Cint_0%5E%5Cinfty%20e%5E%7Bux%7D%20%5Cmu_X%28dx%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\phi_X(u) = \mathop{\mathbb E} [e^{u X}] = \int_0^\infty e^{ux} \mu_X(dx)}' title='{\phi_X(u) = \mathop{\mathbb E} [e^{u X}] = \int_0^\infty e^{ux} \mu_X(dx)}' class='latex' />, where <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmu_X%28dx%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mu_X(dx)}' title='{\mu_X(dx)}' class='latex' /> stands for the distribution of <img src='http://s.wordpress.com/latex.php?latex=%7BX%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X}' title='{X}' class='latex' />, for example, <img src='http://s.wordpress.com/latex.php?latex=%7B%5Crho_X%28x%29%20dx%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\rho_X(x) dx}' title='{\rho_X(x) dx}' class='latex' />. This is associated with the multiplicative system <img src='http://s.wordpress.com/latex.php?latex=%7BQ%20%3D%20%5C%7Be%5E%7Bux%7D%3B%20u%20%5Cgeq%200%5C%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Q = \{e^{ux}; u \geq 0\}}' title='{Q = \{e^{ux}; u \geq 0\}}' class='latex' />. Again, by the Multiplicative Systems Theorem, if <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cphi_X%20%3D%20%5Cphi_Y%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\phi_X = \phi_Y}' title='{\phi_X = \phi_Y}' class='latex' />, then the two variables have the same distribution. The Laplace tranform has the same nice properties as the Moment Generating Function, for example, <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cphi_%7BX%2BY%7D%28u%29%20%3D%20%5Cmathop%7B%5Cmathbb%20E%7D%5Be%5E%7B%28X%2BY%29u%7D%5D%20%3D%20%5Cmathop%7B%5Cmathbb%20E%7D%5Be%5E%7BXu%7D%5D%20%5Ccdot%20%5Cmathop%7B%5Cmathbb%20E%7D%5Be%5E%7BYu%7D%5D%20%3D%20%5Cphi_X%28u%29%20%5Ccdot%20%5Cphi_Y%28u%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\phi_{X+Y}(u) = \mathop{\mathbb E}[e^{(X+Y)u}] = \mathop{\mathbb E}[e^{Xu}] \cdot \mathop{\mathbb E}[e^{Yu}] = \phi_X(u) \cdot \phi_Y(u)}' title='{\phi_{X+Y}(u) = \mathop{\mathbb E}[e^{(X+Y)u}] = \mathop{\mathbb E}[e^{Xu}] \cdot \mathop{\mathbb E}[e^{Yu}] = \phi_X(u) \cdot \phi_Y(u)}' class='latex' />.And it allows us to do similar tricks than the one I just showed for Moment Generating Functions. One common trick that is used, for example, in the proof of Chernoff bounds is, given independent non-negative random variables:
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20P%5Cleft%5C%7B%5Csum_i%20X_i%20%3E%20u%5Cright%5C%7D%20%3D%20P%5Cleft%5C%7Be%5E%7B%5Csum_i%20X_i%7D%20%3E%20e%5Eu%5Cright%5C%7D%20%5Cleq%20%5Cfrac%7B%5Cmathop%7B%5Cmathbb%20E%7D%5Be%5E%7B%5Csum_i%20X_i%7D%20%5D%7D%7Be%5Eu%7D%20%3D%20%5Cfrac%7B%5Cprod_i%20%5Cmathop%7B%5Cmathbb%20E%7D%5Be%5E%7BX_i%7D%20%5D%7D%7Be%5Eu%7D%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle P\left\{\sum_i X_i &gt; u\right\} = P\left\{e^{\sum_i X_i} &gt; e^u\right\} \leq \frac{\mathop{\mathbb E}[e^{\sum_i X_i} ]}{e^u} = \frac{\prod_i \mathop{\mathbb E}[e^{X_i} ]}{e^u} ' title='\displaystyle P\left\{\sum_i X_i &gt; u\right\} = P\left\{e^{\sum_i X_i} &gt; e^u\right\} \leq \frac{\mathop{\mathbb E}[e^{\sum_i X_i} ]}{e^u} = \frac{\prod_i \mathop{\mathbb E}[e^{X_i} ]}{e^u} ' class='latex' /></p>
<p>where we also used Markov Inequality: <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathop%7B%5Cmathbb%20E%7D%5BX%5D%20%5Cgeq%20%5Cbeta%20P%5C%7BX%20%5Cgeq%20%5Cbeta%5C%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathop{\mathbb E}[X] \geq \beta P\{X \geq \beta\}}' title='{\mathop{\mathbb E}[X] \geq \beta P\{X \geq \beta\}}' class='latex' />. Passing to the Laplace transform is the main ingredient in the Chernoff bound and it allows us to sort of &#8220;decouple&#8221; the random variables in the sum. There are several other cases where the Laplace transform proves itsself very useful and turns things that looked very complicated when we saw in undergrad courses into simple and clear things. One clear example of that is the motivation for the Poisson random variable:</p>
<p>If <img src='http://s.wordpress.com/latex.php?latex=%7BT_i%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T_i}' title='{T_i}' class='latex' /> are independend exponentially distributed random variables with mean <img src='http://s.wordpress.com/latex.php?latex=%7B%5Clambda%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\lambda}' title='{\lambda}' class='latex' />, then <img src='http://s.wordpress.com/latex.php?latex=%7B%5Crho_%7BT_i%7D%28t%29%20%3D%20%5Clambda%20e%5E%7B-%20%5Clambda%20t%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\rho_{T_i}(t) = \lambda e^{- \lambda t}}' title='{\rho_{T_i}(t) = \lambda e^{- \lambda t}}' class='latex' />. An elementary calculation shows that its laplace transform is <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cphi_%7BT_i%7D%28u%29%20%3D%20%5Cfrac%7B%5Clambda%7D%7B%5Clambda%2Bu%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\phi_{T_i}(u) = \frac{\lambda}{\lambda+u}}' title='{\phi_{T_i}(u) = \frac{\lambda}{\lambda+u}}' class='latex' />. Let <img src='http://s.wordpress.com/latex.php?latex=%7BS_n%20%3D%20T-0%20%2B%20T_1%20%2B%20T_2%20%2B%20%5Chdots%20%2B%20T_n%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{S_n = T-0 + T_1 + T_2 + \hdots + T_n}' title='{S_n = T-0 + T_1 + T_2 + \hdots + T_n}' class='latex' />, i.e., the time of the <img src='http://s.wordpress.com/latex.php?latex=%7B%28n%2B1%29%5E%7Bth%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{(n+1)^{th}}' title='{(n+1)^{th}}' class='latex' /> arrival. We want to know what is the distribution of <img src='http://s.wordpress.com/latex.php?latex=%7BS_n%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{S_n}' title='{S_n}' class='latex' />. How to do that?</p>
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Cphi_%7BS_n%7D%28u%29%20%3D%20%5Cmathop%7B%5Cmathbb%20E%7D%5Be%5E%7Bu%28T_0%20%2B%20%5Chdots%20%2B%20T_n%29%7D%5D%20%3D%20%5Cphi_T%28u%29%5E%7Bn%2B1%7D%20%3D%20%5Cleft%28%20%5Cfrac%7B%5Clambda%7D%7B%5Clambda%2Bu%7D%20%5Cright%29%5E%7Bn%2B1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \phi_{S_n}(u) = \mathop{\mathbb E}[e^{u(T_0 + \hdots + T_n)}] = \phi_T(u)^{n+1} = \left( \frac{\lambda}{\lambda+u} \right)^{n+1}' title='\displaystyle \phi_{S_n}(u) = \mathop{\mathbb E}[e^{u(T_0 + \hdots + T_n)}] = \phi_T(u)^{n+1} = \left( \frac{\lambda}{\lambda+u} \right)^{n+1}' class='latex' /></p>
<p>Now, we need to find <img src='http://s.wordpress.com/latex.php?latex=%7B%5Crho_%7BS_n%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\rho_{S_n}}' title='{\rho_{S_n}}' class='latex' /> such that <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cint_0%5E%5Cinfty%20%5Crho_%7BS_n%7D%28t%29%20e%5E%7Biu%7D%20dt%20%3D%20%5Cleft%28%20%5Cfrac%7B%5Clambda%7D%7B%5Clambda%2Bu%7D%20%5Cright%29%5E%7Bn%2B1%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\int_0^\infty \rho_{S_n}(t) e^{iu} dt = \left( \frac{\lambda}{\lambda+u} \right)^{n+1}}' title='{\int_0^\infty \rho_{S_n}(t) e^{iu} dt = \left( \frac{\lambda}{\lambda+u} \right)^{n+1}}' class='latex' />. Now it is just a matter of solving this equation and we get: <img src='http://s.wordpress.com/latex.php?latex=%7B%5Crho_%7BS_n%7D%28t%29%20%3D%20%5Cfrac%7B%5Clambda%20%28%5Clambda%20t%29%5En%7D%7Bn%21%7D%20e%5E%7B-%5Clambda%20t%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\rho_{S_n}(t) = \frac{\lambda (\lambda t)^n}{n!} e^{-\lambda t}}' title='{\rho_{S_n}(t) = \frac{\lambda (\lambda t)^n}{n!} e^{-\lambda t}}' class='latex' />. Now, the Poisson varible <img src='http://s.wordpress.com/latex.php?latex=%7BN_t%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{N_t}' title='{N_t}' class='latex' /> measures the number of arrivals in <img src='http://s.wordpress.com/latex.php?latex=%7B%5B0%2Ct%5D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{[0,t]}' title='{[0,t]}' class='latex' /> and therefore:</p>
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Cbegin%7Baligned%7D%20P%5C%7BN_t%20%3D%20n%5C%7D%20%26%20%3D%20P%5C%7BS_%7Bn-1%7D%20%3C%20t%20%3C%20S_n%5C%7D%20%3D%20P%5C%7BS_n%20%3E%20t%5C%7D%20-%20P%5C%7BS_%7Bn-1%7D%20%5Cgeq%20t%5C%7D%20%5C%5C%20%26%20%3D%20%5Cint_t%5E%5Cinfty%20%5Crho_%7BS_n%7D%28t%29%20dt%20-%20%5Cint_t%5E%5Cinfty%20%5Crho_%7BS_%7Bn-1%7D%7D%28t%29%20dt%20%3D%20%5Cfrac%7B%28%5Clambda%20t%29%5En%7D%7Bn%21%7D%20e%5E%7B-%5Clambda%20t%7D%20%5Cend%7Baligned%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \begin{aligned} P\{N_t = n\} &amp; = P\{S_{n-1} &lt; t &lt; S_n\} = P\{S_n &gt; t\} - P\{S_{n-1} \geq t\} \\ &amp; = \int_t^\infty \rho_{S_n}(t) dt - \int_t^\infty \rho_{S_{n-1}}(t) dt = \frac{(\lambda t)^n}{n!} e^{-\lambda t} \end{aligned}' title='\displaystyle \begin{aligned} P\{N_t = n\} &amp; = P\{S_{n-1} &lt; t &lt; S_n\} = P\{S_n &gt; t\} - P\{S_{n-1} \geq t\} \\ &amp; = \int_t^\infty \rho_{S_n}(t) dt - \int_t^\infty \rho_{S_{n-1}}(t) dt = \frac{(\lambda t)^n}{n!} e^{-\lambda t} \end{aligned}' class='latex' /></p>
</li>
<li> <strong>Characteristic Function or Fourier Tranform</strong>: Taking <img src='http://s.wordpress.com/latex.php?latex=%7BQ%20%3D%20%5C%7Be%5E%7Bi%5Clambda%20x%7D%3B%20x%20%5Cin%20%5Cmathbb%7BC%7D%5C%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{Q = \{e^{i\lambda x}; x \in \mathbb{C}\}}' title='{Q = \{e^{i\lambda x}; x \in \mathbb{C}\}}' class='latex' /> we get the Fourier Transform: <img src='http://s.wordpress.com/latex.php?latex=%7Bf_X%28%5Clambda%29%20%3D%20%5Cmathop%7B%5Cmathbb%20E%7D%20e%5E%7Bi%20%5Clambda%20x%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{f_X(\lambda) = \mathop{\mathbb E} e^{i \lambda x}}' title='{f_X(\lambda) = \mathop{\mathbb E} e^{i \lambda x}}' class='latex' /> which also has some of the nice properties of the previous ones and some additional ones. The characteristic functions were the main actors in the development of all the probability techniques that lead to the main result of 19th century Probability Theory: the Central Limit Theorem. We know that moment generating functions and Laplace transforms completely characterize the distributions, but it is not clear how to recover a distribution once we have a transform. For Fourier Transform there is a cleas and simple way of doing that by means of the Inversion Formula:
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Crho_X%28x%29%20%3D%20%5Cfrac%7B1%7D%7B2%20%5Cpi%7D%20%5Cint_%7B%5Cmathbb%20R%7D%20e%5E%7B-i%20%5Clambda%20x%7D%20f_X%28%5Clambda%29%20d%5Clambda&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \rho_X(x) = \frac{1}{2 \pi} \int_{\mathbb R} e^{-i \lambda x} f_X(\lambda) d\lambda' title='\displaystyle \rho_X(x) = \frac{1}{2 \pi} \int_{\mathbb R} e^{-i \lambda x} f_X(\lambda) d\lambda' class='latex' /></p>
<p>One fact that always puzzled me was: why is the normal distribution <img src='http://s.wordpress.com/latex.php?latex=%7B%5Crho_N%20%28x%29%20%3D%20%5Cfrac%7B1%7D%7B%5Csqrt%7B2%20%5Cpi%7D%7D%20e%5E%7B-x%5E2%2F2%7D%20%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\rho_N (x) = \frac{1}{\sqrt{2 \pi}} e^{-x^2/2} }' title='{\rho_N (x) = \frac{1}{\sqrt{2 \pi}} e^{-x^2/2} }' class='latex' /> so important? What does it have in special to be the limiting distribution in the Central Limit Theorem, i.e., if <img src='http://s.wordpress.com/latex.php?latex=%7BX_1%2C%20X_2%2C%20%5Chdots%2C%20X_n%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{X_1, X_2, \hdots, X_n}' title='{X_1, X_2, \hdots, X_n}' class='latex' /> is a sequence of independent random variables, <img src='http://s.wordpress.com/latex.php?latex=%7BS_n%20%3D%20%5Csum_1%5En%20X_i%20-%20%5Csum_1%5En%20%5Cmathop%7B%5Cmathbb%20E%7D%20X_i%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{S_n = \sum_1^n X_i - \sum_1^n \mathop{\mathbb E} X_i}' title='{S_n = \sum_1^n X_i - \sum_1^n \mathop{\mathbb E} X_i}' class='latex' /> then <img src='http://s.wordpress.com/latex.php?latex=%7BS_n%20%2F%20%5Csqrt%7Bvar%20S_n%7D%20%5Crightarrow%20N%280%2C1%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{S_n / \sqrt{var S_n} \rightarrow N(0,1)}' title='{S_n / \sqrt{var S_n} \rightarrow N(0,1)}' class='latex' /> under some natural conditions on the variables. The reason the normal is so special is because it is a &#8220;fixed point&#8221; for the Fourier Transform. We can see that <img src='http://s.wordpress.com/latex.php?latex=%7Bf_N%28%5Clambda%29%20%3D%20e%5E%7B-%5Clambda_2%2F2%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{f_N(\lambda) = e^{-\lambda_2/2}}' title='{f_N(\lambda) = e^{-\lambda_2/2}}' class='latex' />. And there we have something special about it that makes me believe the Central Limit Theorem.</li>
</ol>
<p style="text-align: center;">&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</p>
<p>This blog post was based on lectures by Professor Dynkin at Cornell.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bigredbits.com/archives/231/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Random Spanning Trees</title>
		<link>http://www.bigredbits.com/archives/226</link>
		<comments>http://www.bigredbits.com/archives/226#comments</comments>
		<pubDate>Wed, 04 Nov 2009 04:51:59 +0000</pubDate>
		<dc:creator>renatoppl</dc:creator>
				<category><![CDATA[theory]]></category>
		<category><![CDATA[combinatorics]]></category>
		<category><![CDATA[probability]]></category>

		<guid isPermaLink="false">http://www.bigredbits.com/?p=226</guid>
		<description><![CDATA[BigRedBits is again pleased to have Igor Gorodezky as a guest blogger directly from UCLA. I leave you with his excelent post on the Wilson&#8217;s algorithm. &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212; Igor again, with another mathematical dispatch from UCLA, where I&#8217;m spending the semester eating and breathing combinatorics as part of the 2009 program on combinatorics and its applications [...]]]></description>
			<content:encoded><![CDATA[<p>BigRedBits is again pleased to have <a href="http://jay.cam.cornell.edu/~igor/">Igor Gorodezky</a> as a guest blogger directly from UCLA. I leave you with his excelent post on the Wilson&#8217;s algorithm.</p>
<p style="text-align: center;">&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</p>
<p>Igor again, with another mathematical dispatch from UCLA, where I&#8217;m spending the semester eating and breathing combinatorics as part of the 2009 program on combinatorics and its applications at IPAM. In the course of some reading related to a problem with which I&#8217;ve been occupying myself, I ran across a neat algorithmic result &#8211; Wilson&#8217;s algorithm for uniformly generating spanning trees of a graph. With Renato&#8217;s kind permission, let me once again make myself at home here at Big Red Bits and tell you all about this little gem.</p>
<p>The problem is straightforward, and I&#8217;ve essentially already stated it: given an undirected, connected graph <img src='http://s.wordpress.com/latex.php?latex=%7BG%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G}' title='{G}' class='latex' />, we want an algorithm that outputs uniformly random spanning trees of <img src='http://s.wordpress.com/latex.php?latex=%7BG%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G}' title='{G}' class='latex' />. In the early &#8217;90s, Aldous and Broder independently discovered an algorithm for accomplishing this task. This algorithm generates a tree <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' /> by, roughly speaking, performing a random walk on <img src='http://s.wordpress.com/latex.php?latex=%7BG%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G}' title='{G}' class='latex' /> and adding the edge <img src='http://s.wordpress.com/latex.php?latex=%7B%28u%2Cv%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{(u,v)}' title='{(u,v)}' class='latex' /> to <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' /> every time that the walk steps from <img src='http://s.wordpress.com/latex.php?latex=%7Bu%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{u}' title='{u}' class='latex' /> to <img src='http://s.wordpress.com/latex.php?latex=%7Bv%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v}' title='{v}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7Bv%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v}' title='{v}' class='latex' /> is a vertex that has not been seen before.</p>
<p>Wilson&#8217;s algorithm (D. B. Wilson, <a class="snap_noshots" href="http://dbwilson.com/ja/tau.ps">&#8220;Generating random spanning trees more quickly than the cover time,&#8221;</a> STOC &#8217;96) takes a slightly different approach. Let us fix a root vertex <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' />. Wilson&#8217;s algorithm can be stated as a <em>loop-erased random walk</em> on <img src='http://s.wordpress.com/latex.php?latex=%7BG%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G}' title='{G}' class='latex' /> as follows.</p>
<blockquote><p><strong>Algorithm 1 (Loop-erased random walk)</strong> Maintain a tree <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' />, initialized to consist of <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' /> alone. While there remains a vertex <img src='http://s.wordpress.com/latex.php?latex=%7Bv%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v}' title='{v}' class='latex' /> not in <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' />: perform a random walk starting at <img src='http://s.wordpress.com/latex.php?latex=%7Bv%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v}' title='{v}' class='latex' />, erasing loops as they are created, until the walk encounters a vertex in <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' />, then add to <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' /> the cycle-erased simple path from <img src='http://s.wordpress.com/latex.php?latex=%7Bv%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v}' title='{v}' class='latex' /> to <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' />.</p></blockquote>
<p>We observe that the algorithm halts with probability 1 (its expected running time is actually polynomial, but let&#8217;s not concern ourselves with these issues here), and outputs a random directed spanning tree oriented towards <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' />. It is a minor miracle that this tree is in fact sampled <em>uniformly</em> from the set of all such trees. Let us note that this offers a solution to the original problem, as sampling <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' /> randomly and then running the algorithm will produce a uniformly generated spanning tree of <img src='http://s.wordpress.com/latex.php?latex=%7BG%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G}' title='{G}' class='latex' />.</p>
<p>It remains, then, to prove that the algorithm produces uniform spanning trees rooted at <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' /> (by which we mean directed spanning trees oriented towards <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' />). To this we dedicate the remainder of this post.</p>
<p><strong>1. A &#8220;different&#8221; algorithm </strong></p>
<p>Wilson&#8217;s proof is delightfully sneaky: we begin by stating and analyzing a seemingly different algorithm, the <em>cycle-popping</em> algorithm. We will prove that this algorithm has the desired properties, and then argue that it is equivalent to the loop-erased random walk (henceforth LERW).</p>
<p>The cycle-popping algorithm works as follows. Given <img src='http://s.wordpress.com/latex.php?latex=%7BG%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G}' title='{G}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' />, associate with each non-root vertex <img src='http://s.wordpress.com/latex.php?latex=%7Bv%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v}' title='{v}' class='latex' /> an infinite <em>stack</em> of neighbors. More formally, to each <img src='http://s.wordpress.com/latex.php?latex=%7Bv%20%5Cneq%20r%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v \neq r}' title='{v \neq r}' class='latex' /> we associate</p>
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%20%5Cmathcal%20S_v%20%3D%20%5Bu_%7B0%7D%2C%20u_%7B1%7D%2C%20%5Cdots%20%5D%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle  \mathcal S_v = [u_{0}, u_{1}, \dots ] ' title='\displaystyle  \mathcal S_v = [u_{0}, u_{1}, \dots ] ' class='latex' /></p>
<p>where each <img src='http://s.wordpress.com/latex.php?latex=%7Bu_%7Bi%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{u_{i}}' title='{u_{i}}' class='latex' /> is uniformly (and independently) sampled from the set of neighbors of <img src='http://s.wordpress.com/latex.php?latex=%7Bv%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v}' title='{v}' class='latex' />. Note that each stack is <em>not</em> a random walk, just a list of neighbors. We refer to the left-most element above as the <em>top</em> of <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathcal%20S_v%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathcal S_v}' title='{\mathcal S_v}' class='latex' />, and by <em>popping</em> the stack <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathcal%20S_v%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathcal S_v}' title='{\mathcal S_v}' class='latex' /> we mean removing this top vertex from <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathcal%20S_v%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathcal S_v}' title='{\mathcal S_v}' class='latex' />.</p>
<p>Define the <em>stack graph</em> <img src='http://s.wordpress.com/latex.php?latex=%7BG_%7B%5Cmathcal%20S%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G_{\mathcal S}}' title='{G_{\mathcal S}}' class='latex' /> to be the directed graph on <img src='http://s.wordpress.com/latex.php?latex=%7BV%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{V}' title='{V}' class='latex' /> that has an edge from <img src='http://s.wordpress.com/latex.php?latex=%7Bv%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v}' title='{v}' class='latex' /> to <img src='http://s.wordpress.com/latex.php?latex=%7Bu%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{u}' title='{u}' class='latex' /> if <img src='http://s.wordpress.com/latex.php?latex=%7Bu%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{u}' title='{u}' class='latex' /> is at the top of the stack <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathcal%20S_v%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathcal S_v}' title='{\mathcal S_v}' class='latex' />. Clearly, if <img src='http://s.wordpress.com/latex.php?latex=%7BG%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G}' title='{G}' class='latex' /> has <img src='http://s.wordpress.com/latex.php?latex=%7Bn%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{n}' title='{n}' class='latex' /> vertices then <img src='http://s.wordpress.com/latex.php?latex=%7BG_%7B%5Cmathcal%20S%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G_{\mathcal S}}' title='{G_{\mathcal S}}' class='latex' /> is an oriented subgraph of <img src='http://s.wordpress.com/latex.php?latex=%7BG%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G}' title='{G}' class='latex' /> with <img src='http://s.wordpress.com/latex.php?latex=%7Bn-1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{n-1}' title='{n-1}' class='latex' /> edges. The following lemma follows immediately.</p>
<blockquote><p><strong>Lemma 1</strong> <em><a name="lemstack_graph"></a> Either <img src='http://s.wordpress.com/latex.php?latex=%7BG_%7B%5Cmathcal%20S%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G_{\mathcal S}}' title='{G_{\mathcal S}}' class='latex' /> is a directed spanning tree oriented towards <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' /> or it contains a directed cycle. </em></p></blockquote>
<p>If there is a directed cycle <img src='http://s.wordpress.com/latex.php?latex=%7BC%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C}' title='{C}' class='latex' /> in <img src='http://s.wordpress.com/latex.php?latex=%7BG_%7B%5Cmathcal%20S%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G_{\mathcal S}}' title='{G_{\mathcal S}}' class='latex' /> we may pop it by popping <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathcal%20S_v%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathcal S_v}' title='{\mathcal S_v}' class='latex' /> for every <img src='http://s.wordpress.com/latex.php?latex=%7Bv%20%5Cin%20C%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v \in C}' title='{v \in C}' class='latex' />. This eliminates <img src='http://s.wordpress.com/latex.php?latex=%7BC%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C}' title='{C}' class='latex' />, but of course might create other directed cycles. Without resolving this tension quite yet, let us go ahead and formally state the cycle-popping algorithm.</p>
<blockquote><p><strong>Algorithm 2 (Cycle-popping algorithm)</strong> Create a stack <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathcal%20S_v%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathcal S_v}' title='{\mathcal S_v}' class='latex' /> for every <img src='http://s.wordpress.com/latex.php?latex=%7Bv%20%5Cneq%20r%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v \neq r}' title='{v \neq r}' class='latex' />. While <img src='http://s.wordpress.com/latex.php?latex=%7BG_%7B%5Cmathcal%20S%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G_{\mathcal S}}' title='{G_{\mathcal S}}' class='latex' /> contains any directed cycles, pop a cycle from the stacks. If this process ever terminates, output <img src='http://s.wordpress.com/latex.php?latex=%7BG_%7B%5Cmathcal%20S%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G_{\mathcal S}}' title='{G_{\mathcal S}}' class='latex' />.</p></blockquote>
<p>Note that by the lemma, if the algorithm ever terminates then its output is a spanning tree rooted at <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' />. We claim that the algorithm terminates with probability 1, and moreover generates spanning trees rooted at <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' /> uniformly.</p>
<p>To this end, some more definitions: let us say that given a stack <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathcal%20S_v%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathcal S_v}' title='{\mathcal S_v}' class='latex' />, the vertex <img src='http://s.wordpress.com/latex.php?latex=%7Bu_i%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{u_i}' title='{u_i}' class='latex' /> is at <em>level</em> <img src='http://s.wordpress.com/latex.php?latex=%7Bi%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{i}' title='{i}' class='latex' />. The level of a vertex in a stack is static, and is defined when the stack is created. That is, the level of <img src='http://s.wordpress.com/latex.php?latex=%7Bu_i%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{u_i}' title='{u_i}' class='latex' /> does not change even if <img src='http://s.wordpress.com/latex.php?latex=%7Bu_i%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{u_i}' title='{u_i}' class='latex' /> advances to the top of the stack as a result of the stack getting popped.</p>
<p>We regard the sequence of stack graphs produced by the algorithm as <em>leveled</em> stack graphs: each non-root vertex <img src='http://s.wordpress.com/latex.php?latex=%7Bv%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v}' title='{v}' class='latex' /> is assigned the level of its stack. Observe that the level of <img src='http://s.wordpress.com/latex.php?latex=%7Bv%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v}' title='{v}' class='latex' /> in <img src='http://s.wordpress.com/latex.php?latex=%7BG_%7B%5Cmathcal%20S%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G_{\mathcal S}}' title='{G_{\mathcal S}}' class='latex' /> is the number of times that <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathcal%20S_v%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathcal S_v}' title='{\mathcal S_v}' class='latex' /> has been popped. In the same way, we regard cycles encountered by the algorithm as leveled cycles, and we can regard the tree produced by the algorithm (if indeed one is produced) as a leveled tree.</p>
<p>The analysis of the algorithm relies on the following key lemma (Theorem 4 in Wilson&#8217;s paper), which tells us that the order in which the algorithm pops cycles is irrelevant.</p>
<blockquote><p><strong>Lemma 2</strong> <em><a name="lemcycles_commute"></a> For a given set of stacks, either the cycle-popping algorithm never terminates, or there exists a unique leveled spanning tree <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' /> rooted at <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' /> such that the algorithm outputs <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' /> irrespective of the order in which cycles are popped. </em></p></blockquote>
<p><em>Proof:</em> Fix a set of stacks <img src='http://s.wordpress.com/latex.php?latex=%7B%5C%7B%20%5Cmathcal%20S_v%20%5C%7D_%7Bv%20%5Cneq%20r%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\{ \mathcal S_v \}_{v \neq r}}' title='{\{ \mathcal S_v \}_{v \neq r}}' class='latex' />. Consider a leveled cycle <img src='http://s.wordpress.com/latex.php?latex=%7BC%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C}' title='{C}' class='latex' /> that is pop-able, i.e.~there exist leveled cycles <img src='http://s.wordpress.com/latex.php?latex=%7BC_1%2C%20C_2%2C%20%5Cdots%2C%20C_k%3DC%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C_1, C_2, \dots, C_k=C}' title='{C_1, C_2, \dots, C_k=C}' class='latex' /> that can be popped in sequence. We claim that if the algorithm pops any cycle not equal to <img src='http://s.wordpress.com/latex.php?latex=%7BC%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C}' title='{C}' class='latex' />, then there still must exist a series of cycles that ends in <img src='http://s.wordpress.com/latex.php?latex=%7BC%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C}' title='{C}' class='latex' /> and that can be popped in sequence. In other words, if <img src='http://s.wordpress.com/latex.php?latex=%7BC%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C}' title='{C}' class='latex' /> is pop-able then it remains pop-able, no matter which cycles are popped, until <img src='http://s.wordpress.com/latex.php?latex=%7BC%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C}' title='{C}' class='latex' /> itself is actually popped.</p>
<p>Let <img src='http://s.wordpress.com/latex.php?latex=%7BC%27%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C&#039;}' title='{C&#039;}' class='latex' /> be a cycle popped by the algorithm. If <img src='http://s.wordpress.com/latex.php?latex=%7BC%27%3DC_1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C&#039;=C_1}' title='{C&#039;=C_1}' class='latex' /> then the claim is clearly true. Also, if <img src='http://s.wordpress.com/latex.php?latex=%7BC%27%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C&#039;}' title='{C&#039;}' class='latex' /> shares no vertices with <img src='http://s.wordpress.com/latex.php?latex=%7BC_1%2C%20%5Cdots%2C%20C_k%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C_1, \dots, C_k}' title='{C_1, \dots, C_k}' class='latex' />, then the claim is true again. So assume otherwise, and let <img src='http://s.wordpress.com/latex.php?latex=%7BC_i%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C_i}' title='{C_i}' class='latex' /> be the first in the series <img src='http://s.wordpress.com/latex.php?latex=%7BC_1%2C%20%5Cdots%2C%20C_k%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C_1, \dots, C_k}' title='{C_1, \dots, C_k}' class='latex' /> to share a vertex with <img src='http://s.wordpress.com/latex.php?latex=%7BC%27%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C&#039;}' title='{C&#039;}' class='latex' />. Let us show that <img src='http://s.wordpress.com/latex.php?latex=%7BC%27%3DC_i%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C&#039;=C_i}' title='{C&#039;=C_i}' class='latex' /> by contradiction.</p>
<p>If <img src='http://s.wordpress.com/latex.php?latex=%7BC_i%20%5Cneq%20C%27%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C_i \neq C&#039;}' title='{C_i \neq C&#039;}' class='latex' />, then <img src='http://s.wordpress.com/latex.php?latex=%7BC_i%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C_i}' title='{C_i}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7BC%27%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C&#039;}' title='{C&#039;}' class='latex' /> must share a vertex <img src='http://s.wordpress.com/latex.php?latex=%7Bw%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{w}' title='{w}' class='latex' /> that has different successors in <img src='http://s.wordpress.com/latex.php?latex=%7BC%27%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C&#039;}' title='{C&#039;}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7BC_i%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C_i}' title='{C_i}' class='latex' />. But by definition of <img src='http://s.wordpress.com/latex.php?latex=%7BC_i%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C_i}' title='{C_i}' class='latex' />, none of the <img src='http://s.wordpress.com/latex.php?latex=%7BC_1%2C%20%5Cdots%2C%20C_%7Bi-1%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C_1, \dots, C_{i-1}}' title='{C_1, \dots, C_{i-1}}' class='latex' /> contain <img src='http://s.wordpress.com/latex.php?latex=%7Bw%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{w}' title='{w}' class='latex' />, and this implies that <img src='http://s.wordpress.com/latex.php?latex=%7Bw%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{w}' title='{w}' class='latex' /> has the same level in <img src='http://s.wordpress.com/latex.php?latex=%7BC%27%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C&#039;}' title='{C&#039;}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7BC_i%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C_i}' title='{C_i}' class='latex' />. Therefore its successor in both cycles is the same, a contradiction. This proves <img src='http://s.wordpress.com/latex.php?latex=%7BC_i%3DC%27%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C_i=C&#039;}' title='{C_i=C&#039;}' class='latex' />.</p>
<p>Moreover, the argument above proves that <img src='http://s.wordpress.com/latex.php?latex=%7BC_i%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C_i}' title='{C_i}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7BC%27%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C&#039;}' title='{C&#039;}' class='latex' /> are equal as <em>leveled</em> cycles (i.e.~every vertex has the same level in both cycles). Hence</p>
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%20C%27%3DC_i%2C%20C_1%2C%20C_2%2C%20%5Cdots%2C%20C_%7Bi-1%7D%2C%20C_%7Bi%2B1%7D%2C%20%5Cdots%2C%20C_k%3DC%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle  C&#039;=C_i, C_1, C_2, \dots, C_{i-1}, C_{i+1}, \dots, C_k=C ' title='\displaystyle  C&#039;=C_i, C_1, C_2, \dots, C_{i-1}, C_{i+1}, \dots, C_k=C ' class='latex' /></p>
<p>is a series of cycles that can be popped in sequence, which proves the original claim about <img src='http://s.wordpress.com/latex.php?latex=%7BC%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{C}' title='{C}' class='latex' />.</p>
<p>We conclude that given a set of stacks, either there is an infinite number of pop-able cycles, in which case there will always be an infinite number and the algorithm will never terminate, or there is a finite number of such cycles. In the latter case, every one of these cycles is eventually popped, and the algorithm produces a spanning tree <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' /> rooted at <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' />. The level of each non-root vertex in <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' /> is given by (one plus) the number of popped cycles that contained <img src='http://s.wordpress.com/latex.php?latex=%7Bv%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{v}' title='{v}' class='latex' />. <img src='http://s.wordpress.com/latex.php?latex=%5CBox&#038;bg=T&#038;fg=000000&#038;s=0' alt='\Box' title='\Box' class='latex' /></p>
<p>Wilson summarizes the cycle-popping algorithm thusly: &#8220;[T]he stacks uniquely define a tree together with a partially ordered set of cycles layered on top of it. The algorithm peels off these cycles to find the tree.&#8221;</p>
<blockquote><p><strong>Theorem 3</strong> <em> The cycle-popping algorithm terminates with probability 1, and the tree that it outputs is a uniformly sampled spanning tree rooted at <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' />. </em></p></blockquote>
<p><em>Proof:</em> The first claim is easy: <img src='http://s.wordpress.com/latex.php?latex=%7BG%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G}' title='{G}' class='latex' /> has a spanning tree, therefore it has a directed spanning tree oriented towards <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' />. The stacks generated in the first step of the algorithm will contain such a tree, and hence the algorithm will terminate, with probability 1.</p>
<p>Now, consider a spanning tree <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' /> rooted at <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' />. We&#8217;ll abuse notation and let <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' /> be the event that <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' /> is produced by the algorithm. Similarly, given a collection of leveled cycles <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathcal%20C%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathcal C}' title='{\mathcal C}' class='latex' />, we will write <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathcal%20C%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathcal C}' title='{\mathcal C}' class='latex' /> for the event that <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathcal%20C%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathcal C}' title='{\mathcal C}' class='latex' /> is the set of leveled cycles popped by the algorithm before it terminates. Finally, let <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathcal%20C%20%5Cwedge%20T%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathcal C \wedge T}' title='{\mathcal C \wedge T}' class='latex' /> be the event that the algorithm popped the leveled cycles in <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathcal%20C%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathcal C}' title='{\mathcal C}' class='latex' /> and terminated, with the resulting <em>leveled</em> tree being equal to <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' />.</p>
<p>By the independence of the stack entries, we have <img src='http://s.wordpress.com/latex.php?latex=%7B%5CPr%5B%5Cmathcal%20C%20%5Cwedge%20T%5D%20%3D%20%5CPr%5B%5Cmathcal%20C%5D%20%5Ccdot%20p%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\Pr[\mathcal C \wedge T] = \Pr[\mathcal C] \cdot p}' title='{\Pr[\mathcal C \wedge T] = \Pr[\mathcal C] \cdot p}' class='latex' />, where <img src='http://s.wordpress.com/latex.php?latex=%7Bp%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{p}' title='{p}' class='latex' /> is the probability that the algorithm&#8217;s output is a leveled version of <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' />, a quantity which a moment&#8217;s reflection will reveal is independent of <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' />. Now,</p>
<p align="center"><img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%20%5CPr%5BT%5D%20%3D%20%5Csum_%7B%5Cmathcal%20C%7D%20%5CPr%5B%5Cmathcal%20C%20%5Cwedge%20T%5D%20%3D%20p%20%5Csum_%7B%5Cmathcal%20C%7D%20%5CPr%5B%5Cmathcal%20C%5D%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle  \Pr[T] = \sum_{\mathcal C} \Pr[\mathcal C \wedge T] = p \sum_{\mathcal C} \Pr[\mathcal C] ' title='\displaystyle  \Pr[T] = \sum_{\mathcal C} \Pr[\mathcal C \wedge T] = p \sum_{\mathcal C} \Pr[\mathcal C] ' class='latex' /></p>
<p>which, as desired, is independent of <img src='http://s.wordpress.com/latex.php?latex=%7BT%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{T}' title='{T}' class='latex' />. <img src='http://s.wordpress.com/latex.php?latex=%5CBox&#038;bg=T&#038;fg=000000&#038;s=0' alt='\Box' title='\Box' class='latex' /></p>
<p><strong>2. Conclusion </strong></p>
<p>We have shown that the cycle-popping algorithm generates spanning trees rooted at <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' /> uniformly. It remains to observe that the LERW algorithm is nothing more than an implementation of the cycle-popping algorithm! Instead of initially generating the (infinitely long) stacks and then looking for cycles to pop, the LERW generates stack elements as necessary via random walk (computer scientists might recognize this as the Principle of Deferred Decisions). If the LERW encounters a loop, then it has found a cycle in the stack graph <img src='http://s.wordpress.com/latex.php?latex=%7BG_%7B%5Cmathcal%20S%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{G_{\mathcal S}}' title='{G_{\mathcal S}}' class='latex' /> induced by the stacks that the LERW has been generating. Erasing the loop is equivalent to popping this cycle. We conclude that the LERW algorithm generates spanning trees rooted at <img src='http://s.wordpress.com/latex.php?latex=%7Br%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{r}' title='{r}' class='latex' /> uniformly.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bigredbits.com/archives/226/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Entropy</title>
		<link>http://www.bigredbits.com/archives/145</link>
		<comments>http://www.bigredbits.com/archives/145#comments</comments>
		<pubDate>Fri, 28 Aug 2009 01:28:54 +0000</pubDate>
		<dc:creator>renatoppl</dc:creator>
				<category><![CDATA[theory]]></category>
		<category><![CDATA[entropy]]></category>
		<category><![CDATA[information theory]]></category>
		<category><![CDATA[mathematics]]></category>
		<category><![CDATA[probability]]></category>

		<guid isPermaLink="false">http://www.bigredbits.com/?p=145</guid>
		<description><![CDATA[Today was the first day of classes here at Cornell and as usual, I attend to a lot of different classes to try to decide which ones to take. I usually feel like I wanted to take them all, but there is this constant struggle: if I take too many classes I have no time [...]]]></description>
			<content:encoded><![CDATA[<p>Today was the first day of classes here at Cornell and as usual, I attend to a lot of different classes to try to decide which ones to take. I usually feel like I wanted to take them all, but there is this constant struggle: if I take too many classes I have no time to do research and to read random things that happen to catch my attention at that moment, and if I don&#8217;t take many classes I feel like not learning a lot of interesting stuff I wanted to be learning. The solution in the middle of the way is to audit a lot of classes and start dropping them as a start needing more time: what happens usually quickly. This particular fall I decided that I need to build a stronger background in probability &#8211; since I am finding a lot of probabilistic stuff in my way and I have nothing more than my undergrad course and things I learned on demand. I attended at least three probability classes with different flavours today and I decided to blog about a simple, yet very impressive result I saw in one of them.</p>
<p>Since I took a class on &#8220;Principles of Telecommunications&#8221; in my undergrad, I became impressed by Shannon&#8217;s <a class="snap_noshots" href="http://en.wikipedia.org/wiki/Information_Theory">Information Theory</a> and the concept of entropy. There was one theorem that I always heard about but never saw the proof. I thought it was a somewhat complicated proof, but it turned out not to be that much.</p>
<p>Consder an alphabet <img src='http://s.wordpress.com/latex.php?latex=%7B%5COmega%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\Omega}' title='{\Omega}' class='latex' /> and a probability distribution over it. I want to associate to each <img src='http://s.wordpress.com/latex.php?latex=%7B%5Comega%20%5Cin%20%5COmega%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\omega \in \Omega}' title='{\omega \in \Omega}' class='latex' /> a string <img src='http://s.wordpress.com/latex.php?latex=%7Bc%28%5Comega%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{c(\omega)}' title='{c(\omega)}' class='latex' /> of <img src='http://s.wordpress.com/latex.php?latex=%7Bk%28%5Comega%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{k(\omega)}' title='{k(\omega)}' class='latex' /> <img src='http://s.wordpress.com/latex.php?latex=%7B%5C%7B0%2C1%5C%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\{0,1\}}' title='{\{0,1\}}' class='latex' />-digits to represent each simbol of the alphabet. One way of allowing the code to be decodable is to make them a proper code. A proper code is a code such that given any <img src='http://s.wordpress.com/latex.php?latex=%7B%5Comega_1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\omega_1}' title='{\omega_1}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7B%5Comega_2%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\omega_2}' title='{\omega_2}' class='latex' />, <img src='http://s.wordpress.com/latex.php?latex=%7Bc%28%5Comega_1%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{c(\omega_1)}' title='{c(\omega_1)}' class='latex' /> is not a prefix of <img src='http://s.wordpress.com/latex.php?latex=%7Bc%28%5Comega_2%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{c(\omega_2)}' title='{c(\omega_2)}' class='latex' />. There are several codes like this, but some are more efficient then others. Since the letters have different frequencies, it makes sense to code a frequent letter (say &#8216;e&#8217; in English) with few bits and a letter that doesn&#8217;t appear much, say &#8216;q&#8217; with more bits. We want to find a proper code to minimize:</p>
<img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Cmathop%7B%5Cmathbb%20E%7D%5Bk%28%5Comega%29%5D%20%3D%20%5Csum_%7B%5Comega%20%5Cin%20%5COmega%7D%20k%28%5Comega%29%20p%28%5Comega%29%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \mathop{\mathbb E}[k(\omega)] = \sum_{\omega \in \Omega} k(\omega) p(\omega) ' title='\displaystyle \mathop{\mathbb E}[k(\omega)] = \sum_{\omega \in \Omega} k(\omega) p(\omega) ' class='latex' />
<p>The celebrated theorem by Shannon shows that for any proper code (actually it holds more generally for any decodable code), we have <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathop%7B%5Cmathbb%20E%7D%5Bk%28%5Comega%29%5D%20%5Cgeq%20H%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathop{\mathbb E}[k(\omega)] \geq H}' title='{\mathop{\mathbb E}[k(\omega)] \geq H}' class='latex' /> where <img src='http://s.wordpress.com/latex.php?latex=%7BH%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{H}' title='{H}' class='latex' /> is the entropy of the alphabet, defined as:</p>
<img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20H%20%3D%20-%20%5Csum_%7B%5Comega%7D%20p%28%5Comega%29%20%5Clog_2%20p%28%5Comega%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle H = - \sum_{\omega} p(\omega) \log_2 p(\omega)' title='\displaystyle H = - \sum_{\omega} p(\omega) \log_2 p(\omega)' class='latex' />
<p>even more impressive is that we can achieve something very close to it:</p>
<blockquote><p><strong>Theorem 1</strong> <em> There is a code such that <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathop%7B%5Cmathbb%20E%7D%5Bk%28%5Comega%29%5D%20%5Cleq%20H%20%2B%201%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathop{\mathbb E}[k(\omega)] \leq H + 1}' title='{\mathop{\mathbb E}[k(\omega)] \leq H + 1}' class='latex' />. </em></p></blockquote>
<p>With an additional trick we can get <img src='http://s.wordpress.com/latex.php?latex=%7BH%20%2B%20%5Cepsilon%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{H + \epsilon}' title='{H + \epsilon}' class='latex' /> for any <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cepsilon%20%3E%200%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\epsilon &gt; 0}' title='{\epsilon &gt; 0}' class='latex' />. The first part is trickier and I won&#8217;t do here (but again, it is not as hard as I thought it would be). For proving that there is a code with average length <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cleq%20H%20%2B%201%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\leq H + 1}' title='{\leq H + 1}' class='latex' /> we use the following lemma:</p>
<blockquote><p><strong>Lemma 2</strong> <em> There is a proper code for <img src='http://s.wordpress.com/latex.php?latex=%7B%5COmega%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\Omega}' title='{\Omega}' class='latex' /> with code-lengths <img src='http://s.wordpress.com/latex.php?latex=%7Bk%28%5Comega%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{k(\omega)}' title='{k(\omega)}' class='latex' /> if and only if <img src='http://s.wordpress.com/latex.php?latex=%7B%5Csum_%5Comega%202%5E%7B-k%28%5Comega%29%7D%20%5Cleq%201%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\sum_\omega 2^{-k(\omega)} \leq 1}' title='{\sum_\omega 2^{-k(\omega)} \leq 1}' class='latex' /> </em></p></blockquote>
<p><em>Proof:</em> Let <img src='http://s.wordpress.com/latex.php?latex=%7BN%20%3D%20%5Cmax_%5Comega%20k%28%5Comega%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{N = \max_\omega k(\omega)}' title='{N = \max_\omega k(\omega)}' class='latex' /> and imagine all the possible codewords of length <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cleq%20N%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\leq N}' title='{\leq N}' class='latex' /> as a complete binary tree. Since it is a proper code, no two codes <img src='http://s.wordpress.com/latex.php?latex=%7Bc%28%5Comega_1%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{c(\omega_1)}' title='{c(\omega_1)}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7Bc%28%5Comega_2%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{c(\omega_2)}' title='{c(\omega_2)}' class='latex' /> are in the same path to the root. So, picking one node as a codeword means that we can&#8217;t pick any node in the subtree from it. Also, for each leave, the is at most one codeword in its path to the root. Therefore we can assign each leaf of the tree to a single codeword or to no codeword at all. It is easy to see that a codeword with size <img src='http://s.wordpress.com/latex.php?latex=%7Bk%28%5Comega%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{k(\omega)}' title='{k(\omega)}' class='latex' /> has associated with it <img src='http://s.wordpress.com/latex.php?latex=%7B2%5E%7BN%20-%20k%28%5Comega%29%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{2^{N - k(\omega)}}' title='{2^{N - k(\omega)}}' class='latex' /> leaves. Since there are <img src='http://s.wordpress.com/latex.php?latex=%7B2%5EN%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{2^N}' title='{2^N}' class='latex' /> leaves in total, we have that:</p>
<img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Csum_%5Comega%202%5E%7BN-k%28%5Comega%29%7D%20%5Cleq%202%5EN&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \sum_\omega 2^{N-k(\omega)} \leq 2^N' title='\displaystyle \sum_\omega 2^{N-k(\omega)} \leq 2^N' class='latex' />
<p>what proves one direction of the result. Now, to prove the converse direction, we can propose a greedy algorithm: given <img src='http://s.wordpress.com/latex.php?latex=%7B%5COmega%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\Omega}' title='{\Omega}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7Bk%28%5Comega%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{k(\omega)}' title='{k(\omega)}' class='latex' /> such that <img src='http://s.wordpress.com/latex.php?latex=%7B%5Csum_%5Comega%202%5E%7B-k%28%5Comega%29%7D%20%5Cleq%201%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\sum_\omega 2^{-k(\omega)} \leq 1}' title='{\sum_\omega 2^{-k(\omega)} \leq 1}' class='latex' />, let <img src='http://s.wordpress.com/latex.php?latex=%7BN%20%3D%20%5Cmax_%5Comega%20k%28%5Comega%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{N = \max_\omega k(\omega)}' title='{N = \max_\omega k(\omega)}' class='latex' />. Now, suppose <img src='http://s.wordpress.com/latex.php?latex=%7Bk%28%5Comega_1%29%20%5Cleq%20k%28%5Comega_2%29%20%5Cleq%20k%28%5Comega_3%29%20%5Cleq%20%5Chdots%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{k(\omega_1) \leq k(\omega_2) \leq k(\omega_3) \leq \hdots}' title='{k(\omega_1) \leq k(\omega_2) \leq k(\omega_3) \leq \hdots}' class='latex' />. Start with <img src='http://s.wordpress.com/latex.php?latex=%7B2%5EN%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{2^N}' title='{2^N}' class='latex' /> leaves in a whole block. Start dividing them in <img src='http://s.wordpress.com/latex.php?latex=%7B2%5E%7Bk%28%5Comega_1%29%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{2^{k(\omega_1)}}' title='{2^{k(\omega_1)}}' class='latex' /> blocks and assign one to <img src='http://s.wordpress.com/latex.php?latex=%7B%5Comega_1%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\omega_1}' title='{\omega_1}' class='latex' />. Now we define the recursive step: when we analyze <img src='http://s.wordpress.com/latex.php?latex=%7B%5Comega_j%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\omega_j}' title='{\omega_j}' class='latex' />, the leaves are divided in <img src='http://s.wordpress.com/latex.php?latex=%7B2%5E%7Bk%28%5Comega_j-1%29%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{2^{k(\omega_j-1)}}' title='{2^{k(\omega_j-1)}}' class='latex' /> blocks, some occupied, some not. Divide each free block in <img src='http://s.wordpress.com/latex.php?latex=%7B2%5E%7Bk%28%5Comega_j%29%20-%20k%28%5Comega_j-1%29%7D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{2^{k(\omega_j) - k(\omega_j-1)}}' title='{2^{k(\omega_j) - k(\omega_j-1)}}' class='latex' /> blocks and assign one of them to <img src='http://s.wordpress.com/latex.php?latex=%7B%5Comega_j%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\omega_j}' title='{\omega_j}' class='latex' />. It is not hard to see that each block corresponds to one node in the tree (the common ancestor of all the leaves in that block) and that it corresponds to a proper code. <img src='http://s.wordpress.com/latex.php?latex=%5CBox&#038;bg=T&#038;fg=000000&#038;s=0' alt='\Box' title='\Box' class='latex' /></p>
<p>Now, using this we show how to find a code with with <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathop%7B%5Cmathbb%20E%7D%5Bk%28%5Comega%29%5D%20%5Cleq%20H%20%2B%201%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathop{\mathbb E}[k(\omega)] \leq H + 1}' title='{\mathop{\mathbb E}[k(\omega)] \leq H + 1}' class='latex' />. For each <img src='http://s.wordpress.com/latex.php?latex=%7B%5Comega%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\omega}' title='{\omega}' class='latex' />, since <img src='http://s.wordpress.com/latex.php?latex=%7Bp%28%5Comega%29%20%5Cin%20%280%2C1%5D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{p(\omega) \in (0,1]}' title='{p(\omega) \in (0,1]}' class='latex' /> we can always find <img src='http://s.wordpress.com/latex.php?latex=%7Bk%28%5Comega%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{k(\omega)}' title='{k(\omega)}' class='latex' /> such that <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cfrac%7B1%7D%7B2%7D%20p%28%5Comega%29%20%5Cleq%202%5E%7B-k%28%5Comega%29%7D%20%5Cleq%20p%28%5Comega%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\frac{1}{2} p(\omega) \leq 2^{-k(\omega)} \leq p(\omega)}' title='{\frac{1}{2} p(\omega) \leq 2^{-k(\omega)} \leq p(\omega)}' class='latex' />. Now, clearly:</p>
<img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Csum_%5Comega%202%5E%7B-k%28%5Comega%29%7D%20%5Cleq%20%5Csum_%5Comega%20p%28%5Comega%29%20%3D%201&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \sum_\omega 2^{-k(\omega)} \leq \sum_\omega p(\omega) = 1' title='\displaystyle \sum_\omega 2^{-k(\omega)} \leq \sum_\omega p(\omega) = 1' class='latex' />
<p>and:</p>
<img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Cmathop%7B%5Cmathbb%20E%7D%5Bk%28%5Comega%29%5D%20%3D%20%5Csum_%5Comega%20k%28%5Comega%29%20p%28%5Comega%29%20%5Cleq%20%5Csum_%5Comega%20%5B1%20-%20%5Clog_2%20p%28%5Comega%29%5D%20p%28%5Comega%29%20%3D%20H%20%2B%201&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \mathop{\mathbb E}[k(\omega)] = \sum_\omega k(\omega) p(\omega) \leq \sum_\omega [1 - \log_2 p(\omega)] p(\omega) = H + 1' title='\displaystyle \mathop{\mathbb E}[k(\omega)] = \sum_\omega k(\omega) p(\omega) \leq \sum_\omega [1 - \log_2 p(\omega)] p(\omega) = H + 1' class='latex' />
<p>Cool, but now how to bring it to <img src='http://s.wordpress.com/latex.php?latex=%7BH%20%2B%20%5Cepsilon%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{H + \epsilon}' title='{H + \epsilon}' class='latex' /> ? The idea is to code multiple blocks at the same time (even if they are independent, we are not taking advantage of correlation between the blocks). Consider <img src='http://s.wordpress.com/latex.php?latex=%7B%5COmega%5Ek%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\Omega^k}' title='{\Omega^k}' class='latex' /> and the probability function induced on it, i.e.:</p>
<img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20p_k%20%28%5Comega_1%2C%20%5Chdots%2C%20%5Comega_k%29%20%3D%20%5Cprod_%7Bi%3D1%7D%5Ek%20p%28%5Comega_i%29&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle p_k (\omega_1, \hdots, \omega_k) = \prod_{i=1}^k p(\omega_i)' title='\displaystyle p_k (\omega_1, \hdots, \omega_k) = \prod_{i=1}^k p(\omega_i)' class='latex' />
<p>It is not hard ot see that <img src='http://s.wordpress.com/latex.php?latex=%7B%5COmega%5Ek%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\Omega^k}' title='{\Omega^k}' class='latex' /> with <img src='http://s.wordpress.com/latex.php?latex=%7Bp_k%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{p_k}' title='{p_k}' class='latex' /> has entropy <img src='http://s.wordpress.com/latex.php?latex=%7BkH%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{kH}' title='{kH}' class='latex' /> because:</p>
<img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20%5Cbegin%7Baligned%7D%20%5Csum_%7B%5Comega_1%2C%20%5Chdots%2C%20%5Comega_k%7D%20p_k%28%5Comega_1%2C%20%5Chdots%2C%20%5Comega_k%29%20%5Clog_2%20p_k%28%5Comega_1%2C%20%5Chdots%2C%20%5Comega_k%29%20%3D%5C%5C%20%3D%20%5Csum_%7B%5Comega_1%2C%20%5Chdots%2C%20%5Comega_k%7D%20%5Cprod_i%20p%28%5Comega_i%29%20%5Csum_i%20%5Clog_2%20p%28%5Comega_i%29%20%3D%5C%5C%20%3D%20%5Csum_i%20%5Csum_%5Comega%20p%28%5Comega%29%20%5Clog_2%20p%28%5Comega%29%20%3D%20kH%20%5Cend%7Baligned%7D%20&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle \begin{aligned} \sum_{\omega_1, \hdots, \omega_k} p_k(\omega_1, \hdots, \omega_k) \log_2 p_k(\omega_1, \hdots, \omega_k) =\\ = \sum_{\omega_1, \hdots, \omega_k} \prod_i p(\omega_i) \sum_i \log_2 p(\omega_i) =\\ = \sum_i \sum_\omega p(\omega) \log_2 p(\omega) = kH \end{aligned} ' title='\displaystyle \begin{aligned} \sum_{\omega_1, \hdots, \omega_k} p_k(\omega_1, \hdots, \omega_k) \log_2 p_k(\omega_1, \hdots, \omega_k) =\\ = \sum_{\omega_1, \hdots, \omega_k} \prod_i p(\omega_i) \sum_i \log_2 p(\omega_i) =\\ = \sum_i \sum_\omega p(\omega) \log_2 p(\omega) = kH \end{aligned} ' class='latex' />
<p>and then we can just apply the last theorem to that: we can find a function that codifies <img src='http://s.wordpress.com/latex.php?latex=%7Bk%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{k}' title='{k}' class='latex' /> symbols <img src='http://s.wordpress.com/latex.php?latex=%7B%5Comega%20%3D%20%28%5Comega_1%2C%20%5Chdots%2C%20%5Comega_k%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\omega = (\omega_1, \hdots, \omega_k)}' title='{\omega = (\omega_1, \hdots, \omega_k)}' class='latex' /> with <img src='http://s.wordpress.com/latex.php?latex=%7Bl%28%5Comega%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{l(\omega)}' title='{l(\omega)}' class='latex' /> symbols such that:</p>
<p><img src='http://s.wordpress.com/latex.php?latex=%7BkH%20%5Cleq%20%5Cmathop%7B%5Cmathbb%20E%7D%5Bl%28%5Comega%29%5D%20%5Cleq%20kH%20%2B%201%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{kH \leq \mathop{\mathbb E}[l(\omega)] \leq kH + 1}' title='{kH \leq \mathop{\mathbb E}[l(\omega)] \leq kH + 1}' class='latex' /> since <img src='http://s.wordpress.com/latex.php?latex=%7Bl%28%5Comega%29%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{l(\omega)}' title='{l(\omega)}' class='latex' /> codifies <img src='http://s.wordpress.com/latex.php?latex=%7Bk%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{k}' title='{k}' class='latex' /> symbols, we are actually interested in <img src='http://s.wordpress.com/latex.php?latex=%7B%5Cmathop%7B%5Cmathbb%20E%7D%5Bl%28%5Comega%29%2Fk%5D%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='{\mathop{\mathbb E}[l(\omega)/k]}' title='{\mathop{\mathbb E}[l(\omega)/k]}' class='latex' /> and therefore we get:</p>
<img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%20H%20%5Cleq%20%5Cmathop%7B%5Cmathbb%20E%7D%5Cleft%5B%5Cfrac%7Bl%28%5Comega%29%7D%7Bk%7D%5Cright%5D%20%5Cleq%20H%20%2B%20%5Cfrac%7B1%7D%7Bk%7D&#038;bg=T&#038;fg=000000&#038;s=0' alt='\displaystyle H \leq \mathop{\mathbb E}\left[\frac{l(\omega)}{k}\right] \leq H + \frac{1}{k}' title='\displaystyle H \leq \mathop{\mathbb E}\left[\frac{l(\omega)}{k}\right] \leq H + \frac{1}{k}' class='latex' />
]]></content:encoded>
			<wfw:commentRss>http://www.bigredbits.com/archives/145/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->
