Erlang or Stackless Python

May 31st, 2010 hussam No comments

I recently wrote a simple, event-driven, simulation framework to allow me to quickly prototype and test distributed protocols. Unfortunately it did not scale in the number of messages simulated. So I am planning to re-implement it.

To keep with the spirit of rapid prototyping, I want to use a dynamically typed language. My final two candidates are Erlang and Stackless Python (or rather PyPy). There certainly is a lot of buzz around Erlang these days: My advisor is very enthusiastic about it, and after reading through the tutorial and dummy protocols, I can see why. On the other hand, Stackless Python has the familiar syntax and the huge library of modules.

I read through many posts comparing the two and I finally decided to stick with Python for now. I will code up the framework (hopefully this week) and report back on my findings. However, I am personally still interested in coding something in Erlang, so who knows :-)

Categories: Systems Tags:

Probability Puzzles

February 16th, 2010 renatoppl No comments

Today in a dinner with Thanh, Hu and Joel I heard about a paradox I haven’t heard so far. Probability is full of cute problems that challenge our understanding of the basic concepts. The most famous of them is the Monty Hall Problem, which asks:

You are on a TV game show and there are {3} doors – one of them contains a prize, say a car and the other two door contain things you don’t care about, say goats. You choose a door. Then the TV host, who knows where the prize is, opens one door you haven’t chosen and that he knows has a goat. Then he asks if you want to stick to the door you have chosen or if you want to change to the other door. What should you do?

Probably you’ve already came across this question in some moment of your life and the answer is that changing doors would double your probability of getting the price. There are several ways of convincing your intuitions:

  • Do the math: when you chose the door, there were three options so the prize is in the door you chose with {\frac{1}{3}} probability and in the other door with probability {\frac{2}{3}} (note that the presenter can always open some door with a goat, so conditioning on that event doesn’t give you any new information).
  • Do the actual experiment (computationally) as done here. One can always ask a friend to help, get some goats and perform the actual experiment.
  • To convince yourself that “it doesn’t matter” is not correct, think {100} doors. You choose one and the TV host open {98} of them and asks if you want to change or stick with your first choice. Wouldn’t you change?

I’ve seen TV shows where this happened and I acknowledge that other things may be involved: there might be behavioral and psychologic issues associated with the Monty Hall problem – and possibly those would interest Dan Ariely, whose book I began reading today – and looks quite fun. But the problem they told me about today in dinner was another: the envelope problem:

There are two envelopes and you are told that in one of them there is twice the amount that there is in the other. You choose one of the envelopes at random and open it: it contains {100} bucks. Now, you don’t know if the other envelope has {50} bucks or {200} bucks. Then someone asks you if you wanted to pay {10} bucks and change to the other envelope. Should you change?

Now, consider two different solutions to this problem: the first is fallacious and the second is correct:

  1. If I don’t change, I get {100} bucks, if I change I pay a penalty of {10} and I get either {50} or {200} with equal probability, so my expected prize if I change is {\frac{200+50}{2}-10 = 115 > 100}, so I should change.
  2. I know there is one envelope with {x} and one with {2x}, then my expected prize if I don’t change is {\frac{x + 2x}{2} = \frac{3}{2}x}. If I change, my expected prize is {\frac{x + 2x}{2} - 10 < \frac{3}{2}x}, so I should not change.

The fallacy in the first argument is perceiving a probability distribution where there is no one. Either the other envelope contains {50} bucks or it contains {200} bucks – we just don’t know, but there is no probability distribution there – it is a deterministic choice by the game designer. Most of those paradoxes are a result of either an ill-defined probability space, as Bertrand’s Paradox or a wrong comprehension of the probability space, as in Monty Hall or in several paradoxes exploring the same idea as: Three Prisioners, Sleeping Beauty, Boy or Girl Paradox, …

73a_humpty-dumpty

There was very recently a thrilling discussion about a variant on the envelope paradox in the xkcd blag – which is the blog accompaning that amazing webcomic. There was a recent blog post with a very intriguing problem. A better idea is to go there and read the discussion, but if you are not doing so, let me summarize it here. The problem is:

There are two envelopes containing each of them a distinct real number. You pick one envelope at random, open it and see the number, then you are asked to guess if the number in the other envelope is larger or smaller then the previous one. Can you guess correctly with more than {\frac{1}{2}} probability?

A related problem is: given that you are playing the envelope game and there are number {A} and {B} (with {A < B}). You pick one envelope at random and then you are able to look at the content of the first envelope you open and then decide to switch or not. Is there a strategy that gives you expected earnings greater than {\frac{A+B}{2}} ?

The very unexpected answers is yes !!! The strategy that Randall presents in the blog and there is a link to the source here is: let {X} be a random variable on {{\mathbb R}} such that for each {a<b} we have {P(a < X < b) > 0}, for example, the normal distribution or the logistic distribution.

Sample {X} then open the envelope and find a number {S} now, if {X < S} say the other number is lower and if {X > S} say the other number is higher. You get it right with probability

\displaystyle  P(\text{picked }A) P(X > A) + P(\text{picked }B) P(X < B) = \frac{1}{2} (1 + P(A < X < B))

which is impressive. If you follow your guess, your expected earning {Y} is:

\displaystyle \begin{aligned} &P(\text{picked }A) \mathop{\mathbb E}[Y \vert \text{picked }A] + P(\text{picked }B) \mathop{\mathbb E}[Y \vert \text{picked }B] = \\ & = \frac{1}{2} [P(X<A) A + P(X>A) B] + \frac{1}{2} [P(X<B) B + P(X>B) A] \\ &= \frac{1}{2}[A [P(X<A) + P(X>B)] + B [P(X>A) + P(X<B)]] > \frac{A+B}{2} \\ \end{aligned}

The xkcd pointed to this cool archive of puzzles and riddles. I was also told that the xkcd puzzle forum is also a source of excellent puzzles, as this:

You are the most eligible bachelor in the kingdom, and as such the King has invited you to his castle so that you may choose one of his three daughters to marry. The eldest princess is honest and always tells the truth. The youngest princess is dishonest and always lies. The middle princess is mischievous and tells the truth sometimes and lies the rest of the time. As you will be forever married to one of the princesses, you want to marry the eldest (truth-teller) or the youngest (liar) because at least you know where you stand with them. The problem is that you cannot tell which sister is which just by their appearance, and the King will only grant you ONE yes or no question which you may only address to ONE of the sisters. What yes or no question can you ask which will ensure you do not marry the middle sister?

copied from here.

Categories: puzzles Tags: ,

Walrasian Equilibrium I

February 15th, 2010 renatoppl No comments

Currently I’ve been trying to understand more about the dynamics of markets and basic concepts of microeconomic theory and, as always, writing a blog post will help me to keep my ideas clear. First, why are markets interesting from a computer scientist/mathematician point of view?

  • Markets are multi-objective optimization problems: one can think of the possible state of a market some point in a space of possible {x = (x_1, \hdots, x_N) \in \Omega}&fg=000000. Each player of the market controls one variable, say {x_i}&fg=000000 and is interested in maximizing one objective function {f_i(x)}&fg=000000. So, player {i}&fg=000000 is trying to set {x_i = \text{argmax}_{x_i} f_i(x_i, x_{-i})}&fg=000000.
  • Markets are a computational model: one can think of a market as a way of performing a certain computation – as extracting some kind of information, as a prediction market, stock exchanges, … If we think of it as a computational device, we are asking the same questions: given those preferences which are implicit functions to each of the agents, calculate “fair” prices of items.
  • Markets are distributed systems where each part of the system has a selfish interest.

A market is composed by a set {L}&fg=000000 of commodities, {I}&fg=000000 of consumers and {J}&fg=000000 of producers. Now, we describe how to characterize each of them:

  • Each consumer is defined by a set {X_i \subseteq {\mathbb R}^L}&fg=000000 of commodities combinations he is interested (typically we take {X_i = {\mathbb R}^L_+}&fg=000000) and an utility function {u_i : X_i \rightarrow {\mathbb R}}&fg=000000 expressing his interest for this bundle of commodities. Consumer {i}&fg=000000 will try to maximize {u_i(\cdot)}&fg=000000 in a further restricted {X_i}&fg=000000.
  • Each producer is define by a set {Y_j \subset {\mathbb R}^L}&fg=000000 it has the capacity to produce.
  • Endowments: Each consumer comes to the market with an initial endowment {\omega_i \in {\mathbb R}^L}&fg=000000, so for {j \in L}&fg=000000, {\omega_{ij}}&fg=000000 is the amount of commodity {j}&fg=000000 that consumer {i}&fg=000000 originally has. The initial total endowment of the market is given by {\overline{\omega} = \sum_{i \in I} \omega_i}&fg=000000, which is a vector indicating how much of each commodity originally exists in the market.
  • Shares: consumers have shares in the companies, so for {i \in I}&fg=000000, {j \in J}&fg=000000, consumer {i}&fg=000000 has {\theta_{ij}}&fg=000000 shares of company {j}&fg=000000, such that {\sum_i \theta_{ij} = 1}&fg=000000.

Something very crucial is missing in this picture: a way to compare commodities and something that makes exchanges possible: the answer to that is to attribute prices to the items. How to attribute prices to the items so that the market works fine? A price vector is a vector {p \in {\mathbb R}^L_+}&fg=000000. Consider the following scenario after prices {p_\ell, \ell \in L}&fg=000000 are established to commodities:

  • by producing {y_j \in Y_j}&fg=000000, company {j \in J}&fg=000000 gets profit {p \cdot y_j}&fg=000000, so each company will try to maximize its profit producing {y_j^* \in \text{argmax}_{y_j \in Y_j} p \cdot y_j}&fg=000000.
  • each consumer sells its initial endowment and gets the profit respective to the companies he owns. So, consumer {i}&fg=000000 gets {w_i = p \cdot \omega_i + \sum_{j \in J} \theta_{ij} p \cdot y_j^*}&fg=000000.
  • now consumer {i}&fg=000000 uses the money he has to buy the best bundle he can afford, which is {x_i^* = \text{argmax} \{u_i(x_i); x_i \in X_i, p \cdot x_i \leq w_i\}}&fg=000000.

The amount of commodities in the market must conserve, so that is possible only if we get:

\displaystyle \sum_{i \in I} x_i^* = \sum_{i \in I} \omega_i + \sum_{j \in J} y_j^*&fg=000000

First, it is not clear if such a price vector exists. If it exists, is it unique? If this is an equilibrium, is it the best thing for the consumers? How those prices can be set in practice without a centralized authority? Can people lie? Below, let’s collect a couple of questions I’ll try to answer (yes, no or unknown) in this and the following posts.

Question 1: Does a price vector {p \geq 0}&fg=000000 always exist that generates an equilibrium?

Question 2: If it exists, is it unique?

Question 3: Can we describe an efficent method to find {p}&fg=000000 ?

Question 4: Is it the best thing for the consumers in the following sense: if {(x^*,y^*,p)}&fg=000000 is an equilibrium, are there feasible {(x,y)}&fg=000000 such that {u_i(x_i) \geq u_i(x_i^*)}&fg=000000 and for at least one consumer {u_i(x_i) > u_i(x_i^*)}&fg=000000? (This is called Pareto improvement)

Question 5: A central authority could use the knowledge about functions {u_i}&fg=000000 and endowments {\omega_i}&fg=000000 to calculate the price vector {p}&fg=000000 using some method. Can consumers {i}&fg=000000 be better off by lieing about their utility and endowments?

Question 6: How prices get defined without a central authority? Is there a dynamic/game-theoretical model to that?

For simplicity, let’s think of Exchange Economies, which are economies with no producers. Let’s define it formally:

Definition 1 An exchange economy is composed by a set of commodities {L}&fg=000000 and a set of consumers {I}&fg=000000 each with an utility {u_i : X_i \rightarrow {\mathbb R}}&fg=000000 and an initial endowment {\omega_i \in {\mathbb R}^L_+}&fg=000000.

Definition 2 A price vector {p \in {\mathbb R}^L_+}&fg=000000 is a Walrasian equilibrium for an exchange economy {(L, I, \{X_i, u_i, \omega_i\}_{i \in I})}&fg=000000 if there is {x_i^* \in X_i}&fg=000000 such that:

  1. {x_i^*}&fg=000000 s.t. {u_i(x_i^*) = \max \{u_i(x_i); x_i \in X_i, p \cdot x_i \leq p \cdot \omega_i\}}&fg=000000
  2. {\sum_i x_i^* \leq \sum_i \omega_i}&fg=000000
  3. {p \cdot (\sum_i x_i^* - \omega_i)= 0}&fg=000000

The first condition says that each consumer is maximizing his utility given his prices, the second says that we can’t buy more commodities than what is available in the market and the third, called Walras’ Law, says that if there is surplus of a certain product, it should have price zero. It is by far the most unnatural of those, but it can be easily justifiable in some circumnstances: suppose we say that utilities are non-satiated if for each {x_i \in X_i}&fg=000000 and {\epsilon > 0}&fg=000000, there is {x'_i \in X_i}&fg=000000, {\Vert x_i - x'_i \Vert < \epsilon}&fg=000000 such that {u_i(x'_i) > u_i(x_i)}&fg=000000. If {u_i}&fg=000000 are differentiable, that would mean {\nabla u_i \neq 0}&fg=000000, for example a linear function {u_i(x_i) = \sum_\ell u_{i\ell} x_{i\ell}}&fg=000000 with some {u_{i\ell} > 0}&fg=000000. In that case, {\sum_i (p \cdot x_i^* - p \cdot \omega_i) < 0}&fg=000000 and some player has money surplus and therefore he could increase his utility.

Now, we define for each price vector {p}&fg=000000 the excess demand function {z_i(p) = x_i(p) - \omega_i}&fg=000000 and {z(p) = \sum_i z_i(p)}&fg=000000. Now, under non-satiated utilities, by the last argument, we have that {p}&fg=000000 is an equilibrium vector iff {z(p) \leq 0}&fg=000000. Actually, if {u_i}&fg=000000 are also strong monotone, i.e., {u_i(x_i + \xi) \geq u_i(x_i)}&fg=000000 for each {\xi \geq 0}&fg=000000, then it becomes: {p}&fg=000000 is an equilibrium iff {z(p) = 0}&fg=000000, which means that the market clears:

\displaystyle \sum_i x_i^* = \sum_i \omega_i&fg=000000

The question that is easier to answer is Question 4 and it is sometimes refered as the First Fundamental Theorem of Welfare Economics:

Theorem 3 Given non-satiated preferences, each equilibrium {(x,p)}&fg=000000 is Pareto, i.e. there is no other feasible allocation {x'}&fg=000000 such that for all {i}&fg=000000, {u_i(x'_i) \geq u_i(x_i)}&fg=000000 with the inequality strict for at least one component.

Proof: Suppose there were, since {u_i(x'_i) \geq u_i(x_i)}&fg=000000 then {p \cdot x'_i \geq p \cdot \omega_i}&fg=000000, because if {p \cdot x'_i < p \cdot \omega_i}&fg=000000 then we could improve the utility of {x'_i}&fg=000000 still within the budget, contradicting the optimality of {u_i(x_i)}&fg=000000 for that budget. And clearly {u_i(x'_i) > u_i(x_i)}&fg=000000 implies {p \cdot x'_i > p \cdot \omega_i}&fg=000000.

Summing over {i}&fg=000000, we get {\sum_i p x'_i > \sum_i p \omega_i}&fg=000000, what is a contradiction, because since {x'}&fg=000000 is feasible, {\sum_i x'_i \leq \sum_i \omega_i}&fg=000000 and therefore {\sum_i p \cdot x'_i \leq \sum_i p \cdot \omega_i}&fg=000000. \Box&fg=000000

Now, let’s tackle Question 1. We assume linearly of utility: {u_i(x_i) = \sum_{\ell \in L} u_{i \ell} x_{i \ell}}&fg=000000 for {u_{i \ell} > 0}&fg=000000. This gives us strong monotonicity and local nonsatiated preferences.

Theorem 4 Under linear utilities, there is always an equilibrium price vector {p}&fg=000000.

Consider the function {z}&fg=000000 defined above: {z(p) = \sum_i (x_i(p) - \omega_i)}&fg=000000 where {x_i(p)}&fg=000000 is the bundle of best possible utility. Now, since we are using linear utilities we can’t guarantee there will be only one such bundle, so instead of considering a function, consider {x_i}&fg=000000 and {z}&fg=000000 as being correspondences: {x_i, z: {\mathbb R}^L_+ \rightarrow \mathcal{P}({\mathbb R}^L)}&fg=000000, i.e., {x_i(p) \subseteq {\mathbb R}^L_+}&fg=000000 is the set of all allocations that maximize {u_i(x_i)}&fg=000000 subject to {p\cdot x_i \leq p \cdot \omega_i}&fg=000000. Since {u_i}&fg=000000 are linear functionals, we can calculate {x_i(p)}&fg=000000 by a Fractional Knapsack algorithm: we sort commodities {\ell \in L}&fg=000000 by {\frac{p_\ell}{u_{i\ell}}}&fg=000000 and start buying in the cost-benefit order (the ones that provide more utility per buck spent). Most of the time there will be just one solution, but in points where {\frac{p_\ell}{u_{i\ell}} = \frac{p_k}{u_{i k}}}&fg=000000, then {x_i(p)}&fg=000000 might be a convex region. This correpondence is upper hemicontinuous, which is the correspondence analogue to continuity for functions. As Wikipedia defines:

Definition 5 A correspondence {\Gamma : A \rightarrow B}&fg=000000 is said to be upper hemicontinuous at the point {a \in A}&fg=000000 if for any open neighbourhood {V}&fg=000000 of {\Gamma(a)}&fg=000000 there exists a neighbourhood {U}&fg=000000 of a such that {\Gamma(x)}&fg=000000 is a subset of {V}&fg=000000 for all {x}&fg=000000 in {U}&fg=000000.

It is not hard to see that {z}&fg=000000 is upper hemicontinuous according to that definition. Our goal is to prove that there is one price vector {p}&fg=000000 for which {\overline{\omega} \in x_i(p)}&fg=000000 or: {0 \in z(p)}&fg=000000. To prove that we use Kakutani’s Fixed Point Theorem. Before we go into that, we’ll explore some other properties of {z}&fg=000000:

  • 0-Homogeneous: {z(\alpha p) = z(p), \forall \alpha > 0}&fg=000000
  • Walras’ Law: {p \cdot z(p) = 0}&fg=000000. For any {\sum_i (x_i - \omega_i) \in z(p)}&fg=000000 we know {\sum_i (p \cdot x_i - p \cdot \omega_i) \leq 0}&fg=000000 by the definition of {x_i}&fg=000000. So, if it not zero, some {i}&fg=000000 has money surplus what is absurd given that preferences are strongly monotone.
  • Bounded: {z(p)}&fg=000000 is bounded from below, i.e., {z_\ell(p) \geq -s, \forall p, \ell}&fg=000000 for some {s > 0}&fg=000000. Simply take {s = \max \omega_i}&fg=000000
  • Boundary behavior: if {p^k \rightarrow p \neq 0}&fg=000000 with {p_\ell = 0}&fg=000000, then {z_\ell(p^k) \rightarrow \infty}&fg=000000. That is clear from the fractional knapsack algorithm when one desirable item gets price zero.

Now, we are in shape for applying Kakutani’s Fixed Point Theorem:

Theorem 6 (Kakutani, 1941) If {f:A \rightarrow \mathcal{P}(A)}&fg=000000 is an upper hemicontinuous correspondence such that {f(a)}&fg=000000 is a convex non-empty set for all {a \in A}&fg=000000 then {f}&fg=000000 has a fixed point, i.e., {\exists x \in A}&fg=000000 s.t. {x \in f(x)}&fg=000000.

Since prices are {0}&fg=000000-homogeneous, consider the simplex {\Delta = \{p \geq 0; \sum_\ell p_\ell = 1\}}&fg=000000, its relative interior {\Delta^0 = \{p > 0; \sum_\ell p_\ell = 1\}}&fg=000000 and the boundary {\partial \Delta = \Delta \setminus \Delta^0}&fg=000000. Now we define the following price correcting correspondence {f:\Delta \rightarrow \mathcal{P}(\Delta)}&fg=000000.

If some price {p \in \Delta^0}&fg=000000 is set, it generates demand {d \in z(p)}&fg=000000. For that demand, the price that would maximize profit would be {q}&fg=000000, i.e. {q\cdot d \geq q' \cdot d}&fg=000000 for all {q' \in \Delta}&fg=000000. It is natural to re-adjust the prices to {q}&fg=000000. So we define for {p \in \Delta^0}&fg=000000:

\displaystyle f(p) = \{q \in \Delta; q \text{ is a best response price to some } d \in z(p)\}&fg=000000

and for {p \in \partial \Delta}&fg=000000:

\displaystyle f(p) = \{q \in \Delta; q \cdot p = 0\}&fg=000000

Now, I claim that this correspondence satisfies the conditions in Kakutani’s Theorem. We skip a formal proof of this fact, but this is intuitive for the interior {\Delta^0}&fg=000000 – let’s give the intuition why this is true as we approach the boundary: if {\Delta^0 \ni p^k \rightarrow p \in \partial \Delta}&fg=000000, then {p^k_\ell \rightarrow 0}&fg=000000, therefore the demans explodes: {z_\ell(p^k) \rightarrow \infty}&fg=000000 and as a result the best thing to do is to set the prices of those commodities much higher than the rest. Therefore, the price of the commodities whose demand explode are positive while the prices of the commodities where the price doesn’t get value zero.

Now, after waiving our hands about the upper continuity of {f}&fg=000000, we have by Kakutani’s Theorem a point {p \in \Delta}&fg=000000 such that {p \in f(p)}&fg=000000. By the definition of {f}&fg=000000 we must have {p \in \Delta^0}&fg=000000 (because for {p \in \partial \Delta}&fg=000000, {p \notin f(p)}&fg=000000. Now, I claim {z(p) = 0}&fg=000000. In fact if {z(p) \neq 0}&fg=000000, still {p z(p) = 0}&fg=000000 by Walras’ Law. So, if {z(p) \neq 0}&fg=000000 then there is {\ell}&fg=000000 with {z_\ell (p) < 0}&fg=000000 and therefore {q_\ell = 0}&fg=000000 for all {q \in f(p)}&fg=000000, and {f(p) \subseteq \partial \Delta}&fg=000000. For this reason {z(p) = 0}&fg=000000.

In the next blog post (or serie of blog posts, let’s see) we discuss issues related to the other questions: uniqueness, dynamics, game-theoretical considerations, …