Puzzle #1: Given and a vector of in-degrees and out-degrees and for , find if there is a simple directed graph on nodes with those in and out-degrees in time . By a simple directed graph, I mean at most one edge between each pair , allowing self loops .
Solving this problem in time is easy using a max-flow computation – simply consider a bipartite graph with and edge between each pair with capacity . Add a source and connect to each node in the left size with capacity and add also a sink in the natural way, compute max-flow and check if it is . But it turns out we can do it in a lot more efficient way. The solution I thought works in time but maybe there is a linear solution out there. If you know of one, I am curious.
The second puzzle was given to me by Hyung-Chan An:
Puzzle #2: There is grid of size formed by rigid bars. Some cells of the grid have a rigid bar in the diagonal, making that whole square rigid. The question is to decide, given a grid and the location of the diagonal bars if the entire structure is rigid or not. By rigid I mean, being able to be deformed.
We thought right away in a linear algebraic formulation: look at each node and create a variable for each of the 4 angles around it. Now, write linear equations saying that some variables sum to 360, since they are around one node. Equations saying that some variable must be 90 (because it is in a rigid cell). Now, for the variables internal to each square, write that opposite angles must be equal (since all the edges are of equal length) and then you have a linear system of type where are the variables (angles). Now, we need to check if this system admits more then one solution. We know a trivial solution to it, which is all variable is 90. So, we just need to check if the matrix has full rank.
It turns out this problem has a much more beautiful and elegant solution and it is totally combinatorial – it is based on verifying that a certain bipartite graph is connected. You can read more about this solution in Bracing rectangular frameworks. I by (Bolker and Crapo 1979). A cute idea is to use the the following more general linear system (which works for rigidity in any number of dimensions). Consider a rigid bar from point to point . If the structure is not rigid, then there is a movement it can make: let and be the instantaneous velocities of points and . If are the movements of points , then it must hold that: , so taking derivatives we have:
This is a linear system in the velocities. Now, our job is to check if there are non zero velocities, which again is to check that the matrix of the linear system is or is not full-rank. An interesting thing is that if we look at this question for the grid above, this matrix will be the matrix of a combinatorial problem! So we can simply check if it has full rank by solving the combinatorial problem. Look at the paper for more details.
The third puzzle I found in the amazing website called The Puzzle Toad, which is CMU’s puzzle website:
Puzzle #3: There is a game played between Arthur and Merlin. There is a table with lamps disposed in a circle, initially some are on and some are off. In each timestep, Arthur writes down the position of the lamps that are off. Then Merlin (in an adversarial way) rotates the table. The Arthur’s servant goes and flips (on –> off, off –> on) the lamps whose position Arthur wrote down (notice now he won’t be flipping the correct lamps, since Merlin rotated the table. if Arthur wrote lamp 1 and Merlin rotated the table by 3 positions, the servant will actually be flipping lamp 4. The question is: given and an initial position of the table, is there a strategy for Merlin such that Arthur never manages to turn all the lamps on.
See here for a better description and a link to the solution. For no matter what Merlin does, Arthur always manages to turn on all the lamps eventually, where eventually means in time. The solution is a very pretty (and simple) algebraic argument. I found this problem really nice.
]]>In fact, the behavioral economics literate is full of examples like this where the observed data is far from what you would expect to observe if all agents were rational – and those are normally attributed to cognitive biases. I was always a bit suspicious of such arguments: it was never clear if agents were simply not being rational or whether their true objective wasn’t being captured by the model. I always thought the second was a lot more likely.
One of the main problems of the irrationality argument is that they ignore the fact that agents live in a world where its states are not completely observed. In a beautiful paper in Econometrica called “Apparent Overconfidence“, Benoit and Dubra argue that:
“But the simple truism that most people cannot be better than the median does not imply that most people cannot rationally rate themselves above the median.”
The authors show that it is possible to reverse engineer a signaling scheme such that the data is mostly consistent with the observation. Let me try to give a simple example they give in the introduction: consider that each driver has one of three types of skill: low, medium or high: and . However, they can’t observe this. They can only observe some sample of their driving. Let’s say for simplicity that they can observe a signal that says if they caused an accident or not. Assume also that the larger that skill of a driver, the higher it is his probability of causing an accident, say:
Before observing each driver things of himself as having probability $\frac{1}{3}$ of having each type of skill. Now, after observing , they update their belief according to Bayes rule, i.e.,
doing the calculations, we have that and for the of the drivers that didn’t suffer an accident, they’ll evaluate , , , so:
and therefore will report high-skill. Notice this is totally consistent with rational Bayesian-updaters. The main question in the paper is: “when it is possible to reverse engineer a signaling scheme ?”. More formally, let be a set of types of users and let , i.e., is a distribution on the types which is common knowledge. Now, if we ask agents to report their type, their report is some . Is there a signaling scheme which can be interpreted as a random variable correlated with such that is the distribution rational Bayesian updaters would report based on what they observed from ? The authors give necessary and sufficient condition on when this is possible given .
—————————–
A note also related to the Lake Wobegon effect: I started reading a very nice book by Duncan Watts called “Everything Is Obvious: *Once You Know the Answer” about traps of the common-sense. The discussion is different then above, but it also talks about the dangers of applying our usual common sense, which is very useful to our daily life, to scientific results. I highly recommend reading the intro of the book, which is open in Amazon. He gives examples of social phenomena where, once you are told them, you think: “oh yeah, this is obvious”. But then if you were told the exact opposite (in fact, he begins the example by telling you the opposite from the observed in data), you’d also think “yes, yes, this is obvious” and come up with very natural explanations. His point is that common sense is very useful to explaining data observations, specially observations of social data. On the other hand, it is performs very poorly on predicting how the data will look like before actually seeing it.
]]>My Favourite Restaurants
Hummus Places
Places to eat/work on shabbat: on Saturday and Friday night most of the things in the city are closed, so it is good to know some places to go:
In the old city
Hotel Bars and Cafes
In Tel Aviv
]]>We can interpret the hazard rate in the following way: think of as a random variable that indicates the time that a light bulb will take to extinguish. If we are in time and the light bulb hasn’t extinguished so far, what is the probability it will extinguish in the next time:
We say that a distribution is monotone hazard rate, if is non-decreasing. This is very natural for light bulbs, for example. Many of the distributions that we are used to are MHR, for example, uniform, exponential and normal. The way that I like to think about MHR distributions is the following: if some distribution has hazard rate , then it means that . If we define , then , so:
From this characterization, it is simple to see that the extremal distributions for this class, i.e. the distributions that are in the edge of being MHR and non-MHR are constant hazard rate, which correspond to the exponential distribution for . They way I like to think about those distributions is that whenever you are able to prove something about the exponential distribution, then you can prove a similar statement about MHR distributions. Consider those three examples:
Example 1: for MHR distributions. This fact is straightforward for the exponential distribution. For the exponential distribution and therefore
but the proof for MHR is equally simple: Let , therefore .
Example 2: Given iid where is MHR and and , then . The proof for the exponential distribution is trivial, and in fact, this is tight for the exponential, the trick is to use the convexity of . We use that in the following way:
Since , we have that . This way, we get:
Example 3: For MHR distributions, there is a simple lemma that relates the virtual value and the real value and this lemma is quite useful in various settings: let , then for , . Again, this is tight for exponential distribution. The proof is quite trivial:
Now, MHR distributions are a subclass of regular distributions, which are the distributions for which Myerson’s virtual value is a monotone function. I usually find harder to think about regular distributions than to think about MHR (in fact, I don’t know so many examples that are regular, but not MHR. Here is one, though, called the equal-revenue-distribution. Consider distributed according to . The cumulative distribution is given by . The interesting thing of this distribution is that posted prices get the same revenue regardless of the price. For example, if we post any price , then a customer with valuations buys the item if by price , getting revenue is . This can be expressed by the fact that . I was a bit puzzled by this fact, because of Myerson’s Lemma:
Myerson Lemma: If a mechanism sells to some player that has valuation with probability when he has value , then the revenue is .
And it seemed that the auctioneers was doomed to get zero revenue, since . For example, suppose we fix some price and we sell the item if by price . Then it seems that Myerson’s Lemma should go through by a derivation like that (for this special case, although the general proof is quite similar):
but those don’t seem to match, since one side is zero and the other is 1. The mistake we did above is classic, which is to calculate . We wrote:
but both are infinity! This made me realize that Myerson’s Lemma needs the condition that , which is quite a natural a distribution over valuations of a good. So, one of the bugs of the the equal-revenue-distribution is that . A family that is close to this, but doesn’t suffer this bug is: for , then . For we have , then we get .
]]>Bimatrix Game and How to compute equilibrium
Bimatrix game is the simplest class of games studied. Basically it is a player game, with strategies for player and strategies for player 2 which is represented by a pair of matrices . Let represent a probability distribution that player assigns to his strategies and be the same for player . This way, the players experience utilities:
The best understood class of those games is the one where , called zero-sum games. For this class, computing a Nash equilibrium is very easy and it is given by the famous min-max theorem: player finds maximizing where is the -th unit vector. Similarly player finds maximizing . Then the pair of strategies obtained is a Nash equilibrium – and this verifying that is not hard.
When , the problems gets a lot more complicated. Proving that equilibrium exist can be done using various fixed point theorems, as Brouwer or Kakutani. There is a very simple exponential time algorithm for finding it and the key observation is the following, if is a Nash equilibrium then:
which means that each strategy for player is either in the support or is a best response. Proving that is trivial (if some strategy is in the support and is not a best response, then reducing the probability we play it is an improving deviation). Therefore if we just guess the support of and the support of we just need to find some strategies with this support satisfying the inequalities above. This can be done using a simple LP. This is clearly not very efficient, since it involves solving LPs. A still exponential, but a lot more practical method, is:
Lemke-Howson Algorithm
A good overview of the L-H algorithm can be found in those lecture notes or in a more detailed version in third chapter of the AGT book (a pdf can be found in Tim’s website). Here I’ll present a quick overview. The main idea is to define the best-response polytopes and :
The intuition is that a point represents the fact that the payoff of player when player plays is at least . We could re-write as .
Each of the polytopes has inequalities and equality. Given we define as the indices of the tight inequalities in and similarly we define as the indices of the tight inequalities in . The theorem in the previous section can be rephrased as:
is Nash equilibrium iff
So we need to look at points in the polytope below that are fully-labeled. And the way that this is done in L-H is quite ingenious – it is a similar idea that is used by the Simplex Method – walking through the vertices of the polytope looking for the desired vertex. In order to do it, let’s define another polytope:
Note that there is a clear bijection between and . This is a projective transformation given by . Notice that the labels are preserved by this transformation and the vertex and edges of the polytope are mapped almost 1-1 (except the vertex ). Notice a couple of details:
1. A vertex in corresponds to a point with labels. A vertex in corresponds to a point with labels.
2. The point corresponds to a fully-labeled point of , unfortunately this is the only fully-labeled point that doesn’t correspond to a Nash equilibrium.
3. By taking an edge of the polytope (which is define by labels in and labels in . We can move between two nodes that are almost fully labeled.
The idea is to consider the set of nodes that are almost fully-labeled: fix some label and consider all the nodes such that . Those points are either , a Nash equilibrium or they have one duplicated label, since . An edge is composed of labels (no duplicated labels). So, one almost fully-labeled point that is not a Nash or is only connected to two other vertices via an edge (which correspond to dropping one of the duplicated labels and following the corresponding edge).
This gives us a topology of the space of Nash equilibria. This tells us a couple of facts: (1) we can find a Nash equilibrium by starting in and following the L-H path. In the end of the path, we must find a Nash equilibrium. (ii) If the polytope is not degenerate, then there is an odd number of Nash equilibria and they are connected by L-H edges in the following way:
where the blue dots are the Nash equilibria, the white dots are the almost fully-labeled points and the edges are the L-H edges. The number of Nash equilibria are odd by a simple parity argument.
The classic way in which simplex-like methods walk through the vertices of a polytope is by pivoting. The same way we can implement Lemke-Howson. For an explanation on the implementation, please look at the chapter 3 of the AGT book.
One can ask if we could modify the L-H path following to go through the path faster. Recently, Goldberg, Papadimitriou and Savani proved that finding the Nash equilibrium that L-H outputs is PSPACE-complete. So, in principle, finding this specific equilibrium, seems harder than finding any Nash, which is PPAD-complete.
My current plan is on some other blog posts on Bimatrix games. I wanted to discuss two things: first the homotopy methods and homotopy fixed-point-theorems (which is the heart of the two papers mentioned above) and about new techniques for solving zero-matrix where the strategy space is exponential.
]]>So, coming back to our conversation, we were thinking on how to calculate the size of a connected component. Fix some node in – it doesn’t matter which node, since all nodes are equivalent before we start tossing the random coins. Now, let be the size of the connected component of node . The question is how to calculate .
Recently I’ve been learning MATLAB (actually, I am learning Octave, but it is the same) and I am very amazed by it and impressed about why I haven’t learned it before. It is a programming language that somehow knows exactly how mathematicians think and the syntax is very intuitive. All the operations that you think of performing when doing mathematics, they have implemented. Not that you can’t do that in C++ or Python, in fact, I’ve been doing that all my life, but in Octave, things are so simple. So, I thought this was a nice opportunity for playing a bit with it.
We can calculate using a dynamic programming algorithm in time – well, maybe we can do it more efficiently, but the DP I thought was the following: let’s calculate where it is the expected size of the -connected component of a random graph with nodes where the edges between and other nodes have probability and an edge between and have probability . What we want to compute is .
What we can do is to use the Principle of Deferred Decisions, and toss the coins for the edges between and the other nodes. With probability , there are edges between and the other nodes, say nodes . If we collapse those nodes to we end up with a graph of nodes and the problem is equivalent to plus the size of the connected component of in the collapsed graph.
One difference, however is that the probability that the collapsed node is connected to a node of the nodes is the probability that at least one of is connected to , which is . In this way, we can write:
where . Now, we can calculate by using DP, simply by filling an table. In Octave, we can do it this way:
function component = C(N,p)
C_table = zeros(N,N);
for n = 1:N for s =1:N
C_table(n,s) = binopdf(0,n-1,1-((1-p)^s)) ;
for k = 1:n-1
C_table(n,s) += binopdf(k,n-1,1-((1-p)^s)) * (k + C_table(n-k,k));
end
end end
component = C_table(N,1);
endfunction
And in fact we can call for say and and see how varies. This allows us, for example, to observe the sharp transition that happens before the giant component is formed. The plot we get is:
And then some references:
I first read Aumann’s classic, which is a very beautiful and short paper — but where I got most intuition (and fun) was reading Chapter 2 of Reasoning About Knowledge. So, I’ll show here something in between of what Chapter 2 presents and what Geanakoplos’ survey presents (which is also an amazing source).
We want to reason about the world and the first thing we need is a set representing all possible states the world can take. Each is called a state of the world and completely described the world we are trying to reason about. To illustrate the example, consider the situation where there are people in a room and each person has a number in his head. Person can see the number of everyone else, except his. We want to reason about this situation, so a good way to describe the world is to simply define of all -strings. We define an event to be simply a subset of the possible states of the world, i.e., some set . For example, the even that player has number in his head is simply: . We could also think about the event that the sum of the numbers is odd, which would be: . Now, we need to define what it means for some person to know some event.
For each person , his knowledge structure is defined by a partition of . The rough intuition is that player is unable to distinguish two elements in the same cell of partition . For each , is the cell of partition containing . The way I see knowledge representation is that if is the true state of the world, then person knows that the true state of the world is some element in .
Definition: We say that person knows event on the state of the world is . Therefore, if person knows event , the world must be in some state .
Above we define the knowledge operator . Below, there is a picture in which we represent its action:
Now, this allows us to represent the fact that person knows that person knows of event as the event . Now, the fact the person knows that person doesn’t know that person knows event can be represented as: , where .
An equivalent and axiomatic way of defining the knowledge operator is by defining it as an operator such that:
Notice that axioms 1-4 define exactly a topology and together with 5 it is a topology that is closed under complement. The last two properties are more interesting: they say that if player knows something, then he knows that he knows and if the doesn’t know something, he knows that he doesn’t know. Aumann goes ahead and defines the notion of common knowledge:
Definition: We say that an event is common knowledge at if for any and for any sequence where are players, then .
Suppose that is a partition that is a simultaneous coarsening of , then for all cells of this partition, either or is common knowledge.
An alternative representation is to represent $\Omega$ as nodes in a graph and add an edge between and labeled with if they are in the same cell of . Now, given the true state of the world , one can easily calculate the smallest event such that knows : this is exactly the states that are reached from just following edges labeled with , which is easily recognizable as .
Now, what is the smallest set that knows that knows ? Those are the elements that we can arrive from a path following first an edge labeled and then an edge labeled . Extending this reasoning, it is easy to see that the smallest set that is common knowledge at are all the elements reachable from some path in this graph.
More about knowledge representation and reasoning about knowledge in future posts. In any case, I can’t recommend enough the references above.
]]>Puzzle #0: There are people in a line, and each has a number on his hat. Each player can look to the numbers of the players in front of him. So, if is the number of player , then player knows . Now, from the players will say his own number. Is there a protocol such that players will get their own number right? (Notice that they hear what the players before him said).
Puzzle #1: Consider the same puzzle with an infinite number of players. I.e. there are and player knows for all . Show a protocol for all players, except the first to get the answer right?
Puzzle #2: Still the same setting, but now players don’t hear what the previous player said. Is there a protocol such that only a finite number of players get it wrong ? (notice that it needs to be finite, not bounded).
Puzzle #0 is very easy and the answer is simply parity check. Player could simply declares where stands for XOR. Now, player can for example reconstruct by . Now, player can do the same computation and figure out . Now, he can calculate and so on… When we move to an infinite number of players, however, we can’t do that anymore because taking the XOR of an infinite number of bits is not well defined. However, we can still can solve Puzzles #1 and #2 if we believe and are willing to accept the Axiom of Choice.
Axiom of Choice: Given a family of sets there is a set such that , i.e. a set that takes a representative from each element in the family.
It is used, for example to show that there is no measure that is shift invariant (say under addition modulo ) and . The proof goes the following way: define the following equivalence relation on : if . Now, consider the family of all the equivalence classes and invoke the Axiom of Choice. Let be the set obtained. Now, we can write the interval as a disjoint union:
where all operations are modulo and . Since it is an enumerable union, if such a measure existed, then: which is either if or if .
This is kinda surprising, but more surprising is how we can use the exact same technique to solve the puzzles: first, let’s solve Puzzle #2: let be the set of all infinite -strings and consider the equivalence relation on such that if the strings differ in a finite number of positions. Now, invoke the axiom of choice in the equivalence classes and let be the set of representatives. Now, if is the set of all strings with finite number of ‘s and the operation such that if . We can therefore write:
Now, a protocol the players could use is to look ahead and since they are seeing an infinite number of bits, they can figure out which equivalence class from they the entire string is. Now, they take the representative of this class and guess . Notice that will differ from the real string by at most a finite number of bits.
Now, to solve puzzle #1, the player simply looks at and figure out the equivalence class he is and let be the representative of this class. Now, since and differ by a finite number of bits, he can simply calculate XOR of and (now, since it is a finite number of them, XOR is well defined) and announce it. With this trick, it just becomes like Puzzle #0.
]]>Consider a set of items and agents. Each agent has a monotone submodular valuation over the items, i.e., s.t. for any subsets of and for T. Now, the goal is to partition the items in sets in order to maximize .
This problem is clearly NP-hard (for example, we can reduce from Maximum Coverage or any similar problem), but is has a very simples Greedy Approximation. The approximation goes as follows: start with all sets being empty, i.e., start with then for each item , find the player with maximum and add to this player. This is a -approximation algorithm. The proof is simple:
Let be the sets returned by the algorithm and the optimal solution. Let also and . We can write:
if we added to set it means that by the Greedy rule. Therefore we can write:
where the first inequality follows from the Greedy rule and the second follows from submodularity. Now, we can simply write:
An improved algorithm was given by Dobzinski and Shapira achieving an approximation using demand queries – that are used as a separation oracle for a suitable linear program.
]]>To keep with the spirit of rapid prototyping, I want to use a dynamically typed language. My final two candidates are Erlang and Stackless Python (or rather PyPy). There certainly is a lot of buzz around Erlang these days: My advisor is very enthusiastic about it, and after reading through the tutorial and dummy protocols, I can see why. On the other hand, Stackless Python has the familiar syntax and the huge library of modules.
I read through many posts comparing the two and I finally decided to stick with Python for now. I will code up the framework (hopefully this week) and report back on my findings. However, I am personally still interested in coding something in Erlang, so who knows
]]>You are on a TV game show and there are doors – one of them contains a prize, say a car and the other two door contain things you don’t care about, say goats. You choose a door. Then the TV host, who knows where the prize is, opens one door you haven’t chosen and that he knows has a goat. Then he asks if you want to stick to the door you have chosen or if you want to change to the other door. What should you do?
Probably you’ve already came across this question in some moment of your life and the answer is that changing doors would double your probability of getting the price. There are several ways of convincing your intuitions:
I’ve seen TV shows where this happened and I acknowledge that other things may be involved: there might be behavioral and psychologic issues associated with the Monty Hall problem – and possibly those would interest Dan Ariely, whose book I began reading today – and looks quite fun. But the problem they told me about today in dinner was another: the envelope problem:
There are two envelopes and you are told that in one of them there is twice the amount that there is in the other. You choose one of the envelopes at random and open it: it contains bucks. Now, you don’t know if the other envelope has bucks or bucks. Then someone asks you if you wanted to pay bucks and change to the other envelope. Should you change?
Now, consider two different solutions to this problem: the first is fallacious and the second is correct:
The fallacy in the first argument is perceiving a probability distribution where there is no one. Either the other envelope contains bucks or it contains bucks – we just don’t know, but there is no probability distribution there – it is a deterministic choice by the game designer. Most of those paradoxes are a result of either an ill-defined probability space, as Bertrand’s Paradox or a wrong comprehension of the probability space, as in Monty Hall or in several paradoxes exploring the same idea as: Three Prisioners, Sleeping Beauty, Boy or Girl Paradox, …
There was very recently a thrilling discussion about a variant on the envelope paradox in the xkcd blag – which is the blog accompaning that amazing webcomic. There was a recent blog post with a very intriguing problem. A better idea is to go there and read the discussion, but if you are not doing so, let me summarize it here. The problem is:
There are two envelopes containing each of them a distinct real number. You pick one envelope at random, open it and see the number, then you are asked to guess if the number in the other envelope is larger or smaller then the previous one. Can you guess correctly with more than probability?
A related problem is: given that you are playing the envelope game and there are number and (with ). You pick one envelope at random and then you are able to look at the content of the first envelope you open and then decide to switch or not. Is there a strategy that gives you expected earnings greater than ?
The very unexpected answers is yes !!! The strategy that Randall presents in the blog and there is a link to the source here is: let be a random variable on such that for each we have , for example, the normal distribution or the logistic distribution.
Sample then open the envelope and find a number now, if say the other number is lower and if say the other number is higher. You get it right with probability
which is impressive. If you follow your guess, your expected earning is:
The xkcd pointed to this cool archive of puzzles and riddles. I was also told that the xkcd puzzle forum is also a source of excellent puzzles, as this:
You are the most eligible bachelor in the kingdom, and as such the King has invited you to his castle so that you may choose one of his three daughters to marry. The eldest princess is honest and always tells the truth. The youngest princess is dishonest and always lies. The middle princess is mischievous and tells the truth sometimes and lies the rest of the time. As you will be forever married to one of the princesses, you want to marry the eldest (truth-teller) or the youngest (liar) because at least you know where you stand with them. The problem is that you cannot tell which sister is which just by their appearance, and the King will only grant you ONE yes or no question which you may only address to ONE of the sisters. What yes or no question can you ask which will ensure you do not marry the middle sister?
copied from here.
]]>A market is composed by a set of commodities, of consumers and of producers. Now, we describe how to characterize each of them:
Something very crucial is missing in this picture: a way to compare commodities and something that makes exchanges possible: the answer to that is to attribute prices to the items. How to attribute prices to the items so that the market works fine? A price vector is a vector . Consider the following scenario after prices are established to commodities:
The amount of commodities in the market must conserve, so that is possible only if we get:
First, it is not clear if such a price vector exists. If it exists, is it unique? If this is an equilibrium, is it the best thing for the consumers? How those prices can be set in practice without a centralized authority? Can people lie? Below, let’s collect a couple of questions I’ll try to answer (yes, no or unknown) in this and the following posts.
Question 1: Does a price vector always exist that generates an equilibrium?
Question 2: If it exists, is it unique?
Question 3: Can we describe an efficent method to find ?
Question 4: Is it the best thing for the consumers in the following sense: if is an equilibrium, are there feasible such that and for at least one consumer ? (This is called Pareto improvement)
Question 5: A central authority could use the knowledge about functions and endowments to calculate the price vector using some method. Can consumers be better off by lieing about their utility and endowments?
Question 6: How prices get defined without a central authority? Is there a dynamic/game-theoretical model to that?
For simplicity, let’s think of Exchange Economies, which are economies with no producers. Let’s define it formally:
Definition 1 An exchange economy is composed by a set of commodities and a set of consumers each with an utility and an initial endowment .
Definition 2 A price vector is a Walrasian equilibrium for an exchange economy if there is such that:
- s.t.
The first condition says that each consumer is maximizing his utility given his prices, the second says that we can’t buy more commodities than what is available in the market and the third, called Walras’ Law, says that if there is surplus of a certain product, it should have price zero. It is by far the most unnatural of those, but it can be easily justifiable in some circumnstances: suppose we say that utilities are non-satiated if for each and , there is , such that . If are differentiable, that would mean , for example a linear function with some . In that case, and some player has money surplus and therefore he could increase his utility.
Now, we define for each price vector the excess demand function and . Now, under non-satiated utilities, by the last argument, we have that is an equilibrium vector iff . Actually, if are also strong monotone, i.e., for each , then it becomes: is an equilibrium iff , which means that the market clears:
The question that is easier to answer is Question 4 and it is sometimes refered as the First Fundamental Theorem of Welfare Economics:
Theorem 3 Given non-satiated preferences, each equilibrium is Pareto, i.e. there is no other feasible allocation such that for all , with the inequality strict for at least one component.
Proof: Suppose there were, since then , because if then we could improve the utility of still within the budget, contradicting the optimality of for that budget. And clearly implies .
Summing over , we get , what is a contradiction, because since is feasible, and therefore .
Now, let’s tackle Question 1. We assume linearly of utility: for . This gives us strong monotonicity and local nonsatiated preferences.
Theorem 4 Under linear utilities, there is always an equilibrium price vector .
Consider the function defined above: where is the bundle of best possible utility. Now, since we are using linear utilities we can’t guarantee there will be only one such bundle, so instead of considering a function, consider and as being correspondences: , i.e., is the set of all allocations that maximize subject to . Since are linear functionals, we can calculate by a Fractional Knapsack algorithm: we sort commodities by and start buying in the cost-benefit order (the ones that provide more utility per buck spent). Most of the time there will be just one solution, but in points where , then might be a convex region. This correpondence is upper hemicontinuous, which is the correspondence analogue to continuity for functions. As Wikipedia defines:
Definition 5 A correspondence is said to be upper hemicontinuous at the point if for any open neighbourhood of there exists a neighbourhood of a such that is a subset of for all in .
It is not hard to see that is upper hemicontinuous according to that definition. Our goal is to prove that there is one price vector for which or: . To prove that we use Kakutani’s Fixed Point Theorem. Before we go into that, we’ll explore some other properties of :
Now, we are in shape for applying Kakutani’s Fixed Point Theorem:
Theorem 6 (Kakutani, 1941) If is an upper hemicontinuous correspondence such that is a convex non-empty set for all then has a fixed point, i.e., s.t. .
Since prices are -homogeneous, consider the simplex , its relative interior and the boundary . Now we define the following price correcting correspondence .
If some price is set, it generates demand . For that demand, the price that would maximize profit would be , i.e. for all . It is natural to re-adjust the prices to . So we define for :
and for :
Now, I claim that this correspondence satisfies the conditions in Kakutani’s Theorem. We skip a formal proof of this fact, but this is intuitive for the interior – let’s give the intuition why this is true as we approach the boundary: if , then , therefore the demans explodes: and as a result the best thing to do is to set the prices of those commodities much higher than the rest. Therefore, the price of the commodities whose demand explode are positive while the prices of the commodities where the price doesn’t get value zero.
Now, after waiving our hands about the upper continuity of , we have by Kakutani’s Theorem a point such that . By the definition of we must have (because for , . Now, I claim . In fact if , still by Walras’ Law. So, if then there is with and therefore for all , and . For this reason .
In the next blog post (or serie of blog posts, let’s see) we discuss issues related to the other questions: uniqueness, dynamics, game-theoretical considerations, …
]]>(click on the picture for more legible fonts)
Coming to think about it … why stop at quintuple blindness? how about automated reviewers or mixing reviewers from different eras? Ok, I’ll stop here
]]>The problem of bounded degree spanning tree is as follows: consider a graph with edge weights and we for some nodes a degree bound . We want to find, among the spanning trees for which the degree of is the one with minimum cost. It is clearly a hard problem, since taking all weights equal to and for all nodes is the Hamiltonian Path problem, which is NP-complete. We will get a different kind of approximation. Let OPT be the optimal solution: we will show an algorithm that gives a spanning tree of cost such that each node has degree (this can be improved to with a more sofisticated algorithm, also based on Iterated Rounding).
As always, the first step to design an approximation algorithm is to relax it to an LP. We consider the following LP:
The first constraint expresses that in a spanning tree, there are at most edges, the second prevent the formation of cycles and the third guarantees the degree bounds. For we have the standard Minimal Spanning Tree problem and for this problem the polytope is integral. With the degree bounds, we lose this nice property. We can solve this LP using the Ellipsoid Method. The separation oracle for the is done by a flow computation.
Iterated Rounding
Now, let’s go ahead and solve the LP. It would be great if we had an integral solution: we would be done. It is unfortunately not the case, but we can still hope it is almost integral in some sense: for example, some edges are integral and we can take them to the final solution and recurse the algorithm on a smaller graph. This is not far from truth and that’s the main idea of the iterated rounding. We will show that the support of the optimal solution has some nice structure. Consider the following lemma:
Lemma 1 For any basic solution of the LP, either there is with just one incident edge in the support or there is one such that that at most edges are incident to it.
If we can prove this lemma, we can solve the problem in the following way: we begin with an empty tree: then we solve the LP and look at the support . There are two possibilities according to the lemma:
The algorithm eventually stops, since in each iteration we have less edges or less nodes in and the solution is as desired. The main effort is therefore to prove the lemma. But before, let’s look at the lemma: it is of the following kind: “any basic solution of the LP has some nice properties, which envolve having a not too big (at least in some point) support”. So, it involves proving that the support is not too large. That is our next task as we are trying to prove the lemma. And we will be done with:
Theorem 2 The algorithm described above produces a spanning tree of cost (the LP values and therefore )in which each node has degree .
Bounding the size of the support
We would like now to prove some result like the Lemma above: that in the solution of the LP we have either one with degree in or we have a node in with degree . First, we suppose the opposite, that has all the nodes with degree and all the nodes in have degree . This implies that we have a large number of edges in the support. From the degrees, we know that:
We want to prove that the support of the LP can’t be too large. The first question is: how to estimate the size of the support of a basic solution. The constraints look like that:
A basic solution can be represented by picking rows of the matrix and making them tight. So, if we have a general LP, we pick some submatrix of which is and the basic solution is just . The lines of matrix can be of three types: they can be , which are corresponding to , that correspond to or corresponding to . There are vectors in total. The size of the support is smaller or equal the number of rows of the form in the basic solution. Therefore the idea to bound the size of the support is to prove that “all basic solutions can be represented by a small number of rows in the form . And this is done using the following:
Lemma 3 Assuming , for any basic solution , there is and a family of sets such that:
- The restrictions correspondent to and are tight for
- is an independent set
- is a laminar family
The first 3 items are straightfoward properties of basic solutions. The fourth one, means that for two sets , one of three things happen: , or . Now, we based on the previous lemma and in the following result that can be easily proved by induction, we will prove Lemma 1.
Lemma 4 If is a laminar family over the set where each set contains at least elements, then .
Now, the proof of Lemma 1 is easy. Let’s do it and then we come back to prove Lemma 3. Simply see that what contradicts .
Uncrossing argument
And now we arrive in the technical heart of the proof, which is proving Lemma 3. This says that given any basic solution, given any feasible solution, we can write it as a “structured” basic solution. We start with any basic feasible solution. This already satifies (1)-(3), then we need to change that solution to satisfy condition (4) as well. We need to get rid crossing elements, i.e., in the form:
We do that by the means of the:
Lemma 5 (Uncrossing Lemma) If and are intersecting and tight (tight in the sense that their respective constraint is tight), then and are also tight and:
Which corresponds to that picture:
Proof: First, we note that is a supermodular function, i.e.:
We can see that by case analysis. Every edge appearing in the left side appears in the right side with at least the same multiplicity. Notice also that it holds with strict inequality iff there are edges from to . Now, we have:
where the first relation is trivial, the second is by feasibility, the third is by supermodularity and the lastone is by tightness. So, all hold with equality and therefore and are tight. We also proved that:
so there can be no edge from to in and therefore, thinking just of edges in we have:
Uncrossing arguments are found everywhere in combinatorics. Now, we show how the Uncrossing Lemma can be used to prove Lemma 1:
Proof: Let be any basic solution. It can be represented by a pair where and is a family of sets. We will show that the same basic solution can be represented by where is a laminar family and has the same size of .
Let be all sets that are tight under and a maximal laminar family of tights sets in , such that are independent. I claim that .
In fact, suppose , then there are sets of we could add to without violating independence – the problem is that those sets would cross some set. Pick such intersecting fewer possible sets in . The set intersects some . Since both are tight we can use the Uncrossing Lemma and we get:
since , we can’t have simultaneously and in . Let’s consider two cases:
In either case we have a contradiction, so we proved that . So we can generate all the space of tight sets with a laminar family.
And this finishes the proof. Let’s go over all that we’ve done: we started with an LP and we wanted to prove that the support of each solution was not too large. We wanted that because we wanted to prove that there was one node with degree one in the support or a node in with small () degree. To prove that the degree of the support is small, we show that any basic solution has a representation in terms of a laminar family. Then we use the fact that laminar families can’t be very large families of sets. For that, we use the celebrated Uncrossing Lemma.
Note: Most of this is based on my notes on David Williamson’s Approximation Algorithms class. I spent some time thinking about this algorithm and therefore I decided o post it here.
]]>Most of ways of looking at probability distributions are associated with multiplicative system: a multiplicative system is a set of real-valued functions with the property that if then . Those kinds of sets are powerful because of the Multiplicative Systems Theorem:
Theorem 1 (Multiplicative Systems Theorem) If is a multiplicative system, is a linear space containing (the constant function ) and is closed under bounded convergence, then implies that contains all bounded -measurable functions.
The theorem might look a bit cryptic if you are not familiar with the definitions, but it boils down to the following translation:
Theorem 2 (Translation of the Multiplicative Systems Theorem) If is “general” multiplicative system, and are random variable such that for all then and have the same distribution.
where general excludes some troublesome cases like or all constant functions, for example. In technical terms, we wanted to be the Borel -algebra. But let’s not worry about those technical details and just look at the translated version. We now, discuss several kinds of multiplicative systems:
If we know moment generating functions, we can calculate expectation very easily, since . For example, suppose we have a process like that: there is one bacteria in time . In each timestep, either this bacteria dies (with probability ), continues alive without reproducing (with probability or has offsprings (with probability ). In that case . Each time, the same happens, independently with each of the bacteria alive in that moment. The question is, what is the expected number of bacteria in time ?
It looks like a complicated problem with just elementary tools, but it is a simple problem if we have moment generating functions. Just let be the variable associated with the bacteria of time . It is zero if it dies, if it stays the same and if it has offsprings. Let also be the number of bacteria in time . We want to know . First, see that:
Now, let’s write that in terms of moment generating functions:
which is just:
since the variables are all independent and identically distributed. Now, notice that:
by the definition of moment generating function, so we effectively proved that:
We proved that is just iterated times. Now, calculating the expectation is easy, using the fact that and . Just see that: . Then, clearly . Using similar technique we can prove a lot more things about this process, just by analyzing the behavior of the moment generating function.
where we also used Markov Inequality: . Passing to the Laplace transform is the main ingredient in the Chernoff bound and it allows us to sort of “decouple” the random variables in the sum. There are several other cases where the Laplace transform proves itsself very useful and turns things that looked very complicated when we saw in undergrad courses into simple and clear things. One clear example of that is the motivation for the Poisson random variable:
If are independend exponentially distributed random variables with mean , then . An elementary calculation shows that its laplace transform is . Let , i.e., the time of the arrival. We want to know what is the distribution of . How to do that?
Now, we need to find such that . Now it is just a matter of solving this equation and we get: . Now, the Poisson varible measures the number of arrivals in and therefore:
One fact that always puzzled me was: why is the normal distribution so important? What does it have in special to be the limiting distribution in the Central Limit Theorem, i.e., if is a sequence of independent random variables, then under some natural conditions on the variables. The reason the normal is so special is because it is a “fixed point” for the Fourier Transform. We can see that . And there we have something special about it that makes me believe the Central Limit Theorem.
————————-
This blog post was based on lectures by Professor Dynkin at Cornell.
]]>——————————————
Igor again, with another mathematical dispatch from UCLA, where I’m spending the semester eating and breathing combinatorics as part of the 2009 program on combinatorics and its applications at IPAM. In the course of some reading related to a problem with which I’ve been occupying myself, I ran across a neat algorithmic result – Wilson’s algorithm for uniformly generating spanning trees of a graph. With Renato’s kind permission, let me once again make myself at home here at Big Red Bits and tell you all about this little gem.
The problem is straightforward, and I’ve essentially already stated it: given an undirected, connected graph , we want an algorithm that outputs uniformly random spanning trees of . In the early ’90s, Aldous and Broder independently discovered an algorithm for accomplishing this task. This algorithm generates a tree by, roughly speaking, performing a random walk on and adding the edge to every time that the walk steps from to and is a vertex that has not been seen before.
Wilson’s algorithm (D. B. Wilson, “Generating random spanning trees more quickly than the cover time,” STOC ’96) takes a slightly different approach. Let us fix a root vertex . Wilson’s algorithm can be stated as a loop-erased random walk on as follows.
Algorithm 1 (Loop-erased random walk) Maintain a tree , initialized to consist of alone. While there remains a vertex not in : perform a random walk starting at , erasing loops as they are created, until the walk encounters a vertex in , then add to the cycle-erased simple path from to .
We observe that the algorithm halts with probability 1 (its expected running time is actually polynomial, but let’s not concern ourselves with these issues here), and outputs a random directed spanning tree oriented towards . It is a minor miracle that this tree is in fact sampled uniformly from the set of all such trees. Let us note that this offers a solution to the original problem, as sampling randomly and then running the algorithm will produce a uniformly generated spanning tree of .
It remains, then, to prove that the algorithm produces uniform spanning trees rooted at (by which we mean directed spanning trees oriented towards ). To this we dedicate the remainder of this post.
1. A “different” algorithm
Wilson’s proof is delightfully sneaky: we begin by stating and analyzing a seemingly different algorithm, the cycle-popping algorithm. We will prove that this algorithm has the desired properties, and then argue that it is equivalent to the loop-erased random walk (henceforth LERW).
The cycle-popping algorithm works as follows. Given and , associate with each non-root vertex an infinite stack of neighbors. More formally, to each we associate
where each is uniformly (and independently) sampled from the set of neighbors of . Note that each stack is not a random walk, just a list of neighbors. We refer to the left-most element above as the top of , and by popping the stack we mean removing this top vertex from .
Define the stack graph to be the directed graph on that has an edge from to if is at the top of the stack . Clearly, if has vertices then is an oriented subgraph of with edges. The following lemma follows immediately.
Lemma 1 Either is a directed spanning tree oriented towards or it contains a directed cycle.
If there is a directed cycle in we may pop it by popping for every . This eliminates , but of course might create other directed cycles. Without resolving this tension quite yet, let us go ahead and formally state the cycle-popping algorithm.
Algorithm 2 (Cycle-popping algorithm) Create a stack for every . While contains any directed cycles, pop a cycle from the stacks. If this process ever terminates, output .
Note that by the lemma, if the algorithm ever terminates then its output is a spanning tree rooted at . We claim that the algorithm terminates with probability 1, and moreover generates spanning trees rooted at uniformly.
To this end, some more definitions: let us say that given a stack , the vertex is at level . The level of a vertex in a stack is static, and is defined when the stack is created. That is, the level of does not change even if advances to the top of the stack as a result of the stack getting popped.
We regard the sequence of stack graphs produced by the algorithm as leveled stack graphs: each non-root vertex is assigned the level of its stack. Observe that the level of in is the number of times that has been popped. In the same way, we regard cycles encountered by the algorithm as leveled cycles, and we can regard the tree produced by the algorithm (if indeed one is produced) as a leveled tree.
The analysis of the algorithm relies on the following key lemma (Theorem 4 in Wilson’s paper), which tells us that the order in which the algorithm pops cycles is irrelevant.
Lemma 2 For a given set of stacks, either the cycle-popping algorithm never terminates, or there exists a unique leveled spanning tree rooted at such that the algorithm outputs irrespective of the order in which cycles are popped.
Proof: Fix a set of stacks . Consider a leveled cycle that is pop-able, i.e.~there exist leveled cycles that can be popped in sequence. We claim that if the algorithm pops any cycle not equal to , then there still must exist a series of cycles that ends in and that can be popped in sequence. In other words, if is pop-able then it remains pop-able, no matter which cycles are popped, until itself is actually popped.
Let be a cycle popped by the algorithm. If then the claim is clearly true. Also, if shares no vertices with , then the claim is true again. So assume otherwise, and let be the first in the series to share a vertex with . Let us show that by contradiction.
If , then and must share a vertex that has different successors in and . But by definition of , none of the contain , and this implies that has the same level in and . Therefore its successor in both cycles is the same, a contradiction. This proves .
Moreover, the argument above proves that and are equal as leveled cycles (i.e.~every vertex has the same level in both cycles). Hence
is a series of cycles that can be popped in sequence, which proves the original claim about .
We conclude that given a set of stacks, either there is an infinite number of pop-able cycles, in which case there will always be an infinite number and the algorithm will never terminate, or there is a finite number of such cycles. In the latter case, every one of these cycles is eventually popped, and the algorithm produces a spanning tree rooted at . The level of each non-root vertex in is given by (one plus) the number of popped cycles that contained .
Wilson summarizes the cycle-popping algorithm thusly: “[T]he stacks uniquely define a tree together with a partially ordered set of cycles layered on top of it. The algorithm peels off these cycles to find the tree.”
Theorem 3 The cycle-popping algorithm terminates with probability 1, and the tree that it outputs is a uniformly sampled spanning tree rooted at .
Proof: The first claim is easy: has a spanning tree, therefore it has a directed spanning tree oriented towards . The stacks generated in the first step of the algorithm will contain such a tree, and hence the algorithm will terminate, with probability 1.
Now, consider a spanning tree rooted at . We’ll abuse notation and let be the event that is produced by the algorithm. Similarly, given a collection of leveled cycles , we will write for the event that is the set of leveled cycles popped by the algorithm before it terminates. Finally, let be the event that the algorithm popped the leveled cycles in and terminated, with the resulting leveled tree being equal to .
By the independence of the stack entries, we have , where is the probability that the algorithm’s output is a leveled version of , a quantity which a moment’s reflection will reveal is independent of . Now,
which, as desired, is independent of .
2. Conclusion
We have shown that the cycle-popping algorithm generates spanning trees rooted at uniformly. It remains to observe that the LERW algorithm is nothing more than an implementation of the cycle-popping algorithm! Instead of initially generating the (infinitely long) stacks and then looking for cycles to pop, the LERW generates stack elements as necessary via random walk (computer scientists might recognize this as the Principle of Deferred Decisions). If the LERW encounters a loop, then it has found a cycle in the stack graph induced by the stacks that the LERW has been generating. Erasing the loop is equivalent to popping this cycle. We conclude that the LERW algorithm generates spanning trees rooted at uniformly.
]]>Those people are again a room, each with a hat which is either black or white (picked with probability at random) and they can see the color of the other people’s hats but they can’t see their own color. They write in a piece of paper either “BLACK” or “WHITE”. The whole team wins if all of them get their colors right. The whole team loses, if at least one writes the wrong color. Before entering the room and getting the hats, they can strategyze. What is a strategy that makes them win with probability?
If they all choose their colors at random, the probability of winning is very small: . So we should try to correlate them somehow. The solution is again related with error correcting codes. We can think of the hats as a string of bits. How to correct one bit if it is lost? The simple engineering solution is to add a parity check. We append to the string a bit . So, if bit is lost, we know it is . We can use this idea to solve the puzzle above: if hats are places with probability, the parity check will be with probability and with probability . They can decide before hand that everyone will use and with probability they are right and everyone gets his hat color right. Now, let’s extend this problem in some ways:
The same problem, but there are hat colors, they are choosen independently with probability and they win if everyone gets his color right. Find a strategy that wins with probability .
There are again hat colors, they are choosen independently with probability and they win if at least a fraction () of the people guesses the right color. Find a strategy that wins with probability .
Again to the problem where we just have BLACK and WHITE colors, they are chosen with probability and everyone needs to find the right color to win, can you prove that is the best one can do? And what about the two other problems above?
The first two use variations of the parity check idea in the solution. For the second case, given any strategy of the players, for each string they have probability . Therefore the total probability of winning is . Let , i.e., the same input but with the bit flipped. Notice that the answer of player is the same (or at least has the same probabilities) in both and , since he can’t distinguish between and . Therefore, . So,
. This way, no strategy can have more than probability of winning.
Another variation of it:
Suppose now we have two colors BLACK and WHITE and the hats are drawn from one distribution , i.e., we have a probability distribution over and we draw the colors from that distribution. Notice that now the hats are not uncorrelated. How to win again with probability (to win, everyone needs the right answer).
I like a lot those hat problems. A friend of mine just pointed out to me that there is a very nice paper by Bobby Kleinberg generalizing several aspects of hat problems, for example, when players have limited visibility of other players hats.
I began being interested by this sort of problem after reading the Derandomization of Auctions paper. Hat guessing games are not just a good model for error correcting codes, but they are also a good model for truthful auctions. Consider an auction with a set single parameter agents, i.e., an auction where each player gives one bid indicating how much he is willing to pay to win. We have a set of constraints: of all feasible allocations. Based on the bids we choose an allocation and we charge payments to the bidders. An example of a problem like this is the Digital Goods Auction, where .
In this blog post, I discussed the concept of truthful auction. If an auction is randomized, an universal truthful auction is an auction that is truthful even if all the random bits in the mechanism are revealed to the bidders. Consider the Digital Goods Auction. We can characterize universal truthful digital goods auction as bid-independent auctions. A bid-independent auction is given by function , which associated for each a random variable . In that auction, we offer the service to player at price . If we allocate to and charge him . Otherwise, we don’t allocate and we charge nothing.
It is not hard to see that all universal truthful mechanisms are like that: if is the probability that player gets the item bidding let be an uniform random variable on and define . Notice that here , but we are inverting with respect to . It is a simple exercise to prove that.
With this characterization, universal truthful auctions suddenly look very much like hat guessing games: we need to design a function that looks at everyone else’s bid but not on our own and in some sense, “guesses” what we probably have and with that calculated the price we offer. It would be great to be able to design a function that returns . That is unfortunately impossible. But how to approximate nicely? Some papers, like the Derandomization of Auctions and Competitiveness via Consensus use this idea.
]]>so we can just see it as a formal polynomial and think of:
which is an matrix. The theorem says it is the zero matrix. We thought for a while, looked in the Wikipedia, and there there were a few proofs, but not the one-line proof I was looking for. Later, I got this proof that I sent to Hu Fu:
Write the matrix in the basis of its eigenvectors, then we can write where is the diagonal matrix with the eigenvalues in the main diagonal.
and since we have . Now, it is simple to see that:
and therefore:
And that was the one-line proof. One even simpler proof is: let be the eigenvectors, then , so must be since it returns zero for all elements of a basis. Well, I sent that to Hu Fu and he told me the proof had a bug. Not really a bug, but I was proving only for symmetric matrices. More generally, I was proving for diagonalizable matrices. He showed me, for example, the matrix:
which has only one eigenvalue and the the eigenvectors are all of the form for . So, the dimension of the space spanned by the eigenvectors is , less than the dimension of the matrix. This never happens for symmetric matrices, and I guess after some time as a computer scientist, I got used to work only with symmetric matrices for almost everything I use: metrics, quadratic forms, correlation matrices, … but there is more out there then only symmetric matrices. The good news is that this proof is not hard to fix for the general case.
First, it is easy to prove that for each root of the characteristic polynomial there is one eigenvector associated to it (just see that and therefore there must be , so if all the roots are distinct, then there is a basis of eigenvalues, and therefore the matrix is diagonalizable (notice that maybe we will need to use complex eigenvalues, but it is ok). The good thing is that a matrix having two identical eigenvalues is a “coincidence”. We can identify matrices with . The matrices with identical eigenvalues form a zero measure subset of , they are in fact the roots of a polynomial in . This polynomial is the resultant polynomial . Therefore, we proved Cayley-Hamilton theorem in the complement of a zero-measure set in . Since is a continuous function, it extends naturally to all matrices .
We can also interpret that probabilistically: get a matrix where is taken uniformly at random from . Then has with probability all different eigenvalues. So, with probability . Now, just make .
Ok, this proves the Theorem for real and complex matrices, but what about a matrix defined over a general field where we can’t use those continuity arguments. A way to get around it is by using Jordan Canonical Form, which is a generalization of eigenvector decomposition. Not all matrices have eigenvector decomposition, but all matrices over an algebraic closed field can be written in Jordan Canonical Form. Given any there is a matrix so that:
where are blocks of the form:
By the same argument as above, we just need to prove Cayley Hamilton for each block in separate. So we need to prove that . If the block has size , then it is exacly the proof above. If the block is bigger, then we need to look at how does looks like. By inspection:
Tipically, for we have in each row, starting in column the sequence , i.e., . So, we have
If block has size , then has multiplicity in and therefore and therefore, as we wanted to prove.
It turned out not to be a very very short proof, but it is still short, since it uses mostly elementary stuff and the proof is really intuitive in some sense. I took some lessons from that: (i) first it reinforces my idea that, if I need to say something about a matrix, the first thing I do is to look at its eigenvectors decomposition. A lot of Linear Algebra problems are very simple when we consider things in the right basis. Normally the right basis is the eigenvector basis. (ii) not all matrices are diagonalizable. But in those cases, Jordan Canonical Form comes in our help and we can do almost the same as we did with eigenvalue decomposition.
]]>Three cloud stories .. a threatening cloud, a promising cloud, and a nice cloud
The problem with current clouds is that the user does not know what the cloud service provider is doing with the customer’s code and data. Also, from the cloud service provider’s perspective, the operator does not know what is the code that they are running for customers supposed to do.
Alice is the customer running a service on the cloud owned and operated by Bob.
A solution: what if we had an oracle that Alice and Bob could ask about cloud problems? We want completeness, (if something is faulty, we will know) accuracy (no false positives), verifiability (the oracle can prove its diagnoses is correct).
Idea: make clud accountable to alice+bob. Cloud records its actions in a tamper-evident log, alice and bob can audit, use log to construct evidence that a fault does or does not exist.
Discussion: 1) Isn’t this too pessimistic? bob isn’t malicious ..maybe, but bob can get hacked, or things can just go wrong. 2) shouldn’t bob use fault tolerance instead? yes whenever we can, but masking faults is never perfect, we still need to check. 3) why would a provider want to deploy this? this feature will be attractive to prospective customers, and helpful for support. 4) Are these the right guarantees? completeness (no false negatives), could be relaxed with probabilistic completeness; verifiability could be relaxed only provide some evidence; accuracy (no false positives) can not be relaxed because we need to have confidence when we rule out problems.
A call to action: cloud accountability should do; deliverable provable guarantees, work for most cloud apps, require no changes to application code, cover a wide spectrum of properties, low overhead.
Work in Progress: Accountable Virtual Machines (AVM), goal: provide accountability for arbitrary unmodified software. cloud records enough data to enable deterministic replay, alice can replay log with a known-good copy of the software, can audit any part of the original execution.
Conclusion: current cloud designs carry risks for both customers and providers (mainly because of split administration problem). Proposed solution: accountable cloud. Lots of research opportunities.
Third Talk: Learning from the Past for Resolving Dilemmas of Asynchrony by Paul Ezhilchelvan
In an asynchronous model you can not bound message delivery time or even message processing time by a machine. However, in a probabilistic synchronous model, we can bound times within a certain probability via proactive measurements. The new central hypothesis of the new model is that most of the time, performance of the past is indicative of the performance of the near future (i.e. delay in the past is the indicative of delay in the future).
Design steps include doing proactive measurements, using them to establish synchrony bounds, and assign time bounds based on that, try that and see how it works and enable exceptions.
On-going work: development of exceptions (to deal with exceptional cases when mistakes are detected). Open environments are asynchronous, use crash signals for notification of extreme unexpected behavior.
]]>WebSphere Virtual Enterprise (WVE) is a product for managing resources in a data center. The product is a distributed system whose nodes and controllers need to communicate and share information, and BulletinBoard (BB) is used for that. BB is a platform service for facilitating group-based information sharing in a data center. It is critical component of WVE, and its primary application is monitoring and control, but the designers believe that it could be useful for other weakly consistent services.
Motivation & Contribution: Prior implementation of group communication implemented internall as not designed to grow 10 folds, and that was based on Virtual Synchronous group communication; robustness, stability, high runtime overheads as the system grew beyond several 100s of processes; static hierarchy introduced configuration problems. So the goal was to provide a new implementation to resolve the scaling and stability issues of the prior implementation (and implement this in a short time! so this constraint had important implications on the design decisions).
BB supports a write-sub (write subscribe) service model. It is a cross between pub-sub systems, shared memory systems, and traditional group communication systems. In pub-sub communication is async and done through topics. In shared memory we have overwrite semantics, singe writer per topic and process, and notifications are snapshots of state.
Consistency Semantics (single topic). PRAM Consistency: notified snapshots are consistent with the other process order of writes. A note made was that developers who built services on top of that turned out to understand this semantics of consistency.
Liveness Semantics (single topics). Uses Eventual inclusion: eventually each write by a correct and connected process is included into the notified snapshot. Eventual exclusion means that failed processes will be eventually excluded from updates.
Performance and Scalability Goals: adequate latency, scalable runtime costs, throughput is less of an issue (management load is fixed and low). Low overhead. Robustness, scalability in the presence of large number of processes and topics (2883 topics in a system of 127 processes, note that the initial target was around 1000 processes).
Approach: decided to build this on an overlay network called SON. Service Overlay Network (SON). SON is a semi-structured P2P overlay, already in the product, and self-* (recover from changes quickly without problems), resilient, and supports peer membership and broadcast. The research question here was whether if BB can be implemented efficiently on top of a P2P overlay like SON?
Architecture: SON with IAM (interest aware membership) built on top of it and BB on top of that (but BB can interact directly with SON).
Reliable Shared State Maintenance in SON for BB: is made fully decentralized, and update propagation is optimized for bimodal topic popularity. Overlay broadcast or iterative unicast over direct TCP connections if # subscribers of a topic is less than a certain threshold (and group broadcast otherwise). For Reliability, periodic refresh of the latest written value (on a long cycle) if not overwritten (this was a bad decision in retrospect) with state transfer to new/reconnected subscribers.
Experimental study on different topologies showed low cpu overhead and latency, but these numbers increased as the topology increased in size. Analysis of that revealed that this was because the periodic refreshes were stacked and caused increased CPU & latency overheads. An additional problem was in broadcast flooding, and when that was removed cpu & latency overheads stayed flat as the topology increased in size.
Lessons learned: communication cost is the major factor affecting scalability of overlay based implementations, and that anti-entropy techniques are best fit for such services.
Second Talk: Optimizing Information Flow in the Gossip Objects Platform by Ymir Vigfusson
In gossip, nodes exchange information with a random peer periodically in rounds. Gossip has appealing properties such as bounded network traffic, scalability in group size, robustness against failures, coding simplicity. This is nice when gossip is considered individually per application. In cloud computing with nodes joining many groups, the traffic is no longer bounded per node (but per topic).
The Gossip Objects (GO) platform is a general platform for running gossip for multiple applications on a single node. It bounds the gossip traffic going out of a particular node. The talk focused on how to select rumors to publish out from multiple applications on a single node such that we reduce number of messages. This is possible because rumor messages are small and have a short destination. An observation made is that rumors can be delivered indirectly, uninterested nodes can forward rumors to interested nodes.
The GO heuristic: recipient selection is biased towards higher group traffic. The content is selected by computing a utility of a rumor which is defined as the probability of that rumor will add information to a host that didn’t know that info.
Simulation, a first simulation of an extreme example with only two nodes joining many groups. The GO heuristic showed promising results. Then a real-world evaluation was conducted based on a 55 minute trace of the IBM WebSphere Virtual Enterprise Bulletin Board layer. The trace had 127 nodes and 1364 groups, and the evaluation showed that GO had placed a cap on traffic compared to random and random with stacking heuristics for GO. Additionally, the GO heuristic was able to deliver rumors faster than the other heuristic, and the number of messages needed to deliver the messages to interested nodes, and the GO heuristic had multiple orders of reduction over other heuristics and traditional rumor spreading.
Conclusion: GO implemented novel ideas such as per-node gossip, rumor stacking (pushing the rumor to the MTU size), utility based rumor dissemination, and adapting to traffic rates. GO gives per-node guarantees even when the # of groups scales up. Experimental results were compelling.
Questions:
Mike Spreitzer, IBM Research: What would happen if the number of groups increases?
Answer: Study of available real-world traces showed a pattern of overlap. We also conducted simulation with other group membership patterns and the results were similar.
—: What was the normal rumor size? And what would happen if that increased?
Answer: The average rumor size was 100Bytes. If the message size increased we will stack less rumors, but our platform can also reject really large rumors.
—: Have you thought about network-level encoding?
Answer: Not yet, but we plan to in the future.
—: Have you thought of leveraging other dissemination techniques to run under GO?
Answer: Actually, we thought about the opposite direction where we would run other communication protocols and map them under the hood to GO. Results are pending.
]]>In his talk, David shared some stories about his experience of using SQL in a data center environment to provide cloud services. The speaker was a bit fast while talking, I captured most of his message and the important parts, but i had to skip some parts.
In Windows Live, when building a new service they prefer to use off-the-shelf products such as SQL. Why SQL? familiar tested programming model (real queries, real transactions, good data modeling, excellent at OLTP, easy to find developers that know it). Solid systems software (used often and fine tuned many times and updated). Challenges with using SQL, living without single-image database model (no global transactions or global indexes). Administration and maintenance overhead. Breaking things at scale.
DB partitioned by user, many users per instance DB because it is easy and self contained. User info are small enough that you can place multiple users on single location. Front ends send requests to proper DB. Location is determined by lookup (a Lookup Partition Service – LPS – maps users to partitions). DBs are partitioned by hash to avoid hotspots.
Architecture: Three stages of scale out: bigger server, functional division, and data division.
A problem with scaling out: updates to multiple sevices and users (e.g., add messenger buddy, upload a photo which is writen to file store and recent activity store). Two-phase commit is out (because the risk of having the crash that locks your data out is too high), instead us ad hoc methods: for example: write A intent, write B, write A; another example, write A and work item, let work item write B; another example: write A, then B, tolerate inconsistency.
Another problem is how do you read data about multiple users or even all users. Example scenario, user updates his status, his friends need to know. The old way (inefficient) to do that is to write a change about the users into the profile of all affected users, easy to query, but heavy write load.
Data Availability and Reliability. Replication is used for all user data using SQL replication. Front ends have library (WebStore) to notice failures and switch to secondary. Original scheme was one-to-one which was too slow because parallel transactions vs. single replication stream. Next try was to have four DBs talking to four DBs which fixed most speed problems, but too much load on secondaries after failure. Current approach uses 8-host pods, 25% load increase for secondaries on failure (8×8 matrix, and the replication was done on the transpose of the matrix). However, still not fast enough for key tables (100’s of write threads vs. 5 replication streams). Manual replication (FE’s run SProcs at both primary and secondary, but small probability of inconsistent data). Replication runs a few seconds behind (ops reluctant to auto-promote secondary due to potential data in replication stream), new SQL tech should fix this.
Data loss causes: above the app (external application and old data); in the app (software bugs especially migration logic bugs); below the app (controller failure, disk failure). Mitigation techniques: audit trails and soft deletes for above app problems; per-user backup for software bugs; tape backup, sql replication, and RAID for below app problems (however these are expensive).
Managing Replication: fail safe set, a set of databases in some sort of replication membership. Typical fail safe set is two to four DBs (most are two). Fail safe are the true target of partition schemes.
Upgrade options: upgrade partitions: run DDL in each partition (via WebStore), this is complicated by replication, after all DBs are done, upgrade FEs (SProcs are compatible; changed APIs get new names). Migrate users: can take various forms (between servers, within a server, or even within services), and migrating users can be complex, slow, error-prone, and nobody’s likes it.
Some Operation stories.
Capacity management: growth is in units of servers. when to buy more? test teams provides one opinion, ops team aims to find max resource and stay below limit, two kinds of limits, graceful and catastrophic. Interesting thing about graceful vs catastrophic limits .. if you back off from graceful limits, you can usually go back to your original state (good state), however for catastrophic limits, even if you back off you can remain in a bad situation.
Ops lessons: 1) never do the same thing to all machines at once -stats queries, re-indexing have all crashed clusters in the past. 2) Smaller DBs are better, already coping with many DBs, plus re-indexing backups, upgrades are all faster for small DBs. 3) Read-only mode is powerful (failure maintenance and migration all use it). 4) Use the the live site to try things out (new code new SQL settings etc) “taste vs test”.
Conclusions: SQL can be tamed, it has some real issues but mostly manageable with some infrastructure, and its ops cost not out of line. It is hard to do better than SQL, it keeps improving, each time we go to design something, we find that SQL already design it, perhaps not in the form we want exactly, but close enough and not worth the effort probably. However SQL is not always the best solution.
SQL wish list. Easy ones: partitioned data support, easy migration/placement control, reporting, jobs; supporting aggregated data pattern, improved manageability. Hard ones: taming DB schema evolution, soft delete/versioning support of some kind, and A–D transactions (Atomic & Durable).
]]>This seemed like an interesting piece of work. Unfortunately i came in a bit late from the break and so my writing is sloppy and doesn’t do it much justice. However the paper about CRDTs and TreeDoc has been published in ICDCS.
Problem motivation: TreeDoc is a storage structure that uses binary tree encoding to address and store data. Inserting data is done by adding leaves to the tree. Reading the document consists of reading the binary tree using an “In Order” traversal. Deleting portions of the tree involves marking nodes with tombstones. However, trees can grow very badly, so removing deleted nodes and “rebalancing” the tree is needed. However, now after the rebalancing the tree addresses do not have the same meaning as before, so incoming updates might be inserted in the wrong location. So how can we agree on current addresses without concurrency control.
Tree located at two types of sites: Core and Nebula. The core is a smaller group that runs 2-phase commit to manage updates. the Nebula is a larger set of remote sites that do not run a consistency protocol. Catch-up protocol: if a core and nebula are networked partitioned, core proceeds with updates and buffers operations, let’s say that the nebula also gets some updates and buffers them. Then when the nebula gets the updates from the core, and replays it and the replays its own operations.
main point: There is a need for useful data structures that support operations that commute. The commutativity gives us convergence between multiple sites without concurrency control. TreeDoc is an example of such data structure. The main point with such data structures is that we should take care of garbage collection because it becomes a big issue.
Second Talk: Provenance as First Class Cloud Data by Kiran-Kumar Muniswamy-Reddy
This talk gave motivation for why would provenance be useful in cloud computing services. The speaker argued that provenance can allow us to reason better about the data from cloud services. The speaker argued that native support for provenance in cloud services will be beneficial.
Provenance tells us where did the data come from, its dependencies, and origins. Provenance is essentially a DAG that captures links between objects. Motivating example applications: web-search vs. cloud-search: both have tons of resources, however web search uses hyperlinks to infer dependencies, while no such thing exists for cloud-search. Provenance can provide a solution for that, and this has been argued for in a previous paper by Shah in usenix ’07. Another example, pre-fetching. Provenance can tell us which documents are related to each other, and this allows you to pre-fetch related items for performance. Other examples include ACLs and auditing apps.
Requirements for provenance: consistency, long-term persistence, queryable, security, coordinate compute facilities and storage facilities.
Third Talk: Cassandra – A Decentralized Structured Storage System by Prashant Malik
Why Cassandra? Lots of data (copies of messages, reverse indices of messages, per user data ..etc), random queries ..etc.
Design goals: high availability, eventual consistency (trade-off strong consistency in favor of high availability), incremental scalability, optimistic replication, “knobs” to tune trade-offs between consistency durability and latency, low total cost of ownership, minimal administration.
Data model: similar to the BigTable data model. columns are indexed by key, data is stored in column families, and the columns are sorted by value or by timestamp. Super columns allow columns to be added dynamically.
Write operations, a client issues a write request to a random node in the Cassandra cluster. The “partitioner” determines the nodes responsible for the data. Locally, write operations are logged and then applied to an in-memory version. Commit log is stored on a dedicated disk local to the machine.
Write properties: there are no locks in the critical path, we have sequential disk access. It behaves like a write back cache, and we have append support without read ahead. Atomicity guarantee for a key per replica. “Always Writable”, writes accepted even during failures, in that case the write is handed-off to some other node and loaded back to the correct place when node comes back up.
Reads are sent from the client to any node in the cassandra cluster, and then depending about the knobs the reads either get the most recent value or a quorrum.
Gossip is used between replicas using the Scuttlebutt protocol which has low overhead. Failure detection assigns a failure suspicion to nodes that increases with time until you hear again from users.
Lessons learned: add fancy features only when absolutely necessary. Failures are the norm not the exception. You need system-level monitoring. Value simple designs.
Fourth Talk: Towards Decoupling Storage and Computation in Hadoop with SuperDataNodes by George Porter
Hadoop is growing, gaining adopting, and used in production (Facebook, last.fm, linked in). E.g., facebook imports 25/day to 1k hadoop nodes. A key to that growth and efficiency relies on coupling compute and storage: benefits of moving computation to data, scheduling, locality reduce traffic, map parallelism (“grep” type workload).
So, when to couple storage with computation? This is a critical and complicated design decision, and this is not always done right. Examples, Emerging best practices with dedicated clusters. Your data center design may not be based on the needs for Hadoop (adding map/reduce to existing cluster, or a small workgroup who like the programming model such as Pig, Hive, and Mahout).
Goal is to support late binding between storage and computation. Explore alternative balances between the two (specifically explore the extreme point of decoupling storage and compute nodes). An observation from the Facebook deployment is that the scheduler is really good at scheduling nodes to local nodes for small tasks and bad for scheduling them in rack-locality for large tasks.
SuperDataNode approach: key features include a stateless worker tier, and storage node with shared pool of disks under single O/S, and a high bisection bandwidth worker tier.
There has been alot of talk about advantages of coupling storage and computation, what are the advantages of decoupling them. Advantages include, decoupling amount of storage from number of worker nodes. More intra-rack bandwidth than inter-rack bandwidth. Support for “archival” data, subset of data with low probability of access. Increased uniformity for job scheduling and block placement. Ease of management, workers become stateless; SDN management similar to that of a regular storage node. Replication only for node failures.
Limitations of SDN, scarce storage bandwidth between workers and SDN. Effective throughput with N disks in SDN (@ 100MB/sec each) 1:N ration of bandwidth between local and remote disks. Effect on fault -tolerance. Disk vs Node vs Link failure model. Cost. Performance depends on the work loads.
Evaluation compared a baseline hadoop cluster and an SDN cluster with 10 servers. The results showed that SDN performed better for grep and sort like workloads, and a bad case was random writers were hadoop performed better (workload was just each worker write to disk as fast as possible .. 100% parallelism).
]]>In his talk, Marvin reflected on experiences building and maintaining applications in data centers. He stressed the point that each of these issues are non-surprising individually by themselves, but the very large scale makes all of the possible all at once, and this is the surprising point! I really liked this talk.
A nice analogy he gave for building and running data center and cloud services is: Evolving a Cessna prop-plane into a 747 jump jet in-flight
Start with a Cessna prop-plane for cost and timeliness reasons. 4-9’s availability means that you get to land for 52 minutes every year (including scheduled maintenance, refueling, and crash landings). Success implies growth and evolution and rebuilding the plane mid-flight: Passenger capacity goes from 4-person cabin to 747 jumbo wide-body cabin, support for “scale out” means you add jet engines and remove the propellers while flying, testing and safety have to happen while flying!
Here are the lessons learned:
The unexpected happens! A fuse blows and darkens a set of racks, chillers die in a datacenter and a fraction of servers are down, an electrical plug bursts into flames, tornadoes or lightening hits datacenter, datacenter floods from the roof down, a telco connectivity goes down, the DNS provider creates black holes, simultaneous infant mortality occurs of servers newly-deployed in multiple datacenters, power generation doesn’t start because the ambient temperature is too high, load ..etc
Networking challenges. The IP protocol is deeply embedded in systems that you de-facto have to use it. IP networks can have lost packets, duplicate packets, and corrupted packets. Even if you use TCP your applications still need to worry about lost packets, duplicate packets, and corrupted packets. Software (and hardware) bugs can result in consistent loss or corruption of some packets. You have to be prepared for message storms. Client software is sometimes written without a notion of backing off on retries. One might expect that CRCs and the design of TCP can catch most of these issues. However, we are running in such a large scale that there are enough rare events that can give multiple errors. For example, if a switch or some network hardware erroneously flips the 8th bit of every 64 packets, with the large running scale, these rare events can happen repeatedly!
Things you should be able to do without causing outages: adding new hardware, deploying a new version of software, rolling back to a previous version of software, recovering from the absence, loss, or corruption of non-critical data. Losing a mirror of a DBMS, recovering from having lost a mirror of a DBMS, losing a host in its fleet, losing a datacenter, losing network connectivity between data centers. Can we roll back some parts in the middle of upgrading other parts ?
System resources/objects have lives of their own! Resources/objects in a service may live longer than the accounts used to create them. You have to be able to remap them between accounts. Resources/objects may live longer than versions of the service! You have to be able to migrate them forward with minimal or no disruptions. For example, EC2 instances were designed to run for short periods on demand, but customers start using them and keeping instances up for a long time, and this happens often enough such that shooting down long-lived instances will upset the clients. So how can you deal with that ?
Downstream dependencies fail. It’s a service-oriented architecture. The good news is that your service has the ability to keep going even if other services become unavailable, and the challenge is how to keep going and/or degrade gracefully if you depend on the functionality of downstream services at low levels. Suppose all services are 4-9’s available, if a downstream service fails for 52 minutes, how will you meet your own SLA of failing no more than 52 minutes ? Cascading outages happen, if multiple downstream services fail, how will you handle it? For example, if a storage service fails, 2 services depending on it can also fail, then more services depending on them fail, and so on and so forth. Services need to defend against that.
You must be prepared to deal with data corruption. Data corruption happens: flakey hardware, IO sub-systems can lie, software can be wrong, system evolution happen, people can screw up. End-to-end integrity checks are a must, straight-forward data corruption checking, how do you know if your system is operating correctly? Can your design do fsck in < 52 minutes ?
Keep it simple. It’s 4am on Sunday morning and the service has gone down, can you explain the corner cases of your design to the front-line on-call team over the phone? can you figure out what’s going on in under 52 minutes? Can you make sure it is not a corner case of using your code that did not result in that crash, or how to fix it if it is ? Simple brute force is sometimes preferable to elegant complexity: examples: eventual consistency considered painful (but sometimes necessary), P2P can be harder to debug than centralized approaches (but may be necessary). Is it necessary to build your system to handle situations after product growth when it is more likely that your system will actually change and be replaced by the time that it is big enough to require handling that issue.
Scale: will your design envelope scale far enough? Do you understand your components well enough? Cloud computing has global reach, services may grow at an astonishing pace, the overall scale is HUGE!. The scale of cloud computing tends to push systems outside their standard design envelopes. The rule of thumb that you must redesign your system every time it grows by 10x implies you must be prepared to redesign early and often.
**CAE Trade-Off for Resources. CAE: cost-efficient, available, elastic. You can only pick two of them!
Do not Ignore the Business Model or your TCO. Do you know all the sources of cost? can you accurately measure them? Do you know all the “dimensions of cost” that will be used in pricing? Can you meter them? Have you thought about ways the system can be abused? How will you resolve billing disputes? All these may affect the design of the service in fundamental ways. This is important to measure even if you think that your revenue will come from adds. For example, some customer figured out that if they store large names in the key part of the key/value store rather than the name, they can reduce their cost by 1000x times because S3 only charges for the size of the value not key! So you have to think about what you are not charging people for and how can they abuse it.
Elastic Resources: What boundaries to expose? High availability apps require the notion of independent failure zones –> introduce the notion of availability zones (AZ). Concurrent apps want bounded, preferably low message latency and high bandwidths –> introduce notion of cluster affinity to an AZ. The challenges of AZ clustering, clumping effect since everyone will want to be near everyone else (for example, if you ask people to pick an AZ and they don’t care, everyone will end up in AZ1 !!), makes elastic scheduling harder. Fine-tuned applications are the enemy of elasticity, customers will try to divine your intra-AZ topology (co-location on the same rack, etc.) Eventual evolution to different network infrastructures and topologies means you don’t want to expose more than you have to.
Summary and Conclusions: The unexpected happens, in large systems even extremely rare events occur with a non-negligible frequency; what’s your story on handling them? Keep it simple: it’s 4’am and the clock is ticking- can you debug what’s going on in your system? Cloud computing is a business: you have to think about cost-efficiency as well as availability and elasticity.
Questions:
Mike Freedman, Princeton University: What things of these issues are specific for infrastructure provider (such as Amazon) compared to web service providers such as walmart.com or hotmail?
Answer: Many things are common such as hazards and load. As for other things such as accounting and billing, this is still useful for service providers, then this can at least minimize your running costs and allow you to know where you are spending your money.
Ken Birman, Cornell University: What makes you feel consistency, is it the 4am call or is latency and competitiveness and the added complexity?
Answer: It is the 4am call. When you have systems at large scale, you have to work out all the possible cases in your system and you can not cheat out of it. These corner cases make it hard. Remember that this has to be developed in a timely manner and it is developed by junior developers that are evolving their knowledge and expertise in this.
—: How do you test the resilience of your data centers? Are there people who go and turn off part of your datacenter?
Answer: Essentially yes! You test as much as you can, then you roll out.
Dough Terry, MSR-SV: Shouldn’t the analogy be that you start with a fleet of Cessnas and you want to evolve them into a fleet of Jumbo jets in flight without losing all of them together.
Answer: the problem is that you can not parallelize everything. There is some percentage of your code that does not get fixed.
Hakim Weatherspoon, Cornell University: What about embracing failure? running your systems hot and expect that nodes will fail ?
Answer: That solves some of the existing problems, but newer problems that we don’t know about yet can rise. For example, we never thought that the boot temperature on backup power generators will ever be an issue but it was! So you can never enumerate all problems.
]]>This talk essentially focused on how can enterprise applications be transported to cloud computing settings. Issues focused on are: deployment, this is more complex than just booting up VMs due to data and functionality dependencies. The second issue, availability. Enterprise apps are heavily engineered to maximize uptime. According to a published study, current cloud services can expect up to 5 hours of down time per year. Enterprise customers however really expect 1 hour of downtime per year. So how can this gap be bridged ? The third issue is that of problem resolution.
Bridging the availability gap: ideas include: 1) implementing scaling architectures in the cloud, 2) developing APIs to allow multiple clouds to interact with each other so as to develop failover techniques, 3) Live VM migration to mask failures.
As for problem resolution: categories of issues raised regarding EC2 on EC2 boards: 10% of topics discussed are feature request, 56% user how-to. As for problems 25% cloud error, 64% user error, 11% unknown error. One of the important things that enterprise customers want is being able to know if something is not running correctly, is the issue with the cloud platform, the VM, faulty hardware or what. So techniques and tools have to be developed in that regards.
Second Talk: Cloudifying Source Code Repositories: How Much Does it Cost? by Michael Siegenthaler
Cloud computing used to be mainly used by large companies that have the resources that enabled them to build and maintain the datacenters. Now this is accessible to people outside these companies for low costs.
Why move source control to the cloud? resilient storage, no physical server to administrate, scale to large communities. Used SVN which is very popular, store data on S3 (problem with eventual consistency), used Yahoo Zookeeper (a coordination service) as a lock service. How to measure costs for SVN on S3? measure cost per diff files and stored files. Back of the envelope analysis of cost shows it is inexpensive even for large projects such as Debian and KDE. A trend to notice is that code repos are getting larger in size, but the price of storing a GB is decreasing with time.
Architecture: machines talk to front-end servers on EC2 and storage is on S3. The front-end need not be on EC2, the cloud is there mainly for storage. A problem with a naive implementation is that eventual consistency in S3 means that multiple revision numbers can be issued for conflicting updates. For this reason locking is required. The commit process essentially has a hook that acquires a lock from ZooKeeper and pull for the most recent version number. The most recent version is retrieved from S3 (retry if not found due to eventual consistency), then make commit and release lock and ZooKeeper increments the version number.
Performance evaluation: usage patterns: Apache foundation has 1 repo for 74 project with average 1.10 commits per minute and a max of 7 per minute. The Debian community has 506 repos with 1.12 commits per minute in aggregate and 6 in max. These were used as experiment traces. The results showed that as you add more front-end servers from EC2 the performance does not suffocate due to possible lock contention, and this was tried with differing number of clients.
Third Talk: Cloud9: A Software Testing Service by Stefan Bucur
There is a need to facilitate automatic testing of programs. Cloud computing can make this have better performance. Testing frameworks should provide autonomy (no human intervention), usability, performance. Cloud9 (http://cloud9.epfl.ch/) is a web service for testing cloud applications.
Symbolic Execution: when testing a function, instead of feeding it input values, send it an input abstraction (say, lambda) and whenever we see a control flow branching (such as an if statement) create a subtree of execution. One idea is to send each of these subtrees to a separate machine and test all possible execution paths at once. A naive approach can have many problems. For example trees can expand exponentially, so incrementally getting new resources to run can be problematic. A solution to that is to pre-allocate all needed machines. There are many challenges in parallel symbolic execution in the cloud such as dynamically load balancing trees among workers and state transfers. Along with other problems such as picking the right strategy portfolios
Preliminary results show that parallel symbolic execution on the cloud can give over linear improvement over conventional methods and KLEE.
]]>