Bayesian updates and the Lake Wobegon effect
We seem to have a good mathematical understanding of Bayesian updates, but somehow a very poor understanding of its practical implications. There are many situations in practice that we easily perceive as irrational, one of the most famous is the so called Lake Wobegon effect, named after the fictional town in Minnesota, where “all the women are strong, all the men are good looking, and all the children are above average”. It is described as a cognitive bias where individuals tend to overestimate their own capabilities. In fact, when drivers are asked to rate their own skilled compared to the average in three groups: low-skilled, medium-skilled and high-skilled, most rate themselves above the average.
In fact, the behavioral economics literate is full of examples like this where the observed data is far from what you would expect to observe if all agents were rational – and those are normally attributed to cognitive biases. I was always a bit suspicious of such arguments: it was never clear if agents were simply not being rational or whether their true objective wasn’t being captured by the model. I always thought the second was a lot more likely.
One of the main problems of the irrationality argument is that they ignore the fact that agents live in a world where its states are not completely observed. In a beautiful paper in Econometrica called “Apparent Overconfidence“, Benoit and Dubra argue that:
“But the simple truism that most people cannot be better than the median does not imply that most people cannot rationally rate themselves above the median.”
The authors show that it is possible to reverse engineer a signaling scheme such that the data is mostly consistent with the observation. Let me try to give a simple example they give in the introduction: consider that each driver has one of three types of skill: low, medium or high: and . However, they can’t observe this. They can only observe some sample of their driving. Let’s say for simplicity that they can observe a signal that says if they caused an accident or not. Assume also that the larger that skill of a driver, the higher it is his probability of causing an accident, say:
Before observing each driver things of himself as having probability $\frac{1}{3}$ of having each type of skill. Now, after observing , they update their belief according to Bayes rule, i.e.,
doing the calculations, we have that and for the of the drivers that didn’t suffer an accident, they’ll evaluate , , , so:
and therefore will report high-skill. Notice this is totally consistent with rational Bayesian-updaters. The main question in the paper is: “when it is possible to reverse engineer a signaling scheme ?”. More formally, let be a set of types of users and let , i.e., is a distribution on the types which is common knowledge. Now, if we ask agents to report their type, their report is some . Is there a signaling scheme which can be interpreted as a random variable correlated with such that is the distribution rational Bayesian updaters would report based on what they observed from ? The authors give necessary and sufficient condition on when this is possible given .
—————————–
A note also related to the Lake Wobegon effect: I started reading a very nice book by Duncan Watts called “Everything Is Obvious: *Once You Know the Answer” about traps of the common-sense. The discussion is different then above, but it also talks about the dangers of applying our usual common sense, which is very useful to our daily life, to scientific results. I highly recommend reading the intro of the book, which is open in Amazon. He gives examples of social phenomena where, once you are told them, you think: “oh yeah, this is obvious”. But then if you were told the exact opposite (in fact, he begins the example by telling you the opposite from the observed in data), you’d also think “yes, yes, this is obvious” and come up with very natural explanations. His point is that common sense is very useful to explaining data observations, specially observations of social data. On the other hand, it is performs very poorly on predicting how the data will look like before actually seeing it.