Base rates & statistics: How to be better at guessing

You are probably bad at estimating things. But it’s okay, so is everyone else.

When random people were asked if they thought they were better than the average population at estimating things in general, almost everyone responded yes. This cannot obviously hold true consistently, because mathematically speaking, 50% of people would be worse and 50% of people would be better on average. The remarkable thing is, even after people were explained that fact and given a chance to revise their answer, most of them still insisted they were better than average. 

Without going into the psychological biases of self-esteem preservation or the glaring dunning-kruger effect playing out, it’s safe to assume that while you have a 50% chance of being a better-than-average guesser, you’re likely a poor estimator in general.

But hey, don’t beat yourself up about it. Our brains are extremely powerful parallel processing engines, designed to solve navigational problems and deal with ambiguous social information, but terribly built for manipulating abstract symbols. There was no need for a pre-historic human to solve logarithms or evaluate symbolic conditional probabilities on the fly. You don’t even need to try hard with complex calculations to push this limit, even large numbers rapidly dissolve into computational intractability within our heads.

Try this: how much space does 5 cars versus 20 cars take up in a parking lot if they’re arranged into a square? Since it still falls within the realm of manageable representation, we can easily visualize the rough size in our minds eye.

But what about the area difference between a million cars and a billion? The previous example varied by 4 times and the second by a thousand fold, yet we still vaguely process them as long stretches of cars and have almost no intuition to discern the difference. In this case, the first would stretch across 2km while the second would stretch across 67km. The reality is that much of the information we deal with in real life involve large numbers, made more difficult with probabilistic conditions. We have very little native intuition to deal with either.

**A short digression into System 1 and System 2 modes of thought.
**
Inspired by Thinking Fast and Slow, our mental processes can be broadly grouped into System 1 and System 2.

System 1: It can be viewed as a gut reaction. Fast, intuitive, emotional, and automatic. Everyday activities like recognizing faces and casual conversations. The primary system hardwired by evolution, but susceptible to cognitive biases.

System 2: The critical thinking path. Slow, deliberate, and calculating. Used to solve problems like 63 x 4 or exercising self-control for future rewards. Takes up much more energy and cognitive effort. 

We alternate between these two systems but system 1 tends to be the default. Errors occur when we use system 1 for probabilistic situations.

Anyone with neuroscience knowledge might find the neat division of the brain’s complex interactions into a dichotomy to be a massive oversimplification. But it’s partly intentional. The irony is that creating two characters for system 1 and system 2 (almost with character traits), triggers more efficient system 1 processing and makes the concept easier to understand!

There are three parts below. How we overprioritize our intuition over simple probability, how we ignore base rates and priors in our estimates, and how we have a poor grasp of exponential conditions.

Linda problem: Conjunction fallacy and how we irrationally assign higher likelihoods to improbable things.

In one of Kahneman and Tversky’s experiments, they created this fictional character:

“Linda is thirty-one years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in antinuclear demonstrations.”

Which scenario is more probable?

A. Linda is a bank teller.
B. Linda is a bank teller and is active in the feminist movement.

It turns out that almost 90% of undergraduates chose the second option. Based on Linda’s description, people form the automatic association that Linda cares strongly about social issues and would likely be part of a social movement. However, being a bank teller + being in the feminist movement is a subset of all bank tellers.

Naturally there are more bank tellers, and all bank tellers who are feminist movement activists fall into this group, which means there is obviously a higher likelihood that proposition A is true.

The problem arises because we’re natural storytellers and our immediate approach is based on the plausibility of the narrative. The picture of Linda being in the feminist movement and being a bank teller fits more coherently with our worldview. This overrides the mental modules that account for mathematical consistency, which takes more cognitive effort to process. Our system 1 process fails us here. The key is to think of conditions as sets and subsets, which of them fall concentrically within the other? Which one has a real higher probability?

Base rates: Why receiving a test result shouldn’t be the end of the world

Let’s assume that 1% of a country’s population has malaria. There’s a test for malaria that gives an accurate result 90% of the time.

If you take the test and you test positive for malaria, what’s the probability that you actually have malaria?

This is no trick question, and we actually deal with similar examples in our daily lives every day. If you guessed 90%, because you think it’s accurate 90% of the time, then you’ve got this completely wrong. You have not considered the base rate, or in Bayesian terms, the prior probability of having malaria. Even some doctors presented with this test have responded with numbers like 60 - 70% probability of a true positive. The real answer is closer to 10%.

If this sounds intuitively wrong, we can work this out in a rather straightforward way by using a 1000 people as an example. 

  1. Within a room of a 1000 people, given that 1% of people have malaria, the real number of people who have malaria would be around 10, and around 990 will not have it.

  2. Given that the test works 90% of the time, it means that out of the 990 people who are not infected, 10% of them will have a false positive, or 99 people will test positive even though they don’t have malaria.

  3. Out of the 10 people who really have malaria, 9 people will be correctly tested positive.

  4. The total number of people tested positive will be 99 + 9 = 108. But the number of people who are actually infected and have a positive result is 9. So the probability that you have a positive result and actually have malaria would be 9/108 or 8.3%.

8.3% versus 90% is around a 10x difference. Not having a grasp on statistics and not considering base rates can lead to wildly different conclusions. On a practical level, when a piece of information or news comes in, always consider the initial probability of an event occurring and factor that into your decision making.

Exponential growth

There’s this famous story about a sage that was to be rewarded by the King (a major chess enthusiast) for winning a chess game. I’m seriously paraphrasing here, but I think I’ll get the relevant parts in. When the king asked what reward he wanted, the servant asked for a single grain of rice to be placed on the first square of the chessboard, 2 grains on the second, 4 on the third, doubling each square, and so forth. It seemed manageable at first, so the king happily agreed to the meager placements of a few grains of rice. The problem was that by the time it hit the 64th square, he would need to provide 18 quintillion grains of rice, supposedly enough to cover India in 1m worth of rice.

The point of this story is to illustrate that not only are we bad at reasoning with probability, even natural, but concrete numbers also escape our intuition when they hit exponential growth. It’s important to recalibrate your thinking and activate your system 2 whenever you’re dealing with such phenomena. For example, 5% yearly compounded growth on an investment might seem modest, but it means a doubling every 14 years, and 8x growth in 42 years.