Statistical Fallacy List, Part Two

Base Rate Fallacy

In statistics, the term “base rate” is the general probability of some event occurring. It does not account for specific conditions and simply estimates the probability of that event occurring under general circumstances.

For instance, suppose we have a sample of one hundred students. Seventy of these students are engineering students and 30 are art students. Then the base rate of a randomly selected student being an engineering student is 70%.

Or take your base chance of being hit by lightning, which by some estimates are estimated to about 1 in 3,000. That represents the chance you will be hit by lighting under normal circumstances, without accounting for specific conditions such as whether or not you are standing under a metal pole or something that might increase your chances of being struck by lightning.

A base rate fallacy is when you look at specific data but ignore this base rate. An example might help illustrate this.

Suppose your child is very smart and he is applying to get into a school for gifted children. However, all the applicants of this school are very bright and only say, 5% of applicants are accepted.

But you conclude that your son is brilliant! He is bound to be accepted!

You have, however, failed to account for the base rate. All the applicants to this school are brilliant. Even if your son is brilliant, he still has only a 5% of being accepted.

Where is the error here? You have assumed that your son being smart means he has a high chance of being accepted. However, you have ignored the base rate of 5% probability of acceptance for brilliant kids and assumed brilliant kids have a high chance of acceptance when this is not the case.

boy abacus
Intelligence might not be the only things gifted schools are considering.

Or, take faith healing. If you take 1,000 cancer patients and pray for them to get better, then almost certainly at least one of them will recover, at least for a while. You might, therefore, assume that prayer cured their cancer.

However, what you have failed to account for is the base rate of cancer remission. A certain percentage of cancer patients will undergo remission of their cancer as their cancer seems to go away, at least for a while.

In failing to account for this you have assumed that if a patient’s cancer goes away, it must be the result of your prayers.

When looking at the probability of specific events, you should consider the base rate. It is easy to look at specific occurrences that take place and ignore the base rate.

In other words, you ignore the general probability of the events you are talking about and assume that the specific event you are talking about has some special significance.

However, if you looked at the base rate, it would help you more accurately assess the real significance of the event.

We will take one more example and let us assume your chances of being hit by lightning are indeed 1 in 3,000.

Now, suppose that you look at the chances of being hit by lightning if you are wearing rubber-soled shoes and you find this is about 1 in 2,900. You, therefore, assume that rubber-soled shoes help protect you against lightning.

You have failed to account for the base-rate of 1 in 3,000. That is, without wearing such shoes, your chances of being hit by lightning are 1 in 3,000.

Which is pretty similar to your figure for your rubber-soled shoes. Suggesting that such shoes make no real difference to whether you will be struck by lightning.

Texas Sharpshooter Fallacy

The name comes from a joke about a Texan firing at the side of a barn. The Texan then paints a target centred on the tightest cluster of hits. He then says, “Look at how many bull-eyes I got!”

But you see what he did there. He focused on a cluster of bullet holes that happen to be close together and pretended that this cluster is significant. While ignoring all the other bullet holes that contradict his claims of being a sharpshooter.

This gives you some idea of what this fallacy is about — focusing on clusters or patterns of data and assuming that the cluster is significant. All the while ignoring other data which is different.

Let us present an example. Suppose you notice that many parts of Earth are well suited to life. You then conclude that Earth is proof that the universe is intelligently designed for life. Even though the vast majority of the universe is incompatible with life.

You are looking at one data point, Earth and ignoring the fact that all the other data points suggest that the universe is not designed for life.

Or take the alleged Saturday Night Live curse. You focus on the cast members that have died within the shows forty-year history while ignoring the fact that the majority of the cast is still alive or which died of natural causes.

Focusing on similar data while ignoring dissimilar data can easily lead to the wrong conclusions.

But humans are very prone to focusing on patterns and tend to focus on similar things.

The problem is that when you do this, you may ignore data that is dissimilar. You may, therefore, come to the wrong conclusions about the data as a whole.

One more example might help clarify this. Suppose we look at the quatrains of Nostradamus.

Nostradamus fallacy
You can write a book with as much prophetic power as Nostradamus. Just make predictions about things like war and famine that happen all the time.

If you twist the translations of the archaic French they are written in, several quatrains describe things that can be viewed as vaguely similar to real-world occurrences.

For instance:

Beasts wild with hunger shall cross the rivers:
Most of the fighting shall be close by the Hister [Danube],
It shall result in the great one being dragged in an iron cage,
While the Rhine child of Germany will observe.

You could choose to focus on the things that seem similar to real life.

Perhaps you could choose to believe that “Hister” refers to Hitler, even though it clearly refers to the Danube River.

You could focus on the “child of Germany” part and interpret this as an indication that he is talking about some German person.

Perhaps you could focus on any number of tenuous similarities to real-life and draw a bullseye around those and say “Look, he got all those things right”.

But you are ignoring all the things he says which have no apparent connection to anything. You are ignoring the fact that the iron cage has no apparent connection to Hitler or World War Two.

You can ignore the passage about beasts wild with hunger trying to cross rivers, which has no obvious connection to anything. And so forth.

But of course, that would be a fallacy. It would be a fallacy to focus on what you think he got right while ignoring all the things you cannot rationalize.

That is it for this part. Stay tuned for more statistical fallacies in the future.

Leave a comment