Base rate error refers to the tendency to ignore relevant statistical information in favor of case-specific information. Instead of taking into account the base rate or prior probability of an event. For this reason, people often make inaccurate probability judgments in decision-making contexts. Base rate error is also called base rate neglect or base rate bias or base frequency forgetting.

Erreur du taux de base

Examples

Or a city of one million individuals present on its territory. Of this million (1,000,000) individuals, 100 are presumed delinquents and listed as such on a list, the other 999,900 being presumed non-delinquents. In order to detect the presence of an offender on its territory, the city installs video surveillance cameras with an automatic facial recognition device: this must trigger an alert when the face filmed is that of one of the 100 offenders in the list. Unfortunately, the facial recognition device is not perfect. Suppose it has an "error rate of 1 %", or, more precisely, that:

its sensitivity is 99 %, or a rate of 1 % of false negatives among real offenders;
its specificity is 99 %, or a false positive rate of 1 % among non-offenders.
When an alert is triggered, what is the probability that we are in the presence of an offender listed on the list?

If we reason with "forgetting the base frequency", that is to say by only retaining the "error rate is 1 %", we answer a little quickly that there are 99 % probability that the individual is actually an offender when an alert is triggered. Which is wrong. Indeed, when counting all the alerts, two situations must be taken into account simultaneously:

99 % of offenders trigger the alert, i.e. 99 offenders out of the 100 on the list (according to the definition of sensitivity);
1 % of non-offenders trigger the alert, i.e. 9,999 non-offenders out of 999,900 (according to the definition of specificity).

That's a total of 99 + 9999 = 10,098 alerts. When an alert is triggered, the probability that the individual is actually an offender is therefore 99 out of 10,098, or 0.98 % and not 99 %. This probability can be found by Bayes' theorem.

Here's another example.

During the pandemic, we have often heard or read statistics in the news, such as “70 % of hospitalized Covid patients have been vaccinated” or “7 in 10 hospitalized Covid patients are vaccinated”. At the time, many wondered what this implied about the vaccine's effectiveness.

The key to correctly interpreting this information is the baseline vaccination rate (i.e. the percentage of the population vaccinated). If a large proportion of the population is vaccinated and only a small fraction is unvaccinated, we can expect a higher ratio of vaccinated to unvaccinated individuals in the hospital.

For example, suppose that a population is vaccinated at 99 % and that 51 % of the infected individuals have been vaccinated. The base rate error would lead most people to believe that the vaccine has no preventive effect. However, if the vaccine was ineffective, we would expect that approximately 99 % of infected people would have been vaccinated.

The base rate fallacy has created the misconception that vaccines are ineffective because, in highly vaccinated populations, the majority of COVID-19 cases occur among vaccinated people.

How to Avoid Base Rate Mistakes

This is very simple, in addition to statistics, you must know the number of individuals in each population tested. The simplest way to avoid a base rate error is to rely on a confusion matrix.

en_USEN