Sampling bias occurs when some members of a population are systematically more likely to be selected in a sample than others. We also talk about verification bias in medical fields. Sampling bias limits the generalizability of the results because it poses a threat to external validity, particularly population validity. In other words, results from biased samples can only be generalized to populations that share characteristics with the sample.

biais d'échantillonnage

Causes

Your choice of design research or method data collection may result in sampling bias. This type of research bias can occur in both probability and non-probability sampling.

In probability sampling, each member of the population has a known chance of being selected. For example, you can use a random number generator to select a simple random sample from your population.

Although this procedure reduces the risk of sampling bias, it cannot eliminate it. If your sampling frame – the actual list of individuals from which the sample is drawn – does not match the population, this can result in a biased sample.

You want to study the levels of procrastination and social anxiety among undergraduate students at your university using a simple random sample. You assign a number to each student in the research participant database, from 1 to 1,500, and use a random number generator to select 120 numbers.

Even though you used a random sample, not everyone in your target population – undergraduates at your university – had a chance of being selected. Your sample does not contain everyone who did not sign up to be contacted about their research participation. This may bias your sample toward people who have less social anxiety and are more willing to participate in research.

A non-probability sample is selected based on non-random criteria. For example, in a convenience sample, participants are selected based on their accessibility and availability.

Non-probability sampling often results in biased samples because some members of the population are more likely to be included than others.

You want to study the popularity of plant-based foods among undergraduate students at your university. For convenience, you send a survey to everyone enrolled in introductory psychology courses at your university. They all complete it in exchange for course credits.

Since this is a convenience sample, it is not representative of your target population. People taking this course may be more liberal and attracted to plant-based foods than other students at your university.

Types of Sampling Bias

Self-selection bias

  • People with specific characteristics are more likely than others to agree to participate in a study.
  • People who seek more thrills are likely to participate in pain research studies. This can skew the data.

Non-response bias

  • People who refuse to participate or drop out of a study differ systematically from those who participate.
  • In a study on stress and workload, employees with high workload were less likely to participate. The resulting sample may not vary significantly in terms of workload.

Undercount bias

  • Some members of a population are insufficiently represented in the sample.
  • Administering general national surveys online risks overlooking groups with limited Internet access, such as older adults and low-income households.

Survivorship bias

  • Successful observations, people and objects are more likely to be represented in the sample than unsuccessful observations.
  • In scientific journals, there is a strong publication bias towards positive results. Successful research results are published much more often than unsuccessful results.

Pre-selection or advertising bias

  • How participants are pre-screened or where a study is advertised can bias a sample.
  • When looking for volunteers to test a new sleep intervention, you may end up with a sample that is more motivated to improve their sleep habits than the rest of the population. As a result, they could have improved their sleep habits regardless of the effects of your intervention.

Healthy User Bias

  • Volunteers for preventive interventions are more likely to engage in health-promoting behaviors and activities than other members of the population.
  • A sample participating in a preventive intervention has a better diet, higher levels of physical activity, abstains from alcohol, and avoids smoking more than most of the population. Experimental results may result from the interaction of the treatment with these sample characteristics, rather than just the treatment itself.

How to avoid sampling bias

Using careful research design and sampling procedures can help you avoid sampling bias.

Define a target population and a sampling frame (the list of individuals from which the sample will be drawn). Match the sampling frame to the target population as much as possible to reduce the risk of sampling bias.

Make online surveys as short and accessible as possible.

Follow up on non-respondents.

Avoid convenience sampling.

Oversampling can be used to avoid sampling bias in situations where members of defined groups are underrepresented (undercount). This is a method of selecting respondents from certain groups so that they constitute a larger part of the sample than that of the population.

Once all data is collected, responses from oversampled groups are weighted relative to their actual share of the population to eliminate any sampling bias.

Example of correction

A researcher wants to study the political views of different ethnic groups in the United States and focus in depth on Asian Americans, who only make up 5.6 % of the U.S. population. The researcher wants to study each ethnic group separately, but also gather enough data on Asian Americans to be able to draw accurate conclusions.

They gather a nationally representative sample, with 1,500 respondents, that oversamples Asian Americans. Random dialing is used to contact U.S. households, and disproportionately larger samples are taken from areas with more Asian Americans. Of the 1,500 people surveyed, 336 were Asian Americans. Based on this sample size, the researcher can be confident in their findings about Asian Americans.

Weighting is applied to ensure that Asian American responses represent 5.6 % of the total. This allows for accurate estimates of the sample as a whole.

en_USEN