A confounding variable is a variable that influences both the independent variable and dependent variable and leads to a false correlation between them. A confounding variable is also called a confounder, confounding factor, or lurking variable. Because confounding variables often exist in experiments, correlation does not mean causation. In other words, when you see a change in the independent variable and a change in the dependent variable, you can’t be certain the two variables are related.
Here are examples of confounding variables, a look at the difference between a confounder and a mediator, and ways to reduce the risk of confounding variables leading to incorrect conclusions.
Positive and Negative Confounding
Sometimes confounding points to a false cause-and-effect relationship, while other times it masks a true effect.
- Positive Confounding: Positive confounding overestimates the relationship between the independent and dependent variables. It biases results away from the null hypothesis.
- Negative Confounding: Negative confounding underestimates the relationship between the independent and dependent variables. It biases results toward the null hypothesis.
Confounding Variable Examples
- In a study where the independent variable is ice cream sales and the dependent variable is shark attacks, a researcher sees that increased sales go hand-in-hand with shark attacks. The confounding variable is the heat index. When it’s hotter, more people buy ice cream and more people go swimming in (shark-infested) waters. There’s no causal relationship between people buying ice cream and getting attacked by sharks.
- Real Positive Confounding Example: A 1981 Harvard study linked drinking coffee to pancreatic cancer. Smoking was the confounding variable in this study. Many of the coffee drinkers in the study also smoked. When the data was adjusted for smoking, the link between coffee consumption (the independent variable) and pancreatic cancer incidence (the dependent variable) vanished.
- Real Negative Confounding Example: In a 2008 study of the toxicity (dependent variable) of methylmercury in fish and seafood (independent variable), researchers found the beneficial nutrients in the food (confounding variable) counteracted some of the negative effects of mercury toxicity.
Correlation does not imply causation. If you’re unconvinced, check out the spurious correlations compiled by Tyler Vigen.
How to Reduce the Risk of Confounding
The first step to reduce the risk of confounding variables affecting your experiment is to try to identify anything that might affect the study. It’s a good idea to check the literature or at least ask other researchers about confounders. Otherwise, you’re likely to find out about them during peer review!
When you design an experiment, consider these techniques for reducing the effect of confounding variables:
- Introduce control variables. For example, if you think age is a confounder, only test within a certain age group. If temperature is a potential confounder, control it.
- Be consistent about time. Take data at the same time of day. Repeat experiments at the same time of year. Don’t vary the duration of treatments within a single experiment.
- When possible, use double blinding. In a double blind experiment, neither the researcher nor the subject knows whether or not a treatment was applied.
- Randomize. Select control group subjects and test subjects randomly, rather than having the researcher choose the group or (in human experiments) letting the subjects select participation.
- Use case controls or matching. If you suspect confounding variables, match the test subject and control as much as possible. In human experiments, you might select subjects of the same age, sex, ethnicity, education, diet, etc. For animal and plant studies, you’d use pure lines. In chemical studies, use samples from the same supplier and batch.
Confounder vs Mediator or Effect Modifier
A confounder affects both the independent and dependent variables. In contrast, a mediator or effect modifier does not affect the independent variable, but does modify the effect the independent variable has on the dependent variable. For example, in a test of drug effectiveness, the drug may be more effective in children than adults. In this case, age is an effect modifier. Age doesn’t affect the drug itself, so it is not a confounder.
Confounder vs Bias
In a way, a confounding variable results in bias in that it distorts the outcome of an experiment. However, bias usually refers to a type of systematic error from experimental design, data collection, or data analysis. An experiment can contain bias without being affected by a confounding variable.
Confounding Variable: A factor that affects both the independent and dependent variables, leading to a false association between them.
Effect Modifier: A variable that positively or negatively modifies the the effect of the independent variable on the dependent variable.
Bias: A systematic error that masks the true effect of the independent variable on the dependent variable.
- Axelson, O. (1989). “Confounding from smoking in occupational epidemiology”. British Journal of Industrial Medicine. 46 (8): 505–07. doi:10.1136/oem.46.8.505
- Kish, L (1959). “Some statistical problems in research design”. Am Sociol. 26 (3): 328–338. doi:10.2307/2089381
- VanderWeele, T.J.; Shpitser, I. (2013). “On the definition of a confounder”. Annals of Statistics. 41 (1): 196–220. doi:10.1214/12-aos1058
- Yule, G. Udny (1926). “Why do we Sometimes get Nonsense-Correlations between Time-Series? A Study in Sampling and the Nature of Time-Series”. Journal of the Royal Statistical Society. 89 (1): 1–63. doi:10.2307/2341482