The history of the case-control study dates back to the beginning of the 20th century, probably inspired by writings by Hippocrates and John Stuart Mill. Hippocrates encouraged doctors to pay attention to what was common and particular for their patients. Mill provided the method of agreement and disagreement in search for causation. Much later Cornfield provided the documentation for using the exposure odds ratio to estimate the disease odds ratio in what is now called the case-non case study. The rare disease assumption indicates that this OR is close to RR when the disease is rare among both exposed and not exposed.
Since the 1980s the driving force in developing the case-control study has been to develop a design where the exposure odds ratio would estimate the RR or IRR without any rare disease assumption. The key is that controls are sampled at random from the source population, or the study base, that gave rise to the cases. These ideas have been around for about 30 years and should be well known, at least for the younger generation of epidemiologists. Still, we find these basic principles to be ignored in research proposals and manuscripts submitted for publication, and not all of these come from non-epidemiologists. Especially, the idea of the healthy controls pops up at regular intervals, maybe as a result of what is recalled from high school philosophy, maybe as a result of our rule for using patient controls as surrogates for a proper random sample from the source population. These patient controls should have a disease that is neither caused nor prevented by the exposures under study, one principle that was violated by Doll and Hill’s famous study on lung cancer from the 1950s. Part of their controls had other lung diseases we now know are also partly caused by smoking and they therefore underestimated the related risk of lung cancer for smokers.
If healthy controls are selected to the study, not only people without the case disease under study, effect estimates may be biased if the exposure under study affects the diseases that are excluded in control selection. Furthermore, the confounder structure may be more complicated in this design and usually there will be no or limited data to adjust for this confounding.
In genetic epidemiology we often see violations of basic design principles because existing convenient control samples are used to save money. Often these studies are done by clinicians and statisticians without any help from epidemiologists. Unfortunately, violation of what now should be well known principles is also done by epidemiologists who should know better.
Jørn Olsen, Cesar Victora, Shah Ebrahim, Neil Pearce