Understanding case-control studies as involving sampling the necessary information from an underlying, often unknown, cohort to learn about the relative increase in disease occurrence among exposed compared to not-exposed goes back to the 1980s. But comparing cases and controls in a table (often table 1) is still what we see. “You should stop this business of comparing cases and controls” was what Miettinen’s students were told more than 30 years ago. Apparently, not many listened. This table 1 is often used to “identify” confounders based on data from this particular study. Determinants of the disease (and thus potential confounders) are identified, hopefully, from the entire body of evidence, not from data in a single study. Causes of diseases are not time and population specific.
It would make more sense to look at determinants of exposures among controls who should represent the source population that gave rise to the cases. The link between exposure and the potential confounder in this particular population is of interest since this causal link may well be time and population specific.
The p-values are used to identify potential confounders in these tables, clearly indicating that the authors do not understand what confounding is about. Confounding is about adjusting for causal backdoor pathways between the exposure and the disease of interest, and the magnitude of associations is more important than whether these associations reach statistical significance. A p-value is not a measure of association since a p-value measures both the strength of association and the sample size. Confounding is influenced by the first part but has nothing to do with the second part. Furthermore, the strength of associations of interest are not based on bivariate comparisons but associations in fully adjusted models. You may even see this practice of comparing exposed an unexposed for a set of potential confounders in a randomized trial, indicating that the authors understand neither confounding nor the principles of the randomization.
It may well be justified to adjust a randomized trial for factors that by chance became unevenly distributed among exposed and unexposed, but this uneven distribution is not detected by p-values, and adjustments should take all other factors that could confound the association of interest into consideration.
Many tables present a crude association and then an adjusted association, and both of them come with a confidence interval. It is unclear what the first confidence is there for; perhaps to remind the reader that there is no particular reason to have confidence in most confidence intervals? We know that estimates come with uncertainties, usually much larger than indicated by the confidence interval because there are many other uncertainties unless we talk about the perfect randomized trial, which hardly exists.
It makes good sense to provide the crude unadjusted estimate, because if crude values differ much from the adjusted estimate we need to know why. But why a confidence interval for these crude estimates? Do not put it in the table just because the computer spits it out. You need to have a reason.
We are interested in theoretical parameters like risks, rates, probability causes, effects, etc., but we are usually not able to measure these, only to estimate them. Do not state in your paper that you measured effects when you measured an association in the hope that it reflected the effects of interest.
And do not write that you confirmed the results by others or yourself. We are rarely in the business of confirming anything as our history painfully shows. Popper advocated using the term “corroborate” as more accurate.
These sins may not be “deadly” in a literal sense. They are more deadly boring and deadly annoying. Misinterpretation of p-values in decision making is, however, sometimes a deadly sin. Read about how not to interpret P-values further up in the Epiblog.
— Jørn Olsen, Cesar Victora, Neil Pearce, Patricia Buffler, Shah Ebrahim