Statistical significance is one of the most misunderstood concepts in empirical research, and the misunderstandings aren't trivial — they shape which treatments get approved, which policies get adopted, and which scientific claims make headlines.
A p-value of 0.05 means: if the null hypothesis were true, you'd see data this extreme or more extreme 5% of the time by chance. That's it. It does not mean:
This is the one that really matters. If you're testing a hypothesis that's unlikely to be true (say, 10% prior probability), even a p < 0.05 result has a substantial false positive rate — roughly 36% under reasonable assumptions. The threshold that makes sense depends on the prior probability of the hypothesis, which most significance-based frameworks ignore.
The American Statistical Association's 2019 statement on moving beyond statistical significance is worth reading in full. The key point: stop using binary significant/not significant framing entirely. Report effect sizes with confidence intervals. Let readers judge practical significance. The binary threshold creates perverse incentives (p-hacking, HARKing, file drawer effects) that degrade the literature.
Reference: Wasserstein et al. (2019) The ASA Statement on p-Values: https://www.tandfonline.com/doi/full/10.1080/00031305.2019.1583913
Comments
Loading comments…