On this page I present my unstructured thoughts about statistics and epidemiology from time to time.
Finding evidence in the context of uncertainty is challenging.
Confusion caused by the p-value
November 17,2017
Long-term discussion about the use and misuse of p-values
Four recent articles
- Goodman S. A dirty dozen: twelve p-value misconceptions. Semin Hematol. 2008;45(3):135-40.
- Nuzzo R. Statistical errors. Nature, 2014; 506(7487):150-152.
- Ronald L. Wasserstein, Nicole A. Lazar. The ASA's Statement on p-Values: Context, Process, and Purpose. The American Statistician. 2016;70:2, pages 129-133.
- Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN, Altman DG. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol. 2016;31(4):337-50.
The calculation of a p-value can be followed and therefore appears to be an objective measure for the evaluation of a study. It should be noted that assumptions are made explicitly or implicitly when calculating the p-value. It is assumed that the statistical assumptions of the applied statistical method are met and that no bias is present due to data collection and statistical analysis (e. g. confounding, selection bias, measurement error, correct model specification). A violation of these requirements limits the validity of the p-value.
Conclusions for epidemiological observational studies
- Avoid Null Hypothesis Significant Testing (NHST), in particular dichotomization
(p
< 0.05 significant vs. p ≥ 0.05 not significant) or "trichotomization" (significant, marginally significant, not significant).
- Don't use p-values for a decision alone ("relevant" vs. "not relevant").
- The p-value is not a measure of the relevance of a study result, but a measure of evidence against the null hypothesis assuming that the null hypothesis is true. Therefore, the p-value should be interpreted as continuous measure without arbitrary cutoffs, if at all.
- The p-value is confounded by sample size and precision. (Lang JM, Rothman KJ, Cann CI. That confounded P-value. Epidemiology. 1998;9(1):7-8).
- Transforming the p-value into a more easily comprehensible measure like the surprisal value -log2(p) can help to reduce the misuse of p-values (Greenland S. Invited Commentary: The Need for Cognitive Science in Methodology, AJE. 2017;186(6): 639-645; http://www.umsl.edu/~fraundorfp/egsurpri.html, Fraundorf P. Examples of surprisal. Accessed November 16, 2017).
- Use parameter estimates + confidence intervals (but should also used carefully).
- Bayesian statistic provides a way to quantify uncertainty.
- Compare your results with your expectations.
- Conclusions should not be based on a single study. The results of a study should be interpreted taking into account the existing evidence (knowledge).
Can there be an objective decision rule for the evaluation of a study result, since many subjective decisions are made? It starts with the evaluation of the existing evidence and followed by the choice of study design, study population, study size, inclusion and exclusion criteria, the statistical methods to be used, the variables to be included, etc.
The rationales for a decision should be traceable. However, this does not mean that a traceable decision is objective.