A dirty dozen: twelve p-value misconceptions

Steven Goodman · 2008

DOI: 10.1053/j.seminhematol.2008.04.003

Excerpts

Twelve P-Value Misconceptions:
If P =.05, the null hypothesis has only a 5% chance of being true.
A nonsignificant difference (eg, P > .05) means there is no difference between groups.
A statistically significant finding is clinically important.
Studies with P values on opposite sides of .05 are conflicting.
Studies with the same P value provide the same evidence against the null hypothesis.
P = .05 means that we have observed data that would occur only 5% of the time under the null hypothesis.
P = .05 and P <= .05 mean the same thing.
P values are properly written as inequalities (eg, “P < .02” when P = .015)
P = .05 means that if you reject the null hypothesis, the probability of a type I error is only 5%.
With a P = .05 threshold for significance, the chance of a type I error will be 5%.
You should use a one-sided P value when you don’t care about a result in one direction, or a difference in that direction is impossible.
A scientific conclusion or treatment policy should be based on whether or not the P value is significant.

Abstract

The P value is a measure of statistical evidence that appears in virtually all medical research papers. Its interpretation is made extraordinarily difficult because it is not part of any formal system of statistical inference. As a result, the P value’s inferential meaning is widely and often wildly misconstrued, a fact that has been pointed out in innumerable papers and books appearing since at least the 1940s. This commentary reviews a dozen of these common misinterpretations and explains why each is wrong. It also reviews the possible consequences of these improper understandings or representations of its meaning. Finally, it contrasts the P value with its Bayesian counterpart, the Bayes’ factor, which has virtually all of the desirable properties of an evidential measure that the P value lacks, most notably interpretability. The most serious consequence of this array of P-value misconceptions is the false belief that the probability of a conclusion being in error can be calculated from the data in a single experiment without reference to external evidence or the plausibility of the underlying mechanism.

Reference

Steven Goodman “A dirty dozen: twelve p-value misconceptions” (2008) DOI: 10.1053/j.seminhematol.2008.04.003

@Inproceedings{goodman2008,
  title = {A dirty dozen: twelve p-value misconceptions},
  author = {Goodman, Steven},
  abstract = {The P value is a measure of statistical evidence that appears in virtually all medical research papers. Its interpretation is made extraordinarily difficult because it is not part of any formal system of statistical inference. As a result, the P value's inferential meaning is widely and often wildly misconstrued, a fact that has been pointed out in innumerable papers and books appearing since at least the 1940s. This commentary reviews a dozen of these common misinterpretations and explains why each is wrong. It also reviews the possible consequences of these improper understandings or representations of its meaning. Finally, it contrasts the P value with its Bayesian counterpart, the Bayes' factor, which has virtually all of the desirable properties of an evidential measure that the P value lacks, most notably interpretability. The most serious consequence of this array of P-value misconceptions is the false belief that the probability of a conclusion being in error can be calculated from the data in a single experiment without reference to external evidence or the plausibility of the underlying mechanism.},
  booktitle = {Seminars in hematology},
  volume = {45},
  number = {3},
  pages = {135--140},
  year = {2008},
  organization = {Elsevier},
  doi = {10.1053/j.seminhematol.2008.04.003}
}