Library / Statistical power, Sample size, and Their Reporting in Randomized Controlled Trials

Authors	D Moher C Dulberg G Wells
Year	1994
DOI	10.1001/jama.1994.03520020048013
Links	Link
Tags	Mathematics Statistics Statistical Power Underpowered research

Reference

D Moher, C Dulberg, G Wells “Statistical power, sample size, and their reporting in randomized controlled trials” (1994) // JAMA: The Journal of the American Medical Association. Publisher: American Medical Association (AMA). Vol. 272. No 2. Pp. 122. DOI: 10.1001/jama.1994.03520020048013

Bib

@Article{moher1994,
  title = {Statistical power, sample size, and their reporting in randomized controlled trials},
  volume = {272},
  issn = {0098-7484},
  url = {http://dx.doi.org/10.1001/jama.1994.03520020048013},
  doi = {10.1001/jama.1994.03520020048013},
  number = {2},
  journal = {JAMA: The Journal of the American Medical Association},
  publisher = {American Medical Association (AMA)},
  author = {D Moher and C Dulberg and G Wells},
  year = {1994},
  month = {jul},
  pages = {122}
}

Quotes (1)

Low Statistical Power

Objective. To describe the pattern over time in the level of statistical power and the reporting of sample size calculations in published randomized controlled trials (RCTs) with negative results.
Design. Ourstudy was a descriptive survey. Power to detect 25% and 50% relative differences was calculated for the subset of trials with negative results in which a simple two-group parallel design was used. Criteria were developed both to classify trial results as positive or negative and to identify the primary outcomes. Power calculations were based on results from the primary outcomes reported in the trials.
Population. We reviewed all 383 RCTs published in JAMA, Lancet, and the New England Journal of Medicine in 1975, 1980, 1985, and 1990.
Results. Twenty-sevenpercent of the 383 RCTs (n=102) were classified as having negative results. The number of published RCTs more than doubled from 1975 to 1990, with the proportion of trials with negative results remaining fairly stable. Of the simple two-group parallel design trials having negative results with dichotomous or continuous primary outcomes (n=70), only 16% and 36% had sufficient statistical power (80%) to detect a 25% or 50% relative difference, respectively. These percentages did not consistently increase overtime. Overall, only 32% of the trials with negative results reported sample size calculations, but the percentage doing so has improved over time from 0% in 1975 to 43% in 1990. Only 20 of the 102 reports made any statement related to the clinical significance of the observed differences.
Conclusions. Most trials with negative results did not have large enough sample sizes to detect a 25% or a 50% relative difference. This result has not changed over time. Few trials discussed whether the observed differences were clinically important. There are important reasons to change this practice. The reporting of statistical power and sample size also needs to be improved.