Psychological measures aren’t toothbrushes

Excerpts

A jingle-jangle of labels
Some measures actually quantify different things, but share similar labels (or even identical ones: In APA PsycTests, no less than 19 different tests go by “theory of planned behavior ques- tionnaire”, 15 by “job satisfaction scale”, and 11 by “self-efficacy scale”). Other measures quantify the same thing as existing measures but under a different label. Known as the Jingle and Jangle fallacies, these are common and well-documented threats to the replicability and validity of psychological research, e.g. in studies on emotion . They involve a nominal fallacy: that a measure’s name tells you about its contents or what it measures.

Undisclosed flexibility
Even when authors profess using the same measure of the same construct, all is not yet well because disclosed and undisclosed measurement flexibility, i.e. changes to a measure with known or unknown psychometric consequences, is common. Dropping, adding, and altering items in self-report scales, aggregating total scores in various ways in laboratory tasks, or varying stimuli and trial durations all occur while researchers not only refer to the same construct, but actually to the same nominal instrument. Even when all decisions are disclosed, only a methodological literature review will reveal that many studies used, for instance, unique aggregation algorithms, scoring strategies, or items, often with unknown psychometric consequences.

— Page 1

Scrutiny of the details of previous work’s measures is necessary to both inform how we should interpret existing findings and to increase measures’ future reuse potential. Transparency about the fine grain details of our measures allows others to reuse them with fidelity, and allows for the fidelity of measures to be checked between studies. These aspects of transparency and their scientific benefits have yet to be tapped by our field. If we want to build a cumulative evidence base in psychology, we need to standardise our measures and protocols. Psychologists need to stop remixing and recycling, and start reusing (measures, not toothbrushes).

— Page 2

Hence, (a) the lack of strong empirical or procedural norms in measurement, (b) the lack of transparency in reporting, and (c) the lack of common referents (i.e., test norms) in measurement are an enormous threat to meaningful evidence cumulation and research synthesis.

— Page 2

Abstract

Most psychological measures are used only once or twice. This proliferation and variability threaten the credibility of research. The Standardisation Of BEhavior Research (SOBER) guidelines aim to ensure that psychological measures are standardised and, unlike toothbrushes, reused by others.

Reference

Malte Elson, Ian Hussey, Taym Alsalti, Ruben C Arslan “Psychological measures aren’t toothbrushes” (2023) DOI: 10.1038/s44271-023-00026-9

@Article{elson2023,
  title = {Psychological measures aren’t toothbrushes},
  abstract = {Most psychological measures are used only once or twice. This proliferation and variability threaten the credibility of research. The Standardisation Of BEhavior Research (SOBER) guidelines aim to ensure that psychological measures are standardised and, unlike toothbrushes, reused by others.},
  volume = {1},
  issn = {2731-9121},
  url = {http://dx.doi.org/10.1038/s44271-023-00026-9},
  doi = {10.1038/s44271-023-00026-9},
  number = {1},
  journal = {Communications Psychology},
  publisher = {Springer Science and Business Media LLC},
  author = {Elson, Malte and Hussey, Ian and Alsalti, Taym and Arslan, Ruben C},
  year = {2023},
  month = {oct},
  custom-url-pdf = {https://www.nature.com/articles/s44271-023-00026-9.pdf}
}