Dichotomization Should Be Avoided in Most Cases


The knowledge of losing information from dichotomizing a continuous outcome is nothing new. However, many previous writings report on the optimal choice of cut points, which depends upon the parameters we wish to estimate. If we are lucky, the chosen cut point is near the optimal point, but the consequences of dichotomizing become more dire as we deviate from the optimal point. We focus our study on the evaluation of losses caused by dichotomization given cut points. While the analysis of dichotomized outcomes may be easier, there are no benefits to this approach when the true outcomes can be observed and the ‘working’ model is flexible enough to describe the population at hand. Thus, dichotomization should be avoided in most cases. Only when we wish to estimate a CDF value, our working model poorly approximates reality, and our sample size is large will the biasedness of model-based estimators overpower the improvement in variance. In this case, the dichotomized estimator may lead to better results, but further study-specific consideration is needed. We also want to emphasize that while analysis should be done using actual outcomes, some aspects of this analysis can be reported on a dichotomized scale.

Page 59