Finite-sample Rousseeuw-Croux scale estimators
The paper is based on a series of my research notes:
- Finite-sample efficiency of the Rousseeuw-Croux estimators (2022-08-09)
- Finite-sample bias correction factors for Rousseeuw-Croux scale estimators (2022-09-06)
- Preprint announcement: 'Finite-sample Rousseeuw-Croux scale estimators' (2022-10-18)
Abstract
The Rousseeuw-Croux
, scale estimators and the median absolute deviation MAD_n can be used as consistent estimators for the standard deviation under normality. All of them are highly robust: the breakdown point of all three estimators is . However, and are much more efficient than MAD_n: their asymptotic Gaussian efficiency values are and respectively compared to for MAD_n. Although these values look impressive, they are only asymptotic values. The actual Gaussian efficiency of and for small sample sizes is noticeably lower than in the asymptotic case. The original work by Rousseeuw and Croux (1993) provides only rough approximations of the finite-sample bias-correction factors for and brief notes on their finite-sample efficiency values. In this paper, we perform extensive Monte-Carlo simulations in order to obtain refined values of the finite-sample properties of the Rousseeuw-Croux scale estimators. We present accurate values of the bias-correction factors and Gaussian efficiency for small samples ( ) and prediction equations for samples of larger sizes.
Reference
Andrey Akinshin “Finite-sample Rousseeuw-Croux scale estimators” (2022) arXiv:2209.12268
@Article{akinshin2022frc,
title = {Finite-sample Rousseeuw-Croux scale estimators},
author = {Akinshin, Andrey},
year = {2022},
month = {9},
arxiv = {2209.12268},
abstract = {The Rousseeuw-Croux $S_n$, $Q_n$ scale estimators and the median absolute deviation MAD_n can be used as consistent estimators for the standard deviation under normality. All of them are highly robust: the breakdown point of all three estimators is $50\%$. However, $S_n$ and $Q_n$ are much more efficient than MAD_n: their asymptotic Gaussian efficiency values are $58\%$ and $82\%$ respectively compared to $37\%$ for MAD_n. Although these values look impressive, they are only asymptotic values. The actual Gaussian efficiency of $S_n$ and $Q_n$ for small sample sizes is noticeably lower than in the asymptotic case. The original work by Rousseeuw and Croux (1993) provides only rough approximations of the finite-sample bias-correction factors for $S_n,\, Q_n$ and brief notes on their finite-sample efficiency values. In this paper, we perform extensive Monte-Carlo simulations in order to obtain refined values of the finite-sample properties of the Rousseeuw-Croux scale estimators. We present accurate values of the bias-correction factors and Gaussian efficiency for small samples ($n \leq 100$) and prediction equations for samples of larger sizes.}
}