Finite-sample Rousseeuw-Croux scale estimators
The paper is based on a series of my research notes:
- Finite-sample efficiency of the Rousseeuw-Croux estimators (2022-08-09)
- Finite-sample bias correction factors for Rousseeuw-Croux scale estimators (2022-09-06)
- Preprint announcement: 'Finite-sample Rousseeuw-Croux scale estimators' (2022-10-18)
Abstract
The Rousseeuw-Croux $S_n$, $Q_n$ scale estimators and the median absolute deviation MAD_n can be used as consistent estimators for the standard deviation under normality. All of them are highly robust: the breakdown point of all three estimators is $50\%$. However, $S_n$ and $Q_n$ are much more efficient than MAD_n: their asymptotic Gaussian efficiency values are $58\%$ and $82\%$ respectively compared to $37\%$ for MAD_n. Although these values look impressive, they are only asymptotic values. The actual Gaussian efficiency of $S_n$ and $Q_n$ for small sample sizes is noticeably lower than in the asymptotic case. The original work by Rousseeuw and Croux (1993) provides only rough approximations of the finite-sample bias-correction factors for $S_n,\, Q_n$ and brief notes on their finite-sample efficiency values. In this paper, we perform extensive Monte-Carlo simulations in order to obtain refined values of the finite-sample properties of the Rousseeuw-Croux scale estimators. We present accurate values of the bias-correction factors and Gaussian efficiency for small samples ($n \leq 100$) and prediction equations for samples of larger sizes.
Reference
Andrey Akinshin “Finite-sample Rousseeuw-Croux scale estimators” (2022) arXiv:2209.12268
@Article{akinshin2022frc,
title = {Finite-sample Rousseeuw-Croux scale estimators},
author = {Akinshin, Andrey},
year = {2022},
month = {9},
arxiv = {2209.12268},
abstract = {The Rousseeuw-Croux $S_n$, $Q_n$ scale estimators and the median absolute deviation MAD_n can be used as consistent estimators for the standard deviation under normality. All of them are highly robust: the breakdown point of all three estimators is $50\%$. However, $S_n$ and $Q_n$ are much more efficient than MAD_n: their asymptotic Gaussian efficiency values are $58\%$ and $82\%$ respectively compared to $37\%$ for MAD_n. Although these values look impressive, they are only asymptotic values. The actual Gaussian efficiency of $S_n$ and $Q_n$ for small sample sizes is noticeably lower than in the asymptotic case. The original work by Rousseeuw and Croux (1993) provides only rough approximations of the finite-sample bias-correction factors for $S_n,\, Q_n$ and brief notes on their finite-sample efficiency values. In this paper, we perform extensive Monte-Carlo simulations in order to obtain refined values of the finite-sample properties of the Rousseeuw-Croux scale estimators. We present accurate values of the bias-correction factors and Gaussian efficiency for small samples ($n \leq 100$) and prediction equations for samples of larger sizes.}
}