Finite-sample Rousseeuw-Croux scale estimators

Andrey Akinshin · 2022

PDF · arXiv:2209.12268 · github.com/AndreyAkinshin/paper-frc

The paper is based on a series of my research notes:

Finite-sample efficiency of the Rousseeuw-Croux estimators (2022-08-09)
Finite-sample bias correction factors for Rousseeuw-Croux scale estimators (2022-09-06)
Preprint announcement: 'Finite-sample Rousseeuw-Croux scale estimators' (2022-10-18)

Abstract

The Rousseeuw-Croux $S_n$, $Q_n$ scale estimators and the median absolute deviation MAD_n can be used as consistent estimators for the standard deviation under normality. All of them are highly robust: the breakdown point of all three estimators is $50\%$. However, $S_n$ and $Q_n$ are much more efficient than MAD_n: their asymptotic Gaussian efficiency values are $58\%$ and $82\%$ respectively compared to $37\%$ for MAD_n. Although these values look impressive, they are only asymptotic values. The actual Gaussian efficiency of $S_n$ and $Q_n$ for small sample sizes is noticeably lower than in the asymptotic case. The original work by Rousseeuw and Croux (1993) provides only rough approximations of the finite-sample bias-correction factors for $S_n,\, Q_n$ and brief notes on their finite-sample efficiency values. In this paper, we perform extensive Monte-Carlo simulations in order to obtain refined values of the finite-sample properties of the Rousseeuw-Croux scale estimators. We present accurate values of the bias-correction factors and Gaussian efficiency for small samples ($n \leq 100$) and prediction equations for samples of larger sizes.

Reference

Andrey Akinshin “Finite-sample Rousseeuw-Croux scale estimators” (2022) arXiv:2209.12268

@Article{akinshin2022frc,
  title = {Finite-sample Rousseeuw-Croux scale estimators},
  author = {Akinshin, Andrey},
  year = {2022},
  month = {9},
  arxiv = {2209.12268},
  abstract = {The Rousseeuw-Croux $S_n$, $Q_n$ scale estimators and the median absolute deviation MAD_n can be used as consistent estimators for the standard deviation under normality. All of them are highly robust: the breakdown point of all three estimators is $50\%$. However, $S_n$ and $Q_n$ are much more efficient than MAD_n: their asymptotic Gaussian efficiency values are $58\%$ and $82\%$ respectively compared to $37\%$ for MAD_n. Although these values look impressive, they are only asymptotic values. The actual Gaussian efficiency of $S_n$ and $Q_n$ for small sample sizes is noticeably lower than in the asymptotic case. The original work by Rousseeuw and Croux (1993) provides only rough approximations of the finite-sample bias-correction factors for $S_n,\, Q_n$ and brief notes on their finite-sample efficiency values. In this paper, we perform extensive Monte-Carlo simulations in order to obtain refined values of the finite-sample properties of the Rousseeuw-Croux scale estimators. We present accurate values of the bias-correction factors and Gaussian efficiency for small samples ($n \leq 100$) and prediction equations for samples of larger sizes.}
}