Hodges-Lehmann Estimator
The Hodges–Lehmann Estimator (pseudo-median) is a robust measure of location and location shift. Can be used as a median, average, or central tendency estimator.
For a single sample $\mathbf{x} = ( x_1, x_2, \ldots, x_n )$, it is defined as the median of the Walsh Averages:
$$ \operatorname{HL}(\mathbf{x}) = \underset{1 \leq i \leq j \leq n}{\operatorname{median}} \left(\frac{x_i + x_j}{2} \right). $$For two samples $\mathbf{x} = ( x_1, x_2, \ldots, x_n )$ and $\mathbf{y} = ( y_1, y_2, \ldots, y_m )$, the Hodges-Lehmann location shift estimator is defined as follows:
$$ \operatorname{HL}(\mathbf{x}, \mathbf{y}) = \underset{1 \leq i \leq n,\,\, 1 \leq j \leq m}{\operatorname{median}} \left(x_i - y_j \right). $$Asymptotic breakdown point: $\approx 29\%$; asymptotic Gaussian efficiency: $\approx 96\%$.
References
Original papers:
Estimates of Location Based on Rank Tests · 1963
· J L Hodges
et al.
On the Estimation of Relative Potency in Dilution (-Direct) Assays by Distribution-Free Methods · 1963
· Pranab Kumar Sen
Posts:
Understanding the pitfalls of preferring the median over the mean · 2023-06-20
Median vs. Hodges-Lehmann: compare efficiency under heavy-tailedness · 2023-11-14
Hodges-Lehmann Gaussian efficiency: location shift vs. shift of locations · 2023-09-12
Statistical efficiency of the Hodges-Lehmann median estimator, Part 1 · 2022-05-17
Statistical efficiency of the Hodges-Lehmann median estimator, Part 2 · 2022-05-24
Implementations
R implementation (the default one is buggy):
hl <- function(x, y = NULL) {
if (is.null(y)) {
walsh <- outer(x, x, "+") / 2
median(walsh[lower.tri(walsh, diag = TRUE)])
} else {
median(outer(x, y, "-"))
}
}