Hodges-Lehmann Estimator

The Hodges–Lehmann Estimator (pseudo-median) is a robust measure of location and location shift. Can be used as a median, average, or central tendency estimator.

For a single sample $\mathbf{x} = ( x_1, x_2, \ldots, x_n )$, it is defined as the median of the Walsh AveragesWalsh Averages:

$$ \operatorname{HL}(\mathbf{x}) = \underset{1 \leq i \leq j \leq n}{\operatorname{median}} \left(\frac{x_i + x_j}{2} \right). $$

For two samples $\mathbf{x} = ( x_1, x_2, \ldots, x_n )$ and $\mathbf{y} = ( y_1, y_2, \ldots, y_m )$, the Hodges-Lehmann location shift estimator is defined as follows:

$$ \operatorname{HL}(\mathbf{x}, \mathbf{y}) = \underset{1 \leq i \leq n,\,\, 1 \leq j \leq m}{\operatorname{median}} \left(x_i - y_j \right). $$

Asymptotic breakdown point: $\approx 29\%$; asymptotic Gaussian efficiency: $\approx 96\%$.

References

Original papers:
Estimates of Location Based on Rank Tests · 1963 · J L Hodges et al.
On the Estimation of Relative Potency in Dilution (-Direct) Assays by Distribution-Free Methods · 1963 · Pranab Kumar Sen

Posts:
Understanding the pitfalls of preferring the median over the mean · 2023-06-20
Median vs. Hodges-Lehmann: compare efficiency under heavy-tailedness · 2023-11-14
Hodges-Lehmann Gaussian efficiency: location shift vs. shift of locations · 2023-09-12
Statistical efficiency of the Hodges-Lehmann median estimator, Part 1 · 2022-05-17
Statistical efficiency of the Hodges-Lehmann median estimator, Part 2 · 2022-05-24

Implementations

R implementation (the default one is buggy):

hl <- function(x, y = NULL) {
  if (is.null(y)) {
    walsh <- outer(x, x, "+") / 2
    median(walsh[lower.tri(walsh, diag = TRUE)])
  } else {
    median(outer(x, y, "-"))
  }
}