Median of the shifts vs. shift of the medians, Part 2: Gaussian efficiency

by Andrey Akinshin · 2022-12-27

In the previous post, we discussed the difference between shifts of the medians and the Hodges-Lehmann location shift estimator. In this post, we conduct a simple numerical simulation to evaluate the Gaussian efficiency of these two estimators.

Estimators

We consider two samples of equal size $n$: $x = \{ x_1, x_2, \ldots, x_n \}$, $y = \{ y_1, y_2, \ldots, y_n \}$. We define the shifts of the medians as

$$ \newcommand{\DSM}{\Delta_{\operatorname{SM}}} \DSM = \operatorname{median}(y) - \operatorname{median}(x). $$

and the Hodges-Lehmann location shift estimator as

$$ \newcommand{\DHL}{\Delta_{\operatorname{HL}}} \DHL = \operatorname{median}(y_j - x_i). $$

We also consider the classic estimator that estimates the difference of means:

$$ \newcommand{\Dbase}{\Delta_{\operatorname{0}}} \Dbase = \operatorname{mean}(y) - \operatorname{mean}(x). $$

The Gaussian efficiency of $\DSM$ and $DHL$ can be defined as follows:

$$ e(\DSM) = \frac{\mathbb{V}[\Dbase]}{\mathbb{V}[\DSM]},\quad e(\DHL) = \frac{\mathbb{V}[\Dbase]}{\mathbb{V}[\DHL]}. $$

Numerical simulations

We conduct the following simulation:

  • Enumerate the sample size $n$ from $3$ to $100$.
  • For each $n$, generate $100\,000$ pairs of random samples from $\mathcal{N}(0, 1)$.
  • For each pair of samples, estimate the shift between them using $\Dbase$, $\DSM$, and $\DHL$.
  • Calculate the Gaussian efficiency of $\DSM$ and $DHL$ using the above equations.

Here are the results:

As we can see, the Hodges-Lehmann location shift estimator is much more efficient than the shift of the medians.

References

  • [Hodges1963]
    Hodges, J. L., and E. L. Lehmann. 1963. Estimates of location based on rank tests. The Annals of Mathematical Statistics 34 (2):598–611.
    DOI:10.1214/aoms/1177704172