## Middle non-zero quantile absolute deviation, Part 2

In one of the previous posts, I described the idea of the middle non-zero quantile absolute deviation. It’s defined as follows:

$\operatorname{MNZQAD}(x, p) = \operatorname{QAD}(x, p, q_m),$

$q_m = \frac{q_0 + 1}{2}, \quad q_0 = \frac{\max(k - 1, 0)}{n - 1}, \quad k = \sum_{i=1}^n \mathbf{1}_{Q(x, p)}(x_i),$

where $$\mathbf{1}$$ is the indicator function

$\mathbf{1}_U(u) = \begin{cases} 1 & \textrm{if}\quad u = U,\\ 0 & \textrm{if}\quad u \neq U, \end{cases}$

and $$\operatorname{QAD}$$ is the quantile absolute deviation

$\operatorname{QAD}(x, p, q) = Q(|x - Q(x, p)|, q).$

The $$\operatorname{MNZQAD}$$ approach tries to work around a problem with tied values. While it works well in the generic case, there are some corner cases where the suggested metric behaves poorly. In this post, we discuss this problem and how to solve it.

## The expected number of takes from a discrete distribution before observing the given element

Let’s consider a discrete distribution $$X$$ defined by its probability mass function $$p_X(x)$$. We randomly take elements from $$X$$ until we observe the given element $$x_0$$. What’s the expected number of takes in this process?

This classic statistical problem could be solved in various ways. I would like to share one of my favorite approaches that involves the derivative of the series $$\sum_{n=0}^\infty x^n$$.

## Folded medians

In the previous post, we discussed the Gastwirth’s location estimator. In this post, we continue playing with different location estimators. To be more specific, we consider an approach called folded medians. Let $$x = \{ x_1, x_2, \ldots, x_n \}$$ be a random sample with order statistics $$\{ x_{(1)}, x_{(2)}, \ldots, x_{(n)} \}$$. We build a folded sample using the following form:

$\Bigg\{ \frac{x_{(1)}+x_{(n)}}{2}, \frac{x_{(2)}+x_{(n-1)}}{2}, \ldots, \Bigg\}.$

If $$n$$ is odd, the middle sample element is folded with itself. The folding operation could be applied several times. Once folding is conducted, the median of the final folded sample is the folded median. A single folding operation gives us the Bickel-Hodges estimator.

In this post, we briefly check how this metric behaves in the case of the Normal and Cauchy distributions.

## Gastwirth's location estimator

Let $$x = \{ x_1, x_2, \ldots, x_n \}$$ be a random sample. The Gastwirth’s location estimator is defined as follows:

$0.3 \cdot Q_{⅓}(x) + 0.4 \cdot Q_{½}(x) + 0.3 \cdot Q_{⅔}(x),$

where $$Q_p$$ is an estimation of the $$p^{\textrm{th}}$$ quantile (using classic sample quantiles).

This estimator could be quite interesting from a practical point of view. On the one hand, it’s robust (the breakdown point ⅓) and it has better statistical efficiency than the classic sample median. On the other hand, it has better computational efficiency than other robust and statistical efficient measures of location like the Harrell-Davis median estimator or the Hodges-Lehmann median estimator.

In this post, we conduct a short simulation study that shows its behavior for the standard Normal distribution and the Cauchy distribution.

## Dynamical System Case Study 1 (symmetric 3d system)

Let’s consider the following dynamical system:

$\begin{cases} \dot{x}_1 = f(x_3) - x_1,\\ \dot{x}_2 = f(x_1) - x_2,\\ \dot{x}_3 = f(x_2) - x_3, \end{cases}$

where $$f(x) = \alpha / (1+x^m)$$ is a Hill function. In this case study, we explore the phase portrait of this system for $$\alpha = 18,\; m = 3$$.

## Beeping Busy Beavers and twin prime conjecture

In this post, I use Beeping Busy Beavers to show that twin prime conjecture could be proven or disproven.

## Hodges-Lehmann-Sen shift and shift confidence interval estimators

In the previous two posts (1, 2), I discussed the Hodges-Lehmann median estimator. The suggested idea of getting median estimations based on a cartesian product could be adopted to estimate the shift between two samples. In this post, we discuss how to build Hodges-Lehmann-Sen shift estimator and how to get confidence intervals for the obtained estimations. Also, we perform a simulation study that checks the actual coverage percentage of these intervals.

## Statistical efficiency of the Hodges-Lehmann median estimator, Part 2

In the previous post, we evaluated the relative statistical efficiency of the Hodges-Lehmann median estimator against the sample median under the normal distribution. In this post, we extended this experiment to a set of various light-tailed and heavy-tailed distributions.

## Statistical efficiency of the Hodges-Lehmann median estimator, Part 1

In this post, we evaluate the relative statistical efficiency of the Hodges-Lehmann median estimator against the sample median under the normal distribution. We also compare it with the efficiency of the Harrell-Davis quantile estimator.

Let $$X_1, X_2$$ be i.i.d. random variables that follow the standard normal distribution $$\mathcal{N}(0,1^2)$$. In the previous post, I have found the expected value of $$\min(|X_1|, |X_2|)$$. Now it’s time to find the value of $$Z = \max(|X_1|, |X_2|)$$.