Untied quantile absolute deviation


In the previous posts, I tried to adapt the concept of the quantile absolute deviation to samples with tied values so that this measure of dispersion never becomes zero for nondegenerate ranges. My previous attempt was the middle non-zero quantile absolute deviation (modification 1, modification 2). However, I’m not completely satisfied with the behavior of this metric. In this post, I want to consider another way to work around the problem with tied values.


Read more


Middle non-zero quantile absolute deviation, Part 2


In one of the previous posts, I described the idea of the middle non-zero quantile absolute deviation. It’s defined as follows:

\[\operatorname{MNZQAD}(x, p) = \operatorname{QAD}(x, p, q_m), \]

\[q_m = \frac{q_0 + 1}{2}, \quad q_0 = \frac{\max(k - 1, 0)}{n - 1}, \quad k = \sum_{i=1}^n \mathbf{1}_{Q(x, p)}(x_i), \]

where \(\mathbf{1}\) is the indicator function

\[\mathbf{1}_U(u) = \begin{cases} 1 & \textrm{if}\quad u = U,\\ 0 & \textrm{if}\quad u \neq U, \end{cases} \]

and \(\operatorname{QAD}\) is the quantile absolute deviation

\[\operatorname{QAD}(x, p, q) = Q(|x - Q(x, p)|, q). \]

The \(\operatorname{MNZQAD}\) approach tries to work around a problem with tied values. While it works well in the generic case, there are some corner cases where the suggested metric behaves poorly. In this post, we discuss this problem and how to solve it.


Read more


The expected number of takes from a discrete distribution before observing the given element

DateTags

Let’s consider a discrete distribution \(X\) defined by its probability mass function \(p_X(x)\). We randomly take elements from \(X\) until we observe the given element \(x_0\). What’s the expected number of takes in this process?

This classic statistical problem could be solved in various ways. I would like to share one of my favorite approaches that involves the derivative of the series \(\sum_{n=0}^\infty x^n\).


Read more


Folded medians

DateTags

In the previous post, we discussed the Gastwirth’s location estimator. In this post, we continue playing with different location estimators. To be more specific, we consider an approach called folded medians. Let \(x = \{ x_1, x_2, \ldots, x_n \}\) be a random sample with order statistics \(\{ x_{(1)}, x_{(2)}, \ldots, x_{(n)} \}\). We build a folded sample using the following form:

\[\Bigg\{ \frac{x_{(1)}+x_{(n)}}{2}, \frac{x_{(2)}+x_{(n-1)}}{2}, \ldots, \Bigg\}. \]

If \(n\) is odd, the middle sample element is folded with itself. The folding operation could be applied several times. Once folding is conducted, the median of the final folded sample is the folded median. A single folding operation gives us the Bickel-Hodges estimator.

In this post, we briefly check how this metric behaves in the case of the Normal and Cauchy distributions.


Read more


Gastwirth's location estimator

DateTags

Let \(x = \{ x_1, x_2, \ldots, x_n \}\) be a random sample. The Gastwirth’s location estimator is defined as follows:

\[0.3 \cdot Q_{⅓}(x) + 0.4 \cdot Q_{½}(x) + 0.3 \cdot Q_{⅔}(x), \]

where \(Q_p\) is an estimation of the \(p^{\textrm{th}}\) quantile (using classic sample quantiles).

This estimator could be quite interesting from a practical point of view. On the one hand, it’s robust (the breakdown point ⅓) and it has better statistical efficiency than the classic sample median. On the other hand, it has better computational efficiency than other robust and statistical efficient measures of location like the Harrell-Davis median estimator or the Hodges-Lehmann median estimator.

In this post, we conduct a short simulation study that shows its behavior for the standard Normal distribution and the Cauchy distribution.


Read more


Dynamical System Case Study 1 (symmetric 3d system)


Let’s consider the following dynamical system:

\[\begin{cases} \dot{x}_1 = f(x_3) - x_1,\\ \dot{x}_2 = f(x_1) - x_2,\\ \dot{x}_3 = f(x_2) - x_3, \end{cases} \]

where \(f(x) = \alpha / (1+x^m)\) is a Hill function. In this case study, we explore the phase portrait of this system for \(\alpha = 18,\; m = 3\).


Read more


Beeping Busy Beavers and twin prime conjecture

DateTags

In this post, I use Beeping Busy Beavers to show that twin prime conjecture could be proven or disproven.


Read more


Hodges-Lehmann-Sen shift and shift confidence interval estimators


In the previous two posts (1, 2), I discussed the Hodges-Lehmann median estimator. The suggested idea of getting median estimations based on a cartesian product could be adopted to estimate the shift between two samples. In this post, we discuss how to build Hodges-Lehmann-Sen shift estimator and how to get confidence intervals for the obtained estimations. Also, we perform a simulation study that checks the actual coverage percentage of these intervals.


Read more


Statistical efficiency of the Hodges-Lehmann median estimator, Part 2


In the previous post, we evaluated the relative statistical efficiency of the Hodges-Lehmann median estimator against the sample median under the normal distribution. In this post, we extended this experiment to a set of various light-tailed and heavy-tailed distributions.


Read more


Statistical efficiency of the Hodges-Lehmann median estimator, Part 1


In this post, we evaluate the relative statistical efficiency of the Hodges-Lehmann median estimator against the sample median under the normal distribution. We also compare it with the efficiency of the Harrell-Davis quantile estimator.


Read more