Untied quantile absolute deviation
In the previous posts, I tried to adapt the concept of the quantile absolute deviation to samples with tied values so that this measure of dispersion never becomes zero for nondegenerate ranges. My previous attempt was the middle non-zero quantile absolute deviation (modification 1, modification 2). However, I’m not completely satisfied with the behavior of this metric. In this post, I want to consider another way to work around the problem with tied values.
Read more
Middle non-zero quantile absolute deviation, Part 2
In one of the previous posts, I described the idea of the middle non-zero quantile absolute deviation. It’s defined as follows:
\[\operatorname{MNZQAD}(x, p) = \operatorname{QAD}(x, p, q_m), \]
\[q_m = \frac{q_0 + 1}{2}, \quad q_0 = \frac{\max(k - 1, 0)}{n - 1}, \quad k = \sum_{i=1}^n \mathbf{1}_{Q(x, p)}(x_i), \]
where \(\mathbf{1}\) is the indicator function
\[\mathbf{1}_U(u) = \begin{cases} 1 & \textrm{if}\quad u = U,\\ 0 & \textrm{if}\quad u \neq U, \end{cases} \]
and \(\operatorname{QAD}\) is the quantile absolute deviation
\[\operatorname{QAD}(x, p, q) = Q(|x - Q(x, p)|, q). \]
The \(\operatorname{MNZQAD}\) approach tries to work around a problem with tied values. While it works well in the generic case, there are some corner cases where the suggested metric behaves poorly. In this post, we discuss this problem and how to solve it.
Read more
The expected number of takes from a discrete distribution before observing the given element
Let’s consider a discrete distribution \(X\) defined by its probability mass function \(p_X(x)\). We randomly take elements from \(X\) until we observe the given element \(x_0\). What’s the expected number of takes in this process?
This classic statistical problem could be solved in various ways. I would like to share one of my favorite approaches that involves the derivative of the series \(\sum_{n=0}^\infty x^n\).
Read more
Folded medians
In the previous post, we discussed the Gastwirth’s location estimator. In this post, we continue playing with different location estimators. To be more specific, we consider an approach called folded medians. Let \(x = \{ x_1, x_2, \ldots, x_n \}\) be a random sample with order statistics \(\{ x_{(1)}, x_{(2)}, \ldots, x_{(n)} \}\). We build a folded sample using the following form:
\[\Bigg\{ \frac{x_{(1)}+x_{(n)}}{2}, \frac{x_{(2)}+x_{(n-1)}}{2}, \ldots, \Bigg\}. \]
If \(n\) is odd, the middle sample element is folded with itself. The folding operation could be applied several times. Once folding is conducted, the median of the final folded sample is the folded median. A single folding operation gives us the Bickel-Hodges estimator.
In this post, we briefly check how this metric behaves in the case of the Normal and Cauchy distributions.
Read more
Gastwirth's location estimator
Let \(x = \{ x_1, x_2, \ldots, x_n \}\) be a random sample. The Gastwirth’s location estimator is defined as follows:
\[0.3 \cdot Q_{⅓}(x) + 0.4 \cdot Q_{½}(x) + 0.3 \cdot Q_{⅔}(x), \]
where \(Q_p\) is an estimation of the \(p^{\textrm{th}}\) quantile (using classic sample quantiles).
This estimator could be quite interesting from a practical point of view. On the one hand, it’s robust (the breakdown point ⅓) and it has better statistical efficiency than the classic sample median. On the other hand, it has better computational efficiency than other robust and statistical efficient measures of location like the Harrell-Davis median estimator or the Hodges-Lehmann median estimator.
In this post, we conduct a short simulation study that shows its behavior for the standard Normal distribution and the Cauchy distribution.
Read more
Dynamical System Case Study 1 (symmetric 3d system)
Let’s consider the following dynamical system:
\[\begin{cases} \dot{x}_1 = f(x_3) - x_1,\\ \dot{x}_2 = f(x_1) - x_2,\\ \dot{x}_3 = f(x_2) - x_3, \end{cases} \]
where \(f(x) = \alpha / (1+x^m)\) is a Hill function. In this case study, we explore the phase portrait of this system for \(\alpha = 18,\; m = 3\).
Read more
Beeping Busy Beavers and twin prime conjecture
In this post, I use Beeping Busy Beavers to show that twin prime conjecture could be proven or disproven.
Read more
Hodges-Lehmann-Sen shift and shift confidence interval estimators
In the previous two posts (1, 2), I discussed the Hodges-Lehmann median estimator. The suggested idea of getting median estimations based on a cartesian product could be adopted to estimate the shift between two samples. In this post, we discuss how to build Hodges-Lehmann-Sen shift estimator and how to get confidence intervals for the obtained estimations. Also, we perform a simulation study that checks the actual coverage percentage of these intervals.
Read more
Statistical efficiency of the Hodges-Lehmann median estimator, Part 2
In the previous post, we evaluated the relative statistical efficiency of the Hodges-Lehmann median estimator against the sample median under the normal distribution. In this post, we extended this experiment to a set of various light-tailed and heavy-tailed distributions.
Read more
Statistical efficiency of the Hodges-Lehmann median estimator, Part 1
In this post, we evaluate the relative statistical efficiency of the Hodges-Lehmann median estimator against the sample median under the normal distribution. We also compare it with the efficiency of the Harrell-Davis quantile estimator.
Read more