Quantile estimators based on k order statistics, Part 8: Winsorized Harrell-Davis quantile estimator
In the previous post, we have discussed
the trimmed modification of the Harrell-Davis quantile estimator
based on the highest density interval of size
All posts from this series:
- Quantile estimators based on k order statistics, Part 1: Motivation (2021-08-03)
- Quantile estimators based on k order statistics, Part 2: Extending Hyndman-Fan equations (2021-08-10)
- Quantile estimators based on k order statistics, Part 3: Playing with the Beta function (2021-08-17)
- Quantile estimators based on k order statistics, Part 4: Adopting trimmed Harrell-Davis quantile estimator (2021-08-24)
- Quantile estimators based on k order statistics, Part 5: Improving trimmed Harrell-Davis quantile estimator (2021-08-31)
- Quantile estimators based on k order statistics, Part 6: Continuous trimmed Harrell-Davis quantile estimator (2021-09-07)
- Quantile estimators based on k order statistics, Part 7: Optimal threshold for the trimmed Harrell-Davis quantile estimator (2021-09-14)
- Quantile estimators based on k order statistics, Part 8: Winsorized Harrell-Davis quantile estimator (2021-09-21)
The approach
The general idea is the same that was used in one of the previous posts.
We express the estimation of the
where
In the case of the winsorized Harrell-Davis quantile estimator,
we use a part of the Beta distribution inside the
In the previous post, we discussed the idea of choosing
Numerical simulations
The relative efficiency value depends on five parameters:
- Target quantile estimator
- Baseline quantile estimator
- Estimated quantile
- Sample size
- Distribution
As target quantile estimators, we use:
HD
: Classic Harrell-Davis quantile estimatorTHD-SQRT
: The described in the previous post trimmed modification of the Harrell-Davis quantile estimator based on highest density interval of size .WHD-SQRT
: The described above winsorized modification of the Harrell-Davis quantile estimator based on highest density interval of size .
The conventional baseline quantile estimator in such simulations is the traditional quantile estimator that is defined as a linear combination of two subsequent order statistics. To be more specific, we are going to use the Type 7 quantile estimator from the Hyndman-Fan classification or HF7. It can be expressed as follows (assuming one-based indexing):
Thus, we are going to estimate the relative efficiency of
the trimmed and winsorized Harrell-Davis quantile estimators with different percentage values against
the traditional quantile estimator HF7.
For the
where
We are also going to use the following distributions:
Uniform(0,1)
: Continuous uniform distribution;Tri(0,1,2)
: Triangular distribution;Tri(0,0.2,2)
: Triangular distribution;Beta(2,4)
: Beta distribution;Beta(2,10)
: Beta distribution;Normal(0,1^2)
: Standard normal distribution;Weibull(1,2)
: Weibull distribution;Student(3)
: Student distribution;Gumbel(0,1)
: Gumbel distribution;Exp(1)
: Exponential distribution;Cauchy(0,1)
: Standard Cauchy distribution;Pareto(1,0.5)
: Pareto distribution;Pareto(1,2)
: Pareto distribution;LogNormal(0,1^2)
: Log-normal distribution;LogNormal(0,2^2)
: Log-normal distribution;LogNormal(0,3^2)
: Log-normal distribution;Weibull(1,0.5)
: Weibull distribution;Weibull(1,0.3)
: Weibull distribution;Frechet(0,1,1)
: Frechet distribution;Frechet(0,1,3)
: Frechet distribution;
Simulation Results
Conclusion
One of the biggest drawbacks of the winsorized modification of the Harrell-Davis quantile estimator is
the stair-like pattern which we can observe in the above images.
Such a phenomenon could be easily explained.
If we enumerate all the quantile values from 0 to 1,
the