Ratio function vs. ratio distribution



Let’s say we have two distributions \(X\) and \(Y\). In the previous post, we discussed how to express the “absolute difference” between them using the shift function and the shift distribution. Now let’s discuss how to express the “relative difference” between them. This abstract term also could be expressed in various ways. My favorite approach is to build the ratio function. In order to do this, for each quantile \(p\), we should calculate \(Q_Y(p)/Q_X(p)\) where \(Q\) is the quantile function. However, some people prefer using the ratio distribution \(Y/X\). While both approaches may provide similar results for narrow positive non-overlapping distributions, they are not equivalent in the general case. In this post, we briefly consider examples of both approaches.

Equal standard normal distributions

Let’s start with a simple case when both \(X\) and \(Y\) are the standard normal distributions: \(X=Y=\mathcal{N}(0,1)\). Since distributions are equal, they have equal quantile functions: \(Q_X=Q_Y\). Thus \(Q_Y(p)/Q_X(p)\) is \(1\) for all \(p\) values except \(p=0\) (because \(Q_Y(0)=Q_X(0)=0\)). Here is the corresponding ratio function:


Such a ratio function tells us that there is no difference between \(X\) and \(Y\). If we want to build the ratio function for samples, we should just estimate quantiles for both samples. As a robust and statistically efficient quantile estimator we can use the trimmed Harrell-Davis quantile estimator (see [Akinshin2021]).

We can also build the ratio distribution \(Y/X\). It’s well-known that the ratio of two standard normal distributions is the Cauchy distribution:


Having the Cauchy distribution in your experiments could bring some trouble. Firstly, this distribution has heavy tails, so that we could expect extremely high outliers. Secondly, the variance of the Cauchy distribution is undefined so that the Central limit theorem could not be applied there.

Equal standard uniform distributions

Now let’s build both plots for two standard uniform distributions: \(X=Y=\mathcal{U}(0,1)\). Obviously, the ratio function is defined as \(Q_Y(p)/Q_X(p)=1\) for all \(p\) values except \(p=0\):


The ratio of two uniform distributions is defined as follows:

\[f(x) = \begin{cases} 1/2, & \text{if } 0 \le x \le 1 \\ 1/(2x^2), & \text{if } x > 1 \\ 0, & \text{otherwise}. \end{cases} \]

Here is the corresponding plot:


This picture describes the actual distribution of \(Y/X\), but it doesn’t provide useful insights about the actual difference between \(X\) and \(Y\).

Conclusion

Both the ratio function and the shift distribution may provide useful insights about the properties of the difference between \(X\) and \(Y\). However, if we want to get a clear picture that shows the actual relative difference between distribution PDFs, the ratio function works much better.

References

  • [Akinshin2021]
    Andrey Akinshin (2021) Trimmed Harrell-Davis quantile estimator based on the highest density interval of the given width,
    arXiv:2111.11776