# Matching quantile sets using likelihood based on the binomial coefficients

Let’s say we have a distribution $$X$$ that is given by its $$s$$-quantile values:

$q_{X_1} = Q_X(p_1),\; q_{X_2} = Q_X(p_2),\; \ldots,\; q_{X_{s-1}} = Q_X(p_{s-1})$

where $$Q_X$$ is the quantile function of $$X$$, $$p_j = j / s$$.

We also have a sample $$y = \{y_1, y_2, \ldots, y_n \}$$ that is given by its $$s$$-quantile estimations:

$q_{y_1} = Q_y(p_1),\; q_{y_2} = Q_y(p_2),\; \ldots,\; q_{y_{s-1}} = Q_y(p_{s-1}),$

where $$Q_y$$ is the quantile estimation function for sample $$y$$. We also assume that $$q_{y_0} = \min(y)$$, $$q_{y_s} = \max(y)$$.

We want to know the likelihood of “$$y$$ is drawn from $$X$$”. In this post, I want to suggest a nice way to do this using the binomial coefficients.

The $$q_{X_j}$$ quantile values split all the real numbers into $$s$$ intervals:

$(-\infty; q_{X_1}],\; (q_{X_1}; q_{X_2}],\; \ldots,\; (q_{X_{s-2}}; q_{X_{s-1}}],\; (q_{X_{s-1}}; \infty)$

If we introduce $$q_{X_0} = -\infty$$, $$q_{X_s}=\infty$$, we can describe the $$j^\textrm{th}$$ interval as $$(q_{X_{j-1}}; q_{X_{j}}]$$. Each of such intervals contains exactly $$1/s$$ portion of the whole distribution:

$F_X(q_{X_j}) - F_X(q_{X_{j-1}}) = 1/s,$

where $$F_X$$ is the CDF of $$X$$.

If we knew the elements of sample $$y$$, we would be able to match $$y_i$$ to the corresponding intervals. Let’s consider another sample $$z = \{ z_1, z_2, \ldots, z_n \}$$ where $$z_i$$ is the index of the interval that contains $$z_i$$:

$z_i = \sum_{j=1}^{s} j \cdot \mathbf{1} \{ q_{X_{j-1}} < y_i \leq q_{X_j} \} \quad (i \in \{ 1..n\}).$

Let $$k_j$$ be the number of $$y$$ elements that belong to the $$j^\textrm{th}$$ segment:

$k_j = \sum_{i=1}^n \mathbf{1} \{ z_i = j \} \quad (j \in \{ 1..s\}).$

Unfortunately, we don’t know the exact values of $$y_i$$, we know only $$s$$-quantile estimations $$q_{y_j}$$. For this case, we could estimate the $$k_j$$ values using linear interpolation between known quantiles:

$k_j = n \cdot \sum_{l=1}^{s} \frac{1}{s} \frac{\max(0, \min(q_{y_l}, q_{X_j}) - \max(q_{y_{l-1}}, q_{X_{j-1}}) )}{q_{y_l} - q_{y_{l-1}}} \quad (j \in \{ 1..s\}).$

(Here we assume that $$q_{y_l} \neq q_{y_{l-1}}$$, such cases should be handled separately.)

Now we transform the original problem into the new one: what’s the likelihood of observing sample $$\{ z_i \}$$ described by $$\{ k_j \}$$ given that $$z_i$$ is a random number from $$\{ 1, 2, \ldots, s \}$$.

This is a simple combinatorial problem. The total number of different $$z$$ samples is $$s^n$$ (we have $$n$$ elements, each element is one of $$s$$ values). The number of ways to choose $$k_1$$ elements that equal $$1$$ is $$C_n^{k_1}$$. Once we remove these elements from consideration, the number of ways to choose $$k_2$$ elements that equal $$2$$ is $$C_{n-k_1}^{k_2}$$. If we continue this process, we will get the following equation for the likelihood:

$\mathcal{L} = \frac{ C_n^{k_1} \cdot C_{n-k_1}^{k_2} \cdot C_{n-k_1-k_2}^{k_3} \cdot \ldots \cdot C_{n-k_1-k_2-\ldots-k_{s-1}}^{k_s}}{s^n}$

Note that $$C_n^k$$ usually assumes that $$n$$ and $$k$$ are integers. However, in our approach with the linear interpolation, $$k_j$$ are real numbers. To work around this limitation, we could consider a generalization of $$C_n^k$$ on real numbers using the gamma function $$\Gamma(n) = (n-1)!$$:

$C_n^k = \frac{n!}{k!(n-k)!} = \frac{\Gamma(n+1)}{\Gamma(k+1)\Gamma(n-k+1)}$

In practice, it’s pretty hard to “honestly” calculate the likelihood using the above formulate so we switch to the log-likelihood notation:

$\log\mathcal{L} = \log C_n^{k_1} + \log C_{n-k_1}^{k_2} + \log C_{n-k_1-k_2}^{k_3} + \ldots + \log C_{n-k_1-k_2-\ldots-k_{s-1}}^{k_s} - n \log s$

It’s easy to see that $$\log C_n^k$$ could be easily expressed via the log-gamma function:

$\log C_n^k = \log \frac{\Gamma(n+1)}{\Gamma(k+1)\Gamma(n-k+1)} = \log\Gamma(n+1) - \log\Gamma(k+1) - \log\Gamma(n-k+1).$

This log-likelihood could be used in various applications where we need a way to match sets of quantiles between each other. In one of the future posts, we will learn to apply this approach to change point detection.

Share: