Posts / The expected number of takes from a discrete distribution before observing the given element


Let’s consider a discrete distribution $X$ defined by its probability mass function $p_X(x)$. We randomly take elements from $X$ until we observe the given element $x_0$. What’s the expected number of takes in this process?

This classic statistical problem could be solved in various ways. I would like to share one of my favorite approaches that involves the derivative of the series $\sum_{n=0}^\infty x^n$.

Let’s denote the result (the expected number of takes) as $K$ and the probability $p_X(x_0)$ of observing $x_0$ as $\alpha$. First of all, let’s write down the straightforward equation for $K$. The probability of observing $x_0$ after the first take is just $\alpha$. The probability of observing $x_0$ after the second take is $\alpha(1-\alpha)$. The probability of observing $x_0$ after the third take is $\alpha(1-\alpha)^2$. Continuing this process, we have

$$ K = 1 \cdot \alpha + 2 \cdot \alpha(1-\alpha) + 3 \cdot \alpha(1-\alpha)^2 + 4 \cdot \alpha(1-\alpha)^3 + \ldots = \alpha \sum_{n=0}^\infty n (1-\alpha)^{n-1}. $$

Now consider the following series:

$$ S(y) = \sum_{n=0} y^n. $$

It’s easy to see that

$$ S(y) = \sum_{n=0}^\infty y^n = 1 + \sum_{n=1}^\infty y^n = 1 + y \cdot \sum_{n=1}^\infty y^{n-1} = 1 + y \cdot \sum_{n=0}^\infty y^n = 1 + y \cdot S(y). $$

Solving $S(y) = 1 + y \cdot S(y)$, we get

$$ \sum_{n=0}^\infty y^n = S(y) = \frac{1}{1-y}. $$

By taking the derivative of both sides of this expression, we get

$$ \sum_{n=0}^\infty n \cdot y^{n-1} = \frac{1}{(1-y)^2}. $$

Putting $y = 1-\alpha$, we get

$$ K = \alpha \sum_{n_0}^\infty n \cdot y^{n-1} = \alpha \frac{1}{(1-y)^2} = \alpha \frac{1}{\alpha^2} = \frac{1}{\alpha}. $$

Thus, we get the answer:

$$ K = \frac{1}{p_X(x_0)}. $$