Posts / Misleading skewness

Skewness is a commonly used measure of the asymmetry of the probability distributions. A typical skewness interpretation comes down to an image like this:

It looks extremely simple: using the skewness sign, we get an idea of the distribution form and the arrangement of the mean and the median. Unfortunately, it doesn’t always work as expected. Skewness estimation could be a highly misleading metric (even more misleading than the standard deviation). In this post, I discuss four sources of its misleadingness:

Multiple Definitions

Skewness is an ambiguous term. In general, it should be treated as a measure of the asymmetry, but different researchers use different ways to calculate the actual value. Let’s briefly discuss some of the most popular definitions. Hereinafter, we use the following notation:

One of the most popular ways to express skewness is to use the third-moment equations. However, there are several kinds of corresponding formulas. The classic taxonomy with three common third-moment-based skewness definitions is described in [Joanes1998]:

$$ g_1 = \frac{m_3}{m_2^{3/2}} = \frac{\frac{1}{n} \sum(x_i - \overline{x})^3}{\Big( \frac{1}{n} \sum(x_i - \overline{x})^2 \Big)^{3/2}}, $$ $$ G_1 = \frac{\sqrt{n(n-1)}}{n-2} g_1, $$ $$ b_1 = \frac{m_3}{s^3} = \Big( \frac{n-1}{n} \Big)^{3/2} g_1. $$

However, the third-moment approach is not the only one that defines skewness. In the case of a unimodal distribution, we could also use the Pearson Mode Skewness (or Pearson’s first skewness coefficient).

$$ S_1 = \frac{\overline{x} - x_{\textrm{mode}}}{s} $$

The mode detection in the continuous case could be a challenging problem, so we could also use the Pearson Median Skewness (or Pearson’s second skewness coefficient):

$$ S_2 = \frac{3(\overline{x} - x_m)}{s} $$

Both approaches (the third-moment-based equations from the Joanes-Gill classification and Pearson’s coefficients) are not robust since they involve non-robust moments. As the robust alternative, we can use Bowley’s skewness (also known as quartile skewness coefficient or Yule’s coefficient):

$$ B_1 = \frac{Q_{0.75} - 2Q_{0.5} + Q_{0.25}}{Q_{0.75} - Q_{0.25}} $$

This approach has a generalization by Groeneveld and Meeden:

$$ \gamma(u) = \frac{Q_{u} - 2Q_{0.5} + Q_{1-u}}{Q_{u} - Q_{1-u}} $$

Another popular robust measure of skewness is medcouple which is the median of the following function values:

$$ h(x_i, x_j) = \frac{(x_i - x_m) - (x_m - x_j)}{x_i - x_j} $$

for all pairs $(x_i, x_j)$ such that $x_i \geq x_m \geq x_j$ (if $x_i = x_m = x_j$, we assume $h(x_i, x_j) = \textrm{sign}(i+j-1-m)$).

The list goes on. The most important fact is that skewness has many different definitions. Without a context, it’s hard to tell which approach is used (typically, the best guess is a third-moment based skewness like $b_1$ or $g_1$).


Regardless of the definition, skewness value is meaningful only for unimodal distributions. If the distribution has multiple modes, the raw skewness value will not help you to get any insights about the actual distribution form. Thus, it makes sense to check the distribution for unimodality first (e.g., using the lowland multimodality detector).


The classic third-moment skewness is not robust. This means that a single outlier could unpredictably corrupt the skewness value. Let’s illustrate the problem with an example. Imagine we build a sample with 10000 elements randomly taken from the Beta distribution $\textrm{Beta}(2, 10)$. The underlying distribution is right-skewed (the “true” distribution skewness is positive):

I have generated such a sample and evaluated its skewness. The skewness estimation was 0.9673111. Next, I have added an outlier -1000. The “new” skewness value was -99.95875! Thus, a single extreme outlier could easily corrupt skewness estimation of a sample with thousands of “well-formed” elements.

Here is a short R snippet that reproduces this experiment:

> library(e1071) # skewness from e1071 uses Joanes-Gill Type 3 approach (b1) by default
> set.seed(42)
> x <- rbeta(10000, 2, 10)
> skewness(x)
[1] 0.9673111
> skewness(c(-1000, x))
[1] -99.95875

Breaking the Mean-Median rule

There is a well-known rule of thumb that says that the mean if left of the median for left-skewed distributions, and right of the median for right-skewed distributions. This empirical rule could work in some “simple” cases, but it doesn’t work in the general case.

This problem is well-covered in [Hippel2005] where the author demonstrates multiple examples of the rule violations. In addition to this article, let’s perform our own experiment:

Here are the results:


In this post, we briefly covered some sources of skewness misleadingness. It’s too easy to get invalid insights about a distribution based on a skewness value. If we still want to work with skewness, make sure that:


References (4)