Log-normal Distribution — A simple explanation (2024)

How to calculate μ & σ, the mode, mean, median & variance

Published in

Towards Data Science

6 min read

Feb 16, 2022

We will briefly look at the definition of the log-normal and then go onto calculate the distribution’s parameters μ and σ from simple data. We will then have a look at how to calculate the mean, mode, median and variance from this probability distribution.

The log-normal distribution is a right skewed continuous probability distribution, meaning it has a long tail towards the right. It is used for modelling various natural phenomena such as income distributions, the length of chess games or the time to repair a maintainable system and more.

Log-normal Distribution — A simple explanation (3)

The probability density function for the log-normal is defined by the two parameters μ and σ, where x > 0:

Log-normal Distribution — A simple explanation (4)

μ is the location parameter and σ the scale parameter of the distribution. Caution here! These two parameters should not be mistaken for the more familiar mean or standard deviation from a normal distribution. When our log-normal data is transformed using logarithms our μ can then be viewed as the mean (of the transformed data) and σ as the standard deviation (of the transformed data). But without these transformations μ and σ here are simply two parameters that define our log-normal, not the mean or standard deviation! Okay, now we went from “let’s keep it easy” to “a little too much information”. Let’s dial back and have a look at the just mentioned relationship between the log-normal and normal distribution a bit more.

The name of the “log-normal” distribution reveals that it relates to logarithms as well as the normal distribution. How? Let’s say your data fits a log-normal distribution. If you then take the logarithm of all your data points, the newly transformed points will now fit a normal distribution. This simply means that when you take the log of your log-normal data you end up with a normal distribution. See figure below.

Log-normal Distribution — A simple explanation (5)

The data points for our log-normal distribution are given by the X variable. When we log-transform that X variable (Y=ln(X)) we get a Y variable which is normally distributed.

We can reverse this thinking and look at Y instead. If Y has a normal distribution and we take the exponential of Y (X=exp(Y)), then we get back to our X variable, which has a log-normal distribution. This visual is helpful to keep in mind when analysing important properties of the log-normal distribution:

“The most efficient way to analyse log-normally distributed data consists of applying the well-known methods based on the normal distribution to logarithmically transformed data and then to back-transform results if appropriate.” Lognormal wiki

We can estimate our log-normal parameters μ and σ using maximum likelihood estimation (MLE). This is a popular approach for approximating distribution parameters as it finds parameters that make our assumed probability distribution ‘most likely’ for our observed data.

If you want to understand how MLE works in more detail, StatQuest explains the approach in a fun intuitive way and also derives the estimators for the normal distribution.

The maximum likelihood estimators for the normal distribution are:

Log-normal Distribution — A simple explanation (6)

We, however, want the maximum likelihood estimators μ and σ for the log-normal distribution, which are:

Log-normal Distribution — A simple explanation (7)

These formulas are near identical. We can see that we can use the same approach as with the normal distribution and just transform our data with a logarithm first. If you are curious about how we get our log-normal estimators here is a link to the derivation.

Where is the simple example?!

Let’s take a look at 5 values of income that follow a log-normal distribution. Our fictitious person 1 earns 20k, person 2 earns 22k and so on:

Log-normal Distribution — A simple explanation (8)

We can now estimate μ with the logic from above. First, we take the log of each of our income data points and then calculate the average value for the 5 transformed data points, see below:

Log-normal Distribution — A simple explanation (9)

Log-normal Distribution — A simple explanation (10)

This gives us a value of 3.36 for our location parameter μ.

We can then use our estimated μ to approximate our σ with the following formula.

Log-normal Distribution — A simple explanation (11)

Rather than calculating σ², we take the square root of the formula above to approximate σ. The formula also uses n-1 instead of just n to get a less biased estimator. If you want to understand more on this change have a look at corrected sample variance (or also Bessel’s correction).

Log-normal Distribution — A simple explanation (12)

Similar to above, the first step is to take the logarithm of each individual income data point. We then subtract the estimated μ from each log-transformed data point and then square each result. See table above. These values are then inserted into the formula from above:

Log-normal Distribution — A simple explanation (13)

This gives us a value of 0.4376 for our scale parameter σ.

Note: These calculations are just an example of how these values can be obtained. You need more values to have any statistical significance.

Extracting some of the important properties of the log-normal distribution is straightforward once we have our parameters μ and σ. See key properties, their formula, and the calculation for our example data in the table and figure below.

Log-normal Distribution — A simple explanation (14)

How do we arrive at the different formulas in the table above?

The median is derived by taking the log-normal cumulative distribution function, setting it to 0.5 and then solving this equation (see here).
The mode represents the global maximum of the distribution and can therefore be derived by taking the derivative of the log-normal probability density function and solving it for 0 (see here).
The mean (also known as the expected value) of the log-normal distribution is the probability-weighted average over all possible values (see here).
The variance of the log-normal distribution is the probability-weighted average of the squared deviation from the mean (see here).

Log-normal Distribution — A simple explanation (15)

FAQs

Log-normal Distribution — A simple explanation? ›

Informal Definition

Keep Reading ›

What is the log of a normal distribution? ›

A log-normal distribution is a continuous distribution of random variable y whose natural logarithm is normally distributed. For example, if random variable y = exp { y } has log-normal distribution then x = log ( y ) has normal distribution.

Get More Info Here ›

How do you explain normal distribution to a layman? ›

If something is said to follow the normal distribution, it means in the most simple terms that most of the data lies around the average. An easy example is the distribution of test grades in schools. Most people will score around the average, with a few high scores and a few low scores.

Discover More ›

What is a real life example of a lognormal distribution? ›

For example, the following phenomenon can all be modeled with a lognormal distribution: Milk production by cows. Lives of industrial units with failure modes that are characterized by fatigue-stress. Amounts of rainfall.

What is the difference between log normal distribution and normal distribution? ›

Mainly, normal distributions can allow for negative random variables while log-normal distributions include all positive variables. One of the most common applications where log-normal distributions are used in finance is in the analysis of stock prices.

Explore More ›

What is a log-normal distribution for dummies? ›

Informal Definition

The log-normal distribution is a right skewed continuous probability distribution, meaning it has a long tail towards the right. It is used for modelling various natural phenomena such as income distributions, the length of chess games or the time to repair a maintainable system and more.

Learn More ›

How do you make a log-normal distribution? ›

The method is simple: you use the RAND function to generate X ~ N(μ, σ), then compute Y = exp(X). The random variable Y is lognormally distributed with parameters μ and σ. This is the standard definition, but notice that the parameters are specified as the mean and standard deviation of X = log(Y).

Learn More ›

What is a normal distribution for dummies? ›

What is normal distribution? A normal distribution is a type of continuous probability distribution in which most data points cluster toward the middle of the range, while the rest taper off symmetrically toward either extreme. The middle of the range is also known as the mean of the distribution.

How to explain normal distribution to a kid? ›

A normal distribution follows the empirical rule, which means that approximately 68% of all the data will be within one standard deviation, 95% of all data will be within two standard deviations, and 99.7% of all data will be within three standard deviations.

Get More Info ›

What is the normal distribution in your own words? ›

What Is a Normal Distribution? Normal distribution, also known as the Gaussian distribution, is a probability distribution that is symmetric about the mean, showing that data near the mean are more frequent in occurrence than data far from the mean.

Read On ›

Why do we need lognormal distribution? ›

A random variable that is log-normally distributed takes only positive real values. It is a convenient and useful model for measurements in the natural sciences, engineering, as well as medicine, economics and other fields.

Discover More Details ›

What are the uses of lognormal? ›

Lognormal distribution plays an important role in probabilistic design because negative values of engineering phenomena are sometimes physically impossible. Typical uses of lognormal distribution are found in descriptions of fatigue failure, failure rates, and other phenomena involving a large range of data.

See Details ›

What distribution is similar to lognormal? ›

At first glance, the Lognormal, Weibull, and Gamma distributions distributions look quite similar to each other. Selecting between the three models is “quite difficult” (Siswadi & Quesenberry) and the problem of testing which distribution is the best fit for data has been studied by a multitude of researchers.

What is the disadvantage of lognormal distribution? ›

Log Normal distributions are only described by two parameters (Arithmetic Mean and standard deviation). This makes them limited in their flexibility. This limitation most commonly shows up in an under prediction of extreme events.

How do you know if a distribution is log normal? ›

Normally distributed data forms a symmetric bell-shaped graph, as seen in the previous graphs. In contrast, lognormally distributed data does not form a symmetric shape but rather slants or skews more towards the right.

Read On ›

Can log normal be negative? ›

All values in a lognormal distribution are positive. Negative values and zeroes are not possible in a lognormal distribution.

Keep Reading ›

What is log-normal distribution of prices? ›

A lognormal distribution is a distribution that becomes a normal distribution if one converts the values of the variable to the natural logarithms, or ln's, of the values of the variable. For example, consider a stock for which the expected increase in value per year is 10% and the volatility of the stock price is 30%.

Tell Me More ›

What is the log return normal distribution? ›

The normality of the log-returns for the price of the stocks is one of the most important assumptions in mathematical finance. Usually is assumed that the price dynamics of the stocks are driven by geometric Brownian motion and, in that case, the log-return of the prices are independent and normally distributed.

Tell Me More ›

What is the log-normal transformation? ›

The log transformation is, arguably, the most popular among the different types of transformations used to transform skewed data to approximately conform to normality. If the original data follows a log-normal distribution or approximately so, then the log-transformed data follows a normal or near normal distribution.

Get More Info Here ›

How do you fit a log-normal distribution? ›

To fit the lognormal distribution to data and find the parameter estimates, use lognfit , fitdist , or mle . For uncensored data, lognfit and fitdist find the unbiased estimates of the distribution parameters, and mle finds the maximum likelihood estimates.