Matplotlib histogram and estimated PDF in Python

Key focus: Shown with examples: let’s estimate and plot the probability density function of a random variable using Python’s Matplotlib histogram function.

Generation of random variables with required probability distribution characteristic is of paramount importance in simulating a communication system. Let’s see how we can generate a simple random variable, estimate and plot the probability density function (PDF) from the generated data and then match it with the intended theoretical PDF. Normal random variable is considered here for illustration.

Step 1: Generate random samples

A survey of commonly used fundamental methods to generate a given random variable is given in [1]. For this demonstration, we will consider the normal random variable with the following parameters : μ – mean and σ – standard deviation. First generate a vector of randomly distributed random numbers of sufficient length (say 100000) with some valid values for μ and σ. There are more than one way to generate this. Two of them are given below.

● Method 1: Using the in-built numpy.random.normal() function (requires numpy package to be installed)

import numpy as np

mu=10;sigma=2.5 #mean=10,deviation=2.5
L=100000 #length of the random vector

#Random samples generated using numpy.random.normal()
samples_normal = np.random.normal(loc=mu,scale=sigma,size=(L,1)) #generate normally distributted samples

● Method 2: Box-Muller transformation [2] method produces a pair of normally distributed random numbers (Z1, Z2) by transforming a pair of uniformly distributed independent random samples (U1,U2). The algorithm for transformation is given by

#Samples generated using Box-Muller transformation

U1 = np.random.uniform(low=0,high=1,size=(L,1)) #uniformly distributed random numbers U(0,1)
U2 = np.random.uniform(low=0,high=1,size=(L,1)) #uniformly distributed random numbers U(0,1)

a = np.sqrt(-2*np.log(U1))
b = 2*np.pi*U2

Z = a*np.cos(b) #Standard Normal distributed numbers
samples_box_muller= Z*sigma+mu #Normal distribution with mean and sigma

Step 2: Plot the estimated histogram

Typically, if we have a vector of random numbers that is drawn from a distribution, we can estimate the PDF using the histogram tool. Matplotlib’s hist function can be used to compute and plot histograms. If the density argument is set to ‘True’, the hist function computes the normalized histogram such that the area under the histogram will sum to 1. Estimate and plot the normalized histogram using the hist function.

#For plotting
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')

fig, ax0 = plt.subplots(ncols=1, nrows=1) #creating plot axes
(values, bins, _) = ax0.hist(samples_normal,bins=100,density=True,label="Histogram of samples") #Compute and plot histogram, return the computed values and bins

Step 3: Theoretical PDF:

And for verification, overlay the theoretical PDF for the intended distribution. The theoretical PDF of normally distributed random samples is given by

Theoretical PDF for normal distribution is readily obtained from stats.norm.pdf() function in the SciPy package.

from scipy import stats
bin_centers = 0.5*(bins[1:] + bins[:-1])
pdf = stats.norm.pdf(x = bin_centers, loc=mu, scale=sigma) #Compute probability density function
ax0.plot(bin_centers, pdf, label="PDF",color='black') #Plot PDF
ax0.legend()#Legend entries
ax0.set_title('PDF of samples from numpy.random.normal()');
PDF of samples using numpy random normal function
Figure 1: Estimated PDF (histogram) and the theoretical PDF for samples generated using numpy.random.normal() function

The histogram and theoretical PDF of random samples generated using Box-Muller transformation, can be plotted in a similar manner.

#Samples generated using Box-Muller transformation
from numpy.random import uniform
U1 = uniform(low=0,high=1,size=(L,1)) #uniformly distributed random numbers U(0,1)
U2 = uniform(low=0,high=1,size=(L,1)) #uniformly distributed random numbers U(0,1)
a = np.sqrt(-2*np.log(U1))
b = 2*np.pi*U2
Z = a*np.cos(b) #Standard Normal distribution
samples_box_muller = Z*sigma+mu #Normal distribution with mean and sigma

#Plotting
fig, ax1 = plt.subplots(ncols=1, nrows=1) #creating plot axes
(values, bins, _) = ax1.hist(samples_box_muller,bins=100,density=True,label="Histogram of samples") #Plot histogram
bin_centers = 0.5*(bins[1:] + bins[:-1])
pdf = stats.norm.pdf(x = bin_centers, loc=mu, scale=sigma) #Compute probability density function
ax1.plot(bin_centers, pdf, label="PDF",color='black') #Plot PDF
ax1.legend()#Legend entries
ax1.set_title('Box-Muller transformation');

References:

[1] John Mount, ‘Six Fundamental Methods to Generate a Random Variable’, January 20, 2012
[2] Thomas, D. B., Luk. W., Leong, P. H. W., and Villasenor, J. D. 2007. Gaussian random number generators. ACM Comput. Surv. 39, 4, Article 11 (October 2007), 38 pages DOI = 10.1145/1287620.1287622 http://doi.acm.org/10.1145/1287620.1287622

Rate this post: Note: There is a rating embedded within this post, please visit this post to rate it.

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Similar topics

Articles in this series
[1] Fibonacci series in python
[2] Central Limit Theorem – a demonstration
[3] Moving Average Filter in Python and Matlab
[4] How to plot FFT in Python – FFT of basic signals : Sine and Cosine waves
[5] How to plot audio files as time-series using Scipy python
[6] How to design a simple FIR filter to reject unwanted frequencies
[7] Analytic signal, Hilbert Transform and FFT
[8] Non-central Chi-squared Distribution
[9] Simulation of M-PSK modulation techniques in AWGN channel (in Matlab and Python)
[10] QPSK modulation and Demodulation (with Matlab and Python implementation)

Plot histogram and estimated PDF in Matlab

Key focus: With examples, let’s estimate and plot the probability density function of a random variable using Matlab histogram function.

Generation of random variables with required probability distribution characteristic is of paramount importance in simulating a communication system. Let’s see how we can generate a simple random variable, estimate and plot the probability density function (PDF) from the generated data and then match it with the intended theoretical PDF. Normal random variable is considered here for illustration. Other types of random variables like uniform, Bernoulli, binomial, Chi-squared, Nakagami-m are illustrated in the next section.

Note: If you are inclined towards programming in Python, visit this article

Step 1: Create the random variable

A survey of commonly used fundamental methods to generate a given random variable is given in [1]. For this demonstration, we will consider the normal random variable with the following parameters : – mean and – standard deviation. First generate a vector of randomly distributed random numbers of sufficient length (say 100000) with some valid values for and . There are more than one way to generate this. Some of them are given below.

This article is part of the book
Wireless Communication Systems in Matlab (second edition), ISBN: 979-8648350779 available in ebook (PDF) format and Paperback (hardcopy) format.

● Method 1: Using the in-built random function (requires statistics toolbox)

mu=0;sigma=1;%mean=0,deviation=1
L=100000; %length of the random vector
R = random('Normal',mu,sigma,L,1);%method 1

●  Method 2: Using randn function that generates normally distributed random numbers having and = 1

mu=0;sigma=1;%mean=0,deviation=1
L=100000; %length of the random vector
R = randn(L,1)*sigma + mu; %method 2

● Method 3: Box-Muller transformation [2] method using rand function that generates uniformly distributed random numbers

 mu=0;sigma=1;%mean=0,deviation=1
L=100000; %length of the random vector
U1 = rand(L,1); %uniformly distributed random numbers U(0,1)
U2 = rand(L,1); %uniformly distributed random numbers U(0,1)
Z = sqrt(-2log(U1)).cos(2piU2);%Standard Normal distribution
R = Z*sigma+mu;%Normal distribution with mean and sigma

Step 2: Plot the estimated histogram

Typically, if we have a vector of random numbers that is drawn from a distribution, we can estimate the PDF using the histogram tool.  Matlab supports two in-built functions to compute and plot histograms:

● hist – introduced before R2006a
● histogram – introduced in R2014b

Which one to use ? Matlab’s help page points that the hist function is not recommended for several reasons and the issue of inconsistency is one among them. The histogram function is the recommended function to use.

Estimate and plot the normalized histogram using the recommended ‘histogram’ function. And for verification, overlay the theoretical PDF for the intended distribution. When using the histogram function to plot the estimated PDF from the generated random data, use ‘pdf’ option for ‘Normalization’ option. Do not use the ‘probability’ option for ‘Normalization’ option, as it will not match the theoretical PDF curve.

histogram(R,'Normalization','pdf'); %plot estimated pdf from the generated data

X = -4:0.1:4; %range of x to compute the theoretical pdf
fx_theory = pdf('Normal',X,mu,sigma); %theoretical normal probability density
hold on; plot(X,fx_theory,'r'); %plot computed theoretical PDF
title('Probability Density Function'); xlabel('values - x'); ylabel('pdf - f(x)'); axis tight;
legend('simulated','theory');
Estimated PDF (using histogram function) and the theoretical PDF

However, if you do not have Matlab version that was released before R2014b, use the ‘hist’ function and get the histogram frequency counts () and the bin-centers (). Using these data, normalize the frequency counts using the overall area under the histogram. Plot this normalized histogram and overlay the theoretical PDF for the chosen parameters.

%For those who don't have access to 'histogram' function
%get un-normalized values from hist function with same number of bins as histogram function
numBins=50; %choose appropriately
[f,x]=hist(R,numBins); %use hist function and get unnormalized values
figure; plot(x,f/trapz(x,f),'b-*');%plot normalized histogram from the generated data

X = -4:0.1:4; %range of x to compute the theoretical pdf
fx_theory =   pdf('Normal',X,mu,sigma); %theoretical normal probability density
hold on; plot(X,fx_theory,'r'); %plot computed theoretical PDF
title('Probability Density Function'); xlabel('values - x'); ylabel('pdf - f(x)'); axis tight;
legend('simulated','theory');

Step 3: Theoretical PDF:

The given code snippets above,  already include the command to plot the theoretical PDF by using the ‘pdf’ function in Matlab. It you do not have access to this function, you could use the following equation for computing the theoretical PDF

The code snippet for that purpose is given next.

X = -4:0.1:4; %range of x to compute the theoretical pdf
fx_theory = 1/sqrt(2*pi*sigma^2)*exp(-0.5*(X-mu).^2./sigma^2);
plot(X,fx_theory,'k'); %plot computed theoretical PDF

Note:  The functions – ‘random’ and ‘pdf’ , requires statistics toolbox.

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

References:

[1] John Mount, ‘Six Fundamental Methods to Generate a Random Variable’, January 20, 2012.↗
[2] Thomas, D. B., Luk. W., Leong, P. H. W., and Villasenor, J. D. 2007. Gaussian random number generators. ACM Comput. Surv. 39, 4, Article 11 (October 2007), 38 pages DOI = 10.1145/1287620.1287622 http://doi.acm.org/10.1145/1287620.1287622.↗

Topics in this chapter

Random Variables - Simulating Probabilistic Systems
● Introduction
Plotting the estimated PDF
● Univariate random variables
 □ Uniform random variable
 □ Bernoulli random variable
 □ Binomial random variable
 □ Exponential random variable
 □ Poisson process
 □ Gaussian random variable
 □ Chi-squared random variable
 □ Non-central Chi-Squared random variable
 □ Chi distributed random variable
 □ Rayleigh random variable
 □ Ricean random variable
 □ Nakagami-m distributed random variable
Central limit theorem - a demonstration
● Generating correlated random variables
 □ Generating two sequences of correlated random variables
 □ Generating multiple sequences of correlated random variables using Cholesky decomposition
Generating correlated Gaussian sequences
 □ Spectral factorization method
 □ Auto-Regressive (AR) model

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing