Bayes’ theorem

Key focus: Bayes’ theorem is a method for revising the prior probability for specific event, taking into account the evidence available about the event.

Introduction

In statistics, the process of drawing conclusions from data subject to random variations – is called “statistical inference”. Usually, in any random experiment, the observations are recorded and conclusions have to be drawn based on the recorded data set. Conclusions over the underlying random process are necessary to establish one or many of the following:

* Estimation of a parameter of interest (For example: the carrier frequency estimation in the receiver)
* Confidence and credibility of the estimate
* Rejecting a preconceived hypothesis
* Classification of data set into groups

Several schools of statistical inference have evolved over time. Bayesian inference is one of them.

Bayes’ theorem

Bayes’ theorem is central to scientific discovery and a core tool in machine learning/AI. It has numerous applications including but not limited to areas such as: mathematics, medicine, finance, marketing and engineering.

The Bayes’ theorem is used in Bayesian inference, usually dealing with a sequence of events, as new information becomes available about a subsequent event, that new information is used to update the probability of the initial event. In this context, we encounter two flavors of probabilities: prior probability and posterior probability.

Prior probability : This is the initial probability about an event before any information is available about the event. In other words, this is the initial belief about a particular hypothesis before any evidence is available about the hypothesis.

Posterior probability: This is the probability value that has been revised by using new information that is later obtained from a subsequent event. In other words, this is the updated belief about the hypothesis as new evident becomes available.

The formula for Bayes’ theorem is

Figure 1: Formula for Bayes’ theorem

A very simple thought experiment

You are asked to conduct a random experiment with a given coin. You are told that the coin is unbiased (probability of obtaining head or tail is equal and is exactly 50%). You believe (before conducting the experiment) that the coin is unbiased and that the chance of getting head or tail is equal to be 0.5.

Assume that you have not looked at both sides of the coin and simply you start to conduct the experiment. You start to toss the coin repeatedly and record the events (This is the observed new information/evidences). On the first toss you observe the coin lands on the ground with head faced up. On the second toss, again the head shows up. On subsequent tosses, the coin always shows up head. You have tossed 100 times and all these tosses you observe only head. Now what will you think about the coin? You will really start to think that both sides of the coin are engraved with “head” (no tail etched on the coin). Now, based on the new evidences, your belief about the “unbiasedness” of the coin is altered.

This is what Bayes’ theorem or Bayesian inference is all about. It is a general principle about learning from experience. It connects beliefs (called prior probabilities) and evidences (observed data). Based on the evidence, the degree of belief is refined. The degree of belief after conducting the experiment is called posterior probability.

Figure 2: Bayes’ theorem – the process

Real world example

Suppose, a person X falls sick and goes to the doctor for diagnosis. The doctor runs a series of tests and the test result came positive for a rare disease that affects 0.1% of the population. The accuracy of the test is 99%. That is, the test can correctly identify 99% of people that have the disease and will incorrectly report disease in only 1% of the people that do not have the disease. Now, how certain is that the person X actually have the disease ?

In this scenario, we can apply the extended form of Bayes’ theorem

Figure 3: Bayes’ theorem – extended form

Extended form of Bayes’ theorem is applied in special scenarios where P(H) is a binary variable, which implies it can take only two possible states. In the given problem above, the hypothesis can take only two states – H – “having the disease” and – “not having the disease”.

For the given problem, we can come up with the following numbers for the various quantities in the extended form of Bayes’ theorem.

P(H) = prior probability of having the disease before the availability of test results. This is often guess work, but luckily we have the probability that affects the population (0.1% = 0.001) to replace this.
P(E/H) = probability to test positive for the disease if person X has the disease (99% = 0.99)
P(H̅) = probability of NOT having the disease (1-0.001 = 0.999)
P(E/H̅) = probability of NOT having the disease and falsely identified positive by the test (1% = 0.01).
P(H/E) = probability of person X actually have the disease given the test result is positive.

Plugging-in these numbers in the extended form of Bayes’ theorem, we get the probability that X actually have the disease is just 9%.

Figure 4: Calculation using extended form of Bayes’ theorem

Person X doubts the result and goes for a second opinion to another doctor and gets tested from an independent laboratory. The second test result came back positive this time too. Now what is the probability that person X actually have the disease ?

P(H) = Replace this with the posterior probability from first test (we are refining the belief about the result of the first test) = 9.016% = 0.09016
P(E/H) = probability to test positive for the disease if person X has the disease (99% = 0.99)
P(H̅) = probability of NOT having the disease from first test (1-0.09016 = 0.90984)
P(E/H̅) = probability of NOT having the disease and falsely identified positive by the second test (1% = 0.01).
P(H/E) = probability of person X actually have the disease given the second test result is also positive.

Figure 5: Refining the belief about the first test using results from second test

Therefore, the updated probability based on two positive tests is 90.75%. This implies that there is a 90.75% chance that person X has the disease.

I hope the reader got a better understanding of what Bayes’ theorem is, various parameters in the equation for Bayes’ theorem and how to apply it.

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

References

[1] Jeremy Orloff and Jonathan Bloom, “Conditional Probability, Independence and Bayes’ Theorem”, MIT OCW, Class 3, 18.05 Introduction to Probability and Statistics ↗.
[2] Veritasium, “The Bayesian Trap”, YouTube

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Hidden Markov Models (HMM) – Simplified !!!

Markov chains are useful in computing the probability of events that are observable. However, in many real world applications, the events that we are interested in are usually hidden, that is we don’t observe them directly. These hidden events need to be inferred. For example, given a sentence in a natural language we only observe the words and characters directly. The parts-of-speech from the sentence are hidden, they have to be inferred. This bring us to the following topic– the hidden Markov models.

Hidden Markov models enables us to visualize both observations and the associated hidden events. Let’s consider an example for understanding the concept.

The cheating casino and the gullible gambler

Consider a dishonest casino that deceives it player by using two types of die : a fair die (F) and a loaded die (L). For a fair die, each of the faces has the same probability of landing facing up. For the loaded die, the probabilities of the faces are skewed as given next

When the gambler throws the die, numbers land facing up. These are our observations at a given time t (denoted as Ot = {1,2,3,4,5,6}). At any given time t, whether these number are rolled from a fair die (state St = F) or a loaded die (St = L), is unknown to an observer and therefore they are the hidden events.

Emission probabilities

The probabilities associated with the observations are the observation likelihoods, also called emission probabilities (B).

Initial probabilities

The initial probability of starting (at time = 0) with any of fair die or loaded die (hidden event) is 50%.

Transition probabilities

A gullible gambler switches from the fair die to loaded die with 10% probability. He switches back from loaded die to fair die with 5% probability.

The probabilities of transitioning from one hidden event to another is described by the transition probability matrix (A). The elements of the probability transition matrix, are the transition probabilities (pij) of moving from one hidden state i to another hidden state j.

The transition probabilities from time t-1 to t, for the hidden events are

Therefore, the transition probability matrix is

Based on the given information so far, a probability model is constructed. This is the Hidden Markov Model (HMM) for the given problem.

Figure 1: Hidden Markov Model for the cheating Casino problem

Assumptions

We saw, in previous article, that the Markov models come with assumptions. Similarly, HMMs models also have such assumptions.

1. Assumption on probability of hidden states

In the model given here, the probability of a given hidden state depends only on the previous hidden state. This is a typical first order Markov chain assumption.

2. Assumption on Output

The probability of any observation (output) depends on the hidden state that produce it and not on any other hidden state or output observations.

Problems and Algorithms

Let’s briefly discuss the different problems and the related algorithms for HMMs. The algorithms will be explained in detail in the future articles.

In the dishonest casino, the gambler rolls the following numbers:

Figure 2: Sample Observations

1. Evaluation

Given the model of the dishonest casino, what is the probability of obtaining the above sequence ? This is a typical evaluation problem in HMMs. Forward algorithm is applied for such evaluation problems.

2. Decoding

What is the most likely sequence of die (hidden states) given the above sequence ? Such problems are addressed by Viterbi decoding.

What is the probability of fourth die being loaded, given the above sequence ? Forward-backward algorithm to our rescue.

3. Learning

Learning problems involve parametrization of the model. In learning problems, we attempt to find the various parameters (transition probabilities, emission probabilities) of the HMM, given the observation. Baum-Welch algorithm helps us to find the unknown parameters of a HMM.

Some real-life examples

Here are some real-life examples of HMM applications:

  1. Speech recognition: HMMs are widely used in speech recognition systems to model the variability of speech sounds. In this application, the observable events are the acoustic features of the speech signal, while the hidden states represent the phonemes or words that generate the speech signal.
  2. Handwriting recognition: HMMs can be used to recognize handwritten characters by modeling the temporal variability of the pen strokes. In this application, the observable events are the coordinates of the pen on the writing surface, while the hidden states represent the letters or symbols that generate the handwriting.
  3. Stock price prediction: HMMs can be used to model the behavior of stock prices and predict future price movements. In this application, the observable events are the daily price movements, while the hidden states represent the different market conditions that generate the price movements.
  4. Gene prediction: HMMs can be used to identify genes in DNA sequences. In this application, the observable events are the nucleotides in the DNA sequence, while the hidden states represent the different regions of the genome that generate the sequence.
  5. Natural language processing: HMMs are used in many natural language processing tasks, such as part-of-speech tagging and named entity recognition. In these applications, the observable events are the words in the text, while the hidden states represent the grammatical structures or semantic categories that generate the text.
  6. Image and video analysis: HMMs can be used to analyze images and videos, such as for object recognition and tracking. In this application, the observable events are the pixel values in the image or video, while the hidden states represent the object or motion that generates the pixel values.
  7. Bio-signal analysis: HMMs can be used to analyze physiological signals, such as electroencephalograms (EEGs) and electrocardiograms (ECGs). In this application, the observable events are the signal measurements, while the hidden states represent the physiological states that generate the signal.
  8. Radar signal processing: HMMs can be used to process radar signals and detect targets in noisy environments. In this application, the observable events are the radar measurements, while the hidden states represent the presence or absence of targets.

Rate this post: Note: There is a rating embedded within this post, please visit this post to rate it.

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Similar topics

Essentials of Signal Processing
● Generating standard test signals
 □ Sinusoidal signals
 □ Square wave
 □ Rectangular pulse
 □ Gaussian pulse
 □ Chirp signal
Interpreting FFT results - complex DFT, frequency bins and FFTShift
 □ Real and complex DFT
 □ Fast Fourier Transform (FFT)
 □ Interpreting the FFT results
 □ FFTShift
 □ IFFTShift
Obtaining magnitude and phase information from FFT
 □ Discrete-time domain representation
 □ Representing the signal in frequency domain using FFT
 □ Reconstructing the time domain signal from the frequency domain samples
● Power spectral density
Power and energy of a signal
 □ Energy of a signal
 □ Power of a signal
 □ Classification of signals
 □ Computation of power of a signal - simulation and verification
Polynomials, convolution and Toeplitz matrices
 □ Polynomial functions
 □ Representing single variable polynomial functions
 □ Multiplication of polynomials and linear convolution
 □ Toeplitz matrix and convolution
Methods to compute convolution
 □ Method 1: Brute-force method
 □ Method 2: Using Toeplitz matrix
 □ Method 3: Using FFT to compute convolution
 □ Miscellaneous methods
Analytic signal and its applications
 □ Analytic signal and Fourier transform
 □ Extracting instantaneous amplitude, phase, frequency
 □ Phase demodulation using Hilbert transform
Choosing a filter : FIR or IIR : understanding the design perspective
 □ Design specification
 □ General considerations in design

 

Markov Chains – Simplified !!

Key focus: Markov chains are a probabilistic models that describe a sequence of observations whose occurrence are statistically dependent only on the previous ones.

● Time-series data like speech, stock price movements.
● Words in a sentence.
● Base pairs on the rung of a DNA ladder.

States and transitions

Assume that we want to model the behavior of a driver behind the wheel. The possible behaviors are

● accelerate (state 1)
● constant speed (state 2)
● idling (engine running slowly but the vehicle is not moving – (state 3))
● brake (state 4)

Let’s refer each of these behaviors as a state. In the given example, there are N=4 states, refer them as Q = {q1,q2,q3,q4}.

We observe the following pattern in the driver’s behavior (Figure 1). That is, the driver operates the vehicle through a certain sequence of states. In the graph shown in Figure 1, the states are represented as nodes and the transitions as edges.

Figure 1: Driver’s behavior – operating the vehicle through a sequence of states

We see that, sometimes, the driver changes the state of the vehicle from one state to another and sometimes stays in the same state (as indicated by the arrows).

We also note that either the vehicle stays in the same state or changes to the next state. Therefore, from this model, if we want to predict the future state, all that matters is the current state of the vehicle. The past states has no bearing on the future state except through the current state. Take note of this important assumption for now.

Probabilistic model

We know that we cannot be certain about the driver’s behavior at any given point in time. Therefore, to model this uncertainty, the model is turned into a probabilistic model. A probabilistic model allows us to account for the likelihood of the behaviors or change of states.

An example for a probabilistic model for the given problem is given in Figure 2.

Figure 2: Driver’s behavior – a probabilistic model (transition matrix shown)

In this probabilistic model, we have assigned probability values to the transitions.These probabilities are collectively called transition probabilities. For example, considering the state named “idling”, the probability of the car to transition from this state to the next state (accelerate) is 0.8. In probability mathematics this is expressed as a conditional probability conditioned on the previous state.

p(state = “accelerate” | previous state = “idling” ) = 0.8

Usually, the transition probabilities are formulated in the form of matrix called transition probability matrix. The transition probability matrix is shown in Figure 2. In a transition matrix, denoted as A, each element aij represent the probability of transitioning from state i to state j. The elements of the transition matrix satisfy the following property.

That is, the sum of transition probabilities leaving any given state is 1.

As we know, in this example, the driver cannot start car in any state (example, it is impossible to start the car in “constant speed” state). He can only start the car from at rest (i.e, brake state). To model this uncertainty, we introduce πi – the probability that the Markov chain starts in a given state i. The set of starting probabilities for all the N states are called initial probability distribution (π = π1, π2, …, πN). In Figure 3, the starting probabilities are denoted by green arrows.

Figure 3: Markov Chain model for driver’s behavior

Markov Assumption

As noted in the definition, the Markov chain in this example, assumes that the occurrence of each event/observation is statistically dependent only on the previous one. This is a first order Markov chain (or termed as bigram language model in natural language processing application). For the states Q = {q1, …, qn}, predicting the probability of a future state depends only on the current observation, all other previous observations do not matter. In probabilistic terms, this first order Markov chain assumption is denoted as

Extending the assumption for mth order Markov chain, predicting the probability of a future observation depends only on the previous m observations. This is an m-gram model.

Given a set of n arbitrary random variables/observations Q = {q1, …, qn}, their joint probability distribution is usually computed by applying the following chain rule.

However, if the random observations in Q are of sequential in nature and follows the generic mth order Markov chain model, then the computation of joint probability gets simplified.

The Markov assumptions for first and second order of Markov models are summarized in Figure 4.Figure 4: Assumptions for 1st order and 2nd order Markov chains

Hidden Markov Model (HMM)

Markov chains are useful in computing the probability of events that are observable. However, in many real world applications, the events that we are interested in are usually hidden, that is we don’t observe them directly. These hidden events need to be inferred. For example, given a sentence in a natural language we only observe the words and characters directly. The parts-of-speech from the sentence are hidden, they have to be inferred. This brings us to the next topic of discussion – the hidden Markov models.

Rate this post: Note: There is a rating embedded within this post, please visit this post to rate it.

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Similar topics

Essentials of Signal Processing
● Generating standard test signals
 □ Sinusoidal signals
 □ Square wave
 □ Rectangular pulse
 □ Gaussian pulse
 □ Chirp signal
Interpreting FFT results - complex DFT, frequency bins and FFTShift
 □ Real and complex DFT
 □ Fast Fourier Transform (FFT)
 □ Interpreting the FFT results
 □ FFTShift
 □ IFFTShift
Obtaining magnitude and phase information from FFT
 □ Discrete-time domain representation
 □ Representing the signal in frequency domain using FFT
 □ Reconstructing the time domain signal from the frequency domain samples
● Power spectral density
Power and energy of a signal
 □ Energy of a signal
 □ Power of a signal
 □ Classification of signals
 □ Computation of power of a signal - simulation and verification
Polynomials, convolution and Toeplitz matrices
 □ Polynomial functions
 □ Representing single variable polynomial functions
 □ Multiplication of polynomials and linear convolution
 □ Toeplitz matrix and convolution
Methods to compute convolution
 □ Method 1: Brute-force method
 □ Method 2: Using Toeplitz matrix
 □ Method 3: Using FFT to compute convolution
 □ Miscellaneous methods
Analytic signal and its applications
 □ Analytic signal and Fourier transform
 □ Extracting instantaneous amplitude, phase, frequency
 □ Phase demodulation using Hilbert transform
Choosing a filter : FIR or IIR : understanding the design perspective
 □ Design specification
 □ General considerations in design

Exponential random variable – simulation & application

Introduction

An exponential random variable (RV) is a continuous random variable that has applications in modeling a Poisson process. Poisson processes find extensive applications in tele-traffic modeling and queuing theory. They are used to model random points in time or space, such as the times when call requests arriving at an exchange, the times when a shot noise occurs in the photon counting processing of an optical device, the times when file requests arrive at a serve etc.

Figure 1: Estimated PDF from an exponential random variable

Univariate random variables

This section focuses on some of the most frequently encountered univariate random variables in communication systems design. Basic installation of Matlab provides access to two fundamental random number generators: uniform random number generator (rand) and the standard normal random number generator (randn). They are fundamental in the sense that all other random variables like Bernoulli, Binomial, Chi, Chi-square, Rayleigh, Ricean, Nakagami-m, exponential etc.., can be generated by transforming them.

This article is part of the book
Wireless Communication Systems in Matlab (second edition), ISBN: 979-8648350779 available in ebook (PDF) format and Paperback (hardcopy) format.

Exponential RV

An exponential random variable takes value in the interval and has the following continuous distribution function (CDF).

The rate parameter specifies the mean number of occurrences per unit time and is the number of time units until the occurrence of next event that happens in the modeled process. The probability density function of the exponential rv is given by

By applying the inverse transform method [1], an uniform random variable can be transformed into an exponential random variable. This method is coded in the Matlab function that is shown at the end of this article. Using the function, a sequence of exponentially distributed random numbers can be generated, whose estimated pdf is plotted against the theoretical pdf as shown in the Figure 1.

Application to Poisson process

Poisson process is a continuous-time discrete state process that is widely used to model independent events occurring in time or space. It is widely applied to model a counting process in which the events occur at independently random times but appear to happen at certain rate. In practice, Poisson process has been used to model counting processes like

  • photons landing on a photo-diode
  • arrivals of phone calls at a telephone exchange
  • request for file downloads at a web server
  • location of users in a wireless network

Poisson process is closely related to a number of vital random variables (RV) including the uniform RV, binomial RV, the exponential RV and the Poisson RV. For example, the inter-arrival times (duration between the subsequent arrivals of events) in a Poisson process are independent exponential random variables.

For example, the inter-arrival times (duration between the subsequent arrivals of events) in a Poisson process are independent exponential random variables

Refer the book Wireless Communication Systems in Matlab for full Matlab code

function T = expRV(lambda,L)
%Generate random number sequence that is exponentially distributed
%lambda - rate parameter, L - length of the sequence generated
U = rand(1,L); %continuous uniform random numbers in (0,1)
T = -1/lambda*(log(1-U));
end

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

References

L. Devroye, Non-Uniform Random Variate Generation, Springer-Verlag, New York, 1986.↗

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Topics in this chapter

Random Variables - Simulating Probabilistic Systems
● Introduction
Plotting the estimated PDF
● Univariate random variables
 □ Uniform random variable
 □ Bernoulli random variable
 □ Binomial random variable
 □ Exponential random variable
 □ Poisson process
 □ Gaussian random variable
 □ Chi-squared random variable
 □ Non-central Chi-Squared random variable
 □ Chi distributed random variable
 □ Rayleigh random variable
 □ Ricean random variable
 □ Nakagami-m distributed random variable
Central limit theorem - a demonstration
● Generating correlated random variables
 □ Generating two sequences of correlated random variables
 □ Generating multiple sequences of correlated random variables using Cholesky decomposition
Generating correlated Gaussian sequences
 □ Spectral factorization method
 □ Auto-Regressive (AR) model

Binomial random variable using Matlab

Binomial random variable, a discrete random variable, models the number of successes in mutually independent Bernoulli trials, each with success probability . The term Bernoulli trial implies that each trial is a random experiment with exactly two possible outcomes: success and failure. It can be used to model the total number of bit errors in the received data sequence of length that was transmitted over a binary symmetric channel of bit-error probability .

Generating binomial random sequence in Matlab

Let X denotes the total number of successes in mutually independent Bernoulli trials. For ease of understanding, let’s denote success as ‘1’ and failure as ‘0’. Suppose if a particular outcome of the experiment contains ones and zeros (example outcome: 1011101), the probability mass function↗ of is given by

A binomial random variable can be simulated by generating independent Bernoulli trials and summing up the results.

function X = binomialRV(n,p,L)
%Generate Binomial random number sequence
%n - the number of independent Bernoulli trials
%p - probability of success yielded by each trial
%L - length of sequence to generate
X = zeros(1,L);
for i=1:L,
   X(i) = sum(bernoulliRV(n,p));
end
end

Following program demonstrates how to generate a sequence of binomially distributed random numbers, plot the estimated and theoretical probability mass functions for the chosen parameters (Figure 1).

n=30; p=1/6; %number of trails and success probability
X = binomialRV(n,p,10000);%generate 10000 bino rand numbers
X_pdf = pdf('Binomial',0:n,n,p); %theoretical probility density
histogram(X,'Normalization','pdf'); %plot histogram
hold on; plot(0:n,X_pdf,'r'); %plot computed theoreical PDF
Figure 1: PMF generated from binomial random variable for three different cases of n and p

PMF sums to unity

Let’s verify theoretically, the fact that the PMF of the binomial distribution sums to unity. Using the result of Binomial theorem↗,

Mean and variance

The mean number of success in a binomial distribution is

The variance is

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

Topics in this chapter

Random Variables - Simulating Probabilistic Systems
● Introduction
Plotting the estimated PDF
● Univariate random variables
 □ Uniform random variable
 □ Bernoulli random variable
 □ Binomial random variable
 □ Exponential random variable
 □ Poisson process
 □ Gaussian random variable
 □ Chi-squared random variable
 □ Non-central Chi-Squared random variable
 □ Chi distributed random variable
 □ Rayleigh random variable
 □ Ricean random variable
 □ Nakagami-m distributed random variable
Central limit theorem - a demonstration
● Generating correlated random variables
 □ Generating two sequences of correlated random variables
 □ Generating multiple sequences of correlated random variables using Cholesky decomposition
Generating correlated Gaussian sequences
 □ Spectral factorization method
 □ Auto-Regressive (AR) model

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Bernoulli random variable

Bernoulli random variable is a discrete random variable with two outcomes – success and failure, with probabilities p and (1-p). It is a good model for binary data generators and also for modeling bit error patterns in the received binary data when a communication channel introduces random errors.

To generate a Bernoulli random variable X, in which the probability of success P(X=1)=p for some p ϵ (0,1), the discrete inverse transform method [1] can be applied on the continuous uniform random variable U(0,1) using the steps below.

  ● Generate uniform random number U in the interval (0,1)
  ● If U<p, set X=1, else set X=0

#bernoulliRV.m: Generating Bernoulli random number with success probability p
function X = bernoulliRV(L,p)
%Generate Bernoulli random number with success probability p
%L is the length of the sequence generated
U = rand(1,L); %continuous uniform random numbers in (0,1)
X = (U<p); end

Verifying law of large numbers

In probability theory, the law of large numbers is a theorem that involves repeating an experiment for a large number of times. According to this law, as the number of trials tend to become large, the average result obtained will be close to the expected value.

This article is part of the book
Wireless Communication Systems in Matlab (second edition), ISBN: 979-8648350779 available in ebook (PDF) format and Paperback (hardcopy) format.

Let’s toss a coin with probability of heads . This experiment is repeated for a large number of times, say and the average result for each trial are calculated in a cumulative fashion.

#lawOfLargeNumbers.m: Law of large numbers illustrated using Bernoulli random variable
n=1000; %number of trials
p=0.7; %probability of success
X=bernoulliRV(n,p); %Bernoulli random variable
y_sum=sum(triu(repmat(X,[prod(size(X)) 1])')); %cumulative sum
avg = y_sum./(1:1:n); %average of results
plot(1:1:n,avg,'.'); hold on;
xlabel('Trial #'); ylabel('Probability of Heads');
plot(p*ones(1,n),'r'); legend('average','expected');

Refer the book Wireless Communication Systems in Matlab for full Matlab code

Figure 1: Illustrating law of large numbers using Bernoulli trials

The resulting plot (Figure 1) shows that as the number of trial increases, the average approaches the expected value 0.7.

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

References

[1] L. Devroye, Non-Uniform Random Variate Generation, Springer-Verlag, New York, 1986.↗

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Topics in this chapter

Random Variables - Simulating Probabilistic Systems
● Introduction
Plotting the estimated PDF
● Univariate random variables
 □ Uniform random variable
 □ Bernoulli random variable
 □ Binomial random variable
 □ Exponential random variable
 □ Poisson process
 □ Gaussian random variable
 □ Chi-squared random variable
 □ Non-central Chi-Squared random variable
 □ Chi distributed random variable
 □ Rayleigh random variable
 □ Ricean random variable
 □ Nakagami-m distributed random variable
Central limit theorem - a demonstration
● Generating correlated random variables
 □ Generating two sequences of correlated random variables
 □ Generating multiple sequences of correlated random variables using Cholesky decomposition
Generating correlated Gaussian sequences
 □ Spectral factorization method
 □ Auto-Regressive (AR) model

Plot histogram and estimated PDF in Matlab

Key focus: With examples, let’s estimate and plot the probability density function of a random variable using Matlab histogram function.

Generation of random variables with required probability distribution characteristic is of paramount importance in simulating a communication system. Let’s see how we can generate a simple random variable, estimate and plot the probability density function (PDF) from the generated data and then match it with the intended theoretical PDF. Normal random variable is considered here for illustration. Other types of random variables like uniform, Bernoulli, binomial, Chi-squared, Nakagami-m are illustrated in the next section.

Note: If you are inclined towards programming in Python, visit this article

Step 1: Create the random variable

A survey of commonly used fundamental methods to generate a given random variable is given in [1]. For this demonstration, we will consider the normal random variable with the following parameters : – mean and – standard deviation. First generate a vector of randomly distributed random numbers of sufficient length (say 100000) with some valid values for and . There are more than one way to generate this. Some of them are given below.

This article is part of the book
Wireless Communication Systems in Matlab (second edition), ISBN: 979-8648350779 available in ebook (PDF) format and Paperback (hardcopy) format.

● Method 1: Using the in-built random function (requires statistics toolbox)

mu=0;sigma=1;%mean=0,deviation=1
L=100000; %length of the random vector
R = random('Normal',mu,sigma,L,1);%method 1

●  Method 2: Using randn function that generates normally distributed random numbers having and = 1

mu=0;sigma=1;%mean=0,deviation=1
L=100000; %length of the random vector
R = randn(L,1)*sigma + mu; %method 2

● Method 3: Box-Muller transformation [2] method using rand function that generates uniformly distributed random numbers

 mu=0;sigma=1;%mean=0,deviation=1
L=100000; %length of the random vector
U1 = rand(L,1); %uniformly distributed random numbers U(0,1)
U2 = rand(L,1); %uniformly distributed random numbers U(0,1)
Z = sqrt(-2log(U1)).cos(2piU2);%Standard Normal distribution
R = Z*sigma+mu;%Normal distribution with mean and sigma

Step 2: Plot the estimated histogram

Typically, if we have a vector of random numbers that is drawn from a distribution, we can estimate the PDF using the histogram tool.  Matlab supports two in-built functions to compute and plot histograms:

● hist – introduced before R2006a
● histogram – introduced in R2014b

Which one to use ? Matlab’s help page points that the hist function is not recommended for several reasons and the issue of inconsistency is one among them. The histogram function is the recommended function to use.

Estimate and plot the normalized histogram using the recommended ‘histogram’ function. And for verification, overlay the theoretical PDF for the intended distribution. When using the histogram function to plot the estimated PDF from the generated random data, use ‘pdf’ option for ‘Normalization’ option. Do not use the ‘probability’ option for ‘Normalization’ option, as it will not match the theoretical PDF curve.

histogram(R,'Normalization','pdf'); %plot estimated pdf from the generated data

X = -4:0.1:4; %range of x to compute the theoretical pdf
fx_theory = pdf('Normal',X,mu,sigma); %theoretical normal probability density
hold on; plot(X,fx_theory,'r'); %plot computed theoretical PDF
title('Probability Density Function'); xlabel('values - x'); ylabel('pdf - f(x)'); axis tight;
legend('simulated','theory');
Estimated PDF (using histogram function) and the theoretical PDF

However, if you do not have Matlab version that was released before R2014b, use the ‘hist’ function and get the histogram frequency counts () and the bin-centers (). Using these data, normalize the frequency counts using the overall area under the histogram. Plot this normalized histogram and overlay the theoretical PDF for the chosen parameters.

%For those who don't have access to 'histogram' function
%get un-normalized values from hist function with same number of bins as histogram function
numBins=50; %choose appropriately
[f,x]=hist(R,numBins); %use hist function and get unnormalized values
figure; plot(x,f/trapz(x,f),'b-*');%plot normalized histogram from the generated data

X = -4:0.1:4; %range of x to compute the theoretical pdf
fx_theory =   pdf('Normal',X,mu,sigma); %theoretical normal probability density
hold on; plot(X,fx_theory,'r'); %plot computed theoretical PDF
title('Probability Density Function'); xlabel('values - x'); ylabel('pdf - f(x)'); axis tight;
legend('simulated','theory');

Step 3: Theoretical PDF:

The given code snippets above,  already include the command to plot the theoretical PDF by using the ‘pdf’ function in Matlab. It you do not have access to this function, you could use the following equation for computing the theoretical PDF

The code snippet for that purpose is given next.

X = -4:0.1:4; %range of x to compute the theoretical pdf
fx_theory = 1/sqrt(2*pi*sigma^2)*exp(-0.5*(X-mu).^2./sigma^2);
plot(X,fx_theory,'k'); %plot computed theoretical PDF

Note:  The functions – ‘random’ and ‘pdf’ , requires statistics toolbox.

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

References:

[1] John Mount, ‘Six Fundamental Methods to Generate a Random Variable’, January 20, 2012.↗
[2] Thomas, D. B., Luk. W., Leong, P. H. W., and Villasenor, J. D. 2007. Gaussian random number generators. ACM Comput. Surv. 39, 4, Article 11 (October 2007), 38 pages DOI = 10.1145/1287620.1287622 http://doi.acm.org/10.1145/1287620.1287622.↗

Topics in this chapter

Random Variables - Simulating Probabilistic Systems
● Introduction
Plotting the estimated PDF
● Univariate random variables
 □ Uniform random variable
 □ Bernoulli random variable
 □ Binomial random variable
 □ Exponential random variable
 □ Poisson process
 □ Gaussian random variable
 □ Chi-squared random variable
 □ Non-central Chi-Squared random variable
 □ Chi distributed random variable
 □ Rayleigh random variable
 □ Ricean random variable
 □ Nakagami-m distributed random variable
Central limit theorem - a demonstration
● Generating correlated random variables
 □ Generating two sequences of correlated random variables
 □ Generating multiple sequences of correlated random variables using Cholesky decomposition
Generating correlated Gaussian sequences
 □ Spectral factorization method
 □ Auto-Regressive (AR) model

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

White Noise : Simulation and Analysis using Matlab

Definition

A random process (or signal for your visualization) with a constant power spectral density (PSD) function is a white noise process.

Power Spectral Density

Power Spectral Density function (PSD) shows how much power is contained in each of the spectral component. For example, for a sine wave of fixed frequency, the PSD plot will contain only one spectral component present at the given frequency. PSD is an even function and so the frequency components will be mirrored across the Y-axis when plotted. Thus for a sine wave of fixed frequency, the double sided plot of PSD will have two components – one at +ve frequency and another at –ve frequency of the sine wave. (Know how to plot PSD/FFT in Python & in Matlab)

Gaussian and Uniform White Noise:

A white noise signal (process) is constituted by a set of independent and identically distributed (i.i.d) random variables. In discrete sense, the white noise signal constitutes a series of samples that are independent and generated from the same probability distribution. For example, you can generate a white noise signal using a random number generator in which all the samples follow a given Gaussian distribution. This is called White Gaussian Noise (WGN) or Gaussian White Noise. Similarly, a white noise signal generated from a Uniform distribution is called Uniform White Noise.

Gaussian Noise and Uniform Noise are frequently used in system modelling. In modelling/simulation, white noise can be generated using an appropriate random generator. White Gaussian Noise can be generated using randn function in Matlab which generates random numbers that follow a Gaussian distribution. Similarly, rand function can be used to generate Uniform White Noise in Matlab that follows a uniform distribution. When the random number generators are used, it generates a series of random numbers from the given distribution. Let’s take the example of generating a White Gaussian Noise of length 10 using randn function in Matlab – with zero mean and standard deviation=1.

>> mu=0;sigma=1;
>> noise= sigma *randn(1,10)+mu
noise =   -1.5121    0.7321   -0.1621    0.4651    1.4284    1.0955   -0.5586    1.4362   -0.8026    0.0949

What is i.i.d ?

This simply generates 10 random numbers from the standard normal distribution. As we know that a white process is seen as a random process composing several random variables following the same Probability Distribution Function (PDF). The 10 random numbers above are generated from the same PDF (standard normal distribution). This condition is called “identically distributed” condition. The individual samples given above are “independent” of each other. Furthermore, each sample can be viewed as a realization of one random variable. In effect, we have generated a random process that is composed of realizations of 10 random variables. Thus, the process above is constituted from “independent identically distributed” (i.i.d) random variables.

Strictly and weakly defined white noise:

Since the white noise process is constructed from i.i.d random variable/samples, all the samples follow the same underlying probability distribution function (PDF). Thus, the Joint Probability Distribution function of the process will not change with any shift in time. This is called a stationary process. Hence, this noise is a stationary process. As with a stationary process which can be classified as Strict Sense Stationary (SSS) and Wide Sense Stationary (WSS) processes, we can have white noise that is SSS and white noise that is WSS. Correspondingly they can be called strictly defined white noise signal and weakly defined white noise signal.

What’s with Covariance Function/Matrix ?

A white noise signal, denoted by \(x(t)\), is defined in weak sense is a more practical condition. Here, the samples are statistically uncorrelated and identically distributed with some variance equal to \(\sigma^2\). This condition is specified by using a covariance function as

\[COV \left(x_i, x_j \right) = \begin{cases} \sigma^2, & \quad i = j \\ 0, & \quad i \neq j \end{cases}\]

Why do we need a covariance function? Because, we are dealing with a random process that is composed of \(n\) random variables (10 variables in the modelling example above). Such a process is viewed as multivariate random vector or multivariate random variable.

For multivariate random variables, Covariance function specified how each of the \(n\) variables in the given random process behaves with respect to each other. Covariance function generalizes the notion of variance to multiple dimensions.

The above equation when represented in the matrix form gives the covariance matrix of the white noise random process. Since the random variables in this process are statistically uncorrelated, the covariance function contains values only along the diagonal.

\[C_{xx} = \begin{bmatrix} \sigma^2 & \cdots & 0 \\ \vdots & \sigma^2 & \vdots \\ 0 & \cdots & \sigma^2\end{bmatrix} = \sigma^2 \mathbf{I} \]

The matrix above indicates that only the auto-correlation function exists for each random variable. The cross-correlation values are zero (samples/variables are statistically uncorrelated with respect to each other). The diagonal elements are equal to the variance and all other elements in the matrix are zero.The ensemble auto-correlation function of the weakly defined white noise is given by This indicates that the auto-correlation function of weakly defined white noise process is zero everywhere except at lag \(\tau=0\).

\[R_{xx}(\tau) = E \left[ x(t) x^*(t-\tau)\right] = \sigma^2 \delta (\tau)\]

Related topic: Constructing the auto-correlation matrix in Matlab

Frequency Domain Characteristics:

Wiener-Khintchine Theorem states that for Wide Sense Stationary Process (WSS), the power spectral density function \(S_{xx}(f)\) of a random process can be obtained by Fourier Transform of auto-correlation function of the random process. In continuous time domain, this is represented as

\[S_{xx}(f) = F \left[R_{xx}(\tau) \right] = \int_{-\infty}^{\infty} R_{xx} (\tau) e ^{- j 2 \pi f \tau} d \tau\]

For the weakly defined white noise process, we find that the mean is a constant and its covariance does not vary with respect to time. This is a sufficient condition for a WSS process. Thus we can apply Weiner-Khintchine Theorem. Therefore, the power spectral density of the weakly defined white noise process is constant (flat) across the entire frequency spectrum (Figure 1). The value of the constant is equal to the variance or power of the noise signal.

\[S_{xx}(f) = F \left[R_{xx}(\tau) \right] = \int_{-\infty}^{\infty} \sigma^2 \delta (\tau) e ^{- j 2 \pi f \tau} d \tau = \sigma^2 \int_{-\infty}^{\infty} \delta (\tau) e ^{- j 2 \pi f \tau} = \sigma^2\]
Figure 1: Weiner-Khintchine theorem illustrated

Testing the characteristics of White Gaussian Noise in Matlab:

Generate a Gaussian white noise signal of length \(L=100,000\) using the randn function in Matlab and plot it. Let’s assume that the pdf is a Gaussian pdf with mean \(\mu=0\) and standard deviation \(\sigma=2\). Thus the variance of the Gaussian pdf is \(\sigma^2=4\). The theoretical PDF of Gaussian random variable is given by

\[f_X(x) = \frac{1}{\sigma \sqrt{2 \pi}} \cdot exp \left[ – \frac{\left( x – \mu\right)^2}{2 \sigma^2} \right] \]

More simulation techniques available in the following ebooks
Digital Modulations using Matlab
Digital Modulations using Python
Wireless Communication systems in Matlab

clear all; clc; close all;
L=100000; %Sample length for the random signal
mu=0;
sigma=2;
X=sigma*randn(L,1)+mu;

figure();
subplot(2,1,1)
plot(X);
title(['White noise : \mu_x=',num2str(mu),' \sigma^2=',num2str(sigma^2)])
xlabel('Samples')
ylabel('Sample Values')
grid on;
Figure 2: Simulated noise samples

Plot the histogram of the generated noise signal and verify the histogram by plotting against the theoretical pdf of the Gaussian random variable.

If you are inclined towards programming in Python, go here to know about plotting histogram using Matplotlib package.

subplot(2,1,2)
n=100; %number of Histrogram bins
[f,x]=hist(X,n);
bar(x,f/trapz(x,f)); hold on;
%Theoretical PDF of Gaussian Random Variable
g=(1/(sqrt(2*pi)*sigma))*exp(-((x-mu).^2)/(2*sigma^2));
plot(x,g);hold off; grid on;
title('Theoretical PDF and Simulated Histogram of White Gaussian Noise');
legend('Histogram','Theoretical PDF');
xlabel('Bins');
ylabel('PDF f_x(x)');
Figure 3: Plot of simulated & theoretical PDF for Gaussian RV

Compute the auto-correlation function of the white noise. The computed auto-correlation function has to be scaled properly. If the ‘xcorr’ function (inbuilt in Matlab) is used for computing the auto-correlation function, use the ‘biased’ argument in the function to scale it properly.

figure();
Rxx=1/L*conv(flipud(X),X);
lags=(-L+1):1:(L-1);

%Alternative method
%[Rxx,lags] =xcorr(X,'biased'); 
%The argument 'biased' is used for proper scaling by 1/L
%Normalize auto-correlation with sample length for proper scaling

plot(lags,Rxx); 
title('Auto-correlation Function of white noise');
xlabel('Lags')
ylabel('Correlation')
grid on;
Figure 4: Autocorrelation function of generated noise

Simulating the PSD:

Simulating the Power Spectral Density (PSD) of the white noise is a little tricky business. There are two issues here 1) The generated samples are of finite length. This is synonymous to applying truncating an infinite series of random samples. This implies that the lags are defined over a fixed range. ( FFT and spectral leakage – an additional resource on this topic can be found here) 2) The random number generators used in simulations are pseudo-random generators. Due these two reasons, you will not get a flat spectrum of psd when you apply Fourier Transform over the generated auto-correlation values.The wavering effect of the psd can be minimized by generating sufficiently long random signal and averaging the psd over several realizations of the random signal.

Simulating Gaussian White Noise as a Multivariate Gaussian Random Vector:

To verify the power spectral density of the white noise, we will use the approach of envisaging the noise as a composite of \(N\) Gaussian random variables. We want to average the PSD over \(L\) such realizations. Since there are \(N\) Gaussian random variables (\(N\) individual samples) per realization, the covariance matrix \( C_{xx}\) will be of dimension \(N \times N\). The vector of mean for this multivariate case will be of dimension \(1 \times N\).

Cholesky decomposition of covariance matrix gives the equivalent standard deviation for the multivariate case. Cholesky decomposition can be viewed as square root operation. Matlab’s randn function is used here to generate the multi-dimensional Gaussian random process with the given mean matrix and covariance matrix.

%Verifying the constant PSD of White Gaussian Noise Process
%with arbitrary mean and standard deviation sigma

mu=0; %Mean of each realization of Noise Process
sigma=2; %Sigma of each realization of Noise Process

L = 1000; %Number of Random Signal realizations to average
N = 1024; %Sample length for each realization set as power of 2 for FFT

%Generating the Random Process - White Gaussian Noise process
MU=mu*ones(1,N); %Vector of mean for all realizations
Cxx=(sigma^2)*diag(ones(N,1)); %Covariance Matrix for the Random Process
R = chol(Cxx); %Cholesky of Covariance Matrix
%Generating a Multivariate Gaussian Distribution with given mean vector and
%Covariance Matrix Cxx
z = repmat(MU,L,1) + randn(L,N)*R;

Compute PSD of the above generated multi-dimensional process and average it to get a smooth plot.

%By default, FFT is done across each column - Normal command fft(z)
%Finding the FFT of the Multivariate Distribution across each row
%Command - fft(z,[],2)
Z = 1/sqrt(N)*fft(z,[],2); %Scaling by sqrt(N);
Pzavg = mean(Z.*conj(Z));%Computing the mean power from fft

normFreq=[-N/2:N/2-1]/N;
Pzavg=fftshift(Pzavg); %Shift zero-frequency component to center of spectrum
plot(normFreq,10*log10(Pzavg),'r');
axis([-0.5 0.5 0 10]); grid on;
ylabel('Power Spectral Density (dB/Hz)');
xlabel('Normalized Frequency');
title('Power spectral density of white noise');
Figure 5: Power spectral density of generated noise

The PSD plot of the generated noise shows almost fixed power in all the frequencies. In other words, for a white noise signal, the PSD is constant (flat) across all the frequencies (\(- \infty\) to \(+\infty\)). The y-axis in the above plot is expressed in dB/Hz unit. We can see from the plot that the \(constant \; power = 10 log_{10}(\sigma^2) = 10 log_{10}(4) = 6\; dB\).

Application

In channel modeling, we often come across additive white Gaussian noise (AWGN) channel. To know more about the channel model and its simulation, continue reading this article: Simulate AWGN channel in Matlab & Python.

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

References:

[1] Robert Grover Brown, Introduction to Random Signal Analysis and Kalman Filtering. John Wiley and Sons, 1983.↗
[2] Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, 3rd ed. WCB/McGraw-Hill, 1991.↗

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Introduction to concepts in probability

Note: There is a rating embedded within this post, please visit this post to rate it.

What is Probability?

Probability is a branch of mathematics that deals with uncertainty. The term “probability” is used to quantify the degree of belief or confidence that something is true (or false). It gives us the likelihood of occurrence of a given event. It is expressed as a number that could take any value in the closed interval [0,1]

Consider the following experiment describing a simple communication system. A user transmits data through a noisy medium and another user receives it. Here, the sender utters a single alphabet on the phone. Due to the noise characteristics of the communication medium, we do not know whether the user at the destination will be able to hear what the sender has already spoken. Before performing the experiment, we would like to know the likelihood that the user at the destination hears the particular syllable (given the noise characteristics). This likelihood of the particular event is called probability of the event.

Experiment:

Any activity that can produce observable results is called an experiment. For example, tossing a coin (observable results: Head/Tail), Rolling a die (observable results: numbers on the faces of the die), drawing a card from a deck (observable results: symbols, numbers and alphabets on the cards), sending & receiving bits in a communication system (observable results: bits/alphabets transferred or voltage level at the receiver).

Sample Space:

Given an experiment, the sample space comprises a set of all possible outcomes of the experiment. It plays the role of the universal set when modeling the experiment. It is denoted by the letter – ‘S’. Following examples illustrate the sample spaces for various experiments.

Event:

It is also a set of outcomes of an experiment. It is a subset of sample space. Each time the experiment is run, either a particular event occurs or it does not occur. Events are associated with a probability number.

Types of Events:

Events can be classified according to their relationship with one another. The following table shows the classification of events and their definition.

Computing Probability:

The proability of the occurrence of an event (say ‘A’) is given by the ratio of number of ways that particular event can happen and the total number of all possible outcomes.

For example, consider the experiment of an unbiased rolling of a die. The sample space is given by S={1,2,3,4,5,6}. Let’s say that an event is defined as getting ‘4’ when you roll the die. The probability of getting the face with ‘4’ (event) can be calculated as follows.

Axioms of Probability:

Following definitions are assumed for the axioms listed below: ‘S’ denotes the sample space of an experiment, ‘A’ and ‘B’ are events and P(A) denotes the probability of occurrence of event ‘A’.

Properties of Probability:

The definition of probability – has some properties as listed below.


Here the symbol Ø indicates null event, Ā indicates that the event A is NOT occuring.

Joint probability and Marginal probability:

Joint probability is defined as the probability that two or more events occur simultaneously. For two events A and B, the joint probability is denoted by P(A,B) or P(A∩B).

Given two or more events, the marginal probability is the probability of occurrence of a single event. It is also called a-priori probability.

The following table illustrates the concept of computing the joint and marginal probabilities. Here, four events (P, Q, R, S) are used for illustration. For example, the table indicates that the probability of occurrence of both events R & Q is given by b/n. This is the joint probability of R and Q. Adding all the probabilities either row wise or column wise gives us the marginal probability of a single event. For example, adding a/n and b/n gives the marginal probability of event similarly, adding a/n and c/n gives the marginal probability of event P.

Conditional probability or Posteriori probability:

Conditional probabilities (also called posteriori probability) deal with dependent events. It is used to calculate the probability of an event given that some other event has already occurred.

It is denoted as P(B|A)–meaning that ‘the probability of event B given that the event A has occurred already’. It is called “a-posteriori” because it is only available “after” observing A (the first event).

The conditional probability P(B|A) is mathematically computed as

Recommended Books:

Non-central Chi square distribution

Note: There is a rating embedded within this post, please visit this post to rate it.

If squares of k independent standard normal random variables are added, it gives rise to central Chi-squared distribution with ‘k’ degrees of freedom. Instead, if squares of k independent normal random variables with non-zero means are added, it gives rise to non-central Chi-squared distribution. Non-central Chi-square distribution is related to Ricean distribution, whereas the central Chi-squared distribution is related to Rayleigh distribution.

The non-central Chi-squared distribution is a generalization of Chi-square distribution. A non-central Chi squared distribution is defined by two parameters: 1) degrees of freedom () and 2) non-centrality parameter .

As we know from previous article, the degrees of freedom specify the number of independent random variables we want to square and sum-up to make the Chi-squared distribution. Non-centrality parameter is the sum of squares of means of the each independent underlying normal random variable.

The non-centrality parameter is given by

The PDF of the non-central Chi-squared distribution having degrees of freedom and non-centrality parameter is given by

Here, the random variable is central Chi-squared distributed with degrees of freedom. The factor gives the probabilities of Poisson distribution. Thus, the PDF of the non-central Chi-squared distribution can be termed as the weighted sum of Chi-squared probabilities where the weights being equal to the probabilities of Poisson distribution.

Method of Generating non-central Chi-squared random variable:

The procedure for generating the samples from a non-central Chi-squared random variable is as follows.

● For a given degree of freedom , let the normal random variables be with variances and mean respectively.
● The goal is to add squares of these independent normal random variables with variances set to one and means satisfying the condition set by equation (1).
● Set and
● Generate standard normal random variables and one normal random variable with and
● Squaring and summing-up all the random variables gives the non-central Chi-squared random variable.
● The PDF of the generated samples can be plotted using the histogram method described here.

Matlab Code:

Check this book for full Matlab code.
Wireless Communication Systems using Matlab – by Mathuranathan Viswanathan

Python Code:

Python numpy package has a nocentral_chisquare() generator, which can be used in a straightforward manner to obtain the non-central Chi square distributed sequences.

#---------Non-central Chi square distribution gaussianwaves.com-----
import numpy as np
import matplotlib.pyplot as plt
#%matplotlib inline
plt.style.use('ggplot')

ks=np.asarray([2,4]) #degrees of freedoms to simulate
ldas = np.asarray([1,2,3]) #non-centrality parameters to simulate
nSamp=1000000 #number of samples to generate

fig, ax = plt.subplots(ncols=1, nrows=1, constrained_layout=True)

for i,k in enumerate(ks):
    for j,lda in enumerate(ldas):
        #Generate non-central Chi-squared distributed random numbers
        X = np.random.noncentral_chisquare(df=k, nonc = lda, size = nSamp)
        ax.hist(X,bins=500,density=True,label=r'$k$={} $\lambda$={}'.format(k,lda),\
        histtype='step',alpha=0.75, linewidth=3)

ax.set_xlim(left=0,right=30);ax.legend()
ax.set_title('PDFs of non-central Chi square distribution');
plt.show()
Figure 1: Simulated PDFs of non-central Chi-Squared random variables

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

For further reading

[1] David A. Harville, “Linear Models and the Relevant Distributions and Matrix Algebra”, 978-1138578333, Chapman and Hall/CRC, 1 edition, March 2018.↗

Similar topics

Random Variables - Simulating Probabilistic Systems
● Introduction
Plotting the estimated PDF
● Univariate random variables
 □ Uniform random variable
 □ Bernoulli random variable
 □ Binomial random variable
 □ Exponential random variable
 □ Poisson process
 □ Gaussian random variable
 □ Chi-squared random variable
 □ Non-central Chi-Squared random variable
 □ Chi distributed random variable
 □ Rayleigh random variable
 □ Ricean random variable
 □ Nakagami-m distributed random variable
Central limit theorem - a demonstration
● Generating correlated random variables
 □ Generating two sequences of correlated random variables
 □ Generating multiple sequences of correlated random variables using Cholesky decomposition
Generating correlated Gaussian sequences
 □ Spectral factorization method
 □ Auto-Regressive (AR) model

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing