Spread Spectrum Communications – Introduction

Spread spectrum system, originally developed for military applications, is extremely resistant to unauthorized detection, jamming, interference and noise. It converts a narrowband signal to wideband signal by the means of spreading. Standards like WiFi and bluetooth use spread spectrum technology such as direct sequence spread spectrum and frequency hopping spread spectrum respectively. Spread spectrum techniques rely on the use of one or other form of coding sequences like m-sequences, Gold codes, Kasami codes, etc., The study of properties of these codes is important in the design and implementation of a given spread spectrum technique. In this chapter, generation of spreading sequences and their main properties are covered, followed by simulation models for direct sequence spread spectrum and frequency hopping spread spectrum systems.

Spread spectrum

A spread-spectrum signal is a signal that occupies a bandwidth that is much larger than necessary to the extent that the occupied bandwidth is independent of the bandwidth of the input-data. With this technique, a narrowband signal such as a sequence of zeros and ones is spread across a given frequency spectrum, resulting in a broader or wideband signal. Spread spectrum was originally intended for military application and it offers two main benefits. First, a wideband signal is less susceptible to intentional blocking (jamming) and unintentional blocking (interference or noise) than a narrowband signal. Secondly, a wideband signal can be perceived as part of noise and remains undetected. The two most popular spread spectrum techniques widely used in commercial applications are direct sequence spread spectrum (DSSS) and frequency hopping spread spectrum (FHSS).

Bluetooth, cordless phones and other fixed broadband wireless access techniques use FHSS; WiFi uses DSSS. Given that both the techniques occupy the same frequency band, co-existence of bluetooth and Wifi devices is an interesting issue. Both FSSS and DSSS devices perceive each others as noise, i.e, the WiFi and Bluetooth devices see each other as mutual interferers. All the spread spectrum techniques make use of some form of spreading or code sequences.

This article is part of the book
Wireless Communication Systems in Matlab (second edition), ISBN: 979-8648350779 available in ebook (PDF) format and Paperback (hardcopy) format.

Code sequences for spread spectrum

Be it DSSS or FHSS, the key element in any spread spectrum technique is the use of code sequences. In DSSS, a narrowband signal representing the input data is XOR’ed with a code sequence of much higher bit rate, rendering a wideband signal. In FHSS, the transmitter/receiver agrees to and hops among a given set of frequency channels according to a seemingly random code sequence. The most popular code sequences used in spread spectrum applications are

Maximum-length sequences (m-sequences)
Gold codes
● Walsh-Hadamard sequences
● Kasami sequences
● Orthogonal codes
● Variable length orthogonal codes

These codes are mainly used in the spread spectrum for protection against intentional blocking (jamming) as well as unintentional blocking (noise or interference), to provide a provision for privacy that enables protection of signals against eavesdropping and to enable multiple users share the same resource. The design and selection of a particular code sequence for a given application largely depends on its properties. As an example, we will consider generation of m-sequences and Gold codes, along with the demonstration of their code properties.

This article is part of the book
Wireless Communication Systems in Matlab (second edition), ISBN: 979-8648350779 available in ebook (PDF) format and Paperback (hardcopy) format.

Sequence correlations

Given the choice of numerous spreading codes like m-sequences, Gold codes, Walsh codes etc., the problem of selecting a spreading sequence for a given application reduces to the selection of such codes based on good discrete-time periodic cross-correlation and auto-correlation properties.

The cross-correlation of two discrete sequences x and y, normalized with respect to the sequence length N, is given by

Similarly, the auto-correlation of a discrete sequence refers to the degree of agreement between the given sequence and its phase shifted replica. To avoid problems due to false synchronization, it is important to select a spreading code with excellent auto-correlation properly. The periodic autocorrelation functions of m-sequences approach the ideal case of noise when the sequence length N is made very large. Hence, m-sequences are commonly employed in practice when good autocorrelation functions are required. For applications like code division multiple access (CDMA), the codes with good cross-correlation properties are desired.

Check the book Wireless communication systems in Matlab for Matlab code on computing the sequence correlation between two arbitrary sequences.

Maximum-length sequences (m-sequences)

Maximum-length sequences (also called as m-sequences or pseudo random (PN) sequences) are constructed based on Galois field theory which is an extensive topic in itself. A detailed treatment on the subject of Galois field theory can be found in references [1] and [2].

Maximum length sequences are generated using linear feedback shift registers (LFSR) structures that implement linear recursion. There are two types of LFSR structures available for implementation – 1) Galois LFSR and 2) Fibonacci LFSR. The Galois LFSR structure is a high speed implementation structure, since it has less clock to clock delay path compared to its Fibonacci equivalent. Kindly check out this article where I detailed the generation of m-sequences using Galois LFSR.

Check the book Wireless communication systems in Matlab for Matlab code on generating m-sequences and computing their sequence correlation properties.

Figure 1, depicts the auto-correlation of an m-sequence of period N=31 using the 5th order characteristic polynomial g(x)=x5+x2+1. The normalized auto-correlation of an m-sequence of length N, takes two values [1,-1/N]. We observe that that auto-correlation of m-sequence carries some similarities with that of a random sequence. If the length of the m-sequence is increased, the out-of-peak correlation -1/N reduces further and thereby the peaks become more distinct. This property makes the m-sequences suitable for synchronization and in the detection of information in single-user Direct Sequence Spread Spectrum systems.

Normalized auto-correlation of m-sequence generated using the polynomial g(x) = x5+x2+1
Figure 1: Normalized auto-correlation of m-sequence generated using the polynomial g(x) = x5+x2+1

Figure 2, depicts the cross-correlation of two different m-sequences, consider two different primitive polynomials g1(x)=x5+x4+x2+x+1 and g2(x)=x5+x4+x3+x+1. The cross-correlation plot contains high peaks at certain lags (as high as 35%) and hence the m-sequences causes multiple access interference (MAI), leading to severe performance degradation. Hence, the m-sequences are not suitable for orthogonalization of users in multi-user spread spectrum systems like CDMA.

Check the book Wireless communication systems in Matlab for Matlab code on generating m-sequences and computing their sequence correlation properties.

Figure 2: Normalized cross-correlation of two m-sequences generated using the polynomials g1(x)=x5+x4+x2+x+1 and g2(x)=x5+x4+x3+x+1

Gold codes

In applications like code division multiple access (CDMA) technique and satellite navigation, a large number of codes with good cross-correlation property is a necessity. Code sequences that have bounded small cross-correlations, are useful when multiple devices are broadcasting in the same frequency range. Gold codes, named after Robert Gold, are suited for this application, since a large number of codes with controlled correlation can be generated by a simple time shift of two preferred pair of m-sequences. Kindly check out this article where I detailed the generation of Gold codes using preferred pair m-sequences.

Gold sequences belong to the category of product codes where two preferred pair of m-sequences of same length are XOR’ed (modulo-2 addition) to produce a Gold sequence. The two m-sequences must maintain the same phase relationship till all the additions are performed. A slight change of phase even in one of the m-sequences, produces a different Gold sequence altogether. Gold codes are non-maximal and therefore they have poor auto-correlation property when compared to that of the underlying m-sequences.

The auto-correlation of a Gold code sequence, plotted in Figure 3 and the normalized cross-correlation of preferred pair m-sequences (used for Gold code generation) is plotted in Figure 4.

Figure 3: Auto-correlation of Gold code sequence generated using the preferred pair feedback connections [2,3,4,5] and [2,5]

Check the book Wireless communication systems in Matlab for Matlab code on generating preferred pair m-sequences, Gold code sequences and computing their sequence correlation properties.

The auto-correlation and cross-correlation plots reveal that the Gold code sequence does not possess the excellent auto-correlation property as that of individual m-sequences, but it possess good cross-correlation properties in comparison with the individual m-sequences.

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

References

[1] D. V. Sarwate and M. B. Pursley, Crosscorrelation properties of pseudorandom and related sequences, Proc. IEEE, vol. 68, no. 5, pp. 593–619, May 1980.↗
[2] S. W. Golomb, Shift-register sequences and spread-spectrum communications, Proceedings of IEEE 3rd International Symposium on Spread Spectrum Techniques and Applications (ISSSTA’94), Oulu, Finland, 1994, pp. 14-15 vol.1.↗

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Topics in this chapter

Spread spectrum techniques
Introduction
Code sequences
 □ Sequence correlations
 □ Maximum-length sequences (m-sequences)
 □ Gold codes
● Direct Sequence Spread Spectrum
 □ Simulation of DSSS system
 □ Performance of Direct Sequence Spread Spectrum over AWGN channel
 □ Performance of Direct Sequence Spread spectrum in the presence of Jammer
● Frequency Hopping Spread Spectrum
 □ Simulation model
 □ Binary Frequency Shift Keying (BFSK)
 □ Allocation of frequency channels
 □ Frequency hopping generator
 □ Fast and slow frequency hopping
 □ Simulation code for BFSK-FHSS

Viterbi Decoding of Convolutional codes

Note: There is a rating embedded within this post, please visit this post to rate it.
Viterbi algorithm is utilized to decode the convolutional codes. Again the decoding can be done in two approaches. One approach is called hard decision decoding which uses Hamming distance as a metric to perform the decoding operation, whereas, the soft decision decoding uses Euclidean distance as a metric.

As stated in one of the previous posts the soft decision decoding improves the performance of the code to a significant extent compared to the hard decision decoding.

Viterbi Algorithm (VA) decoding involves three stages, namely,

1) Branch Metric Calculation
2) Path Metric Calculation
3) Trace back operation

Branch Metric Calculation :

The pair of received bits (for (n=2) , if (n=3) then we call it triplets, etc.,) are compared with the corresponding branches in the trellis and the distance metrics are calculated. For hard decision decoding, Hamming distances are calculated. Suppose if the received pair of bits are ’11’ and the hamming distance to {’00’,’01’,’10’,’11’} outputs of the trellis are 2,1,1,0 respectively.

For soft decision decoding, see previous article

Path Metric Calculation:

Path metrics are calculated using a procedure called ACS (Add-Compare-Select). This procedure is repeated for every encoder state.

1. Add – for a given state, we know two states on the previous step which can move to this state, and the output bit pairs that correspond to these transitions. To calculate new path metrics, we add the previous path metrics with the corresponding branch metrics.

2. Compare, select – we now have two paths, ending in a given state. One of them (with greater metric) is dropped.

Trace back Operation :

Track back operation is needed in hardware that generally has memory limitations and if the transmitted message is of greater length compared to the memory available. It is also required to maintain a constant throughput at the output of the decoder.

Using soft decision decoding is recommended for Viterbi decoders, since it can give a gain of about 2 dB (that is, a system with a soft decision decoder can use 2 dB less transmitting power than a system with a hard decision decoder with the same error probability).

Two Level Coding System:

Convolution codes with Viterbi decoding are not good at burst error correction, but they are good at random error correction.On the contrary, the Reed Solomon coding is good at burst error correction and not so good at random error correction. So the advantages of these two codes are exploited in many systems to provide good error correction capability against both random and burst errors.

In such systems, data are encoded firstly with a Reed-Solomon code, then they are processed by an interleaver (which places symbols from the same Reed-Solomon codeword far from each other), and then encoded with a convolutional code.

At the receiver, data are firstly processed by a Viterbi decoder. The bursts of errors from its output are then deinterleaved (with erroneous symbols from one burst getting into different Reed-Solomon codewords) and decoded with a Reed-Solomon decoder.

Two Level Coding System

Additional Resources:

[1] A graphical illustration of “How Viterbi Decoding works”

Recommended Books

Performance comparison of Digital Modulation techniques

Key focus: Compare Performance and spectral efficiency of bandwidth-efficient digital modulation techniques (BPSK,QPSK and QAM) on their theoretical BER over AWGN.

More detailed analysis of Shannon’s theorem and Channel capacity is available in the following book
Wireless Communication Systems in Matlab (second edition), ISBN: 979-8648350779 available in ebook (PDF) format and Paperback (hardcopy) format.

Simulation of various digital modulation techniques are available in these books
Digital Modulations using Matlab : Build Simulation Models from Scratch, ISBN: 978-1521493885
Digital Modulations using Python ISBN: 978-1712321638

Let’s take up some bandwidth-efficient linear digital modulation techniques (BPSK,QPSK and QAM) and compare its performance based on their theoretical BER over AWGN. (Readers are encouraged to read previous article on Shannon’s theorem and channel capacity).

Table 1 summarizes the theoretical BER (given SNR per bit ration – Eb/N0) for various linear modulations. Note that the Eb/N0 values used in that table are in linear scale [to convert Eb/N0 in dB to linear scale – use Eb/N0(linear) = 10^(Eb/N0(dB)/10) ]. A small script written in Matlab (given below) gives the following output.

Figure 1: Eb/N0 Vs. BER for various digital modulations over AWGN channel
Table 1: Theoretical BER over AWGN for various linear digital modulation techniques

The following table is obtained by extracting the values of Eb/N0 to achieve BER=10-6 from Figure-1. (Table data sorted with increasing values of Eb/N0).

Table 2: Capacity of various modulations their efficiency and channel bandwidth

where,

is the bandwidth efficiency for linear modulation with M point constellation, meaning that ηB bits can be stuffed in one symbol with Rb bits/sec data rate for a given minimum bandwidth.

is the minimum bandwidth needed for information rate of Rb bits/second. If a pulse shaping technique like raised cosine pulse [with roll off factor (a)] is used then Bmin becomes

Next the data in table 2 is plotted with Eb/N0 on the x-axis and η on the y-axis (see figure 2) along with the well known Shannon’s Capacity equation over AWGN given by,

which can be represented as (refer [1])

Figure 2: Spectral efficiency vs Eb/N0 for various modulations at Pb=10-6

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

Matlab Code

EbN0dB=-4:1:24;
EbN0lin=10.^(EbN0dB/10);
colors={'b-*','g-o','r-h','c-s','m-d','y-*','k-p','b-->','g:<','r-.d'};
index=1;

%BPSK
BPSK = 0.5*erfc(sqrt(EbN0lin));
plotHandle=plot(EbN0dB,log10(BPSK),char(colors(index)));
set(plotHandle,'LineWidth',1.5);
hold on;

index=index+1;

%M-PSK
m=2:1:5;
M=2.^m;
for i=M,
    k=log2(i);
    berErr = 1/k*erfc(sqrt(EbN0lin*k)*sin(pi/i));
    plotHandle=plot(EbN0dB,log10(berErr),char(colors(index)));
    set(plotHandle,'LineWidth',1.5);
    index=index+1;
end

%Binary DPSK
Pb = 0.5*exp(-EbN0lin);
plotHandle = plot(EbN0dB,log10(Pb),char(colors(index)));
set(plotHandle,'LineWidth',1.5);
index=index+1;

%Differential QPSK
a=sqrt(2*EbN0lin*(1-sqrt(1/2)));
b=sqrt(2*EbN0lin*(1+sqrt(1/2)));
Pb = marcumq(a,b,1)-1/2.*besseli(0,a.*b).*exp(-1/2*(a.^2+b.^2));
plotHandle = plot(EbN0dB,log10(Pb),char(colors(index)));
set(plotHandle,'LineWidth',1.5);
index=index+1;

%M-QAM
m=2:2:6;
M=2.^m;

for i=M,
    k=log2(i);
    berErr = 2/k*(1-1/sqrt(i))*erfc(sqrt(3*EbN0lin*k/(2*(i-1))));
    plotHandle=plot(EbN0dB,log10(berErr),char(colors(index)));
    set(plotHandle,'LineWidth',1.5);
    index=index+1;
end

legend('BPSK','QPSK','8-PSK','16-PSK','32-PSK','D-BPSK','D-QPSK','4-QAM','16-QAM','64-QAM');
axis([-4 24 -8 0]);
set(gca,'XTick',-4:2:24); %re-name axis accordingly
ylabel('Probability of BER Error - log10(Pb)');
xlabel('Eb/N0 (dB)');
title('Probability of BER Error log10(Pb) Vs Eb/N0');
grid on;

Reference

[1] “Digital Communications” by John G.Proakis ,Chapter 7: Channel Capacity and Coding.↗

Related topics

Digital Modulators and Demodulators - Complex Baseband Equivalent Models
Introduction
Complex baseband representation of modulated signal
Complex baseband representation of channel response
● Modulators for amplitude and phase modulations
 □ Pulse Amplitude Modulation (M-PAM)
 □ Phase Shift Keying Modulation (M-PSK)
 □ Quadrature Amplitude Modulation (M-QAM)
● Demodulators for amplitude and phase modulations
 □ M-PAM detection
 □ M-PSK detection
 □ M-QAM detection
 □ Optimum detector on IQ plane using minimum Euclidean distance
● M-ary FSK modulation and detection
 □ Modulator for M orthogonal signals
 □ M-FSK detection

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

BPSK – Binary Phase Shift Keying

Key focus: BPSK, Binary Phase Shift Keying, bpsk modulation, bpsk demodulation, BPSK matlab, BPSK python implementation, BPSK constellation

BPSK – introduction

BPSK stands for Binary Phase Shift Keying. It is a type of modulation used in digital communication systems to transmit binary data over a communication channel.

In BPSK, the carrier signal is modulated by changing its phase by 180 degrees for each binary symbol. Specifically, a binary 0 is represented by a phase shift of 180 degrees, while a binary 1 is represented by no phase shift.

BPSK is a straightforward and effective modulation method and is frequently utilized in applications where the communication channel is susceptible to noise and interference. It is also utilized in different wireless communication systems like Wi-Fi, Bluetooth, and satellite communication.

Implementation details

Binary Phase Shift Keying (BPSK) is a two phase modulation scheme, where the 0’s and 1’s in a binary message are represented by two different phase states in the carrier signal: \(\theta=0^{\circ}\) for binary 1 and \(\theta=180^{\circ}\) for binary 0.

This article is part of the following books
Digital Modulations using Matlab : Build Simulation Models from Scratch, ISBN: 978-1521493885
Digital Modulations using Python ISBN: 978-1712321638
All books available in ebook (PDF) and Paperback formats

In digital modulation techniques, a set of basis functions are chosen for a particular modulation scheme. Generally, the basis functions are orthogonal to each other. Basis functions can be derived using Gram Schmidt orthogonalization procedure [1]. Once the basis functions are chosen, any vector in the signal space can be represented as a linear combination of them. In BPSK, only one sinusoid is taken as the basis function. Modulation is achieved by varying the phase of the sinusoid depending on the message bits. Therefore, within a bit duration \(T_b\), the two different phase states of the carrier signal are represented as,

\begin{align*} s_1(t) &= A_c\; cos\left(2 \pi f_c t \right), & 0 \leq t \leq T_b \quad \text{for binary 1}\\ s_0(t) &= A_c\; cos\left(2 \pi f_c t + \pi \right), & 0 \leq t \leq T_b \quad \text{for binary 0} \end{align*}

where, \(A_c\) is the amplitude of the sinusoidal signal, \(f_c\) is the carrier frequency \(Hz\), \(t\) being the instantaneous time in seconds, \(T_b\) is the bit period in seconds. The signal \(s_0(t)\) stands for the carrier signal when information bit \(a_k=0\) was transmitted and the signal \(s_1(t)\) denotes the carrier signal when information bit \(a_k=1\) was transmitted.

The constellation diagram for BPSK (Figure 3 below) will show two constellation points, lying entirely on the x axis (inphase). It has no projection on the y axis (quadrature). This means that the BPSK modulated signal will have an in-phase component but no quadrature component. This is because it has only one basis function. It can be noted that the carrier phases are \(180^{\circ}\) apart and it has constant envelope. The carrier’s phase contains all the information that is being transmitted.

BPSK transmitter

A BPSK transmitter, shown in Figure 1, is implemented by coding the message bits using NRZ coding (\(1\) represented by positive voltage and \(0\) represented by negative voltage) and multiplying the output by a reference oscillator running at carrier frequency \(f_c\).

Figure 1: BPSK transmitter

The following function (bpsk_mod) implements a baseband BPSK transmitter according to Figure 1. The output of the function is in baseband and it can optionally be multiplied with the carrier frequency outside the function. In order to get nice continuous curves, the oversampling factor (\(L\)) in the simulation should be appropriately chosen. If a carrier signal is used, it is convenient to choose the oversampling factor as the ratio of sampling frequency (\(f_s\)) and the carrier frequency (\(f_c\)). The chosen sampling frequency must satisfy the Nyquist sampling theorem with respect to carrier frequency. For baseband waveform simulation, the oversampling factor can simply be chosen as the ratio of bit period (\(T_b\)) to the chosen sampling period (\(T_s\)), where the sampling period is sufficiently smaller than the bit period.

Refer Digital Modulations using Matlab : Build Simulation Models from Scratch for full Matlab code.
Refer Digital Modulations using Python for full Python code

File 1: bpsk_mod.m: Baseband BPSK modulator

function [s_bb,t] = bpsk_mod(ak,L)
%Function to modulate an incoming binary stream using BPSK(baseband)
%ak - input binary data stream (0's and 1's) to modulate
%L - oversampling factor (Tb/Ts)
%s_bb - BPSK modulated signal(baseband)
%t - generated time base for the modulated signal
N = length(ak); %number of symbols
a = 2*ak-1; %BPSK modulation
ai=repmat(a,1,L).'; %bit stream at Tb baud with rect pulse shape
ai = ai(:).';%serialize
t=0:N*L-1; %time base
s_bb = ai;%BPSK modulated baseband signal

BPSK receiver

A correlation type coherent detector, shown in Figure 2, is used for receiver implementation. In coherent detection technique, the knowledge of the carrier frequency and phase must be known to the receiver. This can be achieved by using a Costas loop or a Phase Lock Loop (PLL) at the receiver. For simulation purposes, we simply assume that the carrier phase recovery was done and therefore we directly use the generated reference frequency at the receiver – \(cos( 2 \pi f_c t)\).

Figure 2: Coherent detection of BPSK (correlation type)

In the coherent receiver, the received signal is multiplied by a reference frequency signal from the carrier recovery blocks like PLL or Costas loop. Here, it is assumed that the PLL/Costas loop is present and the output is completely synchronized. The multiplied output is integrated over one bit period using an integrator. A threshold detector makes a decision on each integrated bit based on a threshold. Since, NRZ signaling format was used in the transmitter, the threshold for the detector would be set to \(0\). The function bpsk_demod, implements a baseband BPSK receiver according to Figure 2. To use this function in waveform simulation, first, the received waveform has to be downconverted to baseband, and then the function may be called.

Refer Digital Modulations using Matlab : Build Simulation Models from Scratch for full Matlab code.
Refer Digital Modulations using Python for full Python code

File 2: bpsk_demod.m: Baseband BPSK detection (correlation receiver)

function [ak_cap] = bpsk_demod(r_bb,L)
%Function to demodulate an BPSK(baseband) signal
%r_bb - received signal at the receiver front end (baseband)
%N - number of symbols transmitted
%L - oversampling factor (Tsym/Ts)
%ak_cap - detected binary stream
x=real(r_bb); %I arm
x = conv(x,ones(1,L));%integrate for L (Tb) duration
x = x(L:L:end);%I arm - sample at every L
ak_cap = (x > 0).'; %threshold detector

End-to-end simulation

The complete waveform simulation for the end-to-end transmission of information using BPSK modulation is given next. The simulation involves: generating random message bits, modulating them using BPSK modulation, addition of AWGN noise according to the chosen signal-to-noise ratio and demodulating the noisy signal using a coherent receiver. The topic of adding AWGN noise according to the chosen signal-to-noise ratio is discussed in section 4.1 in chapter 4. The resulting waveform plots are shown in the Figure 2.3. The performance simulation for the BPSK transmitter/receiver combination is also coded in the program shown next (see chapter 4 for more details on theoretical error rates).

The resulting performance curves will be same as the ones obtained using the complex baseband equivalent simulation technique in Figure 4.4 of chapter 4.

Refer Digital Modulations using Matlab : Build Simulation Models from Scratch for full Matlab code.
Refer Digital Modulations using Python for full Python code

File 3: bpsk_wfm_sim.m: Waveform simulation for BPSK modulation and demodulation

Figure 3: (a) Baseband BPSK signal,(b) transmitted BPSK signal – with carrier, (c) constellation at transmitter and (d) received signal with AWGN noise

References:

[1] Lloyd N. Trefethen, David Bau III , Numerical linear algebra, Philadelphia: Society for Industrial and Applied Mathematics, ISBN 978-0-89871-361-9, pp.56

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Topics in this chapter

Digital Modulators and Demodulators - Passband Simulation Models
Introduction
Binary Phase Shift Keying (BPSK)
 □ BPSK transmitter
 □ BPSK receiver
 □ End-to-end simulation
Coherent detection of Differentially Encoded BPSK (DEBPSK)
● Differential BPSK (D-BPSK)
 □ Sub-optimum receiver for DBPSK
 □ Optimum noncoherent receiver for DBPSK
Quadrature Phase Shift Keying (QPSK)
 □ QPSK transmitter
 □ QPSK receiver
 □ Performance simulation over AWGN
● Offset QPSK (O-QPSK)
● π/p=4-DQPSK
● Continuous Phase Modulation (CPM)
 □ Motivation behind CPM
 □ Continuous Phase Frequency Shift Keying (CPFSK) modulation
 □ Minimum Shift Keying (MSK)
Investigating phase transition properties
● Power Spectral Density (PSD) plots
Gaussian Minimum Shift Keying (GMSK)
 □ Pre-modulation Gaussian Low Pass Filter
 □ Quadrature implementation of GMSK modulator
 □ GMSK spectra
 □ GMSK demodulator
 □ Performance
● Frequency Shift Keying (FSK)
 □ Binary-FSK (BFSK)
 □ Orthogonality condition for non-coherent BFSK detection
 □ Orthogonality condition for coherent BFSK
 □ Modulator
 □ Coherent Demodulator
 □ Non-coherent Demodulator
 □ Performance simulation
 □ Power spectral density

Understanding Gibbs Phenomenon in signal processing

Introduction

Gibbs phenomenon is a phenomenon that occurs in signal processing and Fourier analysis when approximating a discontinuous function using a series of Fourier coefficients. Specifically, it is the observation that the overshoots near the discontinuities of the approximated function do not decrease with increasing numbers of Fourier coefficients used in the approximation.

In other words, when a discontinuous function is approximated by its Fourier series, the resulting series will exhibit oscillations near the discontinuities that do not diminish as more terms are added to the series. This can lead to a “ringing” effect in the signal, where there are spurious oscillations near the discontinuity that can persist even when the number of Fourier coefficients used in the approximation is increased.

The Gibbs phenomenon is named after American physicist Josiah Willard Gibbs, who first described it in 1899. It is a fundamental limitation of the Fourier series approximation and can occur in many other areas of signal processing and analysis as well.

Gibbs phenomenon

Fourier transform represents signals in frequency domain as summation of unique combination of sinusoidal waves. Fourier transforms of various signals are shown in the Figure 1. Some of these signals, square wave and impulse, have abrupt discontinuities (sudden changes) in time domain. They also have infinite frequency content in the frequency domain.

Figure 1: Frequency response of various test signals

Therefore, abrupt discontinuities in the signals require infinite frequency content in frequency domain. As we know, in order to represent these signals in computer memory, we cannot dispense infinite memory (or infinite bandwidth when capturing/measurement) to hold those infinite frequency terms. Somewhere, the number of frequency terms has to be truncated. This truncation in frequency domain manifests are ringing artifacts in time domain and vice-versa. This is called Gibbs phenomenon.

Figure 2: Ringing artifact (Gibbs phenomenon) on a square wave when the number of frequency terms is truncated

These ringing artifacts result from trying to describe the given signal with less number of frequency terms than the ideal. In practical applications, the ringing artifacts can result from

● Truncation of frequency terms – For example, to represent a perfect square wave, an infinite number of frequency terms are required. Since we cannot have an instrument with infinite bandwidth, the measurement truncates the number of frequency terms, resulting in the ringing artifact.

● Shape of filters – The ringing artifacts resulting from filtering operation is related to the sharp transitions present in the shape of the filter impulse response.

FIR filters and Gibbs phenomenon

Owing to their many favorable properties, digital Finite Impulse Response (FIR) filters are extremely popular in many signal processing applications. FIR filters can be designed to exhibit linear phase response in passband, so that the filter does not cause delay distortion (or dispersion) where different frequency components undergo different delays.

The simplest FIR design technique is the Impulse Response Truncation, where an ideal impulse response of infinite duration is truncated to finite length and the samples are delayed to make it causal. This method comes with an undesirable effect due to Gibbs Phenomenon.

Ideal brick wall characteristics in frequency domain is desired for most of the filters. For example, a typical ideal low pass filter necessitates sharp transition between passband and stopband. Any discontinuity (abrupt transitions) in one domain requires infinite number of components in the other domain.

For example, a rectangular function with abrupt transition in frequency domain translates to a \(sinc(x)=sin(x)/x\) function of infinite duration in time domain. In practical filter design, the FIR filters are of finite length. Therefore, it is not possible to represent an ideal filter with abrupt discontinuities using finite number of taps and hence the \(sinc\) function in time domain should be truncated appropriately. This truncation of an infinite duration signal in time domain leads to a phenomenon called Gibbs phenomenon in frequency domain. Since some of the samples in time domain (equivalently harmonics in frequency domain) are not used in the reconstruction, it leads to oscillations and ringing effect in the other domain. This effect is called Gibbs phenomenon.

Similar effect can also be observed in the time domain if truncation is done in the frequency domain.

This effect due to abrupt discontinuities will exists no matter how large the number of samples  is made. The situation can be improved by using a smoothly tapering windows like Blackman, Hamming , Hanning , Keiser windows etc.,

Demonstration of Gibbs Phenomenon using Matlab:

In this demonstration, a sinc pulse in time domain is considered. Sinc pulse with infinite duration in time domain, manifests as perfect rectangular shape in frequency domain. In this demo, we truncate the sinc pulse in the time domain at various length and use FFT (Fast Fourier Transform) to visualize it frequency domain. As the duration of time domain samples increases, the ringing artifact become less pronounced and the shape approaches ideal brick wall filter response.

%Gibbs Phenomenon
clearvars; % clear all stored variables

Nsyms = 5:10:60; %filter spans in symbol duration

Tsym=1; %Symbol duration
L=16; %oversampling rate, each symbol contains L samples
Fs=L/Tsym; %sampling frequency

for Nsym=Nsyms,
    [p,t]=sincFunction(L,Nsym); %Sinc Pulse

    subplot(1,2,1);
    plot(t*Tsym,p,'LineWidth',1.5);axis tight;
    ylim([-0.3,1.1]);
    title('Sinc pulse');xlabel('Time (s)');ylabel('Amplitude');
    
    [fftVals,freqVals]=freqDomainView(p,Fs,'double'); %See Chapter 1
    subplot(1,2,2);
    plot(freqVals,abs(fftVals)/abs(fftVals(length(fftVals)/2+1)),'LineWidth',1.5);
    xlim([-2 2]); ylim([0,1.1]);
    title('Frequency response of Sinc (FFT)');
    xlabel('Normalized Frequency (Hz)');ylabel('Magnitude');
    pause;% wait for user input to continue
end

Simulation Results:

Figure 3: Sinc pulse constructed with Nsym = 5 (filter span in symbols) and L =16 (samples/symbol)
Figure 4: Sinc pulse constructed with Nsym = 15 (filter span in symbols) and L =16 (samples/symbol)
Figure 5: Sinc pulse constructed with Nsym = 35 (filter span in symbols) and L =16 (samples/symbol)

I have also discussed with examples, Gibbs phenomenon applied to truncation of Fourier series coefficients. You can read about it here.

Similar articles

[1] Understanding Fourier Series
[2] Introduction to digital filter design
[3] Design FIR filter to reject unwanted frequencies
[4] FIR or IIR ? Understand the design perspective
[5] How to Interpret FFT results – complex DFT, frequency bins and FFTShift
[6] How to interpret FFT results – obtaining magnitude and phase information
[7] Analytic signal, Hilbert Transform and FFT
[8] FFT and spectral leakage
[9] Moving average filter in Python and Matlab

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Young’s model for Rayleigh fading

Introduction

Young’s fading channel model is a mathematical model used to describe the behavior of a wireless communication channel. It is a type of frequency selective fading channel model that is commonly used to simulate the effects of multipath interference on wireless signals.

The model is based on the assumption that the transmitted signal reaches the receiver through multiple paths, each with a different attenuation and phase shift. The attenuation and phase shift of each path are modeled as independent random variables with specific probability distributions.

The model uses the sum of these attenuated and phase-shifted paths to simulate the received signal. The resulting signal experiences fading due to the constructive and destructive interference of the individual paths.

Young’s fading channel model is useful for simulating the performance of wireless communication systems in a multipath environment. It can help researchers and engineers evaluate the performance of different modulation and coding schemes and develop techniques to mitigate the effects of fading.

Young’s model

In the previous article, the characteristics and types of fading was discussed. Rayleigh Fading channel with Doppler shift is considered in this article.

Consider a channel affected by both Rayleigh Fading phenomena and Doppler Shift. Rayleigh Fading is caused due to multipath reflections of the received signal before it reaches the receiver and the Doppler Shift is caused due to the difference in the relative velocity/motion between the transmitter and the receiver. This scenario is encountered in day to day mobile communications.

A number of simulation algorithms are proposed for generation of correlated Rayleigh random variables. David J.Young and Norman C Beaulieu proposed a method in their paper titled “The Generation of Correlated Rayleigh Random Variates by Inverse Discrete Fourier Transform”[1] based on the inverse discrete Fourier transform (IDFT). It is a modification of the Smith’s algorithm which is normally used for Rayleigh fading simulation. This method requires exactly one-half the number of IDFT operations and roughly two-thirds the computer memory of the original method – as the authors of the paper claims.

Rayleigh Fading can be simulated by adding two Gaussian Random variables as mentioned in my previous post. The effect of Doppler shift is incorporated by modeling the Doppler effect as a frequency domain filter.

The model proposed by Young et.al is shown below.

Rayleigh Fading – Young’s model

The Fading effect + Doppler Shift is simulated by multiplying the Gaussian Random variables and the Doppler Shift’s Frequency domain representation. Then IDFT is performed to bring them into time domain representation. The Doppler Filter used to represent the Doppler Shift effect is derived in Young’s paper.

The equation for the Doppler Filter is :

Matlab Code

Check this book for full Matlab code.
Simulation of Digital Communication Systems Using Matlab – by Mathuranathan Viswanathan

Matlab code Output:

Rayleigh Fading with Doppler Effect

Reference:

[1] D.J. Young and N.C. Beaulieu, “The generation of correlated Rayleigh random variates by inverse discrete fourier transform,” IEEE transactions on Communications, vol. 48, pp. 1114-1127, July 2000.

See also

[1]Eb/N0 Vs BER for BPSK over Rayleigh Channel and AWGN Channel
[2]Simulation of Rayleigh Fading ( Clarke’s Model – sum of sinusoids method)
[3]Performance comparison of Digital Modulation techniques
[4]BER Vs Eb/N0 for BPSK modulation over AWGN
[5]Introduction to Fading Channels

External Resources

[1]Theoretical expressions for BER under various conditions
[2]Capacity of MRC on correlated Rician Fading Channels

Fading channel – complex baseband equivalent models

Keyfocus: Fading channel models for simulation. Learn how fading channels can be modeled as FIR filters for simplified modulation & detection. Rayleigh/Rician fading.

Introduction

A fading channel is a wireless communication channel in which the quality of the signal fluctuates over time due to changes in the transmission environment. These changes can be caused by different factors such as distance, obstacles, and interference, resulting in attenuation and phase shifting. The signal fluctuations can cause errors or loss of information during transmission.

Fading channels are categorized into slow fading and fast fading depending on the rate of channel variation. Slow fading occurs over long periods, while fast fading happens rapidly over short periods, typically due to multipath interference.

To overcome the negative effects of fading, various techniques are used, including diversity techniques, equalization, and channel coding.

Fading channel in frequency domain

With respect to the frequency domain characteristics, the fading channels can be classified into frequency selective and frequency-flat fading.

A frequency flat fading channel is a wireless communication channel where the attenuation and phase shift of the signal are constant across the entire frequency band. This means that the signal experiences the same amount of fading at all frequencies, and there is no frequency-dependent distortion of the signal.

In contrast, a frequency selective fading channel is a wireless communication channel where the attenuation and phase shift of the signal vary with frequency. This means that the signal experiences different levels of fading at different frequencies, resulting in a frequency-dependent distortion of the signal.

Frequency selective fading can occur due to various factors such as multipath interference and the presence of objects that scatter or absorb certain frequencies more than others. To mitigate the effects of frequency selective fading, various techniques can be used, such as equalization and frequency hopping.

The channel fading can be modeled with different statistics like Rayleigh, Rician, Nakagami fading. The fading channel models, in this section, are utilized for obtaining the simulated performance of various modulations over Rayleigh flat fading and Rician flat fading channels. Modeling of frequency selective fading channel is discussed in this article.

Linear time invariant channel model and FIR filters

The most significant feature of a real world channel is that the channel does not immediately respond to the input. Physically, this indicates some sort of inertia built into the channel/medium, that takes some time to respond. As a consequence, it may introduce distortion effects like inter-symbol interference (ISI) at the channel output. Such effects are best studied with the linear time invariant (LTI) channel model, given in Figure 1.

Figure 1: Complex baseband equivalent LTI channel model

In this model, the channel response to any input depends only on the channel impulse response(CIR) function of the channel. The CIR is usually defined for finite length \(L\) as \(\mathbf{h}=[h_0,h_1,h_2, \cdots,h_{L-1}]\) where \(h_0\) is the CIR at symbol sampling instant \(0T_{sym}\) and \(h_{L-1}\) is the CIR at symbol sampling instant \((L-1)T_{sym}\). Such a channel can be modeled as a tapped delay line (TDL) filter, otherwise called finite impulse response (FIR) filter. Here, we only consider the CIR at symbol sampling instances. It is well known that the output of such a channel (\(\mathbf{r}\)) is given as the linear convolution of the input symbols (\(\mathbf{s}\)) and the CIR (\(\mathbf{h}\)) at symbol sampling instances. In addition, channel noise in the form of AWGN can also be included the model. Therefore, the resulting vector of from the entire channel model is given as

\[\mathbf{r} = \mathbf{h} \ast \mathbf{s} +\mathbf{n} \quad\quad (1) \]

This article is part of the following books
Digital Modulations using Matlab : Build Simulation Models from Scratch, ISBN: 978-1521493885
Digital Modulations using Python ISBN: 978-1712321638
Wireless communication systems in Matlab ISBN: 979-8648350779
All books available in ebook (PDF) and Paperback formats

Simulation model for detection in flat fading channel

A flat-fading (also called as frequency-non-selective) channel is modeled with a single tap (\(L=1\)) FIR filter with the tap weights drawn from distributions like Rayleigh, Rician or Nakagami distributions. We will assume block fading, which implies that the fading process is approximately constant for a given transmission interval. For block fading, the random tap coefficient \(h=h[0]\) is a complex random variable (not random processes) and for each channel realization, a new set of complex random values are drawn from Rayleigh or Rician or Nakagami fading according to the type of fading desired.

Figure 2: LTI channel viewed as tapped delay line filter

Simulation models for modulation and detection over a fading channel is shown in Figure 2. For a flat fading channel, the output of the channel can be expressed simply as the product of time varying channel response and the input signal. Thus, equation (1) can be simplified (refer this article for derivation) as follows for the flat fading channel.

\[\mathbf{r} = h\mathbf{s} + \mathbf{n} \quad\quad (2) \]

Since the channel and noise are modeled as a complex vectors, the detection of \(\mathbf{s}\) from the received signal is an estimation problem in the complex vector space.

Assuming perfect channel knowledge at the receiver and coherent detection, the receiver shown in Figure 3(a) performs matched filtering. The impulse response of the matched filter is matched to the impulse response of the flat-fading channel as \( h^{\ast}\). The output of the matched filter is scaled down by a factor of \(||h||^2\) which is the total-energy contained in the impulse response of the flat-fading channel. The resulting decision vector \(\mathbf{y}\) serves as the sufficient statistic for the estimation of \(\mathbf{s}\) from the received signal \(\mathbf{r}\) (refer equation A.77 in reference [1])

\[\tilde{\mathbf{y}} = \frac{h^{\ast}}{||h||^2} \mathbf{r} = \frac{h^{\ast}}{||h||^2} h\mathbf{s} + \frac{h^{\ast}}{||h||^2} \mathbf{n} = \mathbf{s} + \tilde{\mathbf{w}} \quad\quad (3) \]

Since the absolute value \(|h|\) and the Eucliden norm \(||h||\) are related as \(|h|^2= \left\lVert h\right\rVert = hh^{\ast}\), the model can be simplified further as given in Figure 3(b).

To simulate flat fading, the values for the fading variable \(h\) are drawn from complex normal distribution

\[h= X + jY \quad\quad (4) \]

where, \(X,Y\) are statistically independent real valued normal random variables.

● If \(E[h]=0\), then \(|h|\) is Rayleigh distributed, resulting in a Rayleigh flat fading channel
● If \(E[h] \neq 0\), then \(|h|\) is Rician distributed, resulting in a Rician flat fading channel with the factor \(K=[E[h]]^2/\sigma^2_h\)

References

[1] D. Tse and P. Viswanath, Fundamentals of Wireless Communication, Cambridge University Press, 2005.↗

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Central Limit Theorem – a demonstration

Central Limit Theorem – What is it ?

The central limit theorem (CLT) is a fundamental concept in statistics and probability theory that explains how the sum of independent and identically distributed random variables behaves. The theorem states that as the number of these variables increases, the distribution of their sum tends to become more like a normal distribution, even if the variables themselves are not normally distributed.

CLT states that the sum of independent and identically distributed (i.i.d) random variables (with finite mean and variance) approaches normal distribution as sample size \(N \rightarrow \infty\). In simpler terms, the theorem states that under certain general conditions, the sum of independent observations that follow same underlying distribution approximates to normal distribution. The approximation steadily improves as the number of observations increase. The underlying distribution of the independent observation can be anything – binomial, Poisson, exponential, Chi-Squared etc.

Why CLT ?

CLT is an important concept in statistics because it allows us to make inferences about a population based on a sample, even if we do not know the distribution of the population. It is used in many statistical techniques, such as hypothesis testing and confidence intervals.

Applications of CLT

Central limit theorem (CLT) is applied in a vast range of applications including (but not limited to) signal processing, channel modeling, random process, population statistics, engineering research, predicting the confidence intervals, hypothesis testing, etc. One such application in signal processing is – deriving the response of a cascaded series of low pass filters by applying the CLT. In the article titled ‘the central limit theorem and low-pass filters‘ the author has illustrated how the response of a cascaded series of low pass filters approaches Gaussian shape as the number of filters in the series increase [1].

In digital communication, the effect of noise on a communication channel is modeled as additive Gaussian white noise. This follows from the fact that the noise from many physical channels can be considered approximately Gaussian. For example, the random movement of electrons in the semiconductor devices gives rise to shot noise whose effect can be approximated to Gaussian distribution by applying central limit theorem.

Law of large numbers and CLT

there is a connection between the central limit theorem and the law of large numbers.

The law of large numbers is another important theorem in probability theory, which states that as the number of independent and identically distributed (iid) random variables increases, the average of those variables converges to the expected value of the distribution. In other words, as the sample size increases, the sample mean becomes more and more representative of the true population mean.

The central limit theorem, on the other hand, describes the distribution of the sum of iid random variables, and shows that as the sample size increases, the distribution of the sum approaches a normal distribution.

Both the law of large numbers and the CLT deal with the behavior of the sum or average of iid random variables as the sample size gets larger. The law of large numbers describes the behavior of the sample mean, while the CLT describes the behavior of the sum of the variables.

In essence, the law of large numbers is a precursor to the central limit theorem, as it establishes the fact that the sample mean becomes more and more representative of the true population mean as the sample size increases, and the central limit theorem shows that the distribution of the sum of iid random variables approaches a normal distribution as the sample size gets larger.

Demonstration using Python

For Matlab code, please refer the following book – Wireless communication systems in Matlab – by Mathuranathan Viswanathan

The following Python code illustrate how the theorem comes to play when the number of observations is increased for two separate experiments: rolling \(N\) unbiased dice and tossing \(N\) unbiased coins. The code generates \(N\) i.i.d discrete uniform random variables that generates uniform random numbers from the set \(\left\{1,k\right\}\). In the case of the dice rolling experiment, \(k\) is set to \(6\), thus simulating the random pick from the sample space \(S=\left\{1,2,3,4,5,6\right\}\) with equal probability. For the coin tossing experiment, \(k\) is set to \(2\), thus simulating the sample space of \(S=\left\{1,2\right\}\) representing head or tail events with equal probability. Rest of the code is self explanatory.

Python code

#---------Central limit theorem - Author: Mathuranathan #gaussianwaves.com -----------------------
#
import numpy as np
import matplotlib.pyplot as plt
#%matplotlib inline

numIterations = np.asarray([1,2,5,10,50,100]); #number of i.i.d RVs
experiment = 'dice' #valid values: 'dice', 'coins'
maxNumForExperiment = {'dice':6,'coins':2} #max numbers represented on dice or coins
nSamp=100000

k = maxNumForExperiment[experiment]

fig, fig_axes = plt.subplots(ncols=3, nrows=2, constrained_layout=True)

for i,N in enumerate(numIterations):
    y = np.random.randint(low=1,high=k+1,size=(N,nSamp)).sum(axis=0)
    row = i//3;col=i%3;
    bins=np.arange(start=min(y),stop=max(y)+2,step=1)
    fig_axes[row,col].hist(y,bins=bins,density=True)
    fig_axes[row,col].set_title('N={} {}'.format(N,experiment))
plt.show()
Figure 1: Demonstrating central limit theorem using N numbers of dice
Figure 2: Demonstrating central limit theorem using N numbers of coins

References

[1] Engelberg, “The central limit theorem and low-pass filters”, Proceedings of the 2004 11th IEEE International Conference on Electronics, Circuits and Systems, 13-15 Dec. 2004, pp. 65-68.↗

Similar topics:

Random Variables - Simulating Probabilistic Systems
● Introduction
Plotting the estimated PDF
● Univariate random variables
 □ Uniform random variable
 □ Bernoulli random variable
 □ Binomial random variable
 □ Exponential random variable
 □ Poisson process
 □ Gaussian random variable
 □ Chi-squared random variable
 □ Non-central Chi-Squared random variable
 □ Chi distributed random variable
 □ Rayleigh random variable
 □ Ricean random variable
 □ Nakagami-m distributed random variable
Central limit theorem - a demonstration
● Generating correlated random variables
 □ Generating two sequences of correlated random variables
 □ Generating multiple sequences of correlated random variables using Cholesky decomposition
Generating correlated Gaussian sequences
 □ Spectral factorization method
 □ Auto-Regressive (AR) model

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Maximum Likelihood estimation

Keywords: maximum likelihood estimation, statistical method, probability distribution, MLE, models, practical applications, finance, economics, natural sciences.

Introduction

Maximum Likelihood Estimation (MLE) is a statistical method used to estimate the parameters of a probability distribution by finding the set of values that maximize the likelihood function of the observed data. In other words, MLE is a method of finding the most likely values of the unknown parameters that would have generated the observed data.

The likelihood function is a function that describes the probability of observing the data given the parameters of the probability distribution. The MLE method seeks to find the set of parameter values that maximizes this likelihood function.

For example, suppose we have a set of data that we believe to be normally distributed, but we do not know the mean or variance of the distribution. We can use MLE to estimate these parameters by finding the mean and variance that maximize the likelihood function of the observed data.

The MLE method is widely used in statistical inference, hypothesis testing, and model fitting in many areas, including economics, finance, engineering, and the natural sciences. MLE is a powerful and flexible method that can be applied to a wide range of statistical models, making it a valuable tool in data analysis and modeling.

Difference between MLE and MLD

Maximum likelihood estimation (MLE) and maximum likelihood decoding (MLD) are two different concepts used in different contexts.

Maximum likelihood estimation is a statistical method used to estimate the parameters of a probability distribution based on a set of observed data. The goal is to find the set of parameter values that maximize the likelihood function of the observed data. MLE is commonly used in statistical inference, hypothesis testing, and model fitting.

On the other hand, maximum likelihood decoding (MLD) is a method used in digital communications and signal processing to decode a received signal that has been transmitted through a noisy channel. The goal is to find the transmitted message that is most likely to have produced the received signal, based on a given probabilistic model of the channel.

In maximum likelihood decoding, the receiver calculates the likelihood of each possible transmitted message, given the received signal and the channel model. The maximum likelihood decoder then selects the transmitted message that has the highest likelihood as the decoded message.

While both MLE and MLD involve the concept of maximum likelihood, they are used in different contexts. MLE is used in statistical estimation, while MLD is used in digital communications and signal processing for decoding.

MLE applied to communication systems

Maximum Likelihood estimation (MLE) is an important tool in determining the actual probabilities of the assumed model of communication.

In reality, a communication channel can be quite complex and a model becomes necessary to simplify calculations at decoder side.The model should closely approximate the complex communication channel. There exist a myriad of standard statistical models that can be employed for this task; Gaussian, Binomial, Exponential, Geometric, Poisson,etc., A standard communication model is chosen based on empirical data.

Each model mentioned above has unique parameters that characterizes them. Determination of these parameters for the chosen model is necessary to make them closely model the communication channel at hand.

Suppose a binomial model is chosen (based on observation of data) for the error events over a particular channel, it is essential to determine the probability of succcess (\(p\)) of the binomial model.

If a Gaussian model (normal distribution!!!) is chosen for a particular channel then estimating mean (\(\mu\)) and variance (\(\sigma^{2}\)) are necessary so that they can be applied while computing the conditional probability of p(y received | x sent)

Similarly estimating the mean number of events within a given interval of time or space (\(\lambda\)) is a necessity for a Poisson distribution model.

Maximum likelihood estimation is a method to determine these unknown parameters associated with the corresponding chosen models of the communication channel.

Python code example for MLE

The following program is an implementation of maximum likelihood estimation (MLE) for the binary symmetric channel (BSC) using the binomial probability mass function (PMF).

The goal of MLE is to estimate the value of an unknown parameter (in this case, the error probability \(p\)) based on observed data. The BSC is a simple channel model where each transmitted bit is flipped (with probability \(p\)) independently of other bits during transmission. The goal of the following program is to estimate the error probability \(p\) of the BSC based on a given binary data sequence.

import numpy as np
from scipy.optimize import minimize
from scipy.special import binom
import matplotlib.pyplot as plt

def BSC_MLE(data):
    """
    Maximum likelihood estimation (MLE) for the Binary Symmetric Channel (BSC).
    This function estimates the error probability p of the BSC based on the observed data.
    """
    
    # Define the binomial probability mass function
    def binom_PMF(p):
        n = len(data)
        k = np.sum(data)
        p = np.clip(p, 1e-10, 1 - 1e-10)  # Regularization to avoid problems due to small estimation errors
        logprob = np.log(binom(n, k)) + k*np.log(p) + (n-k)*np.log(1-p)
        return -logprob
    
    # Use the minimize function from scipy.optimize to find the value of p that maximizes the binomial PMF
    #x0 argument specifies the initial guess for the value of p that maximizes the binomial PMF. For BSC x0=0.5
    #BFGS is Broyden-Fletcher-Goldfarb-Shanno optimization algorithm used for unconstrained nonlinear optimization
    res = minimize(lambda p: binom_PMF(p), x0=0.5, method='BFGS')
    p_est = res.x[0]

    # Plot the observed data as a histogram
    plt.hist(data, bins=2, density=True, alpha=0.5)
    plt.axvline(p_est, color='r', linestyle='--')
    plt.xlabel('Bit value')
    plt.ylabel('Frequency')
    plt.title('Observed data')
    plt.show()
    
    return p_est

data = np.random.randint(2, size=1000)
p_est = BSC_MLE(data)
print('Estimated error probability: {:.4f}'.format(p_est))

The program first defines a function called BSC_MLE that takes a binary data sequence as input and returns the estimated error probability p_est. The BSC_MLE function defines the binomial PMF, which represents the probability of observing a certain number of errors (i.e., bit flips) in the data sequence given a specific error probability p. The binomial PMF is then maximized using the minimize function from the scipy.optimize module to find the value of p that maximizes the likelihood of observing the data.

The program then generates a random binary data sequence of length 100 using the np.random.randint() function and calls the BSC_MLE function to estimate the error probability based on the observed data. Finally, the program prints the estimated error probability. Try increasing the sequence length to 1000 and observe the estimated error probability.

Figure 1: Maximum Likelihood Estimation (MLE) : Plotting the observed data as a histogram

Reference :

[1] – Maximum Likelihood Estimation – a detailed explanation by S.Purcell

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Related Topics:

[1]An Introduction to Estimation Theory
[2]Bias of an Estimator
[3]Minimum Variance Unbiased Estimators (MVUE)
[4]Maximum Likelihood Estimation
[5]Maximum Likelihood Decoding
[6]Probability and Random Process
[7]Likelihood Function and Maximum Likelihood Estimation (MLE)
[8]Score, Fisher Information and Estimator Sensitivity
[9]Introduction to Cramer Rao Lower Bound (CRLB)
[10]Cramer Rao Lower Bound for Scalar Parameter Estimation
[11]Applying Cramer Rao Lower Bound (CRLB) to find a Minimum Variance Unbiased Estimator (MVUE)
[12]Efficient Estimators and CRLB
[13]Cramer Rao Lower Bound for Phase Estimation
[14]Normalized CRLB - an alternate form of CRLB and its relation to estimator sensitivity
[15]Cramer Rao Lower Bound (CRLB) for Vector Parameter Estimation
[16]The Mean Square Error – Why do we use it for estimation problems
[17]How to estimate unknown parameters using Ordinary Least Squares (OLS)
[18]Essential Preliminary Matrix Algebra for Signal Processing
[19]Why Cholesky Decomposition ? A sample case:
[20]Tests for Positive Definiteness of a Matrix
[21]Solving a Triangular Matrix using Forward & Backward Substitution
[22]Cholesky Factorization - Matlab and Python
[23]LTI system models for random signals – AR, MA and ARMA models
[24]Comparing AR and ARMA model - minimization of squared error
[25]Yule Walker Estimation
[26]AutoCorrelation (Correlogram) and persistence – Time series analysis
[27]Linear Models - Least Squares Estimator (LSE)
[28]Best Linear Unbiased Estimator (BLUE)

Maximum Likelihood Decoding

Keywords: maximum likelihood decoding, digital communication, data storage, noise, interference, wireless communication systems, optical communication systems, digital storage systems, probability, likelihood estimation, python

Introduction

Maximum likelihood decoding is a technique used to determine the most likely transmitted message in a digital communication system, based on the received signal and statistical models of noise and interference. The method uses maximum likelihood estimation to calculate the probability of each possible transmitted message and then selects the one with the highest probability.

To perform maximum likelihood decoding, the receiver uses a set of pre-defined models to estimate the likelihood of each possible transmitted message based on the received signal. The method is commonly used in various digital communication and data storage systems, such as wireless communication and digital storage. However, it can be complex and time-consuming, particularly in systems with large message spaces or complex noise and interference models.

Maximum Likelihood Decoding:

Consider a set of possible codewords (valid codewords – set \(Y\)) generated by an encoder in the transmitter side. We pick one codeword out of this set ( call it \(y\) ) and transmit it via a Binary Symmetric Channel (BSC) with probability of error \(p\) ( To know what is a BSC – click here ). At the receiver side we receive the distorted version of \(y\) ( call this erroneous codeword \(x\)).

Maximum Likelihood Decoding chooses one codeword from \(Y\) (the list of all possible codewords) which maximizes the following probability.

\[\mathbb{P}(y\;sent\mid x\;received )\]

Meaning that the receiver computes \(P(y_1,x) , P(y_2,x) , P(y_3,x),\cdots,P(y_n,x)\). and chooses a codeword (\(y\)) which gives the maximum probability.  In practice we don’t know \(y\) (at the receiver) but we know \(x\). So how to compute the probability ? Maximum Likelihood Estimation (MLE) comes to our rescue. For a detailed explanation on MLE – refer here[1] The aim of maximum likelihood estimation is to find the parameter value(s) that makes the observed data most likely. Understanding the difference between prediction and estimation is important at this point.   Estimation differs from prediction in the following way … In estimation problems, likelihood of the parameters is estimated based on given data/observation vector. In prediction problems, probability is used as a measure to predict the outcome from known parameters of a model.

Examples for “Prediction” and “Estimation” :

1) Probability of getting a “Head” in a single toss of a fair coin is \(0.5\). The coin is tossed 100 times in a row.Prediction helps in predicting the outcome ( head or tail ) of the \(101^{th}\) toss based on the probability.

2) A coin is tossed 100 times and the data ( head or tail information) is recorded. Assuming the event follows Binomial distribution model, estimation helps in determining the probability of the event. The actual probability may or may not be \(0.5\).   Maximum Likelihood Estimation estimates the conditional probability based on the observed data ( received data – \(x\)) and an assumed model.

Example of Maximum Likelihood Decoding:

Let \(y=11001001\) and \(x=10011001\) . Assuming Binomial distribution model for the event with probability of error \(0.1\) (i.e the reliability of the BSC is \(1-p = 0.9\)), the Hamming distance between codewords is \(2\) . For binomial model,

\[\mathbb{P}(y\;received\mid x\;sent ) = (1-p)^{n-d}.p^{d}\]

where \(d\) =the hamming distance between the received and the sent codewords n= number of bit sent
\(p\)= error probability of the BSC.
\(1-p\) = reliability of BSC

Substituting \(d=2, n=8\) and \(p=0.1\) , then \(P(y\;received \mid x\;sent) = 0.005314\).

Note : Here, Hamming distance is used to compute the probability. So the decoding can be called as “minimum distance decoding” (which minimizes the Hamming distance) or “maximum likelihood decoding”. Euclidean distance may also be used to compute the conditional probability.

As mentioned earlier, in practice \(y\) is not known at the receiver. Lets see how to estimate \(P(y \;received \mid x\; sent)\) when \(y\) is unknown based on the binomial model.

Since the receiver is unaware of the particular \(y\) corresponding to the \(x\) received, the receiver computes \(P(y\; received \mid x\; sent)\) for each codeword in \(Y\). The \(y\) which gives the maximum probability is concluded as the codeword that was sent.

Python code implementing Maximum Likelihood Decoding:

The following program for demonstrating the maximum likelihood decoding, involves generating a noisy signal from a transmitted message and then using maximum likelihood decoding to estimate the transmitted message from the noisy signal.

  1. The maximum_likelihood_decoding function takes three arguments: received_signal, noise_variance, and message_space.
  2. The calculate_probabilities function is called to calculate the probability of each possible message given the received signal, using the known noise variance.
  3. The probabilities are normalized so that they sum to 1.
  4. The maximum_likelihood_decoding function finds the index of the most likely message (i.e., the message with the highest probability).
  5. The function returns the most likely message.
  6. An example usage is demonstrated where a binary message space is defined ([0, 1]), along with a noise variance and a transmitted message.
  7. The transmitted message is added to noise to generate a noisy received signal.
  8. The maximum_likelihood_decoding function is called to decode the noisy signal.
  9. The transmitted message, received signal, and decoded message are printed to the console for evaluation.
import numpy as np
import matplotlib.pyplot as plt

# Define a function to calculate the probability of each possible message given the received signal
def calculate_probabilities(received_signal, noise_variance, message_space):
    probabilities = np.zeros(len(message_space))

    for i, message in enumerate(message_space):
        error = received_signal - message
        probabilities[i] = np.exp(-np.sum(error ** 2) / (2 * noise_variance))

    return probabilities / np.sum(probabilities)

# Define a function to perform maximum likelihood decoding
def maximum_likelihood_decoding(received_signal, noise_variance, message_space):
    probabilities = calculate_probabilities(received_signal, noise_variance, message_space)
    most_likely_message_index = np.argmax(probabilities)
    return message_space[most_likely_message_index]

# Example usage
message_space = np.array([0, 1])
noise_variance = 0.4
transmitted_message = 1
received_signal = transmitted_message + np.sqrt(noise_variance) * np.random.randn()
decoded_message = maximum_likelihood_decoding(received_signal, noise_variance, message_space)

print('Transmitted message:', transmitted_message)
print('Received signal:', received_signal)
print('Decoded message:', decoded_message)

# Plot probability distribution
probabilities = calculate_probabilities(received_signal, noise_variance, message_space)
plt.bar(message_space, probabilities)
plt.title('Probability Distribution for Received Signal = {}'.format(received_signal))
plt.xlabel('Transmitted Message')
plt.ylabel('Probability')
plt.ylim([0, 1])
plt.show()

The probability of the received signal given a specific transmitted message is calculated as follows:

  1. Compute the difference between the received signal and the transmitted message.
  2. Compute the sum of squares of this difference vector.
  3. Divide this sum by twice the known noise variance.
  4. Take the negative exponential of this value.

This results in a probability density function (PDF) for the received signal given the transmitted message, assuming that the noise is Gaussian and zero-mean.

The probabilities for each possible transmitted message are then normalized so that they sum to 1. This is done by dividing each individual probability by the sum of all probabilities.

The maximum_likelihood_decoding function determines the most likely transmitted message by selecting the message with the highest probability, which corresponds to the maximum likelihood estimate of the transmitted message given the received signal and the statistical model of the noise.

Sample outputs

Transmitted message: 1
Received signal: 0.21798306949364643
Decoded message: 0

Transmitted message: 1
Received signal: -0.5115453787966966
Decoded message: 0

Transmitted message: 1
Received signal: 0.8343088336355061
Decoded message: 1

Transmitted message: 1
Received signal: -0.5479891887158619
Decoded message: 0

The probability distribution for the last sample output is shown below

Figure: Probability distribution for a sample run of the code

Reference :

[1] – Maximum Likelihood Estimation – a detailed explanation by S.Purcell

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Related Topics:

[1]An Introduction to Estimation Theory
[2]Bias of an Estimator
[3]Minimum Variance Unbiased Estimators (MVUE)
[4]Maximum Likelihood Estimation
[5]Maximum Likelihood Decoding
[6]Probability and Random Process
[7]Likelihood Function and Maximum Likelihood Estimation (MLE)
[8]Score, Fisher Information and Estimator Sensitivity
[9]Introduction to Cramer Rao Lower Bound (CRLB)
[10]Cramer Rao Lower Bound for Scalar Parameter Estimation
[11]Applying Cramer Rao Lower Bound (CRLB) to find a Minimum Variance Unbiased Estimator (MVUE)
[12]Efficient Estimators and CRLB
[13]Cramer Rao Lower Bound for Phase Estimation
[14]Normalized CRLB - an alternate form of CRLB and its relation to estimator sensitivity
[15]Cramer Rao Lower Bound (CRLB) for Vector Parameter Estimation
[16]The Mean Square Error – Why do we use it for estimation problems
[17]How to estimate unknown parameters using Ordinary Least Squares (OLS)
[18]Essential Preliminary Matrix Algebra for Signal Processing
[19]Why Cholesky Decomposition ? A sample case:
[20]Tests for Positive Definiteness of a Matrix
[21]Solving a Triangular Matrix using Forward & Backward Substitution
[22]Cholesky Factorization - Matlab and Python
[23]LTI system models for random signals – AR, MA and ARMA models
[24]Comparing AR and ARMA model - minimization of squared error
[25]Yule Walker Estimation
[26]AutoCorrelation (Correlogram) and persistence – Time series analysis
[27]Linear Models - Least Squares Estimator (LSE)
[28]Best Linear Unbiased Estimator (BLUE)