Window function – figure of merits

Key focus: Window function smooths the observed signal over the edges. Analysis of some important parameters to help select the window for an application.

Spectral leakage

As we know, the DFT operation can be viewed as processing a signal through a set of filter banks with bandwidth Δf centered on the bin (frequency) of interest (Figure 1). To accurately resolve the frequency component in each bin, we desire the resolution bandwidth Δf to be as small as possible. In effect, each filter filter bank is viewed as filter having a brick-wall response with narrow bandwidth. If the signal under study contains several spectral components and broadband noise, the computed amplitude of the DFT output in each bin (filter bank) is significantly affected by the noise contained in the corresponding bandwidth of the filter bank.

DFT (Discrete Fourier Transform) viewed as processing through a set of filter banks
Figure 1: DFT (Discrete Fourier Transform) viewed as processing through a set of filter banks

In reality, signals are of time-limited nature and nothing can be known about the signal beyond the measured interval. When the time-limited signal slice is subjected to analysis using FFT algorithm (used for computing DFT), the FFT implicitly assumes that the signal essentially repeats itself after the observed interval. This may lead to discontinuities at the edges of each slice, that causes the energy contained in each frequency bin to spill into other bins. This phenomenon is called spectral leakage.

Equivalent Noise Bandwidth:[2]

To suppress the spectral leakage, the signals are multiplied with a window function so as to smooth the discontinuity at the edges of the FFT slices. As a result, the choice of window function affects the amount of signal and noise that goes inside each filter bank. Hence the amount of noise that gets accumulated into each filter bank is affected by the bandwidth of the chosen window function. In effect, the amplitude estimate in each frequency bin is influenced by the amount of accumulated noise contributed by the window function of choice. In other words, the noise floor displayed at the FFT output, varies according to the window of choice.

Equivalent noise bandwidth (ENBW), often specified in terms of FFT bins, is the bandwidth of a fictitious brick-wall filter such that the amount of noise power accumulated inside the brick-wall filter is same as the noise power accumulated when processing white noise through the selected window.

Figure 2: Illustrating equivalent noise bandwidth (ENBW) (figure not to scale)

In discrete domain, the equivalent noise bandwidth Benbw can be computed using time domain samples as

\[B_{enbw} = L \frac{\sum_{n}|w[n]|^2}{ \left| \sum_{n} w[n] \right|^2}\]

where L is the length of the window.

ENBW is a commonly used metric to characterize the variation in the displayed noise floor if a non rectangular window is used. If noise floor measurements are desired, the ENBW correction factor need to be applied to account for the variation in the noise floor [3]. Table 1 below, lists the ENBW (bins) and ENBW correction factor for some well known window functions.

Following figure illustrates the equivalent noise bandwidth illustrated for Hann window.

Figure 3: Equivalent noise bandwidth (ENBW) illustrated for Hann window

Read more on Equivalent noise bandwidth…

Coherent power gain:

In time-domain, windowing a signal reduces the amplitude of the signal at its boundaries (Figure 4). This causes a reduction in the amplitude of the spectral component when FFT used to visualize the signal in frequency domain. This reduction in amplitude in the spectral components is characterized as coherent power gain (which actually is a loss). So, when FFT is computed on a windowed signal, in order to compensate for the loss in the amplitude, we simply add the coherent power gain to the FFT output.

Figure 4: Windowing a 10 Hz sinusoidal signal with Hann window of length L = 151 (sampling frequency Fs=160 Hz)

Coherent power gain of a window w[n] of length L is given by

\[\text{Coherent power gain (dB)} = 20 \; log_{10} \left( \frac{\sum_n w[n]}{L} \right)\]

Following plot depicts the coherent power gain (i.e, the reduction in the FFT magnitude when the input signal is processed with a window) of a windowed sinusoidal signal of frequency 10 Hz. The length of the window is L = 151 points and the simulation assumes an oversampling factor of 16 (i.e, Fs=160 Hz). The FFT length is NFFT =2048.

Table 1 below, lists the coherent power gain for some well known window functions.

Figure 5: Coherent power gain – illustrated for various window functions

Scalloping loss

As discussed above, the FFT (equivalently the DFT) operation can be seen as processing a signal through a set of filter banks with bandwidth Δf centered on the successive bin frequencies. The frequency resolution (Δf) of FFT depends on the size of FFT (NFFT) and the sampling frequency Fs.

\[\Delta f = \frac{F_s}{N_{FFT}}\]

Therefore, the FFT process can measure amplitude of tones that are multiples of the frequency resolution Δf. That is, if the frequency of the observed signal matches a particular bin frequency, the signal amplitude will be maximum at that bin.

However, if the tone of the signal falls between two bins, the signal power is spread between the adjacent bins and hence the displayed signal amplitude/power is reduced.

Scalloping loss quantifies the worst case reduction in the signal level, when the tone of the observed signal falls exactly mid way between two bin frequencies. The worst case scalloping loss is calculated as

\[\text{Scalloping loss} = \frac{\sum_n w[n] e^{\left( -j \frac{\pi}{N_{FFT}}n \right)}}{\sum_n w[n]}\]

In reality, scalloping loss depends on the type of window and the relationship between the signal tones to the bin frequencies. Table 1 below, lists the scalloping loss for some well known window functions.

Figure 6: Scalloping loss illustrated for processing a signal with Boxcar and Bartlett window

Figure of merits for window functions

Table 1: Figure of merits for various window functions

List of window functions

Boxcar (Rectangular)

\[w[n] = \begin{cases} 1, \quad 0 \leq n \leq L-1 \\ 0, \quad \text{otherwise} \end{cases}\]
Figure 1: Boxcarr (rectangular) window (L=51)

Barthann

\[ w[n] = 0.62 – 0.48 \left| \frac{n}{L – 1} – 0.5 \right| + 0.38 cos \left(2 \pi \left| \frac{n}{L – 1} – 0.5 \right| \right), \quad 0 \leq n \leq L-1\]
Figure 2: Barthann window (L=51)

Bartlett

\[w[n] = \frac{2}{L-1} \left(\frac{L-1}{2} – \left|n – \frac{L-1}{2}\right| \right),\quad 0 \leq n \leq L-1\]
Figure 3: Bartlett window (L=51)

Blackman

\[w[n] = 0.42 – 0.5 \cos(2\pi n/L) + 0.08 \cos(4\pi n/L), \quad 0 \leq n \leq L-1\]
Figure 4: Blackman window (L=51)

Blackman-Harris

\[w[n]=a_0 – a_1 \cos \left ( \frac{2 \pi n}{L} \right)+ a_2 \cos \left ( \frac{4 \pi n}{L} \right)- a_3 \cos \left ( \frac{6 \pi n}{L} \right), \quad 0 \leq n \leq L-1\] \[a_0=0.35875;\quad a_1=0.48829;\quad a_2=0.14128;\quad a_3=0.01168\]
Figure 5: Blackman-Harris window (L=51)

Bohman

\[w[n] = \left( 1 – |x|\right) cos \left( \pi |x| \right) + \frac{1}{\pi} sin \left( \pi |x| \right) , \quad -1 \leq x \leq 1 \]
Figure 6: Bohman window (L=51)

Cosine

\[w[n] = sin \left( \frac{\pi}{ L } \left(n + 0.5 \right) \right),\quad 0 \leq n \leq L-1\]
Figure 7: Cosine window (L=51)

Flattop

\[w[n]=a_0 – a_1 \cos \left ( \frac{2 \pi n}{L} \right)+ a_2 \cos \left ( \frac{4 \pi n}{L} \right)- a_3 \cos \left ( \frac{6 \pi n}{L} \right), \quad 0 \leq n \leq L-1 \\ a_0=0.21557895;\; a_1=0.41663158 \\ a_2=0.277263158;\; a_3=0.006947368 \]
Figure 8: Flattop window (L=51)

Hamming

\[w[n] = 0.54 – 0.46 \cos\left(\frac{2\pi{n}}{L-1}\right), \quad 0 \leq n \leq L-1\]
Figure 9: Hamming window (L=51)

Hann

\[w[n] = 0.5 – 0.5 \cos\left(\frac{2\pi{n}}{L-1}\right), \quad 0 \leq n \leq L-1\]
Figure 10: Hann window (L=51)

Nuttall

\[ w[n]=a_0 – a_1 \cos \left ( \frac{2 \pi n}{L} \right)+ a_2 \cos \left ( \frac{4 \pi n}{L} \right)- a_3 \cos \left ( \frac{6 \pi n}{L} \right), \quad 0 \leq n \leq L-1 \\ a_0=0.3635819;\; a_1=0.4891775;\\ a_2=0.1365995;\; a_3=0.0106411\]
Figure 11: Nuttall window (L=51)

Parzen

\[w[n] = \begin{cases} 1- 6 \left( \frac{|n|}{L/2} \right)^2 + 6 \left( \frac{|n|}{L/2} \right)^3, & 0 \leq |n| \leq (L-1)/4 \\ 2 \left( 1 – \frac{|n|}{L/2} \right)^3, & (L-1)/4 < |n| \leq (L-1)/2 \end{cases}\]
Figure 12: Parzen window (L=51)

Triangular

\[w[n] = 1 – \frac{|n|}{L/2}, \quad n = -L/2, \cdots, -1, 0, 1, \cdots, L/2\]
Figure 13: Triangular window (L=51)

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

References:

[1] Harris, “On the Use of Windows for Harmonic Analysis with the Discrete Fourier Transform”, proceedings of IEEE, vol. 66, no. 1, January 1978.↗
[2] Stefan Scholl, “Exact Signal Measurements using FFT Analysis”,Microelectronic Systems Design Research Group, TU Kaiserslautern, Germany.↗

Similar articles

[1] Understanding Fourier Series
[2] Introduction to digital filter design
[3] Design FIR filter to reject unwanted frequencies
[4] FIR or IIR ? Understand the design perspective
[5] How to Interpret FFT results – complex DFT, frequency bins and FFTShift
[6] How to interpret FFT results – obtaining magnitude and phase information
[7] Analytic signal, Hilbert Transform and FFT
[8] FFT and spectral leakage
[9] Moving average filter in Python and Matlab
[10] Window functions – figure of merits
[11] Equivalent noise bandwidth (ENBW) of window functions

Additional Resources:

[1] If you are particularly interested in calculating the Equivalent Noise Bandwidth of Analog Filters refer – Thomas J. Karras,”Equivalent Noise Bandwidth Analysis from Transfer Functions”,NASA Technical Note,NASA TN D-2842.
[2] Strategies for Choosing Windows : Michael Cerna and Audrey F. Harvey,”The Fundamentals of FFT-Based Signal Analysis and Measurement”,National Instruments Application Notes 041.

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Equivalent noise bandwidth (ENBW) of window functions

Key focus: Equivalent noise bandwidth (ENBW), is the bandwidth of a fictitious brick-wall filter that allows same amount of noise as a window function. Learn how to calculate ENBW in applications involving window functions and FFT operation.

FFT and spectral leakage

As we know, the DFT operation can be viewed as processing a signal through a set of filter banks with bandwidth Δf centered on the bin (frequency) of interest (Figure 1). To accurately resolve the frequency component in each bin, we desire the resolution bandwidth Δf to be as small as possible. In effect, each filter filter bank is viewed as filter having a brick-wall response with narrow bandwidth. If the signal under study contains several spectral components and broadband noise, the computed amplitude of the DFT output in each bin (filter bank) is significantly affected by the noise contained in the corresponding bandwidth of the filter bank.

Figure 1: DFT (Discrete Fourier Transform) viewed as processing through a set of filter banks

In reality, signals are of time-limited nature and nothing can be known about the signal beyond the measured interval. When the time-limited signal slice is subjected to analysis using FFT algorithm (used for computing DFT), the FFT implicitly assumes that the signal essentially repeats itself after the observed interval. This may lead to discontinuities at the edges of each slice, that causes the energy contained in each frequency bin to spill into other bins. This phenomenon is called spectral leakage.

Here is an article with illustrations on – FFT slices and the phenomenon of spectral leakage,

Hence, to suppress the spectral leakage, the signals are multiplied with a window function so as to smooth the discontinuity at the edges of the FFT slices. As a result, the choice of window function affects the amount of signal and noise that goes inside each filter bank. Hence the amount of noise that gets accumulated into each filter bank is affected by the bandwidth of the chosen window function. In effect, the amplitude estimate in each frequency bin is influenced by the amount of accumulated noise contributed by the window function of choice. In other words, the noise floor displayed at the FFT output, varies according to the window of choice.

Equivalent noise bandwidth

Equivalent noise bandwidth (ENBW), often specified in terms of FFT bins, is the bandwidth of a fictitious brick-wall filter such that the amount of noise power accumulated inside the brick-wall filter is same as the noise power accumulated when processing white noise through the chosen window.

In other words, in frequency domain, the power accumulated in the chosen window is given by

\[P_{window} = \int_{-f}^{f} |W(f)|^2 df\]

Then we wish to find the equivalent noise bandwidth Benbw, such that

\[P_{brick-wall} = B_{enbw} |W(f_0)|^2 = P_{window}\]

where f0 is the frequency at which the power peaks and Pbrick-wall is the power accumulated in the fictitious rectangular window of bandwidth Benbw.

Figure 2: Illustrating equivalent noise bandwidth (ENBW) (figure not to scale)

Therefore, the equivalent noise bandwidth Benbw is given by

\[B_{enbw} = \frac{\int_{-f}^{f} |W(f)|^2 df}{|W(f_0)|^2}\]

Translating to discrete domain, the equivalent noise bandwidth can be computed using DFT samples as

\[B_{enbw} =\frac{\sum_{k=0}^{N-1}|W[k]|^2}{|W[k_0]|^2}\]

where, k0 is the index at which the magnitude of FFT output is maximum and N is the window length. Applying Parseval’s theorem and with a window of length L, Benbw can also be computed using time domain samples as

\[B_{enbw} = L \frac{\sum_{n}|w[n]|^2}{ \left| \sum_{n} w[n] \right|^2}\]

ENBW is a commonly used metric to characterize the variation in the displayed noise floor if a non rectangular window is used. If noise floor measurements are desired, the ENBW correction factor need to be applied to account for the variation in the noise floor [1].

Following figure illustrates the equivalent noise bandwidth illustrated for Hann window.

Figure 3: Equivalent noise bandwidth (ENBW) illustrated for Hann window

Python script

ENBW can be calculated from the time domain samples in a straightforward manner. Following Python 3.0 script calculates the ENBW for some well-known window functions provided as part of Scipy module↗.

#Author : Mathuranathan Viswanathan for gaussianwaves.com
#Date: 13 Sept 2020
#Script to calculate equivalent noise bandwidth (ENBW) for some well known window functions

import numpy as np
import pandas as pd
from scipy import signal

def equivalent_noise_bandwidth(window):
    #Returns the Equivalent Noise BandWidth (ENBW)
    return len(window) * np.sum(window**2) / np.sum(window)**2

def get_enbw_windows():
    #Return ENBW for all the following windows as a dataframe
    window_names = ['boxcar','barthann','bartlett','blackman','blackmanharris','bohman','cosine','exponential','flattop','hamming','hann','nuttall','parzen','triang']
    
    df = pd.DataFrame(columns=['Window','ENBW (bins)','ENBW correction (dB)'])
    for window_name in window_names:
        method_name = window_name
        func_to_run = getattr(signal, method_name) #map window names to window functions in scipy package
        L = 16384 #Number of points in the output window
        window = func_to_run(L) #call the functions
        
        enbw = equivalent_noise_bandwidth(window) #compute ENBW
        
        df = df.append({'Window': window_name.title(),'ENBW (bins)':round(enbw,3),'ENBW correction (dB)': round(10*np.log10(enbw),3)},ignore_index=True)
                       
    return df

df = get_enbw_windows() #call the function
df.head(14) #display dataframe

The resulting output is displayed in the following table (dataframe)

Table 1: Table of equivalent noise bandwidth for some well known window functions

As an example, the following plot depicts the difference in the noise floor of FFT output of a noise added sine wave that is processed through Boxcar and Flattop window.

Figure 4: Noise floor and ENBW for flattop & boxcar window (FFT output) for noise added 10 Hz sinewave (oversampling factor = 16, Fs = 160 Hz, window length L =2048, SNR = 30 dB)

Continue reading on Window function and figure of merits…

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

Reference

[1] Stefan Scholl, “Exact Signal Measurements using FFT Analysis”,Microelectronic Systems Design Research Group, TU Kaiserslautern, Germany.↗

Similar articles

[1] Understanding Fourier Series
[2] Introduction to digital filter design
[3] Design FIR filter to reject unwanted frequencies
[4] FIR or IIR ? Understand the design perspective
[5] How to Interpret FFT results – complex DFT, frequency bins and FFTShift
[6] How to interpret FFT results – obtaining magnitude and phase information
[7] Analytic signal, Hilbert Transform and FFT
[8] FFT and spectral leakage
[9] Moving average filter in Python and Matlab
[10] Window function – figure of merits
[11] Equivalent noise bandwidth of window functions

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

BPSK bit error rate simulation in Python & Matlab

Key focus: Simulate bit error rate performance of Binary Phase Shift Keying (BPSK) modulation over AWGN channel using complex baseband equivalent model in Python & Matlab.

Why complex baseband equivalent model

The passband model and equivalent baseband model are fundamental models for simulating a communication system. In the passband model, also called as waveform simulation model, the transmitted signal, channel noise and the received signal are all represented by samples of waveforms. Since every detail of the RF carrier gets simulated, it consumes more memory and time.

In the case of discrete-time equivalent baseband model, only the value of a symbol at the symbol-sampling time instant is considered. Therefore, it consumes less memory and yields results in a very short span of time when compared to the passband models. Such models operate near zero frequency, suppressing the RF carrier and hence the number of samples required for simulation is greatly reduced. They are more suitable for performance analysis simulations. If the behavior of the system is well understood, the model can be simplified further.

Passband model and converting it to equivalent complex baseband model is discussed in this article.

Simulation of bit error rate performance of BPSK using passband simulation model is discussed in this article.

Figure 1: BPSK constellation

BPSK constellation

In binary phase shift keying, all the information gets encoded in the phase of the carrier signal. The BPSK modulator accepts a series of information symbols drawn from the set m {0,1}, modulates them and transmits the modulated symbols over a channel.

The general expression for generating a M-PSK signal set is given by

Here, M denotes the modulation order and it defines the number of constellation points in the reference constellation. The value of M depends on the parameter k – the number of bits we wish to squeeze in a single M-PSK symbol. For example if we wish to squeeze in 3 bits (k=3) in one transmit symbol, then M = 2k = 23 = 8 and this results in 8-PSK configuration. M=2 gives BPSK (Binary Phase Shift Keying) configuration. The parameter A is the amplitude scaling factor, fc is the carrier frequency and g(t) is the pulse shape that satisfies orthonormal properties of basis functions.

Using trigonometric identity, equation (1) can be separated into cosine and sine basis functions as follows

Therefore, the signaling set {si,sq} or the constellation points for M-PSK modulation is given by,

For BPSK (M=2), the constellation points on the I-Q plane (Figure 1) are given by

Simulation methodology

Note: If you are interested in knowing more about BPSK modulation and demodulation, kindly visit this article.

In this simulation methodology, there is no need to simulate each and every sample of the BPSK waveform as per equation (1). Only the value of a symbol at the symbol-sampling time instant is considered. The steps for simulation of performance of BPSK over AWGN channel is as follows (Figure 2)

  1. Generate a sequence of random bits of ones and zeros of certain length (Nsym typically set in the order of 10000)
  2. Using the constellation points, map the bits to modulated symbols (For example, bit ‘0’ is mapped to amplitude value A, and bit ‘1’ is mapped to amplitude value -A)
  3. Compute the total power in the sequence of modulated symbols and add noise for the given EbN0 (SNR) value (read this article on how to do this). The noise added symbols are the received symbols at the receiver.
  4. Use thresholding technique, to detect the bits in the receiver. Based on the constellation diagram above, the detector at the receiver has to decide whether the receiver bit is above or below the threshold 0.
  5. Compare the detected bits against the transmitted bits and compute the bit error rate (BER).
  6. Plot the simulated BER against the SNR values and compare it with the theoretical BER curve for BPSK over AWGN (expressions for theoretical BER is available in this article)
Figure 2: Simulation methodology for performance of BPSK modulation over AWGN channel

Let’s simulate the performance of BPSK over AWGN channel in Python & Matlab.

Simulation using Python

Following standalone code simulates the bit error rate performance of BPSK modulation over AWGN using Python version 3. The results are plotted in Figure 3.

For more such examples refer the book (available as PDF and paperback) Digital Modulations using Python

#Eb/N0 Vs BER for BPSK over AWGN (complex baseband model)
# © Author: Mathuranathan Viswanathan (gaussianwaves.com)
import numpy as np #for numerical computing
import matplotlib.pyplot as plt #for plotting functions
from scipy.special import erfc #erfc/Q function

#---------Input Fields------------------------
nSym = 10**5 # Number of symbols to transmit
EbN0dBs = np.arange(start=-4,stop = 13, step = 2) # Eb/N0 range in dB for simulation
BER_sim = np.zeros(len(EbN0dBs)) # simulated Bit error rates

M=2 #Number of points in BPSK constellation
m = np.arange(0,M) #all possible input symbols
A = 1; #amplitude
constellation = A*np.cos(m/M*2*np.pi)  #reference constellation for BPSK

#------------ Transmitter---------------
inputSyms = np.random.randint(low=0, high = M, size=nSym) #Random 1's and 0's as input to BPSK modulator
s = constellation[inputSyms] #modulated symbols

fig, ax1 = plt.subplots(nrows=1,ncols = 1)
ax1.plot(np.real(constellation),np.imag(constellation),'*')

#----------- Channel --------------
#Compute power in modulatedSyms and add AWGN noise for given SNRs
for j,EbN0dB in enumerate(EbN0dBs):
    gamma = 10**(EbN0dB/10) #SNRs to linear scale
    P=sum(abs(s)**2)/len(s) #Actual power in the vector
    N0=P/gamma # Find the noise spectral density
    n = np.sqrt(N0/2)*np.random.standard_normal(s.shape) # computed noise vector
    r = s + n # received signal
    
    #-------------- Receiver ------------
    detectedSyms = (r <= 0).astype(int) #thresolding at value 0
    BER_sim[j] = np.sum(detectedSyms != inputSyms)/nSym #calculate BER

BER_theory = 0.5*erfc(np.sqrt(10**(EbN0dBs/10)))

fig, ax = plt.subplots(nrows=1,ncols = 1)
ax.semilogy(EbN0dBs,BER_sim,color='r',marker='o',linestyle='',label='BPSK Sim')
ax.semilogy(EbN0dBs,BER_theory,marker='',linestyle='-',label='BPSK Theory')
ax.set_xlabel('$E_b/N_0(dB)$');ax.set_ylabel('BER ($P_b$)')
ax.set_title('Probability of Bit Error for BPSK over AWGN channel')
ax.set_xlim(-5,13);ax.grid(True);
ax.legend();plt.show()

Simulation using Matlab

Following code simulates the bit error rate performance of BPSK modulation over AWGN using basic installation of Matlab. You will need the add_awgn_noise function that was discussed in this article. The results will be same as Figure 3.

For more such examples refer the book (available as PDF and paperback) Digital Modulations using Matlab: build simulation models from scratch

%Eb/N0 Vs BER for BPSK over AWGN (complex baseband model)
% © Author: Mathuranathan Viswanathan (gaussianwaves.com)
clearvars; clc;
%---------Input Fields------------------------
nSym=10^6;%Number of symbols to transmit
EbN0dB = -4:2:14; % Eb/N0 range in dB for simulation

BER_sim = zeros(1,length(EbN0dB));%simulated Symbol error rates
    
M=2; %number of constellation points in BPSK
m = [0,1];%all possible input bits
A = 1; %amplitude
constellation = A*cos(m/M*2*pi);%constellation points

d=floor(M.*rand(1,nSym));%uniform random symbols from 1:M
s=constellation(d+1);%BPSK modulated symbols
    
for i=1:length(EbN0dB)
    r  = add_awgn_noise(s,EbN0dB(i));%add AWGN noise
    dCap = (r<=0);%threshold detector
    BER_sim(i) = sum((d~=dCap))/nSym;%SER computation
end

semilogy(EbN0dB,BER_sim,'-*');
xlabel('Eb/N0(dB)');ylabel('BER (Pb)');
title(['Probability of Bit Error for BPSK over AWGN']);

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

Reference

[1] Andrea Goldsmith, “Wireless Communications”, ISBN: 978-0521837163, Cambridge University Press; 1 edition, August 8, 2005.↗

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Constellation diagram – investigate phase transitions

The phase transition properties of the different variants of QPSK schemes and MSK, are easily investigated using constellation diagram. Let’s demonstrate how to plot the signal space constellations, for the various modulations used in the transmitter.

Typically, in practical applications, the baseband modulated waveforms are passed through a pulse shaping filter for combating the phenomenon of intersymbol interference (ISI). The goal is to plot the constellation plots of various pulse-shaped baseband waveforms of the QPSK, O-QPSK and π/4-DQPSK schemes. A variety of pulse shaping filters are available and raised cosine filter is specifically chosen for this demo. The raised cosine (RC) pulse comes with an adjustable transition band roll-off parameter α, using which the decay of the transition band can be controlled.

This article is part of the following books
Digital Modulations using Matlab : Build Simulation Models from Scratch, ISBN: 978-1521493885
Digital Modulations using Python ISBN: 978-1712321638
All books available in ebook (PDF) and Paperback formats

The RC pulse shaping function is expressed in frequency domain as

Equivalently, in time domain, the impulse response corresponds to

A simple evaluation of the equation (2) produces singularities (undefined points) at p(t = 0) and p(t = ±Tsym/(2α)). The value of the raised cosine pulse at these singularities can be obtained by applying L’Hospital’s rule [1] and the values are

Using the equations above, the raised cosine filter is implemented as a function (refer the books Digital Modulations using Python and Digital Modulations using Matlab for the code).

The function is then tested. It generates a raised cosine pulse for the given symbol duration Tsym = 1s and plots the time-domain view and the frequency response as shown in Figure 1. From the plot, it can be observed that the RC pulse falls off at the rate of 1/|t|3 as t→∞, which is a significant improvement when compared to the decay rate of a sinc pulse which is 1/|t|. It satisfies Nyquist criterion for zero ISI – the pulse hits zero crossings at desired sampling instants. The transition bands in the frequency domain can be made gradual (by controlling α) when compared to that of a sinc pulse.

Figure 1: Raised-cosine pulse and its manifestation in frequency domain

Plotting constellation diagram

Now that we have constructed a function for raised cosine pulse shaping filter, the next step is to generate modulated waveforms (using QPSK, O-QPSK and π/4-DQPSK schemes), pass them through a raised cosine filter having a roll-off factor, say α = 0.3 and finally plot the constellation. The constellation for MSK modulated waveform is also plotted.

Figure 2: Constellations plots for: (a) a = 0.3 RC-filtered QPSK, (b) α = 0.3 RC-filtered O-QPSK, (c) α = 0.3 RC-filtered π/4-DQPSK and (d) MSK

Conclusions

The resulting simulated plot is shown in the Figure 2. From the resulting constellation diagram, following conclusions can be reached.

  • Conventional QPSK has 180° phase transitions and hence it requires linear amplifiers with high Q factor
  • The phase transitions of Offset-QPSK are limited to 90° (the 180° phase transitions are eliminated)
  • The signaling points for π/4-DQPSK is toggled between two sets of QPSK constellations that are shifted by 45° with respect to each other. Both the 90° and 180° phase transitions are absent in this constellation. Therefore, this scheme produces the lower envelope variations than the rest of the two QPSK schemes.
  • MSK is a continuous phase modulation, therefore no abrupt phase transition occurs when a symbol changes. This is indicated by the smooth circle in the constellation plot. Hence, a band-limited MSK signal will not suffer any envelope variation, whereas, the rest of the QPSK schemes suffer varied levels of envelope variations, when they are band-limited.

References

[1] Clay S. Turner, Raised Cosine and Root Raised Cosine Formulae, Wireless Systems Engineering, Inc, (May 29, 2007) V1.2↗

In this chapter

Digital Modulators and Demodulators - Passband Simulation Models
Introduction
Binary Phase Shift Keying (BPSK)
 □ BPSK transmitter
 □ BPSK receiver
 □ End-to-end simulation
Coherent detection of Differentially Encoded BPSK (DEBPSK)
● Differential BPSK (D-BPSK)
 □ Sub-optimum receiver for DBPSK
 □ Optimum noncoherent receiver for DBPSK
Quadrature Phase Shift Keying (QPSK)
 □ QPSK transmitter
 □ QPSK receiver
 □ Performance simulation over AWGN
● Offset QPSK (O-QPSK)
● π/p=4-DQPSK
● Continuous Phase Modulation (CPM)
 □ Motivation behind CPM
 □ Continuous Phase Frequency Shift Keying (CPFSK) modulation
 □ Minimum Shift Keying (MSK)
Investigating phase transition properties
● Power Spectral Density (PSD) plots
Gaussian Minimum Shift Keying (GMSK)
 □ Pre-modulation Gaussian Low Pass Filter
 □ Quadrature implementation of GMSK modulator
 □ GMSK spectra
 □ GMSK demodulator
 □ Performance
● Frequency Shift Keying (FSK)
 □ Binary-FSK (BFSK)
 □ Orthogonality condition for non-coherent BFSK detection
 □ Orthogonality condition for coherent BFSK
 □ Modulator
 □ Coherent Demodulator
 □ Non-coherent Demodulator
 □ Performance simulation
 □ Power spectral density

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Matplotlib histogram and estimated PDF in Python

Key focus: Shown with examples: let’s estimate and plot the probability density function of a random variable using Python’s Matplotlib histogram function.

Generation of random variables with required probability distribution characteristic is of paramount importance in simulating a communication system. Let’s see how we can generate a simple random variable, estimate and plot the probability density function (PDF) from the generated data and then match it with the intended theoretical PDF. Normal random variable is considered here for illustration.

Step 1: Generate random samples

A survey of commonly used fundamental methods to generate a given random variable is given in [1]. For this demonstration, we will consider the normal random variable with the following parameters : μ – mean and σ – standard deviation. First generate a vector of randomly distributed random numbers of sufficient length (say 100000) with some valid values for μ and σ. There are more than one way to generate this. Two of them are given below.

● Method 1: Using the in-built numpy.random.normal() function (requires numpy package to be installed)

import numpy as np

mu=10;sigma=2.5 #mean=10,deviation=2.5
L=100000 #length of the random vector

#Random samples generated using numpy.random.normal()
samples_normal = np.random.normal(loc=mu,scale=sigma,size=(L,1)) #generate normally distributted samples

● Method 2: Box-Muller transformation [2] method produces a pair of normally distributed random numbers (Z1, Z2) by transforming a pair of uniformly distributed independent random samples (U1,U2). The algorithm for transformation is given by

#Samples generated using Box-Muller transformation

U1 = np.random.uniform(low=0,high=1,size=(L,1)) #uniformly distributed random numbers U(0,1)
U2 = np.random.uniform(low=0,high=1,size=(L,1)) #uniformly distributed random numbers U(0,1)

a = np.sqrt(-2*np.log(U1))
b = 2*np.pi*U2

Z = a*np.cos(b) #Standard Normal distributed numbers
samples_box_muller= Z*sigma+mu #Normal distribution with mean and sigma

Step 2: Plot the estimated histogram

Typically, if we have a vector of random numbers that is drawn from a distribution, we can estimate the PDF using the histogram tool. Matplotlib’s hist function can be used to compute and plot histograms. If the density argument is set to ‘True’, the hist function computes the normalized histogram such that the area under the histogram will sum to 1. Estimate and plot the normalized histogram using the hist function.

#For plotting
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')

fig, ax0 = plt.subplots(ncols=1, nrows=1) #creating plot axes
(values, bins, _) = ax0.hist(samples_normal,bins=100,density=True,label="Histogram of samples") #Compute and plot histogram, return the computed values and bins

Step 3: Theoretical PDF:

And for verification, overlay the theoretical PDF for the intended distribution. The theoretical PDF of normally distributed random samples is given by

Theoretical PDF for normal distribution is readily obtained from stats.norm.pdf() function in the SciPy package.

from scipy import stats
bin_centers = 0.5*(bins[1:] + bins[:-1])
pdf = stats.norm.pdf(x = bin_centers, loc=mu, scale=sigma) #Compute probability density function
ax0.plot(bin_centers, pdf, label="PDF",color='black') #Plot PDF
ax0.legend()#Legend entries
ax0.set_title('PDF of samples from numpy.random.normal()');
Figure 1: Estimated PDF (histogram) and the theoretical PDF for samples generated using numpy.random.normal() function

The histogram and theoretical PDF of random samples generated using Box-Muller transformation, can be plotted in a similar manner.

#Samples generated using Box-Muller transformation
from numpy.random import uniform
U1 = uniform(low=0,high=1,size=(L,1)) #uniformly distributed random numbers U(0,1)
U2 = uniform(low=0,high=1,size=(L,1)) #uniformly distributed random numbers U(0,1)
a = np.sqrt(-2*np.log(U1))
b = 2*np.pi*U2
Z = a*np.cos(b) #Standard Normal distribution
samples_box_muller = Z*sigma+mu #Normal distribution with mean and sigma

#Plotting
fig, ax1 = plt.subplots(ncols=1, nrows=1) #creating plot axes
(values, bins, _) = ax1.hist(samples_box_muller,bins=100,density=True,label="Histogram of samples") #Plot histogram
bin_centers = 0.5*(bins[1:] + bins[:-1])
pdf = stats.norm.pdf(x = bin_centers, loc=mu, scale=sigma) #Compute probability density function
ax1.plot(bin_centers, pdf, label="PDF",color='black') #Plot PDF
ax1.legend()#Legend entries
ax1.set_title('Box-Muller transformation');

References:

[1] John Mount, ‘Six Fundamental Methods to Generate a Random Variable’, January 20, 2012
[2] Thomas, D. B., Luk. W., Leong, P. H. W., and Villasenor, J. D. 2007. Gaussian random number generators. ACM Comput. Surv. 39, 4, Article 11 (October 2007), 38 pages DOI = 10.1145/1287620.1287622 http://doi.acm.org/10.1145/1287620.1287622

Rate this post: Note: There is a rating embedded within this post, please visit this post to rate it.

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Similar topics

Articles in this series
[1] Fibonacci series in python
[2] Central Limit Theorem – a demonstration
[3] Moving Average Filter in Python and Matlab
[4] How to plot FFT in Python – FFT of basic signals : Sine and Cosine waves
[5] How to plot audio files as time-series using Scipy python
[6] How to design a simple FIR filter to reject unwanted frequencies
[7] Analytic signal, Hilbert Transform and FFT
[8] Non-central Chi-squared Distribution
[9] Simulation of M-PSK modulation techniques in AWGN channel (in Matlab and Python)
[10] QPSK modulation and Demodulation (with Matlab and Python implementation)

Design FIR filter to reject unwanted frequencies

Let’s see how to design a simple digital FIR filter to reject unwanted frequencies in an incoming signal. As a per-requisite, I urge you to read through this post: Introduction to digital filter design

Background on transfer function

The transfer function of a system provides the underlying support for ascertaining vital system response characteristics without solving the difference equations of the system. As mentioned earlier, the transfer function of a generic filter in Z-domain is given by ratio of polynomials in z

\[H(z) = \frac{ \displaystyle{\sum_{i=0}^{M} b_k z^{-1}}}{ 1 + \displaystyle{\sum_{i=1}^{N} a_k z^{-1}} } \quad\quad (1)\]

The values of z when H(z) =0 are called zeros of H(z). The values of z when H(z) = ∞ are called poles of H(z).

It is often easy to solve for zeros {zi} and poles {pj}, when the polynomials in the numerator and denominator are expressed as resolvable factors.

\[H(z) = \frac{ \displaystyle{\sum_{i=0}^{M} b_k z^{-1}}}{ 1 + \displaystyle{\sum_{i=1}^{N} a_k z^{-1}} } = \frac{N(z)}{D(z)} = \frac{b_M}{a_N} \frac{(z – z_1)(z – z_2)\cdots (z – z_M)}{(z – p_1)(z – p_2)\cdots (z – p_N)} \quad\quad (2) \]

The zeros {zi} are obtained by finding the roots of the equation

\[N(z) = 0 \quad\quad (3)\]

The poles {pj} are obtained by finding the roots of the equation

\[D(z) = 0 \quad\quad (4) \]

Pole-zero plots are suited for visualizing the relationship between the Z-domain and the frequency response characteristics. As mentioned before, the frequency response of the system H(e) can be computed by evaluating the transfer function H(z) at specific values of z = e. Because the frequency response is periodic with period , it is sufficient to evaluate the frequency response for the range -π <= ω < π (that is one loop around the unit circle on the z-plane starting from z=-1 and ending at the same point).

FIR filter design

FIR filters contain only zeros and no poles in the pole-zero plot (in fact all the poles sit at the origin for a causal FIR). For an FIR filter, the location of zeros of H(z) on the unit-circle nullifies specific frequencies. So, to design an FIR filter to nullify specific frequency ω, we just have to place the zeros at corresponding locations on the unit circle (z=e) where the gain of the filter needs to be 0. Let’s illustrate this using an example.

For this illustration, we would use this audio file as an input to the filtering system. As a first step in the filter design process, we should understand the nature of the input signal. Discrete-time Fourier transform (DTFT) is a tool for analyzing the frequency domain characteristics of the given signal.

The following function is used to compute the DTFT of the sequence read from the audio file.

import numpy as np
import matplotlib.pyplot as plt

from math import ceil,log,pi,cos
from scipy.fftpack import fft,fftfreq,fftshift

def compute_DTFT(x,M=0):
    """
    Compute DTFT of the given sequence x
    M is the desired length for computing DTFT (optional).
    Returns the DTFT X[k] and corresponding frequencies w (omega) arranged as -pi to pi
    """
    N = max(M,len(x))
    N = 2**(ceil(log(N)/log(2)))
    
    X = fftshift(fft(x,N))
    w = 2*pi*fftshift(fftfreq(N))    
    return (X,w)

Let’s read the audio file, load the samples as a signal sequence , and plot the sequence in time-domain/frequency domain (using DTFT).

from scipy.io.wavfile import read

samplerate, x = read('speechWithNoise.wav')
duration = len(x)/samplerate
time = np.arange(0,duration,1/samplerate)

fig1, (ax1,ax2) = plt.subplots(nrows=2,ncols=1)
ax1.plot(time,x)
ax1.set_xlabel('Time (s)')
ax1.set_ylabel('Amplitude')
ax1.set_title('speechWithNoise.wav - x[n]')

(X,w)= compute_DTFT(x)
ax2.plot(w,abs(X))
ax2.set_xlabel('Normalized frequency (radians/sample)')
ax2.set_ylabel('|X[k]|')
ax2.set_title('Magnitude vs. Frequency')
Figure 1: Time-domain and frequency domain characteristics of the given audio sample

The magnitude vs. frequency plot simply shows huge spikes at θ=±1.32344 radians. The location of the spikes are captured by using the numpy.argmax function.

maxIndex = np.argmax(X)
theta = w[maxIndex]
print(theta)

Since a sinusoid can be mathematically represented as

\[x[n] = cos (\theta n) = \frac{1}{2}\left( e^{j \theta n} + e^{-j \theta n }\right) \quad\quad (5)\]

The two spikes at θ=±1.32344 radians in the frequency domain, will manifest as a sinusoidal signal in the time domain.

Zooming in the area between θ= ±0.4 radians, the frequency domain plot reveals a hidden signal.

Figure 2: Hidden signal revealed in frequency domain

Now, our goal is to design an FIR filter that should reject the sinusoid at θ=±1.32344 radians, so that only the hidden signal gets filtered in.

Since the sinusoid that we want to reject is occurring at some θ radians in the frequency domain, for the FIR filter design, we place two zeros at

\[z_1 = e^{j \theta} \quad\quad z_2 = e^{-j \theta}\quad\quad (6)\]

Therefore, the transfer function of the filter system is given by

\[\begin{aligned} H_f(z) &= \left( 1 – z_1 z^{-1}\right)\left(1 – z_2 z^{-1} \right) \\ &= \left( 1 – e^{j \theta} z^{-1}\right)\left(1 -e^{-j \theta}z^{-1} \right) \\ & = 1 – 2\;cos\left(\theta\right)z^{-1} + z^{-2} \end{aligned}\]

This is a second order FIR filter. The coefficients of the filter are

\[b_0 = 1,\; b_1 = – 2 cos (\theta),\; b_2 = 1 \text{ and } a_0=1\]

For the given problem, to should reject the sinusoid at θ=±1.32344 radians, we set θ=1.32344 in the filter coefficients above.

Filter in action

Filter the input audio signal through the designed filter and plot the filtered output in time-domain and frequency domain. The lfilter function from scipy.signal package↗ is utilized for the filtering operation.

from scipy.signal import lfilter
y_signal = lfilter(b, a, x)
fig3, (ax3,ax4) = plt.subplots(nrows=1,ncols=2)
ax3.plot(time,y_signal,'g')
ax3.set(title='Noise removed speech - y[n]',xlabel='Time (s)',ylabel='Amplitude')

(Y,w)= compute_DTFT(y_signal)
ax4.plot(w,abs(Y)/max(abs(Y)),'r')
ax4.set(title='Frequency content of Y',xlabel='Normalized frequency (radians/sample)',ylabel='Magnitude - |Y[k]|')

The filter has effectively removed the sinusoidal noise, as evident from both time-domain and frequency domain plots.

Figure 4: Extracted speech and its frequency content

Save the filtered output signal as .wav file for audio playback

from scipy.io.wavfile import write
output_data = np.asarray(y_signal, dtype=np.int16)#convert y to int16 format
write("noise_removed_output.wav", samplerate, output_data)

Characteristics of the designed filter

Let’s compute the double sided frequency response of the designed FIR filter. The frequency response of the designed FIR digital filter is computed using freqz function from the scipy.signal package↗.

from scipy.signal import freqz
b = [1,-2*cos(theta),1] #filter coefficients
a = [1]
w, h = freqz(b,a,whole=True)#frequency response h[e^(jw)]
#whole = True returns output for whole range 0 to 2*pi
#To plot double sided response, use fftshift
w = w - 2*np.pi*(w>=np.pi) #convert to range -pi to pi
w = fftshift(w)
h = fftshift(h)

Plot the magnitude response, phase response, pole-zero plot and the impulse response of the designed filter.

#Plot Magnitude-frequency response
fig2, (ax) = plt.subplots(nrows=2,ncols=2)
ax[0,0].plot(w,abs(h),'b')
ax[0,0].set(title='Magnitude response',xlabel='Frequency [radians/sample]',ylabel='Magnitude [dB]')
ax[0,0].grid();ax[0,0].axis('tight');

#Plot phase response
angles = np.unwrap(np.angle(h))
ax[0,1].plot(w,angles,'r')
ax[0,1].set(title='Phase response',xlabel='Frequency [radians/sample]',ylabel='Angles [radians]')
ax[0,1].grid();ax[0,1].axis('tight');

#Transfer function to pole-zero representation
from scipy.signal import tf2zpk
z, p, k = tf2zpk(b, a)

#Plot pole-zeros on a z-plane
from  matplotlib import patches
patch = patches.Circle((0,0), radius=1, fill=False,
                    color='black', ls='dashed')
ax[1,0].add_patch(patch)
ax[1,0].plot(np.real(z), np.imag(z), 'xb',label='Zeros')
ax[1,0].plot(np.real(p), np.imag(p), 'or',label='Poles')
ax[1,0].legend(loc=2)
ax[1,0].set(title='Pole-Zero Plot',ylabel='Real',xlabel='Imaginary')
ax[1,0].grid()

#Impulse response
#create an impulse signal
imp = np.zeros(20)
imp[0] = 1

from scipy.signal import lfilter
y_imp = lfilter(b, a, imp) #drive the impulse through the filter
ax[1,1].stem(y_imp,linefmt='-.')
ax[1,1].set(title='Impulse response',xlabel='Sample index [n]',ylabel='Amplitude')
ax[1,1].axis('tight')
Figure 3: Magnitude response, phase response, pole-zero plot and impulse response of the designed second order FIR filter

Questions

Use the comment form below to answer the following questions

1) Is the filter that we designed a lowpass, highpass, bandpass or a bandstop filter ?
2) To reject a single sinusoidal signal, why two zeros are needed in the above filter design ?
3) What do you understand from the phase response plotted above ?

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

Similar topics

Essentials of Signal Processing
● Generating standard test signals
 □ Sinusoidal signals
 □ Square wave
 □ Rectangular pulse
 □ Gaussian pulse
 □ Chirp signal
Interpreting FFT results - complex DFT, frequency bins and FFTShift
 □ Real and complex DFT
 □ Fast Fourier Transform (FFT)
 □ Interpreting the FFT results
 □ FFTShift
 □ IFFTShift
Obtaining magnitude and phase information from FFT
 □ Discrete-time domain representation
 □ Representing the signal in frequency domain using FFT
 □ Reconstructing the time domain signal from the frequency domain samples
● Power spectral density
Power and energy of a signal
 □ Energy of a signal
 □ Power of a signal
 □ Classification of signals
 □ Computation of power of a signal - simulation and verification
Polynomials, convolution and Toeplitz matrices
 □ Polynomial functions
 □ Representing single variable polynomial functions
 □ Multiplication of polynomials and linear convolution
 □ Toeplitz matrix and convolution
Methods to compute convolution
 □ Method 1: Brute-force method
 □ Method 2: Using Toeplitz matrix
 □ Method 3: Using FFT to compute convolution
 □ Miscellaneous methods
Analytic signal and its applications
 □ Analytic signal and Fourier transform
 □ Extracting instantaneous amplitude, phase, frequency
 □ Phase demodulation using Hilbert transform
Choosing a filter : FIR or IIR : understanding the design perspective
 □ Design specification
 □ General considerations in design

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Linear regression using python – demystified

Key focus: Let’s demonstrate basics of univariate linear regression using Python SciPy functions. Train the model and use it for predictions.

Linear regression model

Regression is a framework for fitting models to data. At a fundamental level, a linear regression model assumes linear relationship between input variables () and the output variable (). The input variables are often referred as independent variables, features or predictors. The output is often referred as dependent variable, target, observed variable or response variable.

If there are only one input variable and one output variable in the given dataset, this is the simplest configuration for coming up with a regression model and the regression is termed as univariate regression. Multivariate regression extends the concept to include more than one independent variables and/or dependent variables.

Univariate regression example

Let us start by considering the following example of a fictitious dataset. To begin we construct the fictitious dataset by our selves and use it to understand the problem of linear regression which is a supervised machine learning technique. Let’s consider linear looking randomly generated data samples.

import numpy as np
import matplotlib.pyplot as plt #for plotting

np.random.seed(0) #to generate predictable random numbers

m = 100 #number of samples
x = np.random.rand(m,1) #uniformly distributed random numbers
theta_0 = 50 #intercept
theta_1 = 35 #coefficient
noise_sigma = 3

noise = noise_sigma*np.random.randn(m,1) #gaussian random noise

y = theta_0 + theta_1*x + noise #noise added target
 
plt.ion() #interactive plot on
fig,ax = plt.subplots(nrows=1,ncols=1)
plt.plot(x,y,'.',label='training data')
plt.xlabel(r'Feature $x_1$');plt.ylabel(r'Target $y$')
plt.title('Feature vs. Target')
Figure 1: Simulated data for linear regression problem

In this example, the data samples represent the feature and the corresponding targets . Given this dataset, how can we predict target as a function of ? This is a typical regression problem.

Linear regression

Let be the pair that forms one training example (one point on the plot above). Assuming there are such sample points as training examples, then the set contains all the pairs .

In the univariate linear regression problem, we seek to approximate the target as a linear function of the input , which implies the equation of a straight line (example in Figure 2) as given by

where, is the intercept, is the slope of the straight line that is sought and is always . The approximated target serves as a guideline for prediction. The approximated target is denoted by

Using all the samples from the training set , we wish to find the parameters that well approximates the relationship between the given target samples and the straight line function .

If we represent the variables s, the input samples for and the target samples as matrices, then, equation (1) can be expressed as a dot product between the two sequences

It may seem that the solution for finding is straight forward

However, matrix inversion is not defined for matrices that are not square. Moore-Penrose pseudo inverse generalizes the concept of matrix inversion to a matrix. Denoting the Moore-Penrose pseudo inverse for as , the solution for finding is

For coding in Python, we utilize the scipy.linalg.pinv function to compute Moore-Penrose pseudo inverse and estimate .

xMat = np.c_[ np.ones([len(x),1]), x ] #form x matrix
from scipy.linalg import pinv
theta_estimate = pinv(xMat).dot(y)
print(f'theta_0 estimate: {theta_estimate[0]}')
print(f'theta_1 estimate: {theta_estimate[1]}')

The code results in the following estimates for , which are very close to the values used to generate the random data points for this problem.

>> theta_0 estimate: [50.66645323]
>> theta_1 estimate: [34.81080506]

Now, we know the parameters of our example system, the target predictions for new values of feature can be done as follows

x_new = np.array([[-0.2],[0.5],[1.2] ]) #new unseen inputs
x_newmat = np.c_[ np.ones([len(x_new),1]), x_new ] #form xNew matrix
y_predict  = np.dot(x_newmat,theta_estimate)
>>> y_predict #predicted y values for new inputs for x_1
array([[43.70429222],
       [68.07185576],
       [92.43941931]])

The approximated target as a linear function of feature, is plotted as a straight line.

plt.plot(x_new,y_predict,'-',label='prediction')
plt.text(0.7, 55, r'Intercept $\theta_0$ = %0.2f'%theta_estimate[0])
plt.text(0.7, 50, r'Coefficient $\theta_1$ = %0.2f'%theta_estimate[1])
plt.text(0.5, 45, r'y= $\theta_0+ \theta_1 x_1$ = %0.2f + %0.2f $x_1$'%(theta_estimate[0],theta_estimate[1]))
plt.legend() #plot legend
Figure 2: Linear Regression – training samples and prediction

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

References

[1] Boyd and Vandenberghe , “Convex Optimization”, ISBN: 978-0521833783, Cambridge University Press, 1 edition, March 2004.↗

Related topics

[1] Introduction to Signal Processing for Machine Learning
[2] Generating simulated dataset for regression problems - sklearn make_regression
[3] Hands-on: Basics of linear regression

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Generating simulated dataset for regression problems

Key focus: Generating simulated dataset for regression problems using sklearn make_regression function (Python 3) is discussed in this article.

Problem statement

Suppose, a survey is conducted among the employees of a company. In that survey, the salary and the years of experience of the employees are collected. The aim of this data collection is to build a regression model that could predict the salary from the given experience (especially for the values not seen by the model).

If you are developer, you often have no access to survey data. In this scenario, you wish you could simulate the data for building a regression model.

Generating the dataset

To construct a simulated dataset for this scenario, the sklearn.dataset.make_regression↗ function available in the scikit-learn library can be used. The function generates the samples for a random regression problem.

The make_regression↗ function generates samples for inputs (features) and output (target) by applying random linear regression model. The values for generated samples have to be scaled to appropriate range for the given problem.

import numpy as np
from sklearn import datasets
import matplotlib.pyplot as plt #for plotting

x, y, coef = datasets.make_regression(n_samples=100,#number of samples
                                      n_features=1,#number of features
                                      n_informative=1,#number of useful features 
                                      noise=10,#bias and standard deviation of the guassian noise
                                      coef=True,#true coefficient used to generated the data
                                      random_state=0) #set for same data points for each run

# Scale feature x (years of experience) to range 0..20
x = np.interp(x, (x.min(), x.max()), (0, 20))

# Scale target y (salary) to range 20000..150000 
y = np.interp(y, (y.min(), y.max()), (20000, 150000))

plt.ion() #interactive plot on
plt.plot(x,y,'.',label='training data')
plt.xlabel('Years of experience');plt.ylabel('Salary $')
plt.title('Experience Vs. Salary')
Figure 1: Simulated dataset for linear regression problem

If you want the data to be presented in pandas dataframe format:

import pandas as pd
df = pd.DataFrame(data={'experience':x.flatten(),'salary':y})
df.head(10)
Figure 2: Generated dataset presented as pandas dataframe

We have successfully completed generating simulated dataset for regression problems in Python3. Let’s move on to build and train a linear regression model using the generated dataset and use it for predictions.

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

Related topics

[1] Introduction to Signal Processing for Machine Learning
[2] Generating simulated dataset for regression problems - sklearn make_regression
[3] Hands-on: Basics of linear regression

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Plot audio file as time series using Scipy python

Often the most basic step in signal processing of audio files, one would like to visualize an audio sample file as time-series data.

Audio sounds can be thought of as an one-dimensional vector that stores numerical values corresponding to each sample. The time-series plot is a two dimensional plot of those sample values as a function of time.

Python’s SciPy library comes with a collection of modules for reading from and writing data to a variety of file formats. For example, the scipy.io.wavfile module can be used to read from and write to a .wav format file.

For the following demonstration, sample audio files given in this URL are used for the visualization task.

The read function in the scipy.io.wavefile module can be utilized to open the selected wav file. It returns the sample rate and the data samples.

>>> from scipy.io.wavfile import read #import the required function from the module

>>> samplerate, data = read('CantinaBand3.wav')

>>> samplerate #echo samplerate
22050

>>> data #echo data -> note that the data is a single dimensional array
array([   3,    7,    0, ...,  -12, -427, -227], dtype=int16)

Compute the duration and the time vector of the audio sample from the sample rate

>>> duration = len(data)/samplerate
>>> time = np.arange(0,duration,1/samplerate) #time vector

Plot the time-series data using matplotlib package

>>> import matplotlib.pyplot as plt
>>> import numpy as np
>>> plt.plot(time,data)
>>> plt.xlabel('Time [s]')
>>> plt.ylabel('Amplitude')
>>> plt.title('CantinaBand3.wav')
>>> plt.show()
Figure 1: Time series plot of audio file using Python Scipy

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Plot FFT using Python – FFT of sine wave & cosine wave

Key focus: Learn how to plot FFT of sine wave and cosine wave using Python. Understand FFTshift. Plot one-sided, double-sided and normalized spectrum using FFT.

Introduction

Numerous texts are available to explain the basics of Discrete Fourier Transform and its very efficient implementation – Fast Fourier Transform (FFT).  Often we are confronted with the need to generate simple, standard signals (sine, cosine, Gaussian pulse, squarewave, isolated rectangular pulse, exponential decay, chirp signal) for simulation purpose. I intend to show (in a series of articles) how these basic signals can be generated in Python and how to represent them in frequency domain using FFT. If you are inclined towards Matlab programming, visit here.

This article is part of the book Digital Modulations using Python, ISBN: 978-1712321638 available in ebook (PDF) and Paperback (hardcopy) formats

Sine Wave

In order to generate a sine wave, the first step is to fix the frequency f of the sine wave. For example, we wish to generate a sine wave whose minimum and maximum amplitudes are -1V and +1V respectively. Given the frequency of the sinewave, the next step is to determine the sampling rate.

For baseband signals, the sampling is straight forward. By Nyquist Shannon sampling theorem, for faithful reproduction of a continuous signal in discrete domain, one has to sample the signal at a rate higher than at-least twice the maximum frequency contained in the signal (actually, it is twice the one-sided bandwidth occupied by a real signal. For a baseband signal bandwidth ( to ) and maximum frequency in a given band are equivalent).

For Python implementation, let us write a function to generate a sinusoidal signal using the Python’s Numpy library. Numpy is a fundamental library for scientific computations in Python. In order to use the numpy package, it needs to be imported. Here, we are importing the numpy package and renaming it as a shorter alias np.

import numpy as np

Next, we define a function for generating a sine wave signal with the required parameters.

def sine_wave(f,overSampRate,phase,nCyl):
	"""
	Generate sine wave signal with the following parameters
	Parameters:
		f : frequency of sine wave in Hertz
		overSampRate : oversampling rate (integer)
		phase : desired phase shift in radians
		nCyl : number of cycles of sine wave to generate
	Returns:
		(t,g) : time base (t) and the signal g(t) as tuple
	Example:
		f=10; overSampRate=30;
		phase = 1/3*np.pi;nCyl = 5;
		(t,g) = sine_wave(f,overSampRate,phase,nCyl)
	"""
	fs = overSampRate*f # sampling frequency
	t = np.arange(0,nCyl*1/f-1/fs,1/fs) # time base
	g = np.sin(2*np.pi*f*t+phase) # replace with cos if a cosine wave is desired
	return (t,g) # return time base and signal g(t) as tuple

We note that the function sine wave is defined inside a file named signalgen.py. We will add more such similar functions in the same file. The intent is to hold all the related signal generation functions, in a single file. This approach can be extended to object oriented programming. Now that we have defined the sine wave function in signalgen.py, all we need to do is call it with required parameters and plot the output.

"""
Simulate a sinusoidal signal with given sampling rate
"""
import numpy as np
import matplotlib.pyplot as plt # library for plotting
from signalgen import sine_wave # import the function

f = 10 #frequency = 10 Hz
overSampRate = 30 #oversammpling rate
fs = f*overSampRate #sampling frequency
phase = 1/3*np.pi #phase shift in radians
nCyl = 5 # desired number of cycles of the sine wave

(t,x) = sine_wave(f,overSampRate,phase,nCyl) #function call

plt.plot(t,x) # plot using pyplot library from matplotlib package
plt.title('Sine wave f='+str(f)+' Hz') # plot title
plt.xlabel('Time (s)') # x-axis label
plt.ylabel('Amplitude') # y-axis label
plt.show() # display the figure

Python is an interpreter based software language that processes everything in digital. In order to obtain a smooth sine wave, the sampling rate must be far higher than the prescribed minimum required sampling rate, that is at least twice the frequency – as per Nyquist-Shannon theorem. Hence, we need to sample the input signal at a rate significantly higher than what the Nyquist criterion dictates. Higher oversampling rate requires more memory for signal storage. It is advisable to keep the oversampling factor to an acceptable value.

An oversampling factor of is chosen in the previous function. This is to plot a smooth continuous like sine wave. Thus, the sampling rate becomes . If a phase shift is desired for the sine wave, specify it too.

Figure 1: A 10Hz sinusoidal wave with 5 cycles and phase shift 1/3π radians

Different representations of FFT:

Since FFT is just a numeric computation of -point DFT, there are many ways to plot the result. The FFT, implemented in Scipy.fftpack package, is an algorithm published in 1965 by J.W.Cooley and
J.W.Tuckey for efficiently calculating the DFT.

The SciPy functions that implement the FFT and IFFT can be invoked as follows

from scipy.fftpack import fft, ifft
X = fft(x,N) #compute X[k]
x = ifft(X,N) #compute x[n]

1. Plotting raw values of DFT:

The x-axis runs from to – representing sample values. Since the DFT values are complex, the magnitude of the DFT is plotted on the y-axis. From this plot we cannot identify the frequency of the sinusoid that was generated.

import numpy as np
import matplotlib.pyplot as plt
from scipy.fftpack import fft

NFFT=1024 #NFFT-point DFT      
X=fft(x,NFFT) #compute DFT using FFT    

fig1, ax = plt.subplots(nrows=1, ncols=1) #create figure handle
nVals = np.arange(start = 0,stop = NFFT) # raw index for FFT plot
ax.plot(nVals,np.abs(X))      
ax.set_title('Double Sided FFT - without FFTShift')
ax.set_xlabel('Sample points (N-point DFT)')        
ax.set_ylabel('DFT Values')
fig1.show()
Figure 2: Double sided FFT – without FFTShift

2. FFT plot – plotting raw values against normalized frequency axis:

In the next version of plot, the frequency axis (x-axis) is normalized to unity. Just divide the sample index on the x-axis by the length of the FFT. This normalizes the x-axis with respect to the sampling rate . Still, we cannot figure out the frequency of the sinusoid from the plot.

import numpy as np
import matplotlib.pyplot as plt
from scipy.fftpack import fft

NFFT=1024 #NFFT-point DFT  
X=fft(x,NFFT) #compute DFT using FFT     

fig2, ax = plt.subplots(nrows=1, ncols=1) #create figure handle
   
nVals=np.arange(start = 0,stop = NFFT)/NFFT #Normalized DFT Sample points         
ax.plot(nVals,np.abs(X))     
ax.set_title('Double Sided FFT - without FFTShift')        
ax.set_xlabel('Normalized Frequency')
ax.set_ylabel('DFT Values')
fig2.show()
Figure 3: Double sided FFT with normalized x-axis (0 to 1)

3. FFT plot – plotting raw values against normalized frequency (positive & negative frequencies):

As you know, in the frequency domain, the values take up both positive and negative frequency axis. In order to plot the DFT values on a frequency axis with both positive and negative values, the DFT value at sample index has to be centered at the middle of the array. This is done by using FFTshift function in Scipy Python. The x-axis runs from to where the end points are the normalized ‘folding frequencies’ with respect to the sampling rate .

import numpy as np
import matplotlib.pyplot as plt
from scipy.fftpack import fft,fftshift

NFFT=1024 #NFFT-point DFT      
X=fftshift(fft(x,NFFT)) #compute DFT using FFT  

fig3, ax = plt.subplots(nrows=1, ncols=1) #create figure handle
    
fVals=np.arange(start = -NFFT/2,stop = NFFT/2)/NFFT #DFT Sample points        
ax.plot(fVals,np.abs(X))
ax.set_title('Double Sided FFT - with FFTShift')
ax.set_xlabel('Normalized Frequency')
ax.set_ylabel('DFT Values');
ax.autoscale(enable=True, axis='x', tight=True)
ax.set_xticks(np.arange(-0.5, 0.5+0.1,0.1))
fig.show()
Figure 4: Double sided FFT with normalized x-axis (-0.5 to 0.5)

4. FFT plot – Absolute frequency on the x-axis vs. magnitude on y-axis:

Here, the normalized frequency axis is just multiplied by the sampling rate. From the plot below we can ascertain that the absolute value of FFT peaks at and . Thus the frequency of the generated sinusoid is . The small side-lobes next to the peak values at and are due to spectral leakage.

import numpy as np
import matplotlib.pyplot as plt
from scipy.fftpack import fft,fftshift

NFFT=1024     
X=fftshift(fft(x,NFFT))

fig4, ax = plt.subplots(nrows=1, ncols=1) #create figure handle

fVals=np.arange(start = -NFFT/2,stop = NFFT/2)*fs/NFFT
ax.plot(fVals,np.abs(X),'b')
ax.set_title('Double Sided FFT - with FFTShift')
ax.set_xlabel('Frequency (Hz)')         
ax.set_ylabel('|DFT Values|')
ax.set_xlim(-50,50)
ax.set_xticks(np.arange(-50, 50+10,10))
fig4.show()
Figure 5: Double sided FFT – Absolute frequency on the x-axis vs. magnitude on y-axis

5. Power Spectrum – Absolute frequency on the x-axis vs. power on y-axis:

The following is the most important representation of FFT. It plots the power of each frequency component on the y-axis and the frequency on the x-axis. The power can be plotted in linear scale or in log scale. The power of each frequency component is calculated as

Where is the frequency domain representation of the signal . In Python, the power has to be calculated with proper scaling terms.

Figure 6: Power spectral density using FFT

Plotting the PSD plot with y-axis on log scale, produces the most encountered type of PSD plot in signal processing.

Figure 7: Power spectral density (y-axis on log scale) using FFT

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Topics in this chapter

Essentials of Signal Processing
● Generating standard test signals
 □ Sinusoidal signals
 □ Square wave
 □ Rectangular pulse
 □ Gaussian pulse
 □ Chirp signal
Interpreting FFT results - complex DFT, frequency bins and FFTShift
 □ Real and complex DFT
 □ Fast Fourier Transform (FFT)
 □ Interpreting the FFT results
 □ FFTShift
 □ IFFTShift
Obtaining magnitude and phase information from FFT
 □ Discrete-time domain representation
 □ Representing the signal in frequency domain using FFT
 □ Reconstructing the time domain signal from the frequency domain samples
● Power spectral density
Power and energy of a signal
 □ Energy of a signal
 □ Power of a signal
 □ Classification of signals
 □ Computation of power of a signal - simulation and verification
Polynomials, convolution and Toeplitz matrices
 □ Polynomial functions
 □ Representing single variable polynomial functions
 □ Multiplication of polynomials and linear convolution
 □ Toeplitz matrix and convolution
Methods to compute convolution
 □ Method 1: Brute-force method
 □ Method 2: Using Toeplitz matrix
 □ Method 3: Using FFT to compute convolution
 □ Miscellaneous methods
Analytic signal and its applications
 □ Analytic signal and Fourier transform
 □ Extracting instantaneous amplitude, phase, frequency
 □ Phase demodulation using Hilbert transform
Choosing a filter : FIR or IIR : understanding the design perspective
 □ Design specification
 □ General considerations in design