Statistical measures for stochastic signals

Key focus: Discuss statistical measures for stochastic signals : mean, variance, skewness, kurtosis, histogram, scatterplot, cross-correlation and auto-correlation.

Deterministic and stochastic signals

A deterministic signal is exactly predictable for the given time span of interest. It could be expressed using analytic form (example: x(t) = sin (2 π fc t) ).

Many of the signals that are encountered in real world signal processing applications cannot be expressed or predicted using analytic equations, because they have an element of uncertainty associated with them. As a consequence, such signals are characterized using statistical concepts. Therefore, these signals are outside the realm of deterministic signals and are classified as stochastic signals.

For example, we look at an electrical circuit and monitor the voltage across a resistor for a period of time. Under an applied electric field, atomic particles inside resister tend to randomly move and it manifests as heat. This random thermal motion causes random fluctuation in the voltage measured across the resistor. Therefore, the measured voltage can be characterized by a probability distribution and can be analyzed using statistical methods, but it cannot be predicted with precision. This is an example of signal that is stochastic function of time. Such functions usually evolve according to certain probabilistic laws and are assumed to be generated by an underlying stochastic process (thermal motion in the example above).

Examples for deterministic and stochastic signals
Figure 1: Examples for deterministic and stochastic signals

Given the amplitude of the stochastic voltage signal at time , now we know, we cannot predict the value at t’. However, if we observed the signal for a sufficient amount of time, we can empirically determine its probability distribution based on which we should be able to answer questions like

  • Given the amplitude of the voltage at time t, what is the average (expected or mean) of the voltage at time t’?
  • How much can we expect the voltage at time t’, to fluctuate from the mean ? In other words, we are interested in the variance of the voltage at time t’.
  • What is the probability that the voltage at t’ exceeds or falls below a certain threshold value ?
  • How the voltages at times t and t’ are related ? In other words, we are interested in correlation.

Summary of descriptive statistical measures

Different statistical measures are available to gather, review, analyze and draw conclusions from stochastic signals. Some of the statistical measures are summarized below:

Quantitative measures of shape:

In many statistical analysis, the fundamental task is to get a general idea about the shape of a distribution and it is done by using moments.

Measure of central tendency – mean – the first moment:

Measures of central tendency attempt to identify the central position in the distribution of samples that make up the stochastic signal. Mean, mode and median are different measures of central tendency. In signal processing, we are mostly interested in computing mean which is the average of values of the given samples. Given a discrete random variable X with probability mass function pX(x) the mean is denoted by

\[\mu = E \left[ X\right] = \sum_{x: p_X(x) > 0} x p_X(x) \]

Measure of dispersion – variance – the second moment:

Measures of dispersion describe how the values in the signal samples are spread out around a central location. Variance and standard deviation are some of the measures of dispersion. If the values are widely dispersed, the central location is said to be less representative of the values as a whole. If the values are tightly dispersed, the central location is considered more reliable.

For a discrete random variable X with probability mass function pX(x) , the variance (σX2) is given by the second central moment. The term central moment implies that this is measured relative to mean. For an electrical signal, the second moment is proportional to the average power.

\[\sigma_X^2 = E \left[ \left(X – \mu \right)^2\right] = \sum_{x: p_X(x) > 0} \left(x – \mu \right)^2 p_X(x) \]

The square root of variance is standard deviation

\[\sigma_X = \sqrt{E \left[ \left(X – \mu \right)^2\right]}\]

Figure 2, demonstrates the use of histogram for getting a visual perception of shape of the distribution of samples from two different stochastic signals. Though the signals look similar in nature in time domain, their histogram reveals different picture altogether. The central location (mean) of the first signal is around zero (no DC shift in the time domain view) and for the second signal the mean is at 0.75. The average power of first signal varies widely (histogram is widely spread out) compared to that of the second signal.

Figure 2: Histogram is a visual method that provides a general idea about the shape of a distribution.

Higher order moments – skewness and kurtosis:

Further characterization of the shape of a distribution includes higher order moments : skewness and kurtosis. They identify anomalies and outliers in many signal processing applications.

Skewness provides a measure to quantify the presence of asymmetry in the shape of the distribution. Actually, skewness measures the relative size of the tails in a distribution. Presence of asymmetry manifests as non-zero value for the standardized third central moment. The term “standardized” stands for the normalization of the third central moment by σ3.

\[Skewness = \frac{E \left[ \left( X – \mu \right)^3 \right]}{\sigma^3}\]
Figure 3: A random sequence showing positive skewness (asymmetry in shape) in its distribution

Kurtosis measures the amount of probability in the two tails of a distribution, relative to a normal distribution of same variance. Kurtosis is 3 for normal distribution (perfect bell shaped curve). If the kurtosis of a distribution is greater than 3, it implies that the tails are heavier compared to that of normal distribution. A value less than 3 for kurtosis implies lighter tails compared to that of the normal distribution. Essentially, kurtosis is a measure of outliers. Kurtosis is calculated as the standardized fourth central moment.

\[Kurtosis = \frac{E \left[ \left( X – \mu \right)^4 \right]}{\sigma^4}\]
Figure 4: Histogram showing kurtosis of a random sequence vs. kurtosis of normal distribution

Measures of association

Statistics, such as measures of central tendency, dispersion and higher order moments, describe about a single distribution are called univariate statistics. If we are interested in the relationship between two or more variables, we have to move to at least the realm of bivariate statistics.

Measures of association, summarize the size of association between two variables. Most measures of associates are scaled to a range of values. For example, a measure of association can be construed to have a range 0 to 1, where the value 0 implies no relationship and a value of 1 implies perfect relationship between the two variables under study. In another case, the measure can range from -1 to 1, which can help us determine if the two variables have negative or positive relationship between each other.

Scatter plot

Scatter plot is a great way to visually assess the nature of relationship between two variables. Following figure contains the scatter plots between different stochastic signals, exhibiting different strengths of relationships between the signals.

Figure 5: Scatter plot of stochastic signals exhibiting different strengths of relationship

Correlation

Correlation functions are commonly used in signal processing, for measuring the similarity of two signals. They are especially used in signal detection and pattern recognition.

Cross-correlation

Cross-correlation is commonly used for measuring the similarity between two different signals. In signal processing, cross-correlation measures the similarity of two waveforms as a function of time lags applied to one of the waveform.

For discrete-time waveforms – x[n] and y[n], cross correlation is defined as

\[Corr_{xy}[l] = \sum_{m=-\infty}^{\infty} x[n]^{\ast} y[n+l] = \sum_{m=-\infty}^{\infty} x[n-l]^{\ast} y[n]\]

where, * denotes complex conjugate operation and l is the discrete time lags applied to one of the waveforms. That is, the cross-correlation can be calculated as the dot product of a sequence with each shifted version of another sequence.

Auto-correlation

Auto-correlation is the cross-correlation of a waveform/sequence with itself.

For discrete-time waveform – x[n], auto-correlation is defined as

\[Corr_{xx}[l] = \sum_{m=-\infty}^{\infty} x[n]^{\ast} x[n+l] = \sum_{m=-\infty}^{\infty} x[n-l]^{\ast} x[n] \]

where, * denotes complex conjugate operation and l is the discrete time lags applied to the copy of the same waveform.

Correlation properties are useful for identifying/distinguishing a known bit sequence from a set of other possible known sequences. For example, in GPS technology, each satellite is assigned a unique 10-bit Gold code sequence (210 = 1023 possible combinations). Cross-correlation between different Gold code sequence is very low, since the Gold codes are orthogonal to each other. However, the auto-correlation of a Gold code is maximum at zero lag. The satellites are identified using these correlation properties of Gold codes. (I have described the hardware implementation of Gold codes, it can be accessed here).

Following plots illustrate the application of auto-correlation for audio analysis. The auto-correlation of an audio signal will have a peak at zero lag (i.e, where there is no time shifting when computing the correlation) as shown in Figure 6. Figure 7 contains the same audio file that is synthetically processed to produce reverberation characteristics. By looking at the time series plot in Figure 7, we cannot infer anything. However, the auto-correlation plot reveals the reverberation characteristics embedded in the audio.

Figure 6: Normalized auto-correlation of an original sound

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

For further reading

[1] Steven M. Kay, “Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory”, ISBN: 978-0133457117, Prentice Hall, Edition 1, 1993.↗

Gold code generator using LFSRs

Focus of this article is to discuss the details of Gold code generator using preferred pair m-sequences, implemented using linear feedback shift registers (LFSR). Finally we plot and investigate correlation properties of the generated Gold codes.

Introduction

In a multi-user environment (like spread spectrum, CDMA ) large number of codes with good correlation properties, is a necessity. Gold codes are suited for this application, since a large number of codes with controlled correlation can be generated by a simple time shift of two preferred m-sequences.

Gold sequences belong to the category of  product codes where two m-sequences of same length are XOR’ed to produce a Gold sequence. The two m-sequence must maintain the same phase relationship till all the additions are performed. A slight change of phase even in one of the m-sequences, produces a different Gold sequence altogether. Gold codes are non-maximal and therefore they have poor auto-correlation property when compared to that of the underlying m-sequences.

This article is part of the book
Wireless Communication Systems in Matlab (second edition), ISBN: 979-8648350779 available in ebook (PDF) format and Paperback (hardcopy) format.

A typical implementation of Gold code generator is shown in Figure 1. Here, the two linear feedback shift registers (LFSR), each of length , are configured to generate two different m-sequences. As we cannot start the LFSRs with all zero values, there need to be some values in the LFSRs. These initial values are controlled by ‘seed 1‘ and ‘seed 2‘. The phase of the m-sequence loaded into an LFSR can be controlled by shifting the initial seed (initial value of the shift registers in the LFSR) there by providing an option to generate a new Gold sequence.

Figure 1: Generation of Gold Code using LFSR

Note that there exists numerous choice of Gold sequences based on the m-sequence configuration of the LFSRs and what is initially loaded into the two LFSRs. Not all the Gold sequences possess good correlation properties. In order to have a Gold code sequence with three valued peak cross-correlation magnitudes, that is both bounded and uniform, the m-sequences generated by the two LFSRs need to be of preferred type.

Algebraic details on generating two preferred m-sequence for Gold code generation is detailed in Gold’s work [1]. It warrants the need for searching the preferred pairs, instead, the configurations given in Table 1 can be used for generating Gold codes that provide optimal correlation properties (three valued correlation as mentioned in Reference [2]). Following points must be noted when choosing the length of LFSR () when generating the Gold code.

● Three valued cross-correlation values occur when is odd or when .
● The peak cross-correlation magnitude achieves the minimum possible value only when is odd (). Therefore, this is the best choice here.
● When , the m-sequences listed in table are still of preferred type (three valued cross-correlation), but the lower bound is weaker when compared to the lower bound when is odd.

Table 1: Preferred pairs of m-sequences

Hardware implementation

A sample hardware implementation for generating a length 127 Gold code – using the preferred pairs ([7,3,2,1],[7,3]) is shown in Figure 2. Here each register in the LFSR is a D-flip flop,all connected in cascaded fashion and operating at a given synchronous clock. Changing the initial seed values into shift registers produces a different set of Gold codes. Matlab code for implementing a Gold code generator is available in this book.

Figure 2: Length 127 – Gold sequence generator with odd n for the preferred pairs – [7,3,2,1], [7,3]

Correlation properties

Figure 3 on this post shows the periodic cross-correlation of two m-sequences that are not a preferred pair. Let’s take a preferred pair from the Table 1 for N = 31 having the feedback connections – [2,3,4,5] and [2,5]. The correlation of two sequences can be computed using the function given in section 12.2.1 of this book.

Figure 3: Normalized cross-correlation of preferred pair m-sequences using the feedback connections [2,3,4,5] and [2,5]

The normalized cross-correlation of the chosen preferred pair and the auto-correlation of the corresponding Gold code is shown in Figure 3 and Figure 4 respectively. The auto-correlation and cross-correlation plots reveal that the Gold code sequence does not possess the excellent auto-correlation property as that of individual m-sequences, but it possess good cross-correlation properties in comparison with the individual m-sequences.

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

References:

[1] Gold, R. (1967), Optimal binary sequences for spread spectrum  multiplexing, IEEE Transactions on Information Theory, 13 (4), pp. 619-621.↗

[2] Sarwate D.V, Pursley, M.B., “Crosscorrelation Properties of Pseudorandom and Related Sequences”, Proceedings of the IEEE, Volume 68, Issue 5, pp 593 – 619, May 1980.↗

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Topics in this chapter

Spread spectrum techniques
Introduction
Code sequences
 □ Sequence correlations
 □ Maximum-length sequences (m-sequences)
 □ Gold codes
● Direct Sequence Spread Spectrum
 □ Simulation of DSSS system
 □ Performance of Direct Sequence Spread Spectrum over AWGN channel
 □ Performance of Direct Sequence Spread spectrum in the presence of Jammer
● Frequency Hopping Spread Spectrum
 □ Simulation model
 □ Binary Frequency Shift Keying (BFSK)
 □ Allocation of frequency channels
 □ Frequency hopping generator
 □ Fast and slow frequency hopping
 □ Simulation code for BFSK-FHSS

Construct autocorrelation Matrix in Matlab & Python

Auto-correlation, also called series correlation, is the correlation of a given sequence with itself as a function of time lag. Cross-correlation is a more generic term, which gives the correlation between two different sequences as a function of time lag.

Given two sequences and , the cross-correlation at times separated by lag i is given by ( denotes complex conjugate operation)

Auto-correlation is a special case of cross-correlation, where x=y. One can use a brute force method (using for loops implementing the above equation) to compute the auto-correlation sequence. However, other alternatives are also at your disposal.

Method 1: Auto-correlation using xcorr function

Matlab

For a N-dimensional given vector x, the Matlab command xcorr(x,x) or simply xcorr(x) gives the auto-correlation sequence. For the input sequence x=[1,2,3,4], the command xcorr(x) gives the following result.

>> x=[1 2 3 4]
>> acf=xcorr(x)
acf= 4   11   20   30   20   11  4

Python

In Python, autocorrelation of 1-D sequence can be obtained using numpy.correlate function. Set the parameter mode=’full’ which is useful for calculating the autocorrelation as a function of lag.

import numpy as np
x = np.asarray([1,2,3,4])
np.correlate(x, x,mode='full')
# output: array([ 4, 11, 20, 30, 20, 11,  4])

Method 2: Auto-correlation using Convolution

Auto-correlation sequence can be computed as the convolution between the given sequence and the reversed (flipped) version of the conjugate of the sequence.The conjugate operation is not needed if the input sequence is real.

Matlab

>> x=[1 2 3 4]
>> acf=conv(x,fliplr(conj(x)))
acf= 4   11   20   30   20   11  4

Python

import numpy as np
x = np.asarray([1,2,3,4])
np.convolve(x,np.conj(x)[::-1])
# output: array([ 4, 11, 20, 30, 20, 11,  4])

Method 3: Autocorrelation using Toeplitz matrix

Autocorrelation sequence can be found using Toeplitz matrices. An example for using Toeplitz matrix structure for computing convolution is given here. The same technique is extended here, where one signal is set as input sequence and the other is just the flipped version of its conjugate. The conjugate operation is not needed if the input sequence is real.

Matlab

>> x=[1 2 3 4]
>> acf=toeplitz([x zeros(1,length(x)-1)],[x(1) zeros(1,length(conj(x))-1)])*fliplr(x).'
acf= 4   11   20   30   20   11  4

Python

import numpy as np
from scipy.linalg import toeplitz
x = np.asarray([1,2,3,4])
toeplitz(np.pad(x, (0,len(x)-1),mode='constant'),np.pad([x[0]], (0,len(x)-1),mode='constant'))@x[::-1]
# output: array([ 4, 11, 20, 30, 20, 11, 4])

Method 4: Auto-correlation using FFT/IFFT

Auto-correlation sequence can be found using FFT/IFFT pairs. An example for using FFT/IFFT for computing convolution is given here. The same technique is extended here, where one signal is set as input sequence and the other is just the flipped version of its conjugate.The conjugate operation is not needed if the input sequence is real.

Matlab

>> x=[1 2 3 4]
>> L = 2*length(x)-1
>> acf=ifft(fft(x,L).*fft(fliplr(conj(x)),L))
acf= 4   11   20   30   20   11  4

Python

import numpy as np
from scipy.fftpack import fft,ifft

x = np.asarray([1,2,3,4])

L = 2*len(x) - 1
ifft(fft(x,L)*fft(np.conj(x)[::-1],L))
#output: array([ 4+0j, 11+0j, 20+0j, 30+0j, 20+0j, 11+0j, 4+0j])

Note that in all the above cases, due to the symmetry property of auto-correlation function, the center element represents .

Construction the Auto-correlation Matrix

Auto-correlation matrix is a special form of matrix constructed from auto-correlation sequence. It takes the following form.

The auto-correlation matrix is easily constructed, once the auto-correlation sequence is known. The auto-correlation matrix is a Hermitian matrix as well as a Toeplitz matrix. This property is exploited in the following code for constructing the Auto-Correlation matrix.

Matlab

>> x=[1+j 2+j 3-j] %x is complex
>> acf=conv(x,fliplr(conj(x)))% %using Method 2 to compute Auto-correlation sequence
>>Rxx=acf(3:end); % R_xx(0) is the center element
>>Rx=toeplitz(Rxx,[Rxx(1) conj(Rxx(2:end))])

Python

import numpy as np
x = np.asarray([1+1j,2+1j,3-1j]) #x is complex
acf = np.convolve(x,np.conj(x)[::-1]) # using Method 2 to compute Auto-correlation sequence
Rxx=acf[2:]; # R_xx(0) is the center element
Rx = toeplitz(Rxx,np.hstack((Rxx[0], np.conj(Rxx[1:]))))

Result:

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

References:

[1] Reddi.S.S,”Eigen Vector properties of Toeplitz matrices and their application to spectral analysis of time series, Signal Processing, Vol 7,North-Holland, 1984,pp 46-56.↗
[2] Robert M. Gray,”Toeplitz and circulant matrices – an overview”,Department of Electrical Engineering,Stanford University,Stanford 94305,USA.↗
[3] Matlab documentation help on Toeplitz command.↗

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Walsh Hadamard Code – Matlab Simulation

Note: There is a rating embedded within this post, please visit this post to rate it.
The following is a function to generate a Walsh Hadamard Matrix of given codeword size. The codeword size has to be a power of 2.

function [H]=generateHadamardMatrix(codeSize)

%[H]=generateHadamardMatrix(codeSize);
% Function to generate Walsh-Hadamard Matrix where "codeSize" is the code
% length of walsh code. The first matrix gives us two codes; 00, 01. The second
% matrix gives: 0000, 0101, 0011, 0110 and so on
% Author: Mathuranathan for https://www.gaussianwaves.com
% License: Creative Commons: Attribution-NonCommercial-ShareAlike 3.0
% Unported

%codeSize=64; %For testing only
N=2;
H=[0 0 ; 0 1];
if bitand(codeSize,codeSize-1)==0
while(N~=codeSize)
       N=N*2;
       H=repmat(H,[2,2]);
       [m,n]=size(H); 

      %Invert the matrix located at the bottom right hand corner
      for i=m/2+1:m,
          for j=n/2+1:n,
                H(i,j)=~H(i,j);
         end
     end
end
else
disp('INVALID CODE SIZE:The code size must be a power of 2');
end

Example:

To Generate Walsh Codes used in IS-95 (which utilizes 64 Walsh codes of size 64 bits each, use : [H]=generateHadamardMatrix(64). This will generate 64 Walsh Codes of length 64-bits (for each code).

Test Program:

Click Here to download
Also given below is a program to test the cross-correlation and auto-correlation of Walsh code. A set of 8-Walsh codes are used for this purpose.

% Matlab Program to test Walsh Hadamard Codes and to test their orthogonality
% Plots cross-correlation and auto correlation of Walsh Hadamard Codes
% Author: Mathuranathan Viswanathan for https://www.gaussianwaves.com
% License: Creative Commons: Attribution-NonCommercial-ShareAlike 3.0
% Unported

codeSize=8;
[H]=generateHadamardMatrix(codeSize);

%-----------------------------------------------------------
%Cross-Correlation of Walsh Code 1 with rest of Walsh Codes
h = zeros(1, codeSize-1); %For dynamic Legends
s = cell(1, codeSize-1); %For dynamic Legends
for rows=2:codeSize
[crossCorrelation,lags]=crossCorr(H(1,:),H(rows,:));
h(rows-1)=plot(lags,crossCorrelation);
s{rows-1} = sprintf('Walsh Code Sequence #-%d', rows);
hold all;
end

%Dynamic Legends
% Select the plots to include in the legend
index = 1:codeSize-1;

% Create legend for the selected plots
legend(h(index),s{index});
title('Cross Correlation of Walsh Code 1 with the rest of the Walsh Codes');
ylabel('Cross Correlation');
xlabel('Lags');

%-----------------------------------------------------------
%AutoCorrelation of Walsh Code - 1
autoCorr2(H(2,:),8,2,1);

Simulation Results

From the plots below, it can be ascertained that the Walsh codes has excellent cross-correlation property and poor autocorrelation property. Excellent cross-correlation property (zero cross-correlation) implies orthogonality, which makes it suitable for CDMA applications.

Cross Correlation of Walsh Codes

Auto Correlation of Walsh Code

See also:

[1] Codes used in CDMA
[2] Generation of Gold Codes and their cross-correlation
[3] Preferred Pairs m-sequences generation for Gold Codes
[4] Maximum Length Sequences ( m-sequences)
[5] Introduction to Spread Spectrum

Recommended Books: