AutoCorrelation (Correlogram) and persistence – Time series analysis

The agenda for the subsequent series of articles is to introduce the idea of autocorrelation, AutoCorrelation Function (ACF), Partial AutoCorrelation Function (PACF) , using ACF and PACF in system identification.

Introduction

Given time series data (stock market data, sunspot numbers over a period of years, signal samples received over a communication channel etc.,), successive values in the time series often correlate with each other. This series correlation is termed persistence or inertia and it leads to increased power in the lower frequencies of the frequency spectrum. Persistence can drastically reduces the degrees of freedom in time series modeling (AR, MA , ARMA models). In the test for statistical significance, presence of persistence complicates the test as it reduces the number of independent observations.

Autocorrelation function

Correlation of a time series with its own past and future values- is called autocorrelation.  It is also referred as “lagged or series correlation”. Positive autocorrelation is an indication of a specific form of persistence, the tendency of a system to remain in the same state from one observation to the next (example: continuous runs of 0’s or 1’s). If a time series exhibits correlation, the future values of the samples probabilistic-ally depend on the current & past samples. Thus the existence of autocorrelation can be exploited in prediction as well as modeling time series. Autocorrelation can be accessed using the following tools

● Time series plot
● Lagged scatterplot
● AutoCorrelation Function (ACF)

Generating a sample time series

For the purpose of illustration, let’s begin by generating two time series data using Auto-Regressive AR(1) process. AR(1) process relates the current sample x[n] of the output of an LTI system, its immediate past sample x[n-1] and the white noise term w[n].

A generic AR(1) system is given by

Here and are the model parameters which we will tweak to generate different set of time series data and is a constant which will be set to zero . Thus the model can be equivalently written as

Let’s generate two time series data from the above model.

Model 1: a0=0, a1=1

The “Filter” function in Matlab will be utilized to generate the output process x[n]. The filter function, in its basic form – X=filter(B,A,W), takes three inputs. The vectors B and A denote the numerator and denominator co-efficients (model parameters here) of the transfer function of the LTI system in standard difference equation form, W is the white noise vector to the LTI filter and the output of filter is X.

The transfer function of model 1 is therefore,

Where B=1 and A=[1 -1] and  the input W is a white noise – which can be generate using randn function. Therefore, the above model can be implemented with the command x=filter(1,[1 -1,randn(1000,1)) generate 1000 samples of x[n]

A=[1 -1];  %model co-effs
% generating using numerator/denominator form with noise
x1 = filter(1,A,randn(1000,1));
plot(x1,’b’);

Model 2: a0=1, a1=0.5

Transfer function of this model is

Where B=1 and A=[1 -0.5] and  the input W is a white noise – which can be generated using randn function. Therefore, the above model can be implemented with the command x=filter(1,[1 -0.5], randn(1000,1)) to generate 1000 samples of x[n]

A=[1 -0.5];  %model co-effs
% generating using numerator/denominator form with noise
x2 = filter(1,A,randn(1000,1));
plot(x2,’r’);
autocorrelation and persistence
Time-series plot of two models – where one model shows persistence and the other does not

In the plot above, the output from model 1 exhibits persistence or positive correlation – positive deviations from mean tend to be followed by positive deviations for some duration and the negative deviations from mean tend to be followed by negative deviations for sometime. When the positive deviations are followed by negative deviations or vice-versa, it is a characteristic of negative correlation. Positive correlations are strong indications of long runs of several consecutive observations above or below mean. Negative correlations indicate low incidence of such runs. The output of the model 2 always jumps around the mean value and there is no consistent departure from the mean – no persistence (no positive correlation). The interpretation of time series plots for clues on persistence is a subjective matter and is left for trained eyes. However, it can be considered as a preliminary analysis.

Persistence – an indication of non-stationarity:

For time series analysis, it is imperative to work with stationary process. Many of the formulated theorems in statistical signal processing assume a series to be stationary (atleast in weak sense). Processes whose Probability Density Functions do not change with time are termed stationary (sub classifications include strict sense stationarity (SSS), weak sense stationarity (WSS) etc.,). For analysis, the joint probability distribution must remain unchanged should there be any shift in the time series. Time series with persistence – changing mean with time – are non-stationary – therefore many theorems in signal processing will not apply as such.

Plotting the histogram of the two series (see next figure) , we can immediately identify that the data generated by model 1 is non-stationary  – histogram varies between selected portion of the signal. Whereas, the histogram of the output from model 2 is pretty much same – therefore, this is a stationary signal and is suitable for further analysis.

Persistence – non-stationary and stationary signal

Lagged Scatter Plots

Autocorrelation trend can also be ascertained by lagged scatter plots. In lagged scatter plots, the samples of time series are plotted against one another with one lag at a time. A Strong positive autocorrelation will show of as a linear positive slope for the particular lag value. If the scatter plot is random, it indicates no-correlation for the particular lag.

figure;
x12 = x1(1:end-1);
x12 = x1(1:end-1);
x21 = x1(2:end);
x13 = x1(1:end-2);
x31 = x1(3:end);
x14 = x1(1:end-3);
x41 = x1(4:end);
x15 = x1(1:end-4);
x51 = x1(5:end);
subplot(2,2,1)
plot(x12,x21,'b*');
xlabel('X_1'); ylabel('X_2');
subplot(2,2,2)
plot(x13,x31,'b*');
xlabel('X_1'); ylabel('X_3');
subplot(2,2,3)
plot(x14,x41,'b*');
xlabel('X_1'); ylabel('X_4');
subplot(2,2,4)
plot(x15,x51,'b*');
xlabel('X_1'); ylabel('X_5');

figure;
x12 = x2(1:end-1);
x12 = x2(1:end-1);
x21 = x2(2:end);
x13 = x2(1:end-2);
x31 = x2(3:end);
x14 = x2(1:end-3);
x41 = x2(4:end);
x15 = x2(1:end-4);
x51 = x2(5:end);
subplot(2,2,1)
plot(x12,x21,'b*');
xlabel('X_1'); ylabel('X_2');
subplot(2,2,2)
plot(x13,x31,'b*');
xlabel('X_1'); ylabel('X_3');
subplot(2,2,3)
plot(x14,x41,'b*');
xlabel('X_1'); ylabel('X_4');
subplot(2,2,4)
plot(x15,x51,'b*');
xlabel('X_1'); ylabel('X_5');

The scatter plot of the model 1 for the first four lags indicate strong positive correlation at all the four lag values. The scatter plot of model 2 indicates a slightly positive correlation for lag=1 and no correlation for remaining lags. This trend can be clearing seen if we plot the Auto Correlation Function (ACF).

Auto Correlation Function (ACF) or Correlogram

ACF plot summarizes the correlation of a time series at various lags. It plots the correlation co-efficient of the series lagged by 1 delay at a time in the sample plot.  Plotting the ACF for the output from both the models with the code below.

[x1c,lags] = xcorr(x1,100,'coeff');
%Plotting only positive lag values - autocorrelation is symmetric
stem(lags(101:end),x1c(101:end));
[x2c,lags] = xcorr(x2,100,'coeff');
stem(lags(101:end),x2c(101:end))

The ACF plot of model 1 indicates strong persistence across all the lags. The ACF plot of model 2 indicates significant correlation only at lag 1 (and lag 0 will obviously correlate fully) which concurs with the lagged scatter plots.

Auto Correlation Function or correlogram

Correlogram has very few significant spikes at very small lags and cuts off drastically/dies down quickly for stationary series. Thus model 2 produces stationary series, where as model 1 does not. Also, model 2 is suitable for further time series analysis.

Continue reading on constructing an autocorrelation matrix…

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

Articles in this series

[1]An Introduction to Estimation Theory
[2]Bias of an Estimator
[3]Minimum Variance Unbiased Estimators (MVUE)
[4]Maximum Likelihood Estimation
[5]Maximum Likelihood Decoding
[6]Probability and Random Process
[7]Likelihood Function and Maximum Likelihood Estimation (MLE)
[8]Score, Fisher Information and Estimator Sensitivity
[9]Introduction to Cramer Rao Lower Bound (CRLB)
[10]Cramer Rao Lower Bound for Scalar Parameter Estimation
[11]Applying Cramer Rao Lower Bound (CRLB) to find a Minimum Variance Unbiased Estimator (MVUE)
[12]Efficient Estimators and CRLB
[13]Cramer Rao Lower Bound for Phase Estimation
[14]Normalized CRLB - an alternate form of CRLB and its relation to estimator sensitivity
[15]Cramer Rao Lower Bound (CRLB) for Vector Parameter Estimation
[16]The Mean Square Error – Why do we use it for estimation problems
[17]How to estimate unknown parameters using Ordinary Least Squares (OLS)
[18]Essential Preliminary Matrix Algebra for Signal Processing
[19]Why Cholesky Decomposition ? A sample case:
[20]Tests for Positive Definiteness of a Matrix
[21]Solving a Triangular Matrix using Forward & Backward Substitution
[22]Cholesky Factorization - Matlab and Python
[23]LTI system models for random signals – AR, MA and ARMA models
[24]Comparing AR and ARMA model - minimization of squared error
[25]Yule Walker Estimation
[26]AutoCorrelation (Correlogram) and persistence – Time series analysis
[27]Linear Models - Least Squares Estimator (LSE)
[28]Best Linear Unbiased Estimator (BLUE)

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Understand AR, MA and ARMA models

Key focus: AR, MA & ARMA models express the nature of transfer function of LTI system. Understand the basic idea behind those models & know their frequency responses.

Introduction

Signal models are used to analyze stationary univariate time series. The goal of signal modeling is to estimate the process from which the desired signal is generated. Though the concept described here is related to the topic of “system identification”, they are quite different.

A signal model is an unique combination of a filter and a source input, that may fall into any of the following categories

  • Filter: state-space model, AR, MA, ARMA (see below)
  • Source:pulse, pulse train, white noise,…

Motivation

Let’s say we observe a real world signal x[n] that has a spectrum x[ɷ] (the spectrum can be arbitrary – bandpass, baseband etc..,). We would like to describe the long sequence of x[n] using very few parameters (application : Linear Predictive Coding (LPC) ). The modelling approach, described here, tries to answer the following two questions.

  • Is it possible to model the first order (mean/variance) and second order (correlations, spectrum) statistics of the signal just by shaping a white noise spectrum using a transfer function ? (see Figure 1)
  • Does this produce the same statistics (spectrum, correlations, mean and variance) for a white noise input ?

If the answer is “yes” to the above two questions, we can simply set the modeled parameters of the system and excite the system with white (flat) noise to produce the desired real world signal. This reduces the amount to data we wish to transmit in a communication system application.

Figure 1: Shaping a white noise spectrum (flat spectrum) to achieve desired spectrum

LTI system model

In the model given below, the random signal x[n] is observed. Given the observed signal x[n], the goal here is to find a model that best describes the spectral properties of x[n] under the following assumptions

x[n] is WSS (Wide Sense Stationary) and ergodic.
● The input signal to the LTI system is white noise following Gaussian distribution – zero mean and variance σ2.
● The LTI system is BIBO (Bounded Input Bounded Output) stable.

Figure 2: Linear Time Invariant (LTI) system – signal model

In the model shown above, the input to the LTI system is a white noise following Gaussian distribution – zero mean and variance σ2. The power spectral density (PSD) of the noise w[n] is

The noise process drives the LTI system with frequency response H(e) producing the signal of interest x[n]. The PSD of the output process is therefore,

Three cases are possible given the nature of the transfer function of the LTI system that is under investigation here.

  • Auto Regressive (AR) models : H(e) is an all-poles system
  • Moving Average (MA) models : H(e) is an all-zeros system
  • Auto Regressive Moving Average (ARMA) models : H(e) is a pole-zero system

Auto Regressive (AR) models (all-poles model)

In the AR model, the present output sample x[n] and the past N output samples determine the source input w[n]. The difference equation that characterizes this model is given by

Here, the LTI system is an Infinite Impulse Response (IIR) filter. This is evident from the fact that the above equation considered past samples of x[n] when determining w[n], there by creating a feedback loop from the output of the filter.

The frequency response of the IIR filter is well known

Figure 3: Spectrum of all-pole transfer function (representing AR model)

The transfer function H(e) is an all-pole transfer function (when the denominator is set to zero, the transfer function goes to infinity -> creating peaks in the spectrum). Poles are best suited to model resonant peaks in a given spectrum. At the peaks, the poles are closer to unit circle. This model is well suited for modeling peaky spectra.

Read all articles tagged Auto-regressive model.

Moving Average (MA) models (all-zeros model)

In the MA model, the present output sample x[n] is determined by the present source input w[n] and past N samples of source input w[n]. The difference equation that characterizes this model is given by

Here, the LTI system is an Finite Impulse Response (FIR) filter. This is evident from the fact that the above equation that no feedback is involved from output to input.

The frequency response of the FIR filter is well known

The transfer function H(e) is an all-zero transfer function (when the numerator is set to zero, the transfer function goes to zero -> creating nulls in the spectrum). Zeros are best suited to model sharp nulls in a given spectrum.

Figure 4: Spectrum of all-zeros transfer function (representing MA model)

Auto Regressive Moving Average (ARMA) model (pole-zero model)

ARMA model is a generalized model that is a combination of AR and MA model. The output of the filter is linear combination of both weighted inputs (present and past samples) and weight outputs (present and past samples). The difference equation that characterizes this model is given by

The frequency response of this generalized filter is well known

The transfer function H(e) is a pole-zero transfer function. It is best suited for modelling complex spectra having well defined resonant peaks and nulls.

Next post: Comparing AR and ARMA model – minimization of squared error

Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.

Related topics

[1]An Introduction to Estimation Theory
[2]Bias of an Estimator
[3]Minimum Variance Unbiased Estimators (MVUE)
[4]Maximum Likelihood Estimation
[5]Maximum Likelihood Decoding
[6]Probability and Random Process
[7]Likelihood Function and Maximum Likelihood Estimation (MLE)
[8]Score, Fisher Information and Estimator Sensitivity
[9]Introduction to Cramer Rao Lower Bound (CRLB)
[10]Cramer Rao Lower Bound for Scalar Parameter Estimation
[11]Applying Cramer Rao Lower Bound (CRLB) to find a Minimum Variance Unbiased Estimator (MVUE)
[12]Efficient Estimators and CRLB
[13]Cramer Rao Lower Bound for Phase Estimation
[14]Normalized CRLB - an alternate form of CRLB and its relation to estimator sensitivity
[15]Cramer Rao Lower Bound (CRLB) for Vector Parameter Estimation
[16]The Mean Square Error – Why do we use it for estimation problems
[17]How to estimate unknown parameters using Ordinary Least Squares (OLS)
[18]Essential Preliminary Matrix Algebra for Signal Processing
[19]Why Cholesky Decomposition ? A sample case:
[20]Tests for Positive Definiteness of a Matrix
[21]Solving a Triangular Matrix using Forward & Backward Substitution
[22]Cholesky Factorization - Matlab and Python
[23]LTI system models for random signals – AR, MA and ARMA models
[24]Comparing AR and ARMA model - minimization of squared error
[25]Yule Walker Estimation
[26]AutoCorrelation (Correlogram) and persistence – Time series analysis
[27]Linear Models - Least Squares Estimator (LSE)
[28]Best Linear Unbiased Estimator (BLUE)

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing