Convolutional Codes – sliding window and shift register

Key focus : Convolutional codes: Error-correcting codes effective in noisy channels and burst error correction. Know about sliding window & shift registers and how they are used in a convolutional encoder.

Linear block codes

Linear block codes, like Hamming codes, are mainly employed in hard decision decoding, utilizing the inherent algebraic structure of the code based on finite field properties.

Hard decision decoding of these codes creates a model resembling a binary symmetric channel. This model includes the essential components: the binary modulator, waveform channel, and optimal binary detector.

The decoder’s task for these codes is to find the codeword with the shortest Hamming distance from the binary symmetric channel’s output. When designing effective linear block codes, the primary aim is to discover the code with the highest minimum distance, considering the specific values of \(n\) and \(k\), where \(n\) is the number of output bits and \(k\) is the number of input bits.

As we saw, linear block codes are easily described in terms of algebraic equations. Convolutional codes are another class of error-correcting codes that are described using graphs and trellises.

Convolutional codes

Convolutional codes are a type of error-correcting code used in digital communication and data storage systems to detect and correct errors that may occur during transmission or storage. They are particularly well-suited for channels that introduce burst errors, such as noisy communication channels or storage media with localized errors.

Unlike the linear blocks, where the message bits and parity bits are transmitted, in the convolutional coding scheme, we do not send the message bits but only send the parity bits. Which message bits participate in generation of the parity bits is determined by a sliding window.

Convolutional codes are defined by a set of shift registers and associated feedback connections. The fundamental principle involves encoding input data via a sliding bit window (Figure 1), utilizing bitwise operations to generate the encoded result. The encoded bits are the parity bits. The encoded bits are a linear combination of the current input bits and previous input bits in the shift registers.

Let’s see some definitions with respect to convolutional codes:

Rate: Convolutional codes are often described by their rate, which is the ratio of the number of input bits to the number of output bits (encoded bits). A common notation for convolutional codes is given as \(n,k,m\) where \(n\) is the number of output bits (equivalently the number of generator polynomials), \(k\) is the number of input bits (equivalently the number of bits shifted by the shifter register at a time), and \(m\) is the number of shift registers. The code rate is given by \(R_c = k/n\).

Constraint Length: The constraint length of a convolutional code is the number of shift registers in the encoder. It determines the memory of the code and affects its error-correction capabilities. Larger the constraint length, greater is the redundancy and it may provide better error correction capabilities.

Generator Polynomials: The feedback connections and the bitwise operations performed on the shift registers are defined by generator polynomials. These polynomials determine the encoding process and the redundancy (parity bits) introduced into the data. Code rate is simply calculated as \(R_c = k/n\), where \(n\) is the number of generator polynomials in the encoder structure and \(k\) is the number of input bits at a time. More generator polynomials may improve the error correction capability of the code, it effectively decreases the code rate (more overhead).

Encoding: Convolutional encoding involves shifting the input bits through the shift registers and applying bitwise operations based on the generator polynomials. The encoder bits are just parity bits. In the convolutional encoding scheme, we only transmit the parity bits and not the message itself. If we use multiple generator polynomials, we interleave the output bits of each generate and serialize them when transmitting.

Decoding: At the receiver end, a decoding algorithm (such as the Viterbi algorithm) is used to analyze the received bits and recover the original data. The decoding process tries to find the most likely sequence of input bits given the received encoded bits and the code’s properties. We will come to this process in a later post.

Figure 1 shows a \(n=2, k=1,m=3\) convolutional encoder. The symbol D represents the delay as the input bits are processed through the shift registers. The generator polynomial is specified as \(g_0 = [1,1,1]\) and \(g_1=[1,0,1]\). The generator polynomials connect the appropriate cells in the shift register and performs XOR (modulo-2) operation on the bits stored in the shift register cells. This particular convolutional code generates 2 output bits for every input bit processed. The two output bit streams are designated as \(c_0\) and \(c_1\).

Since there are two generator polynomials, the code rate is \(R_c = 1/2\)

Given an input message \(x[n]\) the equation for generating the parity bit streams is given by

\[\begin{align} c_0[n] &= x[n] + x[n − 1] + x[n − 2] & \quad \quad (modulo-2) \\ c_1[n] &= x[n] + x[n − 2] & \quad \quad (modulo-2) \end{align}\]
n=2, k=1, m=3 Convolutional encoder
Figure 1: \(n=2, k=1, m=3\) Convolutional encoder

Figure 2 shows the sliding window representation of the shift register arrangement shown in Figure 1. Figure 2 shows a 3-bit (\(m\)) sliding window shift register that advances 1-bit (\(k\)) at a time. The number of output bits (\(n\)) depend on how these bits are processes using the generator polynomials.

Figure 2: Generating convolutional codes using sliding window operation and shift register

Sample encoding operation

Figure 2 visualizes the encoding process for a \(n=2, k=1, m=3\) convolutional code. The generator polynomial is specified as \(g_0 =[1,1,1]\) and \(g_1 = [1, 0, 1]\). The generator polynomials connect the appropriate cells in the shift register and performs XOR (modulo-2) operation on the bits in the shift register cells.

These two generator polynomials generate the corresponding output bits \(c_0\) and \(c_1\). The input message to encode is \([1,0,1,1,0,0]\). The output generated by this particular convolutional encoder is \([11,10,00,01,01,11]\).

Figure 2: Encoding process for convolutional codes

Encoder state machine

As illustrated in Figure 2, the constraint length \(m\) determines the number of memory cells in the shift register structure. The input bit and the \(m-1\) bits of current state determine the encoder state on the next clock cycle.

Because the encoder’s output relies on both its input and current state, state diagram is a more concise illustration than the trellis. Essentially, the state diagram represents a graphical depiction of the encoder’s potential states and the potential transition between these states.

From the state diagram, we can immediately recognize the fact : convolutional encoder is indeed a Markov chain.

To know more about Markov chain refer here. Implementing a Markov chain in python is discussed here.

Figure 4: Encoder state machine for convolutional code shown in Figure 1

Figure 4 depicts the state diagram for the convolutional encoder described in Figure 1. For a convolutional encoder with constraint length m, there will be \(2^{m-1}\) possible states. For this specific case, there are \(2^{3-1} = 4\) possible states.

The current state is represented as \(x[n-1]\) and \(x[n-2]\) (corresponding to the values of the memory cells \(D_1\) and \(D_2\). The output of the encoder is determined by the current input bit \(x[n]\) and the current state. Since there are two generator polynomials \(g_0\) and \(g_1\), they generate two outputs \(c_0[n]\) and \(c_1[n]\) respectively. For an input bit \(x[n]\), the encoder transits from one state to the next state, it emits \(c_0[n]\) and \(c_1[n]\) as output. This is depicted as \(x[n]/c_0[n]c_1[n]\).

Figure 5, shows the encoding process when the input message is \([1, 0, 1, 1, 0, 0]\). The encoder transits through various states and generates the output \([11, 10, 00, 01, 01, 11]\).

Figure 5: Encoder state machine processing the input message [1,0,1,1,0,0]

Hidden Markov Models (HMM) – Simplified !!!

Markov chains are useful in computing the probability of events that are observable. However, in many real world applications, the events that we are interested in are usually hidden, that is we don’t observe them directly. These hidden events need to be inferred. For example, given a sentence in a natural language we only observe the words and characters directly. The parts-of-speech from the sentence are hidden, they have to be inferred. This bring us to the following topic– the hidden Markov models.

Hidden Markov models enables us to visualize both observations and the associated hidden events. Let’s consider an example for understanding the concept.

The cheating casino and the gullible gambler

Consider a dishonest casino that deceives it player by using two types of die : a fair die (F) and a loaded die (L). For a fair die, each of the faces has the same probability of landing facing up. For the loaded die, the probabilities of the faces are skewed as given next

When the gambler throws the die, numbers land facing up. These are our observations at a given time t (denoted as Ot = {1,2,3,4,5,6}). At any given time t, whether these number are rolled from a fair die (state St = F) or a loaded die (St = L), is unknown to an observer and therefore they are the hidden events.

Emission probabilities

The probabilities associated with the observations are the observation likelihoods, also called emission probabilities (B).

Initial probabilities

The initial probability of starting (at time = 0) with any of fair die or loaded die (hidden event) is 50%.

Transition probabilities

A gullible gambler switches from the fair die to loaded die with 10% probability. He switches back from loaded die to fair die with 5% probability.

The probabilities of transitioning from one hidden event to another is described by the transition probability matrix (A). The elements of the probability transition matrix, are the transition probabilities (pij) of moving from one hidden state i to another hidden state j.

The transition probabilities from time t-1 to t, for the hidden events are

Therefore, the transition probability matrix is

Based on the given information so far, a probability model is constructed. This is the Hidden Markov Model (HMM) for the given problem.

Figure 1: Hidden Markov Model for the cheating Casino problem

Assumptions

We saw, in previous article, that the Markov models come with assumptions. Similarly, HMMs models also have such assumptions.

1. Assumption on probability of hidden states

In the model given here, the probability of a given hidden state depends only on the previous hidden state. This is a typical first order Markov chain assumption.

2. Assumption on Output

The probability of any observation (output) depends on the hidden state that produce it and not on any other hidden state or output observations.

Problems and Algorithms

Let’s briefly discuss the different problems and the related algorithms for HMMs. The algorithms will be explained in detail in the future articles.

In the dishonest casino, the gambler rolls the following numbers:

Figure 2: Sample Observations

1. Evaluation

Given the model of the dishonest casino, what is the probability of obtaining the above sequence ? This is a typical evaluation problem in HMMs. Forward algorithm is applied for such evaluation problems.

2. Decoding

What is the most likely sequence of die (hidden states) given the above sequence ? Such problems are addressed by Viterbi decoding.

What is the probability of fourth die being loaded, given the above sequence ? Forward-backward algorithm to our rescue.

3. Learning

Learning problems involve parametrization of the model. In learning problems, we attempt to find the various parameters (transition probabilities, emission probabilities) of the HMM, given the observation. Baum-Welch algorithm helps us to find the unknown parameters of a HMM.

Some real-life examples

Here are some real-life examples of HMM applications:

  1. Speech recognition: HMMs are widely used in speech recognition systems to model the variability of speech sounds. In this application, the observable events are the acoustic features of the speech signal, while the hidden states represent the phonemes or words that generate the speech signal.
  2. Handwriting recognition: HMMs can be used to recognize handwritten characters by modeling the temporal variability of the pen strokes. In this application, the observable events are the coordinates of the pen on the writing surface, while the hidden states represent the letters or symbols that generate the handwriting.
  3. Stock price prediction: HMMs can be used to model the behavior of stock prices and predict future price movements. In this application, the observable events are the daily price movements, while the hidden states represent the different market conditions that generate the price movements.
  4. Gene prediction: HMMs can be used to identify genes in DNA sequences. In this application, the observable events are the nucleotides in the DNA sequence, while the hidden states represent the different regions of the genome that generate the sequence.
  5. Natural language processing: HMMs are used in many natural language processing tasks, such as part-of-speech tagging and named entity recognition. In these applications, the observable events are the words in the text, while the hidden states represent the grammatical structures or semantic categories that generate the text.
  6. Image and video analysis: HMMs can be used to analyze images and videos, such as for object recognition and tracking. In this application, the observable events are the pixel values in the image or video, while the hidden states represent the object or motion that generates the pixel values.
  7. Bio-signal analysis: HMMs can be used to analyze physiological signals, such as electroencephalograms (EEGs) and electrocardiograms (ECGs). In this application, the observable events are the signal measurements, while the hidden states represent the physiological states that generate the signal.
  8. Radar signal processing: HMMs can be used to process radar signals and detect targets in noisy environments. In this application, the observable events are the radar measurements, while the hidden states represent the presence or absence of targets.

Rate this post: Note: There is a rating embedded within this post, please visit this post to rate it.

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Similar topics

Essentials of Signal Processing
● Generating standard test signals
 □ Sinusoidal signals
 □ Square wave
 □ Rectangular pulse
 □ Gaussian pulse
 □ Chirp signal
Interpreting FFT results - complex DFT, frequency bins and FFTShift
 □ Real and complex DFT
 □ Fast Fourier Transform (FFT)
 □ Interpreting the FFT results
 □ FFTShift
 □ IFFTShift
Obtaining magnitude and phase information from FFT
 □ Discrete-time domain representation
 □ Representing the signal in frequency domain using FFT
 □ Reconstructing the time domain signal from the frequency domain samples
● Power spectral density
Power and energy of a signal
 □ Energy of a signal
 □ Power of a signal
 □ Classification of signals
 □ Computation of power of a signal - simulation and verification
Polynomials, convolution and Toeplitz matrices
 □ Polynomial functions
 □ Representing single variable polynomial functions
 □ Multiplication of polynomials and linear convolution
 □ Toeplitz matrix and convolution
Methods to compute convolution
 □ Method 1: Brute-force method
 □ Method 2: Using Toeplitz matrix
 □ Method 3: Using FFT to compute convolution
 □ Miscellaneous methods
Analytic signal and its applications
 □ Analytic signal and Fourier transform
 □ Extracting instantaneous amplitude, phase, frequency
 □ Phase demodulation using Hilbert transform
Choosing a filter : FIR or IIR : understanding the design perspective
 □ Design specification
 □ General considerations in design

 

Markov Chains – Simplified !!

Key focus: Markov chains are a probabilistic models that describe a sequence of observations whose occurrence are statistically dependent only on the previous ones.

● Time-series data like speech, stock price movements.
● Words in a sentence.
● Base pairs on the rung of a DNA ladder.

States and transitions

Assume that we want to model the behavior of a driver behind the wheel. The possible behaviors are

● accelerate (state 1)
● constant speed (state 2)
● idling (engine running slowly but the vehicle is not moving – (state 3))
● brake (state 4)

Let’s refer each of these behaviors as a state. In the given example, there are N=4 states, refer them as Q = {q1,q2,q3,q4}.

We observe the following pattern in the driver’s behavior (Figure 1). That is, the driver operates the vehicle through a certain sequence of states. In the graph shown in Figure 1, the states are represented as nodes and the transitions as edges.

Figure 1: Driver’s behavior – operating the vehicle through a sequence of states

We see that, sometimes, the driver changes the state of the vehicle from one state to another and sometimes stays in the same state (as indicated by the arrows).

We also note that either the vehicle stays in the same state or changes to the next state. Therefore, from this model, if we want to predict the future state, all that matters is the current state of the vehicle. The past states has no bearing on the future state except through the current state. Take note of this important assumption for now.

Probabilistic model

We know that we cannot be certain about the driver’s behavior at any given point in time. Therefore, to model this uncertainty, the model is turned into a probabilistic model. A probabilistic model allows us to account for the likelihood of the behaviors or change of states.

An example for a probabilistic model for the given problem is given in Figure 2.

Figure 2: Driver’s behavior – a probabilistic model (transition matrix shown)

In this probabilistic model, we have assigned probability values to the transitions.These probabilities are collectively called transition probabilities. For example, considering the state named “idling”, the probability of the car to transition from this state to the next state (accelerate) is 0.8. In probability mathematics this is expressed as a conditional probability conditioned on the previous state.

p(state = “accelerate” | previous state = “idling” ) = 0.8

Usually, the transition probabilities are formulated in the form of matrix called transition probability matrix. The transition probability matrix is shown in Figure 2. In a transition matrix, denoted as A, each element aij represent the probability of transitioning from state i to state j. The elements of the transition matrix satisfy the following property.

That is, the sum of transition probabilities leaving any given state is 1.

As we know, in this example, the driver cannot start car in any state (example, it is impossible to start the car in “constant speed” state). He can only start the car from at rest (i.e, brake state). To model this uncertainty, we introduce πi – the probability that the Markov chain starts in a given state i. The set of starting probabilities for all the N states are called initial probability distribution (π = π1, π2, …, πN). In Figure 3, the starting probabilities are denoted by green arrows.

Figure 3: Markov Chain model for driver’s behavior

Markov Assumption

As noted in the definition, the Markov chain in this example, assumes that the occurrence of each event/observation is statistically dependent only on the previous one. This is a first order Markov chain (or termed as bigram language model in natural language processing application). For the states Q = {q1, …, qn}, predicting the probability of a future state depends only on the current observation, all other previous observations do not matter. In probabilistic terms, this first order Markov chain assumption is denoted as

Extending the assumption for mth order Markov chain, predicting the probability of a future observation depends only on the previous m observations. This is an m-gram model.

Given a set of n arbitrary random variables/observations Q = {q1, …, qn}, their joint probability distribution is usually computed by applying the following chain rule.

However, if the random observations in Q are of sequential in nature and follows the generic mth order Markov chain model, then the computation of joint probability gets simplified.

The Markov assumptions for first and second order of Markov models are summarized in Figure 4.Figure 4: Assumptions for 1st order and 2nd order Markov chains

Hidden Markov Model (HMM)

Markov chains are useful in computing the probability of events that are observable. However, in many real world applications, the events that we are interested in are usually hidden, that is we don’t observe them directly. These hidden events need to be inferred. For example, given a sentence in a natural language we only observe the words and characters directly. The parts-of-speech from the sentence are hidden, they have to be inferred. This brings us to the next topic of discussion – the hidden Markov models.

Rate this post: Note: There is a rating embedded within this post, please visit this post to rate it.

Books by the author


Wireless Communication Systems in Matlab
Second Edition(PDF)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Python
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart

Digital Modulations using Matlab
(PDF ebook)

Note: There is a rating embedded within this post, please visit this post to rate it.
Checkout Added to cart
Hand-picked Best books on Communication Engineering
Best books on Signal Processing

Similar topics

Essentials of Signal Processing
● Generating standard test signals
 □ Sinusoidal signals
 □ Square wave
 □ Rectangular pulse
 □ Gaussian pulse
 □ Chirp signal
Interpreting FFT results - complex DFT, frequency bins and FFTShift
 □ Real and complex DFT
 □ Fast Fourier Transform (FFT)
 □ Interpreting the FFT results
 □ FFTShift
 □ IFFTShift
Obtaining magnitude and phase information from FFT
 □ Discrete-time domain representation
 □ Representing the signal in frequency domain using FFT
 □ Reconstructing the time domain signal from the frequency domain samples
● Power spectral density
Power and energy of a signal
 □ Energy of a signal
 □ Power of a signal
 □ Classification of signals
 □ Computation of power of a signal - simulation and verification
Polynomials, convolution and Toeplitz matrices
 □ Polynomial functions
 □ Representing single variable polynomial functions
 □ Multiplication of polynomials and linear convolution
 □ Toeplitz matrix and convolution
Methods to compute convolution
 □ Method 1: Brute-force method
 □ Method 2: Using Toeplitz matrix
 □ Method 3: Using FFT to compute convolution
 □ Miscellaneous methods
Analytic signal and its applications
 □ Analytic signal and Fourier transform
 □ Extracting instantaneous amplitude, phase, frequency
 □ Phase demodulation using Hilbert transform
Choosing a filter : FIR or IIR : understanding the design perspective
 □ Design specification
 □ General considerations in design