Key focus: Euclidean & Hamming distances are used to measure similarity or dissimilarity between two sequences. Used in Soft & Hard decision decoding.
Distance is a measure that indicates either similarity or dissimilarity between two words. Given a pair of words a=(a0,a1, … ,an-1) and b=(b0,b1,…,bn-1) , there are variety of ways one can characterize the distance, d(a,b), between the two words. Euclidean and Hamming distances are familiar ones. Euclidean distance is extensively applied in analysis of convolutional codes and Trellis codes. Hamming distance is frequently encountered in the analysis of block codes.
In contrast to classical hard-decision decoders (see below) which operate on binary values, a soft-decision decoder directly processes the unquantized (or quantized in more than two levels in practice) samples at the output of the matched filter for each bit-period, thereby avoiding the loss of information.
If the outputs of the matched filter at each bit-period are unquantized or quantized in more than two levels, the demodulator is said to make soft-decisions. The process of decoding the soft-decision received sequence is called soft-decision decoding. Since the decoder uses the additional information contained in the multi-level quantized or unquantized received sequence, soft-decision decoding provides better performance compared to hard-decision decoding. For soft-decision decoding, metrics like likelihood function, Euclidean distance and correlation are used.
For illustration purposes, we consider the communication system model shown in Figure 1. A block encoder encodes the information blocks m=(m1,m2,…,mk) and generates the corresponding codeword vector c=(c1,c2,…,cn). The codewords are modulated and sent across an AWGN channel. The received signal is passed through a matched filter and the multi-level quantizer outputs the soft-decision vector r .
The goal of a decoder is to produce an estimate m̂ of the information sequence m based on the received sequence r. Equivalently, the information sequence m and the codeword c has one-to-one correspondence, the decoder can also produce an estimate ĉ of the codeword c. If the codeword c was transmitted, a decoding error occurs if ĉ ≠ c.
For equi-probable codewords, The decoder that selects a codeword that maximizes the conditional probability P(r, c). This is called a maximum lihelihood decoder (MLD).
For an AWGN channel with two-sided power spectral density N0/2, the conditional probability is given by
The sum is the squared Euclidean distance between the received sequence r and the coded signal sequence s. We can note that the term is common for all codewords and n is a constant. This simplifies the MLD decoding rule where we select a codeword from the code dictionary that minimizes the Euclidean distance D(r, s)$.
Hamming distance
Hamming distance between two words a=(a0,a1, … ,an-1) and b=(b0,b1,…,bn-1) in Galois FieldGF(2), is the number of coordinates in which the two blocks differ.
For example, the Hamming distance between (0,0,0,1) and (1,0,1,0) in GF(2) is 3, since they differ in three digits. For an independent and identically distributed (i.i.d) error model with (discrete) uniform error amplitude distribution, the most appropriate measure is Hamming distance.
Minimum distance
The minimum distance of block code C, is the smallest distance between all distance pairs of code words in C. The minimum distance of a block code determines both its error-detecting ability and error-correcting ability. A large minimum distance guarantees reliability against random errors. The general relationship between a block code’s minimum distance and the error-detecting and error correcting capability is as follows.
● If dmin is the minimum Hamming distance of a block code, the code is guaranteed to detect up to e=dmin-1 errors. Consequently, let c1 and c2 be the two closest codewords in the codeword dictionary C. If c1 was transmitted and c2 is received, the error is undetectable.
● If dmin is the minimum Hamming distance of a block code and if the optimal decoding procedure of nearest-neighbor decoding is used at the receiver, the code is guaranteed to correct up to t=(dmin-1 )/2 errors.
Sub-optimal hard decision decoding
In soft-decision decoding, the bit samples to the decoder are eitherunquantized or quantized to multi-levels and the maximum likelihood decoder (MLD) needs to compute M correlation metrics, where M is the number of codewords in the codeword dictionary. Although this provides the best performance, the computational complexity of the decoder increases when the number of codewords M becomes large. To reduce the computational burden, the output of the matched filter at each bit-period can be quantized to only two levels, denoted as 0 and 1, that results in a hard-decision binary sequence. Then, the decoder processes this hard-decision sequence based on a specific decoding algorithm. This type of decoding, illustrated in Figure 1, is called hard-decision decoding.
The hard-decision decoding methods use Hamming distance metric to decode the hard-decision received sequence to the closest codeword. The objective of such decoding methods is to choose a codeword that provides the minimum Hamming distance with respect to the hard-decision received sequence. Since the hard-decision samples are only quantized to two levels, resulting in loss of information, the hard-decision decoding results in performance degradation when compared to soft-decision decoding.
Decoding using standard array and syndrome decoding are popular hard-decision decoding methods encountered in practice.
Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.
Hard decision decoding and soft decision decoding are two different methods used for decoding error-correcting codes.
With hard decision decoding, the received signal is compared to a set threshold value to determine whether the transmitted bit is a 0 or a 1. This is commonly used in digital communication systems that experience noise or interference, resulting in a low signal-to-noise ratio.
Soft decision decoding, on the other hand, treats the received signal as a probability distribution and calculates the likelihood of each possible transmitted bit based on the characteristics of the received signal. This approach is often used in modern digital communication and data storage systems where the signal-to-noise ratio is relatively high and there is a need for higher accuracy and reliability.
While soft decision decoding can achieve better error correction, it is more complex and computationally expensive than hard decision decoding.
More details
Let’s expatiate on the concepts of hard decision and soft decision decoding. Consider a simple even parity encoder given below.
Input Bit 1
Input Bit 2
Parity bit added by encoder
Codeword Generated
0
0
0
000
0
1
1
011
1
0
1
101
1
1
0
110
The set of all possible codewords generated by the encoder are 000,011,101 and 110.
Lets say we are want to transmit the message “01” through a communication channel.
Hard decision decoding
Case 1 : Assume that our communication model consists of a parity encoder, communication channel (attenuates the data randomly) and a hard decision decoder
The message bits “01” are applied to the parity encoder and we get “011” as the output codeword.
The output codeword “011” is transmitted through the channel. “0” is transmitted as “0 Volt and “1” as “1 Volt”. The channel attenuates the signal that is being transmitted and the receiver sees a distorted waveform ( “Red color waveform”). The hard decision decoder makes a decision based on the threshold voltage. In our case the threshold voltage is chosen as 0.5 Volt ( midway between “0” and “1” Volt ) . At each sampling instant in the receiver (as shown in the figure above) the hard decision detector determines the state of the bit to be “0” if the voltage level falls below the threshold and “1” if the voltage level is above the threshold. Therefore, the output of the hard decision block is “001”. Perhaps this “001” output is not a valid codeword ( compare this with the all possible codewords given in the table above) , which implies that the message bits cannot be recovered properly. The decoder compares the output of the hard decision block with the all possible codewords and computes the minimum Hamming distance for each case (as illustrated in the table below).
All possible Codewords
Hard decision output
Hamming distance
000
001
1
011
001
1
101
001
1
110
001
3
The decoder’s job is to choose a valid codeword which has the minimum Hamming distance. In our case, the minimum Hamming distance is “1” and there are 3 valid codewords with this distance. The decoder may choose any of the three possibility and the probability of getting the correct codeword (“001” – this is what we transmitted) is always 1/3. So when the hard decision decoding is employed the probability of recovering our data ( in this particular case) is 1/3. Lets see what “Soft decision decoding” offers …
Soft Decision Decoding
The difference between hard and soft decision decoder is as follows
In Hard decision decoding, the received codeword is compared with the all possible codewords and the codeword which gives the minimum Hamming distanceis selected
In Soft decision decoding, the received codeword is compared with the all possible codewords and the codeword which gives the minimum Euclidean distance is selected. Thus the soft decision decoding improves the decision making process by supplying additional reliability information ( calculated Euclidean distance or calculated log-likelihood ratio)
For the same encoder and channel combination lets see the effect of replacing the hard decision block with a soft decision block.
Voltage levels of the received signal at each sampling instant are shown in the figure. The soft decision block calculates the Euclidean distance between the received signal and the all possible codewords.
Valid codewords
Voltage levels at each sampling instant of received waveform
Euclidean distance calculation
Euclidean distance
0 0 0
( 0V 0V 0V )
0.2V 0.4V 0.7V
(0-0.2)2+ (0-0.4)2+ (0-0.7)2
0.69
0 1 1
( 0V 1V 1V )
0.2V 0.4V 0.7V
(0-0.2)2+ (1-0.4)2+ (1-0.7)2
0.49
1 0 1
( 1V 0V 1V )
0.2V 0.4V 0.7V
(1-0.2)2+ (0-0.4)2+ (1-0.7)2
0.89
1 1 0
( 1V 1V 0V )
0.2V 0.4V 0.7V
(1-0.2)2+ (1-0.4)2+ (0-0.7)2
1.49
The minimum Euclidean distance is “0.49” corresponding to “0 1 1” codeword (which is what we transmitted). The decoder selects this codeword as the output. Even though the parity encoder cannot correct errors, the soft decision scheme helped in recovering the data in this case. This fact delineates the improvement that will be seen when this soft decision scheme is used in combination with forward error correcting (FEC) schemes like convolution codes , LDPC etc
From this illustration we can understand that the soft decision decoders uses all of the information ( voltage levels in this case) in the process of decision making whereas the hard decision decoders does not fully utilize the information available in the received signal (evident from calculating Hamming distance just by comparing the signal level with the threshold whereby neglecting the actual voltage levels).
Note: This is just to illustrate the concept of Soft decision and Hard decision decoding. Prudent souls will be quick enough to find that the parity code example will fail for other voltage levels (e.g. : 0.2V , 0.4 V and 0.6V) . This is because the parity encoders are not capable of correcting errors but are capable of detecting single bit errors.
Soft decision decoding scheme is often realized using Viterbi decoders. Such decoders utilize Soft Output Viterbi Algorithm (SOVA) which takes into account the apriori probabilities of the input symbols producing a soft output indicating the reliability of the decision.
Rate this article: Note: There is a rating embedded within this post, please visit this post to rate it.
Keywords: Hamming code, error-correction code, digital communication, data storage, reliable transmission, computer memory systems, satellite communication systems, single-bit error, two-bit errors.
What is a Hamming Code
Hamming codes are a class of error-correcting codes that are commonly employed in digital communication and data storage systems to detect and correct errors that may occur during transmission or storage. They were created by Richard Hamming in the 1950s and bear his name.
The central concept of Hamming codes is to introduce additional (redundant) bits to a message in order to enable the identification and correction of errors. By appending parity bits to the original message, Hamming codes can identify and correct single-bit errors.
One notable characteristic of these codes is their ability to correct any single-bit error and detect any two-bit error, which has contributed to their widespread usage in computer memory systems, satellite communication systems, and other domains where reliable data transmission is crucial.
Technical details of Hamming code
Linear binary Hamming code falls under the category of linear block codes that can correct single bit errors. For every integer p ≥ 3 (the number of parity bits), there is a (2p-1, 2p-p-1) Hamming code. Here, 2p-1 is the number of symbols in the encoded codeword and 2p-p-1 is the number of information symbols the encoder can accept at a time. All such Hamming codes have a minimum Hamming distance dmin=3 and thus they can correct any single bit error and detect any two bit errors in the received vector. The characteristics of a generic (n,k) Hamming code is given below.
\[\begin{aligned} \text{Codeword length:} \quad && n &= 2^p-1 \\ \text{Number of information symbols:} \quad && k &= 2^p-p-1 \\ \text{Number of parity symbols:} \quad && n-k &= p \\ \text{Minimum distance:} \quad && d_{min} &= 3 \\ \text{Error correcting capability:} \quad && t &=1 \end{aligned}\]
With the simplest configuration: p=3, we get the most basic (7, 4) binary Hamming code. The (7,4) binary Hamming block encoder accepts blocks of 4-bit of information, adds 3 parity bits to each such block and produces 7-bits wide Hamming coded blocks.
Systematic & Non-systematic encoding
Block codes like Hamming codes are also classified into two categories that differ in terms of structure of the encoder output:
● Systematic encoding ● Non-systematic encoding
In systematic encoding, just by seeing the output of an encoder, we can separate the data and the redundant bits (also called parity bits). In the non-systematic encoding, the redundant bits and data bits are interspersed.
Constructing (7,4) Hamming code
Hamming codes can be implemented in systematic or non-systematic form. A systematic linear block code can be converted to non-systematic form by elementary matrix transformations. A non-systematic Hamming code is described next.
Let a codeword belonging to (7, 4) Hamming code be represented by [D7,D6,D5,P4,D3,P2,P1], where D represents information bits and P represents parity bits at respective bit positions. The subscripts indicate the left to right position taken by the data and the parity bits. We note that the parity bits are located at position that are powers of two (bit positions 1,2,4).
Now, represent the bit positions in binary.
Seeing from left to right, the first parity bit (P1) covers the bits at positions whose binary representation has 1 at the least significant bit. We find that P1 covers the following bit positions
Similarly, the second parity bit (P2) covers the bits at positions whose binary representation has 1 at the second least significant bit. Hence, P2covers the following bit positions.
Finally, the third parity bit (P4) covers the bits at positions whose binary representation has 1 at the most significant bit. Hence, P4 covers the following bit positions
If we follow even parity scheme for parity bits, the number of 1’s covered by the parity bits must add up to an even number. Which implies that the XOR of bits covered by the parity (including the parity bits) must result in 0. Therefore, the following equations hold.
Following table illustrates the concept of constructing the Hamming code as described by R.W Hamming in his groundbreaking paper [1].
We note that the parity bits and data columns are interspersed. This is an example of non-systematic Hamming code structure. We can continue our work on the table above as it is. Or, we can also re-arrange the entries of that table using elemental transformations, such that a systematic Hamming code is rendered.
After the re-arrangement of columns, we see that the parity columns are nicely clubbed together at the end. We can also drop the subscripts given to the parity/data locations and re-index them according to our convenience. This gives the following structure to the (7,4) Hamming code.
We will use the above systematic structure in the following discussion.
Encoding process
Given the structure in Figure 3, the parity bits are calculated from the following linearly independent equations using modulo-2 additions.
At the transmitter side, a Hamming encoder implements a generator matrix – . It is easier to construct the generator matrix from the linear equations listed in equation above. The linear equations show that the information bit D1 influences the calculation of parities at P1 and P3 . Similarly, the information bit D2 influences P1 and P2, D3 influences P1, P2 & P3 and D4 influences P2 & P3.
Represented as matrix operations, the encoder accepts 4 bit message block \(\mathbf{m}\), multiplies it with the generator matrix \(\mathbf{G}\) and generates 7 bit codewords \(\mathbf{c}\). Note that all the operations (addition, multiplication etc.,) are in modulo-2 domain.
\[\mathbf{c}= \mathbf{m} \mathbf{G} \]
Given a generator matrix, the Matlab code snippet for generating a codebook containing all possible codewords (\(\mathbf{c} \in \mathbf{C}\)) is given below. The resulting codebook \(\mathbf{C}\) can be used as a Look-Up-Table (LUT) when implementing the encoder. This implementation will avoid repeated multiplication of the input blocks and the generator matrix. The list of all possible codewords for the generator matrix (\(\mathbf{G}\)) given above are listed in table 2.
Program: Generating all possible codewords from Generator matrix
%program to generate all possible codewords for (7,4) Hamming code
G=[ 1 0 0 0 1 0 1;
0 1 0 0 1 1 0;
0 0 1 0 1 1 1;
0 0 0 1 0 1 1];%generator matrix - (7,4) Hamming code
m=de2bi(0:1:15,'left-msb');%list of all numbers from 0 to 2ˆk
codebook=mod(m*G,2) %list of all possible codewords
Decoding process – Syndrome decoding
The three check equations for the given generator matrix (\(\mathbf{G}\)) for the sample (7,4) Hamming code, can be expressed collectively as a parity check matrix – \(\mathbf{H}\). Parity check matrix finds its usefulness in the receiver side for error-detection and error-correction.
According to parity-check theorem, for every generator matrix G, there exists a parity-check matrix H, that spans the null-space of G. Therefore, if c is a valid codeword, then it will be orthogonal to each row of H.
\[\mathbf{c} \mathbf{H}^T = 0 \]
Therefore, if \(\mathbf{H}\) is the parity-check matrix for a codebook \(\mathbf{C}\), then a vector \(\mathbf{c}\) in the received code space is a valid codeword if and only if it satisfies \(\mathbf{c} \mathbf{H}^T=0\).
Consider a vector of received word \(\mathbf{r}=\mathbf{c}+\mathbf{e}\), where \(\mathbf{c}\) is a valid codeword transmitted and \(\mathbf{e}\) is the error introduced by the channel. The matrix product \(\mathbf{r}\mathbf{H}^T\) is defined as the syndrome for the received vector \(\mathbf{r}\), which can be thought of as a linear transformation whose null space is \(\mathbf{C}\) [2].
Thus, the syndrome is independent of the transmitted codeword \(\mathbf{c}\) and is solely a function of the error pattern \(\mathbf{e}\). It can be determined that if two error vectors \(\mathbf{e}\) and \(\mathbf{e}’\) have the same syndrome, then the error vectors must differ by a nonzero codeword.
It follows from the equation above, that decoding can be performed by computing the syndrome of the received word, finding the corresponding error pattern and subtracting (equivalent to addition in \(GF(2)\) domain) the error pattern from the received word. This obviates the need to store all the vectors as in a standard array decoding and greatly reduces the memory requirements for implementing the decoder.
Following is the syndrome table for the (7,4) Hamming code example, illustrated here.
Some properties of generator and parity-check matrices
The generator matrix \(\mathbf{G}\) and the parity-check matrix \(\mathbf{H}\) satisfy the following property
\[\mathbf{G} \mathbf{H}^T = 0\]
Note that the generator matrix is in standard form where the elements are partitioned as
\[\mathbf{G} = \begin{bmatrix} I_k \mid P \end{bmatrix} \]
whereIk is a k⨉k identity matrix and P is of dimension k ⨉ (n-k). When G is a standard form matrix, the corresponding parity-check matrix H can be easily determined as
\[\mathbf{H} =[−P^T∣I_{n−k}]\]
In Galois Field – GF(2), the negation of a number is simply its absolute value. Hence the H matrix for binary codes can be simply written as
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
cookielawinfo-checbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.