Hamming Code : construction, encoding & decoding

Keywords: Hamming code, error-correction code, digital communication, data storage, reliable transmission, computer memory systems, satellite communication systems, single-bit error, two-bit errors.

What is a Hamming Code

Hamming codes are a class of error-correcting codes that are commonly employed in digital communication and data storage systems to detect and correct errors that may occur during transmission or storage. They were created by Richard Hamming in the 1950s and bear his name.

The central concept of Hamming codes is to introduce additional (redundant) bits to a message in order to enable the identification and correction of errors. By appending parity bits to the original message, Hamming codes can identify and correct single-bit errors.

One notable characteristic of these codes is their ability to correct any single-bit error and detect any two-bit error, which has contributed to their widespread usage in computer memory systems, satellite communication systems, and other domains where reliable data transmission is crucial.

Technical details of Hamming code

Linear binary Hamming code falls under the category of linear block codes that can correct single bit errors. For every integer p ≥ 3 (the number of parity bits), there is a (2^p-1, 2^p-p-1) Hamming code. Here, 2^p-1 is the number of symbols in the encoded codeword and 2^p-p-1 is the number of information symbols the encoder can accept at a time. All such Hamming codes have a minimum Hamming distance d_min=3 and thus they can correct any single bit error and detect any two bit errors in the received vector. The characteristics of a generic (n,k) Hamming code is given below.

\[\begin{aligned} \text{Codeword length:} \quad && n &= 2^p-1 \\ \text{Number of information symbols:} \quad && k &= 2^p-p-1 \\ \text{Number of parity symbols:} \quad && n-k &= p \\ \text{Minimum distance:} \quad && d_{min} &= 3 \\ \text{Error correcting capability:} \quad && t &=1 \end{aligned}\]

With the simplest configuration: p=3, we get the most basic (7, 4) binary Hamming code. The (7,4) binary Hamming block encoder accepts blocks of 4-bit of information, adds 3 parity bits to each such block and produces 7-bits wide Hamming coded blocks.

Systematic & Non-systematic encoding

Block codes like Hamming codes are also classified into two categories that differ in terms of structure of the encoder output:

● Systematic encoding
● Non-systematic encoding

In systematic encoding, just by seeing the output of an encoder, we can separate the data and the redundant
bits (also called parity bits). In the non-systematic encoding, the redundant bits and data bits are interspersed.

Constructing (7,4) Hamming code

Hamming codes can be implemented in systematic or non-systematic form. A systematic linear block code can be converted to non-systematic form by elementary matrix transformations. A non-systematic Hamming code is described next.

[table “36” not found /]

Let a codeword belonging to (7, 4) Hamming code be represented by [D₇,D₆,D₅,P₄,D₃,P₂,P₁], where D represents information bits and P represents parity bits at respective bit positions. The subscripts indicate the left to right position taken by the data and the parity bits. We note that the parity bits are located at position that are powers of two (bit positions 1,2,4).

Now, represent the bit positions in binary.

Seeing from left to right, the first parity bit (P₁) covers the bits at positions whose binary representation has 1 at the least significant bit. We find that P₁ covers the following bit positions

\[\begin{aligned} \text{bit position in decimal} &: \text{in binary} \\ 1 &: 00 \textbf{1} \\ 3 &: 01 \textbf{1}\\ 5 &: 10 \textbf{1} \\ 7 &: 11 \textbf{1} \end{aligned} \]

Similarly, the second parity bit (P₂) covers the bits at positions whose binary representation has 1 at the second least significant bit. Hence, P₂ covers the following bit positions.

\[\begin{aligned} \text{bit position in decimal} &: \text{in binary} \\ 2 &: 0 \textbf{1} 0 \\ 3 &: 0 \textbf{1} 1 \\ 6 &: 1 \textbf{1} 0 \\ 7 &: 1 \textbf{1} 1 \end{aligned} \]

Finally, the third parity bit (P₄) covers the bits at positions whose binary representation has 1 at the most significant bit. Hence, P₄ covers the following bit positions

\[\begin{aligned} \text{bit position in decimal} &: \text{in binary} \\ 4 &: \textbf{1} 00 \\ 5 &: \textbf{1} 01 \\ 6 &: \textbf{1} 10 \\ 7 &: \textbf{1} 11 \end{aligned} \]

If we follow even parity scheme for parity bits, the number of 1’s covered by the parity bits must add up to an even number. Which implies that the XOR of bits covered by the parity (including the parity bits) must result in 0. Therefore, the following equations hold.

\[\begin{aligned} P_1 &= D_3 \oplus D_5 \oplus D_7 \\ P_2 &= D_3 \oplus D_6 \oplus D_7 \\ P_4 &= D_5 \oplus D_6 \oplus D_7 \end{aligned}\]

For clarity, let’s represent the subscripts in binary.

\[\begin{aligned} P_{00 \textbf{1}} &= D_{01 \textbf{1}} \oplus D_{10 \textbf{1}} \oplus D_{11 \textbf{1}} \\ P_{0 \textbf{1} 0} &= D_{0 \textbf{1} 1} \oplus D_{0 \textbf{1} 0} \oplus D_{1 \textbf{1} 1} \\ P_{\textbf{1} 00} &= D_{\textbf{1} 01} \oplus D_{\textbf{1} 10} \oplus D_{\textbf{1} 11} \end{aligned}\]

Following table illustrates the concept of constructing the Hamming code as described by R.W Hamming in his groundbreaking paper [1].

*Table 1: Construction of (7,4) binary Hamming code*

We note that the parity bits and data columns are interspersed. This is an example of non-systematic Hamming code structure. We can continue our work on the table above as it is. Or, we can also re-arrange the entries of that table using elemental transformations, such that a systematic Hamming code is rendered.

*Figure 2: Re-arranging Hamming code using transformation (non-systematic to systematic code)*

After the re-arrangement of columns, we see that the parity columns are nicely clubbed together at the end. We can also drop the subscripts given to the parity/data locations and re-index them according to our convenience. This gives the following structure to the (7,4) Hamming code.

*Figure 3: Example for Systematic Hamming code*

We will use the above systematic structure in the following discussion.

Encoding process

Given the structure in Figure 3, the parity bits are calculated from the following linearly independent equations using modulo-2 additions.

\[\begin{aligned} P_1 &= D_1 \oplus D_2 \oplus D_3 \\ P_2 &= D_2 \oplus D_3 \oplus D_4 \\ P_3 &= D_1 \oplus D_3 \oplus D_4 \end{aligned} \]

*Figure 4: Computing the parity bits for (7,4) Hamming code*

At the transmitter side, a Hamming encoder implements a generator matrix – $latex \mathbf{G}$. It is easier to construct the generator matrix from the linear equations listed in equation above. The linear equations show that the information bit D₁ influences the calculation of parities at P₁ and P₃ . Similarly, the information bit D₂ influences P₁ and P₂, D₃ influences P₁, P₂ & P₃ and D₄ influences P₂ & P₃.

$latex \mathbf{G} = \bordermatrix{ & D_1 & D_2 & D_3 & D_4 & P_1 & P_2 & P_3 \cr & 1 & 0 & 0 & 0 & 1 & 0 & 1 \cr & 0 & 1 & 0 & 0 & 1 & 1 & 0 \cr & 0 & 0 & 1 & 0 & 1 & 1 & 1 \cr & 0 & 0 & 0 & 1 & 0 & 1 & 1 } &s=2$

Represented as matrix operations, the encoder accepts 4 bit message block $\mathbf{m}$, multiplies it with the generator matrix $\mathbf{G}$ and generates 7 bit codewords $\mathbf{c}$. Note that all the operations (addition, multiplication etc.,) are in modulo-2 domain.

\[\mathbf{c}= \mathbf{m} \mathbf{G} \]

Given a generator matrix, the Matlab code snippet for generating a codebook containing all possible codewords ($\mathbf{c} \in \mathbf{C}$) is given below. The resulting codebook $\mathbf{C}$ can be used as a Look-Up-Table (LUT) when implementing the encoder. This implementation will avoid repeated multiplication of the input blocks and the generator matrix. The list of all possible codewords for the generator matrix ($\mathbf{G}$) given above are listed in table 2.

*Table 2: All possible codewords for (7,4) Hamming code*

Program: Generating all possible codewords from Generator matrix

%program to generate all possible codewords for (7,4) Hamming code
G=[ 1 0 0 0 1 0 1;
0 1 0 0 1 1 0;
0 0 1 0 1 1 1;
0 0 0 1 0 1 1];%generator matrix - (7,4) Hamming code
m=de2bi(0:1:15,'left-msb');%list of all numbers from 0 to 2ˆk
codebook=mod(m*G,2) %list of all possible codewords

Decoding process – Syndrome decoding

The three check equations for the given generator matrix ($\mathbf{G}$) for the sample (7,4) Hamming code, can be expressed collectively as a parity check matrix – $\mathbf{H}$. Parity check matrix finds its usefulness in the receiver side for error-detection and error-correction.

$latex \mathbf{H} = \bordermatrix{ & D_1 & D_2 & D_3 & D_4 & P_1 & P_2 & P_3 \cr & 1 & 1 & 1&0&1&0&0 \cr & 0&1&1&1&0&1&0 \cr & 1 & 0&1&1&0&0&1 } &s=2$

According to parity-check theorem, for every generator matrix G, there exists a parity-check matrix H, that spans the null-space of G. Therefore, if c is a valid codeword, then it will be orthogonal to each row of H.

\[\mathbf{c} \mathbf{H}^T = 0 \]

Therefore, if $\mathbf{H}$ is the parity-check matrix for a codebook $\mathbf{C}$, then a vector $\mathbf{c}$ in the received code space is a valid codeword if and only if it satisfies $\mathbf{c} \mathbf{H}^T=0$.

Consider a vector of received word $\mathbf{r}=\mathbf{c}+\mathbf{e}$, where $\mathbf{c}$ is a valid codeword transmitted and $\mathbf{e}$ is the error introduced by the channel. The matrix product $\mathbf{r}\mathbf{H}^T$ is defined as the syndrome for the received vector $\mathbf{r}$, which can be thought of as a linear transformation whose null space is $\mathbf{C}$ [2].

\[\begin{aligned} \mathbf{s} &= \mathbf{r}\mathbf{H}^T \\ &=\left(\mathbf{c}+\mathbf{e}\right)\mathbf{H}^T \\ &=\mathbf{c}\mathbf{H}^T +\mathbf{e}\mathbf{H}^T \\ &=\mathbf{0} +\mathbf{e}\mathbf{H}^T \\ &=\mathbf{e}\mathbf{H}^T \end{aligned}\]

Thus, the syndrome is independent of the transmitted codeword $\mathbf{c}$ and is solely a function of the error pattern $\mathbf{e}$. It can be determined that if two error vectors $\mathbf{e}$ and $\mathbf{e}’$ have the same syndrome, then the error vectors must differ by a nonzero codeword.

\[\begin{aligned} \mathbf{s} &= \mathbf{e}\mathbf{H}^T = \mathbf{e}’\mathbf{H}^T \\ & \Rightarrow \left(\mathbf{e} – \mathbf{e}’\right)\mathbf{H}^T = 0 \\ & \Rightarrow \left(\mathbf{e} – \mathbf{e}’\right) = \mathbf{c} \in \mathbf{C} \end{aligned}\]

It follows from the equation above, that decoding can be performed by computing the syndrome of the received word, finding the corresponding error pattern and subtracting (equivalent to addition in $GF(2)$ domain) the error pattern from the received word. This obviates the need to store all the vectors as in a standard array decoding and greatly reduces the memory requirements for implementing the decoder.

Following is the syndrome table for the (7,4) Hamming code example, illustrated here.

*Table 2: Syndrome table for (7,4) Hamming code*

Some properties of generator and parity-check matrices

The generator matrix $\mathbf{G}$ and the parity-check matrix $\mathbf{H}$ satisfy the following property

\[\mathbf{G} \mathbf{H}^T = 0\]

Note that the generator matrix is in standard form where the elements are partitioned as

\[\mathbf{G} = \begin{bmatrix} I_k \mid P \end{bmatrix} \]

where I_k is a k⨉k identity matrix and P is of dimension k ⨉ (n-k). When G is a standard form matrix, the corresponding parity-check matrix H can be easily determined as

\[\mathbf{H} =[−P^T∣I_{n−k}]\]

In Galois Field – GF(2), the negation of a number is simply its absolute value. Hence the H matrix for binary codes can be simply written as

\[\mathbf{H} = [P^T \; | \; I_{n-k}]\]

References

[1] R.W Hamming, “Error detecting and error correcting codes”, Bell System Technical Journal. 29 (2): 147–160, 1950.↗
[2] Stephen B. Wicker, “Error Control Systems for digital communication storage”, Prentice Hall, ISBN 0132008092, 1995.

Topics in this Chapter

[table “25” not found /]

Books by the author

[table “23” not found /]

3 thoughts on “Hamming Code : construction, encoding & decoding”

Mong Sim

March 15, 2020 at 4:34 am

Hello. I like your books and is planning to purchase all three. But when I worked out the problem by hand and refer to https://en.wikipedia.org/wiki/Hamming_code, there are some discrepancies. My work is in line with https://en.wikipedia.org/wiki/Hamming_code.

My method starts with a truth table to get the booleon equations, and the venn diagram, the generator matrix and the H matrix, etc..

Please comments on your approach.

Thank you

Mong
- Mathuranathan
  
  March 15, 2020 at 5:57 pm
  
  Thanks for commenting. My aim is not to repeat the theoretical approach that you find in authoritative texts on coding theory. Rather, this is an intuitive approach to understand the idea behind code construction.
Gloria

November 3, 2023 at 7:56 pm

Thanks!