Euclidean and Hamming distances

Key focus: Euclidean & Hamming distances are used to measure similarity or dissimilarity between two sequences. Used in Soft & Hard decision decoding.

Distance is a measure that indicates either similarity or dissimilarity between two words. Given a pair of words a=(a₀,a₁, … ,a_n-1) and b=(b₀,b₁,…,b_n-1) , there are variety of ways one can characterize the distance, d(a,b), between the two words. Euclidean and Hamming distances are familiar ones. Euclidean distance is extensively applied in analysis of convolutional codes and Trellis codes. Hamming distance is frequently encountered in the analysis of block codes.

This article is part of the book
● Wireless Communication Systems in Matlab (second edition), ISBN: 979-8648350779 available in ebook (PDF) format and Paperback (hardcopy) format.

Euclidean distance

The Euclidean distance between the two words is defined as

\[d_{Euclidean}(\mathbf{a},\mathbf{b}) = \sqrt{(a_0-b_0)^2+(a_1-b_1)^2 + \cdots + (a_{n-1}-b_{n-1})^2}\]

Soft decision decoding

In contrast to classical hard-decision decoders (see below) which operate on binary values, a soft-decision decoder directly processes the unquantized (or quantized in more than two levels in practice) samples at the output of the matched filter for each bit-period, thereby avoiding the loss of information.

If the outputs of the matched filter at each bit-period are unquantized or quantized in more than two levels, the demodulator is said to make soft-decisions. The process of decoding the soft-decision received sequence is called soft-decision decoding. Since the decoder uses the additional information contained in the multi-level quantized or unquantized received sequence, soft-decision decoding provides better performance compared to hard-decision decoding. For soft-decision decoding, metrics like likelihood function, Euclidean distance and correlation are used.

For illustration purposes, we consider the communication system model shown in Figure 1. A block encoder encodes the information blocks m=(m₁,m₂,…,m_k) and generates the corresponding codeword vector c=(c₁,c₂,…,c_n). The codewords are modulated and sent across an AWGN channel. The received signal is passed through a matched filter and the multi-level quantizer outputs the soft-decision vector r .

Soft-decision receiver model for decoding linear block codes for AWGN channel (Euclidean Hamming distances) — *Figure 1: Soft-decision receiver model for decoding linear block codes for AWGN channel*

The goal of a decoder is to produce an estimate m̂ of the information sequence m based on the received sequence r. Equivalently, the information sequence m and the codeword c has one-to-one correspondence, the decoder can also produce an estimate ĉ of the codeword c. If the codeword c was transmitted, a decoding error occurs if ĉ ≠ c.

For equi-probable codewords, The decoder that selects a codeword that maximizes the conditional probability P(r, c). This is called a maximum lihelihood decoder (MLD).

For an AWGN channel with two-sided power spectral density N₀/2, the conditional probability is given by

\[P\left(\mathbf{r} | \mathbf{c}\right) = \frac{1}{\left(\pi N_0\right)^{-n/2}} exp \left\lbrace – \sum_{i=1}^{n} \left[r_i – s_i\right]^2 \right\rbrace\]

The sum Euclidean distance equation is the squared Euclidean distance between the received sequence r and the coded signal sequence s. We can note that the term Euclidean distance equation 2 is common for all codewords and n is a constant. This simplifies the MLD decoding rule where we select a codeword from the code dictionary that minimizes the Euclidean distance D(r, s)$.

Hamming distance

Hamming distance between two words a=(a₀,a₁, … ,a_n-1) and b=(b₀,b₁,…,b_n-1) in Galois Field GF(2), is the number of coordinates in which the two blocks differ.

\[d_{Hamming} = d_H(\mathbf{a},\mathbf{b}) = \#\{j : a_j \neq b_j, j = 0,1,\cdots,n-1\}\]

For example, the Hamming distance between (0,0,0,1) and (1,0,1,0) in GF(2) is 3, since they differ in three digits. For an independent and identically distributed (i.i.d) error model with (discrete) uniform error amplitude distribution, the most appropriate measure is Hamming distance.

Minimum distance

The minimum distance of block code C, is the smallest distance between all distance pairs of code words in C. The minimum distance of a block code determines both its error-detecting ability and error-correcting ability. A large minimum distance guarantees reliability against random errors. The general relationship between a block code’s minimum distance and the error-detecting and error correcting capability is as follows.

● If d_min is the minimum Hamming distance of a block code, the code is guaranteed to detect up to e=d_min-1 errors. Consequently, let c₁ and c₂ be the two closest codewords in the codeword dictionary C. If c₁ was transmitted and c₂ is received, the error is undetectable.

● If d_min is the minimum Hamming distance of a block code and if the optimal decoding procedure of nearest-neighbor decoding is used at the receiver, the code is guaranteed to correct up to t=(d_min-1 )/2 errors.

Sub-optimal hard decision decoding

In soft-decision decoding, the bit samples to the decoder are either unquantized or quantized to multi-levels and the maximum likelihood decoder (MLD) needs to compute M correlation metrics, where M is the number of codewords in the codeword dictionary. Although this provides the best performance, the computational complexity of the decoder increases when the number of codewords M becomes large. To reduce the computational burden, the output of the matched filter at each bit-period can be quantized to only two levels, denoted as 0 and 1, that results in a hard-decision binary sequence. Then, the decoder processes this hard-decision sequence based on a specific decoding algorithm. This type of decoding, illustrated in Figure 1, is called hard-decision decoding.

Hard-decision receiver model for decoding linear block codes for AWGN channel (hamming distance) — *Figure 2: Hard-decision receiver model for decoding linear block codes for AWGN channel*

The hard-decision decoding methods use Hamming distance metric to decode the hard-decision received sequence to the closest codeword. The objective of such decoding methods is to choose a codeword that provides the minimum Hamming distance with respect to the hard-decision received sequence. Since the hard-decision samples are only quantized to two levels, resulting in loss of information, the hard-decision decoding results in performance degradation when compared to soft-decision decoding.

Decoding using standard array and syndrome decoding are popular hard-decision decoding methods encountered in practice.

Rate this article: Poor Below average Average Good Excellent (5 votes, average: 4.40 out of 5)

For further reading

[1] I. Dokmanic, R. Parhizkar, J. Ranieri and M. Vetterli, “Euclidean Distance Matrices: Essential theory, algorithms, and applications,” in IEEE Signal Processing Magazine, vol. 32, no. 6, pp. 12-30, Nov. 2015, doi: 10.1109/MSP.2015.2398954.↗

Books by the author

Wireless Communication Systems in Matlab Second Edition(PDF) (169 votes, average: 3.70 out of 5) Checkout Added to cart	Digital Modulations using Python (PDF ebook) (125 votes, average: 3.61 out of 5) Checkout Added to cart	Digital Modulations using Matlab (PDF ebook) (130 votes, average: 3.68 out of 5) Checkout Added to cart
Hand-picked Best books on Communication Engineering Best books on Signal Processing

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.