Maximum Likelihood Decoding

1 Star2 Stars3 Stars4 Stars5 Stars (3 votes, average: 5.00 out of 5)
Having covered the techniques of hard and soft decision decoding , its time to illustrate the most important concept of Maximum Likelihood Decoding.

Maximum Likelihood Decoding:

Consider a set of possible codewords (valid codewords – set \(Y\)) generated by an encoder in the transmitter side. We pick one codeword out of this set ( call it \(y\) ) and transmit it via a Binary Symmetric Channel (BSC) with probability of error p ( To know what is a BSC – click here ). At the receiver side we receive the distorted version of \(y\) ( call this erroneous codeword \(x\)).

Maximum Likelihood Decoding chooses one codeword from \(Y\) (the list of all possible codewords) which maximizes the following probability.

$$\mathbb{P}(y\:sent\mid x\:received )$$

Meaning that the receiver computes \(P(y_1,x) , P(y_2,x) , P(y_3,x),\cdots,P(y_n,x)\). and chooses a codeword (\(y\)) which gives the maximum probability. 

In practice we don’t know Y (at the receiver) but we know \(x\). So how to compute the probability ? Maximum Likelihood Estimation (MLE) comes to our rescue. For a detailed explanation on MLE – refer here[1]
The aim of maximum likelihood estimation is to find the parameter value(s) that makes the observed data most likely. Understanding the difference between prediction and estimation is important at this point.
Estimation differs from prediction in the following way … In estimation problems, likelihood of the parameters is estimated based on given data/observation vector. In prediction problems, probability is used as a measure to predict the outcome from known parameters of a model.

Examples for “Prediction” and “Estimation” :

1) Probability of getting a “Head” in a single toss of a fair coin is \(0.5\). The coin is tossed 100 times in a row.Prediction helps in predicting the outcome ( head or tail ) of the \(101^{th}\) toss based on the probability.
2) A coin is tossed 100 times and the data ( head or tail information) is recorded. Assuming the event follows Binomial distribution model, estimation helps in determining the probability of the event. The actual probability may or may not be \(0.5\).
Maximum Likelihood Estimation estimates the conditional probability based on the observed data ( received data – \(x\)) and an assumed model.

Example of Maximum Likelihood Decoding:

Let \(y=11001001\) and \(x=10011001\) . Assuming Binomial distribution model for the event with probability of error \(0.1\) (i.e the reliability of the BSC is \(1-p = 0.9\)), the distance between codewords is \(y-x = 2\) . For binomial model,

$$\mathbb{P}(y\:received\mid x\:sent ) = (1-p)^{n-d}.p^{d}$$

where d=the hamming distance between the received and the sent codewords

n= number of bit sent

\(p\)= error probability of the BSC.
\(1-p\) = reliability of BSC

Substituting \(d=2, n=8\) and \(p=0.1\) , then \( P(y received \mid x sent) = 0.005314\).

Note : Here, Hamming distance is used to compute the probability. So the decoding can be called as “minimum distance decoding” (which minimizes the Hamming distance) or “maximum likelihood decoding”. Euclidean distance may also be used to compute the conditional probability.

As mentioned earlier, in practice \(y\) is not known at the receiver. Lets see how to estimate \(P(y received \mid x sent)\) when \(y\) is unknown based on the binomial model.

Since the receiver is unaware of the particular y corresponding to the x received, the receiver computes \(P(y received \mid x sent)\) for each codeword in \(Y\). The \(y\) which gives the maximum probability is concluded as the codeword that was sent. 

Reference :

See Also

[1]An Introduction to Estimation Theory
[2]Bias of an Estimator
[3]Minimum Variance Unbiased Estimators (MVUE)
[4]Maximum Likelihood Estimation
[5]Maximum Likelihood Decoding
[6]Probability and Random Process
[7]Likelihood Function and Maximum Likelihood Estimation (MLE)
[8]Score, Fisher Information and Estimator Sensitivity
[9]Introduction to Cramer Rao Lower Bound (CRLB)
[10]Cramer Rao Lower Bound for Scalar Parameter Estimation
[11]Applying Cramer Rao Lower Bound (CRLB) to find a Minimum Variance Unbiased Estimator (MVUE)
[12]Efficient Estimators and CRLB
[13]Cramer Rao Lower Bound for Phase Estimation
[14]Normalized CRLB - an alternate form of CRLB and its relation to estimator sensitivity
[15]Cramer Rao Lower Bound (CRLB) for Vector Parameter Estimation
[16]The Mean Square Error – Why do we use it for estimation problems
[17]How to estimate unknown parameters using Ordinary Least Squares (OLS)
[18]Essential Preliminary Matrix Algebra for Signal Processing
[19]Why Cholesky Decomposition ? A sample case:
[20]Tests for Positive Definiteness of a Matrix
[21]Solving a Triangular Matrix using Forward & Backward Substitution
[22]Cholesky Factorization and Matlab code
[23]LTI system models for random signals – AR, MA and ARMA models
[24]Comparing AR and ARMA model - minimization of squared error
[25]Yule Walker Estimation
[26]AutoCorrelation (Correlogram) and persistence – Time series analysis
[27]Linear Models - Least Squares Estimator (LSE)
[28]Best Linear Unbiased Estimator (BLUE)