General Model :Algorithms in the Real World. Applications. Block Codes

Geeral Model 5-853:Algorithms i the Real World Error Correctig Codes I Overview Hammig Codes Liear Codes 5-853 Page message (m) coder codeword (c) oisy chael decoder codeword (c ) message or error Errors itroduced by the oisy chael: chaged fields i the codeword (e.g. a flipped bit) missig fields i the codeword (e.g. a lost byte). Called erasures How the decoder deals with errors. error detectio vs. error correctio 5-853 Page3 Applicatios Block Codes Storage: CDs, DVDs, hard drives, Wireless: Cell phoes, wireless liks (WiMax) Satellite ad Space: TV, Mars rover, Digital Televisio: DVD, MPEG layover High Speed Networks: Gbase-T Reed-Solomo codes are the most used i practice, but various graph-based codes have recetly become as importat. Algorithms for decodig are quite sophisticated. 5-853 Page4 message (m) coder oisy chael decoder codeword (c) codeword (c ) message or error Each message ad codeword is of fixed size codeword alphabet k m c q C Σ (codewords) Δ(x,y) umber of positios s.t. x i y i d mi{δ(x,y) : x,y C, x y} s max{δ(c,c )} that the code ca correct Code described as: (,k,d) q 5-853 Page5

Biary Codes Today we will mostly be cosiderig {,} ad will sometimes use (,k,d) as shorthad for (,k,d) I biary Δ(x,y) is ofte called the Hammig distace 5-853 Page7 Hypercube Iterpretatio Cosider codewords as vertices o a hypercube. codeword d mi distace 3 dimesioality 8 umber of odes The distace betwee odes o the hypercube is the Hammig distace Δ. The miimum distace is d. is equidistace from, ad. For s-bit error detectio d s + For s-bit error correctio d s + 5-853 Page8 Error Detectio with Parity Bit A (k+,k,) systematic code Ecodig: m m m k m m m k p k+ where p k+ m m m k d sice the parity is always eve (it takes two bit chages to go from oe codeword to aother). Detects oe-bit error sice this gives odd parity Caot be used to correct -bit error sice ay odd-parity word is equal distace Δ to k+ valid codewords. 5-853 Page9 Error Correctig Oe Bit Messages How may bits do we eed to correct a oe bit error o a oe bit message? bits ->, -> (,k,d) Need d 3 to correct oe error. Why? 3 bits ->, -> (3,k,d3) 5-853 Page

Example of (6,3,3) systematic code Error Correctig Multibit Messages message codeword Defiitio: A Systematic code is oe i which the message appears i the codeword We will first discuss Hammig Codes Detect ad correct -bit errors. Codes are of form: ( r -, r - r, 3) for ay r > e.g. (3,,3), (7,4,3), (5,,3), (3, 6, 3), which correspod to, 3, 4, 5, parity bits (i.e. -k) The high-level idea is to localize the error. Ay specific ideas? 5-853 Page 5-853 Page Hammig Codes: Ecodig Localizig error to top or bottom half xxx or xxx m 5 m 4 m 3 m m m m 9 p 8 m 7 m 6 m 5 m 3 p p 8 m 5 m 4 m 3 m m m m 9 Localizig error to xxx or xxx m 5 m 4 m 3 m m m m 9 p 8 m 7 m 6 m 5 p 4 m 3 m p p 4 m 5 m 4 m 3 m m 7 m 6 m 5 Localizig error to xxx or xxx m 5 m 4 m 3 m m m m 9 p 8 m 7 m 6 m 5 p 4 m 3 p p p m 5 m 4 m m m 7 m 6 m 3 Localizig error to xxx or xxx m 5 m 4 m 3 m m m m 9 p 8 m 7 m 6 m 5 p 4 m 3 p p p Hammig Codes: Decodig m 5 m 4 m 3 m m m m 9 p 8 m 7 m 6 m 5 p 4 m 3 p p p We do t eed p, so we have a (5,,?) code. After trasmissio, we geerate b 8 p 8 m 5 m 4 m 3 m m m m 9 b 4 p 4 m 5 m 4 m 3 m m 7 m 6 m 5 b p m 5 m 4 m m m 7 m 6 m 3 b p m 5 m 3 m m 9 m 7 m 5 m 3 With o errors, these will all be zero With oe error b 8 b 4 b b gives us the error locatio. e.g. would tell us that p 4 is wrog, ad would tell us that m is wrog p m 5 m 3 m m 9 m 7 m 5 m 3 5-853 Page3 5-853 Page4 3

Hammig Codes Ca be geeralized to ay power of r (5 i the example) (-k) r (4 i the example) d 3 (discuss later) Ca correct oe error, but ca t tell differece betwee oe ad two! Gives ( r -, r --r, 3) code Exteded Hammig code Add back the parity bit at the ed Gives ( r, r --r, 4) code Ca correct oe error ad detect (ot so obvious) Lower boud o parity bits How may odes i hypercube do we eed so that d 3? Each of the k codewords elimiates eighbors plus itself, i.e. + ( + ) k + log ( + ) k + log ( + ) I previous hammig code 5 + log (5+) 5 Hammig Codes are called perfect codes sice they match the lower boud exactly k 5-853 Page5 5-853 Page6 Lower boud o parity bits What about fixig errors (i.e. d5)? Each of the k codewords elimiates itself, its eighbors ad its eighbors eighbors, givig: k ( + + ( ) / ) k + log( + + ( ) / ) k + log Geerally to correct s errors: log ( ) k + + + + L + s + + 5-853 Page7 Lower Bouds: a side ote The lower bouds assume radom placemet of bit errors. I practice errors are likely to be less tha radom, e.g. evely spaced or clustered: x x x x x x x x x x x x Ca we do better if we assume regular errors? We will come back to this later whe we talk about Reed-Solomo codes. I fact, this is the mai reaso why Reed-Solomo codes are used much more tha Hammig-codes. 5-853 Page8 4

Liear Codes If is a field, the is a vector space Defiitio: C is a liear code if it is a liear subspace of of dimesio k. This meas that there is a set of k basis vectors v i ( i k) that spa the subspace. i.e. every codeword ca be writte as: c a v + + a k v k a i Liear Codes Basis vectors for the (7,4,3) Hammig code: m 7 m 6 m 5 p 4 m 3 p v p v v 3 v 4 How ca we see that d 3? The sum of two codewords is a codeword. 5-853 Page9 5-853 Page Geerator ad Parity Check Matrices Geerator Matrix: A k x matrix G such that: C {xg x k } Made from stackig the basis vectors Parity Check Matrix: A ( k) x matrix H such that: C {y Hy T } Codewords are the ullspace of H These always exist for liear codes Advatages of Liear Codes Ecodig is efficiet (vector-matrix multiply) Error detectio is efficiet (vector-matrix multiply) Sydrome (Hy T ) has error iformatio Gives q -k sized table for decodig Useful if -k is small 5-853 Page 5-853 Page 5

6 5-853 Page3 Example ad Stadard Form For the Hammig (7,4,3) code: G By swappig colums 4 ad 5 it is i the form I k,a. A code with a matrix i this form is systematic, ad G is i stadard form G 5-853 Page4 Relatioship of G ad H If G is i stadard form [I k,a] the H [A T,I -k ] Example of (7,4,3) Hammig code: H G traspose 5-853 Page5 Proof that H is a Parity Check Matrix Suppose that x is a message. The H(xG) T H(G T x T ) (HG T )x T (A T I k +I -k A T )x T (A T + A T )x T Now suppose that Hy T. The A T i,* yt [..k] + yt k+i (where A T i,* is row i of AT ad y T [..k] are the first k elemets of y T ]) for i -k. Thus, y [..k] A *,i y k+i where A *,i is ow colum i of A, ad y [..k] are the first k elemets of y, so y [k+ ] y [..k] A. Cosider x y [..k]. The xg [y [..k] y [..k] A] y. Hece if Hy T, y is the codeword for x y [..k]. 5-853 Page6 The d of liear codes Theorem: Liear codes have distace d if every set of (d-) colums of H are liearly idepedet (i.,e., sum to ), but there is a set of d colums that are liearly depedet. Proof: if d- or fewer colums are liearly depedet, the for ay codeword y, there is aother codeword y, i which the bits i the positios correspodig to the colums are iverted, that both have the same sydrome (). If every set of d- colums is liearly idepedet, the chagig ay d- bits i a codeword y must also chage the sydrome (sice the d- correspodig colums caot sum to ).

Dual Codes For every code with G I k,a ad H A T,I -k we have a dual code with G I -k, A T ad H A,I k The dual of the Hammig codes are the biary simplex codes: ( r -, r, r- -r) The dual of the exteded Hammig codes are the first-order Reed-Muller codes. Note that these codes are highly redudat ad ca fix may errors. NASA Marier: Deep space probes from 969-977. Marier show Used (3,6,6) Reed Muller code (r 5) Rate 6/3.875 (oly out of 5 bits are useful) Ca fix up to 7 bit errors per 3-bit word 5-853 Page7 5-853 Page8 How to fid the error locatios Hy T is called the sydrome (o error if ). I geeral we ca fid the error locatio by creatig a table that maps each sydrome to a set of error locatios. Theorem: assumig s d- every sydrome value correspods to a uique set of error locatios. Proof: Exercise. Table has q -k etries, each of size at most (i.e. keep a bit vector of locatios). 5-853 Page9 7