Faulty Clock Detectio for Crypto Circuits Agaist Differetial Faulty Aalysis Attack Pei uo ad Yusi Fei Departmet of Electrical ad Computer Egieerig Northeaster Uiversity, Bosto, MA 02115 Abstract. Differetial fault aalysis attack is a kid of serious threat to cryptographic devices. Previous protectio schemes for crypto devices are ot desiged specifically agaist this kid of attacks. At the same time, previous schemes either icur large resource overhead or complex desig work for differet process techology. I this paper, we propose a method which ca be easily implemeted either i FPGAs or itegrated circuits to detect the glitches i system clock. Results show that the proposed method ca detect glitches efficietly while eeds very few system resource ad this method will ot ivolve complex desig work for differet process techology. Keywords: AES, differetial fault aalysis, side-chael attacks 1 itroductio Cryptographic applicatios are vulerable to fault ijectio attacks. Differetial Fault Aalysis (DFA) was itroduced by Biham et. al. o the Data Ecryptio Stadard (DES) [1]. The authors i [2] showed that the attackers are able to break the AES-128 with oly 2 faulty ciphertexts, assumig the fault occurs betwee the atepeultimate ad the peultimate MixColums. I [3], the authors show that iducig a radom fault aywhere i oe of the four diagoals of the state matrix at the iput of the eighth roud of the cipher leads to the deductio of the etire AES key. What s more, eve if the fault iductio corrupts two or three diagoals, 2 ad 4 faulty ciphertexts are eough to uiquely idetify the correct key. I [3], the authors show that clock glitches ca be used to iject faults ito cryptographic devices. They ru real-time fault ijectio usig clock glitchig via less sophisticated ad less costly istrumets o Xilix FPGA platform ad their results show that clock glitches ca also be a meas to iduce iteral faults. The authors i [4] also demostrated the effectiveess of frequecy ijectio attacks o a secure microcotroller ad a highly secure FPGA chip. The authors i [5] eve propose a ew fault-based attack called the Fault Sesitivity Aalysis (FSA) attack. They make use of values of faulty ciphertexts ad they show that faulty output may exhibit some detectable characteristics ad such characteristics ca be used to retrieve the secret key. They show that FSA attacks ca be eve used to break some protected AES implemetatios. After that, authors i [6] show that FSA ad the Correlatio Collisio Attack [7] ca be combied to create a eve stroger attack method ad the result shows that the proposed method ca be used to attack some protected AES implemetatios efficietly. To protect the cryptographic devices from such DFA attacks ad improve the reliability of the system, may differet schemes have bee implemeted. For AES specifically, differet kids of method have bee proposed. For example, some simple liear error correctio codes are added to the system ad the redudacy are used to detect the ijected faults [8,9]. At the same time, some works proposed to add aother copy of AES such that the results of these two copies ca be compared to fid the differeces. Because the DFA attacks always iject faults oly ito the last several rouds of AES, some works propose that reverse operatios of the last several rouds ca be implemeted after the ecryptio/decryptio to check the result. I [10], the authors proposed a ew method to detect the glitches istead of usig redudacy to improve the security level agaist DFA attacks. I their method, a o-logic buffer-based delay chai is 1
iserted, ad the by moitorig the delay alog the delay chai, a possible clock glitch based DFA ca be detected. I this paper, we propose a ew method to moitor the clock sigals ad detect glitches i the system. The proposed scheme oly ivolves very few redudacy of the existig circuits ad the proposed method ca be implemeted very coveietly. The followig of this paper is orgaized as followig: I Sectio 2, we revisit the coceptio of DFA attacks ad previous protectio methods. I Sectio 3, we itroduce the coceptio of the proposed detectio method ad discuss its advatages over the previous schemes. I Sectio 4, we show the simulatio results of the proposed methods, ad compare the proposed scheme ad the previous protectio schemes based o the sythesis results. I Sectio 5, we coclude the paper. 2 Clock glitch fault ijectio ad the protectio schemes 2.1 Clock glitch fault ijectio Previous papers demostrated that faults ca be ijected betwee MixColums i roud 8 ad SubBytes i roud 10, DFA based o these ijected faults ca recover the last roud key ad thus recover the secret key. Previous models all focus o the SubBytes of the last roud ad we ca simplify the models show as i Figure 1. For DFA attacks, the attacker first gets a group of (P, C, K 10 ), i which P is a radom plaitext ad C is the correspodig ciphertext with the last roud key as K 10. With aother pair of faulty pair (P, C, K 10 ), i which P 10 is faulty iput of the last roud caused by clock glitch. K 10 P 10 P 10 ' Fig. 1. Differetial fault aalysis model The attacker makes a assumptio of the last roud key bytes Ki 10, i which i deotes the faulty bytes of C comparig with C. For Ki 10, the attacker ca recompute the last roud ad gets the correspodig last roud iput P 10 ad P 10. { P10 = AES10 1 (C, K10 i ) P 10 = AES10 1 (C, K 10 i ) (1) 2
For some fault ijectio models, the attacker ca oly flip oe bit thus P 10 ad P 10 have oly oe bit differece. For more complex faulty models, there will be more faulty bytes ad the relatioship of these faulty bytes ca be used to extract the last roud key bytes. Details of more complex DFA models ad the attackig schemes ca be foud i [3,5,2]. 2.2 Protectio method The geeral protectio methods based o area redudacy is show as i Figure 2. The method proposed i [8] is almost the same as Figure 2, they used two error detectio modules to detect the errors i oliear parts ad liear parts separately. The oliear protectio scheme proposed i [8,11] appeds a cubic computatio module after the liear compressor ad liear predictor ad this scheme ca also be simplified as i Figure 2. For [9], the proposed CRC protectio method separates the lier protectio schemes ito separated steps for MixColums, ShiftRows ad AddRoudKey operatios. At the same time, protectio schemes based o duplicatig the AES modules ca also be simplified to Figure 2 by thikig the predictor as aother copy of the AES modules. So without loss of geerality, we use Figure 2 to discuss the protectio methods based o area redudacy. KI(I,j) I(I,j) KeyExpasio Sub/IvSub iear Predictor ShiftRows/ IvShiftRows Noliear detector MixColums/ IvMixColums iear Compressor Key iput to ext roud Iput to ext roud Noliear error iear error Fig. 2. Protectio schemes agaist DFA based o redudacy ad error detectio codes This kid of protectio schemes ca detect the ijected faults whe the results of the predictor ad the compressor do t match. This method is useful whe the faults are ijected i oly oe of either the origial circuits or the protectio circuits, or the faults ijected i both parts result i differet errors. But for faults ijected by clock glitches, the faults ca happeed at the iput of these two parts, which meas they have the same faulty iput ad thus geerate the same faulty output, the it s very probable that the faults caused by clock glitches will be hidde. 3 Proposed clock glitch detectio method Differet from the previous protectio schemes which either eed large resource overhead or eed special cosideratio of techology i IC desig phase, we propose a kid of clock glitch detectio method which is easy to implemet ad ca make good use of existig resource. More importatly, the desiger ca cofigure the precisio of the system to meet differet threshold. 3
3.1 Details of the proposed method Deote the system clock used i the cryptographic system as clock ad assume there is aother clock source clk which has higher frequecy i the system. Source clk ca be used to measure the width of the target clock. The structure of the proposed scheme is show as i Figure 3. clock clk Register Comaparator Fig. 3. The block diagram of the clock glitch detectio circuit Deote the couter result as H 0 for the previous logic 1, which meas the umber of cycles of clk while clock = 1 i previous cycle. Meawhile, defie 0 the umber of cycles of clk for clock = 0 i previous cycle. At the same time, use H 1 ad 1 to deote the width of the logic 1 ad logic 0 of curret cycle, show as i Figure 4. Number of cycles: H 0 0 H 1 1 Fig. 4. Couter of the clock glitch detectio circuit If the target clock ad the referece clk are both stable, the: { 0 = 1 H 0 = H 1 (2) For sigals with 50% duty cycle, we have: 0 = 1 = H 0 = H 1 (3) If the clock is ot costat, the the width of logic 0 ad logic 1 of clock are ot equal, which meas: { 0 1 H 0 H (4) 1 3.2 Implemetatio of the proposed scheme From the above aalysis, the key poit of this desig is to have a stable high frequecy referece clock source clk. For FPGA ad some other embedded systems, such high frequecy stable clock source ca be obtaied through exteral clock source such as oscillator ad clock geerator devices. Istead of usig 4
Fig. 5. Simulatio result of the proposed scheme exteral clock sources, clock geerator ca be desiged iside the ICs i desig phase. At the same time, a phase-locked loop (P) ca also be implemeted iside for higher frequecy clock geeratio. A substitutio method to get clk is to use the existig resource to geerate higher frequecy usig the target clock. Take FPGA as a example, a digital clock maager (DCM) or a P ca be used to geerate higher frequecy with clock as the iput clock. Usig this method, oly oe clock source is eeded. The Digital Clock Maager (DCM) primitive i Xilix FPGA parts is used to implemet delay locked loop, digital frequecy sythesizer, digital phase shifter, or a digital spread spectrum [12]. The P i FPGA is used to geerate multiple clocks with defied phase ad frequecy relatioships to a give iput clock [13,14]. Both P ad DCM ca be used to geerate higher frequecy clk usig the target clock. The existed P ad DCM module i FPGAs ca be use to geerate higher frequecy. For example, there are up to 12 DCM modules i Virtex 5 FPGAs ad each ca be used to geerate utmost 32X higher frequecy, ad cascaded DCMs ca be used to geerate eve higher frequecy output [15]. Assume oe DCM module is used to geerate 32X frequecy clk, the for stable clock, 0 = 1 = H 0 = H 1 = 16. If there is a glitch i clock show as i Figure 6. The it s obvious that: { 0 1, 1 2 H 0 H 1, H 1 H (5) 2 Number of cycles: H 0 0 H 1 1 H 2 2 Fig. 6. Whe glitch happes i clock For P ad DCM, if there is a glitch i clock, the the glitch will be detected but the frequecy geeratio module will lose OCK ad some operatios are eeded to reset the module. The details will be explaied ad simulated i Sectio 4. 3.3 Advatage of the proposed scheme From the discussio i Sectio 3.1, the first advatage of the proposed scheme comparig with the previous schemes is that the proposed schemes has low resource requiremet. The proposed scheme ca 5
be easily implemeted i FPGAs ad the resource overhead is very low. For ICs, oly aother higher frequecy clock source or a clock maagemet module (P, etc) is eeded. Aother advatage of the proposed scheme is that it s very easy to recofigure the module accordig to the precisio requiremet of the system. The method proposed i [10] requires to iject a o-logic buffer-based delay chai to the circuits ad the delay caot be chaged after productio. What s more, the delay ad the delay chai desig are affected by the process techology ad this improve the desig difficulty ad complexity. Meawhile, our method ca be easily implemeted i both FPGAs ad ICs, ad this scheme is highly recofigurable. First of all, it s easy to cotrol the frequecy of referece clock clk, thus higher precisio ca be achieved by usig higher frequecy clk. Secodly, our scheme has cofigurable threshold of clock variatio. Assume that the clock has variatio ad the width of clock is ot costat, we ote that if the differece betwee 0 ad 1, the differece betwee H 0 ad H 1 are smaller tha λ, the the variatio is acceptable ad it s ot caused by DFA. Which meas that if the followig equatios hold, the it s ot a DFA caused glitch: { 0 1 < λ H 0 H (6) 1 < λ I coclusio, the proposed scheme is easy to implemet ad it s easy to recofigure the detectio module accordig to differet requiremets ad implemetatios. What s more, the proposed scheme eeds much less resource tha previous methods. 4 Implemetatio ad simulatio results To verify the fuctioality of the proposed scheme, we use Virtex-5 FPGA ad its iteral DCM to implemet the proposed scheme. The simulatio result is show as i Figure 5. The target clock is ruig at 5 MHz ad its duty cycle is 75%. We use DCM to geerate the higher frequecy clock clk with the frequecy 160 MHz. From the result we ca see that: The proposed scheme ca detect glitches efficietly ad it will trigger the alarm whe glitch detected; The proposed scheme ca recover moitorig very soo after losig lock of the target clock. To compare the overheads of the proposed glitch detectio scheme ad previous AES protectio schemes, we implemet a glitch detectio scheme based o the origial AES usig P i Verilog. We model glitch detectio scheme ad the above differet protectio schemes i Verilog ad sythesized i Cadece Ecouter RT Compiler with the Nagate 45m Opecell library versio v2009 07. The desigs were placed ad routed usig Cadece Ecouter. The latecy, the area overhead of the protectio schemes were estimated usig Cocurret Curret Source (CCS) model uder typical operatio coditio assumig a supply voltage of 1.1V ad a temperature of 25 Celsius degree. The sythesis results for the schemes are show i Table 1 ad Figure 7. Table 1. Compariso betwee differet protectio schemes Protectio Schemes Overheads Area (um 2 ) Power mw Origial 13987.9 20.4 Duplicatio 25893.0 65.18 iear[8] 18299.5 30.61 Robust[11] 36711.2 136.7 CRC4 : 1[9] 19504.5 52.01 CRC8 : 1[9] 19804.5 52.97 Glitch 14392.5 22.76 6
Fig. 7. Overhead compariso of differet protectio schemes From the sythesis results show as i Table 1 ad Figure 7, it s obvious that the proposed glitch detectio scheme eeds much less resource overhead tha the previous proposed schemes. For example, the proposed scheme oly eed about 103% area of the origial AES scheme while it cosumes oly about 112% power of the origial AES implemetatio. At the same time, the previous protectio schemes such as the duplicatio protectio method eve eeds about 185% area ad about 320% power resource of the origial AES implemetatio. 5 Coclusio I this paper, we propose a simple method to detect clock glitches ad the results show that the proposed scheme ca detect glitches i clock efficietly while eeds few resource overhead. Compare with previous schemes, the proposed scheme is desiged specifically for clock glitch detectio ad it s highly recofigurable. The proposed scheme ivolves o complex work i desig phase for differet process techology. The simulatio ad sythesis results show that the proposed scheme ca detect the clock glitches efficietly ad it eeds much fewer resource tha the previous protectio schemes. 7
Refereces 1. E. Biham ad A. Shamir, Differetial fault aalysis of secret key cryptosystems, i Advaces i CryptologyCRYPTO 97. Spriger, 1997, pp. 513 525. 2. G. Piret ad J.-J. Quisquater, A differetial fault attack techique agaist sp structures, with applicatio to the aes ad khazad, i Cryptographic Hardware ad Embedded Systems-CHES 2003. Spriger, 2003, pp. 77 88. 3. D. Saha, D. Mukhopadhyay, ad D. R. Chowdhury, A diagoal fault attack o the advaced ecryptio stadard. IACR Cryptology eprit Archive, vol. 2009, p. 581, 2009. 4. S. Skorobogatov, Sychroizatio method for sca ad fault attacks, Joural of Cryptographic Egieerig, vol. 1, o. 1, pp. 71 77, 2011. 5. Y. i, K. Sakiyama, S. Gomisawa, T. Fukuaga, J. Takahashi, ad K. Ohta, Fault sesitivity aalysis, i Cryptographic Hardware ad Embedded Systems, CHES 2010. Spriger, 2010, pp. 320 334. 6. A. Moradi, O. Mischke, C. Paar, Y. i, K. Ohta, ad K. Sakiyama, O the power of fault sesitivity aalysis ad collisio side-chael attacks i a combied settig, i Cryptographic Hardware ad Embedded Systems CHES 2011. Spriger, 2011, pp. 292 311. 7. A. Moradi, O. Mischke, ad T. Eisebarth, Correlatio-ehaced power aalysis collisio attack, i Cryptographic Hardware ad Embedded Systems, CHES 2010. Spriger, 2010, pp. 125 139. 8. M. Karpovsky, K. J. Kulikowski, ad A. Taubi, Differetial fault aalysis attack resistat architectures for the advaced ecryptio stadard, i Smart Card Research ad Advaced Applicatios VI. Spriger, 2004, pp. 177 192. 9. C.-H. Ye ad B.-F. Wu, Simple error detectio methods for hardware implemetatio of advaced ecryptio stadard, Computers, IEEE Trasactios o, vol. 55, o. 6, pp. 720 731, 2006. 10. H. Igarashi, Y. Shi, M. Yaagisawa, ad N. Togawa, Cocurret faulty clock detectio for crypto circuits agaist clock glitch based dfa, i Circuits ad Systems (ISCAS), 2013 IEEE Iteratioal Symposium o. IEEE, 2013, pp. 1432 1435. 11. M. Karpovsky, K. J. Kulikowski, ad A. Taubi, Robust protectio agaist fault-ijectio attacks o smart cards implemetig the advaced ecryptio stadard, i Depedable Systems ad Networks, 2004 Iteratioal Coferece o. IEEE, 2004, pp. 93 101. 12. DS485: Digital Clock Maager (DCM) Module. 13. DS622: Phase ocked oop (P) Module (v2.00a). 14. Altera Phase-ocked oop (Altera P) Megafuctio User Guide. 15. UG190: Virtex-5 FPGA User Guide. 8