An on-chip glitchy-clock generator and its application to safe-error attack Sho Endo, Takeshi Sugawara, Naofumi Homma, Takafumi Aoki and Akashi Satoh Graduate School of Information Sciences, Tohoku University 6-6-05, Aramaki Aza Aoba, Aoba-ku, Sendai-shi 980-8579, Japan E-mail: endo@aoki.ecei.tohoku.ac.jp National Institute of Advanced Industrial Science and Technology 1 18 13 Sotokanda, Chiyoda-ku, Tokyo, 101-0021 Japan Abstract. This paper presents a glitchy-clock generator integrated in FPGA for evaluating fault injection attacks and their countermeasures on cryptographic modules. The proposed generator employs the functional block of clock management widely included in modern FPGAs and outputs a clock signal including a glitchy-clock cycle timely. The shape and timing of the glitchy-clock cycle are controlled accurately by the parameter setting. We can implement the proposed generator on a single FPGA board without using any external equipment such as a pulse generator and a variable power supply. Such integration makes it easier to generate reproducible glitchy-clock signals that can be verified by third parties. In this paper, we examine the characteristics of the proposed generator implemented on Side-channel Attack Standard Evaluation Board (SASEBO). The result shows that the glitches can be injected timely to any clock cycle in increments of about 0.17 ns. We also demonstrate its application to the safe-error attack against RSA processor. 1 Introduction Fault injection attacks are attracting much attention in the field of cryptographic hardware and embedded systems. The attackers first inject faults to cryptographic operations and obtain a faulty ciphertext, and then estimate a secret key from several faulty ciphertexts. After the first publication focusing on publickey cryptosystems [1], the fault injection attacks are extended to symmetric-key cryptosystems [2]. Since then, many variations of fault attacks and countermeasures have been presented and newer ones are still being proposed [3][4][5]. With the advance of such attacks and countermeasures, fault injection techniques have also been investigated in order to evaluate the possibility of the attacks and countermeasures. We have two types of fault models that can be used in the attacks: permanent and transient faults. A permanent fault to damage either data or sequencer in memory is very powerful, but inducing permanent faults is generally difficult. A transient fault is a temporal fault that can be recovered after the reset or end of operation, and thus it can happen more easily. 175
Cryptographic circuit (RSA) Controller VirtexII-Pro XC2VP7 Bus I/F Glitchy clock 16-bit Bus Glitchy-clock generator Bus I/F FIFO VirtexII-Pro XC2VP30 USB I/F SASEBO-G X'tal (24MHz) USB PC Fig. 1. Proposed fault injection system on SASEBO-G. For both fault models, various injection techniques were reported using glitches on power and clock signals, lower voltage, higher frequency, laser shots, light illumination on the surface of a depackaged chip, and so on [6] [7]. Among them, a transient fault caused by a glitchy clock (i.e. a clock signal with a glitch) is one of the possible faults due to the non-invasiveness and controllability. This paper presents a glitchy-clock generator integrated on FPGA. In [8], an experimental environment supplying such glitchy-clock signals was reported. The reported environment employs two clock sources having the same frequency but with different phases, and selects one of the two sources depending on the operation timing. A glitched-clock cycle occurs when one clock source is switched to the other one. The clock signals are generated by an external pulse generator. The ideas of the proposed generator are to integrate the environment into a single FPGA without an external pulse generator and to add an further functionality which controls the shape and timing of the glitchy-clock signal in a synchronized manner by parameters via a control PC. For this purpose, we employ the functional block of clock management widely included in modern FPGAs. It is interesting to note that the proposed generator is implemented in an FPGA on Side-channel Attack Standard Evaluation Board (SASEBO) [9]. From the standardization viewpoint, such on-chip fault generator can provide a uniform evaluation environment for fault injection attacks and their countermeasures since it allows us to generate reproducible glitchy-clock signals if the same FPGA is used. In this paper, we evaluate the basic characteristics of the proposed generator implemented on SASEBO. The result shows that the glitch signal can be injected timely for any clock cycle (i.e. tick) in increments of about 0.17 ns. This paper also demonstrates the effectiveness of the proposed generator through the safe-error attack against RSA hardware implemented in the other FPGA on SASEBO. We observe the success of the glitch injection from the difference between correct and faulty power traces. 2 On-chip glitchy-clock generator Figure 1 shows a block diagram of the proposed fault injection system implemented on Side-channel Attack Standard Evaluation Board with two Xilinx 176
Phase feedback Shifted clock E C Phase detector Delay line DLL (DLL) Counter M D max UX A F DLL B Glitch generator θ B θ C Phase shifts (a) Block diagram. Clock signal Phase shift Position of glitchy cycle Clock signal A B C D E F Switch from A to B Phase difference Switch from B to A Glitch width Glitch delay (b) Timing chart. Fig. 2. Glitch generator. FPGAs (SASEBO-G). The proposed generator is implemented in one FPGA (VirtexII-Pro XC2VP30), while the target cryptographic module is implemented in software or hardware on the other FPGA (VirtexII-Pro XC2VP7). The generator supplies a clock signal constantly and switches a normal clock cycle with a glitchy-clock cycle according to designated timing. The generation operation is synchronized with a BUS I/F, a FIFO, and an USB I/F. The BUS I/F receives output data (i.e. ciphertexts) from cryptographic modules via 16-bit bus. The FIFO provides the information about the shape and timing of glitchy-clock cycle to the generator. The USB I/F communicates with a host computer through the USB cable. The proposed system has the following functions: Induce a glitchy-clock cycle (i.e. a clock tick with a glitch) into any position of the clock signal. Change the delay and width of the glitch within one clock cycle. Provide a timing of target operation to acquire power/electromagnetic traces in a synchronized manner by an internal clock-counter. Figure 2(a) shows a block diagram of the glitch generator consisting of two Delay Locked Loop (DLL) circuits and a counter, where the DLL circuits are implemented by Digital Clock Managers (DCMs) for Xilinx FPGAs. (Note here that we can use alternative DLL circuits in the case of other FPGAs (e.g. PLL for Altera FPGAs). The DCMs control the delay of clock signals by the phaseshift parameter. The counter is incremented until the position of the glitchyclock cycle. Figure 2(b) illustrates a timing chart of the generator. The signal A indicates a clock signal given by an on-board clock component, the signals B and C are clock signals delayed by DCMs, the signal D is an output of the counter, the signal E is a signal activated by the timing of C and the maximum counter value, and the signal F is a resulting output. Note that the delay of B is set to be larger than that of C. The generator usually outputs the clock signal 177
(a) 3.3V (b) 1.65V T d T w V t T d : Glitch delay T w : Glitch width (c) 0 200 400 600 800 1000 Time [ns] Fig. 3. Image of glitchyclock cycle. Fig. 4. Examples of glitchy-clock signals. Voltage [V] 3 2 T d min.t w 1 1.6 1.4 0 max. T w 13 13.5 14 0 5 10 15 20 25 30 Time [ns] Time [ns] Voltage [V] 2.2 2 1.8 mean = 0.17 ns std.dev. = 0.036 ns (a) Waveforms. (b) Magnified view. Fig. 5. Waveforms of glitchy-clock cycles for different glitch widths. A, but it switches it to the signal B at the positive edges of the signal C, and then switches it back at the negative edge. Figure 3 shows the image of the glitchy-clock cycle, where T d is the glitch delay and T w is the glitch width. The glitch delay is determined by the time period between the positive edges of A and C. The glitch width is determined by the time period between the positive edges of C and B. Both the time periods are controlled by the phase-shift parameters to DCMs. The interval of the two positive edges is determined when the above time periods are selected. We can change these parameters on-line from the connected PC. We examine the basic characteristics of the proposed glitch generator implemented in FPGA on SASEBO-G. Figure 4 illustrates the examples of generated clock signals with glitchy-clock cycles, where (a), (b), and (c) are the clock signals with glitches at the 1st, 2nd, and 3rd clock cycles, respectively. A glitchy-clock cycle can be induced into any position of the clock signal depending on the maximum counter value. Figure 5(a) shows the waveforms of glitchy-clock cycles for different glitch widths from 0.7 to 13.7 ns, where the glitch delay is 4.2 ns. Figure 5(b) shows a magnified view of the 2nd positive edges in the clock cycle. 178
The result indicates that we can tune the glitch width precisely in increments of about 0.17 ns. More precisely, the increment size follows the normal distribution N(µ, σ 2 ) = N(0.17, 0.0013), which corresponds to the minimum amount of phase shift in DCM. 3 Application to safe-error attack on RSA 3.1 Safe-error attack Safe-error attack [10] is a fault injection attack on a classical modular exponentiation algorithm called the squaring-and-multiply always method [11]. It inserts dummy multiplications for the left-to-right binary method [12]. The dummy multiplication is processed for the zero bits of the exponent in order to perform both squaring and multiplication for each bit. This algorithm prevents an attacker from finding the specific pattern of multiplication and squaring operations depending on the secret exponent. On the other hand, the typical countermeasure is vulnerable to the safe-error attack, which induces a carefully timed fault during the multiplication process. If the returned result is correct, an attacker can find that the multiplication is a dummy and the secret key bit is zero since the result of the dummy multiplication is never used in the following process. 3.2 Parameter setting An RSA processor with high-radix Montgomery multiplication [13] is used in this experiment. The datapath includes a multiplication block, which repeats the multiply-additions in accordance with the bit pattern of the 512-bit key value to perform modular exponentiation. The 32-bit datapath performs multiplyadditions using the 32-bit operands stored in the registers. Each multiplication or squaring takes 578 cycles. The appropriate glitch width was examined according to the above architecture. The error rate is measured for different glitch widths from 1.1 to 8.6 ns, where the glitch delay is 4.9 ns, and 100 fault injection tests are performed for each glitch width. As a result, we obtained the error rate of 1.0 (i.e., 100 % error) from 1.6 to 8.6 ns. The shorter width did not succeed in generating the significant voltage drop before the 2nd positive edge arose. The wider width did not disturb any operation due to the operation margin. In the following experiment, we employed the glitch width of 4.8 ns in order to inject faults with high probability. 3.3 Experiment Figure 6 shows the experimental setup consisting of a SASEBO-G, an oscilloscope, and a PC. The oscilloscope is used to acquire power traces. In this experiment, we observe the success of safe-error attack from difference between two power traces, which one is the original trace with no fault and the other is 179
COSADE 2011 - Second International Workshop on Constructive Side-Channel Analysis and Secure Design (a) SASEBO board. (b) Overview. Fig. 6. Experimental setup. the faulty trace. If a fault-injected (multiplication) operation is a dummy, the following operations do not change with the fault injection. As a result, we can check whether the fault-injected operation is dummy or not by the difference trace after the operation. Figure 7 shows a measured power trace obtained from the RSA processor, where S, M, and DM indicate the squaring, multiplication and dummy multiplication operations, respectively. We injected faults to the first four multiplication operations indicated in Fig. 8. Note that only the third operation is a dummy multiplication. The result is extremely clean, producing a greatly reduced difference signal when the following operations are the same. The amplitude of the differential trace following the fault-injected operation in Fig. 8(c) remains close to zero. It is deduced that the target operation is dummy, and the 3rd key bit is identified as 0. In contrast, the differential traces in Figs. 8 (a), (b) and (d) indicates that the original and faulty traces do not match. This means that the target operations are real multiplication operations, and the 1st, 2nd and 4th key bits are revealed to be 1. As a result, we can obtain the first four key bits E = (1101)2 from the safe-error attack. 4 Conclusion This paper presented an on-chip glitchy-clock generator for evaluating fault injection attacks and their related countermeasures. The proposed generator can be implemented in an FPGA on SASEBO without using any external equipment, and thus is suitable for the development of a reproducible evaluation environment. The result shows that the glitches can be injected timely to any clock cycle in increments of about 0.17 ns. In this paper, we also demonstrated its application to the safe-error attack against RSA processor. We confirmed that 180
Voltage [mv] Positions of injected glitchy-clock cycles (a) (b) (c) (d) 170 160 150 140 0 50 100 150 200 Time [μs] Fig. 7. Power trace of RSA processor. S M S M S DM S M S M (a) (b) M (c) DM (d) 0 50 100 150 200 Time [μs] M Fig. 8. Differential power traces. the secret key bits were successfully obtained by faults provided by the proposed generator. Further experiments are being conducted to apply it to sophisticated fault attacks. References 1. D. Boneh, R. Demillio, and R. Liotin, On the importance of checking cryptographic protocols for fault, EUROCRYPT 1997, Lecture Notes in Computer Science, vol. 1233, pp. 37 51, May 1997. 181
2. E. Biham and A. Shamir, Differential fault analysis of secret key cryptosystems, CRYPTO 1997, vol. 1294, pp. 513 525, Aug. 1997. 3. R. Anderson and M. Kuhn, Low cost attacks on tamper resistant devices, Security Protocols: 5th Int. Workshop, Lecture Notes in Computer Science, vol. 1361, pp. 125 136, Aug. 1997. 4. H. Bar-El, H. Choukri, D. Naccache, M. Tunstall, and C. Whelan, The sorcerer s apprentice guide to fault attack, IACR eprint archive, vol. Report 2004/100, pp. 1 13, May 2004. 5. G. Giraud and H. Thiebeauld, A survey on fault attacks, CARDIS 2004, pp. 159 176, Aug. 2004. 6. C. H. Kim and J.-J. Quisquater, Faults, injection methods, and fault attacks, Design Test of Computers, IEEE, vol. 24, pp. 544 545, 2007. 7. S. Guilley, L. Sauvage, J.-L. Danger, N. Selmane, and R. Pacalet, Silicon-level solutions to counteract passive and active attacks. Proc., 5th Workshop on Fault Diagnosis and Tolerance in Cryptography, pp.3 17, 2008. 8. T. Fukunaga and J. Takahashi, Practical fault attack on a cryptographic lsi with iso/iec 18033-3 block ciphers, Proc., 6th Workshop on Fault Diagnosis and Tolerance in Cryptography, pp. 84 92, Sept. 2009. 9. Side-channel Attack Standard Evaluation Board, http://www.rcis.aist.go.jp/special/sasebo/. 10. S. M. Yen and M. Joye, Checking before output may not be enough against faultbased cryptanalysis, IEEE Trans. Comput., vol. 49, no. 9, pp. 967 970, Sept. 2000. 11. J. S. Coron, Resistance against differential power analysis for elliptic curve cryptosystems, CHES 1999, Lecture Notes in Computer Science, vol. 1717, pp. 292 302, Aug. 1999. 12. J. A. Menezes, C. P. Oorschot, and A. S. Vanstone, Handbook of Applied Cryptography. CRC Press, 1997. 13. A. Miyamoto, N. Homma, T. Aoki, and A. Satoh, Systematic design of high-radix montgomery multipliers for rsa processors, Proc. 26th IEEE Int. Conf. Computer Design, pp. 416 422, Oct. 2008. 182