paioli Power Analysis Immunity by Offsetting Leakage Intensity Sylvain Guilley perso.enst.fr/ guilley Telecom ParisTech

Similar documents
Power Analysis Attacks on SASEBO January 6, 2010

Evaluation of the Masked Logic Style MDPL on a Prototype Chip

The backend duplication method

Variety of scalable shuffling countermeasures against side channel attacks

DPA Leakage Models for CMOS Logic Circuits

Time-Memory Trade-Offs for Side-Channel Resistant Implementations of Block Ciphers. Praveen Vadnala

Hardware Based Strategies Against Side-Channel-Attack Implemented in WDDL

arxiv: v1 [cs.cr] 2 May 2016

Evaluation of the Masked Logic Style MDPL on a Prototype Chip

Power Analysis Based Side Channel Attack

Automated Analysis and Synthesis of Block-Cipher Modes of Operation

DETECTING POWER ATTACKS ON RECONFIGURABLE HARDWARE. Adrien Le Masle, Wayne Luk

A Block Cipher Based Pseudo Random Number Generator Secure against Side-Channel Key Recovery

When Electromagnetic Side Channels Meet Radio Transceivers

k-nearest Neighbors Algorithm in Profiling Power Analysis Attacks

DIGITAL ELECTRONICS. Methods & diagrams : 1 Graph plotting : - Tables & analysis : - Questions & discussion : 6 Performance : 3

Threshold Implementations. Svetla Nikova

Test Apparatus for Side-Channel Resistance Compliance Testing

Towards Optimal Pre-processing in Leakage Detection

Data Storage Using a Non-integer Number of Bits per Cell

Formal Hardware Verification: Theory Meets Practice

Evaluation of On-chip Decoupling Capacitor s Effect on AES Cryptographic Circuit

Card-Based Protocols for Securely Computing the Conjunction of Multiple Variables

Chapter 7 Information Redux

From New Technologies to New Solutions: Exploiting FRAM Memories to Enhance Physical Security

B. Substitution Ciphers, continued. 3. Polyalphabetic: Use multiple maps from the plaintext alphabet to the ciphertext alphabet.

Application and Analysis of Output Prediction Logic to a 16-bit Carry Look Ahead Adder

Electromagnetic-based Side Channel Attacks

Correlation Power Analysis of Lightweight Block Ciphers

5.4 Imperfect, Real-Time Decisions

Notes for Recitation 3

Cryptography. Module in Autumn Term 2016 University of Birmingham. Lecturers: Mark D. Ryan and David Galindo

CS 110 Computer Architecture Lecture 11: Pipelining

Assembly Level Clock Glitch Insertion Into An XMega MCU

Transform. Jeongchoon Ryoo. Dong-Guk Han. Seoul, Korea Rep.

A Cryptosystem Based on the Composition of Reversible Cellular Automata

Know your Algorithm! Architectural Trade-offs in the Implementation of a Viterbi Decoder. Matthias Kamuf,

An Architecture-Independent Instruction Shuffler to Protect against Side-Channel Attacks

Pipelined Beta. Handouts: Lecture Slides. Where are the registers? Spring /10/01. L16 Pipelined Beta 1

Finding the key in the haystack

Disclaimer. Primer. Agenda. previous work at the EIT Department, activities at Ericsson

Instruction Level Parallelism. Data Dependence Static Scheduling

Public Key Cryptography Great Ideas in Theoretical Computer Science Saarland University, Summer 2014

SIDE-CHANNEL attacks exploit the leaked physical information

EE 210 Lab Exercise #4 D/A & A/D Converters

Local and Direct EM Injection of Power into CMOS Integrated Circuits.

Cryptography CS 555. Topic 20: Other Public Key Encryption Schemes. CS555 Topic 20 1

EEE 309 Communication Theory

DIGITAL TO ANALOG AND ANALOG TO DIGITAL CONVERTER

Game Theory and Randomized Algorithms

Merkle s Puzzles. c Eli Biham - May 3, Merkle s Puzzles (8)

Block Ciphers Security of block ciphers. Symmetric Ciphers

Hamming Codes as Error-Reducing Codes

Meeting the Challenges of Formal Verification

Estimation of keys stored in CMOS cryptographic device after baking by using the charge shift

EECS150 - Digital Design Lecture 28 Course Wrap Up. Recap 1

LoRa Reverse Engineering and AES EM Side-Channel Attacks using SDR. Pieter Robyns

1. The decimal number 62 is represented in hexadecimal (base 16) and binary (base 2) respectively as

Outline. Communications Engineering 1

Constructing TI-Friendly Substitution Boxes using Shift-Invariant Permutations. Si Gao, Arnab Roy, and Elisabeth Oswald

Eliminating Random Permutation Oracles in the Even-Mansour Cipher. Zulfikar Ramzan. Joint work w/ Craig Gentry. DoCoMo Labs USA

A Design for Modular Exponentiation Coprocessor in Mobile Telecommunication Terminals

Game Mechanics Minesweeper is a game in which the player must correctly deduce the positions of

Solution: Alice tosses a coin and conveys the result to Bob. Problem: Alice can choose any result.

Module 3 Greedy Strategy

RISC Central Processing Unit

Card-based Cryptographic Protocols Using a Minimal Number of Cards

Universal Radio Hacker

What do ultra low power requirements mean for secure hardware?

The challenges of low power design Karen Yorav

An on-chip glitchy-clock generator and its application to safe-error attack

A Hardware-based Countermeasure to Reduce Side-Channel Leakage

EM Attack Is Non-Invasive? - Design Methodology and Validity Verification of EM Attack Sensor

Is Your Mobile Device Radiating Keys?

Data Acquisition & Computer Control

The dynamic power dissipated by a CMOS node is given by the equation:

6.S084 Tutorial Problems L19 Control Hazards in Pipelined Processors

On the Capacity Region of the Vector Fading Broadcast Channel with no CSIT

5.4 Imperfect, Real-Time Decisions

UNIVERSALITY IN SUBSTITUTION-CLOSED PERMUTATION CLASSES. with Frédérique Bassino, Mathilde Bouvel, Valentin Féray, Lucas Gerin and Mickaël Maazoun

Formal Description of the Chord Protocol using ASM

Chapter 2 Combinational Circuits

Low Power Design of Successive Approximation Registers

Recommendations for Secure IC s and ASIC s

The AMADEOS SysML Profile for Cyber-physical Systems-of-Systems

Nonlinear Multi-Error Correction Codes for Reliable MLC NAND Flash Memories Zhen Wang, Mark Karpovsky, Fellow, IEEE, and Ajay Joshi, Member, IEEE

Fault Attacks on Dual-Rail Encoded Systems

Tone-in-noise detection: Observed discrepancies in spectral integration. Nicolas Le Goff a) Technische Universiteit Eindhoven, P.O.

Methodologies for power analysis attacks on hardware implementations of AES

The attribution problem in Cognitive Science. Thinking Meat?! Formal Systems. Formal Systems have a history

Generic Attacks on Feistel Schemes

The Need for Gate-Level CDC

Evaluating the Robustness of Secure Triple Track Logic through Prototyping

DEVELOPING AN INTEGRATED ENVIRONMENT FOR DETECTING AND MITIGATING SIDE-CHANNEL AND FAULT ATTACKS ON HARDWARE PLATFORMS

Investigations of Power Analysis Attacks on Smartcards

Explaining Differential Fault Analysis on DES. Christophe Clavier Michael Tunstall

Device Pairing at the Touch of an Electrode

Logical Agents (AIMA - Chapter 7)

11/18/2015. Outline. Logical Agents. The Wumpus World. 1. Automating Hunt the Wumpus : A different kind of problem

Department of Electrical and Computer Systems Engineering

Transcription:

paioli Power Analysis Immunity by Offsetting Leakage Intensity Pablo Rauzy rauzy@enst.fr pablo.rauzy.name Sylvain Guilley guilley@enst.fr perso.enst.fr/ guilley Zakaria Najm znajm@enst.fr Telecom ParisTech CNRS LTCI / COMELEC / SEN Journée Sécurité Numérique sur les canaux auxiliaires organisée par le Gdt Sécurité des Systèmes Embarqués du GDR SoC-SiP 4 décembre 2014 @ Paris, France IACR eprint 2013/554 Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 1 / 37

WDDL: SecLib: a False b False m False a True b True m True a False b False a True b True MDPL: y False y True MAJ MAJ y False y True a False b False a True C C C b True C 0 a False b False a True b True BCDL: UNI OR4 OR4 y False y True y False y True Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 2 / 37

WDDL: SecLib: a False b False m False a True b True m True a False b False a True b True MDPL: y False y True Software? MAJ MAJ y False y True a False b False a True C C C b True C 0 a False b False a True b True BCDL: UNI OR4 OR4 y False y True y False y True Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 2 / 37

a False b False m False a True b True m True a False b False a True b True WDDL: MDPL: y False y True Software? MAJ MAJ y False y True a False b False a True C C C b True C 0 SecLib: Automation? a False b False a True b True BCDL: UNI OR4 OR4 y False y True y False y True Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 2 / 37

a False b False m False a True b True m True a False b False a True b True WDDL: MDPL: y False y True MAJ MAJ y False y True a False b False a True C C C b True C 0 a False b False a True b True BCDL: SecLib: OR4 OR4 Software? Automation? Verification? UNI y False y True y False y True Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 2 / 37

a False b False m False a True b True m True a False b False a True b True WDDL: MDPL: y False y True MAJ MAJ y False y True a False b False a True C C C b True C 0 a False b False a True b True BCDL: SecLib: OR4 OR4 Software? Automation? Verification? Formally? UNI y False y True y False y True Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 2 / 37

Motivation Our goal is to be able to formally assess the security of a cryptosystem against power analysis attacks. But, formal methods work with models, not implementations. Yet, side-channel attacks are an implementation-level threat. We want to apply formal methods on the implementation. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 3 / 37

Motivation Power Analysis Power analysis is a form of side-channel attack in which the attacker measures the power consumption of a cryptographic device. Power consumption is modeled by the Hamming weight of values and the Hamming distance of updates. Unprotected implementation leaks at every step. Thwarting side-channel analysis is a complicated task. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 4 / 37

Motivation Countermeasures In practice, there are two ways to protect cryptosystems. Palliative countermeasures attempt to make the attack more difficult, however without a theoretical foundation: variable clock, operation shuffling, dummy encryptions, etc. Curative countermeasures aim at providing a leak-free implementation based on a security rationale: decorrelate the leakage from the manipulated data, or make the leakage constant, irrespective of the manipulated data. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 5 / 37

Motivation / Countermeasures Masking Masking Definition Mix the computation with random numbers to make the leakage (at least in average) independent of the sensitive data. Pros: independence with respect to the leakage behavior of the hardware, existence of provably secure masking schemes. Cons: greedy requirement for randomness, randomness is hard to formalize, hardware glitches are likely to depend on more than one sensitive data, hence being high-order. possibility of high-order attacks. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 6 / 37

Motivation / Countermeasures Balancing Balancing Definition Follow a dual-rail protocol to make the leakage constant, irrespective of the manipulated data. DPL (Dual-rail with Precharge Logic) Definition Compute on redundant representation on two indistinguishable resources, so that the attacker cannot know which one has been set (which depends on the bit value). Pros: no randomness necessary, simple protocol easily captured formally. Cons: strongly depends on assumption on the hardware leakage. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 7 / 37

Motivation Power Analysis Countermeasures Dual-rail with Precharge Logic DPL in Software DPL Macro Generation of DPL Protected Assembly Code Generic Assembly Language Code Transformation Correctness Proof of the Transformation Formally Proving the Absence of Leakage Computed Proof of Constant Activity Hardware Characterization Case Study: present on an AVR Micro-Controller Profiling the AVR Micro-Controller Generating Balanced AVR Assembly Cost of the Countermeasure Attacks Conclusions Perspectives Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 8 / 37

Dual-rail with Precharge Logic The DPL countermeasure consists in computing on a redundant representation: each bit y is implemented as a pair (y False, y True ). The bit pair is then used in a protocol made up of two phases: 1. a precharge phase, during which all the bit pairs are zeroized (y False, y True ) = (0, 0), such that the computation starts from a known reference state; 2. an evaluation phase, during which the (y False, y True ) pair is equal to (1, 0) if it carries the logical value 0, or (0, 1) if it carries the logical value 1. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 9 / 37

Dual-rail with Precharge Logic DPL in Software Historically, DPL has been designed for implementation at hardware level. But we want to run DPL on an off-the-shelf processor. Therefore, we must: identify two similar resources that can hold true and false values in an indiscernible way for a side-channel attacker; play the DPL protocol by ourselves, in software. Then, to reproduce the DPL protocol in software we have to: work at the bit level, and duplicate (in positive and negative logic) the bit values. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 10 / 37

Dual-rail with Precharge Logic DPL Macro Each sensitive instruction should replaced by a DPL macro. The DPL macro assumes that the system is in a valid DPL state. And leaves it in a valid DPL state to make the macros chainable. The basic idea is to concatenate two DPL encoded values. Then use the result as an index in a look-up table. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 11 / 37

Dual-rail with Precharge Logic / DPL Macro Example Using the Two Least Significant Bit In this example we use the two LSB. Logical value 1 is 1 (01). Logical value 0 is 2 (10). Precharge phases (activity: 1 if sensitive) Evaluation phases (activity: 1) Masks (activity: normally 0) Shifts (activity: 2) Concatenation (activity: 1) Look-up (activity: 1 + 2) r 1 r 0 r 1 a r 1 r 1 3 r 1 r 1 1 r 1 r 1 1 r 2 r 0 r 2 b r 2 r 2 3 r 1 r 1 r 2 r 3 r 0 r 3 op[r 1 ] d r 0 d r 3 DPL macro for d = a op b Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 12 / 37

Dual-rail with Precharge Logic / DPL Macro Example Using the Two Least Significant Bit In this example we use the two LSB. Logical value 1 is 1 (01). Logical value 0 is 2 (10). Precharge phases (activity: 1 if sensitive) Evaluation phases (activity: 1) Masks (activity: normally 0) Shifts (activity: 2) Concatenation (activity: 1) Look-up (activity: 1 + 2) r 1 r 0 r 1 a r 1 r 1 3 r 1 r 1 1 r 1 r 1 1 r 2 r 0 r 2 b r 2 r 2 3 r 1 r 1 r 2 r 3 r 0 r 3 op[r 1 ] d r 0 d r 3 DPL macro for d = a op b Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 12 / 37

Dual-rail with Precharge Logic / DPL Macro Example Using the Two Least Significant Bit In this example we use the two LSB. Logical value 1 is 1 (01). Logical value 0 is 2 (10). Precharge phases (activity: 1 if sensitive) Evaluation phases (activity: 1) Masks (activity: normally 0) Shifts (activity: 2) Concatenation (activity: 1) Look-up (activity: 1 + 2) r 1 r 0 r 1 a r 1 r 1 3 r 1 r 1 1 r 1 r 1 1 r 2 r 0 r 2 b r 2 r 2 3 r 1 r 1 r 2 r 3 r 0 r 3 op[r 1 ] d r 0 d r 3 DPL macro for d = a op b Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 12 / 37

Dual-rail with Precharge Logic / DPL Macro Example Using the Two Least Significant Bit In this example we use the two LSB. Logical value 1 is 1 (01). Logical value 0 is 2 (10). Precharge phases (activity: 1 if sensitive) Evaluation phases (activity: 1) Masks (activity: normally 0) Shifts (activity: 2) Concatenation (activity: 1) Look-up (activity: 1 + 2) r 1 r 0 r 1 a r 1 r 1 3 r 1 r 1 1 r 1 r 1 1 r 2 r 0 r 2 b r 2 r 2 3 r 1 r 1 r 2 r 3 r 0 r 3 op[r 1 ] d r 0 d r 3 DPL macro for d = a op b Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 12 / 37

Dual-rail with Precharge Logic / DPL Macro Example Using the Two Least Significant Bit In this example we use the two LSB. Logical value 1 is 1 (01). Logical value 0 is 2 (10). Precharge phases (activity: 1 if sensitive) Evaluation phases (activity: 1) Masks (activity: normally 0) Shifts (activity: 2) Concatenation (activity: 1) Look-up (activity: 1 + 2) r 1 r 0 r 1 a r 1 r 1 3 r 1 r 1 1 r 1 r 1 1 r 2 r 0 r 2 b r 2 r 2 3 r 1 r 1 r 2 r 3 r 0 r 3 op[r 1 ] d r 0 d r 3 DPL macro for d = a op b Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 12 / 37

Dual-rail with Precharge Logic / DPL Macro Example Using the Two Least Significant Bit In this example we use the two LSB. Logical value 1 is 1 (01). Logical value 0 is 2 (10). Precharge phases (activity: 1 if sensitive) Evaluation phases (activity: 1) Masks (activity: normally 0) Shifts (activity: 2) Concatenation (activity: 1) Look-up (activity: 1 + 2) r 1 r 0 r 1 a r 1 r 1 3 r 1 r 1 1 r 1 r 1 1 r 2 r 0 r 2 b r 2 r 2 3 r 1 r 1 r 2 r 3 r 0 r 3 op[r 1 ] d r 0 d r 3 DPL macro for d = a op b Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 12 / 37

Dual-rail with Precharge Logic / DPL Macro Example Using the Two Least Significant Bit In this example we use the two LSB. Logical value 1 is 1 (01). Logical value 0 is 2 (10). Precharge phases (activity: 1 if sensitive) Evaluation phases (activity: 1) Masks (activity: normally 0) Shifts (activity: 2) Concatenation (activity: 1) Look-up (activity: 1 + 2) r 1 r 0 r 1 a r 1 r 1 3 r 1 r 1 1 r 1 r 1 1 r 2 r 0 r 2 b r 2 r 2 3 r 1 r 1 r 2 r 3 r 0 r 3 op[r 1 ] d r 0 d r 3 DPL macro for d = a op b Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 12 / 37

Generation of DPL Protected Assembly Code We want to automatically insert this countermeasure in assembly code. To be as universal as possible, we use a generic assembly language which can be mapped to and from virtually any actual assembly language. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 13 / 37

Generation of DPL Protected Assembly Code Generic Assembly Language Prog ::= ( Label? Inst? ( ; <comment> )? \n )* Label ::= <label-name> : Inst ::= Opcode0 Branch1 Addr Opcode2 Lval Val Opcode3 Lval Val Val Branch3 Val Val Addr Opcode0 ::= nop Branch1 ::= jmp Opcode2 ::= not mov Opcode3 ::= and orr xor lsl lsr add mul Branch3 ::= beq bne Val Lval Addr ::= Lval # <immediate-value> ::= r <register-number> @ <memory-address>! Val (, <offset> )? ::= # <absolute-code-address> <label-name> Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 14 / 37

Generation of DPL Protected Assembly Code / Generic Assembly Language DPL Macro Using the Two Least Significant Bit mov r1 r0 r 1 r 0 mov r1 a r 1 a and r1 r1 #3 r 1 r 1 3 lsl r1 r1 #1 r 1 r 1 1 lsl r1 r1 #1 r 1 r 1 1 mov r2 r0 r 2 r 0 mov r2 b r 2 b and r2 r2 #3 r 2 r 2 3 orr r1 r1 r2 r 1 r 1 r 2 mov r3 r0 r 3 r 0 mov r3!r1,op r 3 op[r 1 ] mov d r0 d r 0 mov d r3 d r 3 Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 15 / 37

Generation of DPL Protected Assembly Code Code Transformation 1. Bitslice code. 2. DPL macros expansion. 3. Look-up tables. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 16 / 37

Generation of DPL Protected Assembly Code / Code Transformation 1. Bitslicing Code Always possible (by Turing machines equivalence theorem) But, hard to do automatically in practice. However, there are a lot of already (manually) bitsliced implementations, since it is a common optimization technique. We take already bitsliced code as input. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 17 / 37

Generation of DPL Protected Assembly Code / Code Transformation 2.1. Sensitive Instructions Sensitive value A value is said sensitive if it depends on sensitive data. A sensitive data depends on the secret key or the plaintext. Definition Sensitive instruction Definition An instruction is said sensitive if it may modify the Hamming weight of a sensitive value. All the sensitive instructions must be expanded to a DPL macro. Thus, all the sensitive data must be transformed too. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 18 / 37

Generation of DPL Protected Assembly Code / Code Transformation 2.2. Which Instructions are Sensitive? Bitsliced code means that only the logical (bit level) operators, except shifts, are used in sensitive instructions. DPL protocol implies that not instructions are replaced by xor. Only and, or, and xor instructions need to be expanded to DPL macros. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 19 / 37

Generation of DPL Protected Assembly Code / Code Transformation 3. Look-Up Tables Addresses of the look-up tables are sensitive too: their indices are sensitive values. Thus, the addresses bits corresponding to the accessed cell must be 0. In our example, the look-up table addresses must be multiple of 16. index 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111 and 00, 00, 00, 00, 00, 01, 10, 00 or 00, 00, 00, 00, 00, 01, 01, 00 xor 00, 00, 00, 00, 00, 10, 01, 00 index 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111 and 00, 10, 10, 00, 00, 00, 00, 00 or 00, 01, 10, 00, 00, 00, 00, 00 xor 00, 01, 10, 00, 00, 00, 00, 00 Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 20 / 37

Generation of DPL Protected Assembly Code / Code Transformation 3. Look-Up Tables Addresses of the look-up tables are sensitive too: their indices are sensitive values. Thus, the addresses bits corresponding to the accessed cell must be 0. In our example, the look-up table addresses must be multiple of 16. index 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111 and 00, 00, 00, 00, 00, 01, 10, 00 or 00, 00, 00, 00, 00, 01, 01, 00 xor 00, 00, 00, 00, 00, 10, 01, 00 index 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111 and 00, 10, 10, 00, 00, 00, 00, 00 or 00, 01, 10, 00, 00, 00, 00, 00 xor 00, 01, 10, 00, 00, 00, 00, 00 Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 20 / 37

Generation of DPL Protected Assembly Code Correctness Proof of the Transformation Correct DPL transformation Let S be a valid state of the system (values in registers and memory). Let c be a sequence of instructions of the system. Definition Let Ŝ be the state of the system after the execution of c with state S, we denote that by S c Ŝ. We write dpl(s) for the DPL state equivalent to the state S. We say that c is a correct DPL transformation of the code c if S c c Ŝ = dpl(s) dpl(ŝ). Correctness of our code transformation The expansion of the sensitive instructions into DPL macros is a correct DPL transformation. Proof in the paper. Proposition Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 21 / 37

Formally Proving the Absence of Leakage Example execution for and. a, b 10, 10 Sensitive activity d r1 r2 r3 mov r1 r0? 0?? 0 mov r1 a? 10?? 1 and r1 r1 #3? 10?? 0 shl r1 r1 #1? 100?? 2 shl r1 r1 #1? 1000?? 2 mov r2 r0? 1000 0? 0 mov r2 b? 1000 10? 1 and r2 r2 #3? 1000 10? 0 orr r1 r1 r2? 1010 10? 1 mov r3 r0? 1010 10 0 0 mov r3!r1,and? 1010 10 10 3 mov d r0 0 1010 10 10 0 mov d r3 10 1010 10 10 1 Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 22 / 37

Formally Proving the Absence of Leakage Example execution for and. a, b 10, 01 Sensitive activity d r1 r2 r3 mov r1 r0? 0?? 0 mov r1 a? 10?? 1 and r1 r1 #3? 10?? 0 shl r1 r1 #1? 100?? 2 shl r1 r1 #1? 1000?? 2 mov r2 r0? 1000 0? 0 mov r2 b? 1000 01? 1 and r2 r2 #3? 1000 01? 0 orr r1 r1 r2? 1001 01? 1 mov r3 r0? 1001 01 0 0 mov r3!r1,and? 1001 01 10 3 mov d r0 0 1001 01 10 0 mov d r3 10 1001 01 10 1 Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 22 / 37

Formally Proving the Absence of Leakage Example execution for and. a, b 01, 10 Sensitive activity d r1 r2 r3 mov r1 r0? 0?? 0 mov r1 a? 01?? 1 and r1 r1 #3? 01?? 0 shl r1 r1 #1? 010?? 2 shl r1 r1 #1? 0100?? 2 mov r2 r0? 0100 0? 0 mov r2 b? 0100 10? 1 and r2 r2 #3? 0100 10? 0 orr r1 r1 r2? 0110 10? 1 mov r3 r0? 0110 10 0 0 mov r3!r1,and? 0110 10 10 3 mov d r0 0 0110 10 10 0 mov d r3 10 0110 10 10 1 Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 22 / 37

Formally Proving the Absence of Leakage Example execution for and. a, b 01, 01 Sensitive activity d r1 r2 r3 mov r1 r0? 0?? 0 mov r1 a? 01?? 1 and r1 r1 #3? 01?? 0 shl r1 r1 #1? 010?? 2 shl r1 r1 #1? 0100?? 2 mov r2 r0? 0100 0? 0 mov r2 b? 0100 01? 1 and r2 r2 #3? 0100 01? 0 orr r1 r1 r2? 0101 01? 1 mov r3 r0? 0101 01 0 0 mov r3!r1,and? 0101 01 01 3 mov d r0 0 0101 01 01 0 mov d r3 01 0101 01 01 1 Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 22 / 37

Formally Proving the Absence of Leakage Computed Proof of Constant Activity Our tool does this verification automatically for the whole program. It uses symbolic computations to keep track of possible leakages. The strategy is to simulate a CPU and memory in software, and compute with sets of values. Initially, all sensitive data values can be either 0 or 1. At each cycle and for each possible combination of actual values: it looks at the Hamming weight of values and Hamming distance of updates in registers, memory, and addresses; and if one can have different values, it reports a leak. This verification is independent from the code transformation. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 23 / 37

Formally Proving the Absence of Leakage Hardware Characterization The DPL countermeasure heavily relies on the indistinguishable resources hypothesis on the hardware. This property is generally not true in non-specialized hardware. Using the bits whose leakage are the most similar will maximize the relevancy of our leakage model. Profiling the hardware allows to find them. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 24 / 37

Case Study: present on an AVR Micro-Controller Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 25 / 37

Case Study: present on an AVR Micro-Controller Profiling the AVR Micro-Controller 1.0 NICV 0.5 0.0 bit 0 bit 1 bit 2 bit 3 bit 4 bit 5 bit 6 bit 7 Time (restarts for each bit) Leakage level during unprotected encryption for each bit of the ATmega163. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 26 / 37

Case Study: present on an AVR Micro-Controller Generating Balanced AVR Assembly r 1 r 0 r 1 a r 1 r 1 6 r 1 r 1 1 r 1 r 1 1 r 2 r 0 r 2 b r 2 r 2 6 r 1 r 1 r 2 r 3 r 0 r 3 op[r 1 ] d r 0 d r 3 DPL macro for d = a op b on the ATmega163. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 27 / 37

Case Study: present on an AVR Micro-Controller Cost of the Countermeasure bitslice DPL cost code (B) 1620 3056 1.88 RAM (B) 288 352 +64 #cycles 78, 403 235, 427 3 DPL cost. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 28 / 37

Case Study: present on an AVR Micro-Controller Attacks We attacked three implementations: a bitsliced but unprotected one, a DPL protected one using the two less significant bits, a DPL protected one taking the hardware characterization into account. We took 100, 000 execution traces. We computed the success rate of using monobit CPA of the output of the S-Box as a model. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 29 / 37

Case Study: present on an AVR Micro-Controller / Attacks Results The unprotected implementation breaks using about 400 traces. The poorly balanced one is still not broken using 100, 000 traces. But we want to show that the hardware characterization is beneficial! Let s make the attacker cheat. We used our knowledge of the key to select a narrow part of the traces where we knew that the attack would work. We used the NICV to select the point where the signal-to-noise ratio of the CPA attack is the highest. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 30 / 37

Case Study: present on an AVR Micro-Controller / Attacks Results The unprotected implementation breaks using about 400 traces. The poorly balanced one is still not broken using 100, 000 traces. But we want to show that the hardware characterization is beneficial! Let s make the attacker cheat. We used our knowledge of the key to select a narrow part of the traces where we knew that the attack would work. We used the NICV to select the point where the signal-to-noise ratio of the CPA attack is the highest. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 30 / 37

Case Study: present on an AVR Micro-Controller / Attacks Results for the Cheating Attacker The unprotected implementation breaks using 138 traces. The poorly balanced one breaks using 1, 470 traces. The better balanced one breaks using 4, 810 traces. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 31 / 37

Case Study: present on an AVR Micro-Controller / Attacks Results for the Cheating Attacker : unprotected 1 Bitslice, unprotected CPA for all 16 guesses (correct one in black), after 400 traces Success rate 0.9 0.8 0.7 0.6 0.5 0.4 Correlation 0.5 0.4 0.3 0.2 0.1 0 0.1 0.3 0.2 0.1 80% Success rate : 138 traces 0.2 0.3 0.4 0.5 0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Traces count 0 50 100 150 200 250 300 350 400 450 Time (# of samples (x1000)) Monobit CPA attack on unprotected bitslice implementation. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 32 / 37

Case Study: present on an AVR Micro-Controller / Attacks Results for the Cheating Attacker : poorly balanced 1 Bitslice DPL, poorly balanced CPA for all 16 guesses (correct one in black), after 9000 traces 0.9 0.6 0.8 0.4 0.7 Success rate 0.6 0.5 0.4 Correlation 0.2 0 0.2 0.3 0.2 0.4 0.1 80% Success rate : 1470 traces (optimistic) 0.6 0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Traces count 0 5 10 15 20 25 30 35 40 Time (# of samples (x1000)) Monobit CPA attack on poorly balanced DPL implementation (bits 0 and 1). Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 33 / 37

Case Study: present on an AVR Micro-Controller / Attacks Results for the Cheating Attacker : better balanced 1 Bitslice DPL, better balanced CPA for all 16 guesses (correct one in black), after 9000 traces 0.9 0.6 0.8 0.4 0.7 Success rate 0.6 0.5 0.4 Correlation 0.2 0 0.2 0.3 0.2 0.4 0.1 80% Success rate : 4810 traces 0.6 0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Traces count 0 5 10 15 20 25 30 35 40 Time (# of samples (x1000)) Monobit CPA attack on better balanced DPL implementation (bits 1 and 2). Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 34 / 37

Conclusions Automatic and proven correct code protection. Independent formal proof of constant activity according to a leakage model. Hardware characterization method to increase the leakage model relevancy. Provably balanced DPL protected implementation or present: At least 250 times more resistant to power analysis attacks. SNR divided by at least 16. Only 3 (or 24) times slower. Software balancing countermeasures are realistic. http://pablo.rauzy.name/sensi/paioli.html Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 35 / 37

Perspectives The pair of bits used for the DPL protocol could change during the execution or chosen at random for each execution. Unused bits could be randomized instead of being zero in order to add noise on top of balancing. Randomness could be used to mask the computation. Also: our methods and tools need to be further tested in other experimental settings; although the mapping from the internal assembly of our tool to the concrete assembly is straightforward, it would be better to have a formal correctness proof of the mapping; our work would also benefit from automated bitslicing. We believe formal methods have a bright future concerning the certification of side-channel attacks countermeasures for trustable cryptosystems. Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 36 / 37

That was it. Questions? Motivation Power Analysis Countermeasures Dual-rail with Precharge Logic DPL in Software DPL Macro Generation of DPL Protected Assembly Code Generic Assembly Language Code Transformation Correctness Proof of the Transformation Formally Proving the Absence of Leakage Computed Proof of Constant Activity Hardware Characterization Case Study: present on an AVR Micro-Controller Profiling the AVR Micro-Controller Generating Balanced AVR Assembly Cost of the Countermeasure Attacks Conclusions Perspectives rauzy@enst.fr Open access and always up-to-date version of the paper: IACR eprint 2013/554 Pablo Rauzy (Telecom ParisTech) paioli 2014-12-04 37 / 37