Implementation and Performance Testing of the SQUASH RFID Authentication Protocol

Similar documents
Evaluating a New Mac for Current and Next Generation Rfid

Linear Congruences. The solutions to a linear congruence ax b (mod m) are all integers x that satisfy the congruence.

TMA4155 Cryptography, Intro

Minimum key length for cryptographic security

Fermat s little theorem. RSA.

o Broken by using frequency analysis o XOR is a polyalphabetic cipher in binary

The number theory behind cryptography

Generic Attacks on Feistel Schemes

Synthesis and Analysis of 32-Bit RSA Algorithm Using VHDL

Lecture 32. Handout or Document Camera or Class Exercise. Which of the following is equal to [53] [5] 1 in Z 7? (Do not use a calculator.

Cryptography CS 555. Topic 20: Other Public Key Encryption Schemes. CS555 Topic 20 1

CARRY SAVE COMMON MULTIPLICAND MONTGOMERY FOR RSA CRYPTOSYSTEM

CDMA Physical Layer Built-in Security Enhancement

Mathematics Explorers Club Fall 2012 Number Theory and Cryptography

Discrete Mathematics & Mathematical Reasoning Multiplicative Inverses and Some Cryptography

SDR Applications using VLSI Design of Reconfigurable Devices

FPGA IMPLENTATION OF REVERSIBLE FLOATING POINT MULTIPLIER USING CSA

MAT 302: ALGEBRAIC CRYPTOGRAPHY. Department of Mathematical and Computational Sciences University of Toronto, Mississauga.

B. Substitution Ciphers, continued. 3. Polyalphabetic: Use multiple maps from the plaintext alphabet to the ciphertext alphabet.

Understanding Cryptography: A Textbook For Students And Practitioners PDF

Sheet 1: Introduction to prime numbers.

High Speed Binary Counters Based on Wallace Tree Multiplier in VHDL

DUBLIN CITY UNIVERSITY

OFDM Based Low Power Secured Communication using AES with Vedic Mathematics Technique for Military Applications

Modular arithmetic Math 2320

Math 1111 Math Exam Study Guide

Public Key Cryptography

Proceedings of Meetings on Acoustics

Advances in Antenna Measurement Instrumentation and Systems

An area optimized FIR Digital filter using DA Algorithm based on FPGA

Diffie-Hellman key-exchange protocol

Generic Attacks on Feistel Schemes

Mahendra Engineering College, Namakkal, Tamilnadu, India.

An on-chip glitchy-clock generator and its application to safe-error attack

MA/CSSE 473 Day 9. The algorithm (modified) N 1

Math 1111 Math Exam Study Guide

Number Theory and Security in the Digital Age

Dr. V.U.K.Sastry Professor (CSE Dept), Dean (R&D) SreeNidhi Institute of Science & Technology, SNIST Hyderabad, India. P = [ p

Multiplier Design and Performance Estimation with Distributed Arithmetic Algorithm

Chapter 4 MASK Encryption: Results with Image Analysis

High Speed ECC Implementation on FPGA over GF(2 m )

Cryptography. Module in Autumn Term 2016 University of Birmingham. Lecturers: Mark D. Ryan and David Galindo

Public Key Cryptography Great Ideas in Theoretical Computer Science Saarland University, Summer 2014

A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm

A Compact Design of 8X8 Bit Vedic Multiplier Using Reversible Logic Based Compressor

Optimized high performance multiplier using Vedic mathematics

TIMA Lab. Research Reports

ECOM 4311 Digital System Design using VHDL. Chapter 9 Sequential Circuit Design: Practice

Block Ciphers Security of block ciphers. Symmetric Ciphers

Non-overlapping permutation patterns

On Built-In Self-Test for Adders

EE 307 Project #1 Whac-A-Mole

Some Cryptanalysis of the Block Cipher BCMPQ

Vector Arithmetic Logic Unit Amit Kumar Dutta JIS College of Engineering, Kalyani, WB, India

Random Bit Generation and Stream Ciphers

Network Security: Secret Key Cryptography

Cryptography. 2. decoding is extremely difficult (for protection against eavesdroppers);

Application: Public Key Cryptography. Public Key Cryptography

COS433/Math 473: Cryptography. Mark Zhandry Princeton University Spring 2017

V.Sorge/E.Ritter, Handout 2

COMPUTER ORGANIZATION & ARCHITECTURE DIGITAL LOGIC CSCD211- DEPARTMENT OF COMPUTER SCIENCE, UNIVERSITY OF GHANA

Chapter 4 The Data Encryption Standard

Design of a High Throughput 128-bit AES (Rijndael Block Cipher)

The number of mates of latin squares of sizes 7 and 8

VLSI DESIGN OF RECONFIGURABLE FILTER FOR HIGH SPEED APPLICATION

Towards Real-time Hardware Gamma Correction for Dynamic Contrast Enhancement

Pseudorandom Number Generation and Stream Ciphers

B.E. SEMESTER III (ELECTRICAL) SUBJECT CODE: X30902 Subject Name: Analog & Digital Electronics

Available online at ScienceDirect. Procedia Computer Science 65 (2015 )

Power Analysis Attacks on SASEBO January 6, 2010

Design and Implementation of Complex Multiplier Using Compressors

NON-OVERLAPPING PERMUTATION PATTERNS. To Doron Zeilberger, for his Sixtieth Birthday

DUBLIN CITY UNIVERSITY

VHDL based Design of Convolutional Encoder using Vedic Mathematics and Viterbi Decoder using Parallel Processing

Remember that represents the set of all permutations of {1, 2,... n}

Conditional Cube Attack on Reduced-Round Keccak Sponge Function

ORCA-50 Handheld Data Terminal UHF Demo Manual V1.0

COMBINATIONAL and SEQUENTIAL LOGIC CIRCUITS Hardware implementation and software design

Symmetric-key encryption scheme based on the strong generating sets of permutation groups

Low Power Approach for Fir Filter Using Modified Booth Multiprecision Multiplier

FPGA-BASED DESIGN AND IMPLEMENTATION OF THREE-PRIORITY PERSISTENT CSMA PROTOCOL

Comparison of Visual Cryptographic Algorithms for Quality Images Using XOR

Design of FIR Filter Using Modified Montgomery Multiplier with Pipelining Technique

Aesthetically Pleasing Azulejo Patterns

AREA AND DELAY EFFICIENT DESIGN FOR PARALLEL PREFIX FINITE FIELD MULTIPLIER

RFID Anti-Collision System Using the Spread Spectrum Technique

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Algorithmic Number Theory and Cryptography (CS 303)

A Design Approach for Compressor Based Approximate Multipliers

Xor. Isomorphisms. CS70: Lecture 9. Outline. Is public key crypto possible? Cryptography... Public key crypography.

Design, Implementation and performance analysis of 8-bit Vedic Multiplier

MA 111, Topic 2: Cryptography

Analyzing the Efficiency and Security of Permuted Congruential Number Generators

Low power implementation of Trivium stream cipher

Lightweight Mixcolumn Architecture for Advanced Encryption Standard

Implementation of Colored Visual Cryptography for Generating Digital and Physical Shares

understand the hardware and software components that make up computer systems, and how they communicate with one another and with other systems

RSA hybrid encryption schemes

Design and Analysis of RNS Based FIR Filter Using Verilog Language

Globally Asynchronous Locally Synchronous (GALS) Microprogrammed Parallel FIR Filter

Transcription:

Implementation and Performance Testing of the SQUASH RFID Authentication Protocol Philip Koshy, Justin Valentin and Xiaowen Zhang * Department of Computer Science College of n Island n Island, New York, 10314 E-mail: Xiaowen.Zhang@csi.cuny.edu Abstract We implement and test the performance of an RFID hash algorithm recently proposed by Adi Shamir [1] using a C++ simulation. The algorithm, called SQUASH (short for SQUarehASH ), allows for an RFID tag design that is simple enough to be implemented on low-cost RFID tags. Shamir has proved mathematically that his SQUASH algorithm is at least as secure as the Rabin cryptosystem [2], which has been extensively tested and scrutinized for nearly 30 years. The SQUASH algorithm is designed to minimize processing time and cost without sacrificing security. Shamir s SQUASH algorithm was developed as a lightweight version of a hash function from the Rabin cryptosystem, c = m 2 mod n. The SQUASH algorithm is theoretically faster because rather than storing, computing, and transmitting a large ciphertext, the algorithm allows a low cost RFID tag to compute the bits of the ciphertext in real-time, bit by bit, transmitting them as they are calculated. Although the SQUASH algorithm is provably as secure as the Rabin cryptosystem, the performance of the algorithm has not been carefully scrutinized. Shamir writes that the performance of the SQUASH algorithm should scale linearly as the size of a tag s register increases; we attempt to test this specific claim by using a software simulation. Keywords-RFID; Hash Algorithm; Rabin Cryptosystem I. INTRODUCTION Radio Frequency Identification (RFID) tags allow large organizations in the public and private sector to catalog, index and process a large volume of data wirelessly. Because of a growing need for low-cost RFID tags, individual tags are created with only a small amount of working memory and security is often implemented as an afterthought. Some consumer rights organizations, like CASPIAN (Consumers Against Supermarket Privacy Invasion and Numbering) and EPIC (Electronic Privacy Information Center), are against the use of RFID because of possible privacy violations. As the security industry searches for a cost-effective, lightweight RFID hash algorithm to deal with these security concerns, Adi Shamir has created the SQUASH algorithm (short for SQUare-hASH ) as one possible solution. Unlike other hash functions that have been proposed without a strong mathematical basis, the SQUASH algorithm is provably as secure as the hash function from the Rabin cryptosystem, c = m 2 mod n, which has been studied and scrutinized for nearly 30 years. The SQUASH algorithm has the benefit of producing *X. Zhang is the corresponding author; his work is supported in part by a PSC-CUNY Award. 978-1-4244-5550-8/10/$26.00 2010 IEEE bits of the ciphertext on the fly, bit by bit, thus allowing for high data throughput. We have focused on a specific claim made by Shamir, specifically that the performance of the SQUASH algorithm will scale linearly as the tag register size is increased exponentially. We have created a simulation of the algorithm in C++ to verify this claim. II. AUTHENTICATION VIA HASH ALGORITHMS A. Authenticate using the result of a mathematical operation If security were not a concern, an RFID reader could authenticate an RFID tag by simply requesting a secret value from the tag. The reader could then match and verify this secret value from its own internal database. While this approach is possible, the tag would send sensitive data that could be retrieved by an eavesdropping attack and then exploited by a man-in-the-middle attack. To avoid this problem, a tag sends something called a ciphertext instead. Rather than share a secret value over a potentially insecure transmission medium, we can perform a mathematical operation on the secret value, and send the result. The operation is called a hash function and the result is called the ciphertext. B. The Authentication Process When beginning transmission, the RFID reader sends a pseudorandom number R to the RFID tag. At this point, the tag and the reader both know the following three pieces of information: (1) The hash function H, (2) a secret piece of information S, uniquely identifying the tag and (3) the pseudorandom number R. After receiving R from the reader, the tag then performs an exclusive-or operation on R and S and sends the result through the hash function H. This produces the ciphertext C 1. So in other words, the tag calculates H(R S) = C 1. Simultaneously, the reader also uses the same H, S and R to calculate its own ciphertext, C 2. Therefore, the reader produces H(R S) = C 2. The tag sends its calculated ciphertext C 1 back to the reader and the reader attempts to compare it with its calculated C 2. If C 2 = C 1, then the RFID reader has successfully authenticated the RFID tag since they have both computed the same value for the ciphertext.

III. SQUASH AND THE RABIN CRYPTOSYSTEM In the one way function c = m 2 mod n from the Rabin cryptosystem, c represents the ciphertext, m represents the message to be hashed and n is a Mersenne number (n = 2 K 1 where k is the message length) that has not yet been factored. If we attempted to square, mod and store a large number, the implementation would be too cumbersome and too slow to work on a low-cost RFID tag. The follow steps show the proof and derivation of the SQUASH algorithm from the Rabin cryptosystem. The focus is on high speed while maintaining a small design footprint. A. Modular Reduction We look at an example to demonstrate how Shamir simplifies the modular square operation (m 2 mod n) used in the Rabin cryptosystem. Say we have a 4-bit message with a value of 1010 binary. (Note that since the message is 4 bits, k = 4 in this example) B. Mathematical Convolution In order to find g 1 and g 0, it would seem that we need to calculate and store m 2 in its entirety. Luckily, because of a process called mathematical convolution, we can generate the bits of m 2 on the fly without storing the square. To better understand convolution, we begin by examining the operation of squaring in a generalized sense. The following example demonstrates how we can square a 4-bit binary number, where the individual bits of the number are represented by m 3 to m 0. This means: m = 1010 binary (which is 10 decimal) m 2 = 0110 0100 binary (which is 100 decimal) We can break up m 2 into a top half and bottom half as follows: m 2 = 0110 0100 g 1 g 0 g 1 = 0110 binary = 6 decimal g 0 = 0100 binary = 4 decimal Mathematically, we can determine that the square of the message is actually equivalent to m 2 = g 1 2 K + g 0 This is because multiplying g 1 by 2 K is the same thing as shifting g 1 left by k places. Visually, the operation would look like this for our example where g 1 = 0110 and k = 4. g 1 = 0110 Initial value of g 1 from above g 1 = 0110 0 Shift left (same as multiplying by 2 1 ) g 1 = 0110 00 Shift left (same as multiplying by 2 2 ) g 1 = 0110 000 Shift left (same as multiplying by 2 3 ) g 1 = 0110 0000 Shift left (same as multiplying by 2 4 ) If we add g 1 (after the four shifts) to g 0, we see that we get m 2 : g 1 = 0110 0000 + g 0 = 0000 0100 m 2 = 0110 0100 From our original formula (c = m 2 mod n), we know that n is of the form 2 K 1. Performing some algebra, we can simply the cipher calculation. 2 K 1 = n Original formula 2 K = n + 1 After adding 1 to both sides 2 K = 1 mod n After taking mod n on both sides This means our cipher can be simplified as follows: c = m 2 mod n = (g 1 2 K + g 0) mod n = g 1 + g 0 Therefore the cipher is simply equal to g 1 + g 0 The final product of the squaring operation can be found by taking the sum of the bits in each column of partial products. For example, the bits of g 0 can be calculated as follows: Bit 0 of the solution = m 0m 0 Bit 1 of the solution = m 1m 0 + m 0m 1 Bit 2 of the solution = m 2m 0 + m 1m 1 + m 0m 2 + carry Bit 3 of the solution = m 3m 0 + m 2m 1 + m 1m 2 + m 0m 3 + carry Thus, in order to generate a bit in the lower half g 0 (we can call it bit j) we use this formula: (1) Similarly, the bits of g 1 can be calculated as follows: Bit 4 of the solution = m 3m 1 + m 2m 2 + m 1m 3 + carry Bit 5 of the solution = m 3m 2 + m 2m 3 + carry Bit 6 of the solution = m 3m 3 + carry Bit 7 of the solution = carry To generate a bit in the upper half g 1 (call it bit j+k) we use this formula: (2) By combining equations (1) and (2), we can come up with the final SQUASH algorithm:

IV. IMPLEMENTATION Once we were able to mathematically generate bits of the product on the fly, we looked for an approach that would work in our C++ implementation. Shamir suggested generating these bits using non-linear feedback shift registers (NLFSR). A paper by Gosset, Standaert and Quisquater [3] gave us ideas. The group implemented the SQUASH algorithm into a Field- Programmable Gate Array (FPGA) using NLFSRs. As described in their paper, the final SQUASH algorithm can be simply stated as follows: states that are not in any particular order. (We say nearly all combinations because the state 0000 is invalid for reasons we will soon see.) A diagram of a simple LFSR follows. Set c = 0, m = NLFSR(R S) For j = 600 to j = 647 c = = c = Output the 32 bits c,, c Figure 1. 0 contains the value 1010 Notice that bits 3 and 4 (counting from left to right) are being fed into an exclusive-or (XOR) gate. These bits are said to be tapped. The number and order of the tapped bits are referred to as a tap sequence. The tap sequence for this register would typically be listed as (4,3). To understand the implementation, we must first understand linear feedback shift registers. A. Linear Feedback Shift Register Imagine if we were tasked with enumerating through all possible combinations (aka states) of a 4 bit binary string. TABLE I. 0 0000 0 1 0001 1 2 0010 2 3 0011 3 4 0100 4 5 0101 5 6 0110 6 7 0111 7 8 1000 8 9 1001 9 10 1010 10 11 1011 11 12 1100 12 13 1101 13 14 1110 14 15 1111 15 Intuitively, we would count from 0000 to 1111 (in binary) like above. Notice that we can represent 16 possible states with 4 binary bits (2 4 = 16). Although counting linearly upward from state to state like this is an intuitive approach, this turns out to be a complex implementation in physical hardware. An alternative design for generating nearly all possibilities is to use something called a linear feedback shift register. A linear feedback shift register allows us to generate nearly all possible binary combinations for a binary string of length n, albeit with Figure 2. 1 (after shifting right once from 0) If we were to generate all possible states of this particular shift register (by shifting right and using the XOR to feed in new values), the output states of the LFSR would be as follows: TABLE II. 0 1010 10 1 1101 13 2 1110 14 3 1111 15 4 0111 7 5 0011 3 6 0001 1 7 1000 8 8 0100 4 9 0010 2 10 1001 9 11 1100 12 12 0110 6 13 1011 11 14 0101 5

There are two important differences between Table I and Table II: The states here are completely out of order, which is why LFSR s are often used in cryptography. If we look carefully, this shift register goes through all possible states except a state with a value of 0000. A state of 0000 being fed into the XOR gate (regardless of the tap sequence) will always produce a 0 and consequently, we would be stuck in a state of 0000 indefinitely. B. Tap Sequences Keep in mind that in the previous LFSR example, we tapped bits 3 and 4. If we had chosen a different tap sequence, our register may not have went through 15 states. In general, a good tap sequence is one that gives us maximal length. This simply means that a LFSR should give us as many different states as possible until it loops back around. Mathematically, a maximal length tap sequence will always yield 2 n 1 possible states where n represents the number of binary bits. Remember that we subtract 1 because of our inability to represent a state of 0000. In order to find a maximal length tap sequence, we can look up the information from a table. For our purposes, we were able to find maximal length tap sequences from Wikipedia for 4, 8, 16, 32, 64 and 128 bit registers [4]. The maximal length tap sequences we used are shown here: Forward Shifter TABLE IV. Reverse Shifter 0 1010 10 0011 5 1 1101 13 1011 11 2 1110 14 0110 6 3 1111 15 1100 12 4 0111 7 1001 9 5 0011 3 0010 2 6 0001 1 0100 4 7 1000 8 1000 8 8 0100 4 0001 1 9 0010 2 0011 3 10 1001 9 0111 7 11 1100 12 1111 15 12 0110 6 1110 14 13 1011 11 1101 13 14 0101 5 1010 10 Notice how the forward and reverse shifters are mirror images of each other. To create a reverse shifter, we must change the shift direction from right to left as well as find a new reverse tap sequence. A simple formula can be used to convert the forward tap sequence into a reverse tap sequence: if (n, A, B, C) are the tapped bits in the forward direction, then (n, n-c, n-b, n-a) is the tap sequence in the reverse direction. For example, if the forward tap sequence for a 4 bit register is (4,3), the reverse tap sequence is (4,1) as Fig. 3 illustrates. TABLE III. Register Size (in bits) Maximal Length Tap Sequence 4 4,3 8 8,6,5,4 16 16,15,13,4 32 32,22,2,1 64 64,63,61,60 128 128,126,101,99 Note how the first value of the tap sequence is also the same as the register size. Essentially, this means the rightmost bit is always tapped for maximal length tap sequences. C. Non-linear Reverse Feedback Shift Register While we ve seen that a normal LFSR can generate bits in the forward direction, we will also need to generate bits in the reverse direction to perform the mathematical convolution we ve described previously. An example follows: Figure 3. Reverse LFSR V. PERFORMANCE TESTING To test the performance of our SQUASH implementation in C++, we designed a program that would simulate 100 timed executions of the SQUASH algorithm for registers of various sizes. After measuring the average time (in CPU cycles), we determined the frequency of the current CPU and used it to calculate an approximate tags per second metric. This was done by dividing the frequency of our CPU by the mean cycles per tag. Based on our results using this metric, our software implementation did indeed perform almost linearly as Shamir had described.

5 4 3 2 1 0 Tags Authenticated By Reader As A Function Of Register Size Tags Per Second 4 8 16 32 64 128 VI. CONCLUSION While our tests showed that performance did scale linearly, we noted that we should generally be able to process significantly more tags per second. Gosset s FPGA implementation of the Squash algorithm was able to process roughly 3,500 tags per second using an FPGA operating at 222 MHz [3]. Meanwhile, our C++ implementation, running on a Dual Core, Intel 1.83 GHz was unable to process 5 tags per second. As we can see, specialized hardware makes an enormous difference. Production-quality RFID readers should not be affected by the performance issues we faced testing on a general purpose PC. Figure 4. Results of performance testing Fig. 4 is slightly skewed because the x-axis grows exponentially Table VI shows the linear relationship between register size and processed tags per second. TABLE VI. Register Size Tags Per Second (In Bits) 4 4.3457 8 2.60074 16 1.68399 32 0.981616 64 0.590769 128 0.309733 REFERENCES [1] Adi Shamir, SQUASH A New MAC With Provable Security Properties for Highly Constrained Devices Such as RFID Tags, Proc. Fast Software Encryption FSE 2008, Lausanne, Switzerland, February 2008 [2] Michael Rabin, Digitalized Signatures and Public-Key Functions as Intractable Factorization, Massachusetts Institute of Technology, Laboratory for Computer Science, TR-212, January 1979 [3] F. Gosset, F.-X. Standaert, J.-J. Quisquater, FPGA Implementation of SQUASH, 29 th Symposium on Information Theory in Benelux, Leuven, Belgium, May 2008, pp. 231-238. [4] Linear Feedback Shift Registers, Wikipedia, http://en.wikipedia.org/wiki/linear_feedback_shift_register, Accessed 2010