Reconfigurable Hardware Implementation and Analysis of Mesh Routing for the Matrix Step of the Number Field Sieve Factorization

Size: px
Start display at page:

Download "Reconfigurable Hardware Implementation and Analysis of Mesh Routing for the Matrix Step of the Number Field Sieve Factorization"

Transcription

1 Reconfigurable Hardware Implementation and Analysis of Mesh Routing for the Matrix Step of the Number Field Sieve Factorization Sashisu Bajracharya MS CpE Candidate Master s Thesis Defense Advisor: Dr Kris Gaj George Mason University

2 Outline RSA and Factoring with Number Field Sieve(NFS) 2 Matrix Step of NFS 3 Basic Mesh Routing 6 4 Improved Mesh Routing 7 5 Results and 8 Conclusions FPGA Array 9 Summary & Conclusions SRC-6e Reconfigurable Computer

3 RSA Major Public Key Cryptosystem Public key (e, N) Private key (d, P,Q) Alice Encryption Network Decryption Bob { e, N } { d, P, Q } N = P Q P, Q - large prime factors

4 RSA developed by Ron Rivest, Adi Shamir & Leonard Adlemann in 977

5 Applications of RSA Secure WWW, SSL, 95% of e-commerce Network Browser WebServer S/MIME, PGP Alice Bob

6 How hard is to break RSA? Largest Number Factored: 576 bits RSA-576 (Dec 23) Resources & efforts workstations from 8 different sites around the world 3 months

7 Recommended key sizes for RSA Old standard: Individual users New standard: 52 bits (55 decimal digits) Broken in 999 Individual users 768 bits Organizations (short term) 24 bits Organizations (long term) 248 bits

8 Estimated Difficulty of factoring 24-bit number by RSA Security, Inc 342 million PCs, 5 MHz 7 GB RAM year

9 Our Task Determine how hard is to break RSA for factoring large key sizes using reconfigurable hardware Generic Array of FPGAs SRC-6e Reconfigurable Computer

10 Best Algorithm to Factor NUMBER FIELD SIEVE Complexity: Sub-exponential time and memory N = Number to factor, k = Number of bits of N exponential function, e k Sub-exponential function, e k/3 (ln k) 2/3 Polynomial function, a k m

11 Number Field Sieve(NFS) Steps Polynomial Selection Sieving Matrix (Linear Algebra) Computationally intensive steps Square Root

12 Hardware Architecture of NFS proposed to date Daniel Bernstein Univ of Illinois, Chicago Adi Shamir, Eran Tromer Weizemann Institute, Israel Mesh Approach Matrix and Sieving Mesh Sorting: Matrix Fall 2 Mesh Routing TWIRL Matrix AsiaCrypt 22 Sieving Crypto 23, AsiaCrypt 23 Mesh method improves asymptotic complexity for NFS performance Just analytical estimations, no real implementation, no concrete numbers

13 My Objective Bring this mesh algorithm to practical hardware implementation and concrete numbers Matrix (Linear Algebra) Focus of Research

14 My Objective Detailed design in RTL code of the mesh algorithm Synthesis and Implementation Results for an array of Virtex FPGAs and SRC-6e Reconfigurable computer

15 Function of Matrix Step Find the linear dependency in the large sparse matrix obtained after sieving step D = number of matrix columns or rows 6 for 52-bit 7 for 24-bit D c i c i2 c il c i c i2 c il =

16 Mesh based hardware circuits, proposed by Bernstein and Shamir-Tromer, decrease the time and cost of matrixvector multiplications Block Weidemann Algorithm for the Matrix Step ) Uses multiple matrix-vector multiplications of the sparse matrix A with K random vectors A v i, A 2 v i,, A k v i k = 2D/K 2) Post computation leading to the determination of linear dependence on columns of matrix A Most Time consuming operation: A [DxD] v [Dx]

17 Two Architectures for Matrix-vector multiplication Mesh Sorting (Bernstein) Based on Recursive Sorting Mesh Routing (Shamir-Tromer) Based on Routing Does one multiplication at a time Does K multiplications at a time large area compact area - (handles large matrix size)

18 Mesh Routing

19 Matrix-Vector Multiplication v A A v Sparse Matrix

20 Mesh Routing m x m mesh where m = D A v cell( S ) d = maximum non-zero entries for all column m D D m

21 Routing in the Mesh Fourth cell Each time a packet arrives at the target cell, the packet s vector s bit is xored with the partial result bit on the target cell

22 Mesh Routing Mesh contains the result of the multiplication

23 Mesh Routing with K parallel multiplications Example for K=2 v v A mesh

24 Clockwise Transposition Routing Each step a cell holds one packet, and receives one packet from neighbor for compare-exchange Exchange is done only if it reduces the distance to target of the farthest traveling packet

25 Clockwise Transposition Routing Four iterations repeated Cells Compareexchange direction

26 Types of Packets ) Valid packet 2) Invalid packet - packet becomes invalid when reached to destination

27 Compare-Exchange Cases Four cases for a cell Left cell 2 2 N N a) Both packets are valid (may need to exchange) b) Current packet invalid, incoming new packet valid (may need to exchange) 2 N 2 N N N N N c) Current packet valid, incoming new packet invalid (may need to annihilate) c) Current packet invalid, incoming new packet invalid (no action)

28 Basic and Improved Mesh Routing Designs

29 Basic Mesh Routing Design Each Cell of mesh handles one column of matrix A K = or K 32, K = number of vectors multiplied by matrix A concurrently Total routing takes d 4 m compare-exchange operations

30 Basic Loading and Unloading Design Vector Non Zero Matrix Entries Result Vector

31 Parallel Loading & Unloading Design Vector Non Zero Matrix Entries Result Vector Restricted by Number of IO pins available

32 Basic Cell Design for Basic Mesh Routing LUT-RAM P[i] R[i] address en decode CU annihilate CR en_cur exchange equal eq_pack eq_packet P [i] en_equal Check Dest equal Status bits r c coordinate exchange annihilate en_equal row/col Comparator oper eq_packet

33 Comparator Design cell s coordinate current packet new packet row col s row col s2 row col row/col s, s2 en_equal > a = b Control Signal Logic oper s s2 exchange annihilate eq_packet

34 Improved Mesh Routing Design Each Cell of mesh handles p columns of the matrix A Compact area => handles larger matrix size Total routing takes p d 4 m compare-exchange steps proposed for cost reduction

35 Mesh Cell Design for Improved Mesh Routing R[i] LUT-RAM address en decode CU en P[i] addr annihilate equal eq_pack CR en_cur exchange eq_packet P [i] en_equal equal addr addr2 Check_ Dest Status bits r c coordinate exchange annihilate en_equal Comparator eq_packet row/col oper

36 Target FPGA Devices Xilinx Virtex II XC2V8 46,592 CLB slices 93,84 LUT (LookUpTable) 93,84 FF (Flip-Flop) Multipliers 8 x 8 Block RAMs Multipliers 8 x 8 Block RAMs Multipliers 8 x 8 Block RAMs Multipliers 8 x 8 Block RAMs XC2V6 33,792 CLB slices 67,584 LUT (LookUpTable) 67,584 FF (Flip-Flop) LUTO Carry & Control Logic FF I/O Block CLB slice LUT Carry & Control Logic FF CLB-SLICE CLK

37 Results and Analysis

38 Synthesis Results for one Virtex II XC2V8 using Basic Mesh Routing Design Matrix Size K CLB slices LUTs FFs Clock Period (ns) Time for K mult (ns) Time per mult (ns) 44x44 (Mesh 2x2) 823 (7%) 5,495 (6%) 5,38 (5%) x44 (Mesh 2x2) 32 23,949 (5%) 46,944 (5%) 23,49 (25%) x44 (Mesh 2x2) 7 43,65 (92%) 84,836 (9%) 45,378 (48%) K = number of concurrent matrix-vector multiplications Time for K mult = d * 4 * m * Clock period

39 Speedup vs Software Implementation Reference Optimized SW Implementation: PC, Pentium IV, 2768 GHz, GB RAM Matrix Size 44x44 (Mesh 2x2) K One Multiplication Time in SW (ns) One Multiplication Time in HW (ns) Speedup

40 Distributed Computation (Geiselmann, Steinwandt) A v Av A, A,2 A,3 v A A 2, A 2,2 A 2,3 A 3, A 3,2 A 3,3 v 2 v 3 = A 2 A A A, v A v,2 2 A,3 v 3 = A v = s j=, j s A : A j= s, j v v j j

41 52-bit & 24-bit performance with different number of square array of FPGAs connected in two dimensions ) FPGA array performs single sub-matrix by sub-vector multiplication 2) Reuse FPGA array for next sub-computation

42 52-bit Performance with one chip & multiple chips connected in mesh for Basic Mesh Routing D = number of columns in matrix A m = mesh dimension n = number of times to repeat multiplications = D 2 /(m 4 ) T K = routing time for K multiplications in the mesh = d*4*m* Clock Period T Load = time for loading and unloading for K multiplications T Total = total time for a Matrix step = 3*D/K * n *( T K + T Load ) Virtex II chips D m n 67x x 6 2 T K (ns) T Load (ns) T Total (days) Speedup vs chip 2 x x x , x ,

43 24-bit Performance with one chip & multiple chips connected in mesh for Basic Mesh Routing D = number of columns in matrix A m = mesh dimension n = number of times to repeat multiplications = D 2 /(m 4 ) T K = routing time for K multiplications in the mesh = d*4*m* Clock Period T Load = time for loading and unloading for K multiplications T Total = total time for a Matrix step = 3*D/K * n *( T K + T Load ) Virtex II chips D m n 4 x x x 7 92 T K (ns) T Load (ns) 77 x T Total (days) Speedup vs chip 4 x 6 77 x x x ,

44 Analysis & Conclusion Polynomial Speedup with number of FPGAs Speedup approximately proportional to (#FPGA) 3/2 T Total = 2 D D 3 ( d 4 m 4 K ( m # chip) # chip + T load m = mesh size in one Virtex II chip )

45 Speedup vs number of FPGA chips 4 35 Speedup over chip Number of Virtex II chips

46 Synthesis Results on one Virtex II XC2V8 for Improved Mesh Routing Design Matrix Size K CLB slices 234x234 (Mesh 2x2, p=6 ) 6738 (4%) LUTs,438 (%) FFs 6,279 (7%) Clock Period (ns) Time for K mult (ns) Time per mult (ns) x234 (Mesh 2x2, p=6 ) 32 29,938 (64%) 5,983 (54%) 9,65 (2%) x234 (Mesh 2x2, p=6 ) 5 43,42 (93%) 74,3 (89%) 27,46 (29%) Time for K mult = p * d * 4 * m * Clock period

47 52-bit Performance with one chip & multiple chips connected in mesh for Improved Mesh Routing D = number of columns in matrix A p = number of columns handled in one cell=6 n = number of times to repeat sub-multiplications = D 2 /(m 2 p) 2 T K = routing time for K multiplications in the mesh = p*d*4*m*clock period T Load = time for loading and unloading for K multiplications T Total = total time for a Matrix step = 3*D/K* n *( T K + T Load ) Virtex II chips D m n 67x 6 2 T K (ns) T Load (ns) T Total (days) Speedup vs chip 84 x x x 5 2 x x x 6 38 x x x 6 9 x

48 24-bit Performance with one chip & multiple chips connected in mesh for Improved Mesh Routing D = number of columns in matrix A p = number of columns handled in one cell=6 n = number of times to repeat sub-multiplications = D 2 /(m 2 p) 2 T K = routing time for K multiplications in the mesh = p*d*4*m*clock period T Load = time for loading and unloading for K multiplications T Total = total time for a Matrix step = 3*D/K* n *( T K + T Load ) Virtex II chips D m n T K (ns) T Load (ns) T Total (days) Speed up vs chip 4 x x x x x 4 8 x 5 2 x x 6 38 x x x 6 9 x

49 Comparison of Basic & Improved Mesh Routing performance with the number of FPGAs 7 Basic Mesh Routing Improved Mesh Routing 4 Improved Mesh Routing Basic Mesh Routing Time 4 (days) 3 Time (days) Number of Virtex II chips Number of Virtex II chips 52-bit 24-bit

50 Speedup of Improved to Basic Mesh Routing vs Number of Virtex II FPGAs speedup ratio Number of Virtex II chips speedup ratio Number of Virtex II chips 52-bit 24-bit

51 Comparison vs Cray Implementation 52-bit number, Improved Mesh Routing Design Cray C96 24 Virtex II FPGAs 93 days 32 days (32 hours)

52 Conclusions for Basic Mesh Routing & Improved Mesh Routing Best Case for 24-bit: Improved Mesh Routing Design 24 Virtex II chips Total execution time: 27 days Improved Mesh Routing faster than Basic Mesh Routing in Virtex II 8 by factor of around -5 times large sub-matrix size handled in same FPGA decreases sharply number of iterations to repeat sub-multiplications Influence of K reducing from 7 to 5 very low

53 SRC-6e Reconfigurable Computer

54 SRC-6e Reconfigurabe Computer Hardware Architecture P3 ( GHz) 8 MB/s / P3 ( GHz) 8 MB/s / Control FPGA XC2V6 ½ MAP Board / 528 MB/s 528 MB/s L2 8 MB/s MIOC L2 /8 MB/s PCI-X µp Board / Computer Memory (5 GB) DDR Interface SNAP 8 MB/s / 8 bits flags / 64 bits data / FPGA XC2V6 48 MB/s / (6x64 bits) On-Board Memory (24 MB) 48 MB/s (6x 64 bits) / 24 MB/s (92 bits) (8 bits) / / 48 MB/s (6x 64 bits) / (8 bits) FPGA 2 XC2V6 Chain Ports 24 MB/s

55 MAP Programming Model of SRC MAP C sub-routine FPGA contents MAP_Function(a, d, e) { a FPGA } Macro_(a, b, c) Macro_2(b, d) Macro_2(c, e) Macro_ b c Macro_2 Macro_2 d e

56 SRC Program Partitioning µp system FPGA system C function for µp C function for MAP VHDL macro HLL HDL

57 SRC-6e Designs

58 SRC-Mesh State Machine Cells Complete Mesh in VHDL

59 SRC-Cells Control in C Cell VHDL macro Mesh in MAP C

60 Modified Architecture of the cell for SRC-Mesh LUT-RAM P[i] R[i] address en decode CU annihilate CR en_cur exchange equal eq_pack eq_packet R P [i] en_equal Check Dest equal Status bits r c coordinate exchange annihilate en_equal row/col Comparator oper eq_packet

61 SRC-Cells Design Entry & Circuit cell cell b a2 a b2 for ( ) { cell (a, &b); cell (a2, &b2); a = b2; a2 = b; } a cell b cell b2 a2

62 Cell Architecture for SRC-Cells Design annihilate equal eq_pack CR en_cur exchange eq_packet P [i] en_equal Check Dest equal Status bits r c coordinate exchange annihilate en_equal row/col Comparator oper eq_packet

63 Results and Analysis

64 SRC Basic Mesh Routing Results K = number of parallel sub-matrix by sub-vector multiplications n = number of times to repeat sub-multiplications = D 2 / m 4 x = clock-cycles per exchange = routing time for K multiplications in the mesh = d*4*m*x* period T Kroute T KTot = time for K multiplications including loading, unloading and routing T 52 Compute = computational total time for a 52-bit Matrix step T 52 Total = total time for a 52-bit Matrix step = 3*D/K* n *( T KTot ) Design Type Mesh Size K CLB slices LUTs FFs Period (ns) x T Kroute T KTot Compute T 52 ( days) T 52 Total ( days) SRC- Mesh 2x2 (Matrix 44x44) 42 3,743 (9%) 54,66 (8%) 43,545 (64%) 2 96 ns 87 ns,52 22,46 SRC- Mesh 2x2 (Matrix 4x4) 3,533 (93%) 54,69 (8%) 28,636 (42%) 2 6 ns 227 ns 4,222 47,865 SRC- Mesh x (Matrix x) 7 3,566 (93%) 55,528 (82%) 46,647 (69%) 2 8 ns 87 ns,938 27, 898 SRC- Cells x (Matrix 2x2) 32,84 (97%) 29,959 (44%) 47,759 (7%) 3 32 ns 6 ns 939,676,46,2

65 Comparison of 52-bit Performance for different mesh sizes & K values with equivalent area Computational time Total time 6, 4, 2, # days, 8, 6, 4, 2, 2x2 K= 2x2 K=42 Mesh Type x K=7

66 Conclusion for performance of different mesh sizes & K values Comparing performance for different mesh sizes and K with equivalent FPGA resources ( 9% ) mesh of 2x2 with K=42 better than mesh of2x2 with K= 2 D D T = 3 ( d 4 m x + 4 K m Total T load ) mesh of x K=7 similar to mesh of 2x2 K=42

67 SRC-Mesh vs SRC-Cells Area for x mesh with K= Design Type Mesh Size K CLB slices LUTs FFs Period (ns) x T Kroute SRC- Cells x (Matrix x) (74%) 2256 (33%) 3642 (53%) 3 2 ns SRC- Mesh x (Matrix x) 9,347 (27%) 3,427 (9%) 439 (5%) 2 8 ns

68 SRC-mesh vs SRC-cells Area for x mesh % CLB LUT FF SRC Mesh SRC Cells

69 Conclusions for SRC-Mesh and SRC-Cells SRC-cells has about 27 times larger area than SRC-mesh for same mesh parameters performs worse than SRC-mesh (only small mesh can fit, K small) Benefit: ease of programming in high-level language

70 SRC Improved Mesh Routing Results (Area) Design Type Mesh Size m x m /w p =6 K CLB slices LUTs FFs Improved SRC- Mesh x (matrix 6x6) 32 3,2 (9%) 5,95 (76%) 29,954 (44%) Improved SRC- Mesh 8x8 (matrix 24x24) 64 3,456 (93%) 53,6 (78%) 3,82 (45%)

71 SRC Improved Mesh Routing Results (Performance) K = number of simultaneous vectors being multiplied p = number of multiple columns of A handled in one cell= 6 n = number of times to repeat sub-multiplications =D 2 /(m 2 *p) 2 x = clock-cycles per compare-exchange operation T Kroute = routing time for K multiplications in the mesh = p*d*4*m*x* period T KTot = time for K multiplications including loading, unloading and routing T 52 Compute = computational total time for a 52-bit Matrix step T 52 Total = total time for a 52-bit Matrix step = 3*D/K* n *( T KTot ) Design Type Improved SRC- Mesh Improved SRC- Mesh Mesh Size m x m x (Matrix 6x6) 8x8 (matrix 24x24) K Period (ns) x T Kroute (ns) T Ktot (ns) T 52 Compute ( days) T 52 Total ( days) ,2 3, , ,36 25, ,93

72 Analysis & Conclusion for SRC-6e Improved & Basic Mesh Routing Improved SRC-Mesh faster than Basic SRC-mesh design by a factor of 57 in SRC-6e Virtex II days compared to 22,46 days in best case Larger sub-matrix size decreases significantly number of sub-multiplications

73 Standalone FPGA vs SRC design Standalone FPGA Virtex II 8 vs SRC Virtex II 6 Virtex II 8 designs, larger K and m Latency of routing increases in SRC-6e To improve the frequency to MHz, time of compare-exchange increased by 2-3 clock cycles Limited IO from 6 OBM banks in SRC-6e, more loading-unloading time Result on two dimensional array of Standalone Virtex II FPGAs vs one FPGA on SRC-6e

74 Summary & Conclusions First Practical hardware Implementation of Mesh Routing for the Number Field Sieve implemented and tested Practical concrete numbers for theoretical algorithm of Mesh Routing obtained to assess the current hardness of the matrix step in reconfigurable hardware Two architectures, Basic and Improved, implemented and compared All designs compared using the platform generic array of FPGA devices SRC-6e Reconfigurable Computer

75 Summary & Conclusions Assuming constant area, Improved Mesh Routing Design faster than Basic Mesh Routing Design by a factor of -5 in Virtex II 8 large sub-matrix handled A two-dimensional array of Virtex II chips can perform computations faster than a single FPGA by a factor proportional to (number of FPGAs) 3/2 Matrix step for a 24-bit number can be performed using 24 Virtex II chips in 27 days

76 Summary & Conclusions Two design entry approaches developed for the SRC- 6e SRC-Mesh is entirely written in VHDL SRC-cells is written mostly in C with only cell in VHDL SRC-Mesh outperforms SRC-cells by a factor of 5 at the cost of hardness in development of the VHDL code manual optimized circuit in VHDL suitable for SRC platform for the distributed computation of mesh

77 Acknowledgement Dr Kris Gaj SRC Computers Inc Deapesh Misra

78 Questions

Reconfigurable Hardware Implementation and Analysis of Mesh Routing for the Matrix Step of the Number Field Sieve Factorization

Reconfigurable Hardware Implementation and Analysis of Mesh Routing for the Matrix Step of the Number Field Sieve Factorization Reconfigurable Hardware Implementation and Analysis of Mesh Routing for the Matrix Step of the Number Field Sieve Factorization A thesis submitted in partial fulfillment of the requirements for the degree

More information

Reference. Wayne Wolf, FPGA-Based System Design Pearson Education, N Krishna Prakash,, Amrita School of Engineering

Reference. Wayne Wolf, FPGA-Based System Design Pearson Education, N Krishna Prakash,, Amrita School of Engineering FPGA Fabrics Reference Wayne Wolf, FPGA-Based System Design Pearson Education, 2004 CPLD / FPGA CPLD Interconnection of several PLD blocks with Programmable interconnect on a single chip Logic blocks executes

More information

Synthesis and Analysis of 32-Bit RSA Algorithm Using VHDL

Synthesis and Analysis of 32-Bit RSA Algorithm Using VHDL Synthesis and Analysis of 32-Bit RSA Algorithm Using VHDL Sandeep Singh 1,a, Parminder Singh Jassal 2,b 1M.Tech Student, ECE section, Yadavindra collage of engineering, Talwandi Sabo, India 2Assistant

More information

Lecture 3, Handouts Page 1. Introduction. EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Simulation Techniques.

Lecture 3, Handouts Page 1. Introduction. EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Simulation Techniques. Introduction EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Techniques Cristian Grecu grecuc@ece.ubc.ca Course web site: http://courses.ece.ubc.ca/353/ What have you learned so far?

More information

A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION

A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION Sinan Yalcin and Ilker Hamzaoglu Faculty of Engineering and Natural Sciences, Sabanci University, 34956, Tuzla,

More information

CHAPTER III THE FPGA IMPLEMENTATION OF PULSE WIDTH MODULATION

CHAPTER III THE FPGA IMPLEMENTATION OF PULSE WIDTH MODULATION 34 CHAPTER III THE FPGA IMPLEMENTATION OF PULSE WIDTH MODULATION 3.1 Introduction A number of PWM schemes are used to obtain variable voltage and frequency supply. The Pulse width of PWM pulsevaries with

More information

CHAPTER 5 IMPLEMENTATION OF MULTIPLIERS USING VEDIC MATHEMATICS

CHAPTER 5 IMPLEMENTATION OF MULTIPLIERS USING VEDIC MATHEMATICS 49 CHAPTER 5 IMPLEMENTATION OF MULTIPLIERS USING VEDIC MATHEMATICS 5.1 INTRODUCTION TO VHDL VHDL stands for VHSIC (Very High Speed Integrated Circuits) Hardware Description Language. The other widely used

More information

WHAT ARE FIELD PROGRAMMABLE. Audible plays called at the line of scrimmage? Signaling for a squeeze bunt in the ninth inning?

WHAT ARE FIELD PROGRAMMABLE. Audible plays called at the line of scrimmage? Signaling for a squeeze bunt in the ninth inning? WHAT ARE FIELD PROGRAMMABLE Audible plays called at the line of scrimmage? Signaling for a squeeze bunt in the ninth inning? They re none of the above! We re going to take a look at: Field Programmable

More information

Design of Parallel Algorithms. Communication Algorithms

Design of Parallel Algorithms. Communication Algorithms + Design of Parallel Algorithms Communication Algorithms + Topic Overview n One-to-All Broadcast and All-to-One Reduction n All-to-All Broadcast and Reduction n All-Reduce and Prefix-Sum Operations n Scatter

More information

Hardware Implementation of BCH Error-Correcting Codes on a FPGA

Hardware Implementation of BCH Error-Correcting Codes on a FPGA Hardware Implementation of BCH Error-Correcting Codes on a FPGA Laurenţiu Mihai Ionescu Constantin Anton Ion Tutănescu University of Piteşti University of Piteşti University of Piteşti Alin Mazăre University

More information

Design and FPGA Implementation of an Adaptive Demodulator. Design and FPGA Implementation of an Adaptive Demodulator

Design and FPGA Implementation of an Adaptive Demodulator. Design and FPGA Implementation of an Adaptive Demodulator Design and FPGA Implementation of an Adaptive Demodulator Sandeep Mukthavaram August 23, 1999 Thesis Defense for the Degree of Master of Science in Electrical Engineering Department of Electrical Engineering

More information

Hardware-based Image Retrieval and Classifier System

Hardware-based Image Retrieval and Classifier System Hardware-based Image Retrieval and Classifier System Jason Isaacs, Joe Petrone, Geoffrey Wall, Faizal Iqbal, Xiuwen Liu, and Simon Foo Department of Electrical and Computer Engineering Florida A&M - Florida

More information

Performance Enhancement of the RSA Algorithm by Optimize Partial Product of Booth Multiplier

Performance Enhancement of the RSA Algorithm by Optimize Partial Product of Booth Multiplier International Journal of Electronics Engineering Research. ISSN 0975-6450 Volume 9, Number 8 (2017) pp. 1329-1338 Research India Publications http://www.ripublication.com Performance Enhancement of the

More information

Interconnect testing of FPGA

Interconnect testing of FPGA Center for RC eliable omputing Interconnect Testing of FPGA Stanford CRC March 12, 2001 Problem Statement Detecting all faults in FPGA interconnect resources Wire segments Programmable interconnect points

More information

DYNAMICALLY RECONFIGURABLE PWM CONTROLLER FOR THREE PHASE VOLTAGE SOURCE INVERTERS. In this Chapter the SPWM and SVPWM controllers are designed and

DYNAMICALLY RECONFIGURABLE PWM CONTROLLER FOR THREE PHASE VOLTAGE SOURCE INVERTERS. In this Chapter the SPWM and SVPWM controllers are designed and 77 Chapter 5 DYNAMICALLY RECONFIGURABLE PWM CONTROLLER FOR THREE PHASE VOLTAGE SOURCE INVERTERS In this Chapter the SPWM and SVPWM controllers are designed and implemented in Dynamic Partial Reconfigurable

More information

Field Programmable Gate Arrays based Design, Implementation and Delay Study of Braun s Multipliers

Field Programmable Gate Arrays based Design, Implementation and Delay Study of Braun s Multipliers Journal of Computer Science 7 (12): 1894-1899, 2011 ISSN 1549-3636 2011 Science Publications Field Programmable Gate Arrays based Design, Implementation and Delay Study of Braun s Multipliers Muhammad

More information

Artificial Neural Network Engine: Parallel and Parameterized Architecture Implemented in FPGA

Artificial Neural Network Engine: Parallel and Parameterized Architecture Implemented in FPGA Artificial Neural Network Engine: Parallel and Parameterized Architecture Implemented in FPGA Milene Barbosa Carvalho 1, Alexandre Marques Amaral 1, Luiz Eduardo da Silva Ramos 1,2, Carlos Augusto Paiva

More information

Chapter 1 Introduction

Chapter 1 Introduction Chapter 1 Introduction 1.1 Introduction There are many possible facts because of which the power efficiency is becoming important consideration. The most portable systems used in recent era, which are

More information

Expert Systems with Applications

Expert Systems with Applications Expert Systems with Applications 39 (2012) 2203 2210 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa Hardware software

More information

CARRY SAVE COMMON MULTIPLICAND MONTGOMERY FOR RSA CRYPTOSYSTEM

CARRY SAVE COMMON MULTIPLICAND MONTGOMERY FOR RSA CRYPTOSYSTEM American Journal of Applied Sciences 11 (5): 851-856, 2014 ISSN: 1546-9239 2014 Science Publication doi:10.3844/ajassp.2014.851.856 Published Online 11 (5) 2014 (http://www.thescipub.com/ajas.toc) CARRY

More information

PE713 FPGA Based System Design

PE713 FPGA Based System Design PE713 FPGA Based System Design Why VLSI? Dept. of EEE, Amrita School of Engineering Why ICs? Dept. of EEE, Amrita School of Engineering IC Classification ANALOG (OR LINEAR) ICs produce, amplify, or respond

More information

Hybrid QR Factorization Algorithm for High Performance Computing Architectures. Peter Vouras Naval Research Laboratory Radar Division

Hybrid QR Factorization Algorithm for High Performance Computing Architectures. Peter Vouras Naval Research Laboratory Radar Division Hybrid QR Factorization Algorithm for High Performance Computing Architectures Peter Vouras Naval Research Laboratory Radar Division 8/1/21 Professor G.G.L. Meyer Johns Hopkins University Parallel Computing

More information

EECS 427 Lecture 21: Design for Test (DFT) Reminders

EECS 427 Lecture 21: Design for Test (DFT) Reminders EECS 427 Lecture 21: Design for Test (DFT) Readings: Insert H.3, CBF Ch 25 EECS 427 F09 Lecture 21 1 Reminders One more deadline Finish your project by Dec. 14 Schematic, layout, simulations, and final

More information

Lessons Learned from Designing a 65 nm ASIC for Third Round SHA-3 Candidates

Lessons Learned from Designing a 65 nm ASIC for Third Round SHA-3 Candidates Lessons Learned from Designing a 65 nm ASIC for Third Round SHA-3 Candidates Frank K. Gürkaynak, Kris Gaj, Beat Muheim, Ekawat Homsirikamol, Christoph Keller, Marcin Rogawski, Hubert Kaeslin, Jens-Peter

More information

FlexWave: Development of a Wavelet Compression Unit

FlexWave: Development of a Wavelet Compression Unit FlexWave: Development of a Wavelet Compression Unit Jan.Bormans@imec.be Adrian Chirila-Rus Bart Masschelein Bart Vanhoof ESTEC contract 13716/99/NL/FM imec 004 Outline! Scope and motivation! FlexWave image

More information

CS/EE Homework 9 Solutions

CS/EE Homework 9 Solutions S/EE 260 - Homework 9 Solutions ue 4/6/2000 1. onsider the synchronous ripple carry counter on page 5-8 of the notes. Assume that the flip flops have a setup time requirement of 2 ns and that the gates

More information

REALISATION OF AWGN CHANNEL EMULATION MODULES UNDER SISO AND SIMO

REALISATION OF AWGN CHANNEL EMULATION MODULES UNDER SISO AND SIMO REALISATION OF AWGN CHANNEL EMULATION MODULES UNDER SISO AND SIMO ENVIRONMENTS FOR 4G LTE SYSTEMS Dr. R. Shantha Selva Kumari 1 and M. Aarti Meena 2 1 Department of Electronics and Communication Engineering,

More information

Lecture 30. Perspectives. Digital Integrated Circuits Perspectives

Lecture 30. Perspectives. Digital Integrated Circuits Perspectives Lecture 30 Perspectives Administrivia Final on Friday December 15 8 am Location: 251 Hearst Gym Topics all what was covered in class. Precise reading information will be posted on the web-site Review Session

More information

Embedded System Hardware - Reconfigurable Hardware -

Embedded System Hardware - Reconfigurable Hardware - 2 Embedded System Hardware - Reconfigurable Hardware - Peter Marwedel Informatik 2 TU Dortmund Germany GOPs/J Courtesy: Philips Hugo De Man, IMEC, 27 Energy Efficiency of FPGAs 2, 28-2- Reconfigurable

More information

Disseny físic. Disseny en Standard Cells. Enric Pastor Rosa M. Badia Ramon Canal DM Tardor DM, Tardor

Disseny físic. Disseny en Standard Cells. Enric Pastor Rosa M. Badia Ramon Canal DM Tardor DM, Tardor Disseny físic Disseny en Standard Cells Enric Pastor Rosa M. Badia Ramon Canal DM Tardor 2005 DM, Tardor 2005 1 Design domains (Gajski) Structural Processor, memory ALU, registers Cell Device, gate Transistor

More information

Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 2 nd year engineering students

Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 2 nd year engineering students Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 2 nd year engineering students FIG-2 Winter/Summer Training Level 1 (Basic & Mandatory) & Level 1.1 continues. Winter/Summer Training

More information

FINITE IMPULSE RESPONSE (FIR) FILTER

FINITE IMPULSE RESPONSE (FIR) FILTER CHAPTER 3 FINITE IMPULSE RESPONSE (FIR) FILTER 3.1 Introduction Digital filtering is executed in two ways, utilizing either FIR (Finite Impulse Response) or IIR (Infinite Impulse Response) Filters (MathWorks

More information

Lecture Perspectives. Administrivia

Lecture Perspectives. Administrivia Lecture 29-30 Perspectives Administrivia Final on Friday May 18 12:30-3:30 pm» Location: 251 Hearst Gym Topics all what was covered in class. Review Session Time and Location TBA Lab and hw scores to be

More information

High-Speed RSA Crypto-Processor with Radix-4 4 Modular Multiplication and Chinese Remainder Theorem

High-Speed RSA Crypto-Processor with Radix-4 4 Modular Multiplication and Chinese Remainder Theorem High-Speed RSA Crypto-Processor with Radix-4 4 Modular Multiplication and Chinese Remainder Theorem Bonseok Koo 1, Dongwook Lee 1, Gwonho Ryu 1, Taejoo Chang 1 and Sangjin Lee 2 1 Nat (NSRI), Korea 2 Center

More information

FPGA-Based Design and Implementation of a Multi-Gbps LDPC Decoder

FPGA-Based Design and Implementation of a Multi-Gbps LDPC Decoder FPGA-Based Design and Implementation of a Multi-Gbps LDPC Decoder Alexios Balatsoukas-Stimming and Apostolos Dollas Technical University of Crete Dept. of Electronic and Computer Engineering August 30,

More information

Factorization myths. D. J. Bernstein. Thanks to: University of Illinois at Chicago NSF DMS Alfred P. Sloan Foundation

Factorization myths. D. J. Bernstein. Thanks to: University of Illinois at Chicago NSF DMS Alfred P. Sloan Foundation Factorization myths D. J. Bernstein Thanks to: University of Illinois at Chicago NSF DMS 0140542 Alfred P. Sloan Foundation Sieving and 611 + for small : 1 2 2 3 4 2 2 3 5 6 2 3 5 7 7 8 2 2 2 9 3 3 10

More information

Design of a High Throughput 128-bit AES (Rijndael Block Cipher)

Design of a High Throughput 128-bit AES (Rijndael Block Cipher) Design of a High Throughput 128-bit AES (Rijndael Block Cipher Tanzilur Rahman, Shengyi Pan, Qi Zhang Abstract In this paper a hardware implementation of a high throughput 128- bits Advanced Encryption

More information

FPGA Circuits. na A simple FPGA model. nfull-adder realization

FPGA Circuits. na A simple FPGA model. nfull-adder realization FPGA Circuits na A simple FPGA model nfull-adder realization ndemos Presentation References n Altera Training Course Designing With Quartus-II n Altera Training Course Migrating ASIC Designs to FPGA n

More information

Vocal Command Recognition Using Parallel Processing of Multiple Confidence-Weighted Algorithms in an FPGA

Vocal Command Recognition Using Parallel Processing of Multiple Confidence-Weighted Algorithms in an FPGA Vocal Command Recognition Using Parallel Processing of Multiple Confidence-Weighted Algorithms in an FPGA ECE-492/3 Senior Design Project Spring 2015 Electrical and Computer Engineering Department Volgenau

More information

אני יודע מה עשית בפענוח האחרון: התקפות ערוצי צד על מחשבים אישיים

אני יודע מה עשית בפענוח האחרון: התקפות ערוצי צד על מחשבים אישיים אני יודע מה עשית בפענוח האחרון: התקפות ערוצי צד על מחשבים אישיים I Know What You Did Last Decryption: Side Channel Attacks on PCs Lev Pachmanov Tel Aviv University Daniel Genkin Technion and Tel Aviv University

More information

Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs and GPUs

Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs and GPUs 5 th International Conference on Logic and Application LAP 2016 Dubrovnik, Croatia, September 19-23, 2016 Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs

More information

PROGRAMMABLE ASICs. Antifuse SRAM EPROM

PROGRAMMABLE ASICs. Antifuse SRAM EPROM PROGRAMMABLE ASICs FPGAs hold array of basic logic cells Basic cells configured using Programming Technologies Programming Technology determines basic cell and interconnect scheme Programming Technologies

More information

Lessons Learned from Designing a 65 nm ASIC for Third Round SHA-3 Candidates

Lessons Learned from Designing a 65 nm ASIC for Third Round SHA-3 Candidates Lessons Learned from Designing a 65 nm ASIC for Third Round SHA-3 Candidates Frank K. Gürkaynak, Kris Gaj, Beat Muheim, Ekawat Homsirikamol, Christoph Keller, Marcin Rogawski, Hubert Kaeslin, Jens-Peter

More information

Design Methodologies. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Design Methodologies. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Design Methodologies December 10, 2002 L o g i c T r a n s i s t o r s p e r C h i p ( K ) 1 9 8 1 1

More information

Abstract of PhD Thesis

Abstract of PhD Thesis FACULTY OF ELECTRONICS, TELECOMMUNICATION AND INFORMATION TECHNOLOGY Irina DORNEAN, Eng. Abstract of PhD Thesis Contribution to the Design and Implementation of Adaptive Algorithms Using Multirate Signal

More information

FIR_NTAP_MUX. N-Channel Multiplexed FIR Filter Rev Key Design Features. Block Diagram. Applications. Pin-out Description. Generic Parameters

FIR_NTAP_MUX. N-Channel Multiplexed FIR Filter Rev Key Design Features. Block Diagram. Applications. Pin-out Description. Generic Parameters Key Design Features Block Diagram Synthesizable, technology independent VHDL Core N-channel FIR filter core implemented as a systolic array for speed and scalability Support for one or more independent

More information

A WiMAX/LTE Compliant FPGA Implementation of a High-Throughput Low-Complexity 4x4 64-QAM Soft MIMO Receiver

A WiMAX/LTE Compliant FPGA Implementation of a High-Throughput Low-Complexity 4x4 64-QAM Soft MIMO Receiver A WiMAX/LTE Compliant FPGA Implementation of a High-Throughput Low-Complexity 4x4 64-QAM Soft MIMO Receiver Vadim Smolyakov 1, Dimpesh Patel 1, Mahdi Shabany 1,2, P. Glenn Gulak 1 The Edward S. Rogers

More information

Design Methodologies. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Design Methodologies. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Design Methodologies December 10, 2002 L o g i c T r a n s i s t o r s p e r C h i p ( K ) 1 9 8 1 1

More information

Implementing Logic with the Embedded Array

Implementing Logic with the Embedded Array Implementing Logic with the Embedded Array in FLEX 10K Devices May 2001, ver. 2.1 Product Information Bulletin 21 Introduction Altera s FLEX 10K devices are the first programmable logic devices (PLDs)

More information

Vol. 4, No. 4 April 2013 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Vol. 4, No. 4 April 2013 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved. FPGA Implementation Platform for MIMO- Based on UART 1 Sherif Moussa,, 2 Ahmed M.Abdel Razik, 3 Adel Omar Dahmane, 4 Habib Hamam 1,3 Elec and Comp. Eng. Department, Université du Québec à Trois-Rivières,

More information

Throughput vs. Area Trade-offs in High-Speed Architectures of Five Round 3 SHA-3 Candidates Implemented Using Xilinx and Altera FPGAs

Throughput vs. Area Trade-offs in High-Speed Architectures of Five Round 3 SHA-3 Candidates Implemented Using Xilinx and Altera FPGAs Throughput vs. Area Trade-offs in High-Speed Architectures of Five Round 3 SHA-3 Candidates Implemented Using Xilinx and Altera FPGAs Ekawat Homsirikamol, Marcin Rogawski, and Kris Gaj George Mason University

More information

Evaluation of Large Integer Multiplication Methods on Hardware

Evaluation of Large Integer Multiplication Methods on Hardware Evaluation of Large Integer Multiplication Methods on Hardare Rafferty, C., O'Neill, M., & Hanley, N. (217). Evaluation of Large Integer Multiplication Methods on Hardare. IEEE Transactions on Computers.

More information

Audio Sample Rate Conversion in FPGAs

Audio Sample Rate Conversion in FPGAs Audio Sample Rate Conversion in FPGAs An efficient implementation of audio algorithms in programmable logic. by Philipp Jacobsohn Field Applications Engineer Synplicity eutschland GmbH philipp@synplicity.com

More information

Minimum key length for cryptographic security

Minimum key length for cryptographic security Journal of Applied Mathematics & Bioinformatics, vol.3, no.1, 2013, 181-191 ISSN: 1792-6602 (print), 1792-6939 (online) Scienpress Ltd, 2013 Minimum key length for cryptographic security George Marinakis

More information

DDC_DEC. Digital Down Converter with configurable Decimation Filter Rev Block Diagram. Key Design Features. Applications. Generic Parameters

DDC_DEC. Digital Down Converter with configurable Decimation Filter Rev Block Diagram. Key Design Features. Applications. Generic Parameters Key Design Features Block Diagram Synthesizable, technology independent VHDL Core 16-bit signed input/output samples 1 Digital oscillator with > 100 db SFDR Digital oscillator phase resolution of 2π/2

More information

Lightweight Mixcolumn Architecture for Advanced Encryption Standard

Lightweight Mixcolumn Architecture for Advanced Encryption Standard Volume 6 No., February 6 Lightweight Micolumn Architecture for Advanced Encryption Standard K.J. Jegadish Kumar Associate professor SSN college of engineering kalvakkam, Chennai-6 R. Balasubramanian Post

More information

Implementation of Block Turbo Codes for High Speed Communication Systems

Implementation of Block Turbo Codes for High Speed Communication Systems ASS 2004 Implementation of Block Turbo Codes for High Speed Communication Systems 21 September 2004 Digital Broadcasting Research Division, ETRI Sunheui Ryoo, Sooyoung Kim, and Do Seob Ahn 1 Needs of high

More information

Implementing Multipliers with Actel FPGAs

Implementing Multipliers with Actel FPGAs Implementing Multipliers with Actel FPGAs Application Note AC108 Introduction Hardware multiplication is a function often required for system applications such as graphics, DSP, and process control. The

More information

DATA SECURITY USING ADVANCED ENCRYPTION STANDARD (AES) IN RECONFIGURABLE HARDWARE FOR SDR BASED WIRELESS SYSTEMS

DATA SECURITY USING ADVANCED ENCRYPTION STANDARD (AES) IN RECONFIGURABLE HARDWARE FOR SDR BASED WIRELESS SYSTEMS INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 6367(Print) ISSN 0976 6375(Online)

More information

DESIGN OF LOW POWER HIGH SPEED ERROR TOLERANT ADDERS USING FPGA

DESIGN OF LOW POWER HIGH SPEED ERROR TOLERANT ADDERS USING FPGA International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 10, Issue 1, January February 2019, pp. 88 94, Article ID: IJARET_10_01_009 Available online at http://www.iaeme.com/ijaret/issues.asp?jtype=ijaret&vtype=10&itype=1

More information

VHDL based Design of Convolutional Encoder using Vedic Mathematics and Viterbi Decoder using Parallel Processing

VHDL based Design of Convolutional Encoder using Vedic Mathematics and Viterbi Decoder using Parallel Processing IJSTE - International Journal of Science Technology & Engineering Volume 3 Issue 01 July 2016 ISSN (online): 2349-784X VHDL based Design of Convolutional Encoder using Vedic Mathematics and Viterbi Decoder

More information

NGP-N ASIC. Microelectronics Presentation Days March 2010

NGP-N ASIC. Microelectronics Presentation Days March 2010 NGP-N ASIC Microelectronics Presentation Days 2010 ESA contract: Next Generation Processor - Phase 2 (18428/06/N1/US) - Started: Dec 2006 ESA Technical officer: Simon Weinberg Mark Childerhouse Processor

More information

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN An efficient add multiplier operator design using modified Booth recoder 1 I.K.RAMANI, 2 V L N PHANI PONNAPALLI 2 Assistant Professor 1,2 PYDAH COLLEGE OF ENGINEERING & TECHNOLOGY, Visakhapatnam,AP, India.

More information

A SCALABLE ARCHITECTURE FOR VARIABLE BLOCK SIZE MOTION ESTIMATION ON FIELD-PROGRAMMABLE GATE ARRAYS. Theepan Moorthy and Andy Ye

A SCALABLE ARCHITECTURE FOR VARIABLE BLOCK SIZE MOTION ESTIMATION ON FIELD-PROGRAMMABLE GATE ARRAYS. Theepan Moorthy and Andy Ye A SCALABLE ARCHITECTURE FOR VARIABLE BLOCK SIZE MOTION ESTIMATION ON FIELD-PROGRAMMABLE GATE ARRAYS Theepan Moorthy and Andy Ye Department of Electrical and Computer Engineering Ryerson University 350

More information

Decision Based Median Filter Algorithm Using Resource Optimized FPGA to Extract Impulse Noise

Decision Based Median Filter Algorithm Using Resource Optimized FPGA to Extract Impulse Noise Journal of Embedded Systems, 2014, Vol. 2, No. 1, 18-22 Available online at http://pubs.sciepub.com/jes/2/1/4 Science and Education Publishing DOI:10.12691/jes-2-1-4 Decision Based Median Filter Algorithm

More information

Fpga Implementation of Truncated Multiplier Using Reversible Logic Gates

Fpga Implementation of Truncated Multiplier Using Reversible Logic Gates International Journal of Engineering Science Invention ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 Volume 2 Issue 12 ǁ December. 2013 ǁ PP.44-48 Fpga Implementation of Truncated Multiplier Using

More information

Single Chip FPGA Based Realization of Arbitrary Waveform Generator using Rademacher and Walsh Functions

Single Chip FPGA Based Realization of Arbitrary Waveform Generator using Rademacher and Walsh Functions IEEE ICET 26 2 nd International Conference on Emerging Technologies Peshawar, Pakistan 3-4 November 26 Single Chip FPGA Based Realization of Arbitrary Waveform Generator using Rademacher and Walsh Functions

More information

Design and implementation of LDPC decoder using time domain-ams processing

Design and implementation of LDPC decoder using time domain-ams processing 2015; 1(7): 271-276 ISSN Print: 2394-7500 ISSN Online: 2394-5869 Impact Factor: 5.2 IJAR 2015; 1(7): 271-276 www.allresearchjournal.com Received: 31-04-2015 Accepted: 01-06-2015 Shirisha S M Tech VLSI

More information

Hardware Implementation of Proposed CAMP algorithm for Pulsed Radar

Hardware Implementation of Proposed CAMP algorithm for Pulsed Radar 45, Issue 1 (2018) 26-36 Journal of Advanced Research in Applied Mechanics Journal homepage: www.akademiabaru.com/aram.html ISSN: 2289-7895 Hardware Implementation of Proposed CAMP algorithm for Pulsed

More information

Block Diagram. i_in. q_in (optional) clk. 0 < seed < use both ports i_in and q_in

Block Diagram. i_in. q_in (optional) clk. 0 < seed < use both ports i_in and q_in Key Design Features Block Diagram Synthesizable, technology independent VHDL IP Core -bit signed input samples gain seed 32 dithering use_complex Accepts either complex (I/Q) or real input samples Programmable

More information

DESIGN OF A HIGH SPEED MULTIPLIER BY USING ANCIENT VEDIC MATHEMATICS APPROACH FOR DIGITAL ARITHMETIC

DESIGN OF A HIGH SPEED MULTIPLIER BY USING ANCIENT VEDIC MATHEMATICS APPROACH FOR DIGITAL ARITHMETIC DESIGN OF A HIGH SPEED MULTIPLIER BY USING ANCIENT VEDIC MATHEMATICS APPROACH FOR DIGITAL ARITHMETIC Anuj Kumar 1, Suraj Kamya 2 1,2 Department of ECE, IIMT College Of Engineering, Greater Noida, (India)

More information

Asynchronous vs. Synchronous Design of RSA

Asynchronous vs. Synchronous Design of RSA vs. Synchronous Design of RSA A. Rezaeinia, V. Fatemi, H. Pedram,. Sadeghian, M. Naderi Computer Engineering Department, Amirkabir University of Technology, Tehran, Iran {rezainia,fatemi,pedram,naderi}@ce.aut.ac.ir

More information

Serial and Parallel Processing Architecture for Signal Synchronization

Serial and Parallel Processing Architecture for Signal Synchronization Serial and Parallel Processing Architecture for Signal Synchronization Franklin Rafael COCHACHIN HENOSTROZA Emmanuel BOUTILLON July 2015 Université de Bretagne Sud Lab-STICC, UMR 6285 Centre de Recherche

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Computer Architecture ECE 568 Part 14 Improving Performance: Interleaving Israel Koren ECE568/Koren Part.14.1 Background Performance

More information

Putting Queens in Carry Chains

Putting Queens in Carry Chains Faculty of Computer Science Institute for Computer Engineering Putting Queens in Carry Chains Thomas B. Preußer Bernd Nägel Rainer G. Spallek Πάφoς, HIPEAC WRC 9 Itinerary Problem and Complexity Overview

More information

International Journal of Scientific & Engineering Research Volume 3, Issue 12, December ISSN

International Journal of Scientific & Engineering Research Volume 3, Issue 12, December ISSN International Journal of Scientific & Engineering Research Volume 3, Issue 12, December-2012 1 Optimized Design and Implementation of an Iterative Logarithmic Signed Multiplier Sanjeev kumar Patel, Vinod

More information

Multi-Channel FIR Filters

Multi-Channel FIR Filters Chapter 7 Multi-Channel FIR Filters This chapter illustrates the use of the advanced Virtex -4 DSP features when implementing a widely used DSP function known as multi-channel FIR filtering. Multi-channel

More information

Study of Power Consumption for High-Performance Reconfigurable Computing Architectures. A Master s Thesis. Brian F. Veale

Study of Power Consumption for High-Performance Reconfigurable Computing Architectures. A Master s Thesis. Brian F. Veale Study of Power Consumption for High-Performance Reconfigurable Computing Architectures A Master s Thesis Brian F. Veale Department of Computer Science Texas Tech University August 6, 1999 John K. Antonio

More information

A Low Power and High Speed Viterbi Decoder Based on Deep Pipelined, Clock Blocking and Hazards Filtering

A Low Power and High Speed Viterbi Decoder Based on Deep Pipelined, Clock Blocking and Hazards Filtering Int. J. Communications, Network and System Sciences, 2009, 6, 575-582 doi:10.4236/ijcns.2009.26064 Published Online September 2009 (http://www.scirp.org/journal/ijcns/). 575 A Low Power and High Speed

More information

VLSI IMPLEMENTATION OF MODIFIED DISTRIBUTED ARITHMETIC BASED LOW POWER AND HIGH PERFORMANCE DIGITAL FIR FILTER Dr. S.Satheeskumaran 1 K.

VLSI IMPLEMENTATION OF MODIFIED DISTRIBUTED ARITHMETIC BASED LOW POWER AND HIGH PERFORMANCE DIGITAL FIR FILTER Dr. S.Satheeskumaran 1 K. VLSI IMPLEMENTATION OF MODIFIED DISTRIBUTED ARITHMETIC BASED LOW POWER AND HIGH PERFORMANCE DIGITAL FIR FILTER Dr. S.Satheeskumaran 1 K. Sasikala 2 1 Professor, Department of Electronics and Communication

More information

Functional analysis of DSP blocks in FPGA chips for application in TESLA LLRF system

Functional analysis of DSP blocks in FPGA chips for application in TESLA LLRF system TESLA Report 23-29 Functional analysis of DSP blocks in FPGA chips for application in TESLA LLRF system Krzysztof T. Pozniak, Tomasz Czarski, Ryszard S. Romaniuk Institute of Electronic Systems, WUT, Nowowiejska

More information

Non-Wafer-Scale Sieving Hardware for the NFS: Another Attempt to Cope with 1024-bit

Non-Wafer-Scale Sieving Hardware for the NFS: Another Attempt to Cope with 1024-bit Non-Wafer-Scale Sieving Hardware for the NFS: Another Attempt to Cope with 1024-bit Willi Geiselmann 1 and Rainer Steinwandt 2 1 IAKS, Fakultät für Informatik, Universität Karlsruhe (TH), Am Fasanengarten

More information

DESIGN & IMPLEMENTATION OF SELF TIME DUMMY REPLICA TECHNIQUE IN 128X128 LOW VOLTAGE SRAM

DESIGN & IMPLEMENTATION OF SELF TIME DUMMY REPLICA TECHNIQUE IN 128X128 LOW VOLTAGE SRAM DESIGN & IMPLEMENTATION OF SELF TIME DUMMY REPLICA TECHNIQUE IN 128X128 LOW VOLTAGE SRAM 1 Mitali Agarwal, 2 Taru Tevatia 1 Research Scholar, 2 Associate Professor 1 Department of Electronics & Communication

More information

Spectral Monitoring/ SigInt

Spectral Monitoring/ SigInt RF Test & Measurement Spectral Monitoring/ SigInt Radio Prototyping Horizontal Technologies LabVIEW RIO for RF (FPGA-based processing) PXI Platform (Chassis, controllers, baseband modules) RF hardware

More information

Digital Integrated Circuits Perspectives. Administrivia

Digital Integrated Circuits Perspectives. Administrivia Lecture 30 Perspectives Administrivia Final on Friday December 14, 2001 8 am Location: 180 Tan Hall Topics all what was covered in class. Review Session - TBA Lab and hw scores to be posted on the web

More information

Midterm Exam ECE 448 Spring Thursday Section. (15 points)

Midterm Exam ECE 448 Spring Thursday Section. (15 points) Midterm Exam ECE 448 Spring 2012 (15 points) Instructions: Zip all your deliverables into an archive .zip and submit it through Blackboard no later than Thursday, March 8, 10:15 PM EST. 1 Introduction:

More information

FPGA Implementation of Adaptive Noise Canceller

FPGA Implementation of Adaptive Noise Canceller Khalil: FPGA Implementation of Adaptive Noise Canceller FPGA Implementation of Adaptive Noise Canceller Rafid Ahmed Khalil Department of Mechatronics Engineering Aws Hazim saber Department of Electrical

More information

JDT EFFECTIVE METHOD FOR IMPLEMENTATION OF WALLACE TREE MULTIPLIER USING FAST ADDERS

JDT EFFECTIVE METHOD FOR IMPLEMENTATION OF WALLACE TREE MULTIPLIER USING FAST ADDERS JDT-002-2013 EFFECTIVE METHOD FOR IMPLEMENTATION OF WALLACE TREE MULTIPLIER USING FAST ADDERS E. Prakash 1, R. Raju 2, Dr.R. Varatharajan 3 1 PG Student, Department of Electronics and Communication Engineeering

More information

1 Q' 3. You are given a sequential circuit that has the following circuit to compute the next state:

1 Q' 3. You are given a sequential circuit that has the following circuit to compute the next state: UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences C50 Fall 2001 Prof. Subramanian Homework #3 Due: Friday, September 28, 2001 1. Show how to implement a T flip-flop starting

More information

CprE 583 Reconfigurable Computing

CprE 583 Reconfigurable Computing Quick Points CprE / ComS 58 Reconfigurable Computing Lectures are viewable for students via WebCT Quality is higher Use discussion forums Class e-mail list created: cpre58@iastate.edu Prof. Joseph Zambreno

More information

BPSK_DEMOD. Binary-PSK Demodulator Rev Key Design Features. Block Diagram. Applications. General Description. Generic Parameters

BPSK_DEMOD. Binary-PSK Demodulator Rev Key Design Features. Block Diagram. Applications. General Description. Generic Parameters Key Design Features Block Diagram Synthesizable, technology independent VHDL IP Core reset 16-bit signed input data samples Automatic carrier acquisition with no complex setup required User specified design

More information

Design and FPGA Implementation of High-speed Parallel FIR Filters

Design and FPGA Implementation of High-speed Parallel FIR Filters 3rd International Conference on Mechatronics, Robotics and Automation (ICMRA 215) Design and FPGA Implementation of High-speed Parallel FIR Filters Baolin HOU 1, a *, Yuancheng YAO 1,b and Mingwei QIN

More information

Technology Timeline. Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs. FPGAs. The Design Warrior s Guide to.

Technology Timeline. Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs. FPGAs. The Design Warrior s Guide to. FPGAs 1 CMPE 415 Technology Timeline 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs FPGAs The Design Warrior s Guide

More information

An Efficient Baugh-WooleyArchitecture forbothsigned & Unsigned Multiplication

An Efficient Baugh-WooleyArchitecture forbothsigned & Unsigned Multiplication An Efficient Baugh-WooleyArchitecture forbothsigned & Unsigned Multiplication PramodiniMohanty VLSIDesign, Department of Electrical &Electronics Engineering Noida Institute of Engineering & Technology

More information

Field Programmable Gate Array Implementation and Testing of a Minimum-phase Finite Impulse Response Filter

Field Programmable Gate Array Implementation and Testing of a Minimum-phase Finite Impulse Response Filter Field Programmable Gate Array Implementation and Testing of a Minimum-phase Finite Impulse Response Filter P. K. Gaikwad Department of Electronics Willingdon College, Sangli, India e-mail: pawangaikwad2003

More information

Managing dynamic reconfiguration on MIMO Decoder

Managing dynamic reconfiguration on MIMO Decoder Managing dynamic reconfiguration on MIMO Decoder Hongzhi Wang, Jean-Philippe Delahaye, Pierre Leray and Jacques Palicot IETR/Supelec Campus de Rennes Av. de la Boulais, CS 47601 35576 CESSON-SEVIGNE, France

More information

Datorstödd Elektronikkonstruktion

Datorstödd Elektronikkonstruktion Datorstödd Elektronikkonstruktion [Computer Aided Design of Electronics] Zebo Peng, Petru Eles and Gert Jervan Embedded Systems Laboratory IDA, Linköping University http://www.ida.liu.se/~tdts80/~tdts80

More information

Topics. FPGA Design EECE 277. Combinational Logic Blocks. From Last Time. Multiplication. Dr. William H. Robinson February 25, 2005

Topics. FPGA Design EECE 277. Combinational Logic Blocks. From Last Time. Multiplication. Dr. William H. Robinson February 25, 2005 FPGA Design EECE 277 Combinational Logic Blocks Dr. William H. Robinson Februar5, 25 http://eecs.vanderbilt.edu/courses/eece277/ Topics Computer, compute to the last digit the value o pi. Mr. Spock (Star

More information

Partial Reconfigurable Implementation of IEEE802.11g OFDM

Partial Reconfigurable Implementation of IEEE802.11g OFDM Indian Journal of Science and Technology, Vol 7(4S), 63 70, April 2014 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Partial Reconfigurable Implementation of IEEE802.11g OFDM S. Sivanantham 1*, R.

More information

FPGA Implementation of Wallace Tree Multiplier using CSLA / CLA

FPGA Implementation of Wallace Tree Multiplier using CSLA / CLA FPGA Implementation of Wallace Tree Multiplier using CSLA / CLA Shruti Dixit 1, Praveen Kumar Pandey 2 1 Suresh Gyan Vihar University, Mahaljagtapura, Jaipur, Rajasthan, India 2 Suresh Gyan Vihar University,

More information