Superpipelined Control and Data Path Synthesis

Size: px
Start display at page:

Download "Superpipelined Control and Data Path Synthesis"

Transcription

1 Superpipelined Control and Data Path Synthesis Usha Prabhu Barry M. Pangrle Department of Computer Science, The Pennsylvania State University, University Park, PA Abstract This paper describes a superpipelined control and data path synthesis system. The system can 1) handle pipelined modules in the data path, 2) perform functional pipelining of the datapath and 3) schedule the datapath using a pipelined controller. Three control styles - serial, parallel and pipelined, are implemented. The system automatically picks one depending on the data path, the clock frequency, and the functional unit and control path delays. The results show that using a modifiable clock cycle time and a parameterized control style can significantly improve the throughput of high performance systems. 1 Introduction Much of the earlier work in high-level synthesis has concentrated on scheduling, allocation and binding within basic blocks. However, recent efforts have been directed toward global scheduling in an effort to improve throughput under resource constraints. Throughput is usually defined as the number of clock cycles needed to realize the schedule. Two main directions have been taken in global scheduling - transformational techniques and path-based techniques. Transformational techniques such as trace scheduling [7] and percolation scheduling [18] move operations across basic blocks in order to fully utilize the resources available. Path-based scheduling [5] tries to schedule each possible path optimally, and may thus schedule an operation into different control states depending on the path. The parallelism in the data path is thus exploited at the expense of an increase in control path complexity. Control path complexity may also increase with the choice of clock frequency in the synthesis of highperformance systems. High throughput designs can sometimes be achieved by choosing the clock cycle time to minimize the total amount of dead-time in 29th ACMIEEE Design Automation Conferences 638 a control state. Dead time is the time wasted in a clock cycle because all ready operations need functional units which have finished execution but cannot be scheduled because they have already been scheduled in the current clock cycle. Decreasing the dead time increases the efficiency of the schedule because less time is wasted. Minimizing dead time usually means decreasing the clock cycle time. As the clock cycle time decreases (and the number of states required to realize the schedule increases), the control circuit once again becomes more complex. Thus efforts to improve throughput may result in the control delay occupying a large percentage of the clock cycle, or even, in some cases, being longer than the clock cycle time. To achieve high throughput, high level synthesis systems must look at ways to execute the data and the control paths in parallel or to pipeline the control path. We propose that a modifiable clock cycle time and a parameterized control style are necessary when synthesizing high performance systems. The combination of small cycle times and a high degree of pipelining usually improves system throughput. Machines with these characteristics have been called superpipelined [9]. Ideally, both the clock cycle time and the pipeline depth should be chosen by the synthesis system. Currently, in this system, the clock cycle time and the level of pipelining in the functional units are supplied by the user. The system automatically picks the pipeline depth in the controller so as to maximize performance. Results show that in many of the benchmark examples, superpipelining improves performance. This paper describes the control styles incorporated in SandS. Figure 1 shows a block diagram of the sy5 tem. Slicer [14] and Splicer [15] perform the scheduling and module allocation tasks respectively. Piper [ 111 performs functional pipelining. The system can thus 1) handle pipelined modules in the data path, 2) perform functional pipelining of the data path, and 3) schedule the data path on a pipelined controller x192 $ IEEE T

2 Current Stat8 4-nq - 7 ~ r r b nn-r r ~ Figure 1: Block diagram of the Sands system 2 Previous Work Most high level synthesis systems which schedule both the data and the control path use an FSM PLA based controller and assume that the clock cycle time is large enough for both control and data path operations. One of the exceptions is Chippe [2]. Chippe calculates the cycle time and picks a control model that minimizes the area required while satisfying the performance criteria. The control styles used are PLAbased control (with or without a condition register), pipelined control (where the control is executed in parallel with the data path), and partitioned pipeline control (where a 2-stage pipelined controller is executed in parallel with the datapath). The main difference in the work described here, is that it allows an arbitrary degree of pipelining in the control path. It can thus handle the combination of long control delays and small cycle times. Mlinar concentrates on controller area in his thesis [12]. He uses a PLA based controller, and explores the changes in PLA area with different datapath schedules. He does not take controller delay into consideration. Most systems do not vary the clock cycle time, but take as input functional unit delays in clock cycle time units. An exception is CYOS [6], which tries to generate a schedule using the best possible cycle time. Thus cycle time is one of the variables explored in the design space. It usea simulated annealing to do the scheduling and cycle time optimization. MAHA [19] also picks a cycle time which allows ENext State 3 Figure 2: Serial Control Style it to meet its area/time constraints. However, the lower bound on the cycle time it chooses is the delay of the slowest functional unit, i.e., it does not allow multicycle functional units. 3 Generating the Control Path An iterative algorithm is used to generate both the data and the control path. Assuming a control delay of zero, Slicer and Splicer are used to schedule and allocate operations in the data path. A Moore machine based finite state machine is then generated to represent the control path. This is used as input to PEG [SI which generates the control equations. These are then passed to MisII [3], which is used with a standard script and a library of basic gates. MisII generates the control circuit, maps it into the component library and returns the maximum delay. This delay is fed back to Slicer to generate a new schedule taking into account the control delay and the control style used. (The control styles are described in the next section). This process is iterated until a stable solution is found. The fast response times of Slicer and Splicer allow such an iterative method to be used successfully. Section 3.1 describes the control styles defined in the system. Section 3.2 gives an overview of the sytem with the help of an example. 3.1 Control styles Three main control styles are used, serial, parallel, and pipelined, which are depicted in Figures 2-4. The style actually implemented depends on the data path, the clock frequency and the data and control path delays. Figure 2 depicts a serial control style, where both the data and the control paths are executed in the 639

3 c.mau.n *ut. n a tru Figure 4: Pipelined Control Style Figure 3: Parallel Control Style same clock cycle. If this control style is chosen, the data path scheduling algorithm must take into account the smaller time available to execute data path operations. Obviously, this style cannot be used if the control delay is greater than one clock cycle. This control style is usually chosen if the data path contains a large number of conditionals and loops, and there is not too much scope for acrossblock transformations. The dis advantage of this style is that a significant portion of a clock cycle can be used in generating the control signals. This problem is overcome in the control style depicted in Figure 3. Here the control path is executed in parallel with the data path, thus allowing operations in the data path to use the complete clock cycle. The penalty paid for this is that registers must be used to store the control signals (marked control word in Figure 3). Also, a NOOP state will have to be inserted in the data path after every conditional. This state may be deleted if the conditional can be scheduled early enough or if code motion can be used to move dabindependent code into these states (This is similar to the delayed bmnches used in RISC systems [9]). This control style is usually used if the data path does not contain too many conditionals or loops. A parallel control style may also be viewed as a 2-stage control and datapath pipeline. If the delay of the control path is greater than the clock cycle time, it may be necessary to break the control path into stages, where the delay of each stage is less than the clock cycle time. A 2-stage pipe is shown in Figure 4. Here registers are needed not only to store the final control signals, but also to store intermediate values between stages. Again, it may be necessary to introduce multiple NOOP states every time there is a branch in the control flow. The number of NOOP states required is the number of stages in the control path. Note that there is no fixed cycle initialization overhead associated with using parallel or pipelined controllers. All the intermediate registers (control word, pipe registers, condition registers and current state register) can be initialized on reset to values that are possible to precompute. In this system, the serial control style is represented as a Moore machine. Peg and MisII are then used to generate and time the resulting circuit. The parallel control style is represented as a Mealy machine, hence Meg [21] and MisII are used to generate these designs. The pipelined control style is generated from the parallel design using retiming [lo]. Retiming is used to insert registers into the network generated by MisII. The minimum number of registers are inserted such that no stage of the resulting circuit has delay greater than the clock cycle time. This problem is the linear programming dual of a minimum-cost flow problem and can be solved in polynomial time. The linear programming package LINDO [20] is used to solve it. 3.2 System overview The system is described with the help of an example. The example chosen is from HAL [lq, expanded so as to use input and output ports. The available hardware consists of 2 multipliers (each of which takes 60ns to execute), 1 adder, 1 subtractor, 1 comparator (each of which takes 30ns to execute), five input ports and two output ports (each of which takes 5ns to execute). Note that all the execution times given are register-to register times. For this example, the cycle time is set to 70ns. The system initially sets the control delay to zero. Using this, Slicer generatea the &state schedule shown in Figure 5. Splicer is then used to do the module allocation and connectivity binding. The system generates a VHDL-like description of the finite state machine for the control. The finite state machine description is then translated into Peg format. The output of Peg is then fed into MisII to map the description into a library of standard gates. The results in this paper were obtained with the library mcnc.genlib. The maximum 640

4 Figure 5: 6 state schedule for diffeq (with a serial control style). NOOP Figure 6: 7 state schedule for diffeq using 5 input ports and a parallel control style. I NOOP state does not have to be inserted within the loop. For this clock cycle time, the serial control style gives better throughput and is accepted by the system. The network description is fed to Artist11 [13] which doea the layout. With a cycle time of 60ns, the parallel control style is superior to the serial style and will thus be adopted by the system. The schedule with a serial control style takes 8 states (with the body of the loop taking 6 states), whereas that with the parallel control style takes 7 states (with the body of the loop taking 4 states). Using the parallel controller results in using less states in spite of the fact that a NOOP state has to be added after the comparison. If the loop is repeated only ten times, the system with the parallel control takes 2580ns to execute, whereas that with the serial control takes 3720ns - a savings of over 30%. This saving increases as the number of iterations through the loop increases. Table 1 shows the results obtained with this example and different clock frequencies. The total time column shows the time taken for ten iterations of the loop. For this example, the best throughput is achieved by setting the clock cycle time to lons and the control style to pipelined. Assuming that the loop iterates ten times, the total execution time is now 2170ns - a savings of 15501x3 over using the 7011s clock cycle and a serial control style. Thus significant improvements in throughput can be obtained even with very small systems. delay through the network is fed back to the scheduler. The scheduler may then change the schedule based on the new information available. If the control delay is greater than the clock cycle time, retiming [lo] is used to introduce registers into the control path until the maximum delay in any stage of this pipelined controller is leas than the clock cycle time. In this example, the maximum delay through the control path is 511s. Assuming a serial control style, the schedule remains exactly the same. Since there is a dead time of at least lons in every clock cycle, the control delay can easily be added to each cycle without modifying the circuit. With a clock cycle time of 70ns, a parallel control style would have given the schedule shown in Figure 6. The number of states has now increased to 7, since an extra NOOP state has to be inserted after the first comparison. Slicer performs in-block transformations that eliminate NOOP states wherever possible. In this case, the comparison within the loop is scheduled early enough 80 that other operations can be moved into the state after it. Hence a Cydo # of Comtrol Comtrol Dos of Tot Parallel Parallel Table 1: Differential Equation Solver 4 Experiments This section describes the results obtained with the FIR filter [18] and the Mark1 processor [4]. All the functional unit execution times mentioned in this section are register-to- register delays. Experiments run with other benchmarks from [l] showed similar increases in throughput. 641

5 4.1 Markl processor 5 Conclusions and Future Work This example was translated from the ISPS description of the Manchester University Mark-1 Computer in [4]. The functional units used were one adder, one subtractor, one comparator and a memory with a read and a write port. Memory operations take 9Ons, all other operations take 35~x3. Table 2 shows the results obtained with a range of clock frequencies. The numbers in the Number of States column are the number of states in the schedule with the chosen control style, and with the other control style. For example, with a clock cycle time of loons, Markl can be scheduled in 13 states with a parallel control style and in 21 states with a aerial control style - a savings of 800 ns. When the clock cycle time is reduced to 15ns, the control delay is greater than the cycle time, and a 2 stage pipelined controller is used. 47/ FIR filter lltyl. Pipelined 1 Pipelined Table 2: Markl Processor PIPI.& Experiments were run with the FIR filter to illustrate performance improvement with functional pipeliing, pipelined functional units and a parallel control path. The register-to-register delay of the adders is 30ns and that of the multipliers is 60ns. Table 3 compares nonpipelined and (functionally) pipelined designs using a non-pipelined multiplier. The second figure in the total time column for the pipelined case shows the elapsed time for the first output from the pipeline. The first figure in this column is the rate at which results are available after that. In both counts, substantial savings are obtained with the pipelined version as compared to the nonpipelined version. Table 4 compares nonpipelined and pipelined designs using a pipelined multiplier. The multiplier can accept new inputs in every clock cycle. The total time column here has the same meaning as in the previous case. The best throughput is obtained using a clock cycle time of 30ns, functional pipelining, pipelined multipliers and a parallel control style. Increasing throughput in high-performance systems can be achieved by completely utilizing every control step. This can be done either by increasing the hardware available or by changing the clock cycle time and modifying the control style. The results in this paper show that varying the clock cycle time and using pipelined or parallel controllers can lead to significant improvements in the throughput of resource constrained systems. Experiments have shown that the right choice of cycle time can decrease the amount of dead time in a control step. Sometimes the improvement can be achieved only by executing the data and the control paths in parallel. Even if the control delay is only a few nanoseconds, decreasing each clock cycle by this amount leads to a substantial improvement across the whole schedule. These effects will be much more pronounced in large systems scheduled over hundreds or thousands of clock cycles. These methods will also be very effective in control dominated systems where the control delay is comparable to functional unit delay. Here the necessity for parallel or pipelined control units is much more obvious. The system will be expanded to incorporate facilities for specifying external timing constraints on control so as to make the system more useful in control dominated designs. The system should also automatically generate the best cycle time. References [l] Benchmarks for the 6th International Workshop on High-Level Synthesis, [2] F. Brewer and D. Gajski. Chippe: A System for Constraint Driven Behavioral Synthesis. IEEE lhns. on CAD, Vol9, July 1990, pp [3] R. Brayton, R. Rudell, A. Sangiovanni- Vincentelli, A. Wang. MIS: A MultipleLevel Logic Op timization System. IEEE lhns. on CAD, Vol 6, NO 6, NOV 1987, pp [4] M. R. Barbacci and D. P. Siewiorek. Design and Analysis of Instruction Set Processors. McGraw- Hill, [5] R. Camposano. Path-Baaed Scheduling for Synthesis. IEEE %ns. on CAD, Vol 10, No 1, January 1991, pp T

6 5+, 4* 5+, 4* NUm of States Nonpipelined Ctrl I Control Serial Serial Parallel Serial Parallel Delay bs) ipelining Control Total DII/CT Serial 300/900 Serial 210/630 Parallel 180/720 Serial 160/440 Parallel 120/330 Table 3: FIR Filter: non-pipelined multiplier Nonpipelined 8 I 7 I Parallel I Intrvl Fun4 Comp Time (CY-) ;ional Pipelining DII/CT Table 4: FIR Filter: pipelined multiplier [6] J. Cortadella, R. M. Badia, E. Ayguade. Scheduling in a Continuous Area-Time Design Space: A Simulated Annealing Approach. Proc. of the Fifth International Workshop on High-Level Synthesis, March 1991, pp [7] J. A. Fisher. The Scheduling: A Technique for Global Microcode Compaction. IEEE lhns. on Computers, vol. C-30, No. 7, July [8] G. Hamachi. Designing Finite State Machines with PEG. UC Berkeley, [9] J. L. Hennessy and D. A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann Publishers, Inc [lo] C. Leiseraon, J. Saxe. Retiming Synchronous Circuitry. Algorithmica, 6(1), 1991, pp [ll] D. Lobo and B. M. Pangrle. Optimization Techniques for Pipelined Scheduling. - Fifth. Intemational Confennee on VLSI Design, India, Jan [12] M. J. Mlinar. Control Path / Data Path Tradeoffi in VLSI Design. CEng Tech. Report 91-16, University of Southern California. [13] R. M. Owens and M. J. Irwin. A Comparison of Four 2-Dimensional Gate Matrix Layout Tools. Proc. 26th Design Automation Conference, [14] B. M. Pangrle and D. D. Gajski. Slicer: A State Synthesizer for Intelligent Compilation. Proc. of ICCD, Oct [15] B. M. Pangrle. Splicer: A Heuristic Approach to Connectivity Binding. Proc. 25th Design Automation Conference, June [16] P. G. Paulin and J. P. Knight. Scheduling and Binding Algorithms for High-Level Synthesis. Proc. 26th Design Automation Conference, June [17] P. G. Paulin, J. P. Knight, E. F. Girczyc. HAL: A multi-paradigm approach to automatic datapath synthesis. Proc. 23rd Design Automation Conference, 1986, pp [18] R. Potasman, J. Lis, A. Nicolau, D. Gajski. Percolation Baaed Synthesis. Proc. 27th Design Automation Conference, 1990, pp [19] A. Parker, J. Pizarro, M. Mlinar. MAHA: A Program for Datapath Synthesis. Proc. 23rd Design Automation Conference, July 1986, pp [20] L. Schrage. Linear, Integer and Quadratic Programming with LINDO. The Scientific Press, [21] D. Wood. MEG. UC Berkeley. 643 r- ~~ -

Power-conscious High Level Synthesis Using Loop Folding

Power-conscious High Level Synthesis Using Loop Folding Power-conscious High Level Synthesis Using Loop Folding Daehong Kim Kiyoung Choi School of Electrical Engineering Seoul National University, Seoul, Korea, 151-742 E-mail: daehong@poppy.snu.ac.kr Abstract

More information

DESIGN OF MULTIPLE CONSTANT MULTIPLICATION ALGORITHM FOR FIR FILTER

DESIGN OF MULTIPLE CONSTANT MULTIPLICATION ALGORITHM FOR FIR FILTER Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 3, March 2014,

More information

Design of Baugh Wooley Multiplier with Adaptive Hold Logic. M.Kavia, V.Meenakshi

Design of Baugh Wooley Multiplier with Adaptive Hold Logic. M.Kavia, V.Meenakshi International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April-2015 105 Design of Baugh Wooley Multiplier with Adaptive Hold Logic M.Kavia, V.Meenakshi Abstract Mostly, the overall

More information

A Taxonomy of Parallel Prefix Networks

A Taxonomy of Parallel Prefix Networks A Taxonomy of Parallel Prefix Networks David Harris Harvey Mudd College / Sun Microsystems Laboratories 31 E. Twelfth St. Claremont, CA 91711 David_Harris@hmc.edu Abstract - Parallel prefix networks are

More information

Exploiting Regularity for Low-Power Design

Exploiting Regularity for Low-Power Design Reprint from Proceedings of the International Conference on Computer-Aided Design, 996 Exploiting Regularity for Low-Power Design Renu Mehra and Jan Rabaey Department of Electrical Engineering and Computer

More information

JDT LOW POWER FIR FILTER ARCHITECTURE USING ACCUMULATOR BASED RADIX-2 MULTIPLIER

JDT LOW POWER FIR FILTER ARCHITECTURE USING ACCUMULATOR BASED RADIX-2 MULTIPLIER JDT-003-2013 LOW POWER FIR FILTER ARCHITECTURE USING ACCUMULATOR BASED RADIX-2 MULTIPLIER 1 Geetha.R, II M Tech, 2 Mrs.P.Thamarai, 3 Dr.T.V.Kirankumar 1 Dept of ECE, Bharath Institute of Science and Technology

More information

FOR HIGH SPEED LOW POWER APPLICATIONS USING RADIX-4 MODIFIED BOOTH ENCODER

FOR HIGH SPEED LOW POWER APPLICATIONS USING RADIX-4 MODIFIED BOOTH ENCODER International Journal of Advancements in Research & Technology, Volume 4, Issue 6, June -2015 31 A SPST BASED 16x16 MULTIPLIER FOR HIGH SPEED LOW POWER APPLICATIONS USING RADIX-4 MODIFIED BOOTH ENCODER

More information

Design of Multiplier Less 32 Tap FIR Filter using VHDL

Design of Multiplier Less 32 Tap FIR Filter using VHDL International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Design of Multiplier Less 32 Tap FIR Filter using VHDL Abul Fazal Reyas Sarwar 1, Saifur Rahman 2 1 (ECE, Integral University, India)

More information

Novel Low-Overhead Operand Isolation Techniques for Low-Power Datapath Synthesis

Novel Low-Overhead Operand Isolation Techniques for Low-Power Datapath Synthesis Novel Low-Overhead Operand Isolation Techniques for Low-Power Datapath Synthesis N. Banerjee, A. Raychowdhury, S. Bhunia, H. Mahmoodi, and K. Roy School of Electrical and Computer Engineering, Purdue University,

More information

AUTOMATIC IMPLEMENTATION OF FIR FILTERS ON FIELD PROGRAMMABLE GATE ARRAYS

AUTOMATIC IMPLEMENTATION OF FIR FILTERS ON FIELD PROGRAMMABLE GATE ARRAYS AUTOMATIC IMPLEMENTATION OF FIR FILTERS ON FIELD PROGRAMMABLE GATE ARRAYS Satish Mohanakrishnan and Joseph B. Evans Telecommunications & Information Sciences Laboratory Department of Electrical Engineering

More information

A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm

A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm V.Sandeep Kumar Assistant Professor, Indur Institute Of Engineering & Technology,Siddipet

More information

Digital Integrated CircuitDesign

Digital Integrated CircuitDesign Digital Integrated CircuitDesign Lecture 13 Building Blocks (Multipliers) Register Adder Shift Register Adib Abrishamifar EE Department IUST Acknowledgement This lecture note has been summarized and categorized

More information

Dual-K K Versus Dual-T T Technique for Gate Leakage Reduction : A Comparative Perspective

Dual-K K Versus Dual-T T Technique for Gate Leakage Reduction : A Comparative Perspective Dual-K K Versus Dual-T T Technique for Gate Leakage Reduction : A Comparative Perspective S. P. Mohanty, R. Velagapudi and E. Kougianos Dept of Computer Science and Engineering University of North Texas

More information

AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER

AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER K. RAMAMOORTHY 1 T. CHELLADURAI 2 V. MANIKANDAN 3 1 Department of Electronics and Communication

More information

Mixed Synchronous/Asynchronous State Memory for Low Power FSM Design

Mixed Synchronous/Asynchronous State Memory for Low Power FSM Design Mixed Synchronous/Asynchronous State Memory for Low Power FSM Design Cao Cao and Bengt Oelmann Department of Information Technology and Media, Mid-Sweden University S-851 70 Sundsvall, Sweden {cao.cao@mh.se}

More information

Compressor Based Area-Efficient Low-Power 8x8 Vedic Multiplier

Compressor Based Area-Efficient Low-Power 8x8 Vedic Multiplier Compressor Based Area-Efficient Low-Power 8x8 Vedic Multiplier J.Sowjanya M.Tech Student, Department of ECE, GDMM College of Engineering and Technology. Abstrct: Multipliers are the integral components

More information

ECE6332 VLSI Eric Zhang & Xinfei Guo Design Review

ECE6332 VLSI Eric Zhang & Xinfei Guo Design Review Summaries: [1] Xiaoxiao Zhang, Amine Bermak, Farid Boussaid, "Dynamic Voltage and Frequency Scaling for Low-power Multi-precision Reconfigurable Multiplier", in Proc. of 2010 IEEE International Symposium

More information

ECE473 Computer Architecture and Organization. Pipeline: Introduction

ECE473 Computer Architecture and Organization. Pipeline: Introduction Computer Architecture and Organization Pipeline: Introduction Lecturer: Prof. Yifeng Zhu Fall, 2015 Portions of these slides are derived from: Dave Patterson UCB Lec 11.1 The Laundry Analogy Student A,

More information

Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier

Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier M.Shiva Krushna M.Tech, VLSI Design, Holy Mary Institute of Technology And Science, Hyderabad, T.S,

More information

Implementing Logic with the Embedded Array

Implementing Logic with the Embedded Array Implementing Logic with the Embedded Array in FLEX 10K Devices May 2001, ver. 2.1 Product Information Bulletin 21 Introduction Altera s FLEX 10K devices are the first programmable logic devices (PLDs)

More information

Tirupur, Tamilnadu, India 1 2

Tirupur, Tamilnadu, India 1 2 986 Efficient Truncated Multiplier Design for FIR Filter S.PRIYADHARSHINI 1, L.RAJA 2 1,2 Departmentof Electronics and Communication Engineering, Angel College of Engineering and Technology, Tirupur, Tamilnadu,

More information

Performance Analysis of Multipliers in VLSI Design

Performance Analysis of Multipliers in VLSI Design Performance Analysis of Multipliers in VLSI Design Lunius Hepsiba P 1, Thangam T 2 P.G. Student (ME - VLSI Design), PSNA College of, Dindigul, Tamilnadu, India 1 Associate Professor, Dept. of ECE, PSNA

More information

Chapter 4. Pipelining Analogy. The Processor. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop:

Chapter 4. Pipelining Analogy. The Processor. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Chapter 4 The Processor Part II Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup p = 2n/(0.5n + 1.5) 4 =

More information

Trade-Offs in Multiplier Block Algorithms for Low Power Digit-Serial FIR Filters

Trade-Offs in Multiplier Block Algorithms for Low Power Digit-Serial FIR Filters Proceedings of the th WSEAS International Conference on CIRCUITS, Vouliagmeni, Athens, Greece, July -, (pp3-39) Trade-Offs in Multiplier Block Algorithms for Low Power Digit-Serial FIR Filters KENNY JOHANSSON,

More information

Mahendra Engineering College, Namakkal, Tamilnadu, India.

Mahendra Engineering College, Namakkal, Tamilnadu, India. Implementation of Modified Booth Algorithm for Parallel MAC Stephen 1, Ravikumar. M 2 1 PG Scholar, ME (VLSI DESIGN), 2 Assistant Professor, Department ECE Mahendra Engineering College, Namakkal, Tamilnadu,

More information

DIGITAL DESIGN WITH SM CHARTS

DIGITAL DESIGN WITH SM CHARTS DIGITAL DESIGN WITH SM CHARTS By: Dr K S Gurumurthy, UVCE, Bangalore e-notes for the lectures VTU EDUSAT Programme Dr. K S Gurumurthy, UVCE, Blore Page 1 19/04/2005 DIGITAL DESIGN WITH SM CHARTS The utility

More information

Keywords: Adaptive filtering, LMS algorithm, Noise cancellation, VHDL Design, Signal to noise ratio (SNR), Convergence Speed.

Keywords: Adaptive filtering, LMS algorithm, Noise cancellation, VHDL Design, Signal to noise ratio (SNR), Convergence Speed. Implementation of Efficient Adaptive Noise Canceller using Least Mean Square Algorithm Mr.A.R. Bokey, Dr M.M.Khanapurkar (Electronics and Telecommunication Department, G.H.Raisoni Autonomous College, India)

More information

PERFORMANCE COMPARISON OF HIGHER RADIX BOOTH MULTIPLIER USING 45nm TECHNOLOGY

PERFORMANCE COMPARISON OF HIGHER RADIX BOOTH MULTIPLIER USING 45nm TECHNOLOGY PERFORMANCE COMPARISON OF HIGHER RADIX BOOTH MULTIPLIER USING 45nm TECHNOLOGY JasbirKaur 1, Sumit Kumar 2 Asst. Professor, Department of E & CE, PEC University of Technology, Chandigarh, India 1 P.G. Student,

More information

7/11/2012. Single Cycle (Review) CSE 2021: Computer Organization. Multi-Cycle Implementation. Single Cycle with Jump. Pipelining Analogy

7/11/2012. Single Cycle (Review) CSE 2021: Computer Organization. Multi-Cycle Implementation. Single Cycle with Jump. Pipelining Analogy CSE 2021: Computer Organization Single Cycle (Review) Lecture-10 CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan CSE-2021 July-12-2012 2 Single Cycle with Jump Multi-Cycle Implementation

More information

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors T.N.Priyatharshne Prof. L. Raja, M.E, (Ph.D) A. Vinodhini ME VLSI DESIGN Professor, ECE DEPT ME VLSI DESIGN

More information

A FFT/IFFT Soft IP Generator for OFDM Communication System

A FFT/IFFT Soft IP Generator for OFDM Communication System A FFT/IFFT Soft IP Generator for OFDM Communication System Tsung-Han Tsai, Chen-Chi Peng and Tung-Mao Chen Department of Electrical Engineering, National Central University Chung-Li, Taiwan Abstract: -

More information

Multiple Constant Multiplication for Digit-Serial Implementation of Low Power FIR Filters

Multiple Constant Multiplication for Digit-Serial Implementation of Low Power FIR Filters Multiple Constant Multiplication for igit-serial Implementation of Low Power FIR Filters KENNY JOHANSSON, OSCAR GUSTAFSSON, and LARS WANHAMMAR epartment of Electrical Engineering Linköping University SE-8

More information

[Krishna, 2(9): September, 2013] ISSN: Impact Factor: INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

[Krishna, 2(9): September, 2013] ISSN: Impact Factor: INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY Design of Wallace Tree Multiplier using Compressors K.Gopi Krishna *1, B.Santhosh 2, V.Sridhar 3 gopikoleti@gmail.com Abstract

More information

LOW-POWER SOFTWARE-DEFINED RADIO DESIGN USING FPGAS

LOW-POWER SOFTWARE-DEFINED RADIO DESIGN USING FPGAS LOW-POWER SOFTWARE-DEFINED RADIO DESIGN USING FPGAS Charlie Jenkins, (Altera Corporation San Jose, California, USA; chjenkin@altera.com) Paul Ekas, (Altera Corporation San Jose, California, USA; pekas@altera.com)

More information

AREA EFFICIENT DISTRIBUTED ARITHMETIC DISCRETE COSINE TRANSFORM USING MODIFIED WALLACE TREE MULTIPLIER

AREA EFFICIENT DISTRIBUTED ARITHMETIC DISCRETE COSINE TRANSFORM USING MODIFIED WALLACE TREE MULTIPLIER American Journal of Applied Sciences 11 (2): 180-188, 2014 ISSN: 1546-9239 2014 Science Publication doi:10.3844/ajassp.2014.180.188 Published Online 11 (2) 2014 (http://www.thescipub.com/ajas.toc) AREA

More information

Logic Rewiring for Delay and Power Minimization *

Logic Rewiring for Delay and Power Minimization * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 20, 1-XXX (2004) Short Paper Logic Rewiring for Delay and Power Minimization * Department of Electrical and Computer Engineering and Department of Computer

More information

Pipelined Linear Convolution Based On Hierarchical Overlay UT Multiplier

Pipelined Linear Convolution Based On Hierarchical Overlay UT Multiplier Pipelined Linear Convolution Based On Hierarchical Overlay UT Multiplier Pranav K, Pramod P 1 PG scholar (M Tech VLSI Design and Signal Processing) L B S College of Engineering Kasargod, Kerala, India

More information

Automated FSM Error Correction for Single Event Upsets

Automated FSM Error Correction for Single Event Upsets Automated FSM Error Correction for Single Event Upsets Nand Kumar and Darren Zacher Mentor Graphics Corporation nand_kumar{darren_zacher}@mentor.com Abstract This paper presents a technique for automatic

More information

Design of Delay Efficient PASTA by Using Repetition Process

Design of Delay Efficient PASTA by Using Repetition Process Design of Delay Efficient PASTA by Using Repetition Process V.Sai Jaswana Department of ECE, Narayana Engineering College, Nellore. K. Murali HOD, Department of ECE, Narayana Engineering College, Nellore.

More information

(VE2: Verilog HDL) Software Development & Education Center

(VE2: Verilog HDL) Software Development & Education Center Software Development & Education Center (VE2: Verilog HDL) VLSI Designing & Integration Introduction VLSI: With the hardware market booming with the rise demand in chip driven products in consumer electronics,

More information

An area optimized FIR Digital filter using DA Algorithm based on FPGA

An area optimized FIR Digital filter using DA Algorithm based on FPGA An area optimized FIR Digital filter using DA Algorithm based on FPGA B.Chaitanya Student, M.Tech (VLSI DESIGN), Department of Electronics and communication/vlsi Vidya Jyothi Institute of Technology, JNTU

More information

Towards PVT-Tolerant Glitch-Free Operation in FPGAs

Towards PVT-Tolerant Glitch-Free Operation in FPGAs Towards PVT-Tolerant Glitch-Free Operation in FPGAs Safeen Huda and Jason H. Anderson ECE Department, University of Toronto, Canada 24 th ACM/SIGDA International Symposium on FPGAs February 22, 2016 Motivation

More information

A Low Energy Architecture for Fast PN Acquisition

A Low Energy Architecture for Fast PN Acquisition A Low Energy Architecture for Fast PN Acquisition Christopher Deng Electrical Engineering, UCLA 42 Westwood Plaza Los Angeles, CA 966, USA -3-26-6599 deng@ieee.org Charles Chien Rockwell Science Center

More information

CORDIC Algorithm Implementation in FPGA for Computation of Sine & Cosine Signals

CORDIC Algorithm Implementation in FPGA for Computation of Sine & Cosine Signals International Journal of Scientific & Engineering Research, Volume 2, Issue 12, December-2011 1 CORDIC Algorithm Implementation in FPGA for Computation of Sine & Cosine Signals Hunny Pahuja, Lavish Kansal,

More information

EE382V-ICS: System-on-a-Chip (SoC) Design

EE382V-ICS: System-on-a-Chip (SoC) Design EE38V-CS: System-on-a-Chip (SoC) Design Hardware Synthesis and Architectures Source: D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner, Embedded System Design: Modeling, Synthesis, Verification, Chapter 6:

More information

TECHNOLOGY scaling, aided by innovative circuit techniques,

TECHNOLOGY scaling, aided by innovative circuit techniques, 122 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 2, FEBRUARY 2006 Energy Optimization of Pipelined Digital Systems Using Circuit Sizing and Supply Scaling Hoang Q. Dao,

More information

Implementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST

Implementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST ǁ Volume 02 - Issue 01 ǁ January 2017 ǁ PP. 06-14 Implementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST Ms. Deepali P. Sukhdeve Assistant Professor Department

More information

ENHANCING SPEED AND REDUCING POWER OF SHIFT AND ADD MULTIPLIER

ENHANCING SPEED AND REDUCING POWER OF SHIFT AND ADD MULTIPLIER ENHANCING SPEED AND REDUCING POWER OF SHIFT AND ADD MULTIPLIER 1 ZUBER M. PATEL 1 S V National Institute of Technology, Surat, Gujarat, Inida E-mail: zuber_patel@rediffmail.com Abstract- This paper presents

More information

VLSI IMPLEMENTATION OF MODIFIED DISTRIBUTED ARITHMETIC BASED LOW POWER AND HIGH PERFORMANCE DIGITAL FIR FILTER Dr. S.Satheeskumaran 1 K.

VLSI IMPLEMENTATION OF MODIFIED DISTRIBUTED ARITHMETIC BASED LOW POWER AND HIGH PERFORMANCE DIGITAL FIR FILTER Dr. S.Satheeskumaran 1 K. VLSI IMPLEMENTATION OF MODIFIED DISTRIBUTED ARITHMETIC BASED LOW POWER AND HIGH PERFORMANCE DIGITAL FIR FILTER Dr. S.Satheeskumaran 1 K. Sasikala 2 1 Professor, Department of Electronics and Communication

More information

THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE

THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE A Novel Approach of -Insensitive Null Convention Logic Microprocessor Design J. Asha Jenova Student, ECE Department, Arasu Engineering College, Tamilndu,

More information

Sno Projects List IEEE. High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations

Sno Projects List IEEE. High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations Sno Projects List IEEE 1 High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations 2 A Generalized Algorithm And Reconfigurable Architecture For Efficient And Scalable

More information

IMPLEMENTATION OF MULTIRATE SAMPLING ON FPGA WITH LOW COMPLEXITY FIR FILTERS

IMPLEMENTATION OF MULTIRATE SAMPLING ON FPGA WITH LOW COMPLEXITY FIR FILTERS IMPLEMENTATION OF MULTIRATE SAMPLING ON FPGA WITH LOW COMPLEXITY FIR FILTERS Prof. R. V. Babar 1, Pooja Khot 2, Pallavi More 3, Neha Khanzode 4 1, 2, 3, 4 Department of E&TC Engineering, Sinhgad Institute

More information

Design of FIR Filter Using Modified Montgomery Multiplier with Pipelining Technique

Design of FIR Filter Using Modified Montgomery Multiplier with Pipelining Technique International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 10, Issue 3 (March 2014), PP.55-63 Design of FIR Filter Using Modified Montgomery

More information

Simultaneous Peak and Average Power Minimization during Datapath Scheduling for DSP Processors

Simultaneous Peak and Average Power Minimization during Datapath Scheduling for DSP Processors Simultaneous Peak and Average Power Minimization during Datapath Scheduling for DSP Processors Saraju P. Mohanty,. Ranganathan and Sunil K. Chappidi Department of Computer Science and Engineering anomaterial

More information

SPIRO SOLUTIONS PVT LTD

SPIRO SOLUTIONS PVT LTD VLSI S.NO PROJECT CODE TITLE YEAR ANALOG AMS(TANNER EDA) 01 ITVL01 20-Mb/s GFSK Modulator Based on 3.6-GHz Hybrid PLL With 3-b DCO Nonlinearity Calibration and Independent Delay Mismatch Control 02 ITVL02

More information

CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES

CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES 69 CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES 4.1 INTRODUCTION Multiplication is one of the basic functions used in digital signal processing. It requires more

More information

AREA AND DELAY EFFICIENT DESIGN FOR PARALLEL PREFIX FINITE FIELD MULTIPLIER

AREA AND DELAY EFFICIENT DESIGN FOR PARALLEL PREFIX FINITE FIELD MULTIPLIER AREA AND DELAY EFFICIENT DESIGN FOR PARALLEL PREFIX FINITE FIELD MULTIPLIER 1 CH.JAYA PRAKASH, 2 P.HAREESH, 3 SK. FARISHMA 1&2 Assistant Professor, Dept. of ECE, 3 M.Tech-Student, Sir CR Reddy College

More information

A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION

A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION Sinan Yalcin and Ilker Hamzaoglu Faculty of Engineering and Natural Sciences, Sabanci University, 34956, Tuzla,

More information

Disseny físic. Disseny en Standard Cells. Enric Pastor Rosa M. Badia Ramon Canal DM Tardor DM, Tardor

Disseny físic. Disseny en Standard Cells. Enric Pastor Rosa M. Badia Ramon Canal DM Tardor DM, Tardor Disseny físic Disseny en Standard Cells Enric Pastor Rosa M. Badia Ramon Canal DM Tardor 2005 DM, Tardor 2005 1 Design domains (Gajski) Structural Processor, memory ALU, registers Cell Device, gate Transistor

More information

A New Architecture for Signed Radix-2 m Pure Array Multipliers

A New Architecture for Signed Radix-2 m Pure Array Multipliers A New Architecture for Signed Radi-2 m Pure Array Multipliers Eduardo Costa Sergio Bampi José Monteiro UCPel, Pelotas, Brazil UFRGS, P. Alegre, Brazil IST/INESC, Lisboa, Portugal ecosta@atlas.ucpel.tche.br

More information

Optimized FIR filter design using Truncated Multiplier Technique

Optimized FIR filter design using Truncated Multiplier Technique International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Optimized FIR filter design using Truncated Multiplier Technique V. Bindhya 1, R. Guru Deepthi 2, S. Tamilselvi 3, Dr. C. N. Marimuthu

More information

DESIGN OF MULTIPLYING DELAY LOCKED LOOP FOR DIFFERENT MULTIPLYING FACTORS

DESIGN OF MULTIPLYING DELAY LOCKED LOOP FOR DIFFERENT MULTIPLYING FACTORS DESIGN OF MULTIPLYING DELAY LOCKED LOOP FOR DIFFERENT MULTIPLYING FACTORS Aman Chaudhary, Md. Imtiyaz Chowdhary, Rajib Kar Department of Electronics and Communication Engg. National Institute of Technology,

More information

Faster and Low Power Twin Precision Multiplier

Faster and Low Power Twin Precision Multiplier Faster and Low Twin Precision V. Sreedeep, B. Ramkumar and Harish M Kittur Abstract- In this work faster unsigned multiplication has been achieved by using a combination High Performance Multiplication

More information

Design of an optimized multiplier based on approximation logic

Design of an optimized multiplier based on approximation logic ISSN:2348-2079 Volume-6 Issue-1 International Journal of Intellectual Advancements and Research in Engineering Computations Design of an optimized multiplier based on approximation logic Dhivya Bharathi

More information

PV SYSTEM BASED FPGA: ANALYSIS OF POWER CONSUMPTION IN XILINX XPOWER TOOL

PV SYSTEM BASED FPGA: ANALYSIS OF POWER CONSUMPTION IN XILINX XPOWER TOOL 1 PV SYSTEM BASED FPGA: ANALYSIS OF POWER CONSUMPTION IN XILINX XPOWER TOOL Pradeep Patel Instrumentation and Control Department Prof. Deepali Shah Instrumentation and Control Department L. D. College

More information

Hardware Implementation of Automatic Control Systems using FPGAs

Hardware Implementation of Automatic Control Systems using FPGAs Hardware Implementation of Automatic Control Systems using FPGAs Lecturer PhD Eng. Ionel BOSTAN Lecturer PhD Eng. Florin-Marian BÎRLEANU Romania Disclaimer: This presentation tries to show the current

More information

IN SEVERAL wireless hand-held systems, the finite-impulse

IN SEVERAL wireless hand-held systems, the finite-impulse IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 51, NO. 1, JANUARY 2004 21 Power-Efficient FIR Filter Architecture Design for Wireless Embedded System Shyh-Feng Lin, Student Member,

More information

AN EFFICIENT MAC DESIGN IN DIGITAL FILTERS

AN EFFICIENT MAC DESIGN IN DIGITAL FILTERS AN EFFICIENT MAC DESIGN IN DIGITAL FILTERS THIRUMALASETTY SRIKANTH 1*, GUNGI MANGARAO 2* 1. Dept of ECE, Malineni Lakshmaiah Engineering College, Andhra Pradesh, India. Email Id : srikanthmailid07@gmail.com

More information

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS 1 T.Thomas Leonid, 2 M.Mary Grace Neela, and 3 Jose Anand

More information

A Level-Encoded Transition Signaling Protocol for High-Throughput Asynchronous Global Communication

A Level-Encoded Transition Signaling Protocol for High-Throughput Asynchronous Global Communication A Level-Encoded Transition Signaling Protocol for High-Throughput Asynchronous Global Communication Peggy B. McGee, Melinda Y. Agyekum, Moustafa M. Mohamed and Steven M. Nowick {pmcgee, melinda, mmohamed,

More information

OPTIMIZATION OF PERFORMANCE OF DIFFERENT VEDIC MULTIPLIER

OPTIMIZATION OF PERFORMANCE OF DIFFERENT VEDIC MULTIPLIER OPTIMIZATION OF PERFORMANCE OF DIFFERENT VEDIC MULTIPLIER 1 KRISHAN KUMAR SHARMA, 2 HIMANSHU JOSHI 1 M. Tech. Student, Jagannath University, Jaipur, India 2 Assistant Professor, Department of Electronics

More information

Design of a High Speed FIR Filter on FPGA by Using DA-OBC Algorithm

Design of a High Speed FIR Filter on FPGA by Using DA-OBC Algorithm Design of a High Speed FIR Filter on FPGA by Using DA-OBC Algorithm Vijay Kumar Ch 1, Leelakrishna Muthyala 1, Chitra E 2 1 Research Scholar, VLSI, SRM University, Tamilnadu, India 2 Assistant Professor,

More information

Digital Logic, Algorithms, and Functions for the CEBAF Upgrade LLRF System Hai Dong, Curt Hovater, John Musson, and Tomasz Plawski

Digital Logic, Algorithms, and Functions for the CEBAF Upgrade LLRF System Hai Dong, Curt Hovater, John Musson, and Tomasz Plawski Digital Logic, Algorithms, and Functions for the CEBAF Upgrade LLRF System Hai Dong, Curt Hovater, John Musson, and Tomasz Plawski Introduction: The CEBAF upgrade Low Level Radio Frequency (LLRF) control

More information

Area Efficient and Low Power Reconfiurable Fir Filter

Area Efficient and Low Power Reconfiurable Fir Filter 50 Area Efficient and Low Power Reconfiurable Fir Filter A. UMASANKAR N.VASUDEVAN N.Kirubanandasarathy Research scholar St.peter s university, ECE, Chennai- 600054, INDIA Dean (Engineering and Technology),

More information

Reduction. CSCE 6730 Advanced VLSI Systems. Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are

Reduction. CSCE 6730 Advanced VLSI Systems. Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are Lecture e 8: Peak Power Reduction CSCE 6730 Advanced VLSI Systems Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors

More information

Vol. 5, No. 6 June 2014 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Vol. 5, No. 6 June 2014 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved. Optimal Synthesis of Finite State Machines with Universal Gates using Evolutionary Algorithm 1 Noor Ullah, 2 Khawaja M.Yahya, 3 Irfan Ahmed 1, 2, 3 Department of Electrical Engineering University of Engineering

More information

UNIT-III POWER ESTIMATION AND ANALYSIS

UNIT-III POWER ESTIMATION AND ANALYSIS UNIT-III POWER ESTIMATION AND ANALYSIS In VLSI design implementation simulation software operating at various levels of design abstraction. In general simulation at a lower-level design abstraction offers

More information

Course Outcome of M.Tech (VLSI Design)

Course Outcome of M.Tech (VLSI Design) Course Outcome of M.Tech (VLSI Design) PVL108: Device Physics and Technology The students are able to: 1. Understand the basic physics of semiconductor devices and the basics theory of PN junction. 2.

More information

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering Low-Power VLSI Seong-Ook Jung 2013. 5. 27. sjung@yonsei.ac.kr VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering Contents 1. Introduction 2. Power classification & Power performance

More information

Functional Integration of Parallel Counters Based on Quantum-Effect Devices

Functional Integration of Parallel Counters Based on Quantum-Effect Devices Proceedings of the th IMACS World Congress (ol. ), Berlin, August 997, Special Session on Computer Arithmetic, pp. 7-78 Functional Integration of Parallel Counters Based on Quantum-Effect Devices Christian

More information

DESIGN OF LOW POWER MULTIPLIERS

DESIGN OF LOW POWER MULTIPLIERS DESIGN OF LOW POWER MULTIPLIERS GowthamPavanaskar, RakeshKamath.R, Rashmi, Naveena Guided by: DivyeshDivakar AssistantProfessor EEE department Canaraengineering college, Mangalore Abstract:With advances

More information

High-Performance Pipelined Architecture of Elliptic Curve Scalar Multiplication Over GF(2 m )

High-Performance Pipelined Architecture of Elliptic Curve Scalar Multiplication Over GF(2 m ) High-Performance Pipelined Architecture of Elliptic Curve Scalar Multiplication Over GF(2 m ) Abstract: This paper proposes an efficient pipelined architecture of elliptic curve scalar multiplication (ECSM)

More information

Instantaneous Loop. Ideal Phase Locked Loop. Gain ICs

Instantaneous Loop. Ideal Phase Locked Loop. Gain ICs Instantaneous Loop Ideal Phase Locked Loop Gain ICs PHASE COORDINATING An exciting breakthrough in phase tracking, phase coordinating, has been developed by Instantaneous Technologies. Instantaneous Technologies

More information

MS Project :Trading Accuracy for Power with an Under-designed Multiplier Architecture Parag Kulkarni Adviser : Prof. Puneet Gupta Electrical Eng.

MS Project :Trading Accuracy for Power with an Under-designed Multiplier Architecture Parag Kulkarni Adviser : Prof. Puneet Gupta Electrical Eng. MS Project :Trading Accuracy for Power with an Under-designed Multiplier Architecture Parag Kulkarni Adviser : Prof. Puneet Gupta Electrical Eng., UCLA - http://nanocad.ee.ucla.edu/ 1 Outline Introduction

More information

Methods for Reducing the Activity Switching Factor

Methods for Reducing the Activity Switching Factor International Journal of Engineering Research and Development e-issn: 2278-67X, p-issn: 2278-8X, www.ijerd.com Volume, Issue 3 (March 25), PP.7-25 Antony Johnson Chenginimattom, Don P John M.Tech Student,

More information

Asynchronous vs. Synchronous Design of RSA

Asynchronous vs. Synchronous Design of RSA vs. Synchronous Design of RSA A. Rezaeinia, V. Fatemi, H. Pedram,. Sadeghian, M. Naderi Computer Engineering Department, Amirkabir University of Technology, Tehran, Iran {rezainia,fatemi,pedram,naderi}@ce.aut.ac.ir

More information

Chapter 16 - Instruction-Level Parallelism and Superscalar Processors

Chapter 16 - Instruction-Level Parallelism and Superscalar Processors Chapter 16 - Instruction-Level Parallelism and Superscalar Processors Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 16 - Superscalar Processors 1 / 78 Table of Contents I 1 Overview

More information

Design and Implementation of Digit Serial Fir Filter

Design and Implementation of Digit Serial Fir Filter International Journal of Emerging Engineering Research and Technology Volume 3, Issue 11, November 2015, PP 15-22 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Design and Implementation of Digit Serial

More information

Data Flow Graph. Parameterized Module Library. Register Level Design ( Data Path + Controller ) Cell Generators (Soft Macros)

Data Flow Graph. Parameterized Module Library. Register Level Design ( Data Path + Controller ) Cell Generators (Soft Macros) High Level Proling Based Low Power Synthesis Technique Srinivas Katkoori, Nand Kumar and Ranga Vemuri Address for Correspondence: Dr. Ranga Vemuri, Director Laboratory for Digital Design Environments Department

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK PARALLEL ARRAY MULTIPLIER DESIGN TECHNIQUES VIGHNESH KADOLKAR 1, SONIA KUWELKAR

More information

VLSI Implementation of Auto-Correlation Architecture for Synchronization of MIMO-OFDM WLAN Systems

VLSI Implementation of Auto-Correlation Architecture for Synchronization of MIMO-OFDM WLAN Systems JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.10, NO.3, SEPTEMBER, 2010 185 VLSI Implementation of Auto-Correlation Architecture for Synchronization of MIMO-OFDM WLAN Systems Jongmin Cho*, Jinsang

More information

UNIT-II LOW POWER VLSI DESIGN APPROACHES

UNIT-II LOW POWER VLSI DESIGN APPROACHES UNIT-II LOW POWER VLSI DESIGN APPROACHES Low power Design through Voltage Scaling: The switching power dissipation in CMOS digital integrated circuits is a strong function of the power supply voltage.

More information

CS250 VLSI Systems Design. Lecture 3: Physical Realities: Beneath the Digital Abstraction, Part 1: Timing

CS250 VLSI Systems Design. Lecture 3: Physical Realities: Beneath the Digital Abstraction, Part 1: Timing CS250 VLSI Systems Design Lecture 3: Physical Realities: Beneath the Digital Abstraction, Part 1: Timing Fall 2010 Krste Asanovic, John Wawrzynek with John Lazzaro and Yunsup Lee (TA) What do Computer

More information

Globally Asynchronous Locally Synchronous (GALS) Microprogrammed Parallel FIR Filter

Globally Asynchronous Locally Synchronous (GALS) Microprogrammed Parallel FIR Filter IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 5, Ver. II (Sep. - Oct. 2016), PP 15-21 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Globally Asynchronous Locally

More information

CS 6135 VLSI Physical Design Automation Fall 2003

CS 6135 VLSI Physical Design Automation Fall 2003 CS 6135 VLSI Physical Design Automation Fall 2003 1 Course Information Class time: R789 Location: EECS 224 Instructor: Ting-Chi Wang ( ) EECS 643, (03) 5742963 tcwang@cs.nthu.edu.tw Office hours: M56R5

More information

Design of High Speed Power Efficient Combinational and Sequential Circuits Using Reversible Logic

Design of High Speed Power Efficient Combinational and Sequential Circuits Using Reversible Logic Design of High Speed Power Efficient Combinational and Sequential Circuits Using Reversible Logic Basthana Kumari PG Scholar, Dept. of Electronics and Communication Engineering, Intell Engineering College,

More information

Co-evolution for Communication: An EHW Approach

Co-evolution for Communication: An EHW Approach Journal of Universal Computer Science, vol. 13, no. 9 (2007), 1300-1308 submitted: 12/6/06, accepted: 24/10/06, appeared: 28/9/07 J.UCS Co-evolution for Communication: An EHW Approach Yasser Baleghi Damavandi,

More information

SYNTHESIS OF CYCLIC ENCODER AND DECODER FOR HIGH SPEED NETWORKS

SYNTHESIS OF CYCLIC ENCODER AND DECODER FOR HIGH SPEED NETWORKS SYNTHESIS OF CYCLIC ENCODER AND DECODER FOR HIGH SPEED NETWORKS MARIA RIZZI, MICHELE MAURANTONIO, BENIAMINO CASTAGNOLO Dipartimento di Elettrotecnica ed Elettronica, Politecnico di Bari v. E. Orabona,

More information

Keywords SEFDM, OFDM, FFT, CORDIC, FPGA.

Keywords SEFDM, OFDM, FFT, CORDIC, FPGA. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Future to

More information

An Efficient Design of Parallel Pipelined FFT Architecture

An Efficient Design of Parallel Pipelined FFT Architecture www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3, Issue 10 October, 2014 Page No. 8926-8931 An Efficient Design of Parallel Pipelined FFT Architecture Serin

More information