Speedup of Self-Timed Digital Systems Using Early Completion

Size: px
Start display at page:

Download "Speedup of Self-Timed Digital Systems Using Early Completion"

Transcription

1 Speedup of Self-Timed igital Systems Using Early ompletion Scott. Smith University of Missouri Rolla, epartment of Electrical and omputer Engineering 3 Emerson Electric o. Hall, 87 Miner ircle, Rolla, MO 659 Phone: (573) 3-3, Fax: (573) 3-53, smithsco@umr.edu bstract n Early ompletion technique is developed to significantly increase the throughput of onvention self-timed digital systems without impacting latency or compromising their self-timed nature. Early ompletion performs the completion detection for registration stage i at the input of the register, instead of at the output of the register, as in standard onvention Logic. This method requires that the singlerail completion signal from registration stage i+, i+, be used as an additional input to the completion detection circuitry for registration stage i, to maintain self-timed operation. However, Early ompletion does necessitate an assumption of equipotential regions, introducing a few easily satisfiable timing assumptions, thus making the design potentially more delay-sensitive. To illustrate the technique, Early ompletion is applied to a case study of the optimally pipelined -bit by -bit unsigned multiplier utilizing full-word completion, presented in [], where a speedup of. is achieved while self-timed operation is maintained and latency remains unchanged.. troduction this paper a new completion strategy for onvention Logic (NL) [] is presented, which increases throughput of NL systems without degrading latency or compromising their self-timed operation. The increased performance is due to a reduction of the impact of the handshaking overhead. The technique is based on anticipation, similar to that of a carry lookahead adder, in which future events are predicted, thus allowing overlapped computation time. Most multi-rail delay-insensitive logic paradigms [3,, 5, 6, and 7] consist of combinational logic, latches or registration, and completion detection. s the standard, completion detection is performed at the output of! I would like to thank the University of Missouri Research oard for their funding that has made this work possible. registration stage i and the completion signal is fed back as an input to registration stage i-. ssuming that each combinational logic block has the same delay and that each completion detection unit has the same delay, the basic cycle is as follows: flows through combinational logic block i (N i ), flows through completion detection unit i (RF i ), T flows through combinational logic block i ( i ), T flows through completion detection unit i (RFN i ). These events are mutually exclusive, such that no two events overlap in time. However, with the application of Early ompletion, N i and RF i partially overlap in time and i and RFN i partially overlap in time, thus decreasing the overall cycle time and increasing throughput. The Early ompletion technique presented herein is similar to the Early one technique presented in the Previous Work section. oth increase throughput by moving the completion detection circuitry forward in the pipeline such that some sort of prediction can be preformed, allowing the overlapping of previously mutually exclusive events. The contribution of this paper is the application of the concept of early completion to multi-rail delay-insensitive paradigms, specifically NL, where the solutions to the specific obstacles of this task are presented, the impact on self-timed operation is analyzed, and a test circuit is simulated to give an analytical measure of the technique s effectiveness.. Previous Work [8] an Early one technique was developed to speedup Williams PS pipeline [9], resulting in a 79% increase in throughput when applied to a -bit FIFO. Williams PS pipeline is based on precharge logic and consists of stages composed of a dual-rail function block and a completion detector. The completion component detects the completion of every functional evaluation and precharge at the output of its corresponding function block. The output of the completion detector for stage i is connected to the precharge/evaluate control input of stage i-, as shown in Figure. The basic cycle for a PS pipeline is: T PS = (3 t Eval ) + ( t ) + t Prech, assuming that each stage has the same functional evaluation delay

2 (t Eval ), precharge delay (t Prech ), and completion delay (t ) [8]. Figure. lock diagram of a PS pipeline [8] The Early one technique modifies the PS pipeline by moving the completion detectors in front of their corresponding functional blocks rather than after them. This allows the current stage to signal the previous stage when it is about to evaluate or precharge, instead of after the action has been completed; thus allowing the completion detection signal to be produced in parallel with the precharge or evaluation of its corresponding functional block, instead of after it. This new LP/ pipeline requires the completion detectors to be modified such that they require an additional input, the stage s P control input, as shown in Figure. The basic cycle for a LP/ pipeline is: T LP/ = ( t Eval ) + ( t ), which is t Eval + t Prech shorter than that of the PS pipeline [8]. Furthermore, this throughput optimization does not impact latency, since the forward path is unchanged. Figure. lock diagram of a LP/ pipeline [8] 3. Overview of NL NL offers a delay-insensitive logic paradigm where control is inherent with each datum. It follows the socalled weak conditions of Seitz s delay-insensitive signaling scheme [3]. s with other delay-insensitive logic methods, the NL paradigm assumes that forks in wires are isochronic [, ]. 3. elay-sensitivity NL uses symbolic completeness of expression [] to achieve delay-insensitive behavior. symbolically complete expression is defined as an expression that only depends on the relationships of the symbols present in the expression without a reference to the time of evaluation. particular, dual-rail signals with three logic states (, T, and T) can be used to rid NL of the implicit time reference of oolean circuits and achieve symbolic completeness of expression. dual-rail signal named Z has two rails denoted Z and Z. The T state of NL (Z =, Z = ) corresponds to a oolean logic, the T state of NL (Z =, Z = ) corresponds to a oolean logic, and the state of NL (Z =, Z = ) corresponds to the empty set, meaning that the result is not yet available. The two rails of a dual-rail NL signal are mutually exclusive, so both rails can never be asserted simultaneously; this state is defined as an illegal state. ll NL systems have at least two register stages, one at both the input and output. These two register stages interact through their request and acknowledge lines, K i and K o, respectively, to prevent T wavefront i from overwriting T wavefront i- by ensuring that the two T wavefronts are always separated by a wavefront. 3. Logic Gates NL uses threshold gates for its basic logic gates. The primary type of threshold gate is the THmn gate, where m n, as depicted in Figure 3. THmn gates have n inputs. t least m of the n inputs must be asserted before the output will become asserted. ecause NL threshold gates are designed with hysteresis, all asserted inputs must be de-asserted before the output will be de-asserted. Hysteresis ensures a complete transition of inputs back to before asserting the output associated with the next wavefront of input data. a THmn gate, each of the n inputs is connected to the rounded portion of the gate; the output emanates from the pointed end of the gate; and the gate s threshold value, m, is written inside of the gate. THnn gate is equivalent to an n-input -element [], while a THn gate is equivalent to an OR gate. put put put n m put Figure 3. THmn threshold gate y employing threshold gates for each logic rail, NL is able to determine the output status without referencing time. puts are partitioned into two separate wavefronts, the wavefront and the T wavefront. The wavefront consists of all inputs to a circuit being, while the T wavefront refers to all inputs being T, some combination of T and T. itially all circuit elements are reset to the state. First, a T wavefront is presented to the circuit. Once all of the outputs of the circuit transition to T, the wavefront is presented to the circuit. Once all of

3 the outputs of the circuit transition to, the next T wavefront is presented to the circuit. This T/ cycle continues repeatedly. s soon as all outputs of the circuit are T, the circuit s result is valid. The wavefront then transitions all of these T outputs back to. When they transition back to T again, the next output is available. This period is referred to as the T-to-T cycle time, denoted as T, and has an analogous role to the clock period in a synchronous system. 3.3 ompleteness of put The completeness of input criterion [], which NL combinational circuits must maintain in order to be delayinsensitive, requires that:. all the outputs of a combinational circuit may not transition from to T until all inputs have transitioned from to T, and. all the outputs of a combinational circuit may not transition from T to until all inputs have transitioned from T to. circuits with multiple outputs, it is acceptable for some of the outputs to transition without having a complete input set present, as long as all outputs cannot transition before all inputs arrive. 3. Observability There is one more condition that must be met in order for NL to retain delay-insensitivity. No orphans may propagate through a gate. n orphan is defined as a wire that transitions during the current T wavefront, but is not used in the determination of the output. Orphans are caused by wire forks and can be neglected through the isochronic fork assumption, as long as they are not allowed to cross a gate boundary. This observability condition ensures that every gate transition is observable at the output, which means that every gate that transitions is necessary to transition at least one of the outputs. TRFN is the time associated with the request for generation. s described in [], the worse-case throughput for an N-stage NL pipeline is based on the following three equations: TRF i- + T i- + T i + TRFN i, TRFN i- + TN i- + TN i + TRF i, and TRF i + T i + TRFN i + TN i, corresponding to the case of adjacent T propagation and request times, the case of adjacent propagation and request times, and the case of and T propagation and request times for a single registration stage, respectively. The worsecase cycle time for the entire pipeline is then calculated by finding the maximum of these three equations for every adjacent stage pair in the pipeline, as listed in the following algorithm: max_cycle_time = TRF + T + TRFN + TN for (i = to N) loop temp_cycle_time = MX(TRF i + T i + TRFN i + TN i, TRF i- + T i- + T i + TRFN i, TRFN i- + TN i- + TN i + TRF i ) if (temp_cycle_time > max_cycle_time) then max_cycle_time = temp_cycle_time end if end loop worst_case_throughput = / max_cycle_time lgorithm. alculation of worst-case throughput for an N-stage NL pipeline [] - T i-, TN i- ombinational ircuit TRF i-, TRFN i- ompletion - Figure. Standard NL pipeline I - T i, TN i ombinational ircuit TRF i, TRFN i ompletion O. Early ompletion Technique I O K i The standard NL pipeline is shown in Figure, where each registration stage consists of multiple single bit registers, shown in Figure 5, and the gate-level structure of the completion components is shown in Figure 6. The ombinational ircuit is an input-complete, fully observable functional block, designed using Threshold ombinational Reduction, as described in [3]. T denotes the time when any T bits are propagating through the combinational circuit, TN denotes the time when any bits are propagating through the combinational circuit, TRF is the time associated with the request for T generation, and K o Figure 5. Single-bit dual-rail NL register [] Notice that in the standard NL pipeline the completion detection is performed at the output of the registration stage. The inputs to completion component i are the outputs from registration stage i. On the other hand, a NL pipeline utilizing Early ompletion, shown in Figure 7, also uses the inputs to registration stage i as the inputs to completion component i ; however this

4 necessitates the completion component for Early ompletion to require an additional input, the completion signal from stage i+, i+, in order to maintain self-timed operation. The registration stage is also slightly modified, by removing the inverting TH gate for each single-bit register, since the output is no longer required. i- (N) (N-) (N-) (N-3) (N-) (N-5) (N-6) (N-7) Figure 6. N-bit completion component - (8) (7) (6) (5) () (3) () () T_E i-, TN_E i- ombinational ircuit TRF_E i-, TRFN_E i- ompletion - i - T_E i, TN_E i ombinational ircuit TRF_E i, TRFN_E i ompletion i+ Figure 7. NL pipeline with Early ompletion Early ompletion allows the completion evaluation for stage i to begin before all bits have propagated through combinational circuit i and have been latched by registration stage i. Therefore, TRFN_E i overlaps with T_E i and TRF_E i overlaps with TN_E i such that TRFN_E i + T_E i < TRFN i + T i and TRF_E i + TN_E i < TRFi + TN i. However, TRF i- + T i- and TRFN i- + TN i- still do not overlap, such that TRF_E i- + T_E i- TRF i- + T i- and TRFN_E i- + TN_E i- TRFN i- + TN i-. y examining the equations for determining the worse-case throughput for an N-stage NL pipeline, TRF i + T i + TRFN i + TN i, TRF i- + T i- + T i + TRFN i, and TRFN i- + TN i- + TN i + TRF i, it can be seen that each equation contains at least one of the previously noted sums: TRFN i + T i and TRFi + TN i. Therefore, the cycle time for a NL pipeline using Early ompletion must be less than one using standard completion since TRF_E i- + T_E i- TRF i- + T i-, TRFN_E i- + TN_E i- TRFN i- + TN i-, TRFN_E i + T_E i < TRFN i + T i, and TRF_E i + TN_E i < TRFi + TN i, as previously shown. Furthermore, Early ompletion does not impact latency, since the forward path is unchanged. The completion component for stages through M- for an M-stage NL pipeline utilizing Early ompletion is shown in Figure 8, where the THcomp gate has the following functionality: ( + ) ( + ) [3]. This completion component is for a datapath of N bits, where N is an odd number. For a datapath with an even number of bits, the TH gate would not be required, so there would only be N intermediate signals. lso, note that the final gate, the inverting TH gate, can be incorporated into the tree structure of TH gates, depending on the width of the datapath, to reduce the number of logic levels by one. This component requests T when all inputs to register i are and the request from register i+, i+, is requesting (rfn). It requests when all inputs to register i are T and i+ is requesting T (). The completion component for the final stage, stage M, is slightly different. The inverting TH gate is no longer inverted; instead the final gate of the tree structure of TH gates is inverted. This causes the component to request T when the input to register M is and the external request input line,, is ; and to request when the input to register M is T and is rfn. This variation in the completion component for the last stage is required since changes to rfn as soon as the output is T, and changes to as soon as the output is, to simulate an infinitely fast external interface. n alternative is to use the standard completion component, shown in Figure 6, for the last stage. However, this later approach produces a system with reduced throughput compared to that when the modified Early ompletion component is used for stage M. X X X X X 3 X 3 X X X N- X N- X N- X N- X N X N THcomp THcomp THcomp N/ N/+ i+ Figure 8. ompletion component for Early ompletion i

5 5. Evaluation of Self-Timed Operation Standard NL systems are self-timed, assuming that wire forks are isochronic. However, the application of Early ompletion changes the fundamental structure of the NL handshaking system, thus necessitating the selftimed issue to be revisited. the most delay-sensitive case, i+ and i+ are both and all bits at the input of register i change to T within a very short period of time. The T wavefront at the input of register i would flow through register i, followed by combinational logic block i+, and finally completion component i+, in order to transition i+ to rfn. Simultaneously, the T wavefront at the input of register i would flow through completion component i in order to transition i to rfn. Therefore, in order for the system to function incorrectly, the T wavefront would have to travel through a set of TH gates (register i ), combinational logic block i+, and completion component i+, before the same signal traveled through only completion component i. These paths are shown in boldface in Figure 9. Since the first path is normally much longer, the delay is well known and the system remains self-timed, through the assumption of equipotential regions [3]. This same argument can be made for the wavefront, by replacing T with, rfn with, and with rfn, yielding the same result. For the special case of a FIFO, the combinational logic delay would be zero, but the delay through completion component i and completion component i+ would be identical, so the above argument would still hold. For the generalized case, completion component i and completion component i+ normally have about the same delay, within one or two gate, such that the above analysis holds true. - T + X The other delay-sensitive scenario introduced by Early ompletion is when i+ changes to when all inputs to register i are already T and all inputs to register i- are. this case the has to pass through one gate (at the least an inverting TH gate) in order to transition i to rfn. Once i is rfn, the wavefront at the input of register i- can flow through the register s TH gates and overwrite the previous T wavefront at the input of register i. Simultaneously, the T wavefront at the input of register i has to pass through only one TH gate to be latched at the output of register i. Therefore, in order for the system to function incorrectly, a signal would have to travel through both an inverting and non-inverting TH gate before the same signal travels through only a single TH gate. Since the path through the two gates is obviously longer than the path through a single gate, the are well known and the system remains self-timed, through the assumption of equipotential regions [3]. This same argument can be made for the wavefront by replacing T with, rfn with, and with rfn, yielding the same result. Note that this example assumes that there is no combinational logic delay, as would be the case in a FIFO. For the generalized case the delay-sensitivity would be even less, since the path through an inverting TH gate, a TH gate, and combinational logic would have to be faster than the path through a single TH gate, as depicted in boldface in Figure, in order to adversely affect self-timed operation. - T T + X X X X X ombinational Logic i ombinational Logic i+ T X T ombinational Logic i T ombinational Logic i+ X 3 3 Early ompletion omponent i THcomp i 3 i+ Early ompletion omponent i+ THcomp i+ 3 Early ompletion omponent i 3 THcomp i 3 i+ Early ompletion omponent i+ THcomp i+ Figure. elay-sensitive scenario # 6. itialization Figure 9. elay-sensitive scenario # standard NL, the system is initialized using a global reset to set the output of each register to either

6 T or. Since each completion component only uses the outputs of its corresponding register as inputs, its output will become initialized in constant time after the global reset is applied, without requiring any reset circuitry itself. However, if this same initialization procedure was used with Early ompletion, the reset time would be O(N), where N is the number of stages in the pipeline, since would have to trickle through each completion component in order to initialize, because the early completion component for stage i not only uses the inputs of register i as its inputs; but it also requires the request output from stage i+ as an input. To remedy this situation, the reset signal is also applied to the final gate of each early completion component such that completion component i is reset to when register i is reset to or completion component i is reset to rfn when register i is reset to T. This revised initialization strategy retains the constant time initialization of standard NL, and is actually faster than standard NL initialization. However, more reset circuitry is required, which could also be applied to standard NL to attain the same reduced initialization time. 7. Results This paper does not use a FIFO as an example system, since bit-wise completion [] would obviously outperform any other completion strategy for an NL FIFO. stead, the optimally pipelined -bit by -bit unsigned multiplier utilizing full-word completion, presented in [], was chosen as the case study. functional block diagram of the multiplier is shown in Figure, where I denotes an incomplete N function [], denotes a complete N function [], H denotes a half-adder [], denotes a full-adder [], OMP denotes a completion component, as shown in Figure 6, and GEN_S7 denotes a specialized component to produce the most significant bit of the result []. To assess the effectiveness of the Early ompletion technique, both the multiplier utilizing standard completion and Early ompletion were simulated using Mentor Graphics, a commercial design tool. The Mentor Graphics technology library is based on Spice simulations of static.5 µm MOS gates, operating at 3.3V. The two systems were exhaustively tested and their average throughput calculated. The throughput for the multiplier using the standard completion technique was determined to be.9 ns - [], while the application of Early ompletion produced a throughput of.56 ns -, resulting in a speedup of.. 8. onclusions The technique of Early ompletion that moves the completion detection for registration stage i from the output of the register to its input can significantly increase throughput of self-timed systems without increasing latency. NL -bit by -bit multiplier case study indicates a speedup of. over the design utilizing standard completion. Furthermore, the technique could be applied to other self-timed paradigms [3,, 5, 6, and 7] as well, since they use the same handshaking scheme, with only differed combinational logic. However, since these other self-timed paradigms do not support the multitude of gates supported by NL, each THcomp gate of the early completion component in Figure 8 would have to be replaced by two TH gates, causing there to be N intermediate signals instead of only N as for NL, thus necessitating more gates and logic levels, and therefore reducing throughput for the other self-timed paradigms. References [] S.. Smith, R. F. emara, J. S. Yuan, M. Hagedorn, and. Ferguson, elay-sensitive Gate-Level Pipelining, tegration, The VLSI Journal, Vol. 3/, pp. 3-3,. [] Karl M. Fant and Scott. randt, onvention Logic: omplete and onsistent Logic for synchronous igital ircuit Synthesis, ternational onference on pplication Specific Systems, rchitectures, and Processors, pp. 6-73, 996. [3]. L. Seitz, System Timing, in troduction to VLSI Systems, ddison-wesley, pp. 8-6, 98. [] N. P. Singh, esign Methodology for Self-Timed Systems, Master s Thesis, MIT/LS/TR-58, Laboratory for omputer Science, MIT, 98. [5] T. S. nantharaman, elay sensitive Regular Expression Recognizer, IEEE VLSI Technology ulletin, Sept [6] Ilana avid, Ran Ginosar, and Michael Yoeli, n Efficient Implementation of oolean Functions as Self- Timed ircuits, IEEE Transactions on omputers, Vol., No., pp. -,99. [7] J. Sparso, J. Staunstrup, M. antzer-sorensen, esign of elay sensitive ircuits using Multi-Ring Structures. Proceedings of the European esign utomation onference, pp. 5-, 99. [8] M. Singh and S. M. Nowick, High-Throughput synchronous Pipelines for Fine-Grain ynamic atapaths, Proceeding of the Sixth ternational Symposium on dvanced Research in synchronous ircuits and Systems, pp. 98-9,. [9] T. E. Williams, Self-Timed Rings and Their pplication to ivision, Ph.. Thesis, SL-TR-9-8, epartment of Electrical Engineering and omputer Science, Stanford University, 99.

7 []. J. Martin, Programming in VLSI, in evelopment in oncurrency and ommunication, ddison-wesley, pp. -6, 99. [] K. Van erkel, eware the Isochronic Fork, tegration, The VLSI Journal, Vol. 3, No., pp. 3-8, 99. []. E. Muller, synchronous Logics and pplication to formation Processing, in Switching Theory in Space Technology, Stanford University Press, pp , 963. [3] S.. Smith, Gate and Throughput Optimizations for onvention Self-Timed igital ircuits, Ph.. issertation, School of Electrical Engineering and omputer Science, University of entral Florida,. OMP Reset X 3 X X X Y 3 Y Y Y 8 bit NL Register OMP OMP H I I I I I I I I 6 bit NL Register 3 bit NL Register I I I I H Stage : gate delay Stage : gate OMP H H H bit NL Register H Stage 3: gate OMP H Stage : gate bit NL Register OMP Stage 5: gate OMP bit NL Register bit NL Register Stage 6: gate X Y Z GEN_S7 S Stage 7: gate OMP 8 bit NL Register S 7 S 6 S 5 S S 3 S S S Figure. Optimally pipelined multiplier using full-word completion

Delay-Insensitive Gate-Level Pipelining

Delay-Insensitive Gate-Level Pipelining Delay-Insensitive Gate-Level Pipelining S. C. Smith, R. F. DeMara, J. S. Yuan, M. Hagedorn, and D. Ferguson Keywords: Asynchronous logic design, self-timed circuits, dual-rail encoding, pipelining, NULL

More information

Design and Characterization of Null Convention Self-Timed Multipliers

Design and Characterization of Null Convention Self-Timed Multipliers lockless VLSI Design Design and haracterization of Null onvention Self-Timed Multipliers Satish K. Bandapati, Scott. Smith, and Minsu hoi University of Missouri-Rolla Editor s note: This article presents

More information

CMOS Implementation of Threshold Gates with Hysteresis

CMOS Implementation of Threshold Gates with Hysteresis MOS Implementation of Threshold Gates with Hysteresis Farhad. Parsan 1, and Scott. Smith 1 University of rkansas, Fayetteville R 72701, US, {fparsan,smithsco}@uark.edu bstract. NULL onvention Logic (NL)

More information

THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE

THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE A Novel Approach of -Insensitive Null Convention Logic Microprocessor Design J. Asha Jenova Student, ECE Department, Arasu Engineering College, Tamilndu,

More information

Glitch Power Reduction for Low Power IC Design

Glitch Power Reduction for Low Power IC Design This document is an author-formatted work. The definitive version for citation appears as: N. Weng, J. S. Yuan, R. F. DeMara, D. Ferguson, and M. Hagedorn, Glitch Power Reduction for Low Power IC Design,

More information

Department of Electrical and Computer Systems Engineering

Department of Electrical and Computer Systems Engineering Department of Electrical and Computer Systems Engineering Technical Report MECSE-31-2005 Asynchronous Self Timed Processing: Improving Performance and Design Practicality D. Browne and L. Kleeman Asynchronous

More information

Implementation of Design For Test for Asynchronous NCL Designs

Implementation of Design For Test for Asynchronous NCL Designs Implementation of Design For Test for Asynchronous Designs Bonita Bhaskaran, Venkat Satagopan, Waleed Al-Assadi, and Scott C. Smith Department of Electrical and Computer Engineering, University of Missouri

More information

Ultra-Low Power and Radiation Hardened Asynchronous Circuit Design

Ultra-Low Power and Radiation Hardened Asynchronous Circuit Design University of Arkansas, Fayetteville ScholarWorks@UARK Theses and Dissertations 5-2012 Ultra-Low Power and Radiation Hardened Asynchronous Circuit Design Liang Zhou University of Arkansas, Fayetteville

More information

QDI Fine-Grain Pipeline Templates

QDI Fine-Grain Pipeline Templates QDI Fine-Grain Pipeline Templates Peter. eerel University of Southern alifornia Outline synchronous Latches Fine Grain Pipelining Weak ondition Half uffer Template uffer Logic Examples Precharge Full uffer

More information

Time-Multiplexed Dual-Rail Protocol for Low-Power Delay-Insensitive Asynchronous Communication

Time-Multiplexed Dual-Rail Protocol for Low-Power Delay-Insensitive Asynchronous Communication Time-Multiplexed Dual-Rail Protocol for Low-Power Delay-Insensitive Asynchronous Communication Marco Storto and Roberto Saletti Dipartimento di Ingegneria della Informazione: Elettronica, Informatica,

More information

Asynchronous Gate-Diffusion-Input (GDI) Circuits

Asynchronous Gate-Diffusion-Input (GDI) Circuits synchronous Gate-Diffusion-Input () ircuits rkadiy Morgenshtein, Michael Moreinis and Ran Ginosar Electrical Engineering Department, Technion Israel Institute of Technology, Haifa 32, Israel [ran@ee.technion.ac.il]

More information

To appear in IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, February 2002.

To appear in IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, February 2002. To appear in IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, February 2002. 3.5. A 1.3 GSample/s 10-tap Full-rate Variable-latency Self-timed FIR filter

More information

Delay-insensitive ternary logic (DITL)

Delay-insensitive ternary logic (DITL) Scholars' Mine Masters Theses Student Research & Creative Works Fall 2007 Delay-insensitive ternary logic (DITL) Ravi Sankar Parameswaran Nair Follow this and additional works at: http://scholarsmine.mst.edu/masters_theses

More information

Asynchronous Early Output Section-Carry Based Carry Lookahead Adder with Alias Carry Logic

Asynchronous Early Output Section-Carry Based Carry Lookahead Adder with Alias Carry Logic Asynchronous Early Output Section-arry Based arry Lookahead Adder with Alias arry Logic P. Balasubramanian,. Dang, D.L. Maskell, and K. Prasad Abstract - A new asynchronous early output section-carry based

More information

Lecture 14: Datapath Functional Units Adders

Lecture 14: Datapath Functional Units Adders Lecture 14: Datapath Functional Units dders Mark Horowitz omputer Systems Laboratory Stanford University horowitz@stanford.edu MH EE271 Lecture 14 1 Overview Reading W&E 8.2.1 - dders References Hennessy

More information

A Level-Encoded Transition Signaling Protocol for High-Throughput Asynchronous Global Communication

A Level-Encoded Transition Signaling Protocol for High-Throughput Asynchronous Global Communication A Level-Encoded Transition Signaling Protocol for High-Throughput Asynchronous Global Communication Peggy B. McGee, Melinda Y. Agyekum, Moustafa M. Mohamed and Steven M. Nowick {pmcgee, melinda, mmohamed,

More information

CHAPTER 5 DESIGN AND ANALYSIS OF COMPLEMENTARY PASS- TRANSISTOR WITH ASYNCHRONOUS ADIABATIC LOGIC CIRCUITS

CHAPTER 5 DESIGN AND ANALYSIS OF COMPLEMENTARY PASS- TRANSISTOR WITH ASYNCHRONOUS ADIABATIC LOGIC CIRCUITS 70 CHAPTER 5 DESIGN AND ANALYSIS OF COMPLEMENTARY PASS- TRANSISTOR WITH ASYNCHRONOUS ADIABATIC LOGIC CIRCUITS A novel approach of full adder and multipliers circuits using Complementary Pass Transistor

More information

ECE520 VLSI Design. Lecture 11: Combinational Static Logic. Prof. Payman Zarkesh-Ha

ECE520 VLSI Design. Lecture 11: Combinational Static Logic. Prof. Payman Zarkesh-Ha EE520 VLSI esign Lecture 11: ombinational Static Logic Prof. Payman Zarkesh-Ha Office: EE ldg. 230 Office hours: Wednesday 2:00-3:00PM or by appointment E-mail: pzarkesh@unm.edu Slide: 1 eview of Last

More information

Application and Analysis of Output Prediction Logic to a 16-bit Carry Look Ahead Adder

Application and Analysis of Output Prediction Logic to a 16-bit Carry Look Ahead Adder Application and Analysis of Output Prediction Logic to a 16-bit Carry Look Ahead Adder Lukasz Szafaryn University of Virginia Department of Computer Science lgs9a@cs.virginia.edu 1. ABSTRACT In this work,

More information

MTCMOS Hierarchical Sizing Based on Mutual Exclusive Discharge Patterns

MTCMOS Hierarchical Sizing Based on Mutual Exclusive Discharge Patterns MTCMOS Hierarchical Sizing Based on Mutual Exclusive Discharge Patterns James Kao, Siva Narendra, Anantha Chandrakasan Department of Electrical Engineering and Computer Science Massachusetts Institute

More information

Domino Static Gates Final Design Report

Domino Static Gates Final Design Report Domino Static Gates Final Design Report Krishna Santhanam bstract Static circuit gates are the standard circuit devices used to build the major parts of digital circuits. Dynamic gates, such as domino

More information

Logic Restructuring Revisited. Glitching in an RCA. Glitching in Static CMOS Networks

Logic Restructuring Revisited. Glitching in an RCA. Glitching in Static CMOS Networks Logic Restructuring Revisited Low Power VLSI System Design Lectures 4 & 5: Logic-Level Power Optimization Prof. R. Iris ahar September 8 &, 7 Logic restructuring: hanging the topology of a logic network

More information

Design and Analysis of Energy Recovery Logic for Low Power Circuit Design

Design and Analysis of Energy Recovery Logic for Low Power Circuit Design National onference on Advances in Engineering and Technology RESEARH ARTILE OPEN AESS Design and Analysis of Energy Recovery Logic for Low Power ircuit Design Munish Mittal*, Anil Khatak** *(Department

More information

TECHNOLOGY scaling, aided by innovative circuit techniques,

TECHNOLOGY scaling, aided by innovative circuit techniques, 122 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 2, FEBRUARY 2006 Energy Optimization of Pipelined Digital Systems Using Circuit Sizing and Supply Scaling Hoang Q. Dao,

More information

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS 1 T.Thomas Leonid, 2 M.Mary Grace Neela, and 3 Jose Anand

More information

Outline. EECS Components and Design Techniques for Digital Systems. Lec 12 - Timing. General Model of Synchronous Circuit

Outline. EECS Components and Design Techniques for Digital Systems. Lec 12 - Timing. General Model of Synchronous Circuit Outline EES 5 - omponents and esign Techniques for igital Systems Lec 2 - Timing avid uller Electrical Engineering and omputer Sciences University of alifornia, erkeley Performance Limits of Synchronous

More information

Low Power CMOS Re-programmable Pulse Generator for UWB Systems

Low Power CMOS Re-programmable Pulse Generator for UWB Systems Low Power CMOS Re-programmable Pulse Generator for UWB Systems Kevin Marsden 1, Hyung-Jin Lee 1, ong Sam Ha 1, and Hyung-Soo Lee 2 1 VTVT (Virginia Tech VLSI for Telecommunications) Lab epartment of Electrical

More information

Integrated Circuits & Systems

Integrated Circuits & Systems Federal University of Santa atarina enter for Technology omputer Science & Electronics Engineering Integrated ircuits & Systems INE 5442 Lecture 16 MOS ombinational ircuits - 2 guntzel@inf.ufsc.br Pass

More information

Clock-free nanowire crossbar architecture based on null convention logic (NCL)

Clock-free nanowire crossbar architecture based on null convention logic (NCL) Missouri University of Science and Technology Scholars' Mine Faculty Research & Creative Works 2007 Clock-free nanowire crossbar architecture based on null convention logic (NC) Ravi Bonam Shikha Chaudhary

More information

Analyzing the Impact of Local and Global Indication on a Self-Timed System

Analyzing the Impact of Local and Global Indication on a Self-Timed System Analyzing the Impact of Local and Global Indication on a Self-Timed System PADMANABHAN BALASUBRAMANIAN *, NIKOS E. MASTORAKIS * School of Computer Science The University of Manchester Oxford Road, Manchester

More information

Eliminating Isochronic-Fork Constraints in Quasi-Delay-Insensitive Circuits

Eliminating Isochronic-Fork Constraints in Quasi-Delay-Insensitive Circuits Eliminating Isochronic-Fork Constraints in Quasi-Delay-Insensitive Circuits Nattha Sretasereekul Takashi Nanya RCAST RCAST The University of Tokyo The University of Tokyo Tokyo, 153-8904 Tokyo, 153-8904

More information

Design for Testability Implementation Of Dual Rail Half Adder Based on Level Sensitive Scan Cell Design

Design for Testability Implementation Of Dual Rail Half Adder Based on Level Sensitive Scan Cell Design Design for Testability Implementation Of Dual Rail Half Adder Based on Level Sensitive Scan Cell Design M.S.Kavitha 1 1 Department Of ECE, Srinivasan Engineering College Abstract Design for testability

More information

Available online at ScienceDirect. Procedia Computer Science 57 (2015 )

Available online at  ScienceDirect. Procedia Computer Science 57 (2015 ) Available online at www.sciencedirect.com Scienceirect Procedia Computer Science 57 (2015 ) 1081 1087 3rd International Conference on ecent Trends in Computing 2015 (ICTC-2015) Analysis of Low Power and

More information

CHAPTER 6 PHASE LOCKED LOOP ARCHITECTURE FOR ADC

CHAPTER 6 PHASE LOCKED LOOP ARCHITECTURE FOR ADC 138 CHAPTER 6 PHASE LOCKED LOOP ARCHITECTURE FOR ADC 6.1 INTRODUCTION The Clock generator is a circuit that produces the timing or the clock signal for the operation in sequential circuits. The circuit

More information

Design of Asynchronous Circuits for High Soft Error Tolerance in Deep Submicron CMOS Circuits

Design of Asynchronous Circuits for High Soft Error Tolerance in Deep Submicron CMOS Circuits Design of synchronous Circuits for High Soft Error Tolerance in Deep Submicron CMOS Circuits Weidong Kuang, Member IEEE, Peiyi Zhao, Member IEEE, J.S. Yuan, Senior Member, IEEE, and R. F. DeMara, Senior

More information

precharge clock precharge Tpchp P i EP i Tpchr T lch Tpp M i P i+1

precharge clock precharge Tpchp P i EP i Tpchr T lch Tpp M i P i+1 A VLSI High-Performance Encoder with Priority Lookahead Jose G. Delgado-Frias and Jabulani Nyathi Department of Electrical Engineering State University of New York Binghamton, NY 13902-6000 Abstract In

More information

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors T.N.Priyatharshne Prof. L. Raja, M.E, (Ph.D) A. Vinodhini ME VLSI DESIGN Professor, ECE DEPT ME VLSI DESIGN

More information

Copyright 2000 N. AYDIN. All rights reserved. 1

Copyright 2000 N. AYDIN. All rights reserved. 1 Introduction to igital Prof Nizamettin IN naydin@yildizedutr naydin@ieeeorg ourse Outline igital omputers, Number Systems, rithmetic Operations, ecimal, lphanumeric, and Gray odes 2 inary, Gates, oolean

More information

Arithmetic Circuits. (Part II) Randy H. Katz University of California, Berkeley. Fall Overview BCD Circuits. Combinational Multiplier Circuit

Arithmetic Circuits. (Part II) Randy H. Katz University of California, Berkeley. Fall Overview BCD Circuits. Combinational Multiplier Circuit (art II) Randy H. Katz University of alifornia, Berkeley Fall 25 Overview BD ircuits ombinational Multiplier ircuit Design ase tudy: Bit Multiplier equential Multiplier ircuit R.H. Katz Lecture #2: -1

More information

Optimization of Robust Asynchronous Circuits by Local Input Completeness Relaxation

Optimization of Robust Asynchronous Circuits by Local Input Completeness Relaxation Optimization of Robust Asynchronous ircuits by Local Input ompleteness Relaxation heoljoo Jeong Steven M. Nowick Department of omputer Science, olumbia University New York, NY, 10027, USA Email: cjeong,

More information

VLSI Implementation & Design of Complex Multiplier for T Using ASIC-VLSI

VLSI Implementation & Design of Complex Multiplier for T Using ASIC-VLSI International Journal of Electronics Engineering, 1(1), 2009, pp. 103-112 VLSI Implementation & Design of Complex Multiplier for T Using ASIC-VLSI Amrita Rai 1*, Manjeet Singh 1 & S. V. A. V. Prasad 2

More information

Retractile Clock-Powered Logic

Retractile Clock-Powered Logic Retractile Clock-Powered Logic Nestoras Tzartzanis and William Athas {nestoras, athas}@isiedu URL: http://wwwisiedu/acmos University of Southern California Information Sciences Institute 4676 Admiralty

More information

Reducing Power Consumption with Relaxed Quasi Delay-Insensitive Circuits

Reducing Power Consumption with Relaxed Quasi Delay-Insensitive Circuits Reducing Power Consumption with Relaxed Quasi Delay-Insensitive Circuits Christopher LaFrieda and Rajit Manohar Computer Systems Laboratory Cornell University Ithaca, NY 14853, USA {ccl28,rajit}@csl.cornell.edu

More information

Lecture 9: Clocking for High Performance Processors

Lecture 9: Clocking for High Performance Processors Lecture 9: Clocking for High Performance Processors Computer Systems Lab Stanford University horowitz@stanford.edu Copyright 2001 Mark Horowitz EE371 Lecture 9-1 Horowitz Overview Reading Bailey Stojanovic

More information

UNIT-II LOW POWER VLSI DESIGN APPROACHES

UNIT-II LOW POWER VLSI DESIGN APPROACHES UNIT-II LOW POWER VLSI DESIGN APPROACHES Low power Design through Voltage Scaling: The switching power dissipation in CMOS digital integrated circuits is a strong function of the power supply voltage.

More information

Design of Baugh Wooley Multiplier with Adaptive Hold Logic. M.Kavia, V.Meenakshi

Design of Baugh Wooley Multiplier with Adaptive Hold Logic. M.Kavia, V.Meenakshi International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April-2015 105 Design of Baugh Wooley Multiplier with Adaptive Hold Logic M.Kavia, V.Meenakshi Abstract Mostly, the overall

More information

Energy Recovery for the Design of High-Speed, Low-Power Static RAMs

Energy Recovery for the Design of High-Speed, Low-Power Static RAMs Energy Recovery for the Design of High-Speed, Low-Power Static RAMs Nestoras Tzartzanis and William C. Athas {nestoras, athas}@isi.edu URL: http://www.isi.edu/acmos University of Southern California Information

More information

Asynchronous vs. Synchronous Design of RSA

Asynchronous vs. Synchronous Design of RSA vs. Synchronous Design of RSA A. Rezaeinia, V. Fatemi, H. Pedram,. Sadeghian, M. Naderi Computer Engineering Department, Amirkabir University of Technology, Tehran, Iran {rezainia,fatemi,pedram,naderi}@ce.aut.ac.ir

More information

FIR Filter Fits in an FPGA using a Bit Serial Approach

FIR Filter Fits in an FPGA using a Bit Serial Approach FIR Filter Fits in an FPG using a it erial pproach Raymond J. ndraka, enior Engineer Raytheon Company, Missile ystems Division, Tewksbury M 01876 INTRODUCTION Early digital processors almost exclusively

More information

M.Sc. Thesis. Implementation and automatic generation of asynchronous scheduled dataflow graph. T.M. van Leeuwen B.Sc. Abstract

M.Sc. Thesis. Implementation and automatic generation of asynchronous scheduled dataflow graph. T.M. van Leeuwen B.Sc. Abstract Circuits and Systems Mekelweg 4, 2628 CD Delft The Netherlands http://ens.ewi.tudelft.nl/ CAS-2010-10 Implementation and automatic generation of asynchronous scheduled dataflow graph Abstract Most digital

More information

A Low-Power High-speed Pipelined Accumulator Design Using CMOS Logic for DSP Applications

A Low-Power High-speed Pipelined Accumulator Design Using CMOS Logic for DSP Applications International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume. 1, Issue 5, September 2014, PP 30-42 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org

More information

Design of Low Power Vlsi Circuits Using Cascode Logic Style

Design of Low Power Vlsi Circuits Using Cascode Logic Style Design of Low Power Vlsi Circuits Using Cascode Logic Style Revathi Loganathan 1, Deepika.P 2, Department of EST, 1 -Velalar College of Enginering & Technology, 2- Nandha Engineering College,Erode,Tamilnadu,India

More information

Detecting Resistive Shorts for CMOS Domino Circuits

Detecting Resistive Shorts for CMOS Domino Circuits Detecting Resistive Shorts for MOS Domino ircuits Jonathan T.-Y. hang and Edward J. Mcluskey enter for Reliable omputing Stanford University Gates Hall 2 Stanford, 94305 STRT We investigate defects in

More information

A New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology

A New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology Inf. Sci. Lett. 2, No. 3, 159-164 (2013) 159 Information Sciences Letters An International Journal http://dx.doi.org/10.12785/isl/020305 A New network multiplier using modified high order encoder and optimized

More information

INF3430 Clock and Synchronization

INF3430 Clock and Synchronization INF3430 Clock and Synchronization P.P.Chu Using VHDL Chapter 16.1-6 INF 3430 - H12 : Chapter 16.1-6 1 Outline 1. Why synchronous? 2. Clock distribution network and skew 3. Multiple-clock system 4. Meta-stability

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 5, May ISSN

International Journal of Scientific & Engineering Research, Volume 4, Issue 5, May ISSN International Journal of Scientific & Engineering Research, Volume 4, Issue 5, May-2013 2190 Biquad Infinite Impulse Response Filter Using High Efficiency Charge Recovery Logic K.Surya 1, K.Chinnusamy

More information

Minimization of Overshoots and Ringing in MCM Interconnections

Minimization of Overshoots and Ringing in MCM Interconnections 106 VOL., NO., APRIL 007 Minimization of Overshoots and Ringing in MM Interconnections Rohit Sharma*, T. hakravarty, Sunil Bhooshan epartment of Electronics and ommunication Jaypee University of Information

More information

DESIGN OF HIGH SPEED PASTA

DESIGN OF HIGH SPEED PASTA DESIGN OF HIGH SPEED PASTA Ms. V.Vivitha 1, Ms. R.Niranjana Devi 2, Ms. R.Lakshmi Priya 3 1,2,3 M.E(VLSI DESIGN), Theni Kammavar Sangam College of Technology, Theni,( India) ABSTRACT Parallel Asynchronous

More information

CHAPTER 4 GALS ARCHITECTURE

CHAPTER 4 GALS ARCHITECTURE 64 CHAPTER 4 GALS ARCHITECTURE The aim of this chapter is to implement an application on GALS architecture. The synchronous and asynchronous implementations are compared in FFT design. The power consumption

More information

Mohit Arora. The Art of Hardware Architecture. Design Methods and Techniques. for Digital Circuits. Springer

Mohit Arora. The Art of Hardware Architecture. Design Methods and Techniques. for Digital Circuits. Springer Mohit Arora The Art of Hardware Architecture Design Methods and Techniques for Digital Circuits Springer Contents 1 The World of Metastability 1 1.1 Introduction 1 1.2 Theory of Metastability 1 1.3 Metastability

More information

Low-Power Digital CMOS Design: A Survey

Low-Power Digital CMOS Design: A Survey Low-Power Digital CMOS Design: A Survey Krister Landernäs June 4, 2005 Department of Computer Science and Electronics, Mälardalen University Abstract The aim of this document is to provide the reader with

More information

Chapter # 1: Introduction

Chapter # 1: Introduction Chapter # : Randy H. Katz University of California, erkeley May 993 ฉ R.H. Katz Transparency No. - The Elements of Modern Design Representations, Circuit Technologies, Rapid Prototyping ehaviors locks

More information

SURVEY AND EVALUATION OF LOW-POWER FULL-ADDER CELLS

SURVEY AND EVALUATION OF LOW-POWER FULL-ADDER CELLS SURVEY ND EVLUTION OF LOW-POWER FULL-DDER CELLS hmed Sayed and Hussain l-saad Department of Electrical & Computer Engineering University of California Davis, C, U.S.. STRCT In this paper, we survey various

More information

EE 330 Lecture 5. Other Logic Styles Improved Device Models Stick Diagrams

EE 330 Lecture 5. Other Logic Styles Improved Device Models Stick Diagrams EE 330 Lecture 5 Other Logic Styles Improved evice Models Stick iagrams Review from Last Time MOS Transistor Qualitative iscussion of n-channel Operation ulk Source Gate rain rain Gate n-channel MOSFET

More information

Asynchronous Pipeline Controller Based on Early Acknowledgement Protocol

Asynchronous Pipeline Controller Based on Early Acknowledgement Protocol ISSN 1346-5597 NII Technical Report Asynchronous Pipeline Controller Based on Early Acknowledgement Protocol Chammika Mannakkara and Tomohiro Yoneda NII-2008-009E Sept. 2008 1 PAPER Asynchronous Pipeline

More information

Totally Self-Checking Carry-Select Adder Design Based on Two-Rail Code

Totally Self-Checking Carry-Select Adder Design Based on Two-Rail Code Totally Self-Checking Carry-Select Adder Design Based on Two-Rail Code Shao-Hui Shieh and Ming-En Lee Department of Electronic Engineering, National Chin-Yi University of Technology, ssh@ncut.edu.tw, s497332@student.ncut.edu.tw

More information

Design and Analysis of CMOS Based DADDA Multiplier

Design and Analysis of CMOS Based DADDA Multiplier www..org Design and Analysis of CMOS Based DADDA Multiplier 12 P. Samundiswary 1, K. Anitha 2 1 Department of Electronics Engineering, Pondicherry University, Puducherry, India 2 Department of Electronics

More information

Asynchronous Design Methodologies: An Overview

Asynchronous Design Methodologies: An Overview Proceedings of the IEEE, Vol. 83, No., pp. 69-93, January, 995. Asynchronous Design Methodologies: An Overview Scott Hauck Department of Computer Science and Engineering University of Washington Seattle,

More information

High Performance VLSI Design Using Body Biasing in Domino Logic Circuits

High Performance VLSI Design Using Body Biasing in Domino Logic Circuits Salendra.Govindarajulu et. al. / (IJS) International Journal on omputer Science and ngineering Vol. 2, No. 5, 21, 1741-1745 High Performance VLSI esign Using ody iasing in omino Logic ircuits Salendra.Govindarajulu

More information

Resource Efficient Reconfigurable Processor for DSP Applications

Resource Efficient Reconfigurable Processor for DSP Applications ISSN (Online) : 319-8753 ISSN (Print) : 347-6710 International Journal of Innovative Research in Science, Engineering and Technology Volume 3, Special Issue 3, March 014 014 International onference on

More information

A Highly Efficient Carry Select Adder

A Highly Efficient Carry Select Adder IJSTE - International Journal of Science Technology & Engineering Volume 2 Issue 4 October 2015 ISSN (online): 2349-784X A Highly Efficient Carry Select Adder Shiya Andrews V PG Student Department of Electronics

More information

E2.11/ISE2.22 Digital Electronics II

E2.11/ISE2.22 Digital Electronics II E2./IE2.22 igital Electronics II roblem heet (uestion ratings: =Easy,, E=Hard. ll students should do questions rated, or as a minimum). The diagram shows three gates in which one input (OTOL) is being

More information

RECENT technology trends have lead to an increase in

RECENT technology trends have lead to an increase in IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39, NO. 9, SEPTEMBER 2004 1581 Noise Analysis Methodology for Partially Depleted SOI Circuits Mini Nanua and David Blaauw Abstract In partially depleted silicon-on-insulator

More information

Designing of Low-Power VLSI Circuits using Non-Clocked Logic Style

Designing of Low-Power VLSI Circuits using Non-Clocked Logic Style International Journal of Advancements in Research & Technology, Volume 1, Issue3, August-2012 1 Designing of Low-Power VLSI Circuits using Non-Clocked Logic Style Vishal Sharma #, Jitendra Kaushal Srivastava

More information

HIGH-PERFORMANCE HYBRID WAVE-PIPELINE SCHEME AS IT APPLIES TO ADDER MICRO-ARCHITECTURES

HIGH-PERFORMANCE HYBRID WAVE-PIPELINE SCHEME AS IT APPLIES TO ADDER MICRO-ARCHITECTURES HIGH-PERFORMANCE HYBRID WAVE-PIPELINE SCHEME AS IT APPLIES TO ADDER MICRO-ARCHITECTURES By JAMES E. LEVY A thesis submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE

More information

Clockless Circuits. CS150 Adam Megacz 5-May-2009

Clockless Circuits. CS150 Adam Megacz 5-May-2009 lockless ircuits S50 Adam Megacz 5-May-2009 Outline lockless ircuits Signal Transition Graphs Muller Elements Foam Rubber Wrapper and Speed Independence Micropipelines KLA Demo 2 lockless ircuits ircuits

More information

Figure.1. Schematic of 4-bit CLA JCHPS Special Issue 9: June Page 101

Figure.1. Schematic of 4-bit CLA JCHPS Special Issue 9: June Page 101 Delay Depreciation and Power efficient Carry Look Ahead Adder using CMOS T. Archana*, K. Arunkumar, A. Hema Malini Department of Electronics and Communication Engineering, Saveetha Engineering College,

More information

EE 330 Lecture 44. Digital Circuits. Ring Oscillators Sequential Logic Array Logic Memory Arrays. Final: Tuesday May 2 7:30-9:30

EE 330 Lecture 44. Digital Circuits. Ring Oscillators Sequential Logic Array Logic Memory Arrays. Final: Tuesday May 2 7:30-9:30 EE 330 Lecture 44 igital Circuits Ring Oscillators Sequential Logic Array Logic Memory Arrays Final: Tuesday May 2 7:30-9:30 Review from Last Time ynamic Logic Basic ynamic Logic Gate V F A n PN Any of

More information

Wave Pipelined Circuit with Self Tuning for Clock Skew and Clock Period Using BIST Approach

Wave Pipelined Circuit with Self Tuning for Clock Skew and Clock Period Using BIST Approach Technology Volume 1, Issue 1, July-September, 2013, pp. 41-46, IASTER 2013 www.iaster.com, Online: 2347-6109, Print: 2348-0017 Wave Pipelined Circuit with Self Tuning for Clock Skew and Clock Period Using

More information

DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N

DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N Jan M. Rabaey, Anantha Chandrakasan, and Borivoje Nikolic CONTENTS PART I: THE FABRICS Chapter 1: Introduction (32 pages) 1.1 A Historical

More information

An Optimized Implementation of CSLA and CLLA for 32-bit Unsigned Multiplier Using Verilog

An Optimized Implementation of CSLA and CLLA for 32-bit Unsigned Multiplier Using Verilog An Optimized Implementation of CSLA and CLLA for 32-bit Unsigned Multiplier Using Verilog 1 P.Sanjeeva Krishna Reddy, PG Scholar in VLSI Design, 2 A.M.Guna Sekhar Assoc.Professor 1 appireddigarichaitanya@gmail.com,

More information

A Study on Super Threshold FinFET Current Mode Logic Circuits

A Study on Super Threshold FinFET Current Mode Logic Circuits XUQING ZHNG et al: STUDY ON SUPER THRESHOLD FINFET CURRENT MODE LOGIC CIRCUITS Study on Super Threshold FinFET Current Mode Logic rcuits Xuqiang ZHNG, Jianping HU *, Xia ZHNG Faculty of Information Science

More information

DESIGN & IMPLEMENTATION OF FIXED WIDTH MODIFIED BOOTH MULTIPLIER

DESIGN & IMPLEMENTATION OF FIXED WIDTH MODIFIED BOOTH MULTIPLIER DESIGN & IMPLEMENTATION OF FIXED WIDTH MODIFIED BOOTH MULTIPLIER 1 SAROJ P. SAHU, 2 RASHMI KEOTE 1 M.tech IVth Sem( Electronics Engg.), 2 Assistant Professor,Yeshwantrao Chavan College of Engineering,

More information

CHAPTER 3 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED ADDER TOPOLOGIES

CHAPTER 3 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED ADDER TOPOLOGIES 44 CHAPTER 3 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED ADDER TOPOLOGIES 3.1 INTRODUCTION The design of high-speed and low-power VLSI architectures needs efficient arithmetic processing units,

More information

HIGH-performance microprocessors employ advanced circuit

HIGH-performance microprocessors employ advanced circuit IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 18, NO. 5, MAY 1999 645 Timing Verification of Sequential Dynamic Circuits David Van Campenhout, Student Member, IEEE,

More information

ECOM 4311 Digital System Design using VHDL. Chapter 9 Sequential Circuit Design: Practice

ECOM 4311 Digital System Design using VHDL. Chapter 9 Sequential Circuit Design: Practice ECOM 4311 Digital System Design using VHDL Chapter 9 Sequential Circuit Design: Practice Outline 1. Poor design practice and remedy 2. More counters 3. Register as fast temporary storage 4. Pipelined circuit

More information

A Multiplexer-Based Digital Passive Linear Counter (PLINCO)

A Multiplexer-Based Digital Passive Linear Counter (PLINCO) A Multiplexer-Based Digital Passive Linear Counter (PLINCO) Skyler Weaver, Benjamin Hershberg, Pavan Kumar Hanumolu, and Un-Ku Moon School of EECS, Oregon State University, 48 Kelley Engineering Center,

More information

VLSI Design I; A. Milenkovic 1

VLSI Design I; A. Milenkovic 1 E 66 dvanced VLI Design dder Design Department of Electrical and omputer Engineering University of labama in Huntsville leksandar Milenkovic ( www. ece.uah.edu/~milenka ) [dapted from Rabaey s Digital

More information

7/11/2012. Single Cycle (Review) CSE 2021: Computer Organization. Multi-Cycle Implementation. Single Cycle with Jump. Pipelining Analogy

7/11/2012. Single Cycle (Review) CSE 2021: Computer Organization. Multi-Cycle Implementation. Single Cycle with Jump. Pipelining Analogy CSE 2021: Computer Organization Single Cycle (Review) Lecture-10 CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan CSE-2021 July-12-2012 2 Single Cycle with Jump Multi-Cycle Implementation

More information

High Speed, Low power and Area Efficient Processor Design Using Square Root Carry Select Adder

High Speed, Low power and Area Efficient Processor Design Using Square Root Carry Select Adder IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 9, Issue 2, Ver. VII (Mar - Apr. 2014), PP 14-18 High Speed, Low power and Area Efficient

More information

[Krishna, 2(9): September, 2013] ISSN: Impact Factor: INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

[Krishna, 2(9): September, 2013] ISSN: Impact Factor: INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY Design of Wallace Tree Multiplier using Compressors K.Gopi Krishna *1, B.Santhosh 2, V.Sridhar 3 gopikoleti@gmail.com Abstract

More information

Wallace and Dadda Multipliers. Implemented Using Carry Lookahead. Adders

Wallace and Dadda Multipliers. Implemented Using Carry Lookahead. Adders The report committee for Wesley Donald Chu Certifies that this is the approved version of the following report: Wallace and Dadda Multipliers Implemented Using Carry Lookahead Adders APPROVED BY SUPERVISING

More information

Energy Aware IP Shifter for DSP Processors using MTD 3 L Asynchronous Approach

Energy Aware IP Shifter for DSP Processors using MTD 3 L Asynchronous Approach IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 6, Ver. II (Nov. - Dec. 2016), PP 41-47 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Energy ware IP Shifter for

More information

SYNTHESIS OF COMBINATIONAL CIRCUITS

SYNTHESIS OF COMBINATIONAL CIRCUITS HPTER 6 SYNTHESIS O OMINTIONL IRUITS 6.1 Introduction oolean functions can be expressed in the forms of sum-of-products and productof-sums. These expressions can also be minimized using algebraic manipulations

More information

Homework Problem Set: Combinational Devices & ASM Charts. Answer all questions on this sheet. You may attach additional pages if necessary.

Homework Problem Set: Combinational Devices & ASM Charts. Answer all questions on this sheet. You may attach additional pages if necessary. Student Name:.. Student Number:.. Session I (1 or 2):. Table I (1-11):... Group I (,, ): Homework Problem Set: ombinational evices & SM harts We will collect these sheets from students at the start of

More information

Design of an Efficient Phase Frequency Detector for a Digital Phase Locked Loop

Design of an Efficient Phase Frequency Detector for a Digital Phase Locked Loop Design of an Efficient Phase Frequency Detector for a Digital Phase Locked Loop Shaik. Yezazul Nishath School Of Electronics Engineering (SENSE) VIT University Chennai, India Abstract This paper outlines

More information

A Comparison of Power Consumption in Some CMOS Adder Circuits

A Comparison of Power Consumption in Some CMOS Adder Circuits A Comparison of Power Consumption in Some CMOS Adder Circuits D.J. Kinniment *, J.D. Garside +, and B. Gao * * Electrical and Electronic Engineering Department, The University, Newcastle upon Tyne, NE1

More information

! Sequential Logic. ! Timing Hazards. ! Dynamic Logic. ! Add state elements (registers, latches) ! Compute. " From state elements

! Sequential Logic. ! Timing Hazards. ! Dynamic Logic. ! Add state elements (registers, latches) ! Compute.  From state elements ESE 570: Digital Integrated Circuits and VLSI Fundamentals Lec 19: April 2, 2019 Sequential Logic, Timing Hazards and Dynamic Logic Lecture Outline! Sequential Logic! Timing Hazards! Dynamic Logic 4 Sequential

More information

A Transistor-Level Test Strategy for C 2 MOS MOUSETRAP Asynchronous Pipelines

A Transistor-Level Test Strategy for C 2 MOS MOUSETRAP Asynchronous Pipelines A Transistor-Level Test Strategy for MOUSETRAP Asynchronous Pipelines Feng Shi Electrical Engineering Dept. Yale University New Haven, CT 652, USA Yiorgos Makris Electrical Engineering Dept. Yale University

More information

Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier

Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier M.Shiva Krushna M.Tech, VLSI Design, Holy Mary Institute of Technology And Science, Hyderabad, T.S,

More information