t Microprocessor Research Laboratories, Intel Corporation, Hillsboro, OR

Similar documents
An Energy-Efficient Noise-Tolerant Dynamic Circuit Technique

Double Stage Domino Technique: Low- Power High-Speed Noise-tolerant Domino Circuit for Wide Fan-In Gates

Leakage Control Techniques for Designing Robust, Low Power Wide-OR Domino Logic for Sub-130nm CMOS Technologies

High Speed Low Power Noise Tolerant Multiple Bit Adder Circuit Design Using Domino Logic

Noise Tolerance Dynamic CMOS Logic Design with Current Mirror Circuit

RELIABILITY ANALYSIS OF DYNAMIC LOGIC CIRCUITS UNDER TRANSISTOR AGING EFFECTS IN NANOTECHNOLOGY

The Twin-Transistor Noise-Tolerant Dynamic Circuit Technique

Wide Fan-In Gates for Combinational Circuits Using CCD

Design of High Performance Arithmetic and Logic Circuits in DSM Technology

RECENT technology trends have lead to an increase in

[Sri*, 4.(12): December, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785

Domino Static Gates Final Design Report

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

High-Performance of Domino Logic Circuit for Wide Fan-In Gates Using Mentor Graphics Tools

Comparison of Power Dissipation in inverter using SVL Techniques

Low Power and High Performance Level-up Shifters for Mobile Devices with Multi-V DD

IJMIE Volume 2, Issue 3 ISSN:

Power-Area trade-off for Different CMOS Design Technologies

Leakage Current Analysis

Unique Journal of Engineering and Advanced Sciences Available online: Research Article

Design of Low Power Vlsi Circuits Using Cascode Logic Style

ECE 484 VLSI Digital Circuits Fall Lecture 02: Design Metrics

UNIT-II LOW POWER VLSI DESIGN APPROACHES

PERFORMANCE ANALYSIS ON VARIOUS LOW POWER CMOS DIGITAL DESIGN TECHNIQUES

EE E6930 Advanced Digital Integrated Circuits. Spring, 2002 Lecture 7. Clocked and self-resetting logic I

A Low-Power SRAM Design Using Quiet-Bitline Architecture

1. Short answer questions. (30) a. What impact does increasing the length of a transistor have on power and delay? Why? (6)

EEC 118 Lecture #12: Dynamic Logic

A High-Speed Variation-Tolerant Interconnect Technique for Sub-Threshold Circuits Using Capacitive Boosting

AS very large-scale integration (VLSI) circuits continue to

ISSN:

A Review of Clock Gating Techniques in Low Power Applications

STUDY OF VOLTAGE AND CURRENT SENSE AMPLIFIER

Reduce Power Consumption for Digital Cmos Circuits Using Dvts Algoritham

Ultra Low Power VLSI Design: A Review

Sub-threshold Logic Circuit Design using Feedback Equalization

Energy Efficiency of Power-Gating in Low-Power Clocked Storage Elements

Low Power Design for Systems on a Chip. Tutorial Outline

CHAPTER 3 PERFORMANCE OF A TWO INPUT NAND GATE USING SUBTHRESHOLD LEAKAGE CONTROL TECHNIQUES

An energy efficient full adder cell for low voltage

A Novel Approach for High Speed and Low Power 4-Bit Multiplier

Chapter 6 Combinational CMOS Circuit and Logic Design. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Low Power High Performance 10T Full Adder for Low Voltage CMOS Technology Using Dual Threshold Voltage

BICMOS Technology and Fabrication

Designing of Low-Power VLSI Circuits using Non-Clocked Logic Style

Design of low power SRAM Cell with combined effect of sleep stack and variable body bias technique

Energy-Efficiency Bounds for Deep Submicron VLSI Systems in the Presence of Noise

Analysis of SRAM Bit Cell Topologies in Submicron CMOS Technology

A Novel Low-Power Scan Design Technique Using Supply Gating

A Novel Radiation Tolerant SRAM Design Based on Synergetic Functional Component Separation for Nanoscale CMOS.

Performance Analysis of Novel Domino XNOR Gate in Sub 45nm CMOS Technology

Lecture 10. Circuit Pitfalls

Module 4 : Propagation Delays in MOS Lecture 19 : Analyzing Delay for various Logic Circuits

EECS 427 Lecture 13: Leakage Power Reduction Readings: 6.4.2, CBF Ch.3. EECS 427 F09 Lecture Reminders

Minimizing the Sub Threshold Leakage for High Performance CMOS Circuits Using Stacked Sleep Technique

Implementation of High Performance Carry Save Adder Using Domino Logic

High Performance and Low power VLSI CMOS Circuit Designs using ONOFIC Approach

DESIGN & IMPLEMENTATION OF SELF TIME DUMMY REPLICA TECHNIQUE IN 128X128 LOW VOLTAGE SRAM

An Overview of Static Power Dissipation

Leakage Power Reduction for Logic Circuits Using Variable Body Biasing Technique

Low Power Design of Successive Approximation Registers

EECS 427 Lecture 22: Low and Multiple-Vdd Design

A HIGH SPEED & LOW POWER 16T 1-BIT FULL ADDER CIRCUIT DESIGN BY USING MTCMOS TECHNIQUE IN 45nm TECHNOLOGY

A DUAL-EDGED TRIGGERED EXPLICIT-PULSED LEVEL CONVERTING FLIP-FLOP WITH A WIDE OPERATION RANGE

VLSI Design I; A. Milenkovic 1

Design of a Low Voltage low Power Double tail comparator in 180nm cmos Technology

Investigating Delay-Power Tradeoff in Kogge-Stone Adder in Standby Mode and Active Mode

Leakage Power Reduction by Using Sleep Methods

SCALING power supply has become popular in lowpower

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering

Novel Buffer Design for Low Power and Less Delay in 45nm and 90nm Technology

Low Power Adiabatic Logic Design

Intellect Amplifier, Current Clasped and Filled Current Approach Sense Amplifiers Techniques Based Low Power SRAM

CHAPTER 5 DESIGN AND ANALYSIS OF COMPLEMENTARY PASS- TRANSISTOR WITH ASYNCHRONOUS ADIABATIC LOGIC CIRCUITS

GENERALLY speaking, to decrease the size and weight of

High Speed Communication Circuits and Systems Lecture 14 High Speed Frequency Dividers

Design and Analysis of Sram Cell for Reducing Leakage in Submicron Technologies Using Cadence Tool

Low-Power Digital CMOS Design: A Survey

Lecture 4. The CMOS Inverter. DC Transfer Curve: Load line. DC Operation: Voltage Transfer Characteristic. Noise in Digital Integrated Circuits

Design and Implement of Low Power Consumption SRAM Based on Single Port Sense Amplifier in 65 nm

PROCESS and environment parameter variations in scaled

Design of 32-bit ALU using Low Power Energy Efficient Full Adder Circuits

Implementation of dual stack technique for reducing leakage and dynamic power

Keywords : MTCMOS, CPFF, energy recycling, gated power, gated ground, sleep switch, sub threshold leakage. GJRE-F Classification : FOR Code:

High-speed Serial Interface

A Comparative Study of Π and Split R-Π Model for the CMOS Driver Receiver Pair for Low Energy On-Chip Interconnects

A Survey of the Low Power Design Techniques at the Circuit Level

Adiabatic Logic Circuits for Low Power, High Speed Applications

CPE/EE 427, CPE 527 VLSI Design I: Homeworks 3 & 4

Deep Submicron Technology: Opportunity or Dead End for Dynamic Circuit Techniques

Low Power Design of Schmitt Trigger Based SRAM Cell Using NBTI Technique

ESD-Transient Detection Circuit with Equivalent Capacitance-Coupling Detection Mechanism and High Efficiency of Layout Area in a 65nm CMOS Technology

DESIGN AND ANALYSIS OF LOW POWER CHARGE PUMP CIRCUIT FOR PHASE-LOCKED LOOP

Impact of Leakage on IC Testing?

Skewed CMOS: Noise-Tolerant High-Performance Low-Power Static Circuit Family

Lecture 12 Memory Circuits. Memory Architecture: Decoders. Semiconductor Memory Classification. Array-Structured Memory Architecture RWM NVRWM ROM

EE 330 Lecture 42. Other Logic Styles Digital Building Blocks

ESTIMATION OF LEAKAGE POWER IN CMOS DIGITAL CIRCUIT STACKS

Optimization of power in different circuits using MTCMOS Technique

Fast Placement Optimization of Power Supply Pads

Transcription:

AN ENERGY-EFFICIENT LEAKAGE-TOLERANT DYNAMIC CIRCUIT TECHNIQUE Lei Wang, Ram K. Krishnamurthyt, K. Soumyanatht, and Naresh R. Shanbhag Coordinated Science Laboratory, Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801. t Microprocessor Research Laboratories, Intel Corporation, Hillsboro, OR 97124. ABSTRACT Technology scaling reduces device threshold voltages to mitigate speed loss due to scaled supply voltages. This, however, exponentially increases leakage power and adversely affects circuit reliability. In this paper, we will investigate the performance degradation in high-leakage digital circuits. It is shown that deep submicron CMOS technologies lead to 60%-70% degradation in noise-immunity due to leakage. Dual-Vt domino designs mitigate the noiseimmunity degradation to 30%-40% but inevitably lead to a loss of 20%-30% in circuit speed. To achieve a better noise-immunity vs. performance trade-off, a new dynamic circuit technique - the boosted-source (BS) technique is proposed. Simulation results of wide fan-in gates designed in the Predictive Berkeley BSIM3v3 0.13pm technology [l] demonstrate 1.6X-3X improvement in noise-immunity at the expense of marginal energy overhead but no loss in delay, as compared with the existing circuit techniques. I. INTRODUCTION Scaling of CMOS technology has rendered the ability to significantly improve the performance of increasingly complex VLSI systems at an affordable cost. However, with feature sizes being reduced towards 0.1-0.05pm generations, noise-immunity will become difficult to achieve due to high-leakage transistors, large threshold variations, low supply voltages, high clock-frequencies, the presence of ground bounce, ZR drops, crosstalk and clock jitter [2]. This is compounded further by aggressive design practices such as dynamic, low-power, and high-speed circuit styles, making deep submicron (DSM) noise [3]-[5] the primary cause of a reliability problem that may ultimately determine the performance achievable in future ASICs. It is very clear that low-power design techniques are needed at various levels of design abstraction from process to algorithm [6] - (81. A widely used low-power technique is supply voltage scaling which provides linear reduction in This research was supported in part by Intel Corporation, National Science Foundation grant CCR-0000987 and Semiconductor Research Corporation. static power dissipation and quadratic reduction in capacitive power dissipation. With the scaling of supply voltage, transistor threshold voltage Vt needs to be scaled properly to offset the undesired speed loss [9]. Unfortunately, such design practice not only exponentially increases the leakage power but also deteriorates the noise-immunity. Furthermore, given the trend that leakage power increases by a factor of 5X with each technology generation and will become a significant portion of the total power in future ICs [lo], active leakage-control becomes critical to deep submicron VLSI systems. Many techniques [11]-[13] have been developed so far to reduce leakage power; however, not much work has been done in addressing the leakage reduction in the presence of DSM noise. In other words, energy-efficiency and reliability issues have not been studied together. In this paper, we will investigate the leakageinduced reliability degradation in deep submicron CMOS technologies. A new energy-efficient, noise-tolerant dynamic circuit technique is proposed for designing high performance VLSI systems. The paper is organized as follows. In section 11, we analyze the reliability degradation due to leakage in two -0.lpm CMOS technologies. Two performance metrics, unity noise gain (UNG) and 4-stage delay, are proposed to quantify the noise-immunity and speed, respectively. In section 111, a new energy-efficient, noise-tolerant dynamic circuit technique - the boosted-source (BS) technique is proposed. Simulation results on the performance of wide fan-in gates are presented and evaluated in section IV. 11. CHARACTERIZATION OF LEAKAGE INDUCED RELIABILITY DEGRADATION In this section, we investigate the noise-immunity degradation in high-leakage digital circuits designed in two -0.lpm CMOS technologies. We also propose the unity noise gain (UNG) and 4-stage delay as metrics to quantitatively describe the noise-immunity and speed, respectively, of different circuit techniques. 0-7803-6598-4/00/$10.000 2000 IEEE 221

7 CLIi -/ A,, ( T - d (4 (b) Figure 1: Wide fan-in domino gates: (a) dl domino and (b) d2 domino. A. Noise Characterization We are primarily concerned with wide fun-in domino gates, which are prone to leakage-induced noise. Fig. 1 depicts two domino topologies of wide fan-in OR gates, where dl domino denotes the conventional domino gate with a foot-switch NMOS transistor and d2 domino denotes that without the foot-switch NMOS transistor [lo]. We need to point out that a d2 domino gate is faster than a dl domino gate of the same design; however, the input signals of a d2 domino gate must remain at 0 during the precharge phase to prevent DC conduction between power supply and ground. To compare the circuit robustness under DSM disturbances, we inject identical noise pulses into all the gate inputs A1-An during the evaluate phase and measure the resulting voltage waveforms at dynamic node VD and output Vout. The input noise stimulus (see Fig. 2(a)) consists of a DC offset VDC (to account for the possible IR drops) and a scalable pulse Vpulse, i.e., where the shape of Vpulse closely mimics real noise pulses due to glitches, crosstalk, and ground bounce, etc.. Fig. 2(b) -(c) illustrate typical waveforms of VD and Vout with the input noise present. To quantify the noise-immunity, we propose the metric of unity noise gain (UNG), which is defined as the amplitude of input noise Vnoise that causes an equal-amplitude noise pulse at Vout, i.e., UNG = {Vnoise Vnoise = &ut}. (2) UNG captures the critical input noise strength, as any noise pulse larger than UNG will be amplified due to the nonlinear transfer function of the transistor. While the UNG measure is easy to obtain, real DSM scenarios are more complicated as the duration of DSM noise also needs to be accounted for. In such case a more comprehensive noise-immunity metric such as the one proposed in [14] can be adopted. In this paper, however, we only consider the noise amplitude for the sake of simplicity. In addition to the noise-immunity, we are also interested in the delay reduction achievable in deep submicron technologies. For this purpose, we simulate five seriallyconnected identical OR gates and measure the worst-case 50%-delay of the first four gates, termed as 4-stage delay (see Fig. 3). This accounts for the fan-in (input) capacitance associated with the circuit style being employed. Figure 2: Noise characterization: (a) input noise waveforms, (b) dynamic node waveforms and (c) output waveforms. vp m 4-stage delay Figure 3: 4-stage delay. B. Performance Comparison and Problem Statement We have designed representative 4-wide, &wide and 16- wide OR gates in two -0.lpm technologies, termed as T-l and T-2, where T-l is a single-threshold technology and T-2 is a scaled dual-threshold technology with smaller threshold voltages. Due to this, T-2 technology induces a higher leakage current, e.g., the worst-case leakage current (measured at room temperature) of IOW-% and high- % transistors are 25X and 6X larger than that of the transistors in T-1 technology of the same design. To investigate the degradation in noise-immunity, two design schemes have been applied to the gates in T-2 technology: 1.) single-vt implementation, where all the transistors are low-%, and 2.) dual-& implementation, where the pulldown NMOS transistors are replaced by high-& devices for the purpose of reducing leakage current. All the pulldown NMOS transistors in these OR gates have the same width which is determined by the specification on fan-in (input) capacitance. Fig. 4 shows the results of UNG vs. 4-stage delay, both normalized by the corresponding baseline T-1 technology values. As indicated, single-& d.2 domino gates in T-2 technology achieve about 2X delay reduction over those 222

Figure 5: Circuit diagram of the boosted-source technique (output inverters are not shown). Figure 4: Noise-immunity vs. speed for two -0.lpm technologies. in T-1 technology. However, the leakage problem becomes severe as the scaled Vt makes transistors more susceptible to DSM noise, resulting in 60%-70% degradation in UNG. Dual-& d2 domino gates mitigate the UNG degradation to 30%-40% as compared with the T-1 technology; however, they also lead to a 20% speed loss over the single-vt d2 domino gates. Within the same technology, 16-wide gates are found to be slower and less robust than 4-wide gates due to the larger parasitic capacitance and stronger leakage path. Moreover, the 16-wide dl domino and d2 domino gates in T-2 technology with single(1ow)-& are non-functional, which means just a small DC offset VDC (around 100mV) at the inputs will cause the final output to switch erroneously. A possible means to further improve noise-immunity is to use dl domino instead of d2 domino, as the stacked foot-switch NMOS transistor can reduce leakage current. This approach, however, incurs a speed penalty because of the reduced pull-down strength. For example, dual-& dl domino gates lead to a 10% further UNG improvement but with a 30% speed loss as compared with dual-vt d2 domino gates. Therefore, design techniques that have a better noise-immunity vs. speed trade-off than that of dual-& domino are needed. 111. THE BOOSTED-SOURCE TECHNIQUE Noise-immunity degradation due to high leakage makes robust performance difficult for low-power digital circuits, especially wide fan-in domino gates. In this section, we will present a new noise-tolerant dynamic circuit technique - the boosted-source (BS) technique, which achieves significant improvement in reliability without incurring large design overheads. Fig. 5 shows the circuit schematic of a dl-compatible wide fan-in gate employing the proposed BS technique. A sense amplifier (SA) is utilized to generate two full-swing, complimentary outputs. The gate works as follows. During the preckge phase when CLK = 0, dynamic node A, output v,,, and Vout are charged up to Vdd, whereas node C is discharged. The voltage level of node B depends upon the inputs. In case 1 (see Fig. S(a)), some of the in- puts Al-A, are low. Thus, node B is also charged up to Vdd. During the evaluate phase when CLK = 1, node A and B will be pulled down due to charge redistribution with the dummy capacitor at node C. Meanwhile, both Vovt and Vout will be momentarily discharged. However, by properly skewing the pull-down strengthof Pathl and Pathd, Vout will be fully discharged while Vout returns back to Vdd. Node A, B and C will converge to an intermediate voltage level due to charge-sharing. Note that this is the highest voltage level that node B can achieve at the end of each evaluate phase. In case 2 (see Fig. 6(b)), all of the inputs AI-A, are high. Thus, node A and B will be at Vdd and an intermediate voltage level, respectively. This voltage difference makes Pathl slower than Path2 After CLK turns to l, Vout will be discharged while Vout stays at Vdd. Node B will converge to a lower voltage level due to charge-sharing with node C. Note that in both cases the small glitch at the non-switching output can be reduced by the output inverter. In comparison with the existing circuit techniques [14], [15], the proposed BS technique has the following features: The BS technique significantly improves the noiseimmunity. Clearly, noise pulses may impair the outputs of a BS gate when all the inputs are high during the precharge phase and at the beginning of evaluate phase when the SA starts latching. However, noise impact is greatly reduced due to the body-effect and low mobility of the pull-up PMOS transistors. In addition, during most of the evaluate phase, noise will only cause charge-sharing between node A, B and C; but will not affect the outputs due to the latching nature of the SA. Note that conventional domino gates are not noise-tolerant, even if they are followed by a latch, as the latch will capture a wrong value at the end of evaluate phase if an error occurs. The delay of a BS gate is determined by the speed of SA. For wide fan-in gates this implies a speed benefit due to the relief of discharging large drain capacitance and parasitic capacitance at dynamic nodes. Moreover, the BS technique doesn t increase the fan-in (input) capacitance. The L pull-up PMOS transistors can be designed with the same fan-in (input) capacitance as that of the pull-down NMOS 223

*-I]L... Al-h // \- ~, "11~1-1. the gate delay. Finally, we need to point out that the BS technique increases the clock load and thus an upsized (local) clock driver is needed. While this leads to extra power dissipation, the simulation results in the next section demonstrate that the power reduction due to low voltage swing is dominant for wide fan-in gates. It must be mentioned that although in this paper we are primarily concerned with wide fan-in gates, the proposed BS technique is equally applicable to narrow fan-in gates and other logic gates which will become leakageprone in future deep submicron technologies. IV. IMPLEMENTATION AND RESULTS ~02n,Ol"~06.11O~nD.n,12"~,."d Tlm (I,", <T*r.rIz, (b) - 7..I.,.. *6n3,bns2"12*4, Figure 6: Operating waveforms of a BS gate when the inputs are (a) not all high and (b) all high. transistors in conventional domino without affecting the gate delay. This allows easier interface to other circuits. Due to partial voltage swing at node A, B and C, dynamic power dissipation is reduced and the extra power dissipation due to the SA can be offset. As the number of fan-in increases, drain capacitance and parasitic capacitance at dynamic nodes also increase, and therefore the power reduction due to partial voltage swing will become significant. A number of design issues regarding the BS technique need to be addressed. First, it is necessary to determine the value of the capacitance at node C. A small capacitance reduces the voltage drop at node B and therefore may not be able to skew the discharging speed when all the inputs are high. On the other hand, a large capacitance wastes power. From the simulations we found that such capacitance should be around 30%-50% of the total capacitance at node A and B. Thus, a dummy capacitor might be needed and this will consume additional layout area. Also, the BS gate shown in Fig. 5 is dl-compatible and allows high-tdow input switch during the precharge phase. Note that dl-compatible gates are desired for some applications such as wide fan-in address decoders in memory design, as d2 domino gates waste power in predischarging large input (bit-line) loads. It is possible to change the circuit configuration in Fig. 5 for designing dscompatible gates. In this case the foot-switch NMOS transistor N1 and the dummy capacitor at node C are no longer needed. This leads to further energy savings. However, the clock signal of the SA must.be delayed properly with respect to CLK to wait for stable inputs. This delayed clock signal can be generated locally from CLK, but it may increase Simulation results of %wide, 16-wide and 32-wide gates designed in the Predictive Berkeley BSIM3v3 0.13pm CMOS technology [l] are presented in this section. Performance in terms of delay, power dissipation and noise-immunity is compared with the conventional domino gates (shown in Fig. l(a)). All the gates are designed with the same speed specification at a given output load. The "pull-up" PMOS transistors in BS gates are designed with the same fan-in (input) capacitance as that of the pull-down NMOS transistors in domino gates. Fig. 7(a) shows the energy dissipation of 8-wide, 16- wide and 32-wide BS gates, normalized by the corresponding measures of the domino gates. Since we are only concerned with the performance of the gate, energy consumed by the output inverter and the load are almost the same for different techniques and therefore are not included in the comparison. Simulation results indicate that the energy dissipation of the 32-wide BS gate is comparable to that of the 32-wide domino gate. This is because the power reduction due to low swing scheme of the BS technique becomes dominant as fan-in number goes up. Therefore, the BS technique is a better choice for wide fan-in gates, which as shown in Fig. 4 are very prone to leakage-induced noise. As mentioned before, noise pulses may impair the outputs of a BS gate when all the inputs are high during the precharge phase and at the beginning of evaluate phase when the SA starts latching. We denote this period as the noise effective time. In the simulations we observed that if noise pulses appear after the PMOS transistor P1 (see Fig. 5) has been turned on, they will not affect the operation of SA anymore, as the SA already has enough strength to converge - towards the correct direction (i.e., Vout ="1" and Vout ="O"). This is about 30% of the total evaluate phase. As the UNG metric defined in (2) cannot be applied directly to BS gates, we compare the noise-immunity in terms of the amplitude of noise pulses that will make output in error, normalized by the corresponding effective time. Fig. 7(b) shows the noiseimmunity of &wide, 16-wide and 32-wide BS gates, normalized by the corresponding measures of the domino gates. It is indicated that the BS technique achieves 1.6X-3X improvement in noise-immunity, and the improvement is significant for wide fan-in gates. This is mainly due to the body-effect and low mobility of the "pull-up" MOS transistors. Also shown in Fig. 7(b) is that the noiseimmunity of conven- 224

1. 48 I8 44 07 (b) Figure 7: Performance of wide fan-in BS gates: (a) energy dissipation and (b) noise-immunity. tional domino gates degrades at a higher rate with increase in fan-in than that of the BS gates. Note that in order to get a more accurate noise-immunity measure, we need a complete noise model which is currently an active research topic for DSM technologies. V. CONCLUSIONS We have investigated the noise-immunity degradation due to high-leakage in deep submicron CMOS technologies. A new energy-efficient, noise-tolerant dynamic circuit technique has been proposed. Simulation results demonstrate the significant improvement in reliability without incurring large design overheads. Future work is being directed towards applying the proposed technique in general circuit design. K. L. Shepard and V. Narayanan, Noise in deep submicron digital design, ICCAD 96, pp. 524-531, 1996. P. Larsson and C. Svensson, Noise in digital dynamic CMOS circuits, IEEE J. Solid-state Circuits, vol. 29, pp. 655-662, June 1994. K. Soumyanath et. al., Accurate on-chip interconnect evaluation: a time-domain approach, IEEE J. Solid- State Circuits, vol. 34, pp. 623-631, May 1999. A. P. Chandrakasan and R. W. Brodersen, Minimizing power consumption in digital CMOS circuits, Proceedings of the IEEE, vol. 83, pp. 498-523, April 1995. R. X. Gu and M. I. Elmasry, Power dissipation analysis and optimization of deep submicron CMOS digital circuits, IEEE J. Solid-state Circuits, vol. 31, pp. 707-713, May 1996. N. R. Shanbhag, A mathematical basis for powerreduction in digital VLSI systems, IEEE Trans. Circuits Syst. II, vol. 44, pp. 935951, Nov. 1997. R. Gonzalez, B. M. Gordon, and M. A. Horowitz, Supply and threshold voltage scaling for low power CMOS, IEEE J. Solid-state Circuits, vol. 32, pp. 1210-1216, August 1997. [lo] V. De and S. Borkar, Technology and design challenges for low power and high performance, Proc. of Intl. Symp. on Low-Power Electronics and Design, pp. 163-168, San Diego, CA, August 1999. [ll] S. Mutoh et. al., 1-V power supply high-speed digital circuit technology with multithreshold-voltage CMOS, IEEE J. Solid-state circuits, vol. 30, pp. 847-854, August 1995. [12] J. P. Halter and F. A. Najm, A gate-level leakage power reduction method for ultra-low-power CMOS circuits, CICC 97, pp. 475-478, 1997. (131 Y. Ye, S. Borkar, and V. De, A new technique for standby leakage reduction in high-performance circuits, Symp. VLSI Circuits, pp. 40-41, 1998. [14] L. Wang and N. R. Shanbhag, An energyefficient noise-tolerant dynamic circuit technique, IEEE Trans. Circuits Syst. II, to be published. [15] R. H. Krambeck, C. M. Lee, and H.-F. S. Law, Highspeed compact circuits with CMOS, IEEE J. Solid- State Circuits, vol. 17, pp. 614-619, June 1982. VI. REFERENCES Predictive Technology Model, URL: http://wwwdevice. eecs. berkeley.edu/-ptm/. The International Technology Roadmap for Semiconductors: 1999 Edition, URL: http://www.itrs.net/ 1999_SIA_Roadmap/Home. htm. 225