Sizing and Placement of Charge Recycling Transistors in MTCMOS Circuits

Similar documents
IEE Electronics Letters, vol 34, no 17, August 1998, pp ESTIMATING STARTING POINT OF CONDUCTION OF CMOS GATES

TECHNICAL NOTE TERMINATION FOR POINT- TO-POINT SYSTEMS TN TERMINATON FOR POINT-TO-POINT SYSTEMS. Zo = L C. ω - angular frequency = 2πf

ECE315 / ECE515 Lecture 5 Date:

High Speed ADC Sampling Transients

Calculation of the received voltage due to the radiation from multiple co-frequency sources

Dynamic Optimization. Assignment 1. Sasanka Nagavalli January 29, 2013 Robotics Institute Carnegie Mellon University

Figure 1. DC-DC Boost Converter

antenna antenna (4.139)

NATIONAL RADIO ASTRONOMY OBSERVATORY Green Bank, West Virginia SPECTRAL PROCESSOR MEMO NO. 25. MEMORANDUM February 13, 1985

Total Power Minimization in Glitch-Free CMOS Circuits Considering Process Variation

Total Power Minimization in Glitch-Free CMOS Circuits Considering Process Variation

To: Professor Avitabile Date: February 4, 2003 From: Mechanical Student Subject: Experiment #1 Numerical Methods Using Excel

Research of Dispatching Method in Elevator Group Control System Based on Fuzzy Neural Network. Yufeng Dai a, Yun Du b

HIGH PERFORMANCE ADDER USING VARIABLE THRESHOLD MOSFET IN 45NM TECHNOLOGY

Comparative Analysis of Reuse 1 and 3 in Cellular Network Based On SIR Distribution and Rate

Unit 1. Current and Voltage U 1 VOLTAGE AND CURRENT. Circuit Basics KVL, KCL, Ohm's Law LED Outputs Buttons/Switch Inputs. Current / Voltage Analogy

A Mathematical Solution to Power Optimal Pipeline Design by Utilizing Soft Edge Flip-Flops

29. Network Functions for Circuits Containing Op Amps

PRACTICAL, COMPUTATION EFFICIENT HIGH-ORDER NEURAL NETWORK FOR ROTATION AND SHIFT INVARIANT PATTERN RECOGNITION. Evgeny Artyomov and Orly Yadid-Pecht

THE GENERATION OF 400 MW RF PULSES AT X-BAND USING RESONANT DELAY LINES *

Passive Filters. References: Barbow (pp ), Hayes & Horowitz (pp 32-60), Rizzoni (Chap. 6)

High Speed, Low Power And Area Efficient Carry-Select Adder

Figure 1. DC-DC Boost Converter

Exploiting Dynamic Workload Variation in Low Energy Preemptive Task Scheduling

Lecture 10: Bipolar Junction Transistor Construction. NPN Physical Operation.

A NSGA-II algorithm to solve a bi-objective optimization of the redundancy allocation problem for series-parallel systems

Walsh Function Based Synthesis Method of PWM Pattern for Full-Bridge Inverter

A High-Sensitivity Oversampling Digital Signal Detection Technique for CMOS Image Sensors Using Non-destructive Intermediate High-Speed Readout Mode

Graph Method for Solving Switched Capacitors Circuits

Digital Transmission

A study of turbo codes for multilevel modulations in Gaussian and mobile channels

POLYTECHNIC UNIVERSITY Electrical Engineering Department. EE SOPHOMORE LABORATORY Experiment 1 Laboratory Energy Sources

A High-Speed Multiplication Algorithm Using Modified Partial Product Reduction Tree

Network Theory. EC / EE / IN. for

Parameter Free Iterative Decoding Metrics for Non-Coherent Orthogonal Modulation

A Comparison of Two Equivalent Real Formulations for Complex-Valued Linear Systems Part 2: Results

Modeling and Control of a Cascaded Boost Converter for a Battery Electric Vehicle

RC Filters TEP Related Topics Principle Equipment

4.3- Modeling the Diode Forward Characteristic

MTBF PREDICTION REPORT

SRAM Leakage Suppression by Minimizing Standby Supply Voltage

Uncertainty in measurements of power and energy on power networks

Voltage Quality Enhancement and Fault Current Limiting with Z-Source based Series Active Filter

ECE 2133 Electronic Circuits. Dept. of Electrical and Computer Engineering International Islamic University Malaysia

Dual Functional Z-Source Based Dynamic Voltage Restorer to Voltage Quality Improvement and Fault Current Limiting

Learning Ensembles of Convolutional Neural Networks

MASTER TIMING AND TOF MODULE-

Triferential Subtraction in Strain Gage Signal Conditioning. Introduction

Full-duplex Relaying for D2D Communication in mmwave based 5G Networks

ANNUAL OF NAVIGATION 11/2006

熊本大学学術リポジトリ. Kumamoto University Repositor

Figure.1. Basic model of an impedance source converter JCHPS Special Issue 12: August Page 13

Guidelines for CCPR and RMO Bilateral Key Comparisons CCPR Working Group on Key Comparison CCPR-G5 October 10 th, 2014

An Adaptive Over-current Protection Scheme for MV Distribution Networks Including DG

Optimal Placement of PMU and RTU by Hybrid Genetic Algorithm and Simulated Annealing for Multiarea Power System State Estimation

PERFORMANCE EVALUATION OF BOOTH AND WALLACE MULTIPLIER USING FIR FILTER. Chirala Engineering College, Chirala.

Latency Insertion Method (LIM) for IR Drop Analysis in Power Grid

Efficient Large Integers Arithmetic by Adopting Squaring and Complement Recoding Techniques

FAST ELECTRON IRRADIATION EFFECTS ON MOS TRANSISTOR MICROSCOPIC PARAMETERS EXPERIMENTAL DATA AND THEORETICAL MODELS

Customer witness testing guide

N- and P-Channel 2.5-V (G-S) MOSFET

Performance Analysis of Multi User MIMO System with Block-Diagonalization Precoding Scheme

Sensors for Motion and Position Measurement

Joint Power Control and Scheduling for Two-Cell Energy Efficient Broadcasting with Network Coding

Decision aid methodologies in transportation

Resource Allocation Optimization for Device-to- Device Communication Underlaying Cellular Networks

Adaptive Modulation for Multiple Antenna Channels

Opportunistic Beamforming for Finite Horizon Multicast

Mismatch-tolerant Capacitor Array Structure for Junction-splitting SAR Analog-to-digital Conversion

AC-DC CONVERTER FIRING ERROR DETECTION

Vectorless Analysis of Supply Noise Induced Delay Variation

A MODIFIED DIRECTIONAL FREQUENCY REUSE PLAN BASED ON CHANNEL ALTERNATION AND ROTATION

Improvement of the Shunt Active Power Filter Dynamic Performance

Dynamic Power Consumption in Virtex -II FPGA Family

An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks

Approximating User Distributions in WCDMA Networks Using 2-D Gaussian

Space Time Equalization-space time codes System Model for STCM

Evolutionary Programming for Reactive Power Planning Using FACTS Devices

A Novel Optimization of the Distance Source Routing (DSR) Protocol for the Mobile Ad Hoc Networks (MANET)

Process Variation Aware SRAM/Cache for Aggressive Voltage-Frequency Scaling

Modeling Hierarchical Event Streams in System Level Performance Analysis

A TWO-PLAYER MODEL FOR THE SIMULTANEOUS LOCATION OF FRANCHISING SERVICES WITH PREFERENTIAL RIGHTS

Understanding the Spike Algorithm

Multi-Robot Map-Merging-Free Connectivity-Based Positioning and Tethering in Unknown Environments

Prevention of Sequential Message Loss in CAN Systems

Switched-Capacitor Filter Optimization with Respect to Switch On-State Resistance and Features of Real Operational Amplifiers

Electricity Network Reliability Optimization

Rejection of PSK Interference in DS-SS/PSK System Using Adaptive Transversal Filter with Conditional Response Recalculation

Time-frequency Analysis Based State Diagnosis of Transformers Windings under the Short-Circuit Shock

Control of Chaos in Positive Output Luo Converter by means of Time Delay Feedback

Weighted Penalty Model for Content Balancing in CATS

Shunt Active Filters (SAF)

Define Y = # of mobiles from M total mobiles that have an adequate link. Measure of average portion of mobiles allocated a link of adequate quality.

AN ALGORITHM TO COMBINE LINK ADAPTATION AND TRANSMIT POWER CONTROL IN HIPERLAN TYPE 2

Safety and resilience of Global Baltic Network of Critical Infrastructure Networks related to cascading effects

Generalized Incomplete Trojan-Type Designs with Unequal Cell Sizes

A Current Differential Line Protection Using a Synchronous Reference Frame Approach

Low Complexity Duty Cycle Control with Joint Delay and Energy Efficiency for Beacon-enabled IEEE Wireless Sensor Networks

Topology Control for C-RAN Architecture Based on Complex Network

DUE TO process scaling, the number of devices on a

Transcription:

Szng and Placement of Charge Recyclng Transstors n TCOS Crcuts Ehsan Pakbazna Dep. of Electrcal Engneerng Unversty of Southern Calforna Los Angeles, U.S.A. pakbazn@usc.edu Farzan Fallah Fujtsu Labs of Amerca Sunnyvale, U.S.A. farzan@fla.fujtsu.com assoud Pedram Dep. of Electrcal Engneerng Unversty of Southern Calforna Los Angeles, U.S.A. pedram@usc.edu ABSTRACT A downsde of usng ult-threshold COS (TCOS) technque for leakage reducton s the energy consumpton durng transtons between sleep and actve modes. Prevously, a charge recyclng (CR) TCOS archtecture was proposed to reduce the large amount of energy consumpton that occurs durng the mode transtons n powergated crcuts. Consderng the RC parastcs of the vrtual ground and VDD lnes, proper szng and placement of chargerecyclng transstors s key to achevng the maxmum power savng. In ths paper, we show that the szng and placement problems of charge-recyclng transstors n CR-TCOS can be formulated as a lnear programmng problem, and hence, can be effcently solved usng standard mathematcal programmng packages. The proposed szng and placement technques allow us to employ the CR-TCOS soluton n large row-based standard cell layouts whle achevng nearly the full potental of ths power-gatng archtecture,.e., we acheve 44% savng n swtchng energy due to the mode transton n CR-TCOS compared to standard TCOS. I. INTRODUCTION Reducng the threshold voltage of transstors n sub-mcron COS technology compensates the performance degradaton resulted from the supply voltage decrement. Threshold voltage reducton, however, ncreases sub-threshold leakage current ncrease exponentally []. Power gatng technque provdes low leakage and hgh performance operaton by usng low V t transstors for logc cells and hgh V t devces as sleep transstors for dsconnectng logc cells from power supply and/or ground [2]. Ths ult-threshold COS (TCOS) technology reduces the leakage n the sleep mode. One of the key concerns n TCOS s the wake up tme latency of the crcut, whch s defned as the tme requred to turn on the crcut after recevng the wake up sgnal. Reducng the wake up tme latency s an mportant ssue snce t can affect the overall performance of the VLSI crcut. Another mportant ssue n power gatng s mnmzng the energy wasted durng mode transton,.e., whle swtchng from actve to sleep mode and vce versa. As we wll dscuss n Secton II, both vrtual ground and vrtual nodes experence voltage change durng mode transton. Snce there s consderable number of cells connected to the vrtual ground and vrtual supply nodes, the total swtchng capactance at these nodes s large, and as a result the swtchng power consumpton durng mode transton can be sgnfcant. Sleep transstor szng s an mportant ssue n desgnng the TCOS crcuts. References [2][4][5][6] dscuss dfferent technques to sze sleep transstor(s) for an arbtrary crcut to meet a performance constrant. None of these technques propose any method to mnmze the power consumpton durng the sleep-toactve and actve-to-sleep mode transtons. Charge recyclng technque has been recently proposed n order to reduce the energy consumpton durng mode transton of TCOS crcuts [7]. It has been shown that by applyng ths technque, up to 46% of the swtchng energy due to mode transton can be saved [7]. In ths paper, we apply the charge recyclng technque between consecutve rows of a standard cell desgn. We propose algorthms to do placement and szng for charge recyclng transstors. The remander of ths paper s organzed as follows. In Secton II, we revst the concept of usng charge recyclng technque n TCOS crcuts. In Secton 0, the overall dea of applyng charge recyclng method to a standard cell desgn s ntroduced. In Secton IV we propose an algorthm to concurrently do placement and szng for the charge-recyclng transstors. Secton V represents the smulaton results, and fnally Secton VI concludes the paper. II. CHARGE RECYCLING TECHNIQUE Fgure shows the charge recyclng confguraton. The charge recyclng transmsson gate s turned on rght before gong from sleep to actve and rght after gong from actve to sleep. In [7] t has been dscussed that durng the actve mode voltage values for nodes G and P are close to 0 and, respectvely. In the sleep mode, however, the reverse stuaton s vald and voltage values for t=t a t=t s t=t s0 >t s t=t a0 <t a C C 2 G Connectons between S N C and C 2 S P Fgure. The proposed charge recyclng confguraton n power gatng structures [7]. P

nodes G and P are close to and 0, respectvely. Charge recyclng technque has been proposed to reduce ths mode transton swtchng energy consumpton. At the sleep-to-actve transton edge and rght before turnng on the sleep transstors, we put the crcut to a half-wakeup state by turnng the charge recyclng crcutry on. After charge recyclng s complete, the charge recyclng crcutry s turned off and the sleep transstors are turned on to completely wake up the crcut. Smlar strategy s used at the actve-sleep transton edge. After sleep transstors are completely turned off, the charge-recyclng crcutry s turned on to help chargng the vrtual ground and dschargng the vrtual supply nodes. Next we show that the assumpton that node G s charged to n the sleep mode s vald. Consder sub-crcut C n Fgure. We show that the only case where ths assumpton s nvald s when outputs of all logc cells n C are set to logc (.e., the pulldown sectons of these cells are OFF) mmedately before the actve-to-sleep transton occurs. However, ths case hardly happens n practce, because f there s at least one cell n C wth output value set to logc 0 (.e., ts pull-down secton s ON) before the actve-to-sleep transton and f the sleep perod s suffcently long, then the steady-state value for the vrtual ground voltage after enterng the sleep mode wll be nearly. Clearly, consderng that a sub-crcut wll typcally contan tens of logc cells, the probablty of at least one of them havng a logc 0 at ts output (before enterng the sleep mode) s nearly,.e., ndeed the vrtual ground of sub-crcut C wll rse and reach nearly after suffcent tme s spent n the sleep mode. To emprcally confrm the aforesad, we produce n Fgure 2 the voltage waveforms of the vrtual ground node for four dfferent cases. In each case we have used an NOS sleep transstor (the case wth POS sleep transstor wll be smlar except that the correspondng output states are reversed). The frst case s that of havng a sngle nverter cell n sub-crcut C. We force the output of ths nverter to logc before enterng the sleep mode. As the fgure shows, after enterng the sleep mode, the vrtual ground voltage of the nverter cell rses to about 200mV, whch s much less than of.2v (see the green waveform). The next case corresponds to the same sub-crcut C, ths tme wth the output of the nverter forced to logc 0. Here, the vrtual ground voltage rses to 0.95V, whch s close to and a sutable level for the chargerecyclng purpose (see the blue waveform). The next two cases correspond to C comprsng of 4 nverter cells each drven an nput to C. In one case, three of the nverter outputs are and only one nverter output s 0. In ths case, the vrtual ground voltage rses to even a hgher level than case 2, resultng n a fnal steady sate voltage level of V (see the red waveform), whch s agan sutable for the charge-recyclng purpose. The last case, two nverter outputs are set to logc whle the others are set to logc 0. Clearly n ths case, after enterng the sleep mode, the vrtual ground node s expected to rse and acheve a level even closer to than before. Ths s confrmed by the black waveform n the fgure, whch shows that a level of nearly.2v s acheved by the vrtual ground of sub-crcut C. In summary, as long as there s a rather large number of logc cells n a sub-crcut that uses an NOS sleep transstor, the probablty that one of these cells wll have a logc 0 output value before enterng the sleep mode s qute hgh (n fact t s nearly one), so, the vrtual ground voltage of such a sub-crcut wll gradually rse and stablze to a level near. Ths stablzaton occurs after only a relatvely short perod of sleep tme (n the order of μsec), whch then provdes us wth the opportunty for charge recyclng between ths sub-crcut and another one that uses a POS sleep transstor. Fgure 2. Vrtual ground voltage durng the sleep mode, VDD=.2 V. III. CHARGE RECYCLING FOR STANDARD CELL DESIGNS Fgure 3 shows a sample cell row. There s a cavty for chargerecyclng transstors for each cell row and all the correspondng charge-recyclng transstors for that row are placed n ths cavty. The confguraton s smlar to what was adopted n [5][6]. Vrtual GND ral s not shown n ths fgure. Note for each cell row we only use a certan type of sleep transstor (NOS or POS, but not both). Furthermore, cell rows alternate between NOS and POS sleep transstor types,.e., row cells are connected to vrtual GND through an NOS sleep transstor, whereas row 2 cells are connected to vrtual through a POS sleep transstor, and so on. Fgure 3. A cell row n standard cell layout. Ths row uses NOS sleep transstors placed n the sleep transstor cavty. Fgure 4 depcts the vrtual GND lne model of a sngle row of Fgure 3. Here, G denotes connecton node of the th cell n the vrtual GND lne. r w-g denotes the wrng resstance between G and G +, whle c nt-g represents the nterconnect capactance at G. In the presence of RC parastcs of the vrtual GND and vrtual lnes, charge recyclng tme, whch s defned as the mnmum tme necessary for the charge recyclng transstors to reman ON n order to have at least (-δ) 00 percent of the full charge recyclng completed, s determned by the szes of the logc cells connected to the vrtual GND and vrtual, the szes of the charge-recyclng transstors, and the connecton ponts of the charge recyclng transstors to the vrtual GND and vrtual lnes. In the remander of the dscusson we assume the charge recyclng technque between each par of the nodes n vrtual GND and vrtual s performed usng an NOS pass transstor nstead of a transmsson gate. In practce, ths s suffcent, although one can use a transmsson gate as well. IV. SIZING AND PLACEENT OF THE CHARGE- RECYCLING TRANSISTORS We consder charge recyclng between two rows wth cells per each row ( wll be set as the smaller of the two cell counts f rows have dfferent number of cells). Fgure 5 shows how chargerecyclng s appled between two consecutve rows by placng

Fgure 4. Vrtual GND modeled usng an RC network. charge-recyclng transstors between the two rows. In ths fgure each charge-recyclng transstor, CRT, s connectng the vrtual GND node of a cell n the upper row to the vrtual node of a cell n the lower row. For example CRT connects the vrtual GND node of cell to the vrtual node of cell 5. To smplfy the optmzaton problem and to reduce the routng complexty, the only allowed connectons are of the form G -P, (a connecton of the form G -P j, where j s not allowed.) The connectons between charge-recyclng transstors and vrtual lne are not shown n Fgure 5 for the sake of space. Fgure 5. Charge-recyclng between two consecutve rows. Durng charge-recyclng,.e., when the charge recyclng transstors are ON, each charge-recyclng transstor, CRT, can be replaced by ts resstve model, R, whch connects node G n the vrtual GND lne to ts correspondng node, P, n the vrtual lne as shown n Fgure 6. In ths fgure we have replaced vrtual GND and vrtual lnes by ther equvalent RC nterconnect models n the same way that we dd for rows n Fgure 4. Note r w-p and c nt-p n the vrtual lne are defned n the same manner as r w-g and c nt-g n the vrtual GND lne. C G and C P n Fgure 6 are defned as follows: C C G P = c = c nt G nt P + C + C d G d P where C d-g and C d-p are the total dffuson capactances of nodes G and P, respectvely. Note for nodes that are drectly connected to sleep transstor, the dffuson term also ncludes dffuson capactance of the sleep transstor. As stated before, n the sleep mode, all C G capactances are charged to, whereas all C P capactances are dscharged to zero. In the actve mode, all C P capactances wll be charged to, whle all C G capactances wll be completely dscharged. Before gong from the sleep to actve mode, we allow a porton of the charge of the vrtual GND capactances to mgrate to the vrtual capactances to reduce the overall energy consumpton durng the mode transton. We must thus decde on the number, the connecton ponts to the vrtual rals, and the sze of CRT s. To answer these questons, we formulate an optmzaton problem n whch we maxmze the total energy savng rato for charge recyclng between two rows subject to γ percent volaton n the wake up delay of the orgnal crcut (.e., the wake up delay when no charge recyclng s appled). The wake up tme n each case s defned as the tme needed for the slowest node n the fnal value, zero, durng the vrtual GND to reach wthn 00 δ percent () Fgure 6. Equvalent crcut model durng the charge recyclng. of ts sleep-actve transton. Wth ths defnton for wake up tme, we can wrte the set of constrants as follows: ( CR) tw ( + γ ) t w (2) where t w s the wake up tme of ths row n the orgnal crcut and t (CR) w, whch s defned for the crcut wth charge recyclng technque, s the wake up tme of the th cell n the same row,.e., (CR) the cell connected to the node G n the vrtual ground lne. t w may be wrtten as: ( CR) CR tw = d + trem (3) where d CR s the charge recyclng delay for node G defned as the tme whch takes the voltage of the node G drops from wthn δ percent of ts fnal value, α, and t rem s the remanng tme needed for G to drop from α to 0 by turnng on sleep transstor(s) after the completon of the charge recyclng. From the dscusson presented n [7], α depends on the rato of the total capactances n the vrtual GND and vrtual rals. For the case of equal total capactance on the vrtual rals, we have α=0.5. Usng (3), the constrant set n (2) may be rewrtten as: CR d ( + γ ) tw trem (4) By defnton, t w s ndependent of the locaton and sze of the charge-recyclng transstors, and f we gnore the dffuson capactances of charge-recyclng transstors, t rem, s also ndependent of the locaton and sze of the charge-recyclng transstors. For an already placed desgn wth known sleep transstor szng and placement nformaton, t w and t rem s for each row can be calculated usng Elmore delay model [9]. We use ths set of constrants to solve the problem of maxmzng the total energy savng rato for adjacent standard cell rows, ESR rows : ( Econv. Ecr ) Ecr overhead Ecrt overhead ESRrow = = ESR (5) Econv. Econv. where E cr-overhead s the total dynamc and leakage energy consumpton n charge recyclng transstors for one complete sleepactve cycle. From [7] we know that the frst term n (5), ESR, depends only on the total capactance rato n the vrtual ground and vrtual lnes and does not depend on the charge recyclng crcutry. Therefore, the problem of maxmzng ESR row s equvalent to the problem of mnmzng E cr-overhead or equvalently mnmzng power overhead due to the charge-recyclng transstors. The total power overhead n each row can be wrtten as the summaton of the dynamc and leakage power consumptons due to each of the charge recyclng transstors: 2 cr overhead = g DD + leak DD = = (6) P C f V I V

Technology Parameter TABLE. TECHNOLOGY PARAETERS USED FOR SIULATIONS V tln V tlp V thn V thp c nt (ff/μm) r nt (Ω/μm) Value.2 0.39-0.34 0.54-0.49 0.66 0.6 where the frst and second summaton terms are the total dynamc and leakage power consumptons due to the CR transstors n the row under the consderaton. f s the mode transton frequency, C g s the nput gate capactance for the th charge-recyclng transstor n the row and I leak s the sub-threshold leakage current of the th charge-recyclng transstor. For the purpose of ths paper, the gate capactance of the th charge-recyclng transstor, C g, can be estmated as: Cg = Cox W L (7) Where W s the wdth of the th charge-recyclng transstor. The sub-threshold leakage current of the th charge-recyclng transstor, I leak, can also be wrtten as [0]: ε ox W 2.8 Vgs V th V ds Ileak = μ0 v exp exp T e Tox L S v T v (8) T where V gs and V ds are the gate-source and dran-source voltage of the charge-recyclng transstor. The leakage current s mportant n the sleep mode when the charge-recyclng transstor s OFF, and V gs =0. Here, V ds for each charge-recyclng transstor s the absolute voltage dfference between vrtual GND and vrtual at the connecton nodes of that charge-recyclng transstor. From (6), we can gnore the dependency of the subthreshold leakage current of the transstor on V ds for V ds 75mv. For a typcal TCOS crcut ths usually happens soon after the mode transton. Hence, for the purpose of our analyss, we can gnore the dependency of the leakage current of a charge-recyclng transstor on ts dran-source voltage. We, thus, conclude that the total leakage current of a charge-recyclng transstor s proportonal to ts wdth. From (7) and (8) the total power overhead n (6) can be wrtten as a lnear functon of the wdths of charge-recyclng transstors: where A s defned as: cr overhead P = A W (9) = 2 με 0 ox 2.8 V th A= LCox f VDD + VDD vt e exp (0) LTox S vt Therefore, mnmzng the power overhead s equvalent to mnmzng the total charge-recyclng transstor wdth. Next we formulate the tmng constrants n (2). There are separate tmng constrants n (2), one for each G node n the vrtual ground. All nodes n the vrtual GND are charged to n the sleep mode. They reman charged to tll the end of the sleep mode and rght before the begnnng of the charge-recyclng operaton. Satsfyng the constrants n (2) ndcates that the maxmum ncrease n the dschargng tme for all the nodes n the vrtual ground s less than γ percent of the wake up tme for the orgnal crcut. Consder dschargng of node G n Fgure 6. In ths fgure each charge-recyclng transstor s replaced by ts equvalent resstve model n lnear regon. The value of the equvalent resstance can be calculated as follows: where η s defned as: η R = () W η L = (2) μc ox ( V th ) where L s the length of the charge-recyclng transstor. There are dfferent resstors contrbutng n charge-recyclng operaton n Fgure 6. These resstors provde dschargng paths between vrtual GND and vrtual. In order to smplfy the dschargng scenaro for each node G n the vrtual GND, we replace all R -R resstors wth a sngle equvalent resstor, R eq, between G and P. Snce there are nodes n the row, there wll be equvalent resstors, R eq -R eq, one for each node representng a dschargng scenaro. R eq, s defned as follows: γ γ Req = =, j Weq x x W ( α ) j j (3) where W eq s the equvalent NOS transstor wdth wth R eq lnear-regon resstance, x and x j are the x coordnates of nodes G and G j n the vrtual GND lne, α j s a coeffcent defned by the desgner whch depends on the total capactance at nodes G and P, and also on the nterconnect resstance per unt length for the vrtual GND lne. W eqj n (3) s defned as a weghted average of the wdths of all charge-recyclng transstors where weghts for the dfferent chargerecyclng transstors are defned based on the dstances that they have wth the cell under the consderaton. Note R eq and W eq are related through (). Form (3) W eq can be wrtten as: W = bw (4) eq j j where b j coeffcents are defned as follows: bj = α j x xj, j (5) (4) gves the value of each W eq as a lnear functon of all W s. The crcut can be further smplfed by replacng the RC nterconnect networks n the vrtual GND and vrtual by ther equvalent RC-lumped models seen at nodes G and P, respectvely. Ths smplfed model s shown n Fgure 7. The RC-lumped model elements for vrtual GND, R (G) and C (G), can be calculated as []: ( G) C = YG, ( G) YG,2 (6) R = 2 Y G, where Y G, and Y G,2 are the frst and second moments of the total admttance at node G n the vrtual GND RC tree and can be calculated from the Taylor seres expanson of the total admttance

Fgure 7. The smplfed crcut model durng the charge-recyclng. at node G, Y G (s), that s: 2 k YG ( s) = Y,,2...,... G s+ YG s + + YG ks + (7) The elements of the RC-lumped model of the vrtual can be calculated n smlar fashon. The frst and second moments of the total admttance at any node n the vrtual GND or vrtual n an RC tree can be calculated recursvely [2]. The detals of ths approach are omtted for brevty. The charge recyclng delay n the crcut gven n Fgure 7 s defned as the tme that takes for the voltage of node G takes to drop from to wthn δ percent of ts fnal value. We can show that the charge recyclng delay for node G can be calculated as: d ( δ ) ( + + ) ( ) ( ) ( C + C ) ( G) ( P) ( G) ( G) CR eq = G P ln R R R C C (8) We can show that usng (3), (4) and (8), the set of constrants n (4) can be wrtten as: bw j j Wmn (9) where W -mn s a lower bound on W eq and can be calculated as: ( G) ( P) ( C C ) + Wmn = η ( + γ) tw t rem ln( δ) ( G) ( P) C C ( G) ( P) R R (20) Now havng defned the set of lnear constrants n (9) and wth the objectve of mnmzng the total power overhead n (9), the optmzaton problem can be formulated and solved by standard mathematcal programmng packages as follows: nmze W = (2) s.t.: bw W j j mn W 0 The optmzaton problem defned n (2) s a lnear programmng (LP) problem, and thus t s a polynomal tme solvable problem. V. SIULATION RESULTS ISCAS-85 benchmark crcuts have been used n ths paper. We use SIS to generate optmzed gate level netlsts. All the benchmark crcuts are frst optmzed usng scrpt.rugged n SIS. We use a 90nm technology lbrary to perform tmng-drven technology mappng. Only one sleep transstor s used per cell row. Placement of the sleep transstors s fxed, and the left most corner of each cell row s reserved for sleep transstor placement. Then the sleep transstor for each row s szed for a maxmum 0% delay penalty. After sleep transstor szng and placement, we extract the resulted gate level netlst as well as the vrtual ground and vrtual nterconnect values nto a fle whch s the output of SIS. We use ths nformaton to calculate b j values n (5) and W mn_ values n (20). Table shows the technology parameters that we have used for our smulatons n ths paper. After calculatng b j and W mn- values, we pass them to an LP solver to solve the optmzaton problem n (2). ATLAB s used to solve the LP problem n (2) n ths paper. Fnally, knowng the total vrtual ral capactance value for each row and the total requred charge recyclng transstor wdth for every par of rows, we can calculate the total energy overhead n (5). Here we only consder dynamc energy overhead. Table 2 shows the results for the ISCAS-85 benchmark crcuts. Snce there s no known method for szng and placement of charge-recyclng transstors, we compare the proposed technque wth two other dfferent schemes. In the frst scheme whch we call t sngle charge recyclng TCOS (sngle CRTCOS), we only use one charge recyclng transstor between two cell rows connectng vrtual GND and vrtual lnes at x=0,.e., the left most corners of both lnes. The second scheme whch s called unform CRTCOS uses three charge-recyclng transstors per cell row. The three charge-recyclng transstors are unformly dstrbuted n vrtual GND or vrtual lnes. We fnd the mnmum sze for the charge-recyclng transstors n sngle and unform CRTCOS schemes such that the wakeup tme volaton s at most γ percent compared to the wakeup tme of the orgnal TCOS crcut. Then we compare the energy savng rato for these cases. Accordng to Table 2, the ESR value of the proposed approach s, on average, 8.5% and 8.5% greater than that for sngle CRTCOS and unform CRTCOS schemes, respectvely. Next we dscuss about the effect of sleep and actve duratons on the total energy savng rato that s acheved usng CRTCOS. For charge-recyclng to provde the maxmum ESR, the sleep perod of the crcut must be long enough such that vrtual GND and vrtual lnes fnsh ther full voltage transtons before the edge of the charge-recyclng operaton n the sleep perod. On the other hand f the sleep perod s too long, the overhead of the charge-recyclng approach wll ncrease because of the addtonal leakage path due to the charge-recyclng transstors [7]. Ths leads us to look for a range of approprate values for actve and sleep duratons. Fortunately our smulatons show that charge-recyclng approach works fne for an acceptable range of actve/sleep duratons. In order to fnd approprate ranges for actve/sleep duratons, we fxed the actve mode duraton and found the amount of savng acheved for dfferent sleep mode duraton values. Fgure 8 shows the result of HSPICE smulatons for a chan of nverters n 90nm technology. Each curve represents a fxed actve duraton. Fgure 8 ndcates that for a gven actve duraton, there s an optmum sleep duraton value whch results n the maxmum ESR. Fgure 8 also shows that the total ESR decreases wth ncreasng the sleep duraton. That s because the total savng s fxed whle the total leakage overhead s ncreasng, but snce the charge recyclng transstors are hgh-vt, the leakage overhead s very low whch results n havng hgh ESR values, 20%, even for large sleep duratons. VI. CONCLUSIONS There s no known work addressng charge-recyclng TCOS (CRTCOS) placement and szng problems. In ths paper, for the frst tme, we addressed and solved placement and szng problems for CRTCOS n the presence of RC nterconnects. We showed that the placement and szng problems for CRTCOS n the presence of RC nterconnects can be formulated as an LP problem, and hence, can be effcently solved

Fgure 8. Energy savng rato (%) versus sleep perod for 3 dfferent fxed actve perods for a chan of nverters n 90nm technology workng n 4 GHz clock frequency. usng standard mathematcal programmng packages. The technque can save up to 44% of the swtchng energy due to mode transton. REFERENCES [] J. Kao, S. Narendra, and A. Chandrakasan, Subthreshold leakage modelng and reducton technques, n Proc. Int l Conf. on Computer-Aded Desgn, pp. 4 48, Nov. 2002. [2] S. utoh et al., -V power supply hgh-speed dgtal crcut technology wth mult threshold-voltage COS, IEEEJSSC, vol. 30. pp. 847-854, Aug., 995. [3] J. Kao, A. Chandrakasan, and D. Antonads, Transstor Szng Issues and Tool for ult Threshold COS Technology, n Proc. Desgn Automaton Conference, pp. 409-44, 997. [4] J. Kao, S. Narenda and A. Chandrakasan, TCOS herarchcal szng based on mutual exclusve dscharge patterns, n Proc. Desgn Automaton Conference, pp. 495-500, 998. [5] ohab Ans, S. Areb, and. Elmasry, Desgn and Optmzaton of ultthreshold COS (TCOS) Crcuts, IEEE Transactons on CAD of Integrated Crcuts and Systems, October 2003. [6] V. Khandelwal and A. Srvastava, Leakage Control through Fne-Graned Placement and Szng of Sleep Transstors. Proc. Int l Conference on Computer Aded Desgn, pp. 533-536, 2004. [7] E. Pakbazna, F. Fallah and. Pedram Charge recyclng n TCOS crcuts: concept and analyss, n Proc. Desgn Automaton Conference, pp. 97-02, 2006. [8] A. Abdollah, F Fallah, and. Pedram An effectve power mode transton technque n TCOS crcuts, n Proc. Desgn Automaton Conference, pp. 37-42, 2005. [9] W. C. Elmore, The Transent Response of Damped Lnear Network wth Partcular Regard to Wdeband Amplfer, J. Appl. Phys., vol. 9, no., pp. 55-63, 948. [0] S. ukhopadhyay and K. Roy, odelng and Estmaton of Total Leakage Current n Nano-scaled COS Devces Consderng the Effect of Parameter Varaton, Proc. Int l Symp. on Low Power Electroncs and Desgn, pp. 72-75, 2003. [] P.R. O Bren and T. L. Savarno, odelng the Drvng-Pont Characterstcs of Resstve Interconnect for Accurate Delay Estmaton, Proc. of IEEE nt l Conf. on Computer Aded Desgn, pp.52-55, 989. [2] A.B. Kahng, S. uddu, Improved effectve capactance computatons for use n logc and layout optmzaton, Proc. of VLSI Desgn, pp.578 582, 999. Crcut TABLE 2. COPARING ENERGY CONSUPTION OF THE PROPOSED SCHEE WITH SINGLE CRTCOS AND UNIFOR CRTCOS SCHEES (γ=0%). # of cells # of rows Total sleep tx wdth Total charge sourced from n one complete actve-sleep cycle (pco Coulombs) Proposed Sngle Unform TCOS CRTCO CRTCOS CRTCOS S ESR (proposed) (%) Comparson (%) Proposed vs. sngle 9Sym 276 4 752 2 0 8 7 42 25 8 C432 204 2 4600 8 5.5 4.9 4.5 44 3 5 C880 432 6 9936 7 4. 2 0 4 24 2 C355 526 6 320 2 5.6 4.3 2 43 7 C3540 295 0 30656 75 53 49 42 44 5 9 C535 727 0 38992 23 88 77 67 46 7 8 average - - - 42.6 3 27.5 23.7 44.4 8.5 8.8 Proposed vs. unform