PERFORMANCE EVALUATION OF SELECTED QUASI-ADIABATIC LOGIC STYLES

Similar documents
Chapter 3 DESIGN OF ADIABATIC CIRCUIT. 3.1 Introduction

Implementation of Low Power Inverter using Adiabatic Logic

Design and Analysis of Multiplexer in Different Low Power Techniques

Performance Analysis of Energy Efficient and Charge Recovery Adiabatic Techniques for Low Power Design

Domino Static Gates Final Design Report

International Journal Of Global Innovations -Vol.5, Issue.I Paper Id: SP-V5-I1-P04 ISSN Online:

Cascadable adiabatic logic circuits for low-power applications N.S.S. Reddy 1 M. Satyam 2 K.L. Kishore 3

Implementation of Power Clock Generation Method for Pass-Transistor Adiabatic Logic 4:1 MUX

Design and Analysis of f2g Gate using Adiabatic Technique

CHAPTER 5 DESIGN AND ANALYSIS OF COMPLEMENTARY PASS- TRANSISTOR WITH ASYNCHRONOUS ADIABATIC LOGIC CIRCUITS

CHAPTER 3 NEW SLEEPY- PASS GATE

POWER EVALUATION OF ADIABATIC LOGIC CIRCUITS IN 45NM TECHNOLOGY

Comparative Analysis of Adiabatic Logic Techniques

Adiabatic Logic Circuits for Low Power, High Speed Applications

Performance Analysis of Different Adiabatic Logic Families

DESIGN OF ADIABATIC LOGIC BASED COMPARATOR FOR LOW POWER AND HIGH SPEED APPLICATIONS

A new 6-T multiplexer based full-adder for low power and leakage current optimization

A Low Power Array Multiplier Design using Modified Gate Diffusion Input (GDI)

Design of Low Power Energy Efficient CMOS Circuits with Adiabatic Logic

Low Power Adiabatic Logic Design

Low Power Parallel Prefix Adder Design Using Two Phase Adiabatic Logic

Comparative Analysis of Low Power Adiabatic Logic Circuits in DSM Technology

SEMI ADIABATIC ECRL AND PFAL FULL ADDER

Design and Analysis of Energy Recovery Logic for Low Power Circuit Design

LOW POWER CMOS CELL STRUCTURES BASED ON ADIABATIC SWITCHING

Design and Analysis of Multiplexer using ADIABATIC Logic

Energy Efficient Design of Logic Circuits Using Adiabatic Process

EE584 Introduction to VLSI Design Final Project Document Group 9 Ring Oscillator with Frequency selector

Figure.1. Schematic of 4-bit CLA JCHPS Special Issue 9: June Page 101

PARAMETRIC ANALYSIS OF DFAL BASED DYNAMIC COMPARATOR

Enhancement of Design Quality for an 8-bit ALU

CHAPTER 6 PHASE LOCKED LOOP ARCHITECTURE FOR ADC

Design of Multiplier using Low Power CMOS Technology

Design Analysis of 1-bit Comparator using 45nm Technology

Power-Area trade-off for Different CMOS Design Technologies

Improved Two Phase Clocked Adiabatic Static CMOS Logic Circuit

International Journal of Engineering Trends and Technology (IJETT) Volume 45 Number 5 - March 2017

A design of 16-bit adiabatic Microprocessor core

ULTRA LOW POWER FULL ADDER IC DESIGN

Energy-Recovery CMOS Design

Low Power, Area Efficient FinFET Circuit Design

Comparison of adiabatic and Conventional CMOS

Electronic Circuits EE359A

Design and Analysis of Energy Efficient MOS Digital Library Cell Based on Charge Recovery Logic

DESIGN AND IMPLEMENTATION OF EFFICIENT LOW POWER POSITIVE FEEDBACK ADIABATIC LOGIC

ADIABATIC LOGIC FOR LOW POWER DIGITAL DESIGN

Design & Analysis of Low Power Full Adder

Power Efficient adder Cell For Low Power Bio MedicalDevices

On Chip Active Decoupling Capacitors for Supply Noise Reduction for Power Gating and Dynamic Dual Vdd Circuits in Digital VLSI

Clock-Powered CMOS: A Hybrid Adiabatic Logic Style for Energy-Efficient Computing

DESIGN & ANALYSIS OF A CHARGE RE-CYCLE BASED NOVEL LPHS ADIABATIC LOGIC CIRCUITS FOR LOW POWER APPLICATIONS

Design of Low Power Vlsi Circuits Using Cascode Logic Style

Topic 6. CMOS Static & Dynamic Logic Gates. Static CMOS Circuit. NMOS Transistors in Series/Parallel Connection

UMAINE ECE Morse Code ROM and Transmitter at ISM Band Frequency

Design of Multiplier Using CMOS Technology

Reduced Swing Domino Techniques for Low Power and High Performance Arithmetic Circuits

A Low-Power High-speed Pipelined Accumulator Design Using CMOS Logic for DSP Applications

Novel Buffer Design for Low Power and Less Delay in 45nm and 90nm Technology

Chapter 6 Combinational CMOS Circuit and Logic Design. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Domino CMOS Implementation of Power Optimized and High Performance CLA adder

Chapter 1 Introduction

Timing and Power Optimization Using Mixed- Dynamic-Static CMOS

Investigation on Performance of high speed CMOS Full adder Circuits

Retractile Clock-Powered Logic

UNIT-III GATE LEVEL DESIGN

Design and Comparison of power consumption of Multiplier using adiabatic logic and Conventional CMOS logic

IMPLEMENTATION OF ADIABATIC DYNAMIC LOGIC IN BIT FULL ADDER

Implementation of dual stack technique for reducing leakage and dynamic power

Adiabatic Logic Circuits: A Retrospect

Implementation of Efficient 5:3 & 7:3 Compressors for High Speed and Low-Power Operations

Chapter 4. Problems. 1 Chapter 4 Problem Set

NOVEMBER 28, 2016 COURSE PROJECT: CMOS SWITCHING POWER SUPPLY EE 421 DIGITAL ELECTRONICS ERIC MONAHAN

International Journal of Scientific & Engineering Research, Volume 6, Issue 7, July ISSN

1. Short answer questions. (30) a. What impact does increasing the length of a transistor have on power and delay? Why? (6)

International Journal of Scientific & Engineering Research, Volume 4, Issue 8, August ISSN

CHAPTER 6 GDI BASED LOW POWER FULL ADDER CELL FOR DSP DATA PATH BLOCKS

ECE 683 Project Report. Winter Professor Steven Bibyk. Team Members. Saniya Bhome. Mayank Katyal. Daniel King. Gavin Lim.

Subthreshold Voltage High-k CMOS Devices Have Lowest Energy and High Process Tolerance

Leakage Current Analysis

UNIT-II LOW POWER VLSI DESIGN APPROACHES

Implementation of 1-bit Full Adder using Gate Difuision Input (GDI) cell

International Journal of Scientific & Engineering Research, Volume 4, Issue 5, May ISSN

Module 4 : Propagation Delays in MOS Lecture 19 : Analyzing Delay for various Logic Circuits

Lecture 4. The CMOS Inverter. DC Transfer Curve: Load line. DC Operation: Voltage Transfer Characteristic. Noise in Digital Integrated Circuits

Design of High-Speed Op-Amps for Signal Processing

LOW POWER NOVEL HYBRID ADDERS FOR DATAPATH CIRCUITS IN DSP PROCESSOR

A Survey of the Low Power Design Techniques at the Circuit Level

CHAPTER 3 PERFORMANCE OF A TWO INPUT NAND GATE USING SUBTHRESHOLD LEAKAGE CONTROL TECHNIQUES

A HIGH SPEED & LOW POWER 16T 1-BIT FULL ADDER CIRCUIT DESIGN BY USING MTCMOS TECHNIQUE IN 45nm TECHNOLOGY

Lecture 7: Components of Phase Locked Loop (PLL)

High Performance Low-Power Signed Multiplier

Design of 32-bit ALU using Low Power Energy Efficient Full Adder Circuits

II. Previous Work. III. New 8T Adder Design

EE 42/100 Lecture 23: CMOS Transistors and Logic Gates. Rev A 4/15/2012 (10:39 AM) Prof. Ali M. Niknejad

IN digital circuits, reducing the supply voltage is one of

Design of Low Power Carry Look-Ahead Adder Using Single Phase Clocked Quasi-Static Adiabatic Logic

Implementation of Carry Select Adder using CMOS Full Adder

DESIGN & IMPLEMENTATION OF SELF TIME DUMMY REPLICA TECHNIQUE IN 128X128 LOW VOLTAGE SRAM

ELEC 350L Electronics I Laboratory Fall 2012

A Novel Continuous-Time Common-Mode Feedback for Low-Voltage Switched-OPAMP

Transcription:

Chapter 4 PERFORMANCE EVALUATION OF SELECTED QUASI-ADIABATIC LOGIC STYLES 4.1 Introduction The need of comparison of quasi-adiabatic logic styles was identified in the last chapter so that a contribution can be made in improving the performance of quasi-adiabatic logic. The performance of any adiabatic logic style can be gauged by either measuring the energy dissipation or by measuring the delay. Generally, the Energy-Delay Product (EDP) is used to decide the superiority of one logic style over the other. Energy recovery adiabatic logic is a circuit level low power technique to reduce the dynamic power dissipation. Since the dynamic power dissipation depends on supply voltage, load capacitance, switching frequency and switching activity rate (1.1); the performance of the adiabatic logic style must be evaluated against these parameters. However, the forth factor viz. the switching activity rate α is not a factor to be considered for evaluating the performance of adiabatic logic style because the adiabatic circuit is designed to recover the load energy that is stored in every input cycle (ideally!). By assuming α to be equal to one, we evaluated the selected quasi-adaibatic logic styles with variation of first three parameters for the worst case dynamic power dissipation. Secondly, the literature survey of the various quasi-adiabatic logic styles shows that these logic styles have been developed using the best possible transistor technologies available at that time. PAL and CAL logic styles were implemented using 1.2µm in 1996 and TSEL was implemented on 0.5µm standard CMOS technology 59

in 1997. The PFAL adder was implemented on 0.13µm technology in 2003. The information about the technology used for IPGL style is not given in the corresponding research paper. In order to evaluate the performance of quasi-adiabatic logic styles, these circuits must be implemented using a common technology. As mentioned earlier in the third chapter, 180nm technology has been used for this research work and all the quasi-adiabatic circuits have been implemented using 180nm gpdk from Cadence in the design environment of Cadence ICFB tool. This brings all the different logic styles at par with each other and makes the comparison fair. With this aspect in mind, the selection of suitable technology for the comparative study, the selection criteria for choosing certain quasi-adiabatic logic styles and a particular bench mark circuit is presented in the next to two subsections. 4.1.1 Selecting Quasi-adiabatic Logic Styles for Comparative Study Sixteen quasi-adiabatic logic styles (2N2P, 2N2N2P, 2N2N2D, ADL, APDL, PFAL, ECRL, PAL, PAL2N, CAL, QATCMOS, QSERL, IPGL, DTGAL, ADSL and TSEL) have been reported in the literature in last fourteen years. Following logical inferences can be drawn from the published literature; 1. Considering CMOS logic style as a benchmark for comparison of energy dissipation, it is seen that IPGL logic style is the superior over 2N2P, PFAL, ECRL and 2N2P2D adiabatic logic styles with all adiabatic logic styles having much less energy dissipation than CMOS. Secondly, each of the logic styles mentioned in the list is better than its successor in the list with specific reduction factor of 10 for PFAL and 2N2P and of 5.1 for ECRL as compared to standard CMOS [51],[11], [64]. 2. The comparative study on adders by V S Kanchana in 2006 [68] shows that IPGL circuits provide good performance at higher frequencies as against the ADSL and PFAL based adders. The transistor counts for 2N2P 8-bit CLA is 700 as compared to PFAL (936 transistors), ADSL(1440 transistors). The energy dissipation curves against load capacitance and frequency show that IPGL has the lowest energy dissipation as against that of 2N2P, 2N2N2P, PFAL and ADSL. This supports above observation by other researchers. 3. A study on 8-bit CLA by S Kim et al shows that PAL dissipates less energy as compared to TSEL and 2N2P at 10MHz [34]. PAL requires simple power-clock scheme and has a less complex logic structure. 60

4. QATCMOS [32] by D Mateo et al requires three voltage levels to represent three logic levels (ternary) and complex power-clock supply. The use of diodes or diode like structures in QSERL [10], ADL [4] and APDL [5] increases the non-adiabatic losses considerably and has the possibility of increasing leakage loss as well. Hence QATCMOS, ADL and ADPL are not considered. 5. DTGAL [56] is designed for pipelined architecture and needs complex power clocking scheme. So, it is also eliminated. 6. PAL2N [12] is a refined version of PAL logic style; hence though it is eliminated from the present comparative study, it is included in the next phase of research. 7. CAL [8] logic style has a simple logic structure, can be operated from a dc supply in a non energy-recovery mode (advantage for hybrid design approach) and uses single phase power-clocking. It appears that CAL, PAL and IPGL are the first three choices for the performance evaluation of quasi-adiabatic logic styles. It must be admitted here that these choices are made from the experimental results reported in the literature. These experiments have been performed on different technologies and in different design environments with different objectives. It is quite possible that the superiority list mentioned above may change by changing these design environments or the objectives of the research. So, some researchers may feel that QSERL is better than TSEL and some may argue that TSEL is better than QSERL. The main objective of our research was to design or hypothesize a new quasi-adiabatic logic style which will have the lower energy dissipation than others and since this objective can be achieved by analyzing the top three quasi-adiabatic logic styles further we finally selected CAL, PAL and IPGL for the further experimental work on the performance evaluation and compared the performance of each one with CMOS logic style. 4.1.2 Selection of 2:1 MUX as a Benchmark Circuit Many researchers have used the adder circuit [10, 53],[60, 61, 62, 63, 64, 65, 66, 67, 68, 71, 72] and few have used multipliers [36, 10, 54, 56, 79] for benchmarking or testing adiabatic logic. Inverter chains are used in performance evaluation of brerl [9], nrerl [35], TSEL [53] and CAL [8] logic styles. The selection of the benchmarking circuit depends on the objective of the research. A 4-bit RISC processor or a complex circuit schematic designed to meet certain 61

application specific goals may be used as benchmarking circuit to select the most energy-efficient adiabatic logic style for that application. It appears that the most of the researchers who used adders and multipliers for testing have adopted this research methodology. As mentioned above some researchers have used chains of 736 and 1100 inverters for CAL-testing [8], 512 inverters for brerl [9] and 1600 shift registers for PAL-testing [7]. Therefore it was easier to prove that these logic styles are energy efficient, i.e. the ratio of the energy dissipated in the power-clock generation circuit to the energy dissipated in such large chains is very less. The selection of large chains as benchmarking circuit is avoided in this study. Since the objective of our experiment was to compare the first three superior adiabatic logic styles as mentioned above, it was not necessary to choose a complex circuit as a benchmark circuit. Multiplexers are commonly used in digital system design for selection purpose as well as for building barrel shifters required in floating point ALU blocks. If it can be shown that a simple circuit like 2:1 mux differs distinctly for energy dissipation parameter when implemented using different logic styles under consideration, then obviously the same would be true for larger complex circuits. We have not come across any report (from literature) on performance evaluation of adiabatic logic styles using multiplexers as benchmark circuit. Hence we selected 2:1 mux as a benchmark circuit for our comparison experiment. 4.2 Implementation and Testing of 2:1 MUX Quasiadiabatic Circuits 4.2.1 Important Design and Test Considerations The quasi-adiabatic design of 2:1 MUX requires the implementation the Boolean equation of the MUX and its complementary function. The Boolean equation for 2:1 MUX to be implemented in the logic block is; F = A.S + B.S (4.1) The complement of this is implemented by the other logic block; F = A.S + B.S = ( A.S ). ( B.S ) = A.B + B.S + A.S + S.S (4.2) We implemented these two equations using NMOS. It is an industry practice to toggle one input and maintain the other inputs at a fixed level while characterizing the circuit cell. Throughout this experiment phase, the inputs A (=V DD ) 62

and B(=0V) were kept constant and the input S was toggled at a predetermined frequency. The energy dissipation is calculated by integrating the power dissipation waveform over the transient analysis time. Thus, E DISS = 160n 0 power.dt (4.3) Delay was measured between the input S and the output signal. Typically, delays are measured between 50% of the input voltage level and the 50% of the output voltage level. But in this experiment, the delays were measured between the 90% voltage levels of the input and the output to ensure that the output has risen to its final value of V DD. This is shown in Figure 4.1 below. Transient analysis time was kept as 160ns corresponding to time period of the lowest input frequency of S. Figure 4.1: Delay Measurement The Spectre tool of Virtuoso assumes a typical capacitive load and simulates the circuit. This is clear from the fact that the simulation results show current entering into the power-clock supply and energy recovery. This would not have been possible without a capacitor connected to the output node. The value of this typical load capacitance was found to be approximately 2.5fF by extrapolating one of the energy dissipation (versus V DD ) curves. The effect of load capacitance on energy dissipation and delay was studied by varying the load capacitance in two steps. In the first step, the load capacitance 63

was varied from 2.88fF to 11.52fF (by a factor of 4)where 2.88fF is the gate capacitance of one NMOS transistor having W=2µm and L=180nm. In the second step, the load capacitance was varied based on variation in fan-out as 1,2,3,and 4. The worst case testing with fan-out of four was selected based on the standard industry practice followed by KARMIC (Karnataka Microelectronics group at Manipal). In fact, adiabatic circuits are known to have poor fan-outs. The Cadence tool selects width of the transistor based on the capacitance specified by the user. The load to be driven was taken as a CMOS 2:1 MUX with the input capacitance as 21.6fF which was was measured by observing the rise time of the output signal when the output node was connected to a CMOS 2:1 MUX and dividing it by the effective resistance in the path. The timings of the input stimuli during the functional simulation phase and the testing phase were different however it was ensured that the timings of all the stimuli are consistent during the test phase. Particularly, when the effect of input frequency on energy dissipation and delay is being studied, all the timings are maintained. The CMOS inverters are used in implementation phase to generate dual-rail input signals. Later, during the testing phase these inverters were removed as they were responsible for increased the energy dissipation. Following section presents the details of implementation and testing on 2:1 mux using CAL, PAL and IPGL logic styles. It may be noted that this dissertation qualitatively presents the comparison and the values quoted should not be taken into absolute terms. 4.2.2 Implementation and Testing 1. CAL 2:1 MUX: The objective of this experiment was to implement and test CAL 2:1 MUX. Circuit description: CAL works on a single-phase power-clock supply and can also be operated from a dc power supply for standard CMOS logic. The power-clock PC applied in our experiment was a ramp type voltage supply. The working of CAL is explained in the Section 3.2.3. The cross-coupled inverters using transistors P0, P1, N10 and N11 as shown in Figure 4.2 act as the memory elements. An auxiliary timing clock CX connects the logic blocks F and F to the outputs F1 and F1bar respectively. The logic blocks F and F were designed using NMOS trees to satisfy the Boolean expressions given by (4.1) and (4.2) respectively. Transistors N0, N1, N4 and N5 form the logic F block whereas the transistors N2, N3, N6 and N7 form the logic F block. The circuit requires 5 PMOS and 15 NMOS transistors. The operation of the CAL adiabatic circuit is divided into three phases; Evaluation, Hold and Recovery as described below. 64

When CX =1, one of the output noad adiabatically charges to PC by power clock during Evaluation phase. During Hold phase, the power clock appears at F1 or F1bar indicating adiabatic logic 1 level while the other one remaining at adiabatic logic 0 level. During this phase, the next cascaded logic device derives its power clock from one of these outputs. When the power clock ramps down in Recovery phase, the load capacitor which was adiabatically charged during Evaluation phase discharges to PC thereby recovering the energy. Figure 4.2: Circuit Schematic of CAL 2:1 MUX in Virtuoso Variables and Analysis: Variables Analysis Name Initial value Type Duration Purpose V DD 1.8V Transient, conservative 100ns Output simulation, energy recovery T ON 8ns Transient, conservative 160ns Testing 65

Input stimuli: DC Signal CX= V DD V DD = V DD V SS =0V A= V DD B= 0V Pulse Signal Initial Final t d t r t f t on period Pulse S 0 V DD 1ns 1ps 1ps 3 T ON 4 T ON Pulse PC 0 V DD 1ns T ON T ON T ON 4 T ON Input and output waveforms: Figure 4.3 shows the simulated input and output waveforms of CAL 2:1 MUX. The first waveform /PM0/S is the current drawn by the circuit. Again, the positive current indicates that the load capacitor is adiabatically charged by the power-clock supply and the negative current during the recovery phase (while power-clock is ramping down) indicates that the load capacitor is adiabatically discharging into the power-clock supply. The second waveform is the /F1bar output which goes high when the logic block F is evaluating the 2:1 MUX Boolean equation. The output voltage rises to 1.8V (V DD ). The lower three waveforms are the /A, /PC and /S. The timings of the /S and /PC are critical and the /S input must be toggled when the /PC signal is at zero potential. The other input signal /B which maintained at logic low is not shown in this waveform. Figure 4.3: Simulated Input and Output Waveforms of CAL 2:1 MUX 66

The energy dissipation waveform is shown separately in Figure 4.4 and depicts a typical energy recovery waveform. The current waveform /PM0/S and the power-clock voltage /PC are multiplied to obtain the third waveform representing the instantaneous power dissipation waveform. This power dissipation waveform is integrated using iing functions which means infinite (in this case equals to transient analysis time) integration. Thus, energy dissipation curve is obtained. Figure 4.4: Energy Dissipation in CAL 2:1 MUX Results: Experiments were performed to find the trends of energy dissipation and delay of CAL logic style by varying V DD, power clock frequency and load capacitance. At a time only one parameter was varied by keeping the other two parameters constant. The dimensions of the transistor were set as length = 180 nm while width = 2µm. The typical values of other parameters chosen for this experiment were as follows. The last column in the following table list the number of the table which should be referred for the results of corresponding experiment. It may be noted that the following tables present the results in figures which are directly generated by the Cadence tools used or derived by authors after getting the values of measurable parameters. For all the three logic styles experiments were performed in a similar fashion as above. The representation of the trends of various parameters and their comparison with CMOS in graphical form is presented in the Section 4.3. 67

Experiment by Varying V DD Switching Frequency Load Capacitor Table 4.1: Experiment Conditions and Result Tables Constant Range of Variation Parameters Switching Frequency =62.5MHz, C L =2.5fF V DD =1.8V, C L =2.5fF V DD =1.8V, Switching Frequency =62.5MHz Refer Table No. for Results 1.2V to 4.8V Table 4.2 6.25MHz to 625 MHz Table 4.3 2.88fF to 11.52fF and 21.6fF to 86.4fF Table 4.4 Table 4.2: Effect of V DD on CAL 2:1 MUX V DD (V) Energy Dissipation (pj) 1.20 0.78 1.50 1.12 1.80 1.45 2.10 1.79 2.40 2.13 2.70 2.48 3.00 2.84 3.30 3.20 3.60 3.57 3.90 3.95 4.20 4.33 4.50 4.72 4.80 5.11 68

Table 4.3: Effect of Frequency on CAL 2:1 MUX T ON of /S(ns) Frequency(MHz) Energy Dissipation (pj) Delay(ns) 80 6.25 0.08 18.07 72 6.94 0.13 16.27 64 7.81 0.13 14.47 56 8.93 0.13 12.67 48 10.42 0.19 10.87 40 12.50 0.19 9.07 32 15.63 0.25 7.27 24 20.83 0.38 5.47 16 31.25 0.60 3.67 8 62.50 1.45 1.94 7.2 69.44 1.66 1.77 6.4 78.13 1.93 1.61 5.6 89.29 2.32 1.44 4.8 104.17 2.78 1.29 4 125.00 3.54 1.13 3.2 156.25 4.69 0.97 2.4 208.33 6.65 0.82 1.6 312.50 10.62 0.66 0.8 625.00 20.98* 1.46* Table 4.4: Effect of Load Capacitance on CAL 2:1 MUX Load Capacitance(fF) Energy Dissipation (pj) Delay(ns) 2.88 1.49 1.92 5.76 1.66 1.94 8.64 1.77 1.97 11.52 1.87 1.99 21.60 2.27 2.06 43.20 3.20 2.25 64.80 4.21 2.44 86.40 5.28 2.63 69

*Note: The output does not rise to 1.62V i.e. 90% of the final voltage level. That s the reason the delays are measured at 90% of the final voltage levels. The select signal /S is selected such that it s T ON and T OF F are equal. The timings of the /PC are adjusted accordingly. Observations and Comments: The energy recovery waveform in Figure 4.4 shows that the energy recovery is incomplete and an approximately 0.5CVT 2 energy is not recovered. The dependence of energy dissipation on V DD is almost linear at lower values of V DD. The maximum frequency of operation for this circuit is 312.5MHz. The CAL 2:1 MUX circuit is successfully tested for fan-out of four. As the channel length of the transistor is increased the energy dissipation in the channel increases due to increase in resistance of the channel (2.12). 2. PAL 2:1 MUX: The objective of this experiment was to implement and test PAL 2:1 MUX. Circuit description: PAL has a less complexity of the logic gate and is dual-rail logic as shown in Figure 4.5. It uses a sinusoidal voltage PC for power-clock supply. A pair of cross-coupled PMOS devices P1 and P2 is connected between outputs and PC. The output node is adiabatically charged by the PC when the logic block evaluates the function as the sinusoidal signal increases gradually from zero to maximum. The node discharges adiabatically when the PC ramps down to zero. The output node F1 remains tri-state and the output voltage is close to 0V. A PMOS transistor P3 is included between the circuit and the power-clock supply to monitor the direction of the current flowing. The circuit is designed using 5 PMOS and 11 NMOS transistors. The logic operation has two phases: Evaluate (E) when the power-clock is ramping up, and discharge (D), when the power-clock is going down. Initially, when PC =0, both the output nodes F1 and F1bar are discharged. During the E phase, when the PC starts increasing, there is a path through 70

one of the functional blocks, let us assume through logic block F. Therefore, the output node F1 will begin to ramp up following the power-clock. Once the voltage difference between F1 and F1bar increases above V T of the PMOS, P1 turns on and F1 node capacitance is charged up through P1 to the peak of PC. The charge current initially flows through the logic block F and then through P1. P2 stays off during this clock period because F1 is always greater than F1bar. During the discharge phase D, the power-clock is ramping down. Initially the discharge is through P1 till the node voltage of F1 is dropped to V T. The final portion of the discharge is done through the conducting functional block. Variables and Analysis: Variables Analysis Name Initial value Type Duration Purpose V DD 1.8V Transient, conservative 300ns Output simulation, T ON 8ns Transient, 160ns Energy recovery conservative and testing Input stimuli: DC Signal V DD = V DD V SS =0V A= V DD B= 0V Pulse Signal Initial Final t d t r t f t on period Pulse S 0 V DD 1ns 1ps 1ps T ON 2 T ON Sine Signal V dc V P f t d Sine PC V DD /2 V DD /2 1/ T ON 1ns Input and output waveforms: Figure 4.6 shows the simulated input and output waveforms of PAL 2:1 MUX. The first waveform /PC is the sinusoidal power-clock supply. The next two waveforms are the ouputs of the PAL 2:1 MUX i.e. /F1bar and /F1. The output F1 is high when both the signals /S and /A are high. The output voltage follows the power-clock signal variations during the high logic state of the output. During the same time the other output i.e. /F1bar is tri-stated as seen in the second waveform (time period between 0 to 80ns). Similarly, when the output /F1bar is active high then the output /F1 is tri-stated. The energy dissipation waveform is shown separately in Figure 4.7. The energy recovery is almost complete. 71

Figure 4.5: Circuit Schematic of PAL 2:1 MUX in Virtuoso 72

Figure 4.6: Simulated Input and Output Waveforms of PAL 2:1 MUX Figure 4.7: Energy Dissipation in PAL 2:1 MUX 73

Results:Experiments were performed to find the trends of energy dissipation and delay of PAL logic style by varying V DD, power clock frequency and load capacitance. At a time only one parameter was varied by keeping the other two parameters constant. The dimensions of the transistor were set as length = 180 nm while width = 2µm. The typical values of other parameters chosen for this experiment are given in Table 4.13. The last column in the following table list the number of the table which should be referred for the results of corresponding experiment. Experiment by Varying V DD Switching Frequency Load Capacitor Table 4.5: Experiment Conditions and Result Tables Constant Range of Variation Parameters Switching Frequency =62.5MHz, C L =2.5fF V DD =1.8V, C L =2.5fF V DD =1.8V, Switching Frequency =62.5MHz Refer Table No. for Results 1.2V to 4.8V Table 4.6 6.25MHz to 625 MHz Table 4.7 2.88fF to 11.52fF and 21.6fF to 86.4fF Table 4.8 Observations and Comments: The output waveforms of PAL 2:1 MUX are clean and undistorted. The correct logic levels can be sampled at the peak of the power-clock to feed as an input to the next stage. The output remains tri-stated when the circuit is not evaluating. This could be the problem in many applications. Such output is susceptible to parasitic charge coupling. The PAL gate complexity is relatively low. The main advantage of this circuit is that it can be operated from a single power-clock supply. The not well-defined output is the major drawback of PAL. 3. IPGL 2:1 MUX : The objective of this exeriment was to implement and test IPGL 2:1 MUX. Circuit description: IPGL gate is based on 2N2P gate design. The generalised circuit schematic of an IPGL style is shown in Figure 2.14 and its working is explained in Section 2.4. It requires differential control in the 74

Table 4.6: Effect of V DD on PAL 2:1 MUX V DD (V) Energy Dissipation (pj) 1.20 0.03 1.50 0.06 1.80 0.08 2.10 0.10 2.40 0.12 2.70 0.15 3.00 0.17 3.30 0.20 3.60 0.22 3.90 0.25 4.20 0.28 4.50 0.31 4.80 0.34 Table 4.7: Effect of Frequency on PAL 2:1 MUX T ON of /S(ns) Frequency(MHz) Energy Dissipation (pj) Delay(ns) 80 6.25 0.01 23.77 72 6.94 0.01 21.42 64 7.81 0.01 19.06 56 8.93 0.01 16.69 48 10.42 0.01 14.33 40 12.50 0.01 11.97 32 15.63 0.02 9.61 24 20.83 0.02 7.25 16 31.25 0.04 4.93 8 62.50 0.08 2.64 7.2 69.44 0.09 2.40 6.4 78.13 0.11 2.17 5.6 89.29 0.13 1.94 4.8 104.17 0.15 1.71 4 125.00 0.19 1.47 3.2 156.25 0.25 1.22 2.4 208.33 0.36 0.98 1.6 312.50 0.58 0.72 0.8 625.00 1.29 0.47 The select waveform /S is selected such that it s T ON and T OF F are equal. The timings of the power-clock waveform /PC are adjusted accordingly. 75

Table 4.8: Effect of Load Capacitance on PAL 2:1 MUX Load Capacitance(fF) Energy Dissipation (pj) Delay(ns) 2.88 0.09 2.67 5.76 0.10 2.83 8.64 0.11 2.90 11.52 0.12 2.98 21.60 0.15 3.21 43.20 0.21 3.61 64.80 0.27 3.98 86.40 0.33 4.36 Evaluate phase of the clock and gives the differential output voltage. The schematic is shown in Figure 4.8. The gate has two paths; charging and discharging. The charging paths consist of the logic F block and the inverted logic F block parallel with a pair of cross-coupled PMOS transistors P1 and P2. The logic blocks are implemented using NMOS transistors and are complementary. The circuit consists of 5 PMOS and 21 NMOS transistors. In the Evaluate phase the power-clock supply PC rises from zero to V DD and the output out follows the power-clock through the logic block F and the PMOS transistor parallel with it. The other output out remains at zero. The outputs are valid when the power-clock reaches V DD. The inputs of the IPGL gate must be stable during the charge period, and the logic gate maintains a valid output during the hold period. The cross coupled PMOS transistors maintain the information at the outputs during the hold period. The discharge period is used to recover the load energy. The delivered charge returns to the power-clock supply as the power-clock supply ramps down. The charge recovery path is through the PMOS transistor (in the absence of N1 and N2). In the discharge phase, as the powerclock supply is ramped down, the output node retains a V T voltage as the cross coupled PMOS transistors are shut off. In the wait period, this energy 0.5CVT 2 is drained to the ground as the new inputs become valid. A new charge path is introduced with the introduction of two NMOS transistors N1 and N2 for each output. The gate of the NMOS transistor is driven by the logic gate in the next phase. If the logic gate is in the recovery phase then the next gate has to be in hold period and therefore the charge can be completely recovered through the NMOS transistor. 76

Figure 4.8: Circuit Schematic of IPGL 2:1 MUX in Virtuoso 77

Variables and Analysis: Variables Analysis Name Initial value Type Duration Purpose V DD 1.8V Transient, 160ns Output simulation conservative and testing T ON 8ns Transient, conservative 20ns Energy recovery Input stimuli: DC Signal V DD = V DD V SS = 0V A= V DD B= 0V Pulse Signal Initial Final t d t r t f t on period Pulse S 0 V DD 1ns 1ps 1ps T ON 2 T ON Pulse PC 0 V DD 1ns T ON /4 T ON /4 T ON /4 T ON Input and output waveforms: Figure 4.9 shows the simulated input and output waveforms of IPGL 2:1 MUX. The second waveform is the /out signal. The power-clock signal /PC and the select signal /S are shown in the first and the third pane respectively. The energy dissipation is shown as the last waveform. Figure 4.9: Simulated Input and Output Waveforms of IPGL 2:1 MUX The energy dissipation waveform is shown separately in Figure 4.10 and depicts a typical energy recovery waveform. It is plotted by using iing 78

function on the multiplication of the /PC and the current through the circuit i.e. /PM2/S. Figure 4.10: Energy Dissipation in IPGL 2:1 MUX Results: Experiments were performed to find the trends of energy dissipation and delay of IPGL logic style by varying V DD, power clock frequency and load capacitance. At a time only one parameter was varied by keeping the other two parameters constant. The dimensions of the transistor were set as length = 180 nm while width = 2µm. The typical values of other parameters chosen for this experiment were as follows. The last column in the following table list the number of the table which should be referred for the results of corresponding experiment. The energy dissipation after one input cycle is 0.27pJ (for the parameters mentioned above during the implementation phase). Observations and Comments: The complete energy recovery is possible with two NMOS transistors N1 and N2 only if the output of the next gate depends only on the current gate. The logic block F and its complementary block F are doubled in this logic style in order to drive the un-driven output to ground. Thus, the outputs are clearly defined and are not tri-stated. But this makes the IPGL logic structure highly complex and consumes larger silicon area as compared to other logic styles. This would also increase the energy dissipation. 79

Experiment by Varying V DD Switching Frequency Load Capacitor Table 4.9: Experiment Conditions and Result Tables Constant Range of Variation Parameters Switching Frequency =62.5MHz, C L =2.5fF V DD =1.8V, C L =2.5fF V DD =1.8V, Switching Frequency =62.5MHz Refer Table No. for Results 1.2V to 4.8V Table 4.10 6.25MHz to 625 MHz Table 4.11 2.88fF to 11.52fF and 21.6fF to 86.4fF Table 4.12 Table 4.10: Effect of V DD on IPGL 2:1 MUX V DD (V) Energy Dissipation (pj) 1.20 0.66 1.50 1.06 1.80 2.32 2.10 5.69 2.40 11.49 2.70 18.90 3.00 28.61 3.30 40.82 3.60 55.74 3.90 73.72 4.20 95.04 4.50 120.01 4.80 148.83 80

Table 4.11: Effect of Frequency on IPGL 2:1 MUX T ON of /S(ns) Frequency(MHz) Energy Dissipation (pj) Delay(ns) 80 6.25 1.32 18.18 72 6.94 1.21 16.37 64 7.81 1.53 14.56 56 8.93 1.41 12.75 48 10.42 1.36 10.94 40 12.50 1.39 9.13 32 15.63 1.42 7.32 24 20.83 1.57 5.51 16 31.25 1.67 3.70 8 62.50 2.32 1.90 7.2 69.44 2.47 1.72 6.4 78.13 2.69 1.54 5.6 89.29 3.00 1.37 4.8 104.17 3.37 1.20 4 125.00 4.02 1.03 3.2 156.25 5.00 0.87 2.4 208.33 6.74 0.72 1.6 312.50 10.42 0.57 0.8 625.00 21.06* 1.99* The select waveform /S is selected such that it s T ON and T OF F are equal. The timings of the power-clock waveform /PC are adjusted accordingly. *Note: The output does not rise to 1.62V i.e. 90% of the final voltage level. Table 4.12: Effect of Load Capacitance on IPGL 2:1 MUX Load Capacitance(fF) Energy Dissipation (pj) Delay(ns) 2.88 2.35 1.90 5.76 2.47 1.92 8.64 2.55 1.94 11.52 2.63 1.95 21.60 2.92 2.00 43.20 3.63 2.12 64.80 4.42 2.27 86.40 5.27 2.43 81

The maximum frequency of the input signal that can be applied to this IPGL 2:1 MUX circuit is 312.5MHz (with V DD =1.8V, W=2µm, L=180nm, C L 2.5fF). IPGL gate can be pipelined using phase shifted power-clock supply between the adjacent gates. Thus, when the previous stage is in Hold phase, the present stage must evaluate the logic. 4. CMOS 2:1 MUX: The objective of this experiment was to implement and test CMOS 2:1 MUX. Circuit description: The NMOS pull down network is designed to satisfy the (4.1) whereas the PMOS pull up network is designed to satisfy the (4.2) as shown in Figure 4.11. The circuit requires 6 PMOS and 6 NMOS transistors. Variables and Analysis: Variables Analysis Name Initial value Type Duration Purpose V DD 1.8V Transient, 160ns Output simulation conservative and testing T ON 8ns Input stimuli: DC Signal V DD = V DD V SS = 0V A= V DD B= 0V Pulse Signal Initial Final t d t r t f t on period Pulse S 0 V DD 1ns 1ps 1ps T ON 2 T ON Input and output waveforms: Figure 4.12 shows the simulated input and output waveforms of CMOS 2:1 MUX. The output signal /F is high when the inputs are, /A=high, /B=0 and /S=high. The output /Fbar is the complementary output of /F. Results: Experiments were performed to find the trends of energy dissipation and delay of CMOS logic style by varying V DD, power clock frequency and load capacitance. At a time only one parameter was varied by keeping the other two parameters constant. The dimensions of the transistor were set as length = 180 nm while width = 2µm. The typical values of other parameters chosen for this experiment were as follows. The last column in the following table list the number of the table which should be referred for the results of corresponding experiment. 82

Figure 4.11: Circuit Schematic of CMOS 2:1 MUX Experiment by Varying V DD Switching Frequency Load Capacitor Table 4.13: Experiment Conditions and Result Tables Constant Range of Variation Parameters Switching Frequency =62.5MHz, C L =2.5fF V DD =1.8V, C L =2.5fF V DD =1.8V, Switching Frequency =62.5MHz Refer Table No. for Results 1.2V to 4.8V Table 4.14 6.25MHz to 625 MHz Table 4.15 2.88fF to 11.52fF and 21.6fF to 86.4fF Table 4.16 83

Figure 4.12: Simulated Input and Output Waveforms of CMOS 2:1 MUX Table 4.14: Effect of V DD on CMOS 2:1 MUX V DD (V) Energy Dissipation (pj) 1.20 0.59 1.50 0.98 1.80 1.48 2.10 2.12 2.40 2.88 2.70 3.78 3.00 4.81 3.30 6.00 3.60 7.35 3.90 8.88 4.20 10.62 4.50 12.60 4.80 14.85 84

Table 4.15: Effect of Frequency on CMOS 2:1 MUX T ON of /S(ns) Frequency(MHz) Energy Dissipation (pj) Delay(ns) 80 6.25 0.16 0.09 72 6.94 0.22 0.09 64 7.81 0.22 0.09 56 8.93 0.22 0.09 48 10.42 0.30 0.09 40 12.50 0.30 0.09 32 15.63 0.37 0.09 24 20.83 0.52 0.09 16 31.25 0.75 0.09 8 62.50 1.48 0.09 7.2 69.44 1.70 0.09 6.4 78.13 1.85 0.09 5.6 89.29 2.14 0.09 4.8 104.17 2.51 0.09 4 125.00 2.95 0.09 3.2 156.25 3.68 0.09 2.4 208.33 4.92 0.09 1.6 312.50 7.32 0.09 0.8 625.00 14.47 0.09 Table 4.16: Effect of Load Capacitance on CMOS 2:1 MUX Load Capacitance(fF) Energy Dissipation (pj) Delay(ns) 2.88 1.55 0.10 5.76 1.86 0.14 8.64 2.04 0.16 11.52 2.22 0.18 21.60 2.84 0.24 43.20 4.16 0.38 64.80 5.46 0.52 86.40 6.77 0.65 85

Observations and Comments: CMOS circuit works well under all parametric analyses. The delays are least but energy dissipation is greater than the quasi-adiabatic logic styles. The number of transistors required is also less. 4.3 Performance Evaluation of CAL, PAL, IPGL and CMOS In the next two sections, the detail discussions on the performance evaluation of the four logic styles are presented. The simulation results from the test phase are used to compare the performances. The obtained results are divided into two categories so as to analyze the effect on energy dissipation and delay. The tabular data obtained from the tests is analyzed in MATLAB version 7.5.0. MATLAB offers greater flexibility in plotting the data and annotating the resultant graphs. The text information on the graphs is converted into LaTeX. The figures/graphs obtained from MATLAB are then converted into PNG (portable network graphics format) and imported in LaTeX. 4.3.1 Comparison of Energy Dissipation The energy dissipation of all the four logic styles is compared in this section. The effects of supply voltage, frequency and load capacitance on the energy dissipation are discussed. 1. Effect of V DD Variation: The energy dissipation against V DD curve is shown below in Figure 4.13. IPGL consumes the higher energy dissipation and it is greater than CMOS energy dissipation for V DD >1.5V. Every adiabatic logic style is energy efficient when it is optimized by selecting appropriate adiabatic charging time T, V DD, fan-out etc. IPGL has a very complex logic structure and its optimized V DD is approximately 2.5V T. Note that the typical value of threshold voltage is 0.6V approximately for gpdk 180nm technology. It has been observed that the energy recovery of IPGL is very poor (Figure 4.10). The energy dissipation per input cycle even after energy recovery is higher than that of CMOS. This is due to higher complexity of the logic style and the non-adiabatic losses in this large number of transistors. A zoomed in version of the above curve is shown in Figure 4.14 below. The range of V DD is kept between 1.2V to 2.7V. It shows that PAL has the lowest 86

Figure 4.13: Energy Dissipation Versus Supply Voltage Curve energy dissipation and CAL dissipates lower energy than CMOS when V DD is greater than 1.8V. Thus, it is practically found that the CAL has a lower bound of 3V T for low energy dissipation. The dependence of the energy dissipation on the supply voltage in adiabatic circuit is quadratic as per equation given below; Ediss = ( RC L T )C L(V T ) 2 (4.4) The above equation assumes a constant current power-clock supply which is not practically used. The ramp type or sinusoidal power-clock supply used in practical quasi-adiabatic circuit generates fairly good constant current supply. So, in practice it is very difficult to get quadratic dependence of energy dissipation on the supply voltage. Finally, CAL shows energy gain (reduction) of 40% whereas PAL shows energy gain of 90% over that of CMOS. 2. Effect of Variation in Switching Frequency: The effect of input frequency on the energy dissipation is depicted in Figure 4.15. It is clear from (4.2) that the energy dissipation increases linearly with the frequency of the input signal (here the frequency of select input is varied). The time T in the above equation is the time required for either adiabatic charging or discharging and it is not the time period of a periodic waveform. The time periods of the power-clock waveforms (PC) and the input signals (select) 87

Figure 4.14: Zoomed in Energy Dissipation Versus Supply Voltage Curve in the testing (of logic styles) phase are same. These time periods are of in the integral multiples of T and therefore there is a linear dependence of frequency of the input signal on T. Hence the energy dissipation is also linearly dependent on the input frequency which is seen in the Figure 4.15. The energy dissipation of CAL is higher than that of CMOS for frequencies above 62.5MHz. Thus, CAL is not suited for high speed applications. The zoomed in version of the same curve is shown in Figure 4.16. Except for the aberrations in the lower frequency range the energy dissipation is linearly dependent on the frequency. CAL energy dissipation is 0.7 to 0.9 times that of CMOS in the frequency range of 10MHz to 60MHz. PAL turns out to be the best quasi-adiabatic logic style again and can work up to 625MHz. The energy consumption of PAL is hardly 7 % that of CMOS at 625MHz. This analysis shows that there is an upper bound on the frequency of operation for every quasi-adiabatic circuit and the design engineer needs to simulate the test circuit to find out this maximum frequency of operation. 3. Effect of Variation in Load Capacitance: Figure 4.17 shows the effect of variations in load capacitance on the energy dissipation. As mentioned in the previous section, the load capacitance is varied in two stages; 2.88fF to 11.52fF and 21.6fF to 86.4fF. There are two distinct regions in the graph. All the three quasi-adiabatic logic styles are energy efficient in driving the larger node capacitance. They work satisfactorily for a fan-out up to four. 88

Figure 4.15: Energy Dissipation Versus Frequency Curve (up to 625MHz) Figure 4.16: Energy Dissipation Versus Frequency Curve (up to 62.5MHz) 89

It appears that IPGL can outperform CAL for capacitive loads greater than 86.4fF or fan-out more than four. PAL once again remains the best quasi-adiabatic logic style consuming only 10% of the energy consumed by CAL and IPGL at C L =86.4fF. Figure 4.17: Energy Dissipation Versus Load Capacitance Curve 4.3.2 Comparison of Delay The delays measured between output signal and the select signal for all the four logic styles and are compared in this section. The effects of supply voltage, frequency and load capacitance on the delay are discussed. 1. Effect of Variation in Switching Frequency: The nature of the quasiadiabatic output signal is quite different from a conventional CMOS signal. Select signal is a pulse (square) type signal and the output signal is an adiabatic signal in this experimentation. Thus, the delays in CAL, PAL and IPGL 2:1 MUX circuits are directly proportional to the rise times of the respective power-clock signals. The time periods of the select signal have to be selected in such a way that they are equal to the time periods of the power-clock signal and time periods of the power-clock signals are integral multiples of the rise times. Thus, the delays are directly proportional to the time periods of the input signal and inversely proportional to the frequencies of the input signal. Figure 4.18 shows this inverse relationship between 90

Figure 4.18: Delay Versus Frequency Curve the delay and the frequencies for CAL, PAL AND IPGL. The CMOS delays are constant for frequencies up to 625MHz for 2:1 MUX circuit. The delays of CAL and IPGL above 312.5MHz do not follow this inverse relationship and their outputs are not valid above this frequency. It means that the maximum frequency of operation for these two logic styles is 312.5MHz. This is shown in Figure 4.19. The delays in PAL are the highest. Higher delays mean better energy recovery in adiabatic theory and hence PAL has the lowest energy dissipation. 2. Effect of Variation in Load Capacitance: The effect of load capacitance on the delay is shown in Fig. 4.20. The directly proportional relationship between the delay and the load capacitance establishes the fact that the larger load capacitance will take more time to charge to the final value of the voltage. Once again, PAL has the highest delay and CMOS has the lowest delay. 4.4 Conclusions and Remarks The discussions on the analysis of the CAL, PAL, IPGL and CMOS help to draw the following conclusions; 1. PAL is the best quasi-adiabatic logic style among the three. The energy dissipation of the PAL is merely 8-15% that of CAL and IPGL under different 91

Figure 4.19: Zoomed in Delay Versus Frequency Curve Figure 4.20: Delay Versus Load Capacitance Curve 92

test conditions. 2. Every quasi-adiabatic logic style outperforms CMOS when it is optimized for energy dissipation. But when these are compared under similar test conditions, CMOS outperforms IPGL whereas CAL and PAL dissipate less energy in comparison with IPGL. 3. Adiabatic logic also confirms the usual inverse relationship between the delay and the energy dissipation. Thus, PAL has the highest delay. The delays are directly proportional to the load capacitance and inversely proportional to W/L ratio, again as expected. 4. PAL logic performs better even at high frequencies ( 312.5MHz) whereas CAL and IPGL perform up to 312.5MHz. 5. IPGL has poor energy recovery, complex logic structure and higher energy dissipation. 6. CAL also has incomplete energy recovery and some energy remains stranded on the output node. Finally, it is concluded based on the literature survey and the above experimentation that the PAL logic structure is the best amongst the previously published quasi-adiabatic logic styles. It has only one major drawback of tri-stated output but it offers a platform for researchers to use the circuit techniques for further reducing the energy dissipation in quasi-adiabatic logic styles. These results and their analysis encouraged us to explore PAL logic style further. In the next chapter, results on this further research are presented. 93