Area-Delay-Power Efficient Carry Select Adder Badi Lavanya #1, Y. Sathish Kumar *2, #1 M.Tech (Vlsi & Embedded Systems) Swamy Vivekananda Engineering College (Sveb), Kalavarai (Vi), Bobbili (M), Vizianagaram (Dist.), Andhra Pradesh, India *2 Assistant Professor Swamy Vivekananda Engineering College (Sveb), Kalavarai (Vi), Bobbili (M), Vizianagaram (Dist.), Andhra Pradesh, India Abstract The main objective of this thesis is to provide high speed, low area and power efficient in the carry select adder (CSLA) by using RCA-RCA (Ripple Carry Adder) configurations. Up to now most of the researchers are done various techniques at the different levels of the design process have been implemented to reduce the power dissipation at the circuit, architectural and system level. The dynamic power requirement of CMOS circuits is major concern in the design of personal information systems and large computers. a new CMOS logic style called gate diffusion input. It uses two individual RCA with different anticipated carry input values (Cin =0 and Cin = 1). After the calculation, the appropriate sum and carry-out will be selected using a multiplexer depending on the logic state of the carry input. In recent times, several architectures in CSLA adder were proposed. Boolean to Excess-1 converter (BEC) was one among them. This BEC-1 converter will be used instead of RCA with Cin=1, such that the CPD can be reduced. The alternate approach was using Common Boolean Logic (CBL). This method also succeeded in reducing the CPD. Apart from these two architectures of CSLA, the proposed architecture in this article has shown a significant amount of results on reducing the CPD of the binary adder. In this article, the proposed CSLA adder employs single stage scheme such that the logic burden can be reduced. In this single stage architecture, the partial sum will be generated for the given input data. Later the carry selection will perform according to the input carry then followed by the full sum generation. Thus, it has a single stage carry selection process. In this article, the sum generator uses A GDI based XOR gate. Full Adder (FA) which was used to perform the sum generation was replaced by GDI Full Adder which is an efficient low power adder. Thus, the proposed adder will adopt High speed, Low Area and Power Efficient adder. The comparison results were also discussed in the results section. Keywords carry select adder, RCA-RCA (Ripple Carry Adder), Boolean to Excess-1 converter (BEC),low-power design. Fast binary adders with high speed, low power and area I. INTRODUCTION efficient designs are in great demand for the IC design This Design of area and power-efficient high speed data industries. Because these efficient binary adders involve not path logic systems are one of the most substantial areas of research in VLSI system design. In digital adders, the speed of addition is limited by the time required to propagate a carry through the adder. The sum for each bit position in an elementary adder is generated sequentially only after the previous bit position has been summed and a carry propagated into the next position. The CSLA is used in many computational systems to alleviate the problem of carry propagation delay by independently generating multiple carries and then select a carry to generate the sum [1]. However, the CSLA is not area efficient because it uses multiple pairs of Ripple Carry Adders (RCA) to generate partial sum and carry by considering carry input Cin = 0 and Cin = 1, then the final sum and carry are selected by the only in performing binary addition but also play a vital role in other important elements inside the digital circuits. To achieve the high-performance goal at the system level, it is a must to design these binary adders at optimal cost. Consider the basic Ripple Carry Adder (RCA) which has less circuit complexity and easy to design and implement. But, it has the major drawback of large carry propagation delay (CPD). There are several designs proposed to reduce the CPD such that fast addition can be achieved. The architectures which were proposed to reduce the CPD were known as the fast binary adders. Carry Select Adder, Carry Look-Ahead Adder, Carry Save Adder, Parallel Prefix Adders, etc. are classified as the fast adders. The addition is the most widely used arithmetic operation that widely used in digital computation performed in multiplexers (mux). The basic idea of this work is to use all digital signal processing applications, digital simple combinational circuit instead of RCA with cin = 1 and multiplexer in the regular CSLA to achieve lower area and power. The main advantage of this paper is logic comes from low power than the n-bit Full Adder (FA) structure. This brief is structured as follows. The SQRT CSLA has been developed by using simple combinational circuit and compared with regular SQRT CSLA and ref[4]. multiplications, signal transformations and various other applications as well. The performance of the adders has a great impact on the performance, efficiency of the digital system. There is great demand for the less delay time, low area adders for the Digital Signal Processing Applications (DSP), Digital Communication, and Filtering, Transformation, Mobile computing and other applications. The disadvantage of the ripple carry adder is, the addition II. REVIEW OF LITERATURE will start for every bit after the arrival of its carry input. The key idea behind the carry-select adder is using two RCA in
which, one connected to a constant 0 carry-in, while the other is connected to a constant-1 carry-in. The CSLA adder will choose the actual output from the pre-computed values using the multiplexer. The pre- computed values obtained by performing the addition for the two alternative values of Cin = 0 and Cin =1. The Multiplexer will choose the appropriate pre-computed values once the External Cin arrive at the select line. With this technique, the time required to take the carry information and then performing addition time both were reduced. This increases the speed of the CSLA adder. But on the other hand it fails to conserve the area, due to which it is often considered as not area efficient. Hence designing area efficient CSLA is a challenging task for the VLSI engineers. In this paper such an attempt to modify the conventional CSLA is performed to achieve the above challenges. In digital electronics adder is a digital circuit that performs addition of numbers, these can be classified into 1 bit adders and multi bit adders. Further 1bit adders are categorized in half adders and full adders. These are not only used in ALU but also in memory for III. PROPOSED GDI BASED ADDER The GDI method is based on the use of a simple cell as shown infig.1.atfirstglance,the basic cell reminds one of the standard CMOS inverter, but there are some important differences. The GDI cell contains three inputs: G (common gate input of n MOS and p MOS),P (input to the source/drain of PMOS),and N(input to the source/drain of n MOS).Bulk so both NMOS and PMOS are connected to NorP (respectively),so it can Arbitrarily biased at contrast with a CMOS inverter. Most of the designed circuits were on the F1andF2functions. There as on for this areas follows. Both F1andF2 are complete logic families (allows realization of any possible two- Input logic function). F1 is the only GDI function that can be realized in a standard p-well CMOS process, because the bulk of any NMOS is constantly and equally biased. When N input is driven at high logic level and P input is at low logic level,the Diodes between NMOS and PMOS bulks to Out are directly polarized and there is a Short between N and P,resulting in static power dissipation It must be remarked that not all of the functions are possible in standard p-well CMOS process but can be successfully implemented in twin well CMOS or silicon on insulator(soi)technologies. Table.1:GDI Based Boolean Functions Analysis of GDI Circuit: In this section, we analyze GDI circuits. First we explain their operation and analyze their transient behavior. Then we consider swing restoration issues and switching characteristics. Operational Analysis of GDI Cell As mentioned in Section I, one of the common problems of PTL design methods is the low swing of output signals because of the threshold drop across the single-channel pass transistors. In existing PTL techniques, additional buffering circuitry is used to overcome this problem. To understand the effects of the low swing problem in a GDI cell, we suggest the following analysis, based on the example of F1function, and can be easily extended to use in other GDI functions. Table II presents a full set of logic states and related functionality modes of F1.The fact that demands special emphasis is that in about 50% of the cases the GDI cell operates as a regular CMOS inverter, which is widely used as a digital buffer for logic-level restoration. In this section, the analysis of a basic GDI cell was presented. The operational and transient analysis was performed,as well as comparison of switching characteristics of CMOS and GDI, showing the advantages of GDI in terms of delay,number of transistors, and power consumption.several draw- backs,mostly related to inputs connection to MOSFET wells, have to be mentioned. The threshold drop and, in some cases, an increased diffusion input capacitance (both exist also in PTL techniques and were considered in simulations and analysis) The relative increase of circuit area because of separated MOSFET wells (comparisons based on real layouts will be presented). However, those drawbacks are mostly compensated by the advantages of GDI circuits.
Fig.:1 CMOS and GDI structure IV. BLOCK DIAGRAM OF CONVENTIONAL CSLA AND BEC- BASED CLSA The main objective is to identify redundant logic operations and data dependence. The CLSA has two units: 1) the sum and carry generator unit (SCG) and 2) the sum and carry selection unit. Various logic designs have been suggested for efficient implementation. Fig:2 (a)conventional CSLA (b)logical operations of RCA V. DESIGNING LEAF CELLS IN GDI The examples of GDI functions given in Table I refer only to extension of a single-input CMOS inverter structure to a triple- input GDI cell in order to achieve implementation of complicated logic functions with a minimal number of transistors Actually,this approach can be defined in more general form.ex- tension of any n-input CMOS structure to GDI cell can be done by introducing an input P instead of supply voltage in the PMOS block of a CMOS structure and an input instead of in the NMOS block. Fig.4: GDI Comparisons with other logic styles Leaf Cells Comparisons Implies that the only power consumed is through the inputs, as GDI cells are fed only by the previous circuits. A similar phenomenon is partially observed in most PTL circuits, but there the power consumption from the source is caused by CMOS buffers, which are included in every regular PTL. Yet, in real circuits and simulations, current flow from the sources can be measured in GDI.It is caused by buffers that are connected between cascaded cells.hence, a fair comparison between the techniques must be performed for measurements that are carried out from cells series with buffers and not from a single cell. GDI and TG test circuits contain two basic cells with one output buffer.npg contains two buffers: one after each cell. CMOS has no buffers in test circuits. Comparisons and Results: For each technique, average power, maximal delay, and number of transistors were measured. The results are given in Table IV. Number of transistors comparison: Among all the de- sign techniques, GDI proves to have them in number of transistors.each GDI gate was implemented using only two transistors. The worst case, with respect to transistor count is for the CMOS MUX gate In this sense the PTL techniques proved to be inferior compared to GDI.
GDI Adder: Fig.5: Transistor Comparison Gate Diffusion Induced technique is a novel low power technique used to build logic circuits. It requires fewer transistors as compared to other CMOS techniques. Extra circuitry required in this is the pre charge circuitry. Adder Circuit using this technique is implemented in Figure 3.10. As mentioned, one important requirement of full adder cells especially at low voltage is to provide enough driving power to the following circuits. The drivability is ensured by the full signal swing and decoupling of inputs and outputs (at least one inverter per cell) so that the adder cell can be cascaded arbitrarily and work reliably in any circuit configuration. So the second stage of full adder cells which generate Sum and Carry must have enough drivability. In addition, there are several choices of circuits to generate signal Sum. We use a similar circuit as that of TFA, but fully exploit the available XOR and XNOR outputs from stage I to allow a single inverter to be attached at the last stage. The output inverter guarantees sufficient drive to the cascaded cells. Fig.6: GDI full adder Conventional SQRT CSLA Architecture: The ripple carry adder is very easy to implement but if offer large CPD which reduces the speed of the adder. As an alternate to this approach, as an alternative to this approach several techniques to limit the CPD are proposed. On the line, the carry save adder, parallel prefix architectures also grabbed the major attention by reducing the CPD. New architecture which is referred as square root CSLA (SQRT CSLA) is proposed keeping in view of the above limitation. The SQRT-CSLA typically has an inherent feature to handle large bandwidth address with minimum delay [4-8]. The circuit diagram of this conventional SQRT CSLA is as shown in. Fig.7: Conventional SQRT CSLA Adder
Logical Analysis The CSLA adder consisting of two elements: The Sum and Carry generator element (SCG) and the sum and carry selection unit [9]. The critical path of the SCG unit is large as it occupies more number of design elements in the CSLA design. Fig.9 :-input XOR gate using GDI& CMOS Layout diagram of 2-input XOR-GDI at 90nmTechnology Fig.8: Logic formulation structure The logical circuit of the ripple carry adder is consisting of the following blocks: Half Sum Generator (HSG) unit, Full Sum Generator (FSG) unit, Carry Generator (CG) unit and Carry Selector (CS) unit. Further, the CG unit has Carry generators for two possibilities of the carry input. Such as, CG0 for input carry 0 and CG1 for input carry 1. The inputs of the HSG are the input operands of the adder (A and B) and produce the Half Sum and Half carry information without considering the external carry input. The CG0 and CG1 will obtain S0 and C0 form the HSG unit and generate two n-bit full-carry words and for the input carry 0 and 1, respectively. The Carry selector block is controlled by the C in, using this Cin the final carry will be selected from the two carry words produced by the CG unit. If C in= 0 then CS block selects otherwise it will select. The CS unit is consisting of an n-bit 2-to-1 MUX. This helps to optimize the CS block. The CS block is described in figure 2 (g) which consists of n AND-ORGate by 60%. The CSLA architecture has one stage of HSG and FSG. The key component of HSG and FSG are XOR gates. In this article, the HSG and FSG were implemented using XOR gates developed using Gate Diffusion Input (GDI).dis Fig.10:- (C) CMOS Layout for Carry Generator (CG) at 90nm (d) CMOS Layout for 2-bit CSLA adder Proposed XOR-GDI The performance comparison between various CSLA architectures and proposed CSLA architecture with the GDI-XOR gate is shown in below Table V Design Area Sum Delay Cout ADP (μm 2. μs) Conv 1438.12 3.45 3.35 4.96 CSLA[6] 1282.45 4.08 4.08 5.23 CSLA[7] 906.56 3.78 3.78 3.42 CSLA[8] 1654.65 7.3 7.3 12.08 CSLA[1] 951.09 3.87 3.42 3.68 Proposed 683.5 3.41 3.25 3.21 Table.2: Comparison between Proposed and Existing CSLA Architecture VI. RESULTS AND DISCUSSION The Gate-Diffusion Input (GDI) circuits are much smaller than the CMOS circuit elements, 30% faster and consume 85% less power. The combination of CMOS-GDI circuit provides the optimal solution. The GDI circuit for Half Adder is shown in figure The CMOS layout for the XOR gate-gdi designed at 90nm technology shown in figure 3
[5] Kore. S. D., K.B.V.S.: Modified Carry Select Adder Using Binary Adder as a BEC-1,"European Journal of Scientific Research, vol. 103, no. 1, pp. 156-164, (2013) [6] B. O.J., "Carry-Select Adder," IRE Transactions on Electronics * Computers, pp. 340-344, (1962) [7]He. Y., Chang. C. H., Gu. J.: n area-efficient 64-bit square root carryselect adder for low power application," in IEEE international symposium on Circuits and Systems, (2005) [7] M. S., S. V.: An Effieicient SQRT Architecture of Carry Select Adder Design by Common Boolean Logic. (2013) [8] Parhami. B.: Computer Arithmetic: Algorithms and Hardware Designs, 2 ed., New york, NY,USA: OXFORD UNIVERSITY PRESS, (2010) Fig.11: GDI based FA layout design VII. CONCLUSIONS In this article, the logic operations in the conventional CSLA, CSLA-BEC, CSLA-CBL and Logic formulation of CSLA architecture. Then the XOR gate was replaced with GDI in HSG and FSG and its performance was compared. The GDI technique utilizes very less no. of transistors consumes very less power than the CMOS logic gates. By using this GDI based XOR gate, the CSLA adder will now adopt its features and its performance will be up-scaled. Thus, making it for optimal for high-speed, low power and low area requirement. References REFERENCES [1] Siva Rongali*1, Rajanbabu Mallavarapu2: High Speed Carry Select Adder using GDI Technique. 2nd International Conference on Micro- Electronics, Electromagnetics and Telecommunications (ICMEET 2016)vol.67,jan,2017 [2] Basant Kumar Mohanty and Sujit Kumar Patel, "Area-Delay-Power Efficient Carry Select Adder," IEEE Transactions on Circuits and Systems -II Express Briefs, vol. 61, no. 6, pp. 418-422, June, 2014. [3] Rohit Tripati, Paresh Rawat.: A Efficient Low-Power High Speed Digital Circuit by using 1-bit GDI Full Adder Circuit. International Journal of Engineering Trends and Technolgoy (IJETT), vol. 36, no. 3, pp. 155-160, (June, 2016) [4] I-chyn. W., Cheng-chen. H., Yi-Sheng. L., Chien Chang. P.: An Area- Efficient Carry Select Adder Design by Sharing Common Boolean Logic Term. in IEEE International symposium Circuits Systems, (2005)