VLSI DESIGN OF DIGIT-SERIAL FPGA ARCHITECTURE
|
|
- Rachel Tyler
- 5 years ago
- Views:
Transcription
1 Journal of Circuits, Systems, and Computers Vol. 3, No. (24) 7 52 c World Scientific Publishing Company VLSI ESIGN OF IGIT-SERIAL FPGA ARCHITECTURE HANHO LEE School of Information and Communication Engineering, Inha University, Incheon 42-75, Korea GERAL E. SOBELMAN epartment of Electrical and Computer Engineering, University of Minnesota, 2 Union Street SE, Minneapolis, MN 55455, USA Received May 2 Revised 28 October 22 This paper presents a novel application-specific field-programmable gate array (FPGA) architecture that satisfies efficient implementation of digit-serial SP architectures on a digit wide basis. igit-serial SP designs have been an effective implementation method for FPGAs. To efficiently realize a digit-serial SP design on FPGAs, one must create an FPGA architecture optimized for those types of systems. We examine the various circuits used in digit-serial SP designs to extract their key features that should be reflected in the new FPGA architecture. We explain the design methodology, layout and implementation of the new digit-serial FPGA architecture. igit-serial SP designs using the digit-serial FPGA (S-FPGA) are compared to those implemented on Xilinx FPGAs. We have estimated that the S-FPGA are about times more efficient in area and faster than the equivalent digit-serial SP architectures implemented using Xilinx FPGAs. Keywords: VLSI; FPGA; SP; digit-serial; architecture.. Introduction Field-Programmable Gate Arrays (FPGAs) are of interest for use in digital signal processing (SP) systems due to their ability to implement custom hardware solutions while still maintaining flexibility through device reprogramming. FPGAs provide a configurable structure through an array of adjustable logic modules interconnected by programmable routing resources and surrounded by programmable input/output (I/O) blocks. The main constraints on FPGA architectures are limited Corresponding author. 7
2 8 H. Lee & G. E. Sobelman routing resources, limited I/O resources and large routing delays. Under these circumstances, a digit-serial approach has been shown to be an effective implementation style for FPGAs. 5 In practical SP applications, it may be desirable to combine the area-efficiency of a bit-serial architecture with the time-efficiency of a corresponding bit-parallel architecture into a single area/time efficient digit-serial architecture. 6 9 The implementation methods of digit-serial architectures have been proposed in Refs Moreover, the digit-level pipelined and bit-level pipelined digit-serial multipliers that can be used to further increase the performance of digit-serial architectures have been proposed in Refs. 7. It was demonstrated that the area time efficiency and performance of the digit-serial architectures are considerably above bit-serial and bit-parallel architectures for FPGA in Refs. 3. Several digit-serial arithmetic circuits and SP architectures using FPGAs were presented in Refs. 5. This paper shows that by focusing on a specific class of digit-serial SP applications, we may increase the area and speed efficiency of FPGAs significantly. However, the general-purpose FPGAs, such as those described in Refs. 4 and 5, do not offer area-efficient realization for a certain class of digit-serial SP applications, and were better suited for state machine and wide range of logic functions. To efficiently realize the digit-serial SP designs on FP- GAs, one must create an FPGA architecture targeted to digit-serial SP architectures that can compensate the weakness of the general-purpose FPGAs and accelerate the performance substantially. 6 We examine the various circuits used in digit-serial SP designs to extract their key features that should be reflected in the new FPGA architecture. Key to the suitability of the FPGA for these applications is the fact that each of its basic blocks is capable of processing a digit-size of up to 4-bits. The targeted digit-serial SP systems may contain several digit-level and bit-level pipelined digit-serial datapaths of various digit sizes. They may also have irregular control logic in some portions. Thus, our digit-serial FPGA (S-FPGA) architecture must contain some bit-level programmability, yet take advantage of the high degree of regularity that exists in digit-serial datapaths. The S-FPGA architecture makes possible a more efficient realization of those digit-serial architectures, and static RAM programming technology is used to provide the in-circuit reprogrammability. This paper is organized as follows. Section 2 describes the overview of digit-serial approach. S-FPGA architecture and major components are presented in Sec. 3. In Sec. 4, we discuss the various circuit design issues that must be considered. Routing architecture design issues will be addressed in Sec. 5. A layout style, performance and fabrication result for S-FPGA architecture is presented in Sec. 6. Example digit-serial SP implementations using the proposed S-FPGA are described in Sec. 7. Section 8 presents a performance comparison between the S-FPGA and Xilinx FPGA. Our conclusions are summarized in Sec. 9.
3 VLSI esign of igit-serial FPGA Architecture 9 2. igit-serial Approach Previous architectures have primarily focused on two approaches: bit-serial and bit-parallel implementations. Bit-serial designs process one input bit of a word (or sample) at a time. The advantages of these systems include fewer interconnections, fewer pin-outs, less internal hardware, faster clock speed, and less power consumption. Their main disadvantage is that they are slow because for a word-length of W bits, bit-serial architectures will require W clock cycles to compute one word or sample. Therefore they are primarily suited for low to medium speed applications. Bit-parallel systems process all input bits of a word in one clock cycle and is the most common implementation style. Their main advantage is that they can compute one word in one clock cycle and therefore can provide high-performance and are ideal for high-speed applications. Their disadvantages include larger chip area, interconnection, pin-out, and they consume more power. To avoid the disadvantages of the bit-serial and bit-parallel computation, the concept of digit-serial implementations has been proposed in recent years. 6 3,7 igit-serial approach offers a flexible trade-off between bit-serial and bit-parallel approaches, and between data throughput and the size of arithmetic operators. A system based on these approach can combine the advantages of the high throughput of parallel computation and the small operator size of serial computation. Bit-serial systems, which process one bit of the input sample in one clock cycle, have very localized routing and area-efficient in FPGAs. 8,9 On the other hand bit-parallel systems, which process one whole word of the input sample in one clock-cycle, requires many modules, so the routing resources often are insufficient and result in large routing delay. However, in applications which require moderate sample rates both these systems may be ineffective, that is, the bit-serial systems may be too slow and bit-parallel systems may be faster than necessary and occupy considerable amount of area. To this end, digit-serial systems are best suited for implementation of digital signal processing systems which require moderate sampling rates. In a digit-serial arithmetic implementation, the W -bits of a data word are processed in units of the digit-size N-bits in W/N clock cycles, and are processed serially one digit at a time with the least significant digit first. This leads to arithmetic operators that have smaller area than equivalent bit-parallel arithmetic designs and have a larger throughput than equivalent bit-serial arithmetic designs. Architectures based on the digit-serial approach may offer the best overall trade-off between speed, efficient area utilization, throughput, I/O pin limitations and power consumption. By considering a range of values for the digit-size, one can search the design space to find the optimum implementation for a given application. The implementation methods of digit-serial architectures have been proposed. 6 3,7 The first approach is to start with a bit-parallel structure and then use folding to obtain the digit-serial architecture. 6 8 The second approach is to start with a bit-serial architecture and then use unfolding to obtain the digit-serial
4 2 H. Lee & G. E. Sobelman architecture. 7 The major drawback of the architectures based on these approaches is that they cannot be pipelined at the bit-level, which has severely limited their throughput. This could be a major obstacle for high-speed applications. The main reason why these structure cannot be pipelined is due to the existence of carry feedback loops, which are impossible to pipeline. Recently, the digit-serial architectures that can be pipelined at the bit-level have been reported. 2 The use of carry feed-forward has solved a major bottleneck of the carry feedback loops of conventional digit-serial designs. The possibility of high degree of pipelining offered A B A B rst S S S2 S3 A B A B rst C_LS if if if2 if3 (a) (b) A S B A B S S2 S3 rst A B A B C_LS (b) rst if if if2 if3 C_LS (b) if A B if A B if2 if3 Add A B A B Add rst C_LS (c) (c) rst Out Out Out2 Out3 C_LS (c) A B Out A B Out Out2 Out3 C_MS rst A B A B Latch C_MS (d) C_LS rst Out Out Out2 Out3 Latch Sign (d) C_LS (d) Out Out Out2 Out3 Sign Fig.. (a) igit-serial adder, (b) digit-serial subtractor, (c) digit-serial adder/subtractor, and (d) digit-serial comparator with N = 4 bits.
5 VLSI esign of igit-serial FPGA Architecture 2 A B HA HA S A B FA HA S FA HA S2 FA S3 Pipelining (a) Cin= A B FA HA if A B FA HA if FA HA if2 FA if3 Pipelining (b) Fig. 2. (a) Bit-level pipelined digit-serial adder and (b) bit-level pipelined digit-serial subtractor with N = 4 bits. by the structure in Refs. 2 increases the throughput rate of the digit-serial architectures. A basic element in a digit-serial SP implementation is the digit-serial adder shown in Fig. (a). A digit-serial adder with a digit-size (N) of 4 bits is a circuit that adds four pairs of bits along with a previous carry bit and produces a sum digit and a new carry bit. The two operands, A and B, are fed one digit at a time into the digit-serial adder. The addition is done N-bits at a time, with the carry rippling from one full adder to the next. The carry-out from the digit-serial adder is fed back into the first full adder during the next clock cycle, when the next pair of inputs have arrived. Several examples of digit-serial arithmetic circuits are shown in Figs. 3.
6 SMM2 X X" Parallel inputs X X" Y(3) Y(4) Y() Y() Y(2) Y(5) Y(6) Y(7) 22 H. Lee & G. E. Sobelman Y PI PI PO PO igit-serial input X X X Y X" X X" SMM2 Pi PO Pi PO X Y X" X X" SMM2 Pi PO Pi PO X Y X" X X" SMM2 Pi PO Pi PO X Y X" X X" SMM2 Pi PO Pi PO X Y X" X X" SMM2 Pi PO Pi PO X Y X" X X" SMM2 Pi PO Pi PO X Y X" X X" SMM2 Out Pi PO Pi Out PO igi-serial output (a) Fig. 3. (a) Unsigned digit-level pipelined N = 2 digit-serial multiplier and (b) unsigned digit-level pipelined N = 4 digit-serial multiplier.
7 VLSI esign of igit-serial FPGA Architecture 23 X2 Pi Pi PO PO SMM4 Y X X X X2 X3 X X X2 X3 PI3 PI2 PI PI Y PO3 PO PO PO2 SMM4 X X X X2 X3 X X PO X3 SMM4 Y P2 P3 PO2 PO3 X X Parallel inputs X2 igit-serial input X2 X3 X X X2 X3 Pi Pi P2 P3 PO PO PO2 PO3 SMM4 Y X3 X X X2 X3 X X X2 X3 Pi Pi P2 P3 PO PO PO2 PO3 SMM4 Y P3 X X X2 X3 X X X2 X3 Pi Pi P2 P3 PO PO PO2 PO3 SMM4 Y X3 X X X2 X3 X X X2 X3 Pi Pi P2 P3 PO PO PO2 PO3 Y(3) Y(5) igi-serial output Out3 Out2 Out Out X3 X X X2 X3 Pi Pi P2 Y(6) PO PO PO2 PO3 SMM4 Y X X X2 X X X2 X3 Pi Pi P2 P3 PO PO2 PO3 SMM4 Y Y() Y(2) Y(4) Y(7) X2 X X Y() (b) Fig. 3 (Continued).
8 24 H. Lee & G. E. Sobelman A digit-level pipelined digit-serial multiplier shown in Fig. 3 can be implemented by unfolding the structure of a bit-serial multiplier. 7 One input of this multiplier is parallel while the other is digit-serial with the least significant digit presented first. The output is also digit-serial with the least significant digit first. Chang has presented the bit-level pipelined digit-serial multiplier design that can be pipelined at the bit-level, which results in higher processing speeds. The bitlevel pipelined digit-serial multiplier contains digit-cells, digit-serial 3:2 compressor and digit-serial adder. Each digit-cell consists of a partial product generator module and a carry-save adder (CSA) module. A simple digit-serial 3:2 compressor adder is used to combine the carry-save adder array outputs down to two digits. A digitserial adder is then used to add these two digits to generate the final digit-serial outputs. 3. igit-serial FPGA Architecture A igit-serial FPGA (S-FPGA) architecture is composed of a digit-serial logic block (LB), programmable interconnect architecture and I/O block as shown in Fig. 4. Overall structure of LB and routing architecture are explained in this section. 3.. igit-serial logic block The LB is a logic-based core cell which is simple and straightforward. Figure 5(a) shows a simplified schematic of the LB, consisting of four main parts; digit-serial S S S S LB LB LB IO Block S S S S LB LB LB S S S S LB LB LB S S S S Fig. 4. S-FPGA architecture.
9 VLSI esign of igit-serial FPGA Architecture 25 LB Input igit-serial Logic Array Logic Module Logic Module Logic Module Fast Carry Logic LB Cin Cin Cin2 Cin3 Carry -type Select Logic Register Array LB Output Logic Module (a) Multiplexer controlled by configuration program : SRAM Configuration Bits Sbit Sbit SbitM3(:) E Ci CTL A B Ci A B Ci2 Ci3 E 2 3 Ri(3:) CK SR CE C C C LM LM LM2 LM3 P G P G P2 G2 P3 G3 P(3:) G(3:) Fast Carry Logic C C C2 C3 2 3 SIR SRi SbitM3() SRA SRA SCO3 SO SO SO2 SO3 RO3 CO CO CO2 CO3 (b) Fig. 5. (a) igit-serial logic block (LB) diagram and (b) detail of LB of S-FPGA.
10 26 H. Lee & G. E. Sobelman Table. The configuration settings for the LB. Bit SIR SRi SRA SRA SCO3 SbitM3(:) Meaning High if (3:) is the direct input to -FFs High if Ri (3:) is the direct input to -FFs High if -FFs with outputs SO (3:) are used High if -FFs with outputs CO (2:) are used High if -FF with output CO3 is used Low Low if N = 2 digit-serial circuits are mapped onto LB High-Low if N = 3 digit-serial circuits are mapped onto LB logic array, fast-carry logic, carry-type select logic, and register array. The detailed LB structure is shown in Fig. 5(b). The LB is formed in a digit-serial structure and operates on N = 4 bit operands. This leads to direct implementation of basic mathematical functions, such as digit-serial adders, subtractors and multipliers, in LB. The LBs can be connected together through the interconnection resources to implement digit-serial adders, subtractors, and multipliers of any digit-size. The LB can also realize several bit-wise logical operations, including AN, OR, XOR, etc. It has 26 inputs and 9 outputs plus clock (CK), set/reset (SR) and clock enable (CE) inputs. Programming configuration bits in Table share across identically programmed datapath slices in a LB. This programming bit sharing reduces the total number of SRAM cells resulting in higher density and faster reconfiguration. The low number of SRAM bits simplified the programming of the LB, which is also desirable for reconfigurable computing. The LB will support efficient implementation of N = 2, 3 and 4 digit-serial SP applications as well as random control circuits. The LB architecture uniquely combines both fine and coarse logic granularity for optimum logic utilization and high performance. High logic utilization is provided by the fine-grained logic modules that can implement random logic functions without wasting device resources. The coarse-grained structure of the four fully interconnected logic modules provides fast operation and efficient routing with minimal signal skew for digit-serial SP architectures igit-serial logic array The digit-serial logic array is composed of four small logic modules (LMs), which are the smallest unit of logic in the structure of LB and a logic-based core cell. The logic-based core cell method has a less number of transistors than look-up table (LUT) method. Figure 6 depicts the structure of LM, which has five data inputs, four data outputs and a configuration bit. Propagate (P ) and generate (G) outputs are used as inputs of fast-carry logic in the LB. A large number of logic functions can be implemented by using an appropriate subset of the inputs and tying the remaining inputs of a LM high or low as shown in Table 2. Each logic module can implement arithmetic functions such as a full-adder, subtractor and
11 VLSI esign of igit-serial FPGA Architecture 27 Cin B SO CO E A Sbit P G Fig. 6. The structure of logic module (LM). Table 2. Logic functions which can be implemented by a logic module. Function Function Function Input pattern Configuration name at SO at CO (A B C E) bit (Sbit) Full-Adder X Y Z X Y + Z (X+Y) X Y Z Y Subtractor X Y Z X Y + Z (X+Y ) Y X Z X Half-Adder X Y X Y X Y Multiplier cell (X Y) P Z (X Y) P+Z ((X Y) P) X P Z P Y INV X X X AN2 X Y X Y X X Y NAN2 (X Y) Y X Y X OR2 X + Y (X + Y) X Y NOR2 (X + Y) Y X XOR2 X Y X Y X Y XNOR2 (X Y) X Y Y X MUX 2: X S + Y S X S + X Y S S X S Y MUXB 2: X S + Y S Y S Y S X AN XOR (X Y) Z X Y Z X Z Y some combinational functions. Table 3 shows the programming table of configurable SRAM cells for mapping the arithmetic functions. A single logic array can implement either N = 2, 3 and 4 digit-serial adders/subtractors, unsigned digit-serial multiplier modules with partial product or two s complement digit-serial multiplier modules. Based on our observation, these operations are widely used in digit-serial SP applications. The LM can also realize several bit-wise logical operations, including AN, OR, XOR, etc Fast-carry and carry-type select logic The LB provides fast-carry logic that bypasses the ripple-carry interconnect structure for N = 4 digit-serial circuits with a carry look-ahead method. The use of LMs to make the propagate (P ) and generate (G) signals made the fast-carry logic in LB very simple. The fast-carry logic greatly increases the efficiency and performance of digit-serial adders, subtractors and multiplier building blocks with N = 4.
12 28 H. Lee & G. E. Sobelman Table 3. Programming of configurable SRAM cells (S=digit-serial). Arithmetic Configuration bits function Sbit(:) SbitM3(:) SIR SRI SRA SRA SCO3 INV AN2 NAN2 OR2 NOR2 XOR2 XNOR2 MUX 2: MUXB 2: AN XOR Full-Adder flip-flop UU N = 4 S adder N = 4 S subt. N = 4 S mult. cell As shown in Fig. 5(b), there is three 3: multiplexers employing a combination of multiplexing to select one of three possible carry options via two configurable bits after LMs. This structure comprises of carry-type select logic that is used to select either the ripple-carry chains, the fast-carry logic or the carry-save array. The ripple-carry chains sequentially connect all the LMs in a LB, supporting N = 2 digit-serial circuits. The carry from a lower-order bit moves forward to the higher-order bit via the carry chain Register array Our S-FPGA is a register rich architecture which makes possible a high degree of pipelining, leading to increased performance. The register array contains a multiplexer to select the output, an edge-triggered flip-flop and output drivers. In total, there are nine flip-flops in each LB that can be combined to form a 8-bit register. The eight multiplexers in front of flip-flops will select either carry-save operation or direct inputs via configurable bits. The flip-flops share a common clock (CK), clock enable (CE), and set/reset (SR). Alternatively, the inputs (Ri(3:)) can be used as a direct inputs to the registers that are frequently used to implement shift-registers. The final outputs SO(3:) and CO(3:) can be either the direct outputs from the multiplexers or the outputs from flip-flops Routing architecture The routing framework contains predefined segmented wires in the vertical and horizontal directions as shown in Fig. 7. S-FPGA has two groups of routing resources. One is internal routing resources and the other is the external
13 Quad-length line (8) Singlelength ouble-length line (8) line (4) Internal Input Routing Internal Output Routing A B Ci E A B Ci E Ci2 2 E2 SO SO SO2 SO3 LB CO CO Ci3 3 E3 CO2 CO3 Ri Ri Ri2 Ri3 CK SR RO3 CE Connection Block Long-length line (6) M M Switch Block M M M M M M Buffered Switch Fig. 7. Connection Block S-FPGA routing architecture. Switch Block M Single-length line (4) ouble-length line (8) Quad-length line (8) VLSI esign of igit-serial FPGA Architecture 29
14 3 H. Lee & G. E. Sobelman routing resources. Internal routing is the routing between logic block pins to provide rich, direct routing resources needed in digit-serial circuits. These lines provide very fast signal transmission with short delay. The connections between adjacent logic blocks which frequently occurs in digit-serial circuits is implemented via direct line without consuming any slow switch block. Feedback interconnect is the feedback paths from the LB s outputs (CO(3:)) to its input (Ci) without consuming any external routing resources. Signal routed to one of the LB pins is buffered at the input or output. The buffers at the LB pins effectively isolates the capacitive load of the drain capacitance of pass-transistors from the routing segments. The routing delay through the routing segments is greatly reduced. External routing employs connection blocks and switch blocks to permit the interconnection of individual LBs. Switch blocks are connected to singlelength, double-length, quard-length and long-length line segments on four directions. Switch blocks provide connectivity with the routing segments using 2 programmable switches. Connection blocks provide connectivity between LBs and routing segments using programmable switches. Pass-transistor switches add series resistance to S-FPGA routing paths, resulting in long delays for long paths. Longer segmentation of interconnect lines has been used to address this issue. At very large device sizes, lines are heavily-loaded, and the wire resistance slows down signals on the line. Signal propagation delay depends on the number of switches that the signal passes through. 4. Circuit-Level esign Issues for LB In this section we discuss the circuit-level design of the LB. Throughout the discussion, various trade-offs among supply voltage, logic style and performance are evaluated. The most important parameter controlling power consumption is the supply voltage, due to the squared term in the power consumption equation. 2 Thus, supply voltage reduction is the most effective way to reduce the power consumption. However, this method presents a tough challenge in the design of FP- GAs since most of these structure make extensive use of pass-transistor logic. The problem of supply voltage reduction is further exacerbated in process that do not have low-threshold devices. In these processes, lowering the supply voltage below 2.5 V results in a dramatic loss of performance and even causes some circuits to malfunction. Therefore, such a supply voltage reduction requires new design methods for low-voltage and low-power integrated circuits. Circuit topologies that help reducing the supply voltage are discussed. 4.. Impact of logic style The logic style used in logic gates basically influences the speed, size, power dissipation, and the wiring complexity of a circuit. The circuit delay is determined by the number of transistors in series, transistor sizes (i.e., channel widths), and intraand inter-cell wiring capacitances. Circuit size depends on the number of transistors and their sizes and on the wiring complexity. Power dissipation is determined
15 VLSI esign of igit-serial FPGA Architecture 3 by the switching activity and the node capacitances. All these characteristics may vary considerably from one logic style to another and thus make the proper choice of logic style crucial for circuit performance. Various investigation of logic styles with respect to low-power dissipation have recently been carried out and reported in the literature. 2,2 In these publications, CPL and related pass-transistor logic styles are propagated as low-power logic styles, because CPL gates count fewer transistors, have smaller transistors and smaller capacitances, and are faster than gates in complementary CMOS. However, these circuits have a limited drive capability at a low supply voltage. Although the poor signal level can still drive other circuits correctly at a high supply voltage, it cannot guarantee proper operation at a low supply voltage. Therefore, the problems of threshold voltage loss must be alleviated and the full voltage swing is needed to get a correct signal level at a low supply voltage Logic module circuit XOR and MUX constitute the critical part of logic module (LM) in LB. However 7-transistor XOR circuit 22 has been used to implement the LM circuit. The performance comparisons using several XOR circuits are presented in Ref. 23. The investigation results presented show that 7-transistor XOR performs much better than CPL and complementary static CMOS XOR. ouble-pass transistor logic (PL) MUX is used to improve circuit performance at reduced supply voltages. Because of the presence of both NMOS and PMOS devices, the output of PL MUX circuit has a full voltage swing and there is no static short circuit current problem. The investigation results presented in Ref. 2 show that for all simple and complex logic gates such as two-input NAN (NAN2), two-input NOR (NOR2) and three-input and-or-invert (AOI), complementary static CMOS outperforms CPL and other pass-transistor logic styles with respect to circuit delay, power dissipation, power-delay product, and layout size. CMOS also shows the highest robustness and smallest sensitivity to transistor and voltage scaling. This makes complementary CMOS the logic style of choice for low-power, low-voltage implementation of LM circuit. However, other logic style, such as pass-transistor XOR and MUX, is still be viable candidates for low-power high-speed implementation of LM circuit. A transistor-level schematic diagram of the proposed LM is depicted in Fig. 8. Addition of the complementary transistor allows the circuit to operate at V dd = 3.3 V without the loss in thresholds that plagued the NMOS only pass-transistor design. The LM circuit has 25% less delay and 37% less power consumption than LM circuit using static CMOS circuits. The proposed LM has good signal output levels and low power consumption at a low supply voltage (V dd = 2 V). Table 4 shows the circuit delays of LM using HSPICE simulation Fast-carry logic circuit We have designed fast-carry logic circuit using compact static CMOS carry lookahead (CLA) circuits. This fast-carry look-ahead circuit is based on transistor
16 32 H. Lee & G. E. Sobelman Cin SO B F A CO E P G Fig. 8. Transistor-level schematic of logic module. Table 4. Path delays of LM circuit. Path delays Voltage A P A G A SO A Carry 2. V.7 ns.53 ns.58 ns 2.39 ns 3.3 V.54 ns.7 ns.67 ns.8 ns sharing in multiple output static CMOS complex gates to reduce the transistor count and improve the operation speed of the whole circuit. 24 We define C i as the carry of the ith stage, and A i and B i are the ith bits of the input data; then C i+ is expressed as Expanding this yields C i = G i + P i C i, G i = A i B i, P i = A i B i. C i = G i + P i G i + P i P i G i P i P i P C in.
17 VLSI esign of igit-serial FPGA Architecture 33 Practically, the number of look-ahead stages is limited to four and the term C 3 is expressed as C 3 = G 3 + P 3 G 2 + P 2 [G + P (G + P C in )]. Figure 9(a) shows the carry chain of the 4-bit CLA block which yields C, C and C 2 as well as C 3. By inserting a PMOS transistor with P i gate input into the pull-up part, we can isolate the pull-up part of carry C i from the pull-up part of all carry C j (j > i) according to the logic redundancy method earlier. In the pull-down part, however, P i and G i cannot both be high because P i is the XOR and G i is the AN function of input operand A i and B i for i =, 2 and 3; therefore, there is no other discharging path to ground except the original pull-down part of carry C i, which makes the addition of redundant transistors unnecessary. The fast-carry logic circuit is composed of chains of MOSFETs serially connected between a power supply rail and the output of the subcircuit. These serially connected MOSFETs are a major source of delay and power dissipation, therefore, optimal sizing of these transistors is important in reducing the delay and power dissipation of these circuit structures. Channel width tapering method 25 has been used to reduce the delay and power dissipation of serially connected MOSFET chains in fast-carry logic circuit Glitch-free TSPC flip-flop flip-flop (-FF) is used heavily throughout the S-FPGA in order to make a possible high degree of pipelining, leading to increased performance. The -FFs share a common clock (CK), clock enable (CE), and set/reset (SR) and they can be set/reset locally or globally. Enhancing -FFs speed can lead to a higher clock rate. Power dissipated in the clock distribution network has usually been a substantial part of the total system power consumption. Therefore, it is important to minimize the number of global clocks, as well as the gate capacitance associated with the clock nets. To accomplish this, the true single-phase clocking (TSPC) methodology has been proposed, with the basic register shown in Ref. 26. The TSPC scheme has the inherent advantage of clock skew problems being restricted to the proper distribution of only one clock phase. It is shown in Ref. 27 that, although the proportion of power consumption due to glitching varies significantly with the particular circuits (from 9% to 38%), the hazard/glitch power consumption cannot be neglected in static CMOS circuits. Therefore, the glitch-free TSPC -FF presented in Ref. 28 is used to implement the -FF with clock, set/reset and clock enable in LB as shown in Fig. 9(b). To ensure that the output does not discharge when y is high at evaluation, an NMOS transistor MN4, controlled by y b, is inserted into the output stage of the conventional TSPC -FF. Transistor size optimization also improves circuit speed by a factor of.5.8.
18 34 H. Lee & G. E. Sobelman Cin 2. P 2. P 2.4 P2 2.4 P3 2.7 C 2./.9 G P P 2.7/.9 P C 2./.9 G 2.4 P2 2.4 P2 P3 P2 P3 C2 2./.9 G2 P G3 P P3 C3 2./.9 C3 P2.9 P2 P.9 P P.2 Cin.2.2 G.9 G.9 G2.9 G3 (a) CLK SET CE 2.7/.6.9/.6 2.7/.6.9/.6 EN_CLK.9/.6 RST V V V MP 2.4/.6 MPS 2.7/.6 y MN.9/.6 V.9/.6 y_b.9/.6 MPS2 2.4/.6 2.7/.6 y2 MN2.9/.6 MNS.9/.6.9/.6 MP2 2.7/.6 MNS2.9/.6 MN4.9/.6 MN3.9/.6 QB 3./.6.9/.6 Q (b) Fig. 9. (a) Fast-carry logic circuit and (b) glitch-free TSPC Flip-flop with clock, set/reset and clock enable.
19 VLSI esign of igit-serial FPGA Architecture Routing Architecture esign Issues A key aspect in the design of an FPGA is its routing architecture, which comprises the resources that are used to interconnect the device s logic blocks. A large number of different routing architecture issues was investigated in Refs. 29 and 3. Several architectural and circuit design choices impact the amount of capacitance that will be charged and discharged within an FPGA design. The electrical design of FPGA interconnect circuit was investigated in Ref. 3. The number of metal layers available in the selected process technology influence the final area and power of an FPGA array. In addition, the sizing and number of switches within the interconnect fabric can also seriously affect power. Lastly, capacitance is affected by the interconnect of the basic logic cell to the surrounding interconnect and to the neighboring cells. Given that interconnect wiring is a crucial resource in an FPGA, the reduction of the intrinsic wiring capacitance by using upper layers of metal can help lower net capacitances. However, the necessity of reaching the active layers to insert switches leads to the addition of several contacts and vias to lower levels. Another important design consideration that has an impact on the resulting FPGA capacitance is the size and number of switches present on each interconnect segment. The interconnect network consists of programmable switches that are organized in connection blocks and switch blocks. The performance of FPGAs is mainly limited by the delay through the interconnects programmable switches. This delay increases quadratically with the number of series switches and linearly with the number of switches loading each node and is especially a problem when the programmable switches are implemented using MOS transistors since these have an appreciable resistance and capacitance. The FPGA has higher signal delay because (a) the channel resistance of pass-transistors connecting segments of wire, (b) parasitic capacitance of the off transistor, and (c) branches of extra wire that are not on the source to sink path. An approximate discrete analysis of a line modeled as RC cascadated sections (of equal R and C) yields t n = RCn(n + ) 2. () As it can be seen from the above equation, the total delay depends quadratically on the number of sections and linearly on the resistance of each interconnection and the capacitance of each section. The accumulation of quadratic delay can be limited by inserting repeaters that consist of pairs of unidirectional tristate buffers. A tradeoff for the switch size must be reached in order to obtain result. Once given the projected delay and the prospective load to be driven, a tradeoff can be found for the buffer strength and the CMOS switch in terms of required speed and area wasted. This tradeoff tries to minimize not only the total delay from the beginning of the interconnection to the end of the multi-switch line but also the local delays from the beginning of the line to each intermediate point after
20 36 H. Lee & G. E. Sobelman an interconnection element. As a consequence, the average delay is reduced and a significant area reduction is achieved. This section discusses the tradeoff of the NMOS pass-transistor switch size used for programmable interconnections as a function of the buffer strength and the desired delay for a given load. The repeater interval and buffer optimum size in the interconnection network are determined for the fast speed. 5.. NMOS pass-transistor switch sizing SRAM-based FPGAs normally use NMOS pass-transistors to implement routing switches and this kind of switch has significant series resistance and parasitic capacitance. The sizing and implementation of switches throughout the FPGA interconnect is a factor with a large impact on an FPGA s resulting power consumption and speed. When considering increases in switch size, there is a slow decrease in delay and a nearly linear increase in energy. In order to account for wiring capacitance component, a worst case estimate of wiring parasitics was made and added to each node. A series of simulations were performed to more accurately assess the trade-offs in switch size and implementation using interconnection network shown in Fig.. Figure shows the measured delay and energy-delay product from the input port to the node point after 2 switching sections. Figure (a) shows that increasing the NMOS pass-transistor switch width from.2 to 3.3 µm significantly reduces delay, but there are diminishing returns beyond that point. The reason that the delay flattens out as transistor size increases is that, although resistance drops as the transistor size is increased, parasitic capacitance increases. Assuming a wiring capacitance of 6 ff, a switch size of approximately 3.3 µm is optimal for passtransistor interconnect with a high fanout Repeater interval and buffer optimum size It is well known that an NMOS pass-transistor can transmit the signal completely, but it has poor performance on transmitting the signal. In the latter case, one will incur voltage drop V tn, where V tn is the threshold voltage of NMOS. The NMOS pass-transistor is effective at pulling low so the inverters PMOS transistor is fully turned on giving a solid low-to-high transition. The maximal voltage that can be passed through the NMOS transistor sits at V dd V tn. Since the poor V dd V tn voltage cannot fully turn on NMOS transistor in inverter, the falltime is longer than risetime. To achieve the same risetime and falltime at buffer output, we need to increase the NMOS transistor at first stage of buffer. Figure 2(a) shows the tristate buffer optimized with the same risetime and falltime. Figure shows an interconnection network with tristate buffer repeaters. The repeater interval k is defined as the number of switches between two nodes that
21 Buffered Switch Buffered Switch Output Buffer C block S block S block M S block S block M S block S block C block Input Buffer 6F 6F 6F 6F 6F 6F 6F 6F 6F k Fig.. Wp/Wn M Output Buffer 2.7u/.9u 4.5u/.5u k Input Buffer M.9u/2.u 2.u/.9u Interconnect network using pairs of unidirectional tristate buffers. VLSI esign of igit-serial FPGA Architecture 37
22 38 H. Lee & G. E. Sobelman 9 elay vs. Switch size elay (ns) Switch Size (Width in ums), Length=.6um (a) 2.5 Energy elay Product vs. Switch Size 2 Energy elay Product Switch Size (Width in ums), Length=.6um (b) Fig.. (a) elay versus switch-size and (b) energy delay product versus switch-size through 2 switches (repeater interval = 6).
23 VLSI esign of igit-serial FPGA Architecture 39 2./.6 2./.6 2./.6 IN.9/.6 2./.6 OUT.9/.6.9/.6.9/.6 2.4/.6 EN EN_b.9/.6 (a) 33 elay through 36 loaded stages versus repeater interval 32 3 elay through 36 loaded stages (ns) Repeater Interval (k) (b) Fig. 2. (a) Tristate buffer for repeater and (b) delay through 36 loaded switch sections versus repeater interval.
24 4 H. Lee & G. E. Sobelman elay 9 8 delay (ns) Switch Width (um) Buffer Width (um) Fig. 3. elay as the function of the switch width and buffer width (length =.6 µm). contain repeaters. Figure 2(b) compares delay for simulations of the propagation delay in 36 switch sections with variable repeater interval k. The curve shows the average propagation delay for a chain that uses tristate buffer repeater, which was optimized for minimum delay, with equal rising and falling delay. Increasing the repeater interval from 2 to 6 significantly reduces delay, but there are diminishing returns beyond that point. Above repeater interval, the signal is not transmitted from the input to the node point after 36 switch sections. Therefore, we can get a fast speed in interconnect network with repeater interval 6. Figure 3 shows the results of an experiment to measure the effect that varying the channel width of switch and buffer has on the speed-performance of interconnection network. The result shows that increasing the switch width significantly reduces delay by switch width 3.3 µm, but there is no large delay reduction as increasing the buffer width. Figure 4 shows the delay models of S-FPGA routing structure using the optimum-size pass-transistor switch and buffers. We have tried to identify the tradeoff of the NMOS pass-transistor switch size used for programmable interconnections as a function of the buffer strength and the desired delay for a given load. The switch size of approximately 3.3 µm is optimal for pass-transistor interconnect with a high fanout. The repeater interval in the interconnection network was determined for the fast speed. A tradeoff can be reached to minimize the area penalty and the average delay without practically increasing the overall delay.
25 elay model of S FPGA routing structure LB Output Buffer C block S block S block S block S block C block Input Buffer LB 2.76ns LB Buffered Switch Output Buffer C block S block S block S block 2.57 ns S block Buffered Switch S block Buffered S block S block S block S block Switch S block S block S block S block 2.38 ns Fig. 4. elay model of S-FPGA routing structure. Buffered Switch VLSI esign of igit-serial FPGA Architecture 4
26 April 3, 24 8:8 WSPC/23-JCSC H. Lee & G. E. Sobelman 6. Physical esign and Fabrication 6.. Area and speed Switch Block Input Internal Routing Vertical Routing Track LB Output Internal Routing river C Block river The proposed S-FPGA cell has been implemented using a full-custom VLSI design in order to extract physical characteristics. The custom layout of the S-FPGA cell was done in a.5 µm Hewlett-Packard (HP) CMOS process with three metal layers. Figure 5(a) shows the floorplan of S-FPGA tile and its major building blocks. C Block (a) (b) Fig. 5. (a) Floorplan of S-FPGA tile and (b) layout of S-FPGA prototype chip.
27 VLSI esign of igit-serial FPGA Architecture 43 Table 5. Circuit delays of S-FPGA cell measured from HSPICE simulation. Path elay (ns) Combinational logic LM inputs LB outputs (SO(3 : )) (without -FF) 2.9 LM inputs LB outputs (SO(3 : )) (with -FF) 3.86 LM inputs LM output (cm).85 LM inputs LM output (P ).54 LM inputs 2 th LM outputs (cm) (ripple-carry mode) 3.53 LB fast-carry logic P () and Ci Fast-carry logic output (C).42 P () and Ci Fast-carry logic output (C2).44 P () and Ci Fast-carry logic output (C3).47 Sequential delays Fast-carry logic output -FF output LB output 2.57 Routing track delays elay through two connection block and four switch block including buffers 2.76 (spanning 4 LBs) elay through four switch block including buffered switch (spanning 4 LBs) 2.38 Path delays N = 4 path delay (A CO3) 4.33 N = 2 critical path delay (using ripple-carry logic) (A SO) 6. N = 3 critical path delay (using fast-carry logic) (A SO2) 6.6 N = 4 critical path delay (using fast-carry logic) (A SO3) 6. The major components include the LB, connection block and switch block. The area of LB core is 23 µm 2 µm and the area of each tile (which contains LB, connection block and switch block) is 6 µm 42 µm = 252 µm 2. LB core makes up 9% of the total area while the routing resources makes up the remaining 8%. In particular, the routing resources such as connection block and switch block take up significant area. We used HSPICE to measure the circuit and routing track delays for the proposed S-FPGA cell and to verify the functionality of our layout. Table 5 shows the circuit and routing track delays for the proposed S-FPGA cell at V dd = 3.3 V. The routing track delays have been estimated from HSPICE simulation. These results are used to determine the speed of digit-serial arithmetic circuits implemented using the proposed S-FPGA cells. The critical path delay between the input and output pins of a LB, including direct-connection, is 6. ns Fabrication results A prototype chip of S-FPGA has been fabricated using.5 µm HP CMOS process with three metal layers, and the total layout is shown in Fig. 5(b). It contains only four of the tiles, which were enough to build the digit-serial adder and digit-serial multiplier. This chip has 4 pads, including four power and ground, 28 signal pins into the S-FPGA core, and eight programming pins. From this chip, we were able
28 44 H. Lee & G. E. Sobelman to make delay measurements that included one part of the logic block. Based on these measurements, the pad to pad delay through the part of critical path of LB and two switch block was 9 ns, giving a delay of about 7.5 ns when the pad delay is eliminated. Consideration of the propagation delays for S-FPGA suggest that digit-level pipelined digit-serial multipliers with throughput as fast as 5 MHz may be achieved. 7. igit-serial atapath Circuit Implementations on S-FPGA In this section, we overview the methodology used for technology mapping of the digit-serial circuits into S-FPGA. S-FPGA provides substantial support for the implementing of digit-serial arithmetic building blocks for digital systems such as FIR filters, CT circuits and similar computational intensive structures. Figure 6 shows how the LB can be used in various ways. igit-serial arithmetic modules can be implemented using LB in S-FPGA. Each N = 4 digit-serial arithmetic module is typically implemented using to 2 LBs with logic depth of or 2 LBs which leads to high clock frequency operation. Each digit-serial arithmetic modules consist of a single cell as in adders and subtractors, or multiple cells proportional to the word length as in digit-serial multipliers and registers. A single LB can be used to implement an unsigned N = 4 digit-serial multiplier module or a two s complement N = 4 digit-serial multiplier module as shown in Figs. 6(a) and 6(b). The outputs can be registered for pipelining; otherwise the -FFs are available for independent usage, bypassing the logic modules. X(3:) and Y bit(:) are 4-bit and 2-bit data for digit-serial multiplier modules. P I(3:) and P O(3:) are the input and output partial products. As shown in Figs. 6(a) and 6(b), the four AN gates and a 4-bit adder that are required for an N = 4 digit-serial multiplier module are contained in one LB. We can use the fast-carry logic for N = 4 digit-serial circuit to increase speed. The digit-serial multiplier modules can be easily stacked together to form deeper and/or wider multipliers. For example, to implement an N = 8 digit-serial multiplier module, the carry output (CO3) of one LB will be connected to the carry input (Ci) of the next LB, thus requiring two LBs. A single LB can be used to implement two N = 2 digit-serial multiplier modules using ripple-carry chain of LB as shown in Fig. 6(c), and four independent full-adders/subtractors as in a row of a carry-save array as shown in Fig. 6(d). Each logic module in a LB can also be configured to implement random logic without wasting device resources. We can configure each logic module to implement the following random logic gates: AN2, NAN2, OR2, NOR2, XOR2, XNOR2, MUX2, MUX, etc, as shown in Table 2. If LB is used to implement the bit-serial circuits, one LM can be used to implement the bit-serial circuit and the remaining LMs can be used for random logic gates. We have designed some digit-serial multipliers using LBs in S-FPGA as shown in Fig. 7. The digit-serial multipliers have been designed so that routing
29 VLSI esign of igit-serial FPGA Architecture 45 Ybit(:) Ybit(:) X PI X PI X2 PI2 PO PO PO2 X PI X PI X2 PI2 PO PO PO2 X3 PI3 PO3 X3 PI3 PO3 Sign it(:) (a) (b) PO PO PO2 PO3 X PO PI X PO PI Ybit PO2 X PI PO3 X PI Ybit X PI X PI Ybit X PI X PI Ybit PO PO PO PO A B Ci A B Ci Ci2 A B Ci A B Ci Ci2 Ci3 Ci3 Ybit(:) S C S C S2 C2 S3 C3 S C S C S2 C2 S3 C3 (b) (c) (c) (c) (d) (d) (d) Fig. 6. (a) Unsigned digit-level pipelined N = 4, (b) two s complement N = 4, (c) unsigned N = 2 and (d) bit-level pipelined N = 4 digit-serial multiplier module implementations onto LB.
30 X3 Y X A B Cin E A B Cin SO SO Y A B Cin E A B Cin SO SO Y2 A B Cin E A B Cin SO SO A B Cin E A B Cin SO SO A B Cin E A B Cin SO SO A B Cin E A B Cin SO SO A B Cin E A B Cin SO SO A B Cin E A B Cin SO SO Pout Pout A B Cin E A B Cin SO SO 46 H. Lee & G. E. Sobelman X Y X2 X X X2 X3 Cin2 2 E Cin3 3 Ri Ri Ri2 Ri3 LB SO2 SO3 CO Y Cin2 2 E Cin3 3 LB SO2 SO3 CO Y2 Cin2 2 E Cin3 3 LB SO2 SO3 CO CO CO CO CO CO CO CO2 Ri CO2 Ri CO2 Ri CO2 Ri CO2 Ri CO2 CO3 Ri Ri Ri Ri Ri Ri2 CO3 Ri2 CO3 Ri2 CO3 Ri2 CO3 Ri2 CO3 RO3 Ri3 RO3 Ri3 RO3 Ri3 RO3 Ri3 RO3 Ri3 RO3 CK SR CE CK SR CE CK SR CE CK SR CE CK SR CE CK SR CE Cin2 2 E Cin3 3 LB SO2 SO3 CO Cin2 2 E Cin3 3 LB SO2 SO3 CO Cin2 2 E Cin3 3 LB SO2 SO3 CO Cin2 2 E Cin3 3 Ri Ri Ri2 Ri3 LB SO2 SO3 CO CO CO2 CO3 RO3 CK SR CE Cin2 2 E Cin3 3 Ri Ri Ri2 Ri3 LB SO2 SO3 CO CO CO2 CO3 RO3 CK SR CE Pout2 Pout3 Cin2 2 E Cin3 3 Ri Ri Ri2 Ri3 LB SO2 SO3 CO CO CO2 CO3 RO3 CK SR CE CK SR CE (a) Fig. 7. (a) Implementation of unsigned digit-level pipelined N = digit-serial multiplier and (b) digit-cell for unsigned bit-level pipelined N = 4 digit-serial multipliers using LB.
31 CK SR CE CK SR CE CK SR CE CK SR CE CK SR CE CK SR CE CK SR CE CK SR CE CK S b b b3 b2 a2 b2 b b b3 a CK SR CE Ri3 Ri2 Ri Ri RO3 CO3 CO2 3 Ci3 E 2 Ci2 LB Ci B A E Ci B A CO CO SO3 SO2 SO SO CK SR CE Ri3 Ri2 Ri Ri RO3 CO3 3 Ci3 E 2 Ci2 CO2 CO LB CO SO3 Ci B A SO2 SO E Ci B A SO 4 4 b3 a b2 b a b CK SR CE Ri3 Ri2 Ri Ri RO3 CO3 3 Ci3 E 2 Ci2 CO2 CO LB CO SO3 Ci B A SO2 SO E Ci B A SO b a3 b2 b3 a3 b Sum_in Sum_in Sum_in CK SR CE CK SR CE CK SR CE Ri3 Ri2 Ri Ri RO3 RO3 RO3 CO3 Ri3 Ri2 Ri Ri CO3 Ri3 Ri2 Ri Ri CO3 3 Ci3 E 2 Ci2 CO2 CO2 CO2 CO CO CO LB CO SO3 3 Ci3 E 2 Ci2 SO2 SO LB CO SO3 3 Ci3 E 2 Ci2 SO2 SO LB CO (b) SO3 Fig. 7 (Continued). Ci B A Ci B A Ci B A SO2 SO E Ci B A SO E Ci B A SO E Ci B A SO Sum_out Sum_out Sum_out2 VLSI esign of igit-serial FPGA Architecture 47
32 48 H. Lee & G. E. Sobelman is kept regular and well-organized. An unsigned digit-level pipelined N = digit-serial multiplier implementation using LB is shown in Fig. 7(a). Each block is replaced by the digit-serial multiplier module shown in Fig. 6(a). In order to increase the throughput of the digit-serial multiplier, the architecture is pipelined at the digit-level. In the example shown, the pipelining limits the propagation to a 4- bit adder in the N = 4 digit-serial multiplier. The partial products presented to the shifting accumulator are generated by the logical AN of the input serial bit with each bit of the parallel input. However, the critical path delay of unsigned digitlevel pipelined N = 4 digit-serial multiplier is (AN + 4 Full adder + -FF) delay. Reduction in the critical path delay below this value is not possible because of the presence of feedback loops. It is found that the critical path of this N = 4 digit-serial multiplier using S-FPGA is (T LM + T F ast + T F F + T r ) delay. In these equations, T LM represents the propagation delay associated with the LM within the LBs, and T F ast and T F F represent, respectively, the propagation delay associated with fast-carry logic and flip-flop within the LBs. T r is a delay incurred in the routing between each LB. Since the proposed S-FPGA can implement significant digitserial arithmetic functions within a single LB, routing is only required to support the implementation of wide operand structures. Finally, the critical path for the overall digit-serial multiplier has been determined in terms of the worst critical path associated with the constituent multiplier module. Therefore, the maximum possible sampling frequency associated with the N = 4 digit-serial multiplier can be obtained as 4 f = W (T LM + T F ast + T F F + T r ). The unsigned bit-level pipelined digit-serial multiplier contains digit-cells, digitserial 3:2 compressor adder and digit-serial adder. A digit-cell can be configured using six LBs as shown in Fig. 7(b) and a simple digit-serial 3:2 compressor adder can be first used to reduce digit-cell output digits to two digits. A digit-serial adder is then used to add these two digits to generate the final digit-serial outputs. However, if it is mapped on S-FPGA, the reduction in the critical path below N = 4 digit-serial adder is not possible due to the presence of feedback loop in the final digit-serial adder. The resulting critical path of bit-level pipelined digit-serial multipliers would be (2T LM + T F F + T r ) delay. 8. Results To evaluate the advantages of S-FPGA, we need to compare the area and speed efficiency of the S-FPGA architecture with general purpose FPGAs. To determine the number of FPGA logic blocks needed to implement a circuit, we have mapped several digit-serial SP architectures onto S-FPGA using the direct handmapping which ensures the most efficient logic usage, and then estimated silicon area in each case. The area cost of FPGAs is estimated by the number of logic blocks required to implement a digit-serial SP architectures.
UNIT-II LOW POWER VLSI DESIGN APPROACHES
UNIT-II LOW POWER VLSI DESIGN APPROACHES Low power Design through Voltage Scaling: The switching power dissipation in CMOS digital integrated circuits is a strong function of the power supply voltage.
More informationCHAPTER 5 DESIGN AND ANALYSIS OF COMPLEMENTARY PASS- TRANSISTOR WITH ASYNCHRONOUS ADIABATIC LOGIC CIRCUITS
70 CHAPTER 5 DESIGN AND ANALYSIS OF COMPLEMENTARY PASS- TRANSISTOR WITH ASYNCHRONOUS ADIABATIC LOGIC CIRCUITS A novel approach of full adder and multipliers circuits using Complementary Pass Transistor
More informationA New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology
Inf. Sci. Lett. 2, No. 3, 159-164 (2013) 159 Information Sciences Letters An International Journal http://dx.doi.org/10.12785/isl/020305 A New network multiplier using modified high order encoder and optimized
More informationHigh Performance Low-Power Signed Multiplier
High Performance Low-Power Signed Multiplier Amir R. Attarha Mehrdad Nourani VLSI Circuits & Systems Laboratory Department of Electrical and Computer Engineering University of Tehran, IRAN Email: attarha@khorshid.ece.ut.ac.ir
More informationNovel Low-Overhead Operand Isolation Techniques for Low-Power Datapath Synthesis
Novel Low-Overhead Operand Isolation Techniques for Low-Power Datapath Synthesis N. Banerjee, A. Raychowdhury, S. Bhunia, H. Mahmoodi, and K. Roy School of Electrical and Computer Engineering, Purdue University,
More informationReduced Swing Domino Techniques for Low Power and High Performance Arithmetic Circuits
Reduced Swing Domino Techniques for Low Power and High Performance Arithmetic Circuits by Shahrzad Naraghi A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for
More informationDIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N
DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N Jan M. Rabaey, Anantha Chandrakasan, and Borivoje Nikolic CONTENTS PART I: THE FABRICS Chapter 1: Introduction (32 pages) 1.1 A Historical
More informationReference. Wayne Wolf, FPGA-Based System Design Pearson Education, N Krishna Prakash,, Amrita School of Engineering
FPGA Fabrics Reference Wayne Wolf, FPGA-Based System Design Pearson Education, 2004 CPLD / FPGA CPLD Interconnection of several PLD blocks with Programmable interconnect on a single chip Logic blocks executes
More informationA Survey of the Low Power Design Techniques at the Circuit Level
A Survey of the Low Power Design Techniques at the Circuit Level Hari Krishna B Assistant Professor, Department of Electronics and Communication Engineering, Vagdevi Engineering College, Warangal, India
More informationPreface to Third Edition Deep Submicron Digital IC Design p. 1 Introduction p. 1 Brief History of IC Industry p. 3 Review of Digital Logic Gate
Preface to Third Edition p. xiii Deep Submicron Digital IC Design p. 1 Introduction p. 1 Brief History of IC Industry p. 3 Review of Digital Logic Gate Design p. 6 Basic Logic Functions p. 6 Implementation
More informationChapter 6 Combinational CMOS Circuit and Logic Design. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan
Chapter 6 Combinational CMOS Circuit and Logic Design Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Outline Advanced Reliable Systems (ARES) Lab. Jin-Fu Li,
More informationElectronic Circuits EE359A
Electronic Circuits EE359A Bruce McNair B206 bmcnair@stevens.edu 201-216-5549 1 Memory and Advanced Digital Circuits - 2 Chapter 11 2 Figure 11.1 (a) Basic latch. (b) The latch with the feedback loop opened.
More informationTotally Self-Checking Carry-Select Adder Design Based on Two-Rail Code
Totally Self-Checking Carry-Select Adder Design Based on Two-Rail Code Shao-Hui Shieh and Ming-En Lee Department of Electronic Engineering, National Chin-Yi University of Technology, ssh@ncut.edu.tw, s497332@student.ncut.edu.tw
More informationAn Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors
An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors T.N.Priyatharshne Prof. L. Raja, M.E, (Ph.D) A. Vinodhini ME VLSI DESIGN Professor, ECE DEPT ME VLSI DESIGN
More informationAn energy efficient full adder cell for low voltage
An energy efficient full adder cell for low voltage Keivan Navi 1a), Mehrdad Maeen 2, and Omid Hashemipour 1 1 Faculty of Electrical and Computer Engineering of Shahid Beheshti University, GC, Tehran,
More informationII. Previous Work. III. New 8T Adder Design
ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: High Performance Circuit Level Design For Multiplier Arun Kumar
More informationDesign and Implementation of Complex Multiplier Using Compressors
Design and Implementation of Complex Multiplier Using Compressors Abstract: In this paper, a low-power high speed Complex Multiplier using compressor circuit is proposed for fast digital arithmetic integrated
More informationPower-Area trade-off for Different CMOS Design Technologies
Power-Area trade-off for Different CMOS Design Technologies Priyadarshini.V Department of ECE Sri Vishnu Engineering College for Women, Bhimavaram dpriya69@gmail.com Prof.G.R.L.V.N.Srinivasa Raju Head
More informationA Low Power Array Multiplier Design using Modified Gate Diffusion Input (GDI)
A Low Power Array Multiplier Design using Modified Gate Diffusion Input (GDI) Mahendra Kumar Lariya 1, D. K. Mishra 2 1 M.Tech, Electronics and instrumentation Engineering, Shri G. S. Institute of Technology
More informationCPE/EE 427, CPE 527 VLSI Design I: Homeworks 3 & 4
CPE/EE 427, CPE 527 VLSI Design I: Homeworks 3 & 4 1 2 3 4 5 6 7 8 9 10 Sum 30 10 25 10 30 40 10 15 15 15 200 1. (30 points) Misc, Short questions (a) (2 points) Postponing the introduction of signals
More informationA Low-Power High-speed Pipelined Accumulator Design Using CMOS Logic for DSP Applications
International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume. 1, Issue 5, September 2014, PP 30-42 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org
More informationImplementation of Carry Select Adder using CMOS Full Adder
Implementation of Carry Select Adder using CMOS Full Adder Smitashree.Mohapatra Assistant professor,ece department MVSR Engineering College Nadergul,Hyderabad-510501 R. VaibhavKumar PG Scholar, ECE department(es&vlsid)
More informationEE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling
EE241 - Spring 2004 Advanced Digital Integrated Circuits Borivoje Nikolic Lecture 15 Low-Power Design: Supply Voltage Scaling Announcements Homework #2 due today Midterm project reports due next Thursday
More informationSophisticated design of low power high speed full adder by using SR-CPL and Transmission Gate logic
Scientific Journal of Impact Factor(SJIF): 3.134 International Journal of Advance Engineering and Research Development Volume 2,Issue 3, March -2015 e-issn(o): 2348-4470 p-issn(p): 2348-6406 Sophisticated
More informationLecture 9: Cell Design Issues
Lecture 9: Cell Design Issues MAH, AEN EE271 Lecture 9 1 Overview Reading W&E 6.3 to 6.3.6 - FPGA, Gate Array, and Std Cell design W&E 5.3 - Cell design Introduction This lecture will look at some of the
More informationInternational Journal of Advance Engineering and Research Development
Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 05, May -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 COMPARATIVE
More informationSIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS
INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS 1 T.Thomas Leonid, 2 M.Mary Grace Neela, and 3 Jose Anand
More informationChapter 3 Novel Digital-to-Analog Converter with Gamma Correction for On-Panel Data Driver
Chapter 3 Novel Digital-to-Analog Converter with Gamma Correction for On-Panel Data Driver 3.1 INTRODUCTION As last chapter description, we know that there is a nonlinearity relationship between luminance
More informationJDT EFFECTIVE METHOD FOR IMPLEMENTATION OF WALLACE TREE MULTIPLIER USING FAST ADDERS
JDT-002-2013 EFFECTIVE METHOD FOR IMPLEMENTATION OF WALLACE TREE MULTIPLIER USING FAST ADDERS E. Prakash 1, R. Raju 2, Dr.R. Varatharajan 3 1 PG Student, Department of Electronics and Communication Engineeering
More informationCHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES
69 CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES 4.1 INTRODUCTION Multiplication is one of the basic functions used in digital signal processing. It requires more
More informationEE141-Spring 2007 Digital Integrated Circuits
EE141-Spring 2007 Digital Integrated Circuits Lecture 22 I/O, Power Distribution dders 1 nnouncements Homework 9 has been posted Due Tu. pr. 24, 5pm Project Phase 4 (Final) Report due Mo. pr. 30, noon
More informationInvestigation on Performance of high speed CMOS Full adder Circuits
ISSN (O): 2349-7084 International Journal of Computer Engineering In Research Trends Available online at: www.ijcert.org Investigation on Performance of high speed CMOS Full adder Circuits 1 KATTUPALLI
More informationLow-Power Digital CMOS Design: A Survey
Low-Power Digital CMOS Design: A Survey Krister Landernäs June 4, 2005 Department of Computer Science and Electronics, Mälardalen University Abstract The aim of this document is to provide the reader with
More informationIntroduction to CMOS VLSI Design (E158) Lecture 9: Cell Design
Harris Introduction to CMOS VLSI Design (E158) Lecture 9: Cell Design David Harris Harvey Mudd College David_Harris@hmc.edu Based on EE271 developed by Mark Horowitz, Stanford University MAH E158 Lecture
More informationLecture 3, Handouts Page 1. Introduction. EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Simulation Techniques.
Introduction EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Techniques Cristian Grecu grecuc@ece.ubc.ca Course web site: http://courses.ece.ubc.ca/353/ What have you learned so far?
More informationDESIGN OF LOW POWER MULTIPLIER USING COMPOUND CONSTANT DELAY LOGIC STYLE
DESIGN OF LOW POWER MULTIPLIER USING COMPOUND CONSTANT DELAY LOGIC STYLE 1 S. DARWIN, 2 A. BENO, 3 L. VIJAYA LAKSHMI 1 & 2 Assistant Professor Electronics & Communication Engineering Department, Dr. Sivanthi
More informationA new 6-T multiplexer based full-adder for low power and leakage current optimization
A new 6-T multiplexer based full-adder for low power and leakage current optimization G. Ramana Murthy a), C. Senthilpari, P. Velrajkumar, and T. S. Lim Faculty of Engineering and Technology, Multimedia
More informationDomino CMOS Implementation of Power Optimized and High Performance CLA adder
Domino CMOS Implementation of Power Optimized and High Performance CLA adder Kistipati Karthik Reddy 1, Jeeru Dinesh Reddy 2 1 PG Student, BMS College of Engineering, Bull temple Road, Bengaluru, India
More informationDESIGN OF MULTIPLYING DELAY LOCKED LOOP FOR DIFFERENT MULTIPLYING FACTORS
DESIGN OF MULTIPLYING DELAY LOCKED LOOP FOR DIFFERENT MULTIPLYING FACTORS Aman Chaudhary, Md. Imtiyaz Chowdhary, Rajib Kar Department of Electronics and Communication Engg. National Institute of Technology,
More informationAN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER
AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER K. RAMAMOORTHY 1 T. CHELLADURAI 2 V. MANIKANDAN 3 1 Department of Electronics and Communication
More informationCHAPTER 3 NEW SLEEPY- PASS GATE
56 CHAPTER 3 NEW SLEEPY- PASS GATE 3.1 INTRODUCTION A circuit level design technique is presented in this chapter to reduce the overall leakage power in conventional CMOS cells. The new leakage po leepy-
More informationLearning Outcomes. Spiral 2 8. Digital Design Overview LAYOUT
2-8.1 2-8.2 Spiral 2 8 Cell Mark Redekopp earning Outcomes I understand how a digital circuit is composed of layers of materials forming transistors and wires I understand how each layer is expressed as
More information[Krishna, 2(9): September, 2013] ISSN: Impact Factor: INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY Design of Wallace Tree Multiplier using Compressors K.Gopi Krishna *1, B.Santhosh 2, V.Sridhar 3 gopikoleti@gmail.com Abstract
More informationDESIGN AND SIMULATION OF A HIGH PERFORMANCE CMOS VOLTAGE DOUBLERS USING CHARGE REUSE TECHNIQUE
Journal of Engineering Science and Technology Vol. 12, No. 12 (2017) 3344-3357 School of Engineering, Taylor s University DESIGN AND SIMULATION OF A HIGH PERFORMANCE CMOS VOLTAGE DOUBLERS USING CHARGE
More informationLecture 11: Clocking
High Speed CMOS VLSI Design Lecture 11: Clocking (c) 1997 David Harris 1.0 Introduction We have seen that generating and distributing clocks with little skew is essential to high speed circuit design.
More informationDigital Microelectronic Circuits ( ) CMOS Digital Logic. Lecture 6: Presented by: Adam Teman
Digital Microelectronic Circuits (361-1-3021 ) Presented by: Adam Teman Lecture 6: CMOS Digital Logic 1 Last Lectures The CMOS Inverter CMOS Capacitance Driving a Load 2 This Lecture Now that we know all
More informationDesign and Implementation of High Speed Carry Select Adder Korrapatti Mohammed Ghouse 1 K.Bala. 2
IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 07, 2015 ISSN (online): 2321-0613 Design and Implementation of High Speed Carry Select Adder Korrapatti Mohammed Ghouse
More informationCHAPTER 6 GDI BASED LOW POWER FULL ADDER CELL FOR DSP DATA PATH BLOCKS
87 CHAPTER 6 GDI BASED LOW POWER FULL ADDER CELL FOR DSP DATA PATH BLOCKS 6.1 INTRODUCTION In this approach, the four types of full adders conventional, 16T, 14T and 10T have been analyzed in terms of
More informationCHAPTER 3 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED ADDER TOPOLOGIES
44 CHAPTER 3 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED ADDER TOPOLOGIES 3.1 INTRODUCTION The design of high-speed and low-power VLSI architectures needs efficient arithmetic processing units,
More informationA Novel Low-Power Scan Design Technique Using Supply Gating
A Novel Low-Power Scan Design Technique Using Supply Gating S. Bhunia, H. Mahmoodi, S. Mukhopadhyay, D. Ghosh, and K. Roy School of Electrical and Computer Engineering, Purdue University, West Lafayette,
More informationImplementation of High Performance Carry Save Adder Using Domino Logic
Page 136 Implementation of High Performance Carry Save Adder Using Domino Logic T.Jayasimha 1, Daka Lakshmi 2, M.Gokula Lakshmi 3, S.Kiruthiga 4 and K.Kaviya 5 1 Assistant Professor, Department of ECE,
More informationUNIT-III GATE LEVEL DESIGN
UNIT-III GATE LEVEL DESIGN LOGIC GATES AND OTHER COMPLEX GATES: Invert(nmos, cmos, Bicmos) NAND Gate(nmos, cmos, Bicmos) NOR Gate(nmos, cmos, Bicmos) The module (integrated circuit) is implemented in terms
More informationDesign & Analysis of Low Power Full Adder
1174 Design & Analysis of Low Power Full Adder Sana Fazal 1, Mohd Ahmer 2 1 Electronics & communication Engineering Integral University, Lucknow 2 Electronics & communication Engineering Integral University,
More informationLow Power 32-bit Improved Carry Select Adder based on MTCMOS Technique
Low Power 32-bit Improved Carry Select Adder based on MTCMOS Technique Ch. Mohammad Arif 1, J. Syamuel John 2 M. Tech student, Department of Electronics Engineering, VR Siddhartha Engineering College,
More informationEE584 Introduction to VLSI Design Final Project Document Group 9 Ring Oscillator with Frequency selector
EE584 Introduction to VLSI Design Final Project Document Group 9 Ring Oscillator with Frequency selector Group Members Uttam Kumar Boda Rajesh Tenukuntla Mohammad M Iftakhar Srikanth Yanamanagandla 1 Table
More informationEC 1354-Principles of VLSI Design
EC 1354-Principles of VLSI Design UNIT I MOS TRANSISTOR THEORY AND PROCESS TECHNOLOGY PART-A 1. What are the four generations of integrated circuits? 2. Give the advantages of IC. 3. Give the variety of
More informationEE 330 Lecture 44. Digital Circuits. Ring Oscillators Sequential Logic Array Logic Memory Arrays. Final: Tuesday May 2 7:30-9:30
EE 330 Lecture 44 igital Circuits Ring Oscillators Sequential Logic Array Logic Memory Arrays Final: Tuesday May 2 7:30-9:30 Review from Last Time ynamic Logic Basic ynamic Logic Gate V F A n PN Any of
More informationYet, many signal processing systems require both digital and analog circuits. To enable
Introduction Field-Programmable Gate Arrays (FPGAs) have been a superb solution for rapid and reliable prototyping of digital logic systems at low cost for more than twenty years. Yet, many signal processing
More informationIntegration of Optimized GDI Logic based NOR Gate and Half Adder into PASTA for Low Power & Low Area Applications
Integration of Optimized GDI Logic based NOR Gate and Half Adder into PASTA for Low Power & Low Area Applications M. Sivakumar Research Scholar, ECE Department, SCSVMV University, Kanchipuram, India. Dr.
More informationPropagation Delay, Circuit Timing & Adder Design. ECE 152A Winter 2012
Propagation Delay, Circuit Timing & Adder Design ECE 152A Winter 2012 Reading Assignment Brown and Vranesic 2 Introduction to Logic Circuits 2.9 Introduction to CAD Tools 2.9.1 Design Entry 2.9.2 Synthesis
More informationPropagation Delay, Circuit Timing & Adder Design
Propagation Delay, Circuit Timing & Adder Design ECE 152A Winter 2012 Reading Assignment Brown and Vranesic 2 Introduction to Logic Circuits 2.9 Introduction to CAD Tools 2.9.1 Design Entry 2.9.2 Synthesis
More informationDesign and Analysis of Row Bypass Multiplier using various logic Full Adders
Design and Analysis of Row Bypass Multiplier using various logic Full Adders Dr.R.Naveen 1, S.A.Sivakumar 2, K.U.Abhinaya 3, N.Akilandeeswari 4, S.Anushya 5, M.A.Asuvanti 6 1 Associate Professor, 2 Assistant
More informationCOMPUTER ORGANIZATION & ARCHITECTURE DIGITAL LOGIC CSCD211- DEPARTMENT OF COMPUTER SCIENCE, UNIVERSITY OF GHANA
COMPUTER ORGANIZATION & ARCHITECTURE DIGITAL LOGIC LOGIC Logic is a branch of math that tries to look at problems in terms of being either true or false. It will use a set of statements to derive new true
More informationBICMOS Technology and Fabrication
12-1 BICMOS Technology and Fabrication 12-2 Combines Bipolar and CMOS transistors in a single integrated circuit By retaining benefits of bipolar and CMOS, BiCMOS is able to achieve VLSI circuits with
More informationLow Power VLSI CMOS Design. An Image Processing Chip for RGB to HSI Conversion
REPRINT FROM: PROC. OF IRISCH SIGNAL AND SYSTEM CONFERENCE, DERRY, NORTHERN IRELAND, PP.165-172. Low Power VLSI CMOS Design An Image Processing Chip for RGB to HSI Conversion A.Th. Schwarzbacher and J.B.
More informationPass Transistor and CMOS Logic Configuration based De- Multiplexers
Abstract: Pass Transistor and CMOS Logic Configuration based De- Multiplexers 1 K Rama Krishna, 2 Madanna, 1 PG Scholar VLSI System Design, Geethanajali College of Engineering and Technology, 2 HOD Dept
More informationA HIGH SPEED DYNAMIC RIPPLE CARRY ADDER
A HIGH SPEED DYNAMIC RIPPLE CARRY ADDER Y. Anil Kumar 1, M. Satyanarayana 2 1 Student, Department of ECE, MVGR College of Engineering, India. 2 Associate Professor, Department of ECE, MVGR College of Engineering,
More informationCOMPARISION OF LOW POWER AND DELAY USING BAUGH WOOLEY AND WALLACE TREE MULTIPLIERS
COMPARISION OF LOW POWER AND DELAY USING BAUGH WOOLEY AND WALLACE TREE MULTIPLIERS ( 1 Dr.V.Malleswara rao, 2 K.V.Ganesh, 3 P.Pavan Kumar) 1 Professor &HOD of ECE,GITAM University,Visakhapatnam. 2 Ph.D
More informationDigital Microelectronic Circuits ( ) Pass Transistor Logic. Lecture 9: Presented by: Adam Teman
Digital Microelectronic Circuits (361-1-3021 ) Presented by: Adam Teman Lecture 9: Pass Transistor Logic 1 Motivation In the previous lectures, we learned about Standard CMOS Digital Logic design. CMOS
More informationNovel Buffer Design for Low Power and Less Delay in 45nm and 90nm Technology
Novel Buffer Design for Low Power and Less Delay in 45nm and 90nm Technology 1 Mahesha NB #1 #1 Lecturer Department of Electronics & Communication Engineering, Rai Technology University nbmahesh512@gmail.com
More informationA High Speed Low Power Adder in Multi Output Domino Logic
Journal From the SelectedWorks of Kirat Pal Singh Winter November 28, 2014 High Speed Low Power dder in Multi Output Domino Logic Neeraj Jain, NIIST, hopal, India Puran Gour, NIIST, hopal, India rahmi
More informationImplementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST
ǁ Volume 02 - Issue 01 ǁ January 2017 ǁ PP. 06-14 Implementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST Ms. Deepali P. Sukhdeve Assistant Professor Department
More informationCMOS VLSI Design (A3425)
CMOS VLSI Design (A3425) Unit III Static Logic Gates Introduction A static logic gate is one that has a well defined output once the inputs are stabilized and the switching transients have decayed away.
More informationHigh Speed Vedic Multiplier Designs Using Novel Carry Select Adder
High Speed Vedic Multiplier Designs Using Novel Carry Select Adder 1 chintakrindi Saikumar & 2 sk.sahir 1 (M.Tech) VLSI, Dept. of ECE Priyadarshini Institute of Technology & Management 2 Associate Professor,
More informationBASIC PHYSICAL DESIGN AN OVERVIEW The VLSI design flow for any IC design is as follows
Unit 3 BASIC PHYSICAL DESIGN AN OVERVIEW The VLSI design flow for any IC design is as follows 1.Specification (problem definition) 2.Schematic(gate level design) (equivalence check) 3.Layout (equivalence
More informationPHYSICAL STRUCTURE OF CMOS INTEGRATED CIRCUITS. Dr. Mohammed M. Farag
PHYSICAL STRUCTURE OF CMOS INTEGRATED CIRCUITS Dr. Mohammed M. Farag Outline Integrated Circuit Layers MOSFETs CMOS Layers Designing FET Arrays EE 432 VLSI Modeling and Design 2 Integrated Circuit Layers
More informationA Literature Survey on Low PDP Adder Circuits
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 12, December 2015,
More informationCMPEN 411 VLSI Digital Circuits Spring Lecture 24: Peripheral Memory Circuits
CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 24: Peripheral Memory Circuits [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11
More informationGdi Technique Based Carry Look Ahead Adder Design
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 4, Issue 6, Ver. I (Nov - Dec. 2014), PP 01-09 e-issn: 2319 4200, p-issn No. : 2319 4197 Gdi Technique Based Carry Look Ahead Adder Design
More informationChapter 4. Problems. 1 Chapter 4 Problem Set
1 Chapter 4 Problem Set Chapter 4 Problems 1. [M, None, 4.x] Figure 0.1 shows a clock-distribution network. Each segment of the clock network (between the nodes) is 5 mm long, 3 µm wide, and is implemented
More informationTechnology, Jabalpur, India 1 2
1181 LAYOUT DESIGNING AND OPTIMIZATION TECHNIQUES USED FOR DIFFERENT FULL ADDER TOPOLOGIES ARPAN SINGH RAJPUT 1, RAJESH PARASHAR 2 1 M.Tech. Scholar, 2 Assistant professor, Department of Electronics and
More informationApplication and Analysis of Output Prediction Logic to a 16-bit Carry Look Ahead Adder
Application and Analysis of Output Prediction Logic to a 16-bit Carry Look Ahead Adder Lukasz Szafaryn University of Virginia Department of Computer Science lgs9a@cs.virginia.edu 1. ABSTRACT In this work,
More informationTwo New Low Power High Performance Full Adders with Minimum Gates
Two New Low Power High Performance Full Adders with Minimum Gates M.Hosseinghadiry, H. Mohammadi, M.Nadisenejani Abstract with increasing circuits complexity and demand to use portable devices, power consumption
More informationTwo New Low Power High Performance Full Adders with Minimum Gates
Two New Low Power High Performance Full Adders with Minimum Gates M.Hosseinghadiry, H. Mohammadi, M.Nadisenejani Abstract with increasing circuits complexity and demand to use portable devices, power consumption
More informationA NOVEL 4-Bit ARITHMETIC LOGIC UNIT DESIGN FOR POWER AND AREA OPTIMIZATION
A NOVEL 4-Bit ARITHMETIC LOGIC UNIT DESIGN FOR POWER AND AREA OPTIMIZATION Mr. Snehal Kumbhalkar 1, Mr. Sanjay Tembhurne 2 Department of Electronics and Communication Engineering GHRAET, Nagpur, Maharashtra,
More informationCS302 Digital Logic Design Solved Objective Midterm Papers For Preparation of Midterm Exam
CS302 Digital Logic Design Solved Objective Midterm Papers For Preparation of Midterm Exam MIDTERM EXAMINATION 2011 (October-November) Q-21 Draw function table of a half adder circuit? (2) Answer: - Page
More informationDigital Integrated CircuitDesign
Digital Integrated CircuitDesign Lecture 13 Building Blocks (Multipliers) Register Adder Shift Register Adib Abrishamifar EE Department IUST Acknowledgement This lecture note has been summarized and categorized
More informationLow Power, Area Efficient FinFET Circuit Design
Low Power, Area Efficient FinFET Circuit Design Michael C. Wang, Princeton University Abstract FinFET, which is a double-gate field effect transistor (DGFET), is more versatile than traditional single-gate
More informationDESIGN OF LOW POWER HIGH PERFORMANCE 4-16 MIXED LOGIC LINE DECODER P.Ramakrishna 1, T Shivashankar 2, S Sai Vaishnavi 3, V Gowthami 4 1
DESIGN OF LOW POWER HIGH PERFORMANCE 4-16 MIXED LOGIC LINE DECODER P.Ramakrishna 1, T Shivashankar 2, S Sai Vaishnavi 3, V Gowthami 4 1 Asst. Professsor, Anurag group of institutions 2,3,4 UG scholar,
More informationDESIGNING powerful and versatile computing systems is
560 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 5, MAY 2007 Variation-Aware Adaptive Voltage Scaling System Mohamed Elgebaly, Member, IEEE, and Manoj Sachdev, Senior
More informationOn Built-In Self-Test for Adders
On Built-In Self-Test for s Mary D. Pulukuri and Charles E. Stroud Dept. of Electrical and Computer Engineering, Auburn University, Alabama Abstract - We evaluate some previously proposed test approaches
More informationDESIGN OF LOW POWER HIGH SPEED ERROR TOLERANT ADDERS USING FPGA
International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 10, Issue 1, January February 2019, pp. 88 94, Article ID: IJARET_10_01_009 Available online at http://www.iaeme.com/ijaret/issues.asp?jtype=ijaret&vtype=10&itype=1
More informationIJMIE Volume 2, Issue 3 ISSN:
IJMIE Volume 2, Issue 3 ISSN: 2249-0558 VLSI DESIGN OF LOW POWER HIGH SPEED DOMINO LOGIC Ms. Rakhi R. Agrawal* Dr. S. A. Ladhake** Abstract: Simple to implement, low cost designs in CMOS Domino logic are
More informationImplementation of Efficient 5:3 & 7:3 Compressors for High Speed and Low-Power Operations
Volume-7, Issue-3, May-June 2017 International Journal of Engineering and Management Research Page Number: 42-47 Implementation of Efficient 5:3 & 7:3 Compressors for High Speed and Low-Power Operations
More informationLow-Power Approximate Unsigned Multipliers with Configurable Error Recovery
SUBMITTED FOR REVIEW 1 Low-Power Approximate Unsigned Multipliers with Configurable Error Recovery Honglan Jiang*, Student Member, IEEE, Cong Liu*, Fabrizio Lombardi, Fellow, IEEE and Jie Han, Senior Member,
More informationCOMPREHENSIVE ANALYSIS OF ENHANCED CARRY-LOOK AHEAD ADDER USING DIFFERENT LOGIC STYLES
COMPREHENSIVE ANALYSIS OF ENHANCED CARRY-LOOK AHEAD ADDER USING DIFFERENT LOGIC STYLES PSowmya #1, Pia Sarah George #2, Samyuktha T #3, Nikita Grover #4, Mrs Manurathi *1 # BTech,Electronics and Communication,Karunya
More information64 x 64 Bit Multiplier Using Pass Logic
Georgia State niversity ScholarWorks @ Georgia State niversity Computer Science Theses Department of Computer Science --6 6 6 Bit Multiplier sing Pass Logic Shibi Thankachan Follow this and additional
More information/$ IEEE
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 12, DECEMBER 2006 1309 Design of Robust, Energy-Efficient Full Adders for Deep-Submicrometer Design Using Hybrid-CMOS Logic
More informationAn Efficient Low Power and High Speed carry select adder using D-Flip Flop
Journal From the SelectedWorks of Journal April, 2016 An Efficient Low Power and High Speed carry select adder using D-Flip Flop Basavva Mailarappa Konnur M. Sharanabasappa This work is licensed under
More informationSOLIMAN A. MAHMOUD Department of Electrical Engineering, Faculty of Engineering, Cairo University, Fayoum, Egypt
Journal of Circuits, Systems, and Computers Vol. 14, No. 4 (2005) 667 684 c World Scientific Publishing Company DIGITALLY CONTROLLED CMOS BALANCED OUTPUT TRANSCONDUCTOR AND APPLICATION TO VARIABLE GAIN
More information