REDUCING POWER DISSIPATION IN NETWORK ON CHIP BY USING DATA ENCODING SCHEMES

Similar documents
Novel implementation of Data Encoding and Decoding Techniques for Reducing Power Consumption in Network-on-Chip

Analysis of Data Standards in Network on Chip Shaik Nadira 1 K Swetha 2

Reducing Energy Consumption by Using Data Encoding Techniques in Network-On-Chip

LOW POWER AND HIGH SPEED DATA ENCODING TECHNIQUE IN NoC

ENCRYPTING INFORMATION PROFICIENCY FOR REDUCING POWER USAGE IN NETWORK-ON- CHIP

A NEW CDMA ENCODING/DECODING METHOD FOR ON-CHIP COMMUNICATION NETWORK

Power Reduction Technique for Data Encoding in Network-on-Chip (NoC)

THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE

Reducing Switching Activities Through Data Encoding in Network on Chip

DATA ENCODING TECHNIQUES FOR LOW POWER CONSUMPTION IN NETWORK-ON-CHIP

Optimization of energy consumption in a NOC link by using novel data encoding technique

A FPGA Implementation of Power Efficient Encoding Schemes for NoC with Error Detection

ISSN Vol.03,Issue.04, July-2015, Pages:

Methods for Reducing the Activity Switching Factor

Energy Reduction through Crosstalk Avoidance Coding in NoC Paradigm

Implementation of Memory Less Based Low-Complexity CODECS

A Novel Encoding Scheme for Cross-Talk Effect Minimization Using Error Detecting and Correcting Codes

The dynamic power dissipated by a CMOS node is given by the equation:

Bus Serialization for Reducing Power Consumption

Modeling the Effect of Wire Resistance in Deep Submicron Coupled Interconnects for Accurate Crosstalk Based Net Sorting

A Novel Low Power Optimization for On-Chip Interconnection

A Two-bit Bus-Invert Coding Scheme With a Mid-level State Bus-Line for Low Power VLSI Design

DESIGN OF LOW POWER MULTIPLIER USING COMPOUND CONSTANT DELAY LOGIC STYLE

Design and Analysis of CMOS based Low Power Carry Select Full Adder

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 1

UNIT-II LOW POWER VLSI DESIGN APPROACHES

Bus-Switch Encoding for Power Optimization of Address Bus

Course Outcome of M.Tech (VLSI Design)

Area and Energy-Efficient Crosstalk Avoidance Codes for On-Chip Buses

High Speed Low Power Noise Tolerant Multiple Bit Adder Circuit Design Using Domino Logic

A Fast INC-XOR Codec for Low Power Address Buses

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 2, APRIL E(m)= n /01$10.

AS very large-scale integration (VLSI) circuits continue to

2 Assoc Prof, Dept of ECE, George Institute of Engineering & Technology, Markapur, AP, India,

LOW POWER DATA BUS ENCODING & DECODING SCHEMES

A Technique to Reduce Transition Energy for Data-Bus in DSM Technology

LOW-POWER design is one of the most critical issues

FOR HIGH SPEED LOW POWER APPLICATIONS USING RADIX-4 MODIFIED BOOTH ENCODER

IMPLEMENTATION OF AREA EFFICIENT AND LOW POWER CARRY SELECT ADDER USING BEC-1 CONVERTER

International Journal of Advance Engineering and Research Development. Multicoding Techniqe to Reduce Power Dissipation in VLSI:A Review

Lecture #2 Solving the Interconnect Problems in VLSI

SURVEY AND EVALUATION OF LOW-POWER FULL-ADDER CELLS

Design of Parallel Prefix Tree Based High Speed Scalable CMOS Comparator for converters

Design of High Performance Arithmetic and Logic Circuits in DSM Technology

Oscillation Ring Test Using Modified State Register Cell For Synchronous Sequential Circuit

Chapter 1 Introduction

Reduce Power Consumption for Digital Cmos Circuits Using Dvts Algoritham

AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER

Design and Implementation of FPGA Based Digital Base Band Processor for RFID Reader

Data Word Length Reduction for Low-Power DSP Software

Design Of Arthematic Logic Unit using GDI adder and multiplexer 1

DESIGN & IMPLEMENTATION OF FIXED WIDTH MODIFIED BOOTH MULTIPLIER

A HIGH SPEED & LOW POWER 16T 1-BIT FULL ADDER CIRCUIT DESIGN BY USING MTCMOS TECHNIQUE IN 45nm TECHNOLOGY

BASICS: TECHNOLOGIES. EEC 116, B. Baas

A High Performance Variable Body Biasing Design with Low Power Clocking System Using MTCMOS

An Optimized Design of High-Speed and Energy- Efficient Carry Skip Adder with Variable Latency Extension

Study and Analysis of CMOS Carry Look Ahead Adder with Leakage Power Reduction Approaches

A Novel Design of High-Speed Carry Skip Adder Operating Under a Wide Range of Supply Voltages

SINGLE CYCLE TREE 64 BIT BINARY COMPARATOR WITH CONSTANT DELAY LOGIC

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Implementation of 256-bit High Speed and Area Efficient Carry Select Adder

Modified Booth Encoding Multiplier for both Signed and Unsigned Radix Based Multi-Modulus Multiplier

Sleepy Keeper Approach for Power Performance Tuning in VLSI Design

Designing Reliable and Low Power Multiplier by using Algorithmic Noise Tolerant

Gates and Circuits 1

AN EFFICIENT DESIGN OF ROBA MULTIPLIERS 1 BADDI. MOUNIKA, 2 V. RAMA RAO M.Tech, Assistant professor

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume VIII /Issue 1 / DEC 2016

Coding for Reliable On-Chip Buses: Fundamental Limits and Practical Codes

An Efficient Design of Low Power Speculative Han-Carlson Adder Using Concurrent Subtraction

Recursive Pseudo-Exhaustive Two-Pattern Generator PRIYANSHU PANDEY 1, VINOD KAPSE 2 1 M.TECH IV SEM, HOD 2

An Area Efficient Decomposed Approximate Multiplier for DCT Applications

Ramon Canal NCD Master MIRI. NCD Master MIRI 1

Gates and and Circuits

PROCESS and environment parameter variations in scaled

High-Level Interconnect Delay and Power Estimation

DESIGN AND IMPLEMENTATION OF 64- BIT CARRY SELECT ADDER IN FPGA

Noise Tolerance Dynamic CMOS Logic Design with Current Mirror Circuit

LOW POWER VLSI TECHNIQUES FOR PORTABLE DEVICES Sandeep Singh 1, Neeraj Gupta 2, Rashmi Gupta 2

Design and Optimization of Half Subtractor Circuits for Low-Voltage Low-Power Applications

Low-Power Multipliers with Data Wordlength Reduction

IMPLEMENTATION OF POWER GATING TECHNIQUE IN CMOS FULL ADDER CELL TO REDUCE LEAKAGE POWER AND GROUND BOUNCE NOISE FOR MOBILE APPLICATION

Ultra Low Power VLSI Design: A Review

DESIGN AND TEST OF CONCURRENT BIST ARCHITECTURE

Novel Low-Overhead Operand Isolation Techniques for Low-Power Datapath Synthesis

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

SQRT CSLA with Less Delay and Reduced Area Using FPGA

A High-Speed 64-Bit Binary Comparator

Design of an optimized multiplier based on approximation logic

A Review of Clock Gating Techniques in Low Power Applications

AREA AND DELAY EFFICIENT DESIGN FOR PARALLEL PREFIX FINITE FIELD MULTIPLIER

Design and Implementation of 64-bit MAC Unit for DSP Applications using verilog HDL

Design of High Performance Decoder with Mixed Logic Styles

Verilog Implementation of 64-bit Redundant Binary Product generator using MBE

A Multiplexer-Based Digital Passive Linear Counter (PLINCO)

An Optimized Design System for Flip-Flop Grouping Using Low Power Clock Gating

Low Power Optimization Of Full Adder, 4-Bit Adder And 4-Bit BCD Adder

Gdi Technique Based Carry Look Ahead Adder Design

Low Power Register Design with Integration Clock Gating and Power Gating

Design and Implementation of Carry Select Adder Using Binary to Excess-One Converter

Transcription:

REDUCING POWER DISSIPATION IN NETWORK ON CHIP BY USING DATA ENCODING SCHEMES 1 B.HEMALATHA, 2 G.MAMATHA 1,2 Department of Electronics and communication, J.N.T.U., Ananthapuram E-mail: 1 hemabandi7@gmail.com, 2 mamathasashi@gmail.com Abstract Now a day s power dissipated by links of network on chip (NoC) is more compared to power dissipated by components of communication sub system like routers, network interfaces (NI) of NoC. In this paper we present set of data encoding schemes aimed to reduce power dissipated by link of NoC without any modification of routers and architecture. In these schemes data is encoded in packets before enter into network links so no modification in routers or network link and also without affecting area by using encoder and decoder blocks of proposed encoding schemes. Keywords Inter Connections on chip, data encoding, network on chip (NoC), network interface (NI). I. INTRODUCTION Moving from silicon innovation hub results quicker and more power efficient gates but however slower furthermore, more power hungry wires [1]. More than half of the power is dissipated in interconnects in current processors, and this is expected to rise to 65% 80% throughout the following quite a long while [2]. Worldwide interconnect length does not scale with smaller transistors and neighborhood wires. Chip size remains moderately constant the chip function continues to increment and RC delay increments exponentially. At 32/28 nm, case in point, the RC delay in a 1-mm worldwide wire at the least pitch is 25 higher than the intrinsic delay of a twoinput NAND gate of fan-out of 5 [1]. In the event that the crude computation capacity of instancing more and more cores in a single silicon pass on, scalability issues, because of the need of making efficient and reliable communication between the expanding number of cores, turn into an issue [3]. The network on- chip (NoC) communication worldview [4] is perceived as the most reasonable approach to handle with scalability and reliability issues that describe the ultra deep submicron meter period. These days, the on-chip communication issues are as important as, and now and again more pertinent than, the computation related issues [4]. Indeed, the communication subsystem progressively affects the traditional configuration objectives, including area (i.e., silicon region), execution, power dissipation, energy consumption, reliability, and so on. As technology shrinks, the more significant fraction of the total power budget is complex in system-on-chip. In this paper, we concentrate on strategies went for decreasing the power dissipated by the network links. Truth be told, the power dissipated by the network connections is as important as that dissipated by switches and network interfaces (NIs) and their contribution is relied upon to increment as innovation scales [5]. Specifically, we introduce data encoding schemes working at bounce level and on a end to-end premise, which allows us to minimize both the switching activity and the coupling switching activity on routing path links and traversed by the packets. The proposed encoding schemes, which are straightforward with appreciation to the switch implementation, are exhibited and examined at both the algorithmic level and the compositional level, and evaluated by method for simulation on manufacture. The analysis considers a few viewpoints and measurements of the outline, including silicon range, power dissipation, and energy consumption. The outcomes demonstrate that by utilizing the proposed encoding scheme up to 51% of power can be saved without any degradation in execution. II. RELATED WORK In the following quite a long while, the accessibility of chips with 1000 cores is anticipated [6]. In these chips, a significant fraction of the total system power budget is dissipated by interconnection networks. Thusly, the configuration of power-efficient interconnection networks has been the center of numerous works distributed in the NoC structures. These works concentrate on various components of the interconnection networks, for example, switches, NIs, and connections. Since the aim of this paper is on diminishing the power dissipated by the connections, in this section, we quickly review a portion of the works in the area of connection power reduction. These incorporate the procedures that make use of protecting [7], [8], expanding line-to-line spacing [9], [10], what's more, repeater insertion [11]. They all expansion the chip area. The data encoding scheme is another technique that was utilized to decrease the connection power dissipation. The data encoding methods might be arranged into two classifications. In the first class, encoding strategies concentrate on lowering the power because of selfswitching activity of individual transport lines while disregarding the power dissipation inferable from their coupling switching activity. In this class, bus invert (BI) [12] and INC-XOR [13]. Give us now a chance to talk about in more detail the works with 41

which we analyze our proposed schemes. In [12], the number of transitions from 0 to 1 for two consecutive flits (the flit that just crossed and the one which is going to navigate the connection) is checked. If the number is bigger than half of the connection width, the inversion will be performed to diminish the number of 0 to 1 transitions when the flit is exchanged by means of the connection. This method is only concerned about the self-switching without stressing the coupling switching. Note that the coupling capacitance in the state-of the-art silicon innovation is considerably bigger (e.g., four times) contrasted and the self-capacitance, and thus, should be considered in any scheme proposed for the connection power reduction. In addition, the scheme depended on the flit by-jump strategy, also, accordingly, encoding is performed in each hub. The plan displayed in [14] managed lessening the coupling switching. TABLE- I Effect of Odd Invers: Onon Ceange of Transition Types In this technique, a complex encoder counts the number of Type I (Table I) transitions with a weighting coefficient of one and the number of Type II transitions with the weighting coefficient of two. If the number is bigger than half of the connection width, the inversion will be performed. In addition to the complex encoder, the strategy only works on the examples whose full inversion prompts the connection power reduction while not considering the examples whose full inversions may prompt higher connection power consumption. In this manner, the connection power reduction accomplished through this procedure is not as vast as it could be. This plan was additionally in view of the jump by-bounce strategy. In another coding system introduced in [15], groups of four bits are encoded with five bits. The encoded bits were segregated utilizing protecting wires to such an extent that the event of the examples "101" and "010" were counteracted. Thusly, no concurrent Type II transitions in two contiguous pair bits are actuated. This system viably lessens the coupling switching activity. In spite of the fact that the system lessens the power consumption considerably, it expands the data exchange time, also, subsequently, the connection vitality consumption. This is because of the reality that for every four bits, six bits are transmitted which increments the communication. III. PROPOSED ENCODING SCHEME In this section, present the proposed encoding scheme whose goal is to reduce power dissipation by minimizing the coupling transition activity on the links of the interconnection network. One can classify four types of coupling transitions. The effective switched capacitance varies from type to type and hence, the coupling transition activity, is a weighted sum of different types of coupling transition contributions.here, we calculate the occurrence probability for different types of transitions. Consider that flit ( t 1) and flit (t ) refer to the previous flit which was transferred through the link and the flit is about to pass through the link, respectively. A Type I transition happens when one of the lines switches when alternate remains unaltered. In a Type II transition, one line changes from low to high while alternate makes transition from high to low. A Type III transition corresponds to the situation where both lines switch at the same time. At last, in a Type IV transition both lines don't change. The viable changed capacitance fluctuates from sort to sort, also, consequently, the coupling. Here, we ascertain the event likelihood for various sorts of transitions. Consider that flutter (t 1) and bounce (t) allude to the past flutter which was exchanged by means of the connection and the flutter which is going to go through the connection, individually. We consider only two contiguous bits of the physical channel. Sixteen unique combinations of these four bits could happen (Table I). Note that the main piece is the estimation of the non specific ith line of the connection, though the second piece speaks to the worth of its (i +1)th line. The quantity of transitions for Types I, II, III, and IV are 8, 2, 2, and 4, individually. For an arbitrary set of data, each of these sixteen transitions has the same likelihood. Accordingly, the event likelihood for Types I, II, III, and IV are 1/2, 1/8, 1/8, and 1/4, individually. In the rest of this section, we exhibit three data encoding plans intended for diminishing the dynamic power dissipation of the network joins along with a conceivable equipment implementation of the decoder. SCHEME I: Square In Scheme I, we concentrate on reducing the number of Type I transitions (by converting them to Types III and IV transitions) what's more, Type II 42

transitions (by converting them to Type I transition). The scheme contrasts the present data and the past one to choose whether odd inversion or no inversion of the present data can prompt the connection power reduction. While in Scheme II, both Types I and II transitions are taken into account for deciding between half and full invert, depending the amount of switching reduction. Finally, in Scheme III, we consider the fact that Type I transitions show different behaviors in the case of odd and even invert and make the inversion which leads to the higher power saving. performed by simply inverts the encoder circuit when the inverting bit is set to 1. SCHEME II: In scheme II, our main goal is to reducing the number of Type II transitions. Type II transitions are converted into Type IV transitions. This scheme compares the two data s based on to reducing the connection power reduction by doing full inversion or odd inversion or no inversion operation. Fig.2. Encoder Architecture Scheme II Fig.1. a) Block diagram of Encoder, b) Internal view of Encoder Block Scheme I. The general block diagram in Fig. 1(a) is same for scheme 1, scheme 2 and scheme 3. The w-1 bit is given to the one input block. The original binary input converted into encoding output by this block. The two inputs are original binary and previously encoded outputs encoded by the encoder and performing the any one of the inversion based on the transition types for connection power reduction. The block E is changes to scheme by scheme according to requirement. From the given inputs the TY block takes two adjacent bits as its inputs. From these the TY block checks what type of transitions occurs, whether the output state to 1, otherwise it set to 0 based on more number of type 1 and type 2 transitions. The odd inversion is performed for these types of transitions. The last stage using the XOR circuits is used to perform the odd bit inversion. The decoding is Full and odd inversion based this advanced encoding architecture consist of w-1 connection width and one bit for inversion bit which indicate if the bit travel through the link is inverted or not. W bits connection width is considered when there is no encoding is applied for the input bits. Here the TY block from scheme 1 is added in scheme2. This takes two adjacent bits from the given inputs. From these two input bits the TY block checks what type of transitions occurs. We have T2 and T4** blocks which determines if any of the transition types T2 and T4**occur based on the link power reduction. The number of one s blocks in the next stage. The output of the TY, T2 and T4** send through the number of one s blocks. The output of the ones block is log 2 w. The first ones block is used to determine the number of transitions based on odd inversion. The second ones block determines the number of transitions based on the full inversion and the then another one ones block is used to determine the number of transitions based on the full inversion. These inversions are performed based on the link power reduction. Based on these ones block the Module A takes the decision of which inversion should be performed for the link power reduction. For this module is satisfied means the output is set to 1. None of the output is set to 1 if there is no inversion is takes place. The module A is implemented using full adder and comparator circuit. 43

Fig 3: Decoder block diagram Fig.5 Encoder block Diagram Scheme III Fig.4. Internal view of Decoder block Scheme II The block diagram of the decoder is shown in Fig.3. The w-1 bits input is applied in the decoder circuit and another input of the decoder is previous decoded output. The decoder block compares the two input data s and inversion operation is performed and w-1 bits output is produced. The remaining one bit is used to indicate the inversion is performed or not. In decoder circuit diagram (Fig.4.) consist of TY block and Majority vector and Xor circuits. Based on the encoder action the TY block is determined the transitions. Based on the transitions types the majority blocks checks the validity. The output of the majority voter is given to the Xor circuit. Half inversion, full inversion and no inversion is performed based on the logic gates. SCHEME III: In scheme III, we are adding the even inversion into scheme II. Because the odd inversion converts Type I transitions into Type II transitions. From table II, T1 **/T1*** are converted into Type IV/Type III transitions by the flits is even inverted. The link power reduction in even inversion is larger than the Odd inversion. Table 2: Effect Of Even Inversion On Change Of Transition Types The encoding architecture (Fig.5) in scheme III is same of encoder architecture in scheme I and II. Here we adding the Te block to the scheme II. This is based on even invert condition, Full invert condition and Odd invert condition. It consists of w-1 link width input and the w bit is used for the inversion bit. The full, half and even Inversion is performed means the inversion bit is set to1, otherwise it set as 0.The TY, Te and T4** block determines the transition types T2, Te and T4**. The transition types are send to the number of one s block. The Te block is determined if any of the detected transition of types T2, T1** and T1**. The ones block determines the number of ones in the corresponding transmissions of TY, T2, Te and T4**. This number of one s is given to the Module C block. This block check if odd, even, full or no invert action has to perform. The decoder architecture of scheme II and scheme III are same. CONCLUSIONS In this work, the encoding technique is implemented for reducing the transition activity in the NOC. This encoding scheme aimed at reducing the power dissipated by the links of an NOC. The proposed encoding schemes are agnostic with respect to the underlying NOC architecture in the sense that our application does not require any modification neither in the links nor in the links. The proposed architecture is coded using VERILOG language and is simulated and synthesized using Modelsim and Xilinx software. Overall, the application scheme allows 40% power saving In the future, the Network On Chip (NOC) Implementation using different types of router and Network interface technique will be analyzed. Comparing the area, delay and power with previous techniques. ACKNOWLEDGMENTS I extend my grateful thanks to the authorities for their support and encouragement to write this paper. REFERENCES [1] International Technology Roadmap for Semiconductors. (2011) [Online].Available: http://www.itrs.net [2] M. S. Rahaman and M. H. Chowdhury, Crosstalk avoidance and error correction coding for coupled RLC 44

interconnects, in Proc. IEEE Int.Symp. Circuits Syst., May 2009, pp. 141 144. [3] W. Wolf, A. A. Jerraya, and G. Martin, Multiprocessor system-on-chip MPSoC technology, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 27, no. 10, pp. 1701 1713, Oct. 2008. [4] L. Benini and G. De Micheli, Networks on chips: A new SoCparadigm, Computer, vol. 35, no. 1, pp. 70 78, Jan. 2002. [5] S. E. Lee and N. Bagherzadeh, A variable frequency link for a power aware network-on-chip (NoC), Integr. VLSI J., vol. 42, no. 4,pp. 479 485, Sep. 2009. [6] D. Yeh, L. S. Peh, S. Borkar, J. Darringer, A. Agarwal, andw. M. Hwu, Thousand-core chips roundtable, IEEE Design Test Comput., vol. 25,no. 3, pp. 272 278, May Jun. 2008. [7] A. Vittal and M. Marek-Sadowska, Crosstalk reduction for VLSI, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 16, no. 3,pp. 290 298, Mar. 1997. [8] M. Ghoneima, Y. I. Ismail, M. M. Khellah, J. W. Tschanz, and V. De, Formal derivation of optimal active shielding for low-power on-chipbuses, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 25,no. 5, pp. 821 836, May 2006. [9] L. Macchiarulo, E. Macii, and M. Poncino, Wire placement for crosstalk energy minimization in address buses, in Proc. Design Autom.Test Eur. Conf. Exhibit., Mar. 2002, pp. 158 162. [10] R. Ayoub and A. Orailoglu, A unified transformational approach for reductions in fault vulnerability, power, and crosstalk noise and delayon processor buses, in Proc. Design Autom. Conf. Asia South Pacific,vol. 2. Jan. 2005, pp. 729 734. [11] K. Banerjee and A. Mehrotra, A power-optimal repeater insertion methodology for global interconnects in nanometer designs, IEEETrans. Electron Devices, vol. 49, no. 11, pp. 20 [12] M. R. Stan and W. P. Burleson, Bus-invert coding for low-poweri/o, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 3, no. 1,pp. 49 58, Mar. 1995. [13] S. Ramprasad, N. R. Shanbhag, and I. N. Hajj, A coding framework for low-power address and data busses, IEEE Trans. Very Large ScaleIntegr. (VLSI) Syst., vol. 7 [14] K. W. Ki, B. Kwang Hyun, N. Shanbhag, C. L. Liu, and K. M.Sung, Coupling-driven signal encoding scheme for low-power interface design, in Proc. IEEE/ACM Int. Conf. Comput.-Aided Design,Nov. 2000, pp. 318 321. [15] P. P. Pande, H. Zhu, A. Ganguly, and C. Grecu, Energy reduction through crosstalk avoidance coding in NoC paradigm, in Proc.9th EUROMICRO Conf. Digit. Syst. Design Archit. Methods Tools, Sep. 2006, pp. 689 695. 45