Hybrid System Level Power Consumption Estimation for FPGA-Based MPSoC

Size: px
Start display at page:

Download "Hybrid System Level Power Consumption Estimation for FPGA-Based MPSoC"

Transcription

1 Hybrid System Level Power Consumption Estimation for FPGA-Based MPSoC Santhosh Kumar RETHINAGIRI, Rabie BEN ATITALLAH, Smail NIAR, Eric SENN, and Jean-Luc DEKEYSER INRIA Lille Nord Europe, Université de Lille1, Lille, France and LAMIH, Université de Valenciennes et du Hainaut Cambrésis, Valenciennes, France and LAB-STICC Université de Bretagne Sud, Lorient, France Abstract This paper proposes an efficient Hybrid System Level (HSL) power estimation methodology for FPGA-based MPSoC. Within this methodology, the Functional Level Power Analysis (FLPA) is extended to set up generic power models for the different parts of the system. Then, a simulation framework is developed at the transactional level to evaluate accurately the activities used in the related power models. The combination of the above two parts lead to a hybrid power estimation that gives a better trade-off between accuracy and speed. The proposed methodology has several benefits: it considers the power consumption of the embedded system in its entirety and leads to accurate estimates without a costly and complex material. The proposed methodology is also scalable for exploring complex embedded architectures. The usefulness and effectiveness of our HSL methodology is validated through a typical mono-processor and multiprocessor embedded system designed around the Xilinx Virtex II Pro FPGA board. Our experiments performed on an explicit embedded platform show that the obtained power estimation results are less than 1.2% of error when compared to the real board measurements and faster compared to other power estimation tools. I. INTRODUCTION Due to the ongoing nano-miniaturisation in chip production, estimation of power consumption is becoming a critical predesign metric in complex embedded systems such as Multi- Processor System-on-Chip (MPSoC) in FPGA s. In such architectures, processors come with high computation rates while the reconfigurable logic offers high performance per watt and adaptability to the application constraints. In current industrial and academic practices, power estimation using low level CAD tools is still widely adopted, which is clearly not suited to manage the complexity of modern embedded systems. Recently, the ITRS [8] and HiPEAC 1 roadmaps promote power defines performance and power is the wall. Facing this issue, designers should calculate the power consumption as early as possible in the design flow to reduce the time-tomarket and the development cost. Today, system level power estimation is considered a vital premise to cope with the critical design constraints. However, the development of tools for power estimation at the system level is in the face of extremely challenging requirements such as the efficient power 1 modeling methodology, the rapid system prototyping, and the accurate power estimates. At the system level, the power estimation process is centered around two correlated aspects: the power model granularity and the system abstraction level. The first aspect concerns the granularity of the relevant activities on which the power model relies. It covers a large spectrum that starts from the fine-grain level such as the logic gate switching and stretches out to the coarse-grain level like the hardware component events. In general, fine-grain power estimation yields to a more correlated model with data and to handle technological parameters, which is tedious for system level designers. On the other hand, coarse-grain power models depend on microarchitectural activities that cannot be determined easily. The second aspect involves the abstraction level on which the system is described. It starts from the usual Register Transfer Level (RTL) and extends up to the algorithmic level. In general, going from low to high design level corresponds to more abstract description and then coarser activity granularity. The power evaluation time increases as we go down through the design flow and the accuracy depends on the extraction of each relevant activity and the characterization methodology to evaluate the related power cost. In order to have an efficient power estimation methodology, we should find a better tradeoff between these two aspects. To answer the above challenges, we propose a new Hybrid System Level (HSL) power consumption estimation methodology for complex embedded systems. A key word in our contribution is hybridization between abstraction levels. Almost all the previous studies focus on power estimation for a given abstraction level without overcoming the wall of speed/accuracy trade-off. The idea here is to build up a hybrid power estimation tool that combines Functional Level Power Analysis (FLPA) for hardware power modeling and Transactional Level Modeling (TLM) simulation technique for rapid system prototyping and fast power estimation. Basically, the FLPA is used for processor power modeling. In the frame of this work, it will be extended to cover the other hardware components used in the MPSoC such as the memory and the reconfigurable logic. After that, we go further in terms of scalability to target heterogeneous multiprocessor /11/$ IEEE 239

2 architectures. The functional power estimation part is coupled with a fast SystemC [13] simulator in order to obtain the needed micro-architectural activities for power models, which allows us to reach a superior bargain between accuracy and speed. This paper is organized as follows. Section II presents the related works, section III exposes the proposed HSL power estimation methodology. In Section IV, the power modeling methodology is applied to a typical MPSoC designed around Virtex II Pro FPGA board. To evaluate our approach in terms of accuracy, speed and scalability, experimental results are presented in Section V. Fig. 1. Hybrid System Level (HSL) power estimation methodology II. RELATED WORKS Among the developed tools for power consumption estimation at the system level, we quote tools based on microarchitectural cycle-level simulation such as Wattch [5] and Simplepower [16]. They define fine-grain power models by characterizing component features such as a set of instructions or functional blocks using analytic power laws. The contributions of the internal unit activities are calculated and added together during the execution of the program on the micro-architectural simulator. This approach needs low-level description of the architecture that are often difficult to obtain and with a least significant amount of simulation time. In an attempt to reduce simulation time, recent efforts have been done to build up fast simulators using Transaction Level Modeling (TLM) [4]. SystemC [13] and its TLM 2.0 kit have become a de facto standard for the system-level description of Systems-on-Chip (SoC) by the means of offering different coding styles. Nevertheless, power estimation at the TLM level is still under research and is not well established. In [11] and [12], a methodology is presented to generate consumption models for peripheral devices at the TLM level. Relevant activities are identified at different levels and granularities. The characterization phase is however done at the gate level: from where they deduce the activity and power consumption for the higher level. Using this approach for recent processors and systems is not realistic. Dhawada et al. [6] proposed a power estimation methodology for a monoprocessor PowerPC and CoreConnect-based system at the TLM level. Their power modeling methodology is based on a fine-gsrain activity characterization at the gate level, which needs a huge amount of development time. Due to a high correlation with data, a power estimation inaccuracy of 11% is achieved. Compared to the previous works, our proposed methodology for power estimation also partially uses SystemC/TLM simulation with coarse grain power models. In addition, our methodology is applied for heterogeneous multiprocessor architectures. For the functional level, Tiwari et al. [15] have introduced the concept of Instruction Level Power Analysis (ILPA). They associate a power consumption model with instructions or instruction pairs. The power consumed by a program running on the processor can be estimated using an Instruction Set Simulator (ISS) to extract instruction traces, and then adding up the total cost of the instructions. This approach suffers from the high number of experiments required to obtain the model. In addition, it can be applicable only for processors. To overcome this drawback the Functional Level Power Analysis (FLPA) was proposed [10] [9], which relies on the identification of a set of functional blocks that influence the power consumption of the target component. The model is represented by a set of analytical functions or a table of consumption values which depend on functional and architectural parameters. Once the model is build, the estimation process consists of extracting the appropriate parameter values from the design, which will be injected into the model to compute the power consumption. Based on this methodology, the tool SoftExplorer [7] has been developed and included in the recent toolbox CAT [14]. It includes a library of power models for simple to complex processors. Only a static analysis of the code, or a rapid profiling is necessary to determine the input parameters for the power models. However, when complex hardware or software is involved, some parameters may be difficult to determine with precision. This lack of precision may have a non-negligible impact on the final estimation accuracy. In order to refine the value of sensible parameters with a reasonable delay, we propose to couple SystemC/TLM simulation with functional power modeling technique. Thus, by this way a reasonable trade-off between estimation speed and accuracy will be reached. III. HYBRID SYSTEM LEVEL POWER ESTIMATION METHODOLOGY This section exposes our proposed HSL power estimation methodology that is divided into two parts as shown in Fig. 1. The first part concerns the power model elaboration for the system hardware components. In our framework, the FLPA methodology is used to develop generic power models for different target platforms. The main advantage of this methodology is to obtain power models, which rely on the functional parameters of the system with a reduced number 240

3 Fig. 2. HSL power estimator tool functioning of experiments. As explained in the previous section, FLPA comes with few consumption laws, which are associated to the consumption activity values of the main functional blocks of the system. The generated power models have been adapted to system level design, as the required activities can be obtained from a system level environment. For a given platform, the generation of power model is done at once. To estimate the power consumption of an MPSoC system, the first step is to divide the architecture into different functional blocks and then to cluster the components that are concurrently activated when the code is running. There are two types of parameters: algorithmic parameters that depend on the executed algorithm (typically the cache miss or instruction per cycle rates for a processor and area utilization for a hardware accelerator) and architectural parameters that depend on the component configuration set by the designer (typically the clock frequency). For instance, Table I presents the common set of parameters of our generic power model. These sets of parameters are defined for a general class of MPSoC. Additional parameters can be identified for complex processors based-architecture such as Superscaler or VLIW (Very Long Instruction Word). The second step is the characterization of the embedded system power consumption when the parameters vary. These variations are obtained by using some elementary assembly programs (called scenario) or built in test vectors elaborated to stimulate each block separately. Characterization can be performed by measurements on real boards. Finally, a curve fitting of the graphical representation will allow us to determine the power consumption models by regression. The analytical form or a table of values expresses the obtained power models. This power modeling approach was proven to be fast and precise. In our work, this approach has been applied to model power TABLE I GENERIC POWER MODEL PARAMETERS Algorithmic Name Description τ External memory access rate γ Cache miss rate for a processor β Instruction per cycle rate α area utilization for a HW accelerator Architectural F processor Frequency of the processor F bus Frequency of the bus N Number of processors consumption for processor, memory, reconfigurable hardware, and I/O peripherals. The second part of the methodology defines the architecture of our HSL power estimator that includes the functional power estimator and fast SystemC simulator as shown in Fig.1. The functional power estimator evaluates the consumption of the target system with the help of the elaborated power models from the first part. It takes into account the architectural parameters (e.g. the frequency, the number of processors, the processor cache configuration, etc.) and the application mapping. It also requires the different activity values on which the power models rely. In order to collect accurately the needed activity values, the functional power estimator communicates with a fast SystemC simulator at a TLM level. The combination of the above two components described at different abstraction levels (functional and TLM) leads to a hybrid power estimation that gives a better trade-off between accuracy and speed. The vital function of system level power estimation is to offer a detailed power analysis by the means of a complete simulation of the application. This process is initiated by the functional power estimator through the data and task interface (Fig. 2). In this way, the mapping information is transmitted to the fast TLM SystemC simulator. Our simulator consists 241

4 of several hardware components which are instantiated from the SoCLib [2] library in order to build a virtual prototype of the target system. We highlight that processors are described using Instruction Set Simulator (ISS) that sequentially executes the instructions and has no notion of concurrency of microarchitecture. In our previous framework [3], we presented an accurate TLM simulation technique that allows to evaluate the heterogeneous MPSoC performances. In the power estimation step, the simulator collects the activities that are influenced by the application and the input data. At the end of the simulation, the values of the activities are transmitted to the power consumption models or power estimator kernal using the activity counter interface in order to calculate the global power consumption as illustrated in Fig. 2. As we have stated before, the following section will discuss the first part i.e., the elaboration of the power model for the Xilinx Virtex II Pro FPGA platform by using FLPA methodology. IV. POWER MODEL ELABORATION In order to prove the usefulness and the effectiveness of the proposed power estimation methodology, we used a PowerPC 405-based architectures implemented into the Xilinx Virtex II Pro FPGA (XupV2Pro) platform. The Virtex II Pro FPGA contains two PowerPC 405 processors that have a 16KB, 2-way set associative instruction and data caches. In addition, this FPGA has a large number of configurable logic blocks (CLB) for implementing hardware accelerators. Each processor has access to the on-chip memory (BRAM) and the off-chip memory (SDRAM) via the Processor Local Bus (PLB). We used the JPEG (Joint Photographic Experts Group) application as a benchmark. The JPEG application consists of 4 main tasks: conversion RGB (Red, Green and Blue) to YUV (luminance, blue chrominance, and red chrominance components), Discrete Cosine Transform (DCT), Quantization, and Huffman coding. We also used the H.264/AVC baseline profile decoder that supports intra and inter-coding, and entropy coding with Context-Adaptive Variable-Length Coding (CAVLC). This section presents the power model elaboration which is the first part of our methodology. As explained above, we used the FLPA methodology to generate generic power models for the target system. As a first step, we divided the architecture into different functional blocks such as the processor, the memory system, the reconfigurable logic, etc. Then, we started the characterization of each component in order to extract the related power consumption models. Processor power model: Table II shows the power consumption models for the PowerPC405 processor and its memory system. These models predict consumption of the processor kernel and the I/Os parts separately, since distinct supplier devices power them with constant voltage: 1.5V for the processor and 2.5V for the SDRAM and I/Os respectively. The obtained power models shown in the Table II depend also on the memory mapping. For this reason, there are different power models for on-chip memory (BRAM) and for external memory (SDRAM). The input parameters on which the power models rely are the frequency of the processor (F processor (MHz)), bus frequency (F bus (MHz)), and the cache miss rate (0 < γ < 100 (%)). The system designer chooses the frequency of the processor and bus while cache miss rate is considered as an activity of the processor, which could be extracted from the simulation environment. According to these power models, the static consumption is dominant which is a drawback of the FPGA technology. For this reason, the latest FPGA circuits come with an optimized static power factory setting. TABLE II CONSUMPTION MODELS FOR THE POWERPC 405 PLATFORM Mapping Voltage Power laws BRAM 1.5V P(mW) = 0.40 F processor F bus V P(mW) = 5.37 F bus SDRAM 1.5V P(mW) = 0.38 F processor F bus V P(mW) = 4.1γ + 6.3F bus The elaborated power model for the PowerPC405 is simple due to the corresponding scaler architecture. However, new generation FPGAs such as Xilinx Virtex 7 come with more performance integrated processors such as the ARM 2 processor family. To demonstrate the usefulness of our power modeling approach in the context of complex processor, we elaborate the power model for the ARM Cortex A8 (P A8 ) which has a Superscaler architecture. The obtained power model (equation 1) includes the IPC (Instruction Per Cycle) parameter and separates the cache miss rates of the Level-1 (γ1) and Level-2 (γ2) caches. P A8 = 0.79 f IPC (γ1 + γ2) (1) FPGA power model: A power model has been built for the reconfigurable part of the FPGA component on the XupV2Pro board. For a given FPGA the parameters that can be extracted from the high-level specification are the frequency F, the switching rate β, and the utilized area α of the targeted FPGA. Using a high-level architecture synthesis tool such as GAUT [1], these parameters can be predicted with a good estimates. According to the experimental results, the model does not come as a multi-linear equation of the above-mentioned parameters. For this reason, a 3 entries table of consumption values is used. The power is estimated by interpolation of these 3 input parameters. For instance, Fig 3 illustrates the variation of the FPGA power consumption according to area utilization and the switching activity with an operating frequency set to 100 MHz. Extrapolation for heterogeneous multiprocessor architectures: The above developed power models will be used in the framework of system level estimation of heterogeneous multiprocessor architectures that may contain several processors and hardware accelerators. This approach is mandatory in the design flow for two reasons. First, system level estimation can be achieved with acceptable accuracy of x faster than the physical level. Second, it allows exploring architectures that cannot be implemented due to the hardware resource

5 Fig. 3. FPGA power consumption with 100 MHz frequency limitation or the unavailability of the target component. For instance, we cannot exceed two PowerPC based architecture using our XupV2Pro platform. Thus, it is important to have a scalable approach to address the complex system power/energy estimation issue. The equation 2 will be considered for the total system power (P total ) estimation. In addition to the processor (P pi ) and the hardware accelerators (P HwAcc ), the equation involves the power consumption of the synchronization part (P sync ) required to access the shared memory (P mem ) and the shared I/O resources (P I/O ). We mention here that it is necessary to compute the total energy (E total ) of the system before the deduction of the total power consumption in order to compare the proposed HSL methodology energy with the real board performance. P total = i E total = i P pi + P mem + P sync + P I/O + P HwAcc (2) j E pi + E mem + E sync + E I/O + E HwAcc (3) j In our XupV2Pro platform, synchronization between parallel tasks running on different processors or hardware accelerators is performed by a call to a hardware mutex. Several experiments have been conducted to evaluate the additional power cost of this hardware component. This study includes three parameters which are the number of masters, and the processor & bus frequencies. Experimental results show that the mutex power consumption depends mainly on the PLB frequency. V. SYSTEM LEVEL POWER ESTIMATION RESULTS A. Monoprocessor architecture For the second part of our HSL power estimation methodology, a system level prototype of a PowerPC based architecture has been developed. This prototype uses different SystemC models especially the ISS for the target processor. Furthermore, the cache parameters and the bus latencies are set to emulate the real platform behaviour. A set of counters are injected into the simulator to determine the occurrences of the main activities. For the PowerPC processor, the following counters are used for different cache miss rates: read data miss, write data miss and read instruction miss. For the example of ARM cortex A8 processor, additional counters should be defined for IPC and Level-2 cache miss rates computing. Fig. 4 shows the detailed results of the activities fetched by the fast SystemC simulator for each task of the JPEG application. From these results several remarks can be drawn. First, we can notice that instruction cache miss rates and read data miss rates are very low when compared with write data miss rates. This is due to the reduced task kernel and data pattern sizes that are very low compared to the cache size (16 KB), which decreases the access to the external memory and thus having a minimal effect on the dynamic power consumption. However, with the new submicron technologies the effect of the static power consumption cannot be neglected. For this reason, a softcore processor such as the Microblaze comes with reconfigurable cache size to fit with the application requirements. Second, the data write miss rates have a high impact on the total power consumption of the system. This is because of the algorithm structure, which does not favour the reuse of data output arrays and the usage of write-through cache policy. Therefore, the statistics collected in Fig. 4 could help in tuning the application structure for a better optimization of the system power consumption. In the next step, we estimated the total power consumption of each task using the power models shown in Table II (SDRAM mapping). Fig. 5 illustrates the results and shows the comparison between the proposed HSL methodology, SoftExplorer tool introduced in Section II, and the real board measurements. First, our power estimator has a negligible average error equals to 1.59%, which offers better accuracy than SoftExplorer which presents an average error of 4.32%. The average error obtained here is negligible due to the dominance of static power. For this reason, we calculate again the average error without taking into account the static power. Our methodology produces an average error of 4.3% and the SoftExplorer gives around 9.4% in comparison with real board measurement. This study offers a detailed power analysis for each task in order to help designers to detect peaks of consumption and thus to propose efficient mapping or optimization techniques. In order to evaluate the accuracy of our tool, we carried out power estimation on several image & signal processing benchmarks. Fig. 6 illustrates the power estimation results by showing the comparison between the proposed HSL power estimation methodology, SoftExplorer and the real board measurements. Our proposed methodology has a negligible average error of 1.24%, which offers better accuracy than SoftExplorer with an average error of 6.34% when compared to the real board measurements. This is due to better accuracy of the captured activities in the simulator rather than the static analysis or rapid profiling of the C or 243

6 Fig. 4. JPEG application cache miss rates Fig. 5. Power estimation accuracy for the JPEG application assembly code in SoftExplorer. B. Homogeneous multiprocessor architecture The second study involves an homogeneous architecture with identical processors to run the JPEG application. To evaluate the impact of the number of processors on the execution time and total energy/power consumption, we executed the JPEG on systems with 1 to 8 processors. The PowerPC frequency was set to 300MHz and the PLB frequency to 100MHz. All the processors execute the same workload but on different image macroblocs. Fig. 7 reports the execution time in ms and the total energy consumption in mj and Fig. 8 shows the power estimation of multimedia benchmarks for homogeneous multiprocessor (two PowerPC) architecture. Compared to real board energy measurements, our HSL estimator achieved an error of 0.79% and 3.49% for one and two processors respectively. This accuracy is obtained because of three main reasons. First, power models are extracted from real board measurements. Second, our methodology considers the synchronization part while using multiprocessor system. Finally, additional activities that are intrinsic in parallel processing such as shared data communication overheads are accurately evaluated by using our SystemC simulator. The 244

7 Fig. 6. Comparison of power estimation accuracy of HSL tool vs SoftExplorer (monoprocessor architecture) Fig. 7. Execution time and energy variation according to the number of processors Fig. 8. Power estimation of homogeneous multiprocessor (two PowerPC) architecture above mentioned reasons encourage us to consider architectures with a higher number of processors in the context of exploring new complex MPSoC. Fig. 7 shows that, for the implemented JPEG parallel application, adding processors to the system decreases the execution time, which improves the system performance. This variation is not linear because the processors share resources, which generates conflicts at some times, and reduces the speed-up as waiting cycles are added to the processors execution. In terms of energy consumption, we observe that until a certain number of processors, the total system energy consumption decreases as the number of execution cycles is reduced, and then it tends to stabilize as the system performance improves. But increasing the number of processors over a certain limit tends to be ineffective, as it just adds new conflicts at the bus level, leading to more waiting cycles. From Fig. 8, we are able to conclude that proposed HSL methodology is accurate and efficient in terms of scalability of application and processor with its negligible average error of 0.89% when compared to the real board measurements. C. Heterogeneous multiprocessor architecture In this part, we emphasize the benefit of our estimation methodology in the context of heterogeneous architecture. In general, the choice of a hardware accelerator is driven principally by the performance requirements of the application and the processor usage of each task. For the JPEG application, the DCT is the most time consuming task. Thus, it is selected to be implemented as a hardware accelerator. Various tradeoffs can be done between the amount of consumed hardware resources, the execution time, and the power consumption. The DCT task is highly regular and has large repetition spaces in its multiple hierarchical levels. Such large repetition spaces allow us to fully exploit the existing partitioning in VHDL. We selected a configuration, which is about 200 times faster than a software execution with a PowerPC processor running at 100 MHz. The synthesized hardware occupies 18% of the XupV2Pro. According to the FPGA power model, the power consumption of the chosen hardware DCT is around 300mW offering 40% of power saving compared to the software execution and 25% of reduction in execution time. 245

8 D. Estimation speed comparison In this section, we will compare the efficiency of the proposed methodology in term of estimation speed with the SoftExplorer and SimplePower tools introduced in Section II. This comparison is for the quantification of our proposed methodology to the state of art power estimation tools used in current industrial and academic practices. SoftExplorer and our proposed methodology are executed on a PC (Intel, 1.8 GHz, 2 Go of RAM), whereas SimplePower on a Workstation (Ultra Sparc T2+, 1.6 GHz, 2 Go of RAM). In order to compare the result, computer benchmarking has been done to confirm that the workstation is always faster compared to the PC for all kind of applications. Power estimation has been carried out with a set of image & signal processing applications and also with SPEC benchmarks. Fig. 9. Comparison of estimation time for HSL, SoftExplorer, and Simple- Power tools From Fig. 9, we can notice that SoftExplorer has an average estimation time of 5 seconds, which is faster when compared to the SimplePower s average estimation time of 20 seconds. Our proposed HSL methodology has a average estimation time of 4 seconds, which is faster compared to the other tools. Our HSL methodology works by running the application on the fast SystemC TLM simulator thereby collecting the dynamic activities. SimplePower uses cycle accurate specifications to collect the necessary power data, whereas SoftExplorer realizes a static profiling of the code, which results in reduced execution time by this way power consumption estimation time is low. Static profiling of the C code is not sufficient to determine the average execution time and the global energy consumption, for this reason we need to run the application on the Fast SystemC simulator in order to collect the activities accurately and efficiently. Experimental results prove that our proposed methodology is efficient and accurate. VI. CONCLUSION This paper presents a hybrid system level estimation methodology for MPSoC power-aware design. Indeed, 3 power/energy constraints are considered as a major challenge when the system runs on batteries. Thus, designers must take these constraints into account as early as possible in the design flow. First, a power modeling methodology has been defined to address the global system consumption that includes processors, memory, reconfigurable hardware, and etc. Secondly, the functional power modeling part is coupled with a fast SystemC simulation technique to obtain the needed micro-architectural activities for the power models, which allows us to reach accurate estimates. With such proposed methodology, the designer can explore several implementation choices: monoprocessor, homogeneous and heterogeneous multiprocessor. The future works of this project will focus on more complex heterogeneous platforms. Furthermore, in order to obtain more accurate power estimations, some power model refinements must be realized. This is the case for the data exchanges between hardware and software tasks respectively executed on hardware resource and on processor which are currently estimated at high level of abstraction. REFERENCES [1] The Gaut Website. [2] The Soclib Website. [3] R. B. Atitallah, S. Niar,, and J.-L. Dekeyser. Mpsoc power estimation framework at transaction level modeling. In The 19th International Conference on Microelectronics (ICM 2007), [4] G. Beltrame, L. Fossati, and D. Sciuto. ReSP: A Nonintrusive Transaction-Level Reflective MPSoC Simulation Platform for Design Space Exploration. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 28(12): , Dec [5] D. Brooks, V. Tiwari, and M. Martonosi. Wattch: a framework for architectural-level power analysis and optimizations. In Proceedings of the 27th annual international symposium on Computer architecture, pages 83 94, [6] N. Dhanwada, R. A. Bergamaschi, W. W. Dungan, I. Nair, P. Gramann, W. E. Dougherty, and I.-C. Lin. Transaction-level modeling for architectural and power analysis of powerpc and coreconnect-based systems. [7] S. Dhouib, J.-P. Diguet, D. Blouin, and J. Laurent. Energy and power consumption estimation for embedded applications and operating systems. Journal of Low Power Electronics (JOLPE), 5(3), [8] ITRS. Design, 2010 edition [9] J. Laurent, N. Julien, and E. Martin. High level energy estimation for DSP systems. In Proc. Int. Workshop on Power And Timing Modeling, Optimization and Simulation PATMOS 01, pages , [10] J. Laurent, N. Julien, and E. Martin. Functional level power analysis: An efficient approach for modeling the power consumption of complex processors. In Proceedings of the Design, Automation and Test in Europe Conference, Munich, [11] I. Lee, H. Kim, P. Yang, S. Yoo, E. Chung, K.Choi, J.Kong, and S.Eo. Powervip: Soc power estimation framework at transaction level. In Proc. ASP-DAC, [12] N.Dhanwada, I. Lin, and V.Narayanan. A power estimation methodology for systemc transaction level models. In International conference on Hardware/software codesign and system synthesis, [13] Open SystemC Initiative. Systemc, World Wide Web document, URL: [14] J. D. S. Douhib. Model driven high-level power estimation of embedded operating systems communication and synchronization services. In Proceedings of the 6th IEEE International Conference on Embedded Software and Systems, China, May [15] V. Tiwari, S. Malik, and A. Wolfe. Power analysis of embedded software: A first step towards software power minimization. In Transactions on VLSI Systems, [16] W. Ye, N. Vijaykrishnan, M. Kandemir, and M. Irwin. The Design and Use of SimplePower: A Cycle Accurate Energy Estimation Tool. In Design Automation Conf, June

Digital Systems Design

Digital Systems Design Digital Systems Design Digital Systems Design and Test Dr. D. J. Jackson Lecture 1-1 Introduction Traditional digital design Manual process of designing and capturing circuits Schematic entry System-level

More information

Multi-core Platforms for

Multi-core Platforms for 20 JUNE 2011 Multi-core Platforms for Immersive-Audio Applications Course: Advanced Computer Architectures Teacher: Prof. Cristina Silvano Student: Silvio La Blasca 771338 Introduction on Immersive-Audio

More information

Auto-tuning Fault Tolerance Technique for DSP-Based Circuits in Transportation Systems

Auto-tuning Fault Tolerance Technique for DSP-Based Circuits in Transportation Systems Auto-tuning Fault Tolerance Technique for DSP-Based Circuits in Transportation Systems Ihsen Alouani, Smail Niar, Yassin El-Hillali, and Atika Rivenq 1 I. Alouani and S. Niar LAMIH lab University of Valenciennes

More information

Accelerating embedded software processing in an FPGA with PowerPC and Microblaze

Accelerating embedded software processing in an FPGA with PowerPC and Microblaze Accelerating embedded software processing in an FPGA with PowerPC and Microblaze Luis Pantaleone and Elias Todorovich INTIA Institute Universidad Nacional del Centro de la Pcia. de Bs. As. Paraje Arrollo

More information

A High Definition Motion JPEG Encoder Based on Epuma Platform

A High Definition Motion JPEG Encoder Based on Epuma Platform Available online at www.sciencedirect.com Procedia Engineering 29 (2012) 2371 2375 2012 International Workshop on Information and Electronics Engineering (IWIEE) A High Definition Motion JPEG Encoder Based

More information

Tutorial: Using the UML profile for MARTE to MPSoC co-design dedicated to signal processing

Tutorial: Using the UML profile for MARTE to MPSoC co-design dedicated to signal processing Tutorial: Using the UML profile for MARTE to MPSoC co-design dedicated to signal processing Imran Rafiq Quadri, Abdoulaye Gamatié, Jean-Luc Dekeyser To cite this version: Imran Rafiq Quadri, Abdoulaye

More information

INTRODUCTION. In the industrial applications, many three-phase loads require a. supply of Variable Voltage Variable Frequency (VVVF) using fast and

INTRODUCTION. In the industrial applications, many three-phase loads require a. supply of Variable Voltage Variable Frequency (VVVF) using fast and 1 Chapter 1 INTRODUCTION 1.1. Introduction In the industrial applications, many three-phase loads require a supply of Variable Voltage Variable Frequency (VVVF) using fast and high-efficient electronic

More information

CS 6135 VLSI Physical Design Automation Fall 2003

CS 6135 VLSI Physical Design Automation Fall 2003 CS 6135 VLSI Physical Design Automation Fall 2003 1 Course Information Class time: R789 Location: EECS 224 Instructor: Ting-Chi Wang ( ) EECS 643, (03) 5742963 tcwang@cs.nthu.edu.tw Office hours: M56R5

More information

Data Word Length Reduction for Low-Power DSP Software

Data Word Length Reduction for Low-Power DSP Software EE382C: LITERATURE SURVEY, APRIL 2, 2004 1 Data Word Length Reduction for Low-Power DSP Software Kyungtae Han Abstract The increasing demand for portable computing accelerates the study of minimizing power

More information

Hardware-Software Co-Design Cosynthesis and Partitioning

Hardware-Software Co-Design Cosynthesis and Partitioning Hardware-Software Co-Design Cosynthesis and Partitioning EE8205: Embedded Computer Systems http://www.ee.ryerson.ca/~courses/ee8205/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer

More information

Evaluation of CPU Frequency Transition Latency

Evaluation of CPU Frequency Transition Latency Noname manuscript No. (will be inserted by the editor) Evaluation of CPU Frequency Transition Latency Abdelhafid Mazouz Alexandre Laurent Benoît Pradelle William Jalby Abstract Dynamic Voltage and Frequency

More information

Course Outcome of M.Tech (VLSI Design)

Course Outcome of M.Tech (VLSI Design) Course Outcome of M.Tech (VLSI Design) PVL108: Device Physics and Technology The students are able to: 1. Understand the basic physics of semiconductor devices and the basics theory of PN junction. 2.

More information

Estimation of Real Dynamic Power on Field Programmable Gate Array

Estimation of Real Dynamic Power on Field Programmable Gate Array Estimation of Real Dynamic Power on Field Programmable Gate Array CHALBI Najoua, BOUBAKER Mohamed, BEDOUI Mohamed Hedi ---------------------------------------------------------------------***---------------------------------------------------------------------

More information

A Self-Reconfigurable Implementation of the JPEG Encoder

A Self-Reconfigurable Implementation of the JPEG Encoder A Self-Reconfigurable Implementation of the JPEG Encoder Antonino Tumeo, Matteo Monchiero, Gianluca Palermo, Fabrizio Ferrandi, Donatella Sciuto Politecnico di Milano, Dipartimento di Elettronica e Informazione

More information

EE382V: Embedded System Design and Modeling

EE382V: Embedded System Design and Modeling EE382V: Embedded System Design and - Introduction Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu : Outline Introduction Embedded systems System-level

More information

DATA ENCODING TECHNIQUES FOR LOW POWER CONSUMPTION IN NETWORK-ON-CHIP

DATA ENCODING TECHNIQUES FOR LOW POWER CONSUMPTION IN NETWORK-ON-CHIP DATA ENCODING TECHNIQUES FOR LOW POWER CONSUMPTION IN NETWORK-ON-CHIP S. Narendra, G. Munirathnam Abstract In this project, a low-power data encoding scheme is proposed. In general, system-on-chip (soc)

More information

EDA for IC System Design, Verification, and Testing

EDA for IC System Design, Verification, and Testing EDA for IC System Design, Verification, and Testing Edited by Louis Scheffer Cadence Design Systems San Jose, California, U.S.A. Luciano Lavagno Cadence Berkeley Laboratories Berkeley, California, U.S.A.

More information

VLSI System Testing. Outline

VLSI System Testing. Outline ECE 538 VLSI System Testing Krish Chakrabarty System-on-Chip (SOC) Testing ECE 538 Krish Chakrabarty 1 Outline Motivation for modular testing of SOCs Wrapper design IEEE 1500 Standard Optimization Test

More information

FPGA Implementation of Wallace Tree Multiplier using CSLA / CLA

FPGA Implementation of Wallace Tree Multiplier using CSLA / CLA FPGA Implementation of Wallace Tree Multiplier using CSLA / CLA Shruti Dixit 1, Praveen Kumar Pandey 2 1 Suresh Gyan Vihar University, Mahaljagtapura, Jaipur, Rajasthan, India 2 Suresh Gyan Vihar University,

More information

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 3, MARCH

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 3, MARCH IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 3, MARCH 2009 427 Power Management of Voltage/Frequency Island-Based Systems Using Hardware-Based Methods Puru Choudhary,

More information

Instruction Scheduling for Low Power Dissipation in High Performance Microprocessors

Instruction Scheduling for Low Power Dissipation in High Performance Microprocessors Instruction Scheduling for Low Power Dissipation in High Performance Microprocessors Abstract Mark C. Toburen Thomas M. Conte Department of Electrical and Computer Engineering North Carolina State University

More information

Final Report: DBmbench

Final Report: DBmbench 18-741 Final Report: DBmbench Yan Ke (yke@cs.cmu.edu) Justin Weisz (jweisz@cs.cmu.edu) Dec. 8, 2006 1 Introduction Conventional database benchmarks, such as the TPC-C and TPC-H, are extremely computationally

More information

Globally Asynchronous Locally Synchronous (GALS) Microprogrammed Parallel FIR Filter

Globally Asynchronous Locally Synchronous (GALS) Microprogrammed Parallel FIR Filter IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 5, Ver. II (Sep. - Oct. 2016), PP 15-21 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Globally Asynchronous Locally

More information

VLSI Implementation of Digital Down Converter (DDC)

VLSI Implementation of Digital Down Converter (DDC) Volume-7, Issue-1, January-February 2017 International Journal of Engineering and Management Research Page Number: 218-222 VLSI Implementation of Digital Down Converter (DDC) Shaik Afrojanasima 1, K Vijaya

More information

Methods for Reducing the Activity Switching Factor

Methods for Reducing the Activity Switching Factor International Journal of Engineering Research and Development e-issn: 2278-67X, p-issn: 2278-8X, www.ijerd.com Volume, Issue 3 (March 25), PP.7-25 Antony Johnson Chenginimattom, Don P John M.Tech Student,

More information

PRFloor: An Automatic Floorplanner for Partially Reconfigurable FPGA Systems

PRFloor: An Automatic Floorplanner for Partially Reconfigurable FPGA Systems PRFloor: An Automatic Floorplanner for Partially Reconfigurable FPGA Systems Tuan D. A. Nguyen (1) & Akash Kumar (2) (1) ECE Department, National University of Singapore, Singapore (2) Chair of Processor

More information

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS 1 T.Thomas Leonid, 2 M.Mary Grace Neela, and 3 Jose Anand

More information

Simulation Performance Optimization of Virtual Prototypes Sammidi Mounika, B S Renuka

Simulation Performance Optimization of Virtual Prototypes Sammidi Mounika, B S Renuka Simulation Performance Optimization of Virtual Prototypes Sammidi Mounika, B S Renuka Abstract Virtual prototyping is becoming increasingly important to embedded software developers, engineers, managers

More information

Low-Power CMOS VLSI Design

Low-Power CMOS VLSI Design Low-Power CMOS VLSI Design ( 范倫達 ), Ph. D. Department of Computer Science, National Chiao Tung University, Taiwan, R.O.C. Fall, 2017 ldvan@cs.nctu.edu.tw http://www.cs.nctu.tw/~ldvan/ Outline Introduction

More information

CLIPPER: Counter-based Low Impact Processor Power Estimation at Run-time

CLIPPER: Counter-based Low Impact Processor Power Estimation at Run-time CLIPPER: Counter-based Low Impact Processor Power Estimation at Run-time Jorgen Peddersen, Sri Parameswaran School of Computer Science and Engineering The University of New South Wales & National ICT Australia

More information

Design A Redundant Binary Multiplier Using Dual Logic Level Technique

Design A Redundant Binary Multiplier Using Dual Logic Level Technique Design A Redundant Binary Multiplier Using Dual Logic Level Technique Sreenivasa Rao Assistant Professor, Department of ECE, Santhiram Engineering College, Nandyala, A.P. Jayanthi M.Tech Scholar in VLSI,

More information

Datorstödd Elektronikkonstruktion

Datorstödd Elektronikkonstruktion Datorstödd Elektronikkonstruktion [Computer Aided Design of Electronics] Zebo Peng, Petru Eles and Gert Jervan Embedded Systems Laboratory IDA, Linköping University http://www.ida.liu.se/~tdts80/~tdts80

More information

Chapter 9 Image Compression Standards

Chapter 9 Image Compression Standards Chapter 9 Image Compression Standards 9.1 The JPEG Standard 9.2 The JPEG2000 Standard 9.3 The JPEG-LS Standard 1IT342 Image Compression Standards The image standard specifies the codec, which defines how

More information

Design of a High Speed FIR Filter on FPGA by Using DA-OBC Algorithm

Design of a High Speed FIR Filter on FPGA by Using DA-OBC Algorithm Design of a High Speed FIR Filter on FPGA by Using DA-OBC Algorithm Vijay Kumar Ch 1, Leelakrishna Muthyala 1, Chitra E 2 1 Research Scholar, VLSI, SRM University, Tamilnadu, India 2 Assistant Professor,

More information

ASIP Solution for Implementation of H.264 Multi Resolution Motion Estimation

ASIP Solution for Implementation of H.264 Multi Resolution Motion Estimation Int. J. Communications, Network and System Sciences, 2010, 3, 453-461 doi:10.4236/ijcns.2010.35060 Published Online May 2010 (http://www.scirp.org/journal/ijcns/) ASIP Solution for Implementation of H.264

More information

Figures from Embedded System Design: A Unified Hardware/Software Introduction, Frank Vahid and Tony Givargis, New York, John Wiley, 2002

Figures from Embedded System Design: A Unified Hardware/Software Introduction, Frank Vahid and Tony Givargis, New York, John Wiley, 2002 Figures from Embedded System Design: A Unified Hardware/Software Introduction, Frank Vahid and Tony Givargis, New York, John Wiley, 2002 Data processing flow to implement basic JPEG coding in a simple

More information

Overview. 1 Trends in Microprocessor Architecture. Computer architecture. Computer architecture

Overview. 1 Trends in Microprocessor Architecture. Computer architecture. Computer architecture Overview 1 Trends in Microprocessor Architecture R05 Robert Mullins Computer architecture Scaling performance and CMOS Where have performance gains come from? Modern superscalar processors The limits of

More information

FPGA-BASED PULSED-RF PHASE AND AMPLITUDE DETECTOR AT SLRI

FPGA-BASED PULSED-RF PHASE AND AMPLITUDE DETECTOR AT SLRI doi:10.18429/jacow-icalepcs2017- FPGA-BASED PULSED-RF PHASE AND AMPLITUDE DETECTOR AT SLRI R. Rujanakraikarn, Synchrotron Light Research Institute, Nakhon Ratchasima, Thailand Abstract In this paper, the

More information

Chapter 1 Introduction

Chapter 1 Introduction Chapter 1 Introduction 1.1 Introduction There are many possible facts because of which the power efficiency is becoming important consideration. The most portable systems used in recent era, which are

More information

Dynamic MIPS Rate Stabilization in Out-of-Order Processors

Dynamic MIPS Rate Stabilization in Out-of-Order Processors Dynamic Rate Stabilization in Out-of-Order Processors Jinho Suh and Michel Dubois Ming Hsieh Dept of EE University of Southern California Outline Motivation Performance Variability of an Out-of-Order Processor

More information

Hybrid Coding (JPEG) Image Color Transform Preparation

Hybrid Coding (JPEG) Image Color Transform Preparation Hybrid Coding (JPEG) 5/31/2007 Kompressionsverfahren: JPEG 1 Image Color Transform Preparation Example 4: 2: 2 YUV, 4: 1: 1 YUV, and YUV9 Coding Luminance (Y): brightness sampling frequency 13.5 MHz Chrominance

More information

Design of an optimized multiplier based on approximation logic

Design of an optimized multiplier based on approximation logic ISSN:2348-2079 Volume-6 Issue-1 International Journal of Intellectual Advancements and Research in Engineering Computations Design of an optimized multiplier based on approximation logic Dhivya Bharathi

More information

A Framework for Fast Hardware-Software Co-simulation

A Framework for Fast Hardware-Software Co-simulation A Framework for Fast Hardware-Software Co-simulation Andreas Hoffmann, Tim Kogel, Heinrich Meyr Integrated Signal Processing Systems (ISS), RWTH Aachen Templergraben 55, 52056 Aachen, Germany hoffmann[kogel,meyr]@iss.rwth-aachen.de

More information

Efficient Multi-Operand Adders in VLSI Technology

Efficient Multi-Operand Adders in VLSI Technology Efficient Multi-Operand Adders in VLSI Technology K.Priyanka M.Tech-VLSI, D.Chandra Mohan Assistant Professor, Dr.S.Balaji, M.E, Ph.D Dean, Department of ECE, Abstract: This paper presents different approaches

More information

Serial and Parallel Processing Architecture for Signal Synchronization

Serial and Parallel Processing Architecture for Signal Synchronization Serial and Parallel Processing Architecture for Signal Synchronization Franklin Rafael COCHACHIN HENOSTROZA Emmanuel BOUTILLON July 2015 Université de Bretagne Sud Lab-STICC, UMR 6285 Centre de Recherche

More information

Introduction to co-simulation. What is HW-SW co-simulation?

Introduction to co-simulation. What is HW-SW co-simulation? Introduction to co-simulation CPSC489-501 Hardware-Software Codesign of Embedded Systems Mahapatra-TexasA&M-Fall 00 1 What is HW-SW co-simulation? A basic definition: Manipulating simulated hardware with

More information

Validation of Frequency- and Time-domain Fidelity of an Ultra-low Latency Hardware-in-the-Loop (HIL) Emulator

Validation of Frequency- and Time-domain Fidelity of an Ultra-low Latency Hardware-in-the-Loop (HIL) Emulator Validation of Frequency- and Time-domain Fidelity of an Ultra-low Latency Hardware-in-the-Loop (HIL) Emulator Elaina Chai, Ivan Celanovic Institute for Soldier Nanotechnologies Massachusetts Institute

More information

Ramon Canal NCD Master MIRI. NCD Master MIRI 1

Ramon Canal NCD Master MIRI. NCD Master MIRI 1 Wattch, Hotspot, Hotleakage, McPAT http://www.eecs.harvard.edu/~dbrooks/wattch-form.html http://lava.cs.virginia.edu/hotspot http://lava.cs.virginia.edu/hotleakage http://www.hpl.hp.com/research/mcpat/

More information

Abstract of PhD Thesis

Abstract of PhD Thesis FACULTY OF ELECTRONICS, TELECOMMUNICATION AND INFORMATION TECHNOLOGY Irina DORNEAN, Eng. Abstract of PhD Thesis Contribution to the Design and Implementation of Adaptive Algorithms Using Multirate Signal

More information

OQPSK COGNITIVE MODULATOR FULLY FPGA-IMPLEMENTED VIA DYNAMIC PARTIAL RECONFIGURATION AND RAPID PROTOTYPING TOOLS

OQPSK COGNITIVE MODULATOR FULLY FPGA-IMPLEMENTED VIA DYNAMIC PARTIAL RECONFIGURATION AND RAPID PROTOTYPING TOOLS Proceedings of SDR'11-WInnComm-Europe, 22-24 Jun 2011 OQPSK COGNITIVE MODULATOR FULLY FPGA-IMPLEMENTED VIA DYNAMIC PARTIAL RECONFIGURATION AND RAPID PROTOTYPING TOOLS Raúl Torrego (Communications department:

More information

An Efficient Method for Implementation of Convolution

An Efficient Method for Implementation of Convolution IAAST ONLINE ISSN 2277-1565 PRINT ISSN 0976-4828 CODEN: IAASCA International Archive of Applied Sciences and Technology IAAST; Vol 4 [2] June 2013: 62-69 2013 Society of Education, India [ISO9001: 2008

More information

Computer Aided Design of Electronics

Computer Aided Design of Electronics Computer Aided Design of Electronics [Datorstödd Elektronikkonstruktion] Zebo Peng, Petru Eles, and Nima Aghaee Embedded Systems Laboratory IDA, Linköping University www.ida.liu.se/~tdts01 Electronic Systems

More information

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors T.N.Priyatharshne Prof. L. Raja, M.E, (Ph.D) A. Vinodhini ME VLSI DESIGN Professor, ECE DEPT ME VLSI DESIGN

More information

EE 382C EMBEDDED SOFTWARE SYSTEMS. Literature Survey Report. Characterization of Embedded Workloads. Ajay Joshi. March 30, 2004

EE 382C EMBEDDED SOFTWARE SYSTEMS. Literature Survey Report. Characterization of Embedded Workloads. Ajay Joshi. March 30, 2004 EE 382C EMBEDDED SOFTWARE SYSTEMS Literature Survey Report Characterization of Embedded Workloads Ajay Joshi March 30, 2004 ABSTRACT Security applications are a class of emerging workloads that will play

More information

Bus-Switch Encoding for Power Optimization of Address Bus

Bus-Switch Encoding for Power Optimization of Address Bus May 2006, Volume 3, No.5 (Serial No.18) Journal of Communication and Computer, ISSN1548-7709, USA Haijun Sun 1, Zhibiao Shao 2 (1,2 School of Electronics and Information Engineering, Xi an Jiaotong University,

More information

Processors Processing Processors. The meta-lecture

Processors Processing Processors. The meta-lecture Simulators 5SIA0 Processors Processing Processors The meta-lecture Why Simulators? Your Friend Harm Why Simulators? Harm Loves Tractors Harm Why Simulators? The outside world Unfortunately for Harm you

More information

DESIGN OF INTELLIGENT PID CONTROLLER BASED ON PARTICLE SWARM OPTIMIZATION IN FPGA

DESIGN OF INTELLIGENT PID CONTROLLER BASED ON PARTICLE SWARM OPTIMIZATION IN FPGA DESIGN OF INTELLIGENT PID CONTROLLER BASED ON PARTICLE SWARM OPTIMIZATION IN FPGA S.Karthikeyan 1 Dr.P.Rameshbabu 2,Dr.B.Justus Robi 3 1 S.Karthikeyan, Research scholar JNTUK., Department of ECE, KVCET,Chennai

More information

Self-Aware Adaptation in FPGAbased

Self-Aware Adaptation in FPGAbased DIPARTIMENTO DI ELETTRONICA E INFORMAZIONE Self-Aware Adaptation in FPGAbased Systems IEEE FPL 2010 Filippo Siorni: filippo.sironi@dresd.org Marco Triverio: marco.triverio@dresd.org Martina Maggio: mmaggio@mit.edu

More information

AREA EFFICIENT DISTRIBUTED ARITHMETIC DISCRETE COSINE TRANSFORM USING MODIFIED WALLACE TREE MULTIPLIER

AREA EFFICIENT DISTRIBUTED ARITHMETIC DISCRETE COSINE TRANSFORM USING MODIFIED WALLACE TREE MULTIPLIER American Journal of Applied Sciences 11 (2): 180-188, 2014 ISSN: 1546-9239 2014 Science Publication doi:10.3844/ajassp.2014.180.188 Published Online 11 (2) 2014 (http://www.thescipub.com/ajas.toc) AREA

More information

SDR Applications using VLSI Design of Reconfigurable Devices

SDR Applications using VLSI Design of Reconfigurable Devices 2018 IJSRST Volume 4 Issue 2 Print ISSN: 2395-6011 Online ISSN: 2395-602X Themed Section: Science and Technology SDR Applications using VLSI Design of Reconfigurable Devices P. A. Lovina 1, K. Aruna Manjusha

More information

EE382V: Embedded System Design and Modeling

EE382V: Embedded System Design and Modeling EE382V: Embedded System Design and System-Level Design Tools Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu : Outline Overview System-level design

More information

NGP-N ASIC. Microelectronics Presentation Days March 2010

NGP-N ASIC. Microelectronics Presentation Days March 2010 NGP-N ASIC Microelectronics Presentation Days 2010 ESA contract: Next Generation Processor - Phase 2 (18428/06/N1/US) - Started: Dec 2006 ESA Technical officer: Simon Weinberg Mark Childerhouse Processor

More information

ON CHIP COMMUNICATION ARCHITECTURE POWER ESTIMATION IN HIGH FREQUENCY HIGH POWER MODEL

ON CHIP COMMUNICATION ARCHITECTURE POWER ESTIMATION IN HIGH FREQUENCY HIGH POWER MODEL ON CHIP COMMUNICATION ARCHITECTURE POWER ESTIMATION IN HIGH FREQUENCY HIGH POWER MODEL Khalid B. Suliman 1, Rashid A. Saeed and Raed A. Alsaqour 3 1 Department of Electrical and Electronic Engineering,

More information

Mixed Synchronous/Asynchronous State Memory for Low Power FSM Design

Mixed Synchronous/Asynchronous State Memory for Low Power FSM Design Mixed Synchronous/Asynchronous State Memory for Low Power FSM Design Cao Cao and Bengt Oelmann Department of Information Technology and Media, Mid-Sweden University S-851 70 Sundsvall, Sweden {cao.cao@mh.se}

More information

Automated FSM Error Correction for Single Event Upsets

Automated FSM Error Correction for Single Event Upsets Automated FSM Error Correction for Single Event Upsets Nand Kumar and Darren Zacher Mentor Graphics Corporation nand_kumar{darren_zacher}@mentor.com Abstract This paper presents a technique for automatic

More information

CHAPTER 4 FIELD PROGRAMMABLE GATE ARRAY IMPLEMENTATION OF FIVE LEVEL CASCADED MULTILEVEL INVERTER

CHAPTER 4 FIELD PROGRAMMABLE GATE ARRAY IMPLEMENTATION OF FIVE LEVEL CASCADED MULTILEVEL INVERTER 87 CHAPTER 4 FIELD PROGRAMMABLE GATE ARRAY IMPLEMENTATION OF FIVE LEVEL CASCADED MULTILEVEL INVERTER 4.1 INTRODUCTION The Field Programmable Gate Array (FPGA) is a high performance data processing general

More information

A HARDWARE DC MOTOR EMULATOR VAGNER S. ROSA 1, VITOR I. GERVINI 2, SEBASTIÃO C. P. GOMES 3, SERGIO BAMPI 4

A HARDWARE DC MOTOR EMULATOR VAGNER S. ROSA 1, VITOR I. GERVINI 2, SEBASTIÃO C. P. GOMES 3, SERGIO BAMPI 4 A HARDWARE DC MOTOR EMULATOR VAGNER S. ROSA 1, VITOR I. GERVINI 2, SEBASTIÃO C. P. GOMES 3, SERGIO BAMPI 4 Abstract Much work have been done lately to develop complex motor control systems. However they

More information

International Journal of Scientific & Engineering Research, Volume 7, Issue 3, March-2016 ISSN

International Journal of Scientific & Engineering Research, Volume 7, Issue 3, March-2016 ISSN ISSN 2229-5518 159 EFFICIENT AND ENHANCED CARRY SELECT ADDER FOR MULTIPURPOSE APPLICATIONS A.RAMESH Asst. Professor, E.C.E Department, PSCMRCET, Kothapet, Vijayawada, A.P, India. rameshavula99@gmail.com

More information

A Modular Approach to the Design of the Soft Output Viterbi Algorithm (SOVA) Decoder

A Modular Approach to the Design of the Soft Output Viterbi Algorithm (SOVA) Decoder A Modular Approach to the Design of the Soft Output Viterbi Algorithm (SOVA) Decoder Jacques Martinet and Paul Fortier Département de génie électrique et de génie informatique Université Laval, Sainte-Foy

More information

Design of Multiplier Less 32 Tap FIR Filter using VHDL

Design of Multiplier Less 32 Tap FIR Filter using VHDL International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Design of Multiplier Less 32 Tap FIR Filter using VHDL Abul Fazal Reyas Sarwar 1, Saifur Rahman 2 1 (ECE, Integral University, India)

More information

An Energy Conservation DVFS Algorithm for the Android Operating System

An Energy Conservation DVFS Algorithm for the Android Operating System Volume 1, Number 1, December 2010 Journal of Convergence An Energy Conservation DVFS Algorithm for the Android Operating System Wen-Yew Liang* and Po-Ting Lai Department of Computer Science and Information

More information

White Paper Stratix III Programmable Power

White Paper Stratix III Programmable Power Introduction White Paper Stratix III Programmable Power Traditionally, digital logic has not consumed significant static power, but this has changed with very small process nodes. Leakage current in digital

More information

Design and Implementation of High Speed Carry Select Adder

Design and Implementation of High Speed Carry Select Adder Design and Implementation of High Speed Carry Select Adder P.Prashanti Digital Systems Engineering (M.E) ECE Department University College of Engineering Osmania University, Hyderabad, Andhra Pradesh -500

More information

ABSTRACT 1. INTRODUCTION

ABSTRACT 1. INTRODUCTION THE APPLICATION OF SOFTWARE DEFINED RADIO IN A COOPERATIVE WIRELESS NETWORK Jesper M. Kristensen (Aalborg University, Center for Teleinfrastructure, Aalborg, Denmark; jmk@kom.aau.dk); Frank H.P. Fitzek

More information

Single Chip FPGA Based Realization of Arbitrary Waveform Generator using Rademacher and Walsh Functions

Single Chip FPGA Based Realization of Arbitrary Waveform Generator using Rademacher and Walsh Functions IEEE ICET 26 2 nd International Conference on Emerging Technologies Peshawar, Pakistan 3-4 November 26 Single Chip FPGA Based Realization of Arbitrary Waveform Generator using Rademacher and Walsh Functions

More information

SW simulation and Performance Analysis

SW simulation and Performance Analysis SW simulation and Performance Analysis In Multi-Processing Embedded Systems Eugenio Villar University of Cantabria Context HW/SW Embedded Systems Design Flow HW/SW Simulation Performance Analysis Design

More information

An area optimized FIR Digital filter using DA Algorithm based on FPGA

An area optimized FIR Digital filter using DA Algorithm based on FPGA An area optimized FIR Digital filter using DA Algorithm based on FPGA B.Chaitanya Student, M.Tech (VLSI DESIGN), Department of Electronics and communication/vlsi Vidya Jyothi Institute of Technology, JNTU

More information

An Optimized Design of High-Speed and Energy- Efficient Carry Skip Adder with Variable Latency Extension

An Optimized Design of High-Speed and Energy- Efficient Carry Skip Adder with Variable Latency Extension An Optimized Design of High-Speed and Energy- Efficient Carry Skip Adder with Variable Latency Extension Monisha.T.S 1, Senthil Prakash.K 2 1 PG Student, ECE, Velalar College of Engineering and Technology

More information

Sno Projects List IEEE. High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations

Sno Projects List IEEE. High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations Sno Projects List IEEE 1 High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations 2 A Generalized Algorithm And Reconfigurable Architecture For Efficient And Scalable

More information

PV SYSTEM BASED FPGA: ANALYSIS OF POWER CONSUMPTION IN XILINX XPOWER TOOL

PV SYSTEM BASED FPGA: ANALYSIS OF POWER CONSUMPTION IN XILINX XPOWER TOOL 1 PV SYSTEM BASED FPGA: ANALYSIS OF POWER CONSUMPTION IN XILINX XPOWER TOOL Pradeep Patel Instrumentation and Control Department Prof. Deepali Shah Instrumentation and Control Department L. D. College

More information

Statement of Research Weiwei Chen

Statement of Research Weiwei Chen Statement of Research Weiwei Chen Embedded computer systems are ubiquitous and pervasive in our modern society with a wide application domain, such as automotive and avionic systems, electronic medical

More information

Power Consumption Model for Partial and Dynamic Reconfiguration

Power Consumption Model for Partial and Dynamic Reconfiguration Power Consumption Model for Partial and Dynamic Re Robin Bonamy CAIRN-IRISA Université de Rennes1 Lannion robin.bonamy@irisa.fr Daniel Chillet CAIRN-IRISA Université de Rennes1 Lannion daniel.chillet@irisa.fr

More information

Heterogeneous Concurrent Error Detection (hced) Based on Output Anticipation

Heterogeneous Concurrent Error Detection (hced) Based on Output Anticipation International Conference on ReConFigurable Computing and FPGAs (ReConFig 2011) 30 th Nov- 2 nd Dec 2011, Cancun, Mexico Heterogeneous Concurrent Error Detection (hced) Based on Output Anticipation Naveed

More information

Merging Propagation Physics, Theory and Hardware in Wireless. Ada Poon

Merging Propagation Physics, Theory and Hardware in Wireless. Ada Poon HKUST January 3, 2007 Merging Propagation Physics, Theory and Hardware in Wireless Ada Poon University of Illinois at Urbana-Champaign Outline Multiple-antenna (MIMO) channels Human body wireless channels

More information

Outline Simulators and such. What defines a simulator? What about emulation?

Outline Simulators and such. What defines a simulator? What about emulation? Outline Simulators and such Mats Brorsson & Mladen Nikitovic ICT Dept of Electronic, Computer and Software Systems (ECS) What defines a simulator? Why are simulators needed? Classifications Case studies

More information

An Overview of Computer Architecture and System Simulation

An Overview of Computer Architecture and System Simulation An Overview of Computer Architecture and System Simulation J. Manuel Colmenar José L. Risco-Martín and Juan Lanchares C.E.S. Felipe II Dept. of Computer Architecture and Automation U. Complutense de Madrid

More information

LOW-POWER SOFTWARE-DEFINED RADIO DESIGN USING FPGAS

LOW-POWER SOFTWARE-DEFINED RADIO DESIGN USING FPGAS LOW-POWER SOFTWARE-DEFINED RADIO DESIGN USING FPGAS Charlie Jenkins, (Altera Corporation San Jose, California, USA; chjenkin@altera.com) Paul Ekas, (Altera Corporation San Jose, California, USA; pekas@altera.com)

More information

Analog front-end electronics in beam instrumentation

Analog front-end electronics in beam instrumentation Analog front-end electronics in beam instrumentation Basic instrumentation structure Silicon state of art Sampling state of art Instrumentation trend Comments and example on BPM Future Beam Position Instrumentation

More information

Low Power Design for Systems on a Chip. Tutorial Outline

Low Power Design for Systems on a Chip. Tutorial Outline Low Power Design for Systems on a Chip Mary Jane Irwin Dept of CSE Penn State University (www.cse.psu.edu/~mji) Low Power Design for SoCs ASIC Tutorial Intro.1 Tutorial Outline Introduction and motivation

More information

Performance Enhancement of the RSA Algorithm by Optimize Partial Product of Booth Multiplier

Performance Enhancement of the RSA Algorithm by Optimize Partial Product of Booth Multiplier International Journal of Electronics Engineering Research. ISSN 0975-6450 Volume 9, Number 8 (2017) pp. 1329-1338 Research India Publications http://www.ripublication.com Performance Enhancement of the

More information

2 Assoc Prof, Dept of ECE, George Institute of Engineering & Technology, Markapur, AP, India,

2 Assoc Prof, Dept of ECE, George Institute of Engineering & Technology, Markapur, AP, India, ISSN 2319-8885 Vol.03,Issue.30 October-2014, Pages:5968-5972 www.ijsetr.com Low Power and Area-Efficient Carry Select Adder THANNEERU DHURGARAO 1, P.PRASANNA MURALI KRISHNA 2 1 PG Scholar, Dept of DECS,

More information

An Area Efficient Decomposed Approximate Multiplier for DCT Applications

An Area Efficient Decomposed Approximate Multiplier for DCT Applications An Area Efficient Decomposed Approximate Multiplier for DCT Applications K.Mohammed Rafi 1, M.P.Venkatesh 2 P.G. Student, Department of ECE, Shree Institute of Technical Education, Tirupati, India 1 Assistant

More information

Lecture 3, Handouts Page 1. Introduction. EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Simulation Techniques.

Lecture 3, Handouts Page 1. Introduction. EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Simulation Techniques. Introduction EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Techniques Cristian Grecu grecuc@ece.ubc.ca Course web site: http://courses.ece.ubc.ca/353/ What have you learned so far?

More information

A FPGA Implementation of Power Efficient Encoding Schemes for NoC with Error Detection

A FPGA Implementation of Power Efficient Encoding Schemes for NoC with Error Detection IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 3, Ver. II (May. -Jun. 2016), PP 70-76 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org A FPGA Implementation of Power

More information

ISSN Vol.07,Issue.08, July-2015, Pages:

ISSN Vol.07,Issue.08, July-2015, Pages: ISSN 2348 2370 Vol.07,Issue.08, July-2015, Pages:1397-1402 www.ijatir.org Implementation of 64-Bit Modified Wallace MAC Based On Multi-Operand Adders MIDDE SHEKAR 1, M. SWETHA 2 1 PG Scholar, Siddartha

More information

Keywords SEFDM, OFDM, FFT, CORDIC, FPGA.

Keywords SEFDM, OFDM, FFT, CORDIC, FPGA. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Future to

More information

AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER

AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER K. RAMAMOORTHY 1 T. CHELLADURAI 2 V. MANIKANDAN 3 1 Department of Electronics and Communication

More information

DYNAMIC VOLTAGE FREQUENCY SCALING (DVFS) FOR MICROPROCESSORS POWER AND ENERGY REDUCTION

DYNAMIC VOLTAGE FREQUENCY SCALING (DVFS) FOR MICROPROCESSORS POWER AND ENERGY REDUCTION DYNAMIC VOLTAGE FREQUENCY SCALING (DVFS) FOR MICROPROCESSORS POWER AND ENERGY REDUCTION Diary R. Suleiman Muhammed A. Ibrahim Ibrahim I. Hamarash e-mail: diariy@engineer.com e-mail: ibrahimm@itu.edu.tr

More information

Power Efficient Optimized Arithmetic and Logic Unit Design on FPGA

Power Efficient Optimized Arithmetic and Logic Unit Design on FPGA From the SelectedWorks of Innovative Research Publications IRP India Winter December 1, 2014 Power Efficient Optimized Arithmetic and Logic Unit Design on FPGA Innovative Research Publications, IRP India,

More information

Design and Implementation of Signal Processing Systems: An Introduction

Design and Implementation of Signal Processing Systems: An Introduction Design and Implementation of Signal Processing Systems: An Introduction Yu Hen Hu (c) 1997-2013 by Yu Hen Hu 1 Outline Course Objectives and Outline, Conduct What is signal processing? Implementation Options

More information