2 Embedded System Hardware - Reconfigurable Hardware - Peter Marwedel Informatik 2 TU Dortmund Germany
GOPs/J Courtesy: Philips Hugo De Man, IMEC, 27 Energy Efficiency of FPGAs 2, 28-2-
Reconfigurable Logic Full custom chips may be too expensive, software too slow. Combine the speed of HW with the flexibility of SW HW with programmable functions and interconnect. Use of configurable hardware; common form: field programmable gate arrays (FPGAs) Applications: bit-oriented algorithms like encryption, fast object recognition (medical and military) Adapting mobile phones to different standards. Very popular devices from XILINX (XILINX Vertex II are recent devices) Actel, Altera and others 2, 28-3-
Floor-plan of VIRTEX II FPGAs 2, 28-4-
Virtex II Configurable Logic Block (CLB) 2, 28-5-
Virtex II Slice (simplified) Look-up tables LUT F and G can be used to compute any Boolean function of 4 variables. 2, 28 Example: a b c d G - 6-
Virtex II (Pro) Slice [ and source: Xilinx Inc.: VirtexII Pro Platform FPGAs: Functional Description, Sept. 22, //www.xilinx.com] 2, 28-7-
Number of resources available in Virtex II Pro devices [ and source: Xilinx Inc.: Virtex-II Pro Platform FPGAs: Functional Description, Sept. 22, //www.xilinx.com] 2, 28-8-
Hierarchical Routing Resources Interconnect 2, 28-9-
Virtex II Pro Devices include up to 4 PowerPC processor cores [ and source: Xilinx Inc.: Virtex-II Pro Platform FPGAs: Functional Description, Sept. 22, //www.xilinx.com] 2, 28 - -
2 Memory Peter Marwedel Informatik 2 TU Dortmund Germany
Memory For the memory, efficiency is again a concern: speed (latency and throughput); predictable timing energy efficiency size cost other attributes (volatile vs. persistent, etc) 2, 28-2 -
Access times and energy consumption increases with the size of the memory Example (CACTI Model): "Currently, the size of some applications is doubling every months" [STMicroelectronics, Medea+ Workshop, Stuttgart, Nov. 23] 2, 28-3 -
Access times and energy consumption for multi-ported register files Power (W).8 7 4.7 6 2 5 4 8 3 6.2 2 4. 2.6.5.4.3 6 32 64 28 6 Register File Size 32 64 GP6M2 28 GP6M3 6 32 64 Register File Size Rixner s et al. model [HPCA ], Technology of.8 µm 2, 28 28 Source and H. Valero, 2 Area (λ 2x6) Cycle Time (ns) - 4 -
How much of the energy consumption of a system is memory-related? Mobile PC Thermal Design (TDP) System Power Other 3% 6/5 MHz up 3% Power Supply % Memory+Graphics 2% LCD " 3% Memory+Graphics 5% LCD " 9% Note: Based on Actual Measurements CPU Dominates Thermal Design Power [Courtesy: N. Dutt; Source: V. Tiwari] Other 3% 6/5 MHz up 37% Power Supply % HDD 9% Mobile PC Average System Power HDD 9% Multiple Platform Components Comprise Average Power 2, 28-5 -
Energy consumption in mobile devices [O. Vargas (Infineon Technologies): Minimum power consumption in mobile-phone memory subsystems; Pennwell Portable Design - September 25;] Thanks to Thorsten Koch (Nokia/ Univ. Dortmund) for providing this source. 2, 28-6 -
Access-times will be a problem Speed gap between processor and main DRAM increases Speed early 6ties (Atlas): page fault ~ 25 instructions 22 (2 GHz µp): access to DRAM ~ 5 instructions penalty for cache miss about same as for page fault in Atlas -2 p. a. ) 8 CP U (.5 4 2x every 2 years 2 7 p. a. ( D R AM 2 3 4.) 5 years [P. Machanik: Approaches to Addressing the Memory Wall, TR Nov. 22, U. Brisbane] 2, 28-7 -
Hierarchical memories using scratch pad memories (SPM) SPM is a small, physically separate memory mapped into the address space Hierarchy main Address space scratch pad memory FFF.. no tag memory select SPM SPM processor Example Selection is by an appropriate address decoder (simple!) 2, 28 ARM7TDMI cores, wellknown for low power consumption - 8 -
Comparison of currents using measurements E.g.: ATMEL board with ARM7TDMI and ext. SRAM Current 32 Bit-Load Instruction (Thumb) 2 ma 5 5 6 77,2 48,2 5,9 44,4 Prog Main/ Data Main Prog Main/ Data SPM Prog SPM/ Data Main Core+SPM (ma) 82,2 2, 28,6 53, Prog SPM/ Data SPM Main Memory Current (ma) - 9 -
Why not just use a cache? ().. Energy for parallel access of sets, in comparators, muxes. 9 8 Energy per access [nj] 7 6 Scratch pad 5 Cache, 2way, 4GB space Cache, 2way, 6 MB space 4 Cache, 2way, MB space 3 2 256 52 24 248 496 892 6384 memory size [R. Banakar, S. Steinke, B.-S. Lee, 2] 2, 28-2 -
Influence of the associativity Parameters different from previous slides [P. Marwedel et al., ASPDAC, 24] 2, 28-2 -
2 D/A-Converters Peter Marwedel Informatik 2 TU Dortmund Germany
Embedded System Hardware Embedded system hardware is frequently used in a loop ( hardware in a loop ): actuators 2, 28 For Impres s ive dis play tec hnology s ee http://www.date-c onferenc e.c om/c onferenc e/ 23/keynotes /index.htm - 23 -
Kirchhoff s junction rule Kirchhoff s Current Law, Kirchhoff s first rule Formally, for any node in a circuit: Example: www.wikipedia.org The principle of conservation of electric charge implies that: At any point in an electrical circuit where charge density is not changing in time, the sum of currents flowing towards that point is equal to the sum of currents flowing away from that point. i + i4 = i2 + i3 -i+i2+i3-i4= i = k k Count current flowing away from node as negative. 2, 28-24 -
Kirchhoff's loop rule Kirchhoff s Voltage Law, Kirchhoff's second rule The principle of conservation of energy implies that: The directed sum of the electrical potential differences around a closed circuit must be zero. Otherwise, it would be possible to build a perpetual motion machine that passed a current in a circle around the circuit. Example: V4 R4 V2 R3 [www.wikipedia.org] V3 V+V3-V2+V4= Formally, for any loop in a circuit: V = k k Count voltages traversed against arrow direction as negative V V3=R3 I if current counted in the same direction as V3 V3=-R3 I if current counted in the opposite direction as V3 2, 28-25 -
Operational Amplifiers (Op-Amps) Operational amplifiers (op-amps) are devices amplifying the voltage difference between two input terminals by a large gain factor g Supply voltage Vout=(V+ - V-) g op-amp + V- Vout V+ High impedance input terminals Currents into inputs ground Op-amp in a separate package (TO-5) [wikipedia] For an ideal op-amp: g (In practice: g may be around 4..6) 2, 28-26 -
Op-Amps with feedback In circuits, negative feedback is used to define the actual gain I R - R V Due to the feedback to the inverted input, R reduces voltage V-. To which level? op-amp V- + Vout ground Vout = - g V- (op-amp feature) I R+Vout-V-= (loop rule) I R+ - g V- -V-= (+g) V- = I R V = I R + g V,ideal = lim g I R = + g V- is called virtual ground: the voltage is, but the terminal may not be connected to ground 2, 28-27 -
Digital-to-Analog (D/A) Converters Various types, can be quite simple, e.g.: 2, 28-28 -
Output voltage ~ no. represented by x Ii Loop rule: I i = xi I = x3 = Vref R 3 i= + x2 Vref 2 R Ii i + x Vref 4 R + x Vref 8 R xi 2i 3 I = I' Junction rule: Finally: R I= 2 3 i R V + R I ' = Loop rule: Hence: Vref Vref I ~ nat (x), where nat(x): natural number represented by x; Op-amp turns this current into a voltage ~ nat (x) V + R I = R 3 R i 3 V = Vref xi 2 = Vref nat ( x ) R i= 8 R 2, 28-29 -
2 Output Peter Marwedel Informatik 2 TU Dortmund Germany
Embedded System Hardware Embedded system hardware is frequently used in a loop ( hardware in a loop ): actuators 2, 28-3 -
Actuators and output Huge variety of actuators and output devices, impossible to present all of them. Microsystems motors as examples ( MCNC): ( MCNC) 2, 28-32 -
Actuators and output (2) Courtesy and : E. Obermeier, MAT, TU Berlin 2, 28-33 -
Summary Hardware in a loop Sensors Discretization Information processing Importance of energy efficiency Special purpose HW very expensive Energy efficiency of processors Code size efficiency Run-time efficiency Reconfigurable Hardware D/A converters Actuators 2, 28-34 -