TKT-1212 Digitaalijärjestelmien toteutus Lecture 15 - System design trends & challenges Erno Salminen, TUT, 2012 Sidenote regarding exercises: Always mark the initial state of the FSM!
Outline and acknowledgements Challenges in digital systems Introduction to system design Most slides were made by Ari Kulmala The International Technology Roadmap for Semiconductors M. Keating and P. Bricaud, Reuse Methodology Manual for System-on-a-Chip Designs, 3rd Edition 2
System drivers There isn t any one size fits all solution Digital products have e.g. 6 different segments, all with with emphasizing different aspects E.g. security vs. cost http://www.itrs.net/links/2009it RS/2009Chapters_2009Tables/200 9_SysDrivers.pdf 3
Implementation style trend: Moore s law and more Moore s law states roughly: The number of components [transistors] on an integrated circuit doubles every 1.5-2 years SiP: Many ICs in a single package (system-in-package) 4
System-on-chip (SoC) Integrated circuit (IC) that integrates most (all?) components of a computer or other electronic system into a single chip Processors, memories, hardware accelerators, I/Os Interfaces to external memories, analog devices, etc 1990-1992 mobile phones included 15 ICs and 800 other discrete components and in 2002 3-4 ICs and 200 discrete components [Neuvo, 2004] Using SoC reduces system costs ($) and power, offers faster Two main SoC types: 1. power-efficient (PE) for mobile 2. high-performance (HP, later a.k.a. CS consumer stationary) 5 [Y. Neuvo, Cellular phones as embedded systems, IEEE Int l Solid-State Circuits Conference,Digest of Technical Papers, 2004. pp. 32-37.] Texas Instruments OMAP chips have been used e.g. in many smart phones
System-on-chip types 1. Consumer stationary Eg E.g. a high-end game machine (like PS3) Performance is most important differentiator. Wide application domain Many functions implemented with software General-purpose processing engines Has less processing engines than mobile SoCs Relatively long life cycle Easy to add or modify functions Die-size about 220 mm 2 2. Power-efficient E.g. mobile phones Power consumption is strictly limited by the battery (lifetime) High performance required nevertheless Specialized application domain with rather predictable processing (e.g. channel coding, video ) Some general-purpose and many specialiazed HW units Rapid progress and hence life cycle is short Die-size about 64 mm2 6
Challenges in digital system design High-level challenges, not taking into account physical and manufacturing issues 1. Managing design complexity and verification 2. Minimizing power consumption 3. Economics 4. (optimizing chip area & performance) 7
Design challenge #1 :Complexity Good news: "Over the past ten years, reuse leverage more than doubled, and more reuse tends to translates into less project effort, shorter cycle times as well as fewer spins and less schedule slip. Bad news: 85%-89% of IC projects miss their original schedules Schedule slip is 5-30% [Accenture s report] or even 44% [Numetrics report] One reason is that re-usable components are not, after all, easy to integrate [Eetimes] http://eetimes.eu/semi/showarticle.jhtml?arti cleid=204702114 http://www.eetasia.com/art_8800520301_4 80200_NT_68f71562.HTM 8 Fig. SoC size as function of time (CAGR= compound annual growth rate)
Design complexity Logic size increases over 10x per decade! There will be lots of people, lots of files, lots of requirements All things must at least double-checked, d hence all steps must be repeatable Reuse and automation are crucial Be extra careful with versioning and verification 9
Example: IBM/Sony/Toshiba CELL BE, 2005- Commercial chip with multiple heterogeneous processors: 1 64-b PowerPC (32KB L1) + 8 128-b SPEs (256 KB scratch-pad mem), each with DMA, 512 KB L2 3.2 GHz, orig. 90 nm, 221 mm2, 240M transistors, theor. max 230 GFLOPS Acronyms: synergistic processor elements (SPE) dual-threaded power processor element (PPE) element interconnect bus (EIB) (actually ring) 10
New architectures: Intel Terascale/Polaris, 2007- Research processor with 80 homogeneous cores (small processors) Interconnected with 2-D mesh network-on-chipon Stacked chip: to solve local memory problems Parallel programming g becomes mainstream How to do it efficiently? 11 Fig 1.Die stacking improves performance Fig 2. Intel Polaris
High-performance SoC trend DPE = data processing engine 12 ITRS 2006 update, http://www.itrs.net/links/2006update/finaltopost/01_sysdrivers_2006update.pdf
SoC-PE Design complexity trends 13 #DPEs is 3-4x compared to SoC-HP Memory dominates the area
Additional constraints (power- efficient SoC) Problem #2 Problem #1 More performance, very well 14 ITRS 2005, http://www.itrs.net/links/2005itrs/sysdrivers2005.pdf
Challenge #2: Power consumption 15 Chip power consumption can be defined as P avg = P dynamic + P leakage Traditionallly P dynamic dominated in CMOS and P leakage was very small However, in 90nm and below, leakage becomes an increasingly important factor (see fig) Other factors, such shortcircuit power when gate switches state, are often ignored Power optimization done both at design time and a runtime Fig. Leakage vs. dynamic power [http://asicsoc.blogspot.com/2008/03/leakage-powertrends.html]
Dynamic power consumption P K C V f 2 f dynamic out dd K = average number of transitions of the output node every cycle divided by two (e.g. ½ means that there is a single transition each cycle) Glitches etc V dd = Supply voltage f = clock frequency C out = output capacitance Note the square-law dependence of V dd Typically, y, higher the f, higher V dd required 16
Power breakdown in Hi-perf Soc Sta Dy atic P ynamic P Variability between devices and temperature effects will increase leakage notably 17 Power consumptionof one DPE reduces but their number increases. Battery life is not an issue, but chip packaging and cooling are. ITRS 2006 update, http://www.itrs.net/links/2006update/finaltopost/01_sysdrivers_2006update.pdf
Power breakdown in mobile SoC Larger fraction of power is static since these devices are always-on. Note also that memory s static (leakage) power is larger than dynamic. Dyna amic P Static P Note the difference in y-axis scales: 14W vs. 600 W (prev slide)
Challenge #3: Economics Rapid technology change shortens product life cycles and makes time-to-market a critical issue Margins are very small in many business areas Very high h design productivity is needed d Sure we can build that but will anyone want to buy it with such a price? Maximizing volume reduces unit costs Speed binning chips are priced according to their freqeuncy Note that many customers in cosumer markets appreciate little bit different things than engineers 19
Simplified Electronic Product Development Cost Model 20 http://www.itrs.net/links/2005itrs/design2005.pdf
Major steps in design productivity Reuse, higher abstraction level and design automation are the intertwined cornerstones. Lots of research needed in general. In many cases, adopting the existing good methods is already a step forward for many companies. 21
Design development costs Manufacturing non-recurring engineering (NRE) costs are in the order of millions of dollars (mask set + probe card) for high-end chips High non-recurring engineering (NRE) costs necessitate high volume Design errors can cause silicon re-spins that multiply manufacturing NRE ASIC manufacturing actu cycle times are measured e in weeks, with low uncertainty. Design and verification cycle times are measured in months or years, with high uncertainty. Test cost has grown faster than manufacturing costs Software can account for 80% of embedded-systems development cost Verification engineers outnumber design engineers on microprocessor project teams Verification is top priority from day one, not the last phase just before delivery 22 http://www.itrs.net/links/2005itrs/design2005.pdf
23 System design process
System design phases Blocks are preferably reusable IPs Blocks implemented as in earlier lectures with reusable macros 24
System Design Process 1. System specification Identify the system requirements (engineering, marketing) Formulate the preliminary specification 2. Develop a behavioural model and test environment Basic algorithms, their usability (e.g. good enough video encoding quality) Executable specification, golden reference, e.g. C++/SystemC Develop environment for verifying the functionality and performance of the design 3. Model refinement Do not intrdocude too many details at first. Add them gradually. Try to discard definitely bad choices early E.g. floating point model -> fixed-point model -> cycle-accurate and bit-accurate model 25
System Design Process (2) 4. HW/SW partitioning (decomposition) Largely a manual process guided by experience and understanding of tradeoffs (area vs. performance) Define the interfaces between HW and SW, communication protocols 5. Specify and develop a hardware architectural model IPs, Memory architecture, interconnection structure Bandwidth, latency Start from high level models, transaction-level modeling Rfi Refine the architecture until it meets the requirements 6. Refine and test architectural model (co-simulation) A behavioural model of the HW A prototype version of the SW System-level test SW Key to success efficient HW-SW co-design 26
System Design Process (3) 7. Specify and implement blocks Reuse if possible E.g. HW specification: Basic functions Timing, area, and power requirements Physical and SW interfaces Descriptions of the I/O pins and register map Separate testbenches for all HW components! 8. Integration First small blocks together, then bit larger, then... Check the assumptions and estimations made in earlier phases 27
System design timeline Spiral flow instead of waterfall Maximum parallelism HW development (prototypes, emulation) SW development Verification (verification environment) HW/SW integration Iterations after iterations Inevitable Gradual refinement Physical issues taken into account early 28
29 ITRS 2006 update, http://www.itrs.net/links/2006update/finaltopost/02_design_2006update.pdf
30 Teaching in DCS
Yksinkertaistetut kurssien esitiedot 11/12, laatinut ES Esitiedot /Koulutusohjelma- kohtaiset Kandidaatin tutkinto 25 op TKT-1212 DigJärjTot 8 op (k3) TKT-2431 SoC-Suunn 5 op (s1) DI-tutkinto 30 op TKT-3541 Soc-Alustat 5 op (k3) TKT-1101 TKT-1202 DigTeknPer. DigSuunn 4 op (s1) 5 op (s1) TKT-1400 ASIC I 5op(s1) tai ELE1010 TKT-1220 Aritmetiikka 4 op (k3) TKT-1230 Laboratorio 3 op (k4) TKT-1410 SuunnVarm 5 op (k3) TKT-1527 DigSysDesIss. 5op(k3) TKT-2526 Project work 5-8 op TKT-1110 Mikroprosess. 5 op (k3) tai ELE-2300 TKT-3200 Tietokonetekn. I 5 op (s1) TKT-3400 Tietokonetekn. II 5 op (k3) TKT-3500 Mikrokontroll. 5 op (s1) TKT-3526 Proc. Design 5 op (k3) TKT-9626/9636 Seminar 3-6 op TKT-1540/1550 DI-työ semin. 1+0 op TKT-1570 Kandityösemin. 8op TKT-2301 TKT-2456 TKT-9646 Lang. sens.v Wireless.sen Colloqium sov. 5 op p( (s1) s. 5 op p( (k3) 3op pakollinen Esitietoksi käy TKT-1202 tai TKT-1212 suositeltava Tarkista eksaktit esitietovaatimukset opinto-oppaasta. TKT-2530 SatellPaikann 5 op (s1) TKT-9617 ScientificPubl 6 op (s1) TKT-2566 GNSS. 5 op (k3) TKT-2556 Inertial nav. 5 op (k4)
Summary Increasingly complex systems need new methodologies Hierarchical, gradual refinement Reuse evertyhing you can. Pay attention that your own work is reusable Accessible, easy to start with, well commented, tool-independent Invest in executable specifications Divergence to two types of SoCs: High-performance & Lowpower Several advances and active research required in order to keep on pushing the technology in its limits Parallel processing seems the best way to increase performance New methodologies for SW programmers need to be adapted Currently, tool support for parallelization is weak 32
Course summary Most important things you must know (V)HDL basics: entity, architecture, process, signal, data types What happens in simulation and synthesis Testbench and reuse concepts Clocking synchronization Supplementary knowledge: FPGA technology and project management You made a great progress during the exercises! From simulated 1-b adder to an audio synthesizer on FPGA Please fill in the official course feedback in Kaiku You can have 1 A4 sheet of your own notes in the exam See you on the other TKT-courses! Thank you for your interest 33