ARM: 1176 IEM Reference Methodology

Similar documents
Implementing a Voltage Scaling Reference Flow Based on ARM s IEM. Giorgio Parapini Cadence ICD Product Engineer

EDA Challenges for Low Power Design. Anand Iyer, Cadence Design Systems

Advanced Techniques for Using ARM's Power Management Kit

Low Power Design Methods: Design Flows and Kits

POWER GATING. Power-gating parameters

Ruixing Yang

A Survey of the Low Power Design Techniques at the Circuit Level

Policy-Based RTL Design

ICE of silicon. [Roza] Computational efficiency [MOPS/W] 3DTV. Intrinsic computational efficiency.

The Need for Gate-Level CDC

Lecture 23 Encounter in Depth and Conclusion

Low Power System-On-Chip-Design Chapter 12: Physical Libraries

Accurate Timing and Power Characterization of Static Single-Track Full-Buffers

AN EFFICIENT APPROACH TO MINIMIZE POWER AND AREA IN CARRY SELECT ADDER USING BINARY TO EXCESS ONE CONVERTER

Power Management in modern-day SoC

Low-Power Digital CMOS Design: A Survey

DATASHEET CADENCE QRC EXTRACTION

Improved DFT for Testing Power Switches

A 0.9 V Low-power 16-bit DSP Based on a Top-down Design Methodology

Amber Path FX SPICE Accurate Statistical Timing for 40nm and Below Traditional Sign-Off Wastes 20% of the Timing Margin at 40nm

Mixed Signal Virtual Components COLINE, a case study

The challenges of low power design Karen Yorav

Low Power Techniques for SoC Design: basic concepts and techniques

Digital Systems Design

Automated Place and Route Methodologies. For Multi-project Test Chips. Christopher Lieb

Lecture #2 Solving the Interconnect Problems in VLSI

Fast Estimation and Mitigation of Substrate Noise in Early Design Stage for Large Mixed Signal SOCs Shi-Hao Chen, Hsiung-Kai Chen, Albert Li

Contents 1 Introduction 2 MOS Fabrication Technology

Power Gating of the FlexCore Processor. Master of Science Thesis in Integrated Electronic System Design. Vineeth Saseendran Donatas Siaudinis

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling

Ramya Srinivasan GLOBALFOUNDRIES 22FDX: Tempus Body-Bias Interpolation QoR. April

Semiconductor Technology Academic Research Center An RTL-to-GDS2 Design Methodology for Advanced System LSI

STM RH-ASIC capability

Technology Timeline. Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs. FPGAs. The Design Warrior s Guide to.

Mohit Arora. The Art of Hardware Architecture. Design Methods and Techniques. for Digital Circuits. Springer

SUBSTRATE NOISE FULL-CHIP LEVEL ANALYSIS FLOW FROM EARLY DESIGN STAGES TILL TAPEOUT. Hagay Guterman, CSR Jerome Toublanc, Ansys

Aggressive Leakage Management in ARM Based Systems

Chapter 1 Introduction

Exploring the Basics of AC Scan

Advanced In-Design Auto-Fixing Flow for Cell Abutment Pattern Matching Weakpoints

White Paper Stratix III Programmable Power

Signal Integrity Management in an SoC Physical Design Flow

Low Power VLSI Circuit Synthesis: Introduction and Course Outline

Overview of Design Methodology. A Few Points Before We Start 11/4/2012. All About Handling The Complexity. Lecture 1. Put things into perspective

Disseny físic. Disseny en Standard Cells. Enric Pastor Rosa M. Badia Ramon Canal DM Tardor DM, Tardor

VLSI System Testing. Outline

ASICs Concept to Product

Optimization of power in different circuits using MTCMOS Technique

INF3430 Clock and Synchronization

Low Power Radiation Tolerant CMOS Design using Commercial Fabrication Processes

Using a Voltage Domain Programmable Technique for Low-Power Management Cell-Based Design

Power Spring /7/05 L11 Power 1

An Implementation of a 32-bit ARM Processor Using Dual Power Supplies and Dual Threshold Voltages

MHz phase-locked loop

Chip Package - PC Board Co-Design: Applying a Chip Power Model in System Power Integrity Analysis

Managing Cross-talk Noise

Power-Delivery Network in 3D ICs: Monolithic 3D vs. Skybridge 3D CMOS

Design of a Tri-modal Multi-Threshold CMOS Switch with Application to Data Retentive Power Gating

Implementation of dual stack technique for reducing leakage and dynamic power

UNIT-II LOW POWER VLSI DESIGN APPROACHES

EE 434 ASIC and Digital Systems. Prof. Dae Hyun Kim School of Electrical Engineering and Computer Science Washington State University.

EECS150 - Digital Design Lecture 28 Course Wrap Up. Recap 1

Computer Aided Design of Electronics

Interconnect-Power Dissipation in a Microprocessor

Datorstödd Elektronikkonstruktion

Chapter 3 Chip Planning

Dr. Ralf Sommer. Munich, March 8th, 2006 COM BTS DAT DF AMF. Presenter Dept Titel presentation Date Page 1

RTL Power Estimation Flow and Its Use in Power Optimization

Leakage Power Minimization in Deep-Submicron CMOS circuits

PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS

Low-Power VLSI. Seong-Ook Jung VLSI SYSTEM LAB, YONSEI University School of Electrical & Electronic Engineering

International Journal of Innovative Research in Technology, Science and Engineering (IJIRTSE) Volume 1, Issue 1.

The Physical Design of Long Time Delay-chip

The backend duplication method

Current Based Delay Models: A Must For Nanometer Timing

Analog-aware Schematic Synthesis

Entering FD-SOI Era Using GLOBALFOUNDRIES 22FDX Technology

Timing analysis can be done right after synthesis. But it can only be accurately done when layout is available

Geared Oscillator Project Final Design Review. Nick Edwards Richard Wright

Managing Metastability with the Quartus II Software

Closing the Power Gap between ASIC and Custom: An ASIC Perspective

CharFlo-Cell! Cell! Next-Generation Solution for Characterizing and Modeling Standard Cell and I/O Library

EECS 427 Lecture 21: Design for Test (DFT) Reminders

EE-382M-8 VLSI II. Early Design Planning: Back End. Mark McDermott. The University of Texas at Austin. EE 382M-8 VLSI-2 Page Foil # 1 1

To Boldly Do What Can t Be Done: Asynchronous Design for All. Kenneth S. Stevens University of Utah

Low Power Design Part I Introduction and VHDL design. Ricardo Santos LSCAD/FACOM/UFMS

Lecture 10. Circuit Pitfalls

EE E6930 Advanced Digital Integrated Circuits. Spring, 2002 Lecture 7. Clocked and self-resetting logic I

Lecture 3, Handouts Page 1. Introduction. EECE 353: Digital Systems Design Lecture 3: Digital Design Flows, Simulation Techniques.

Timing Issues in FPGA Synchronous Circuit Design

DesignCon On-Chip Power Supply Noise and Reliability Analysis for Multi-Gigabit I/O Interfaces

Lecture 19: Design for Skew

UT90nHBD Hardened-by-Design (HBD) Standard Cell Data Sheet February

DFT for Testing High-Performance Pipelined Circuits with Slow-Speed Testers

DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N

Novel Low-Overhead Operand Isolation Techniques for Low-Power Datapath Synthesis

Reducing Transistor Variability For High Performance Low Power Chips

MDLL & Slave Delay Line performance analysis using novel delay modeling

Low Power Design of Successive Approximation Registers

Characterization and Variation Modeling for 22FDX. Ning Jin Digital Design Methodology Team

Transcription:

ARM: 1176 IEM Reference Methodology

Philip Watson, Implementation Environment Program Manager, ARM. Introduction ARM and Cadence have been collaborating on low-power methodology development for a number of years to serve their common customers across all market segments, including wireless, consumer, computing, and networking. In 2005, as fellow members of the Silicon Design Chain, ARM, and Cadence developed a low-power test chip that demonstrated 40% power savings compared to a standard timing closure flow. The design was based on ARM s ARM1136JF-S test chip used in their RealView Integrator development boards. It was implemented using Artisan low-power technology libraries and was manufactured on a 90nm TSMC process. Through a combination of automated voltage scaling and clock gating techniques, the chip taped with a 38% decrease in dynamic power consumption compared to the same design implemented with a standard timing closure flow. Using multi-supply voltage (MSV) RTL synthesis, 46% savings in leakage power was achieved. Overall, the methodology reduced total power consumption by 40%. The design was further enhanced in 2006 to add power-gating modes to support power shut-off (PSO) of the core logic region. Using a structured ring methodology and low-power formal verification technology, switch-cell placement, and power stitching were automated, and power-switch voltage drop and turn-on analysis were performed. In the low-power regions, analysis showed a leakage power reduction of 98%. ARM joined both PFI and the Si2 LPC in 2006. In November 2007, ARM and Cadence delivered a PFI silicon proof point based on the ARM1176 processor test chip, also on 90nm. The design demonstrated use of a CPF-enabled flow to deliver an ARMbased SoC implementing advanced power management techniques including dynamic voltage and frequency scaling (DVFS) and advanced PSO. The design contained three voltage domains (see Figure 165). Silicon measurements showed excellent correlation with power analysis results. Leakage reduction of over 96% was measured in the PSO regions of the design. Sec8:2

Figure 165. ARM 1176 test chip ARM-Cadence Implementation Reference Methodologies For the past five years, ARM and Cadence have been collaborating on Implementation Reference Methodologies (irms) for the benefit of their mutual customers. These reference flows enable ARM licensees to customize, implement, verify, and characterize soft IP ARM processors for their chosen process technologies. They provide a predictable route to silicon and a solid basis for custom methodology development. Sec8:3

An irm comprises a setup of flow scripts which, when combined with processor RTL, timing constraints, libraries (front-end standard cell libraries and precompiled memories as appropriate) and Cadence technologies, provide a complete out-of-box implementation of the target ARM processor. irm Concept Tools and Methodology - cadence Performance Effort + Soft IP = Hard IP Quality Targets ARM Time to Volume ARM Implementation Reference Methodology: A proven path for the deployment of ARM IP with best-in-class EDA tools Figure 166. irm concept ARM and Cadence are now applying the irm concept to deliver advanced lowpower reference flows. The first example of this is the CPF-based low-power irm for the ARM1176JZF-S core, which was released in December 2007. ARM1176 Processor The ARM1176JZ-S and ARM1176JZF-S synthesizable processors are designed for use as applications processors in consumer and wireless products. These are also the first processors to integrate support for ARM Intelligent Energy Manager (IEM) technology, making them ideal for cell phones, PDAs, hand-held games, and other portable consumer devices. Sec8:4

Intelligent Energy Manager (IEM) Technology ARM: 1176 IEM Reference Methodology IEM is a combined software and hardware technology that dynamically monitors and predicts the performance requirements of multiple applications, and tunes the processor s operating voltage and frequency to match the requirements. IEM reduces the processor s energy consumption by an additional 25% to 50%, extending battery life for portable systems. The IEM-enabled ARM1176JZ-S and ARM1176JZF-S processors include support for the voltage and clock domains required to implement an IEM-enabled system. The IEM technology uses an advanced power management technique called dynamic voltage and frequency scaling (DFVS) to implement the power and energy savings. Figure 167. ARM IEM technology ARM1176JZ-S Power Management ARM1176JZ-S power management incorporates two complementary techniques: ARM1176 dormant mode IEM compatible core The ARM1176 dormant mode supports two key capabilities: Complete power-off of the core, which reduces leakage in the core to zero Retention of the system state in cache/tcm tightly coupled memories at low voltage, which requires isolation cells to clamp the RAM inputs The key benefits are that this dormant mode minimizes energy loss due to leakage power during standby modes of operation, a substantial reduction of energy consumption, and extended battery life. Sec8:5

The IEM compatible core and design flow enables dynamic voltage and frequency scaling and supports the tuning of performance dynamically, or on-the-fly, to the current performance demand during that mode of operation. ARM1176JZF-S RTL To implement IEM, it is necessary to partition the CPU into separate power domains, the supply voltages of which can be safely can be safely scaled independently. The hierarchy of ARM1176JZF-S RTL is partitioned to reflect the voltage domains: VRAM VCORE VSOC The hierarchy of placeholder modules for level shifters and clamp cells have been engineered into the RTL for the following interfaces between voltage domains: VCORE and VRAM VSOC and VCORE Level shifters and clamps are incorporated on the I/Os that operate within the VSOC domain but are sampled in the VCORE domain. The AXI bus interface for the ARM core also supports an asynchronous mode of operation the core voltage (Vcore) to vary with respect to the system-on-chip voltage (Vsoc). When the two voltage levels are the same (Vcore = Vsoc), it is possible to dynamically switch the AXI interface into synchronous mode, without any latency from synchronizing logic. HARDENED CORE L-shift/clamp ACLK TCM and cache RAMS TCM and cache RAMS Clamp ARM core L-shift/Clamp Clamp IEM Sync/Async I/F IEM Sync/Async I/F VRAMS VCORE VSOC Figure 168. ARM1176JZF-S RTL Sec8:6

ARM Power Management Kit The ARM Power Management Kit (PMK) provides components to actively manage dynamic and leakage power in SoC designs. PMK enables a fast and effective implementation of designs with multiple core voltages, power gating and retention flip-flops, back-biasing, and other power saving techniques. The ARM PMK includes up, down, and bi-directional level shifters, on-chip power gating, retention flip-flops, and back-bias compatible well-tap cells. The ARM Power Management Kit includes: Power gates (MT-CMOS) Power control of voltage islands via switchable voltage rails, using either header or footer cells Level shifter and isolation cells Up- and down-shifting with optional enable signal Retention flip-flops Maintaining the flip-flop states after power-down for leakage reduction Back-bias support Reducing leakage current via well-biasing with special fill_tie cells Always-on buffers Buffering of signals in powered-down areas VDD1 VDD2 VDD1 VDD2 VDD1 VDD2 Global VBS Global VDD Figure 169. ARM Power-Management Kit Sec8:7

ARM1176JZF-S Low-Power Reference Methodology As explained earlier, an irm provides a complete out-of-the-box reference flow for ARM licensees wanting to implement soft IP ARM processors. ARM and Cadence have worked together to apply this concept to a low-power implementation of ARM1165JZF-S processor. Key Components ARM-Cadence Reference Methodology ARM-Artisan Metro Standard Cell ARM-Artisan Power Management Kit TM Compiled Views for Fire & Ice, VoltageStorm, CeltiC NDC; Timing ECSM Extensions Encounter RTL Compiler SoC Encounter System Fire & Ice QXC CeltiC NDC VoltageStorm Power Verification Encounter Conformal Low Power Figure 170. ARM and Cadence irm Flow Features The ARM1176-IEM irm features: CPF-based flow, in which the CPF file is used to describe the low-power intent and drive the entire design, verification, and implementation flow Automated RTL-to-GDS multi-supply voltage (MSV) implementation flow Multi-mode, multi-corner (MMMC) analysis and optimization, which ensures that the design is optimized across the complete range of voltages and frequencies Variable VDD flow (also called Tri-lib flow), which provides accurate interpolation for DVFS and IR drop analysis, and features Effective Current Source Model (ECSM) extensions to the.lib library process files Sec8:8

Library Support ARM: 1176 IEM Reference Methodology The irm uses the level shifters and isolation cells available in the Power Management Kit. In addition, the libraries are characterised at multiple voltages to support DVFS implementation. Level Shifter Usage Level shifters provide shift up and shift down functionality to support the proper interface between islands of different voltage levels. These are available with and without enable/isolation signals, and with different drive sizes. Dual-voltage characterization for all level shifters includes characterization of cells with separate voltage values for input and output voltage. Island A Low Voltage Domain Island B High Voltage Domain Low Voltage Signal Down shifter High Voltage Signal VDDI Low Voltage Up shifter High Voltage Figure 171. Level shifters Isolation Cell Usage Isolation cells are used to isolate switchable power islands with identical voltage levels, and are similar to high-to-low level shifters with added enable. Fail-safe shifter and isolation cell design allows flexible power-on and power-off sequences without manual intervention in a CPF-enabled flow. Sec8:9

Island A Low Voltage Domain Island B High Voltage Domain VDDI Low Voltage Isolation Cell High Voltage Low Voltage Signal Enable Figure 172. Isolation cells DVFS Support The standard cell libraries are characterised at three voltage levels. ECSM library models allow accurate interpolation at the intermediate voltage levels required for DVFS. Regarding library and process support for DVFS, ARM Metro libraries are available for the TSMC CL013G process. Available voltage corners include: WC: 0.72V, 0.9V, 1.08V TC: 0.8V, 1.0V, 1.2V BC: 0.88V, 1.1V, 1.32V Accurate low-power design is enabled by the required views: Timing models (.lib) characterized at available voltage corners, optionally with ECSM extensions for better accuracy Noise models (.cdb) characterized at available voltage corners Sec8:10

CPF Setup Figure 173. ARM MSV setup Power domains include all instances that share a power supply. Isolation rules define the location and type of isolation logic to be added and the conditions defining when to enable the isolation logic. Level shifter rules define the logic and type of level shifter to be added. An illustration of the CPF setup for the ARM 1176-IEM MSV design follows: Sec8:11

create_power_domain -name VCORE \ -instances $VCORE_moduleInst_list \ -boundary_ports $VCORE_pins \ -shutoff_condition {SWITCH_VCORE} update_power_domain -name VCORE \ -internal_power_net VDDCORE create_global_connection -net VDDCORE -domain VCORE -pins VDD create_global_connection -net VSS -domain VCORE -pins VSS create_power_domain -name VRAM create_power_domain -name VSOC default create_isolation_rule -name rule_vcore2vram \ -from VCORE -to VRAM \ -isolation_output low \ -isolation_condition {!RAMCLAMP} update_isolation_rules -names rule_vcore2vram \ -location to -combine_level_shifting \ -cells {LVLLHEHX8M} create_level_shifter_rule -name rule_vram2vcore \ -from VRAM -to VCORE update_level_shifter_rules -names rule_vram2vcore \ -cells {LVLHLX8M} -location create_level_shifter_rule -name rule_vsoc2vcore create_isolation_rule -name rule_vcore2vsoc_low create_isolation_rule -name rule_vcore2vsoc_high create_power_nets -nets VDDRAM create_power_nets -nets VDDCORE \ -external_shutoff_condition {SWITCH_VCORE} create_power_nets -nets VDD ARM 1176-IEM CPF Power Modes A power mode is a static state of a design in which each power domain operates under a specific nominal condition. Once defined, timing constraints can be associated with a power mode. Power modes enable different voltage conditions to be assigned to the voltage domains defined earlier. With CPF, the same power modes are used by both logic synthesis (for example, Cadence Encounter RTL Compiler) and by place-and-route (Cadence SoC Encounter System.) Sec8:12

Dynamic Voltage and Frequency Scaling Mode VSOC VCORE VRAM Target frequency PM_highV 1.08V 1.08V 1.08V 250MHz PM_medV PM_lowV 1.08V 1.08V 0.90V 0.72V 0.90V 0.72V 166MHz 90MHz VCORE_Dormant 1.08V Off 0.72V 250MHz Figure 174. ARM1176-IEM power modes Sample CPF is shown below: create_nominal_condition -name highv -voltage 1.08 update_nominal_condition -name highv -library_set libs-worst-1.08v create_nominal_condition -name medv -voltage 0.90 create_nominal_condition -name lowv -voltage 0.72 create_nominal_condition -name OFF -voltage 0 create_power_mode -name PM_highV -default \ -domain_conditions {VCORE@highV VRAM@highV VSOC@highV} \ update_power_mode -name PM_highV -sdc_files ARM1176JZFS.constraints_PM_highV.sdc create_power_mode -name VCORE_dormant \ -domain_conditions {VCORE@OFF VRAM@highV VSOC@highV} update_power_mode -name VCORE_dormant -sdc_files ARM1176JZFS.constraints_PM_highV.sdc ARM 1176-IEM CPF Corners and Analysis Views An operating corner is a specific set of process, voltage, and temperature (PVT) values under which the design must perform. The analysis view associates an operating corner with a power mode for which certain timing constraints were specified. Operating corners and analysis views are only used for physical implementation (SoC Encounter System). Timing analysis and physical optimization work concurrently on active analysis views. Sec8:13

Sample CPF follows for four voltage corners: create_operating_corner -name WCORNER_1.08 \ -voltage 1.08 -temperature 125 -process 1 -library_set libs-worst-1.08v create_operating_corner -name WCORNER_0.72 \ -voltage 0.72 -temperature 125 -process 1 -library_set libs-worst-0.72v create_operating_corner -name WCORNER_0.90 \ -voltage 0.90 -temperature 125 -process 1 -library_set libs-worst-0.90v create_operating_corner -name BCORNER \ -voltage 1.32 -temperature -40 -process 1 -library_set libs-best-1.32 create_analysis_view -name WCVIEW_1.08 \ -mode PM_highV \ -domain_corners {VCORE@WCORNER_1.08 VRAM@WCORNER_1.08 VSOC@WCORNER_1.08} create_analysis_view -name WCVIEW_0.90 \ -mode PM_medV \ -domain_corners {VCORE@WCORNER_0.90 VRAM@WCORNER_0.90 VSOC@WCORNER_1.08} create_analysis_view -name WCVIEW_0.72 \ -mode PM_lowV \ -domain_corners {VCORE@WCORNER_0.72 VRAM@WCORNER_0.72 VSOC@WCORNER_1.08} create_analysis_view -name BCVIEW_1.32 \ -mode PM_highV \ -domain_corners {VCORE@BCORNER VRAM@BCORNER VSOC@BCORNER} Automated CPF-Driven MSV Flow CPF is used to drive synthesis and physical implementation for an MSV design as follows: CPF defines the MSV/power domain partition (power domains with assigned instances, top-level I/O pins, and power ground connections) Set up MMMC environment (power modes, delay corners, and analysis views) Isolation rules and level shifter rules automate the usage of low-power elements (shifter and isolation cells): definition, identification from RTL, insertion, placement, and optimization Synthesis, placement, optimization, and routing based on CPF-defined power domains Power domain-aware clock tree synthesis Sec8:14

Constraints CPF Netlist Top-Down MSV/MultiMode Single-Pass Synthesis Low-Power Functional Verification SI and Static Timing Analysis Static/dynamic Power Analysis Floorplanning / Silicon Virtual Prototyping Power Grid Synthesis Isolation/Level Shifter Cells Check and Insertion Placement including SRPG/Level Shifters/ISO Cells Power Routing Low-Power Clock Tree Synthesis Domain-Aware Post-CTS Optimization Domain-Aware NanoRoute Routing IRDrop-Aware Timing/SI Optimization Timing/Power Optimization Sign-Off MSV/MMMC Infrastructure GDSII Figure 175. Automated CPF-driven MSV flow Sec8:15

MSV Logic Synthesis Flow with CPF Multi-vth *.lib Multi-voltage *.lib RTL_files.v CPF SDC Gate.v SDC Import RTL Setup MSV, Multi-Mode, and Power Constraints Import Isolation/Level Shifter Cells Define DFT Constraints and Clock Gating Top-Down Synthesis MultiV/Multi-Mode Connect Scan Chains and Incremental Synthesis Analysis / Output set_attribute library_domain... read_hdl $ HDL_files elaborate read_cpf CPF_file set_attribute max_leakage_power 0 set_attribute Ip_multi_vt_optimization_effort level_shifter import isolation_cell import synthesize -to_mapped -effort high connect_scan_chains synthesize -to_mapped -effort high -incr foreach Mode { report timing; report power; write_sdc } write_mapped Figure 176. Logic synthesis flow Top-down single-pass synthesis with power domain definition Identification of level shifters and isolation cells already instantiated in RTL Multi-mode synthesis to consider frequency and voltage scaling Power domain-aware scan-chain implementation Leakage power optimization using high-vt cells CPF create_power_domain -name VCORE \ -instances $VCORE_moduleInst_list \ -boundary_ports "$VCORE_pins" \ -shutoff_condition {SWITCH_VCORE} create_nominal_condition -name highv -voltage 1.08 update_nominal_condition -name highv -library_set libs-worst-1.08v create_power_mode -name PM_highV -default \ -domain_conditions {VCORE@highV VRAM@highV VSOC@highV} \ update_power_mode -name PM_highV -sdc_files ARM1176JZFS.constraints_PM_highV.sdc RTL Compiler internal database structure power_domians VCORE library_domain_by_mode shutoff_signal modes PM_highV clock_domains exceptions external_delays VRAM library_domain_by_mode PM_medV clock_domains exceptions VSOC library_domain_by_mode external_delays PM_lowV Figure 177. Multi-voltage, multi-mode logic synthesis Sec8:16

Physical Implementation of MSV with CPF ARM: 1176 IEM Reference Methodology Additional complexity of MSV flow is managed through CPF and bound to floorplanning phase. The rest of the MSV flow is similar or identical to a standard flow. Multi-Mode SDC Abstracts *.lef Multi-Vth *.lib Multi-voltage *.lib gate.v CPF LoadCPF/CommitCPF Floorplan Place and Route captable Noise models Timing/SI Closure Signoff Tech file Power models Power/Rail Analysis sdf final.v GDS Figure 178. Physical implementation flow MMMC Physical Implementation Flow DVFS is relying on an MMMC implementation flow, which allows the place-androute software to analyze and optimize the design for all the foreseen working combinations of voltage and frequencies. Analysis views are reflecting these possible combinations binding a different operating corner to each power domain. A different set of timing constraints is also linked to each analysis view through power modes to account for different working frequencies. Sec8:17

create_operating_corner -name WCORNER_1.08 \ -voltage 1.08 -temperature 125 -process 1 -library_set libs-worst-1.08v... create_analysis_view -name WCVIEW_1.08 \ -mode PM_highV \ -domain_corners {VCORE@WCORNER_1.08 VRAM@WCORNER_1.08 VSOC@WCORNER_1.08} Active analysis views may be changed wherever in the flow Analysis and optimization work concurrently only on ACTIVE analysis views set_analysis_view -setup (WCVIEW_1.08 WCVIEW_0.90 WCVIEW_0.72) -hold {BCVIEW_1.32} VRAM 1.08V 0.90V 0.72V VCORE VSOC 1.08V 0.90V 0.72V 1.08V Figure 179. MMMC physical implementation flow Floorplanning with MSV and CPF Power domains are only logically created after reading the CPF. Physical information has to be added to them during floorplanning phase. Each power domain has a fence constraint with some additional parameters necessary to automatically build its own dedicated row structure, created depending on the associated cell libraries. Hard block placement is scripted and easily customizable through relative floorplan commands. The power grid structure is automatically built via a.tcl script by use of power domain-aware commands. Figure 180. Floorplan with power domains Sec8:18

Sample commands to floorplan a power domain (VRAM): ARM: 1176 IEM Reference Methodology modifypowerdomainattr VRAM -mingaps 5.74 5.74 5.74 5.74 -rsexts 0.0 0.0 0.0 0.0 \ -rowspacetype 2 -rowspacing 0.0 -rowflip first \ -core2top $VRAM_margin -core2bot $VRAM_margin \ -core2left $VRAM_margin -core2right $VRAM_margin # Define PD box setobjfplanbox Group VRAM $vram_box_llx $vram_box_lly $vram_box_urx $vram_box_ury The floorplan power structure with three power domains (VRAM, VCORE, and VSOC) with the rings and stripes of the grid dedicated to each of them is shown below: Figure 181. Floorplan power structure Sample CPF follows for the creation of power and ground nets: create_ground_nets -nets VSS create_power_nets -nets VDDRAM create_power_nets -nets VDDCORE -external_shutoff_condition {SWITCH_VCORE} create_power_nets -nets VDD-nets VDD MSV Design Placement Standard cells, level shifters (single and multi-height), and isolation cells automatically placed in a single pass. The shifters/isolation cells are automatically placed on the edges of a power domain. Sec8:19

The verifypowerdomain native SoC Encounter System command checks: Level shifter and isolation rules for nets crossing the boundaries The correct placement of instances in a power domain Assignment of I/O pins to the correct power domain Figure 182. MSV design element placement Level Shifters/Clamps: VCORE-VRAM Secondary power pins of shifter and clamp cells are routed using sroute (an SoC Encounter special router). Figure 183. Level shifters and clamps in the VCORE-VRAM Sec8:20

CPF for creating isolation and level shifter rules follows: ARM: 1176 IEM Reference Methodology create_isolation_rule -name rule_vcore2vram \ -from VCORE -to VRAM -isolation_output low \ -isolation_condition {!RAMCLAMP} update_isolation_rules -names rule_vcore2vram \ -location to -cells {LVLLHEHX8M} \ -combine_level_shifting create_level_shifter_rule -name rule_vram2vcore \ -from VRAM -to VCORE update_level_shifter_rules -names {rule_vram2vcore} \ -cells {LVLHLX8M} -location to Clock Tree Synthesis and CPF With CPF, clock tree synthesis is made power domain-aware. A single-pass, topdown clock tree is created by clock tree synthesis. Clock tree synthesis does the following: Establishes a single entry/exit point for each domain Selects buffers from appropriate libraries and places them within domain boundaries Balances clock skew through all domains and for all active corners and views Figure 184. Clock tree synthesis Sec8:21

Power Domain-Aware Routing with CPF The SoC Encounter System honors power domains in both trialroute and NanoRoute routing modes. Behavior is optionally enabled using the following settings: settrialroutemode handlepd -handlepdcomplex setnanoroutemode routehonorpowerdomain true Correct buffering method Default PD1 On/off PD2 PD3 default Figure 185. Power domain-aware routing Cross-Domain Timing Optimization Timing optimization transparently considers both level shifter/clamp placement and signal direction. Parts of nets are defined as don t touch Buffers are inserted from the correct library and into the correct module Buffer location is timing driven The SoC Encounter System optimizes timing and design rules concurrently for all active corners and views. Active views may be controlled using the set_analysis_view command. Sec8:22

Power Domain A Libraries A Power Domain B Libraries B Don t touch no s Power Domain A Libraries A Power Domain B Libraries B Power Domain A (0.8V) Libraries A Power Domain B (1.0V) Libraries B 0.8V UO 0.8V UO Figure 186. Cross-domain timing optimization Signoff with CPF Domain-Aware Signal Integrity Analysis and Optimization Concurrent multi-domain signal integrity (SI) analysis and optimization includes SI glitch propagation across domains, voltages, and level shifters. It provides accurate analysis across domains, despite the fact that super aggressors can make SI problems worse. Noise models of cells are characterized for different voltages to maintain accuracy through different mode-related operating voltages. Optimization works concurrently on all active views. Sec8:23

1.08V VDD Island Low Vt strengthens aggressor 1.08V Super aggressor Low Vt increases noise propagation 0.7V High Vt weakens victim 0.7V VDD Island Critical path Figure 187. Domain-aware SI analysis and optimization Variable Vdd Analysis Variable Vdd analysis enables accurate analysis and optimization at noncharacterized voltages. It uses ECSMs that accurately model delay variations with Vdd, and requires models characterized at three voltages (tri-library). In addition to modes based on pre-characterized voltages, the flow supports power modes that use non-characterized voltages and allows these to be run through full timing analysis and optimization. This is required for full DVFS implementations and also for modeling the impact of voltage variations when using fixed voltage levels. The approach uses ECSMs that allow accurate interpolation of intermediate voltage levels. In this example, a power mode has been defined where VCORE is set at 1 volt, a voltage for which libraries have not been characterized. For the variable Vdd flow, the designer assigns a (non-characterized) Vdd value as the operating voltage and runs through timing optimization closure, using ECSM and tri-library technology for better accuracy. It requires a minimum of two libraries characterized for different voltages for good accuracy across the full range, and is supported by analysis and optimization. Sec8:24

Delay Calculator Process Temp SS, 125 C 1.08V 0.90V 0.72V Dynamic Voltage and Frequency Scaling Mode VSOC VCORE VRAM Target frequency PM_highV 1.08V 1.08V 1.08V 250MHz PM_medV 1.08V 0.90V 0.90V 166MHz PM_triV 1.08V 1.00V 1.00V 200MHz VCORE_Dormant 1.08V Off 0.72V 250MHz Figure 188. Variable Vdd flow Low-Power Verification The CPF-enabled verification tool, Cadence Encounter Conformal Low Power, is used throughout the implementation flow for: Equivalence checking of low-power designs Power domain structural and functional checks Transistor electrical verification Equivalence checking for low-power design ensures that: No logic bugs are introduced by implementation tools Low-power optimizations do not introduce logical errors Low-power optimizations are verified State retention mapping from RTL to gate are checked Gated clocks, gated signals, de-cloning, and re-cloning of gated clocks are verified Corresponding presence of isolation and level shifter during implementation are checked Power domain structural and functional checks include: MSV and power-gating functional and structural checks Checks for both logical and physical (power-aware) netlists Proper insertion of low-power cells Proper connectivity of low-power cells Formal validation of the isolation function Formal validation of the state retention function Sec8:25

Transistor electrical verification: Performs transistor checks to reduce leakage in a design Detects sneak (leakage) paths across power domain boundaries Top-Down MSV/MultiMode Single-Pass Synthesis Encounter Conformal Low Power Floorplanning / Silicon Virtual Prototyping Power Grid Synthesis Isolation/Level Shifter Cells Check and Insertion Placement Including SRPG/Level Shifters/ISO Cells Power Routing Low-Power Clock Tree Synthesis Domain-Aware Post-CTS Optimization Domain-Aware NanoRoute Routing IRDrop-Aware Timing/SI Optimization Sign-Off MSV/MMMC Infrastructure Figure 189. Formal verification Conclusion The ARM-Cadence low-power Reference Methodology and PMK for the ARM1176 is the result of a long collaboration on low-power implementation methodologies. It provides comprehensive support for advanced low-power SoCs across the design flow, from RTL to GDS. The Cadence CPF-based low-power design flow, combined with IEM-enabled ARM processor IP, technology libraries, PMK, and Portable Reference Methodology Scripts provide an advanced low-power reference flow. This reference flow leverages state-of-the-art power management features, including DVFS and variable Vdd techniques, and has been optimized and tested for use with the latest Cadence technology releases. As shown through actual silicon measurements, these features can deliver up to 40% overall power reduction and over 96% leakage reduction in regions where PSO is applied. Sec8:26

Design intent is clearly specified and controlled throughout the whole flow by CPF, ensuring readability, information sharing between different teams, and risk prevention (no describing different things for different tools). DVFS is a very complex flow, now made automatic, hiding the complexity inside the tools. The ARM Reference Methodology streamlines rapid deployment of ARM processor products, accelerates time to market for ARM customers developing low-power products, and is available to ARM partners and Cadence customers. An overlapping customer base coupled with the popularity of ARM processorbased designs make the Cadence/ARM partnership an intuitive process. Our ongoing collaboration on low-power methodologies and our consistent progress therewith has made it possible for mutual customers to adopt new process geometries and incorporate advanced power-reduction techniques. Now with an integrated, fully automated low-power design flow, companies can achieve both functional and structural verification before incurring manufacturing costs. No longer constrained by the risk of low yield or costly re-spins, ARM-based design teams can focus their time and resources on what matters most innovation. Philip Watson, Implementation Environment Program Manager, ARM. Sec8:27