Ruixing Yang - PDF Free Download

Design of the Power Switching Network Ruixing Yang 15.01.2009

Outline Power Gating implementation styles Sleep transistor power network synthesis Wakeup in-rush current control Wakeup and sleep latency reduction The presentation is based on the reference book (M. Keating, et al., Low Power Methodology Manual for System-on-Chip Design, Springer, 2007. ) chapter 14. All the contents and figures used here are referenced from the book chapter 14.

Power Gating challenges Power Gating effective for reducing the leakage power in standby or sleep mode. However: I) Overhead Silicon area taken by the sleep transistors. Routing resources for permanent and virtual power networks. Complex power-gating design and implementation processes. II) Power integrity issues. IR drop on the sleep transistors Ground bounce caused by in-rush wake up current. III) Wakeup latency.

Ring vs. Grid Style Coarse grain power gating can be implemented in either a ring or a grid style power network. Ring based switching place the switches externally to the power gated block effectively encapsulating the block with a ring of switches. Grid based switching the sleep transistors are distributed throughout the power gated region. Ring Style Sleep Transistor Implementation Grid Style Sleep Transistor Implementation

Ring vs. Grid Style cont. Ring style implementation: Advantages: Has a less complex power plan than the grid because of the separation of the permanent power network and the virtual power network. The sleep transistors are not mixed with other logic cells. Has little negative impact on placement and routing in the standard cell area. Good option for small blocks of logic where the voltage drop across the switch transistors and VVDD mesh can be managed. Disadvantages: Doesn t support retention registers. Add significant extra area cost compared to a grid approach. Grid style implementation Advantages: The switches in a grid network drive the virtual supply for the short distances compared with ring-style implementation Requires fewer sleep transistors than the ring-style impl. To achieve the same IR drop target. The permanent power supply is available across the power-down domain areas. It provides somewhat better trickle charge distribution for management of in-rush current. Has less impact on the area of a power gated block. Disadvantages: Has impact on standard cell routing and physical synthesis. Complexity is added to power routing.

Ring vs. Grid Style cont. More grid style impl. Row and Column Grids 1. Column based switching (fig. upper right), employs columns of switch cells spaced evenly across the switched design. Advantage: Each power switch only has to provide power to a small segment of the standard cell row thereby minimizing any potential voltage drop. Disadvantage: Impact the placement optimization, limiting the flexibility of the standard cell placer. 2. Row based switching (fig. bottom right). Advantage: Optimal solution for distributed switching since the potential impact on the placement engine is limited. Disadvantage: Impact routing resources in lower layer Disadvantage: Impact routing resources in lower layer metal, which can be avoided by column based approach.

Ring vs. Grid Style cont. Selection of the implementation style The best choice of the impl. depens on: The design being implemented The library being used and the type of switches available. The technology being targeted and its specific leakage characteristics. The performance and power goals for the design. The use of the legacy or highly optimized IP. Hybrid Style Implementation The grid style is implemented at the top-level and ring style is applied to certain power-gated hard macros and/or power domain blocks which have no retention cells. Advantage: Take use of the both implementation styles advantages. Disadvantages: more complex power planning.

Ring vs. Grid Style cont. Recommendations Ring vs. Grid Style 1. For the design which implements retention cells, select grid style. 2. If no retention cells, check the area budget and the need for permanent power supply ppy in the power-down areas for always-on buffers. 3. For the design which has power-gated hard macros, or blocks without retention logic, select hybrid style. 4. For grid-style, use wide straps in permanent power network to reduce IR drop.

Header vs. Footer Switch Header Switch: use a high VT pmos transistor to control VDD. Footer Switch: use a high VT nmos transistor s to control o VSS. The selection decision is based on area cost, IR drop constraints, and system architectural issues. 1. Switch Efficiency Consideration Definition: Switch Efficiency = ratio of drain current in the ON and OFF states (Ion/Ioff) Total Leakage in the switch fabric is mainly determined by the switch efficiency. 90nm High VT pmos Switch Efficiency at Normal Body Bias 90nm high VT nmos Switch Efficiency at Normal Body Bias

Header vs. Footer Switch cont. 2. Area Efficiency Consideration and L/W Choice The area efficiency depends on the size (L*W) and layout implementation of the sleep transistors. t Optimal L is determined by the switch efficiency and can be obtained from the switch efficiency curve. The switch efficiency decreases with the increase of W in pmos transistors, therefore the small W is preferred. Figure shows us: Ion linearly increases with W. Ion/W becomes constant at Ion/W becomes constant at given L and Vbb -> the area efficiency is determined by the layout implementation of the sleep transistors.

Header vs. Footer Switch cont. 3. Body Bias Considerations Applying reverse body bias on the sleep transistor can increase the switch efficiency and reduce leakage significantly. Cost for the reverse body bias in the header switch is significantly smaller than in the footer switch. Reason: N-well of the pmos transistor is readily available for bias tapping in the standard CMOS process. It can be tapped to its own body bias supply ppy as long as N-well of the sleep transistor has enough space from the surrounding standard cells N-wells. nmos transistor does not have a well in the standard CMOS process. It is necessary to create wells for nmos sleep transistors to allow separate body bias. higher chip fabrication cost and design complexity & more process variations. Conclusion: pmos header is preferable in reverse body bias application.

Header vs. Footer Switch cont. 4. System Level Design Consideration In SoC designs, blocks usually communicate in the active-high interface protocols referencing common ground (VSS) as logic 0. In header switch implementation, all signal nets in power-gated blocks are settled at Vss which is convenient from system design perspective. Header switch avoids potential signal integrity issues and header switch allows a simple design of a pull-down transistor to isolate power-gated blocks and clamp output signals at logic 0. 5. Recommendations Header vs. Footer Area efficiency is main concern: nmos, which produces higher switch efficiency and smaller transistor size. W should be chosen as large as possible for a given cell height. System level design and IP integration: header. Header is more commonly used than footer in power-gating design currently. Choice of sleep transistor can be limited by the availability of the low-leakage transistor in a given technology. Minimum standby leakage is main concern: W should be chosen based on high switch efficiency and hence low leakage. W is obtained based on the investigation of area and leakage trade-off.

Rail vs. Strap VDD Supply Sleep transistors get power supply from the permanent power network (VDD) and deliver it to the virtual power network (VVDD). Two ways to distribute Vdd to the sleep transistors Rail vs. Strap VDD supply. 1. Parallel Rail VDD Distribution A VDD rail is added to a cell row in parallel with VVDD rail. The sleep transistor gets its permanent power supply by connecting to VDD rails. Advantages: Permanent power supply rail is reachable throughout the design. No restriction on the placement of cells which require connections to permanent power supply. Disadvantages: The implementation ti takes at least one trace of routing resources in every row in VDD rail layer. Incurs layer conflict with conventional standard library cells which use the metal 1 layer for cell internal routing.

Rail vs. Strap VDD Supply 2. Power Strap VDD Distribution Permanent power network is built in one or two top metal layers. The sleep transistors are placed under the straps of the coarse-grain network and get their VDD supply through h via pillars. Advantages: Allows the use of a normal standard cell library in a power-gating design. Disadvantages: Permanent power network no longer covers the design area. - Place the cells which need permanent power supply (PPS) under the PPS network (placement constraint) - Power-routing the cells which need PPS (complicates the power-routing nets)

Rail vs. Strap VDD Supply 3. Recommendations for supply Distribution If no available standard cell library which provides extra VDD rail, select power strap VDD. If impact on routing resources is the main concern, select power strap VDD. If th i ifi t b f t ti i t i d i d i t it i If there are a significant number of retention registers in a design and power integrity in power-routing are the main concern, select parallel distribution.

A Sleep Transistor Example Double row 90nm header switch cell. 60 small pmos transistors of 0.55um width. 6-row transistor array. Normal body bias. VSS is in the middle of the two rows A pair of inverters that drive the sleep transistors t is implemented in the cell for area efficiency.

Wakeup Current and Latency Control Methods In power gating design, thousands of sleep transistors waking up simultaneous -> a very large current in charging the design to a full power-on state -> IR drop -> functional error / short term VDD collapse -> state t in retention ti registers and memories corrupted. Possible solution: control in-rush current by separating the chip power supply to many rows and the power is turned on row by row. Disadvantage: crowbar currents -> IR drop. Not practical in power gating design industry. 1. Single Daisy Chain Sleep Transistor Distribution Turn on the sleep transistors gradually by configuring the sleep transistors in a daisy chain style. Advantages: simple design. Disadvantages: the short delay of the buffers in the chain usually turns on the sleep transistors too quickly -> larger than acceptable in-rush current during wakeup. 2. Dual Daisy Chain Sleep Transistor Distribution Use weak transistors t to trickle charge the design to prevent large in-rush current. When the design is trickle charged close to VDD, large transistors of the optimal drive strength are turned on.

Wakeup Current and Latency Control Methods The transistors are split into two chains: a weak transistor chain and main transistor chain. Size of the weak trickle is defined by the user-defined in-rush current limit and maximum permissible turn-on delay time. Size of the sleep transistors in the main chain is optimized by the methods described for the performance and leakage goals. Trickle sleep transistors are to control wakeup rush current and reduce wakeup latency. The main chain transistor design is based on meeting IR drop target and reducing sleep transistor area.

Wakeup Current and Latency Control Methods 3. Parallel Short Chain Distribution of the Main Sleep Transistor Wakeup Latency = trickle charge time + turn on time of main chain Reduce main chain turn time to reduce wakeup latency. Single daisy chain -> longest time to charge up & small peak charge current. Parallel array -> smallest delay & largest peak current Compromise: Parallel short chain sleep transistors are connected as a number of short daisy chains connected in a parallel l manner. The short daisy chains are turned on simultaneously l when the main chain is turned on. -> The delay is shortened and peak current is controlled. 4. Main Chain Turn-on Control When weak and main chain design are fixed, it is needed to determine the threshold to turn on the main chain. Lower threshold -> turn on early & higher peak current. 5. Buffer Delay Based Main Chain Turn-on Control Control the time to trickle charge the design to the required threshold. In real power-gating design, trickle charge is controlled by the buffer chain which turns on the weak transistors in sequence.

Summary Power gating design style Ring vs. Grid Implementation of Ring, Grid Row vs. Column Grid Hybrid Style Header vs. Footer Switch Switch efficiency i Area efficiency Body bias System level design Rail vs. Strap VDD supply Parallel Rail vs. Power Strap Wakeup Current and Latency Control Methods Single Daisy Chain Dual Daisy Chain Parallel Short Chain Distribution of the Main Sleep Transistors Main Chain Turn-on Control