White Paper Kilopass X2Bit bitcell: OTP Dynamic Power Cut by Factor of 10 November 2015 Of the challenges being addressed by Internet of Things (IoT) designers around the globe, none is more pressing than the need to reduce edge-node power. While eyes often turn to the radio as primary consumer of energy, memory, including NVM memory, also contributes a substantial portion of the energy consumed by an edge node. Power reductions in all memories will be essential for meeting this challenge. Kilopass X2Bit bitcell has achieved a breakthrough that allows it to reduce its dynamic current by a factor of 10, dropping from 100 µa/mhz to 10 µa/mhz. This paper first motivates in more detail why this is important and then illustrates the area where the change was made as well as the implications of that change. Anatomy of the IoT The IoT has as many different looks as it has champions gunning for primacy. But with enough abstraction, we can sketch out a general IoT architecture that captures the most important features for the wide majority of applications. The IoT concept revolves around sensors measuring the world in some way relevant to its application, sending that measurement data (or a filtered version of it) on to one or more platforms for further analysis and possible integration with other data streams. That computation may happen relatively locally (in what s sometimes referred to as the fog ) within gateways or local servers, or it may traverse the Internet to be handled in the Cloud. Because of the cost of communication, it s always beneficial to do as much work locally as practical. So, before communicating sensor data, an edge node may want to do some early filtering and accumulation before sending the data on (we ll look at an example shortly). Fog computing may be able to handle local decisions from multiple local edge nodes, while the Cloud can leverage other Internet-based data like social media feeds. The computation done in the Cloud may be an end in itself, in the form of analytics or other data for storing or for sale. Alternatively, the result of the computation may drive commands that are fed back to the same or to other edge nodes to make changes via actuators. In other words, sometimes the sensor information is useful simply as information; other times it serves to help decisions that result in automated action. www.kilopass.com Page 1/8
The Cloud Gateway/Hub Wireless edge node Wired edge node Sensor data Internet data Actuator commands Figure 1. The Internet of Things comprises a cloud computing capability with communication back to edge nodes on the far reaches of the network. Edge nodes can be wired or wireless. There are, broadly speaking, two important classes of edge node. Some edge nodes have the benefit of being installed where line power is readily available. That power might come via a standard power feed or it could be delivered by power over Ethernet. For such edge nodes, power is a critical issue only to the extent that it s always a good idea to minimize any power consumption anywhere when readily possible. Examples are: Sensors in lighting Parking sensors Sensors in home appliances Sensors within industrial equipment www.kilopass.com Page 2/8
But many more sensors are likely to operate away from a power source, meaning that they will be powered by batteries or, when technology permits, energy harvesting. A good number of these sensors will be located far afield, making it difficult to change batteries. Others may be closer in even near power but not in a way that admits easy connection to power without an extensive wiring project. Examples of these include: Infrastructure sensors (buildings, bridges, etc.) Medical sensors for use on or inside the body Oil drilling and mining sensors Industrial sensors placed outside equipment (like vibration sensors on a pipe) Agricultural sensors Weather sensors These wireless edge nodes must do their work under severe power budget constraints while still providing performance and reliability suitable to their application. Anatomy of a Wireless Edge Node A wireless edge node consists of (at a minimum) the following key components: One or more sensors, combined with ASICs for handling signal conditioning, linearization, and digitization (the last step being typical but optional); A facility for computing, typically provided by a microcontroller with a processor, working memory, firmware memory, and possibly other peripherals like analog-to-digital conversion; A radio for communicating; most often duplex, although sometimes send-only (reducing the power needed to keep a receiver listening; and Power management, including sleep control logic. Power Management Battery Radio Computing Platform Sensor Figure 2. An abstract view of an IoT edge node, comprising a sensor, a means of computing the output of the sensor, and a radio for transmitting data. A power management block must manage delivery of power so as to maximize battery life. An edge node may also have actuators; those are ignored in this analysis. www.kilopass.com Page 3/8
Each of these components including power management requires power. The dominant elements are the radio, the sensor, and the microcontroller. Some typical numbers will illustrate why these are the primary consumers of energy. We ll use as normalizing assumptions 30-MHz operation with a 10% duty cycle. The reason for the duty cycle will be discussed shortly; 10% is actually a very conservative number, since many sensors may have a far lower duty cycle. Sensor current The power consumed by the sensor will depend on the sensor type. Some sensors, like accelerometers, are passive and require no power to operate. Others, like gyroscopes and some magnetic sensors, consume power when making a measurement. However, all sensors passive or active have accompanying circuitry to clean up and (usually) to digitize the data for delivery to the microcontroller. That circuitry consumes energy. In other words, all sensors consume energy, although some consume more than others. That said, we can assign a typical current number of about 10 µa plus about 10 µa/mhz. At our assumed conditions of 10% duty cycle, this gives us roughly 40 µa. Processor current We ll divide the microcontroller current into a number for the processor and a separate number for the memory (ignoring other peripherals). A typical ARM Cortex M0, which is the class of processor common for edge-node application, will run at around 10 µa/mhz, yielding 30 µa. Memory current There are likely to be a variety of memories for different purposes: One-time programmable (OTP) memory for firmware (1 Mb) Embedded flash (eflash) memory for rewritable persistent storage (1 Mb) SRAM for high-speed working memory (1 Mb) Today s OTP and eflash consume roughly 100 µa/mhz, giving a total of 600 µa combined. SRAM draws roughly twice the current, contributing another 600 µa. Radio current Radio current will vary according to the type of radio selected, but, for the sake of example, we ll use Bluetooth Low Energy (BLE), with a minimum transmit and receive current at 10 ma. Using a 10% duty cycle, this makes the radio contribution 1 ma. The contributions from other circuits, including power management, have been ignored as being far less than that of these dominant components. www.kilopass.com Page 4/8
These results are summarized in the following table and the left pie chart below. Note, however, that due to the recognized dominance of the radio power, much research and development is going into reducing the energy consumed by the radio and in using the radio more sparingly, with the net expected effect of reducing the radio s contribution by as much as an order of magnitude. The right-hand pie chart shows the share of energy consumption if that happens; memory becomes the dominant factor, with OTP being a significant contributor. Component Current (µa) Sensor 40 Processor 30 OTP 300 eflash 300 SRAM 600 Radio 1000 Sensor Processor OTP eflash SRAM Radio Sensor Processor OTP eflash SRAM Radio Figure 3. The radio and memory dominate energy consumption in an IoT edge node. Improvements in radio design and utilization are reducing its impact, leaving memory as the dominant component. Keeping some level of computation within the edge node can help reduce radio usage. That computation can sort data measurements, ensuring that only useful data is sent. For example, if a temperature sensor is intended to monitor warmer temperatures, communicating readings and sounding an alarm above a threshold, then, rather than sending every temperature measurement, power can be saved by sending only readings in the warm regime so that whoever is monitoring this can be ready in case the temperature continues to increase. Given a sample reading every 30 seconds, a transmit-all policy would result, over the course of a year, in just over a million transmissions. Assuming that the higher temperatures occur only in the four hottest months and during the five hottest hours of the day would reduce that number to about 72,000 a savings of 93% if only warm temperatures were transmitted. Further transmission savings can be had if the edge node buffers multiple readings and batches them out together. That would increase the data transmission time somewhat when the data is actually sent, but www.kilopass.com Page 5/8
the total transmission energy would still be less than that required for multiple separate shorter transmissions due to power consumed during setup and teardown. The most effective power-saving strategy will therefore involve optimizing the sleep schedules of the various components, ensuring that nothing is on when not being used, and that higher-energy events occur as rarely as possible. Even given a slow transmission lasting 10 seconds done once per hour gives a radio duty cycle of 0.2%, which is why the 10% number used above can be considered conservative. It is through efforts like these that radio power is being reduced overall, leaving memory behind as the largest contributor. Anatomy of an OTP Bitcell Since OTP may be a significant contributor to power, we should investigate how further to reduce OTP memory energy consumption. An typical OTP cell consists of a CMOS gate oxide that is ruptured under a controlled mechanism to turn what would otherwise be an open circuit into a resistive short, along with the circuitry needed to program the cell and to select the cell out of an array of cells. The rupture occurs when the electric field across the dielectric exceeds the dielectric s breakdown limit. While any high-enough voltage can break down the oxide, it takes a carefully controlled approach to create a well-characterized short consistently and reliably meaning that the short will not somehow degrade over time, becoming more resistive and ultimately opening up entirely. A given cell is programmed if selected by the bit and word lines. The schematic for a cell is shown below, with three transistors: A program transistor; this is the transistor whose gate oxide will be programmed (or not). If the gate has been programmed, then a current can flow through it from the gate signal V WP. If it s not programmed, then there will be no current. The threshold for deciding that the oxide has been shorted is the ability to conduct several µa as a read current. A select transistor; that selects the program transistor during program and read operations. A bit-line control transistor that selects and controls the bit-line program biasing during program operation The three transistors are connected to a bit line that is routed through the memory array and connected to the input of the sense amplifier. www.kilopass.com Page 6/8
V WP V WR V BG (Float) V BL (Bit line) Figure 4. A basic OTP bitcell. The gray gate indicates the gate that will be shorted when programmed. The blue line indicates the direction of current if the cell is programmed. If the cell is not programmed, there should be no current. At a high level, operation of the cell is very straightforward. For both write and read operations, the V WR and V BG lines are raised so that their transistors are turned on; we need not consider them further in the following discussion. In order to program the cell, the bit line is grounded and V WP is raised to a high voltage on the order of 5 V. That s enough to rupture the gate oxide. However, that voltage is also higher than what is used for the other logic, meaning that a charge pump is required to generate that voltage from the available core or IO voltage. That high a voltage is also delicate in an advanced process, so it must be carefully controlled to within ±100 mv meaning that a regulator is required. To read the cell, the bit line is again grounded and the V WP line is again raised, although this time not to such a high voltage. If the resulting current through the bit line is low (ideally, zero), then the cell has not been programmed. If the current exceeds a few µa, then the cell is considered to be programmed. The actual programming algorithm is somewhat more complex than this, and it involves four important parameters effectively, knobs that can be tuned: The programming voltage: this is the voltage placed on V WP. The programming current: this is the current that flows when the oxide ruptures, and it s a function both of the programming voltage and the biasing of the remaining two transistors. The programming time: this is the length of time that V WP is kept at a high voltage when attempting to program the cell; The number of programming attempts: due to natural variation between cells, most cells program easily, a few cells require additional retry attempts. Programming is a violent action, and, in general, it s best to use the gentlest approach that will be reliable. That means turning each of the above knobs down as low as possible. Fortunately, that also helps reduce energy consumption as well. The lower the programming voltage, the lower the current and the less charge pumping is required. www.kilopass.com Page 7/8
The lower the programming current (for a given programming voltage), the fewer electrons are pulled from the battery through this path. The shorter the programming time, the less time current flows, and the lower the impact on the battery. The fewer programming cycles are required, the fewer times the programming circuits are energized. In other words, by dialing these parameters down as low as possible, power can be saved. But they can only be turned down so much. If taken too far, then the oxide may be incompletely ruptured, creating a partially programmed cell. That cannot be allowed to happen, since it will adversely affect circuit performance and reliability; there is a limit to how low these parameters can be set. Programming is typically accomplished using an optimal recipe that may dial some parameters up slightly so that others can be dialed down. Such recipes take significant effort to design, characterize, and qualify, but even so, there s a limit to what can be done with these four knobs. Anatomy of a Breakthrough As a result of the limitations provided with the four well-known knobs, dynamic read currents have been stuck at the 100-µA/MHz level, and designs have maintained the use of charge pumps and regulators in the power circuit. It turns out, however, that there s a fifth knob that can be turned. The details of this knob will be detailed in the future, but it has a significant impact. The internal voltage during read operation can be reduced from around 2.2 V to about 0.75 volts. It can actually go lower, but 0.75 V is the level that Kilopass has settled on for reliable operation. This significant reduction in voltage results in a much lower power consumption, enabling a specification of 10 µa/mhz instead of 100 µa/mhz. With this new, lower internal voltage, the circuit complexity can be reduced as charge pumps are no longer needed; this contributes to a smaller area and lower current. With the lower voltage, it is also easier to cover wide supply voltage ranges and variations, reducing the need for regulators. This results in further area reduction and lower current. The impact of this breakthrough is that OTP contribution to overall power drops by roughly the same magnitude as the improvements in radio power are expected to provide. The 300 µa/mhz contributed by OTP memory drops to 30 µa/mhz in the example above. This change can be applied to both bulk and SOI processes, giving us a continued roadmap to lower OTP power. Summary Kilopass new X2Bit bitcell consumes an order of magnitude less power than its predecessor cell and its competitors. At 10 µa/mhz, it fades into the background as a drain on an IoT edge-cell battery. In addition, reduced circuitry means a smaller footprint, saving silicon cost. We see this as an enormous step forward towards achieving the overall lower power that s required for a successful rollout of the IoT. www.kilopass.com Page 8/8