A Spin-Torque Transfer MRAM in 90nm CMOS. Hui William Song

Size: px
Start display at page:

Download "A Spin-Torque Transfer MRAM in 90nm CMOS. Hui William Song"

Transcription

1 A Spin-Torque Transfer MRAM in 90nm CMOS by Hui William Song A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department of Electrical and Computer Engineering University of Toronto Copyright by Hui William Song 2011

2 A Spin-Torque Transfer MRAM in 90nm CMOS Hui William Song Master of Applied Science, 2011 Graduate Department of Electrical and Computer Engineering University of Toronto Abstract This thesis presents the design and implementation of a high-speed read-access STT MRAM. The proposed design includes a 2T1MTJ cell topology, along with two different read schemes: current-based and voltage-based. Compared to the conventional read scheme with 1T1MTJ cells, the proposed design is capable of reducing the loading on the read circuit to minimize the read access time. A complete STT MRAM test chip including the proposed and the conventional schemes was fabricated in 90nm CMOS technology. The 16kb test chip s measurement results confirm a read access time of 6ns and a write access time of 10ns. The read time is 25% faster than other works of similar array size published thus far, while the write time is able to match the fastest result. ii

3 In loving memory of my mother. Everything I am today, I am because of you. iii

4 Acknowledgements I wish to express my sincere gratitude to my supervisor, Prof. Ali Sheikholeslami, for his guidance and support throughout my study here at the University of Toronto. His extensive insight and devotion into research has kept me motivated. I thank Prof. Glenn Gulak, Prof. Wai Tung Ng, and Prof. Bruce Francis, members of my thesis examination committee, for their valuable feedback. I would like to thank all my colleagues for their help and friendship. I would especially like to acknowledge David Halupka for his guidance on the research projects, as well as the rest of Prof. Ali s graduate group for the support. I am fortunate to have been surrounded by so many bright minds in the electronics research group. My genuine appreciations to Jeetendar Narsinghani for support in the lab. The measurements would not be complete without his help. I also thank our research partners at Fujitsu Labs, especially Koji Tsunoda, Chikako Yoshida, and Aoki Masaki. Their work ethic and utmost professionalism have made our collaboration a very enjoyable experience. I am deeply grateful to my parents for their unconditional love and support throughout my life. I cannot be where I am today, if not for their sacrifices. I dedicate this thesis to my late mom. iv

5 Contents List of Tables viii List of Figures ix List of Acronyms xiii 1 Introduction Motivation Thesis Objectives Thesis Outline Background MTJ Technology MTJ Electrical Characteristics MRAM FIMS STT Current State-of-the-Art Challenges Critical Current / Density MTJ Model MTJ Variation v

6 2.5.4 Read Speed Reference Generation Summary T1MTJ-based STT MRAM Motivation Proposed 2T1MTJ Cell Proposed Current-based Read Scheme Proposed Voltage-based Read Scheme Conventional Read Scheme Test Chip Architecture Shift Registers WL Driver Adjustable Delay Column Circuitry Read Decoder Write Circuitry Probe-out Scan Summary Measurement Results Test Chip Measurement Setup Read Performance Current-based Read Array Voltage-based Read Array Conventional Read Array Write Performance vi

7 4.4.1 Proposed Array Conventional Array Comparison Against Conventional Summary Conclusion and Future Directions Contributions Future Directions A Test Setup 80 A.1 Testboard A.2 Equipments Bibliography 84 vii

8 List of Tables 1.1 Comparison of key attributes of different memory technologies The delay times resulting from each control input How the SL and BL should be biased during read/write modes with given input data, for the proposed array How the SL and BL should be biased during read/write modes with given input data, for the conventional array Summary of all dies, with each lot s MTJ dimension and quantity Comparison against the most recently published works in STT MRAM.. 76 viii

9 List of Figures 1.1 The memory hierarchy, highlighting the target applications of MRAM The write endurance and write access time of various memories. (Red: volatile, blue: non-volatile) The basic structure of an MTJ The magnetic switching of the free layer is achieved by two orthogonal currents running directly over and underneath the MTJ Illustration of the (a) parallelizing and (b) anti-parallelizing mechanism of STT An MTJ can be represented as a variable resistor A standard MTJ resistance hysteresis curve. Also shown are the critical current/voltage points MTJ critical current as a function of pulse width [1] Structural diagram of a FIMS MRAM cells Schematic of a 1T1MTJ STT MRAM cell Structural diagram of a STT MRAM cells Schematic of a conventional write scheme Schematic of a conventional read scheme A reference generation scheme which produces a voltage exactly midway between the two states ix

10 3.1 (a) The conventional 1T1MTJ cell and (b) the proposed 2T1MTJ cell Illustration of how the change in current capacity affects switching pulse The proposed current-based read scheme The current-based sense amplifier Simulation of the current-based read scheme The proposed voltage-based read scheme The voltage-based sense amplifier Simulation of the voltage-based read scheme A simplified diagram of the conventional read scheme Implementation of the conventional read scheme Simulation of the conventional read scheme Top level overall architecture of the test chip The proposed array with 2T1MTJ cells with current/voltage-based read schemes Array with 1T1MTJ cells and conventional read scheme A single row of the WL Driver Schematic of the level shifter Adjustable delay element is able to produce 4 delay settings during runtime A column circuitry which contains the read and write components. This block is replicated per every column The read decoder will (1) shift the data along in shift mode and (2) captures the correct output polarity in read mode A push-pull write driver The write driver control circuit. Signals to the write driver (Fig. 3.20) are produced as specified in Table The write driver control circuit for the conventional array. Signals follow the specification of Table x

11 3.23 The probe out circuit allows direct access to the SL and BL lines externally Die photograph of the 16kb array A cross-sectional TEM image of the MTJ layers in between Metal-6 and Metal Photograph of the packaged test chip, with the lid open The V93K test flow procedure MTJ yield map of the current-based read array. (White: pass, black: fail.) Read access time of the current-based read scheme Measurement sequence for the read operation Shmoo plot of the current-based read yield as a function of V DD variation A 3D graph of the shmoo plot illustrating yield in terms of V DD and read access time Read access yield data from multiple temperatures, appended on one graph Maximum read yield of the current-based scheme as a function of temperature MTJ yield map of the voltage-based read array. (White: pass, black: fail.) Read access time of the voltage-based read scheme Shmoo plot of the voltage-based read yield as a function of V DD variation A 3D graph of the shmoo plot illustrating yield in terms of V DD and read access time Read access yield data from multiple temperatures, appended on one graph Maximum read yield of the voltage-based scheme as a function of temperature Yield map of the conventional read array. (White: pass, black: fail.) The conventional read scheme, with critical matching pairs highlighted Measurement sequence for the write operation Write access times of the proposed array xi

12 4.22 Shmoo plot of write yield of the proposed array as a function of V DD variation Write access times of the conventional array Shmoo plot of write yield of the conventional array as a function of V DD variation Comparison of the write access times of the proposed array (circle) against the conventional array (square) A.1 Photograph of the testboard A.2 Photograph of the underside of the testboard where additional decoupling caps are soldered A.3 Measurement setup for temperature testing A.4 Photograph of the testing setup using the TFU xii

13 List of Acronyms 1T1MTJ 2T1MTJ AP BL CMOS CQFP DFT DRAM DUT ECC FeRAM FIMS GMR LLG LS MR 1-Transistor-1-MTJ 2-Transistor-1-MTJ Anti-parallel Bitline Complementary Metal Oxide Semiconductor Ceramic Quad Flat Pack Design For Test Dynamic Random Access Memory Design Under Test Error Correction Code Ferroelectric Random Access Memory Field-Induced Magnetic Switching Giant Magnetoresistance Landau-Lifshitz-Gilbert Level Shifter Magnetoresistance Ratio xiii

14 MRAM MTJ P PCB PRAM PVT RRAM RSL RTL SA SL SoC SRAM STT TAS TEM TFU TMR Magnetoresistive Random Access Memory Magnetic Tunnel Junction Parallel Printed Circuit Board Phase-change Random Access Memory Process Voltage Temperature Resistive Random Access Memory Read Select Line Register Transfer Level Sense Amplifier Select Line System on Chip Static Random Access Memory Spin-Torque Transfer Thermally Assisted Switching Transmission Electron Microscope Thermal Forcing Unit Tunneling Magnetoresistance V93K Verigy WL WWL Word Line Write Word Line xiv

15 Chapter 1 Introduction 1.1 Motivation Through four decades of technology scaling, solid-state memories have become faster, larger, cheaper, and more power efficient with every generation. Most memory technologies are fairly established and have a specific niche which they fulfill. Recently, a new memory technology in the form of Spin-Torque Transfer (STT) Magnetoresistive Random Access Memory (MRAM) has emerged. STT MRAM has garnered a lot of research attention due to its potential to become a universal memory [1 3]. STT MRAM contains the advantages from each of the existing memory technologies and combines them into a single form. Its features include the fast access time of Static Random Access Memory (SRAM), the high density of Dynamic Random Access Memory (DRAM), and the non-volatility of Flash, but with low write power and almost unlimited endurance. Table 1.1 [3] shows a comparison of features of the current memory technologies. Table 1.1 shows the characteristics of the existing products with the potential of STT MRAM. To the best of the author s knowledge, there are no commercial STT MRAM products available at the time of writing. This is one of the main driving forces behind 1

16 Chapter 1. Introduction 2 Table 1.1: Comparison of key attributes of different memory technologies. SRAM DRAM Flash (NOR) Flash (NAND) FeRAM PRAM MRAM FIMS MRAM STT Non-volatile No No Yes Yes Yes Yes Yes Yes Cell size (F 2 ) Read time (ns) Write/erase time (ns) /50 1µs / 10ms 1m / 0.1ms 50/50 50/ Endurance > Write power Low Low Very high Very high Low Low High Low High voltage required (V) No No Other power consumption Current leakage Refresh current None None None None None None research which is intensely focused on achieving these attributes, as soon as possible. The memory hierarchy, as shown in Fig. 1.1, is organized in the manner that the fastest and smallest memories are closest to the CPU, while the slower and bigger memories are further down the chain. The target MRAM application is to replace the top of the hierarchy with a single-memory solution that is non-volatile [4]. Fig. 1.2 shows different types of memory technologies, including emerging memories such as PRAM and RRAM, in terms of their write times and life endurances. It confirms that the best candidate to achieve non-volatile, fast, and reliable embedded memory is STT MRAM [4].

17 Chapter 1. Introduction 3 Speed Reg (SRAM) Cache (SRAM) MRAM target Capacity Main memory (DRAM) Secondary storage (HDD/SSD) Non-volatile Figure 1.1: The memory hierarchy, highlighting the target applications of MRAM. Figure 1.2: The write endurance and write access time of various memories. (Red: volatile, blue: non-volatile)

18 Chapter 1. Introduction Thesis Objectives A survey of the current state-of-the-art in published works reveals that STT MRAM is still lagging behind in terms of read access time. Most have working read access in the order of a few tens of nanoseconds. However, to truly be an embedded solution, it must operate in the realm of sub-10ns to compete directly with SRAM. This thesis aims to explore design methodologies to achieve high speed read access time in STT MRAM. We accomplish this objective by proposing a novel cell topology, along with two accompanying read schemes, both with the goal of speed and reliability. Additionally, we must verify the proposed design through measurement of a fabricated test chip. 1.3 Thesis Outline This thesis is organized as follows. Chapter 2 provides background information on the basics of Magnetoresistive RAM. Chapter 3 describes the proposed designs of a new cell and the accompanying read schemes. Chapter 4 presents the measurement results of the test chip, of access times and performance variations. The results are then compared against the state-of-the-art works. Finally, chapter 5 summarizes the merits of this work and provide suggestions for future directions.

19 Chapter 2 Background MRAM is widely considered as the next universal memory technology [1 3]. The emergence of MRAM can be attributed to the invention of the storage device, the Magnetic Tunnel Junction (MTJ). Section 2.1 reviews the MTJ technologies available today; the most promising of which is the STT - MTJ. The STT device s electrical characteristics are described in section 2.2. This chapter will also review the basic functions of operation of MRAM (section 2.3), and present the current state-of-the-art in this area of research (section 2.4). Finally, a discussion of the challenges being faced by STT MRAM is presented in section MTJ Technology We first begin with a study of the MTJ. This section will briefly review a historical perspective of how MTJs have progressed to the current state. Also presented are the methods of operation for each MTJ technology. The modern MTJ is an evolution from many generations of earlier works. One of the earliest of works is the discovery of the Giant Magnetoresistance (GMR) effect in the 1980 s [5, 6]. It was observed that two ferromagnetic layers in parallel orientation displayed a lower electrical resistance compared to the anti-parallel magnetization of the 5

20 Chapter 2. Background 6 same layers. The theoretical basis behind GMR comes from the spin-dependent scattering effect of electrons inside ferromagnetic materials [7]. As electrical current enters the ferromagnet, the spins of the electrons are quantized to either direction (parallel / anti-parallel) of the field. The anti-parallel-spin electrons are strongly scattered, while the parallel-spin electrons can pass through with less scattering. Hence when current encounters two ferromagnetic layers with the same orientation, a low resistance is observed, whereas two layers with opposite magnetization will strongly scatter both spins of electrons, resulting in higher resistance. While the GMR effect formed the fundamental basis of magneto-resistive circuits, the next step in the evolution came with Tunneling Magnetoresistance (TMR)-based devices [2]. This new technology exploits the quantum tunneling phenomena. A TMR-based MTJ structure is composed of two ferromagnetic layers separated by a thin insulating barrier, as shown in Fig The top layer is free to switch between two possible magnetic orientations (hence called the free layer ), but the bottom plate is fixed in its orientation (hence called the fixed layer ). Ferromagnetic free layer (CoFeB) Insulating barrier layer (MgO) Ferromagnetic fixed layer (CoFeB) Figure 2.1: The basic structure of an MTJ. The pinned magnetic field in the fixed layer is achieved by a larger, antiferromagnetic layer attached directly underneath. The free layer s magnetic orientation can be changed by locally generated magnetic fields. The required field is achieved by two orthogonal

21 Chapter 2. Background 7 i easy B easy free layer fixed layer B hard i hard Figure 2.2: The magnetic switching of the free layer is achieved by two orthogonal currents running directly over and underneath the MTJ. running currents, as illustrated in Fig One current (i hard ) runs along the direction of the length of the MTJ to produce a magnetic field along the Hard-axis. The other current (i easy ) flows in the direction of the width of MTJ, to produce a magnetic field along the Easy-axis. Depending on the direction of i easy, the resulting orientation of free layer is either Parallel (P) or Anti-parallel (AP) to that of the fixed layer. When sensing the state of the MTJ, the same scattering effect as in GMR is utilized along with the electron tunneling effect through the barrier. As electrons enter the device, their spin becomes aligned with that of the magnetic layer. If the other layer is also of the same spin-orientation, the tunneling current may flow through with relative ease, whereas the opposite orientations would inhibit the current flow. Thus the same high / low effective resistance is observed. However, the addition of the insulating barrier adds significantly to the effective resistance. A figure of merit for MTJs is the Magnetoresistance Ratio (MR), defined as (R AP R P )/R P. Thus the larger TMR resistances result in a higher MR, in the order of 100%, compared to the GMR-based

22 Chapter 2. Background 8 devices which exhibit only a few percent. Larger MR means larger sense margins, which translates to more feasible circuit requirements. The MTJ described above is the basic component of Field-Induced Magnetic Switching (FIMS) based MRAM. Although it was much better than the old GMR-based devices, it had issues of its own. As the MTJ cell geometries shrink further, exponentially stronger magnetic fields are required to successfully switch the state of the free layer. Not only does this face scaling challenges, but due to such high levels of magnetic field, neighbouring cell are subject to disturbance. To overcome the scaling and power consumption issues, a new MTJ technology was introduced in 2005 [1]. Appropriately named Spin-Torque Transfer (STT) switching, spin-aligned electrons apply force on the free layer, resulting in torque that rotates its magnetization to the opposite direction of the easy-axis. STT effect was first theoretically predicted in 1996 [8, 9] and experimentally observed in 1998 [10, 11]. The difference between STT and FIMS MTJ is only in the switching mechanism; the sensing operation works the same way. To better visualize how STT switching is achieved, Fig. 2.3 illustrates the parallelizing and anti-parallelizing mechanisms. In a parallelizing operation (a), assume the MTJ is first in the AP state. As electrons enter the fixed layer, the spin of almost all electrons become aligned in the same direction. Once they tunnel through to the free layer, spin-aligned electrons create the torque against the opposite magnetization. If the given torque is large enough, the magnetic moment of the free layer will switch over to become aligned with the incoming electrons. For the anti-parallelizing operation (b), assume MTJ is initially in P state. As electrons enter through the free layer, they are quantized into two directions of spin-alignment. Unlike the fixed layer with a strong magnetization moment, the free layer is weaker and thus a minority of electrons will be of opposite spin-direction. This minority group will bounce back from the fixed layer due to scattering, and apply a torque onto the free layer to flip its magnetization to the anti-parallel direction. Because only a minority of

23 Chapter 2. Background 9 initially AP state spin-torque applied switch of mag. moment applied electrons spin-aligned electrons spin-torque applied applied electrons initially P state (a) spin-torque applied spin-torque applied switch of mag. moment spin-torque applied (b) Figure 2.3: Illustration of the (a) parallelizing and (b) anti-parallelizing mechanism of STT.

24 Chapter 2. Background 10 electrons is doing work, the magnitude of current necessary to anti-parallelize is larger than the parallel case. The write procedure imposes stress on the MTJ device. A breakdown of the oxidebarrier layer can occur after the MTJ s lifetime endurance of write cycles. The read operation however, does not stress the device, because only a small amount of current is applied to sense the magnetization. With STT, since the current flows directly through the MTJ, the cell architecture becomes simpler compared to the field switching method (Fig. 2.7 vs. Fig. 2.9). Also, FIMS requires currents of magnitude that is inversely proportional to MTJ volume, whereas STT switching current is directly proportional to volume [12, 13]. This lends well to scalability in future smaller nodes. 2.2 MTJ Electrical Characteristics An MTJ can be represented simply as a variable resistor as shown in Fig The convention we will follow in this work is that the free layer side of the MTJ will be the positive terminal and the fixed layer side will be the negative terminal. Thus a positive current is defined as flowing from A to B. A A free layer barrier layer fixed layer R MTJ i MTJ B B Figure 2.4: An MTJ can be represented as a variable resistor.

25 Chapter 2. Background 11 R MTJ R AP R P I C / V C I C+ / V C+ i MTJ / V MTJ Figure 2.5: A standard MTJ resistance hysteresis curve. Also shown are the critical current/voltage points. The MTJ can assume one of two states (high-resistance R AP or low-resistance R P ) and displays a hysteresis behaviour. A typical resistance hysteresis curve is plotted in Fig. 2.5, which shows the change in resistance as a function of applied voltage or current (V MTJ /i MTJ ). Positive applied V MTJ /i MTJ switches the MTJ from a high-resistance to a low-resistance state. Negative V MTJ /i MTJ will cause low-to-high switching. If MTJ is already in P state when positive stimulus is applied, or in AP state when negative stimulus is applied, no change in state will occur. Note that, due to TMR effect, the resistance varies more as a function of the applied stimulus while in the high-resistance state [1]. The current/voltage values at which switching occur are called the critical current (I C ) and critical voltage (V C ). We can further specify the parallelizing and anti-parallelizing critical points as I C+ /I C and V C+ /V C. Critical current/voltage varies as a function of the pulse width in which the I C /V C is applied. As the pulse width becomes shorter, the magnitude of I C /V C become larger. Fig. 2.6 shows a typical I C -t relationship. Note that down to 10ns, the I C is a well-behaved relationship with respect to time, but pulse

26 Chapter 2. Background 12 Figure 2.6: MTJ critical current as a function of pulse width [1]. width below 10ns causes exponential rise in the magnitude of critical current. In the >10ns regime, thermal agitation can help greatly in reducing the switching current [12]. However, in the sub-10ns regime, the thermal contribution is limited and precessional switching becomes the only mechanism. For this work, the Fujitsu MTJs which we designed for had 1mA critical current. This is a few times larger than the average of those that have been reported thus far [13 15]. The electrical characteristics described in this section pertain to the basic MTJ device. However, more advanced MTJ structures will exhibit slightly different characteristics. For the scope of this thesis, all the background information necessary have been summarized in this section. Interested readers on topic of MTJ innovation are encouraged to look into such areas as perpendicular magnetic orientation structures [16 18], multi-value MTJs [19 21], top-pinned fixed layer [22], high thermal stability [23,24], modeling [25 28], and study of feasibility down to 22nm node devices [29, 30].

27 Chapter 2. Background 13 i WR i RD BL SL i WWL WWL D WL S Figure 2.7: Structural diagram of a FIMS MRAM cells. 2.3 MRAM In this section, the basic MRAM operations are described, at the cell level as well as the read and write procedures. The focus of this thesis is on STT MRAM, but a brief discussion is also given to FIMS, to better understand the contrast between them FIMS The standard FIMS MRAM cell is illustrated in Fig It consists of the MTJ connected in series to the access transistor. The Bitline (BL) and Select Line (SL) are shared along each column, while the Word Line (WL) and Write Word Line (WWL) are shared among cells of the same row. During write operation, the unidirectional current (i WWL ) through WWL is activated as well as the bidirectional (data-dependent) programming current (i WR ) through the BL. Since only at the orthogonal cross-point would there be sufficient magnetic field, the selected cell s state should switch. For read operation, a small sense current (i RD ) runs through the MTJ to develop a voltage across the cell (i RD R MTJ + V Tr ). This voltage is evaluated by a Sense Amplifier (SA).

28 Chapter 2. Background 14 SL WL BL R MTJ + Figure 2.8: Schematic of a 1T1MTJ STT MRAM cell. i RD / i WR BL SL WL D S Figure 2.9: Structural diagram of a STT MRAM cells STT The 1-Transistor-1-MTJ (1T1MTJ) topology (Fig. 2.8) is the standard cell most widely used in STT MRAM. The MTJ is connected in series with an NMOS access transistor. The positive end of MTJ is connected to the vertical line BL, while the access transistor is connected to the other vertical line SL. The gate is attached to the horizontal WL. A structural view of the cell is illustrated in Fig Note the absence of WWL. The STT write operation is simpler than FIMS. Depending on the data to be stored, I C in either direction is fed through the cell. We follow the convention that low-resistance MTJ state corresponds to the logic 0 and the high-resistance state corresponds to logic

29 Chapter 2. Background 15 I C I C+ WR1 WR0 WL SL BL WR0 WR1 Figure 2.10: Schematic of a conventional write scheme. 1. A block diagram of the write circuit is shown in Fig Note that since the MTJ switching requirements are asymmetrical for parallelizing and anti-parallelizing, I C needs to be larger than I C+. Fig shows the basic schematic of a simple read circuitry. During read operation, a small current of i read, carefully chosen to avoid read-disturbance, is fed through the cell. The voltage that develops across the cell is compared against a reference voltage that is ideally midway between the two input voltages. V ref = V H + V L 2 (2.1) However, with smaller read currents, the voltage margin ( V = V H V L ) becomes smaller and is harder to detect by the SA. Another issue is the creation of a reference resistance

30 Chapter 2. Background 16 i READ V x V ref SA R MTJ R ref R ref = R P + R AP 2 Figure 2.11: Schematic of a conventional read scheme. (R ref ) that is exactly halfway between the two states, R ref = R P + R AP 2 (2.2) Due to high levels of PVT-related variation that exists among MTJs, it is a good idea to utilize MTJ themselves as references, such that the variation will be tracked. Of course, it is also possible to create an adaptive reference-resistance analog circuit, but such scheme would come with higher complexity cost (not to mention larger area and power consumption as well). Employing the two possible MTJ resistor values available, the structure in Fig shows one method of producing the ideal V ref value. However, it faces many issues of practicality, such as the fabrication of two MTJs back-to-back, circuits to control each MTJ state, and overall size. We tackle the issue of reference generation in our design in Chapter 3.

31 Chapter 2. Background 17 i READ /2 R P R V EQ = i P + R AP READ ( 2 ) R AP Figure 2.12: A reference generation scheme which produces a voltage exactly midway between the two states. 2.4 Current State-of-the-Art A survey of the recent works in prominent literature in the area of MRAM is summarized in this section. These papers provide us with an idea of the goals to target for the specifications of our own system. They will be directly compared against our work later in Chapter 4. The leading published work in FIMS MRAM came in 2009 [31]. A 32Mb array built in 90nm CMOS was shown to have 12ns access times. The contributions in this work include boosted wordlines and reduction of parasitic resistance. Toggle-based MRAM is a variation of FIMS, which is more power-efficient in writing, but takes longer to complete. A toggle-based 180kb embeddable module exhibited a 25ns read access times by [32]. One of the first STT MRAM systems to be successfully demonstrated was in 2007 [13]. The 2Mb test chip had 100ns write access and 40ns read access times. In 2009, the same team improved upon their design by achieving a 32Mb array with 40ns write access and 32ns read access time [14]. More recently, in 2010, a 64Mb array was published with 30ns write and 11ns read access times [15]. The primary contribution in this work is the 50µA switching perpendicular MTJ. This switching current is much lower than any reported

32 Chapter 2. Background 18 before, and thus allowed for high cell density in the array. At the same time, our team published a paper [33] that introduced the use of negative-resistance in STT MRAM, to guarantee non-destructive reads and to achieve power savings in certain cases of writes. 2.5 Challenges STT MRAM is a relatively young technology and will need more work to mature. Some of the challenges being faced today are discussed in this section Critical Current / Density The critical current of MTJs is still quite high, in the order of a few hundred micro-amps. The access transistor must be sized to be capable of carrying such I C. Unfortunately, the resulting access transistor is constrained to be much larger than the ideally desired minimum size, such as in DRAM. Hence, at the system level, it translates to a density issue. This obstacle is primarily in the domain of process-level research to resolve. However, [34] has tried to get around this issue by purposely sizing the access transistor smaller than required, and using an Error Correction Code (ECC) technique to rectify faults. This of course, comes with the penalty of ECC area and increased system-level complexity MTJ Model There are models of MTJs, which have the physical electro-magnetic properties well defined, using the Landau-Lifshitz-Gilbert (LLG) equation [35, 36], but are not compatible with circuit simulators, such as Hspice [26]. These physical models are mainly targeted towards physics and device-level research. There are also MTJ behavioural models which do characterize the electrical properties and can be utilized by circuit designers. However, these are not based on physical parameters, but rather curve-fit from measured

33 Chapter 2. Background 19 data [27, 28]. The ideal solution would be a model whose data is based on numerical analysis (i.e. using LLG), and convert such data into high-level electrical characteristics compatible with circuit-level simulators. Because such a model is not available, for the work in this thesis, we used a fairly simple curve-fit-based model to account for the MTJ s static behaviour MTJ Variation Due to process, deviations in the MTJ layers thickness and lateral dimensions can occur. This translates to MTJ resistance variation and critical current/voltage variation across different locations of an array. These variations will inevitably exist, much like CMOS transistor variation, and therefore must be taken into consideration during design. MTJ characteristics will vary across different temperatures as well. At higher temperatures, the thermal energy within the MTJ assists with the switching process, a phenomenon appropriately named Thermally Assisted Switching (TAS). With TAS, the required critical current/voltage values decrease, which is helpful to the write process. However, smaller switching current/voltage is harmful to the read operation, because it can lead to increased probability of read-disturb failure Read Speed So far, STT-MRAM chips have not penetrated the sub-10ns cycle time regime. In order to compete as an embedded memory, such as SRAM, high speed random access operation will be required. Unlike write time, which is heavily dependent on the MTJ process, read time is mostly in the realm of circuit design in terms of its limits. Therefore, innovative circuit design needs to be utilized to push the speed of the read operation. This will be one of the main objectives of our work. The proposed designs presented in the next chapter set out to achieve fast read access times.

34 Chapter 2. Background Reference Generation A reference, whether it be voltage or current, exactly midway between the two input cases is beneficial. It ensures reliable operation of the sense amplifier, as well as to help relax matching constraints. However, as mentioned earlier in 2.3.2, practically speaking, a perfectly midway reference is hard to achieve. Most current works try to emulate it, and is close to the ideal point, but when other factors such as CMOS and MTJ resistance variation are considered, the margin can reduce significantly. The next chapter will put forth ideas on generating stable midway references. 2.6 Summary In this chapter, the MTJ technology advancements were outlined from the first GMRbased to the modern STT-based devices. Next, electrical characteristics of the MTJ from the circuit design point of view were considered. A description of the top-level MRAM operations were given, followed by a summary of the most significant state-of-the-art published works. Finally, the chapter is concluded with the current technical issues facing STT MRAM today. We will try to tackle some of these issues in our proposed design presented in the next chapter.

35 Chapter 3 2T1MTJ-based STT MRAM 3.1 Motivation In order for STT MRAM to achieve full scale integration as the next embedded memory, a high-speed random access operation rivaling that of SRAM is necessary [31,37]. Targeting read-mostly memory applications, this work proposes an STT MRAM system aimed at obtaining high speed read access times. We accomplish this by utilizing a novel 2-Transistor-1-MTJ (2T1MTJ) cell topology along with new read schemes designed specifically for the 2T1MTJ cell. Two read schemes are proposed in this work: current-based and voltage-based. The current-based read scheme directly compares current levels, while the voltage-based read scheme compares node voltages. For comparison purposes, a conventional array was also implemented on the test chip, to measure the performance differences. In terms of write operation, the target specification is a 10ns write access time. This spec dictates the minimum size of the access transistor, and hence the area of the cell. This chapter is organized as follows: section 3.2 introduces the 2T1MTJ cell. Section 3.3 and 3.4 explain the functionality of the proposed current-based read scheme and voltage-based read scheme, respectively. Section 3.5 describes the conventional array that 21

36 Chapter 3. 2T1MTJ-based STT MRAM 22 was implemented. Section 3.6 shows the overall architecture of the test chip, including all the peripheral components. Finally, section 3.7 concludes the chapter. 3.2 Proposed 2T1MTJ Cell The conventional 1T1MTJ cell (Fig. 3.1(a)) uses a single transistor for both read and write purposes. However, the two operations place contradicting requirements on the access transistor. Write operation demands it to be as large as possible, but a large size is unfavourable for the read operation, because the cell becomes more susceptible to read disturbance [38, 39] and results in longer read times. To decouple these conflicting requirements, a second transistor is added to be used exclusively for the read operation. By connecting the gate of the second device to the internal node of the cell, it adds the flexibility to utilize that internal node. Thus a 2T1MTJ cell is proposed as shown in Fig. 3.1(b). SL RSL WL BL SL RSL WL BL R MTJ R MTJ M1 M1 M2 M2 (a) (b) Figure 3.1: (a) The conventional 1T1MTJ cell and (b) the proposed 2T1MTJ cell. In order for this cell to maintain the same area as that of the conventional cell, we reduce the size of the access transistor M1 to accommodate the significantly smaller M2. While M1 would be sized large enough to support the critical switching current, M2 can be as small as the minimum-size transistor of the technology, because it is only used

37 Chapter 3. 2T1MTJ-based STT MRAM 23 I C (t) I 0 I 0 t 0 t' 0 T pulse Figure 3.2: Illustration of how the change in current capacity affects switching pulse. during the read operation. The drain of M2 is connected to a third column named Read Select Line (RSL). RSL does not increase the area of the cell in any way because it lies on a different metal layer than SL and BL. As its name suggests, the RSL is only utilized during read operation. For writes, it is grounded, rendering both M2 and RSL irrelevant. The reduced size of M1 can be justified by MTJ behaviour, as shown in Fig The critical switching current increases exponentially in the sub-10ns pulse width region. Therefore a decrease in the current capacity of the access transistor will yield only a slight increase in the write pulse width, as illustrated in Fig However, we will show that this small increase in write time will be justified by the ample gain in read speed. 3.3 Proposed Current-based Read Scheme As with all memories, there exists a read/write input signal (r/w) to the memory system to indicate the mode of operation. In this design, during read mode (r/w=1), all cells are biased such that SL is tied to V DD and BL is tied to ground. This configuration is maintained throughout the read mode, regardless of active read cycles. The read circuit setup is shown in Fig To select a row, a voltage of approximately 2 / 3 V DD is applied

38 Chapter 3. 2T1MTJ-based STT MRAM 24 on the WL. As a result, the access transistor M1 effectively acts like a small current source inside the cell, allowing a voltage to quickly develop across the MTJ. This voltage, which can be of two different values depending on the MTJ resistance, drives the gate of M2, generating the current i cell. The current-based sense amplifier will directly compare the cell current (i cell ) against the reference current (i ref ) produced by the reference cells. The reference cell is composed of two regular cells, but each with MTJs of the opposite state. The top branch of the reference cell produces i 0 /2 while the bottom branch produces i 1 /2, which together generate a total constant reference current of i ref = i 0 + i 1 2 (3.1) The M2s in the reference cells are exactly half the drive strength of those in the regular storage cells. Hence, M2 in reference cells can be chosen as minimum-size w o and those in storage cells as 2w o. Another option is to size M2 in storage cells as minimum-size w o, and M2 in reference cell as w o /2L o (twice minimum length). Both options achieve the same effect; that is, produce exactly half the current in the reference with respect to the storage cell. The first option has the advantage that width ratio is better controlled through the fabrication process, rather than length ratio, but it suffers from the area overhead. We chose the first option in our implementation, since accuracy was more important than area in the case of a test chip. The loading on the two RSL lines are identical. This ensures that the development time of the i cell and i ref are the same. The current-based sense amplifier is illustrated in Fig The input nodes are attached to the two RSL lines. The currents from the inputs (in+ and in ) are mirrored and multiplied, then fed directly into the two branches of the cross-coupled inverters. The active-low enable signal (en) is activated once the input currents reach steady-state. When the enable signal is not triggered (en=1), the two NMOS are ON and allow the currents to flow through them, stabilizing the outputs to the same voltage. Once the enable is triggered (en 0), the two currents can now push the cross-coupled inverters to latch into one of the two possible directions. The outputs are fully digital logic values.

39 Chapter 3. 2T1MTJ-based STT MRAM V RD WL Storage cell M1 R MTJ i 0 i 1 W M2 C RSL SL i cell i ref SA SA_out C RSL BL WL Reference cell M1 R P i 0 /2 W/2 M2 M1 R AP i 1 /2 W/2 M2 iref = i 0 + i 1 2 RD Figure 3.3: The proposed current-based read scheme.

40 Chapter 3. 2T1MTJ-based STT MRAM V 1.2V 1.2V 1.2V i ref i cell in in+ SA_out en SA_out en Figure 3.4: The current-based sense amplifier. Note that since in+ is connected to top RSL and in to bottom RSL, the inputs are switched when the accessed cell is in the bottom half of the array. This therefore switches the output polarity and needs to be accounted for. Fig. 3.5 shows the simulation waveforms of the read signals. This example shows two consecutive reads; a read of 0, followed by a read of 1. The read operation starts with the activation of the external I/O signal W L en. When this signal is enabled, the WL Driver turns on the selected row WL, as well as the reference row WL, to the set voltage V b. Once the WL is activated, i cell and i ref will start to develop. When the currents reach steady-state, the sense-amp enable (en) can be triggered. The difference between the reference current and the cell current is about 15µA. The output of the SA then feeds to the read decoder, whose job is to invert the polarity of the signal if the accessed cell is in the bottom half of the array. The final output is then captured by a data register, triggered by the positive edge clock (CLK). The read access time is defined as the time from the W L en activation to the rising edge of the register clock (shown in Fig. 3.5).

41 Chapter 3. 2T1MTJ-based STT MRAM 27 Read-0 Read-1 WL_en WL i cell, i ref (µa) en out, out CLK read access time x 10-8 Time (s) Figure 3.5: Simulation of the current-based read scheme. Compared to the conventional scheme, which grounds the BL and charges the capacitive load on SL to develop the cell voltage V x, the proposed read scheme bypasses the C SL and C BL loading. This is accomplished due to the fact that SL and BL are always tied to set voltage levels, and do not need to be pre/discharged in every read cycle. The most significant load is now on the RSL. However, this loading is smaller than that of SL/BL, because it is shunted by the much smaller M2 transistors and is only connected to half the array. Thus the factor of reduction in capacitance is approximately w M2 /2w M1, which in the case of our test chip is 1 / 6. This contributes to shorter read access times. w M1 is the width of the access transistor of conventional cell, as defined in Section 3.5

42 Chapter 3. 2T1MTJ-based STT MRAM Proposed Voltage-based Read Scheme The voltage-based read-scheme is similar in its structure to the current-based read scheme as shown in Fig Again, all cells are biased with SL tied to V DD and BL tied to ground. This bias condition has another benefit that it is in the anti-parallelizing configuration for the MTJ. From the MTJ characteristics, we know that its asymmetrical hysteresis allows for a larger disturbance margin in the anti-parallel region during the read operation (Fig. 2.5). The parallelizing current/voltage is about half of that of the anti-parallelizing critical value, thus the read margin would be significantly smaller if biased in the opposite direction. The reference is also identical to that used in the current-based read scheme. The read cycle starts by first precharging the RSLs to V DD. A standard precharge circuit is used, and is not shown on Fig Once the selected WL is turned on to the set bias voltage of V b, approximately 2 / 3 V DD, M2 will start discharging the RSL node at the rate of dv dt = i C RSL (3.2) The storage cell will produce either i 0 or i 1, depending on the MTJ state, and the reference cell will produce a constant current (i 0 +i 1 )/2. The C RSL on the top and the bottom are identical. Therefore, the rate of change of RSL during discharge will be such that the reference RSL is exactly halfway between the high and low cases. The voltage-based sense amplifier is presented in Fig The enable is an active-low signal. Before it is triggered (en=1), output nodes are both grounded due to M1 being OFF and M2/M3 branch ON. When the enable is triggered (en=0), M1 now connects V DD to the cross-coupled inverters. The en d (enable delayed) is still high initially. During this time, the input voltage will allow the output nodes to develop in a manner leaning towards the final value. Once the delayed enable also goes low, the M2/M3 side branches are no longer active, and the cross-coupled inverters will fully latch to the final value.

43 Chapter 3. 2T1MTJ-based STT MRAM V WL Storage cell M1 R MTJ W M2 SL C RSL precharge V x = 0 V ref V x = 1 discharge V x V ref C RSL SA SA_out Reference cell BL WL M1 R P W/2 M2 M1 R AP W/2 M2 Figure 3.6: The proposed voltage-based read scheme.

44 Chapter 3. 2T1MTJ-based STT MRAM V en M1 SA_out SA_out en d M2 M3 en d in+ M4 M5 in Figure 3.7: The voltage-based sense amplifier. The outputs are fully digital logic values. The amount of delay between en and en d is a programmable value that ranges from 200ps to 1ns. The delay circuit will be explained in detail in a later section. The longer delay will ensure the SA to latch to the correct position, but the shorter delay reduces the time in which current flows in the circuit, thus saving power. Fig. 3.8 shows the simulation of the read signals of the voltage-based read scheme. Again, we show two consecutive reads; a read of 0 preceding a read of 1. The read operation starts with the activation of the external I/O signal W L en. When this signal is enabled, the WL Driver pulls up the selected row WL, as well as the reference row WL. At the same time, the top and bottom RSLs are precharged to V DD =1.2V. After the precharge completes, the M2 will start discharging the RSL. Once enough margin develops between the reference voltage and the cell voltage, the sense-amp enable (en) is triggered. The optimal margin is around 250mV. If we wait too long to trigger the en, then both input nodes will have fully discharged to ground. Therefore, there is an optimal time at which to enable the SA to ensure maximum margin. The output of the

45 Chapter 3. 2T1MTJ-based STT MRAM 31 Read-0 Read-1 WL_en WL V cell, V ref en out, out CLK read access time x 10-8 Time (s) Figure 3.8: Simulation of the voltage-based read scheme. SA then feeds to the read decoder, whose job is to invert the polarity of the signal if the accessed cell is in the bottom half of the array. The final output is then captured by a data register, triggered by the positive edge clock (CLK). The read access time is defined as the time from the W L en activation to the rising edge of the data register clock. Some of the benefits of this scheme compared to the conventional include guaranteeing the reference level to be exactly midway between the two cases of input levels. As well, the voltage margin between V x and V ref is larger than that of the conventional scheme by as much as 2.5 (as will be seen later). Also, this circuit offloaded the effects of the large capacitor of the SL (C SL ) to the smaller load of C RSL, which contributes to the reduced read access time.

46 Chapter 3. 2T1MTJ-based STT MRAM Conventional Read Scheme The conventional array was also implemented on the test chip for the purpose of comparison against the proposed arrays. The 1T1MTJ cell is used (Fig. 3.1(a)), with the size of the access transistor M1 (w M1 ) equal to the 2T1MTJ s M1 and M2 added together (w M1 +w M2 ). A simplified block diagram of the conventional read scheme is shown in Fig The basic idea is to feed a read current to the storage and the reference MTJs to develop a voltage across the cell (MTJ + transistor) and a reference voltage across the reference cell. The cell voltage (V x ) is compared against a reference voltage (V ref ) by the sense amplifier once the input levels reach their final values. This assumes that resistance of the reference MTJ (R ref ) is midway between R P and R AP. There are many different ways to implement this conventional scheme and many variations have been published to maximize the performance of such systems [13 15, 33]. The conventional system implemented in our design is shown in Fig The reference is composed of two cells in parallel, each with opposite state MTJs. The read current is provided by the current mirror structure, which feeds twice as much current through the reference MTJs as the storage cell. This yields an effective reference resistance of R ref = 2(R P R AP ) (3.3) R ref = 2 R P R AP R P + R AP (3.4) Now assuming nominal resistance values of R P =1000Ω and R AP =1500Ω, R ref = (3.5) R ref = 1200 Ω (3.6)

47 Chapter 3. 2T1MTJ-based STT MRAM 33 i READ i READ cell ref V x V ref SA C SL R MTJ C ref R ref R ref = R P + R AP 2 Figure 3.9: A simplified diagram of the conventional read scheme. 1.2V 1.2V 1.2V i read 2i read en V ref V x cell ref SA_out 3.3V 3.3V en d en d R CELL R P R AP voltage generation sense amp Figure 3.10: Implementation of the conventional read scheme.

48 Chapter 3. 2T1MTJ-based STT MRAM 34 This is not exactly midway, but is a close approximation. The only way to guarantee a reference voltage exactly midway is to use the configuration shown in Fig However, such a configuration is difficult to achieve practically, as explained in section Considering the simplicity and the relative closeness of V ref to the midway point, this structure is a good compromise. Also note that V ref is actually closer to the lower input voltage (V 0 ), thus the margin is slightly smaller when reading a parallel-state MTJ. The sense amplifier used is the exact same design as that in the proposed voltage-based read scheme in Fig Figure 3.11 shows the simulation results of the conventional read. We first access a cell to read 0, followed by a read of 1. Similar to the proposed designs, the read cycles start with the activation of the W L en, which in turn allows the WL Driver to turn on the selected row WL and reference WL to full V DD. This lets the read current i read to flow through and for V x and V ref to develop. The margin between V x and V ref is about 100mV under nominal conditions. The functionality of the SA has been described previously in section 3.4. There are three issues noticeable in the conventional scheme that allows opportunity for improvement. First, it must charge C SL, which is quite large due to the parasitic capacitance of the large access transistors. Second, the reference voltage is not exactly midway between the two input cases. Finally, the nominal voltage margin is only 100mV, which can quickly shrink in the face of CMOS and MTJ variation. The proposed read schemes (described in the previous sections) were designed to address these issues mentioned above.

49 Chapter 3. 2T1MTJ-based STT MRAM 35 Read-0 Read-1 WL_en WL V cell, V ref en out, out CLK read access time x 10-8 Time (s) Figure 3.11: Simulation of the conventional read scheme.

50 Chapter 3. 2T1MTJ-based STT MRAM Test Chip Architecture A simplified block diagram of the overall architecture of the test chip is illustrated in Fig As the figure shows, the complete array is composed of three sub-arrays. The proposed arrays share a WL Driver, while another version of it is dedicated to the conventional array. Fig and 3.14 further illustrate the proposed and conventional arrays, with cell-level connection details. The reference cells are placed in the middle row of the array to minimize MTJ variations, in terms of resistance and critical switching current, between the reference and the storage cells. The references are activated at every cycle when a bit in the same column is accessed. This leads to more frequent reference cell usage than regular cells. However, reference cell fatigue is not an issue, because these are read accesses. Only the write operation stresses the MTJ device and the reference cells only need to be written into once, at the beginning of system start-up. Therefore, this architecture does not compromise the overall lifetime of the system. In fact, due to the reference cell s slower aging rate, it could lead to discrepancies in MTJ characteristics between the storage and reference cells. This can be resolved by re-writing the same state into the reference cells periodically, much like the refresh mechanism of DRAM. Functionally, it is unnecessary due to the non-volatile nature of MRAM, but it serves to ensure equal rate of aging between the reference and storage cells. The cell and the sense amplifier has already been described in the previous sections. We will now outline the design of the row and column circuitry outside of the array core.

51 Chapter 3. 2T1MTJ-based STT MRAM 37 Address proposed current-based read array proposed voltage-based read array conventional read array Address storage cells (64x32) storage cells (64x32) storage cells (64x64) Address shift reg WL Driver (proposed array) ref cells sense amp ref cells ref cells sense amp ref cells sense amp ref cells WL Driver (conventional array) Address shift reg storage cells (64x32) storage cells (64x32) storage cells (64x64) Write data Write / Read circuitry... Data shift reg Read data Figure 3.12: Top level overall architecture of the test chip.

52 Chapter 3. 2T1MTJ-based STT MRAM 38 current-based read array voltage-based read array Address WL SL... RSL BL SL... RSL BL Address shift reg WL Driver ref i-sa ref i-sa v-sa v-sa WL_en Write data Write / Read circuitry... Data shift reg Read data Figure 3.13: The proposed array with 2T1MTJ cells with current/voltage-based read schemes.

53 Chapter 3. 2T1MTJ-based STT MRAM 39 Address WL SL... BL Address shift reg WL Driver v-sa ref... v-sa ref WL_en Write data Write / Read circuitry... Data shift reg Read data Figure 3.14: Array with 1T1MTJ cells and conventional read scheme.

54 Chapter 3. 2T1MTJ-based STT MRAM Shift Registers Due to pin restrictions, address and data bits are serially shifted in and out using shift registers. There is one flip flop for each row in the address register and one flip flop for each column in the data register. The address bit stream represents one-hot encoding. Data to be written is shifted in before a write operation, and the data read from memory is shifted out after read operation WL Driver The Wordline Driver is composed of individual slices replicated per every row. A single slice of the WL Driver is displayed in Fig The inputs comes from the output of the shift register, which is the row select signal and the external W L en that acts to gate it. The resulting signal is level shifted from 1.2V to 3.3V due to the need to drive high-voltage cell transistors. The Level Shifter (LS) is illustrated in Fig After the signal has been boosted, it is processed through a series of buffers to drive the large capacitive load of WL. The read operation requires an analog voltage of V b on the WL. For the purpose of easy adjustability during testing, we chose to simply provide V b as an external voltage instead of using on-chip voltage generation scheme. In the final stage of the buffer, the signal is branched out to be of either V DD or V b level. Transmission gates ensure that the correct output level is sent through depending on the mode of operation (r/w). The WL Driver for the conventional array is simpler, due to the fact that WL is always pulled to full V DD regardless of read or write mode. Hence it does not have the V b buffer and the transmission gates.

55 Chapter 3. 2T1MTJ-based STT MRAM V DD rwn 1.2V 3.3V row select [n] level shifter V b rwn WL[n] rwn WL_en... Figure 3.15: A single row of the WL Driver. VDD H VDD H OUT VDD L IN Figure 3.16: Schematic of the level shifter.

56 Chapter 3. 2T1MTJ-based STT MRAM 42 Table 3.1: The delay times resulting from each control input. ctrl[1:0] delay ps ps ps ps enable enable delayed ctrl[1] ctrl[0] 1 ctrl[1] ctrl[0] 3 ctrl[1] ctrl[0] 2 ctrl[1] ctrl[0] 4 Figure 3.17: Adjustable delay element is able to produce 4 delay settings during runtime Adjustable Delay This block provides the necessary enable signals to the voltage-based sense amplifier. The circuit is shown in Fig Depending on the 2-bit input control data (ctrl[1 : 0]), the output enable delayed can be varied. The transistors are sized to provide the amount of delay as listed in Table 3.1.

57 Chapter 3. 2T1MTJ-based STT MRAM 43 SA_out Data[n-1] Data[n] Bottom array access Read Decoder SL BL CLK shift Write Circuitry rwn SL BL Scan[n-1] CLK Probe-out Scan Scan[n] SL_ext BL_ext Figure 3.18: A column circuitry which contains the read and write components. This block is replicated per every column Column Circuitry The column circuitry is replicated per every column in the array. The two proposed arrays utilize the same column circuitry, while the conventional array has its own. Fig shows the column circuitry of the proposed arrays. It contains the write driver, read decoder, and the probe-out scan. The column circuitry of the conventional array is very similar, but has a simpler read decoder because the SA outputs do not need to be adjusted. Each sub-block will be explained in more details below.

58 Chapter 3. 2T1MTJ-based STT MRAM 44 Bottom array access SA_out Data[n-1] 0 1 D Q Data[n] shift CLK Figure 3.19: The read decoder will (1) shift the data along in shift mode and (2) captures the correct output polarity in read mode Read Decoder The purpose of the read decoder is two-fold; (1) to shift the data along the register in shift mode and (2) to detect if the read accessed cell is in the bottom half of the array, and if so, reverse the polarity of the SA output. The circuit is shown in Fig The signal bottom array access comes from the output of a dynamic OR gate attached to the bottom half of the address shift register. Since the read decoder of the conventional array does not require the check for location, SA out is connected directly to the input of the 2:1 mux.

59 Chapter 3. 2T1MTJ-based STT MRAM 45 SL BL 3.3V 3.3V write1 20µm/ 0.36µm 4µm/ 0.36µm write0 P1 P2 write0 2µm/ 0.36µm 10µm/ 0.36µm write1 N1 N2 Figure 3.20: A push-pull write driver Write Circuitry The write driver used is a simple push-pull driver, as shown in Fig Since antiparallelizing writing requires higher amount of current to complete, the two transistors used to perform AP switching is larger than the two transistors used for P writing. The control circuit for the driver should only turn on P1 and N2 during read mode, while during write mode, either P1/N2 or P2/N1 pair should be activated depending on the data presented. Table 3.2 summarizes the specifications of the desired output signal (SL, BL) levels, given the input (r/w, Data) conditions. Fig shows the circuit implementation. For the conventional array, the control circuit needs to be modified. During read, the BL is tied to ground, but the SL should be left floating by the write driver, because the voltage on SL (read voltage V x ) is controlled by the read circuit. Table 3.3 summarizes this new requirement and Fig shows the circuit implementation to achieve it.

60 Chapter 3. 2T1MTJ-based STT MRAM 46 Table 3.2: How the SL and BL should be biased during read/write modes with given input data, for the proposed array. r/w Data SL BL SL BL Push-pull write driver LS LS rwn shift Data[n] Figure 3.21: The write driver control circuit. Signals to the write driver (Fig. 3.20) are produced as specified in Table 3.2.

61 Chapter 3. 2T1MTJ-based STT MRAM 47 Table 3.3: How the SL and BL should be biased during read/write modes with given input data, for the conventional array. r/w Data SL BL Z Z 0 BL SL LS Push-pull write driver Data[n] shift rwn LS Figure 3.22: The write driver control circuit for the conventional array. Signals follow the specification of Table 3.3.

62 Chapter 3. 2T1MTJ-based STT MRAM 48 SL[n] BL[n]... sel sel sel sel... SL_out SL_out BL_out BL_out Figure 3.23: The probe out circuit allows direct access to the SL and BL lines externally Probe-out Scan This block is inserted as a Design For Test (DFT) technique to assist in probing internal node voltages during the testing stage. This small additional block is attached to every column s BL and SL to be able to externally sense the voltage of BL and SL during read and write operations. If we measure the correct voltage of BL and SL in a write cycle, then it would confirm the correct functionality of the write driver and its control circuit. During a read cycle, it allows us to observe the transient behaviour of BL and SL as the read voltages develop. Fig shows the implementation. Much like the address register, there s a shift register to select the single column of interest. Only one bit in the register should be high at one time. The selected column will have the transmission gates active to connect the BL and SL to BL out and SL out, respectively. BL out and SL out are lines connected to the I/O pad for external probing. Of course, due to the extra loading on BL out and SL out, we cannot accurately measure the transient behaviour of the internal circuitry as in normal operation. Rather, it allows us to observe the final steady-state values on BL and SL lines, which nonetheless provides insightful information.

63 Chapter 3. 2T1MTJ-based STT MRAM Summary This chapter presented the proposed design of a new 2T1MTJ cell, and the associated current-based read scheme and voltage-based read scheme for an STT-MRAM system. We explained how the cell and read schemes were designed to provide fast read access time, with the reference generated to be constantly halfway between the two possible cell levels, to ensure reliable operation. While keeping the same cell size as the conventional cell design, sacrificing slightly in the write access time, the proposed cell offers better read speed and reliability. The conventional implementation was also described in this chapter, as well as all the peripheral components which make up the test chip.

64 Chapter 4 Measurement Results 4.1 Test Chip The proposed design was fabricated in a test chip in a 1-Poly-7-Metal 90nm CMOS process technology by Fujitsu Laboratories. The MTJ layers are placed between Metal-6 and Metal-7 (topmost metal layer). The total area of our 16kb array was 1.3mm 0.9mm. There are two 16kb arrays implemented on the test chip, on opposite corners of the die, each composed of 4µm and 6µm access transistor cells. Fig. 4.1 is a detailed die photograph of the 16kb array. A cross-sectional TEM image of the MTJ layers, placed in between the standard CMOS layers is pictured in Fig In this shuttle run, the Fujitsu MRAM Team provided us with a total of 18 dies, from 6 different lots. The die composition is listed in Table 4.1. Each lot has a different MTJ size, which exhibits varying electrical characteristics. The MTJ s average I C switching values are measured to be 370µA for parallelizing and 950µA for anti-parallelizing. The nominal cell resistances are 960Ω for R P and 1550Ω for R P. 50

65 Chapter 4. Measurement Results 51 Periphery WL Driver Array 1 ( 8kb ) Array 2 ( 8kb ) WL Driver WR/RD cct. WR/RD cct. Shift Reg Figure 4.1: Die photograph of the 16kb array. MTJ M7 MgO CoFeB CoFeB M6 Figure 4.2: A cross-sectional TEM image of the MTJ layers in between Metal-6 and Metal-7.

66 Chapter 4. Measurement Results 52 Table 4.1: Summary of all dies, with each lot s MTJ dimension and quantity. Lot # Wafer # MTJ size (nm nm) # of dies Total Measurement Setup The dies were packaged in a 120 pin Ceramic Quad Flat Pack (CQFP) package. Fig. 4.3 is a photograph of the open-chip package showing the die and bonding connections. The packaged chips are then inserted inside a clamshell socket of our PCB testboard. The testboard is designed to interface with the our main testing equipment, which is the production-level Verigy SoC Tester. The following steps are taken to set up tests, as illustrated in the Fig First, we need to create the test vectors in RTL (Verilog) code. This file specifies the input signal waveforms and the expected output levels. Using an in-house developed script (created by D. Patel [40]), the Verilog behavioural code is translated into binary vector form (.binl) which the Verigy (V93K) Tester can recognize. The Tester reads the binary vector file and generates the input electrical signals sent to the DUT (i.e. test chip). The chip will output the corresponding signals according to the input stimulus. These signals are sampled by the V93K and saved in memory. We may then view the output signals as transient waveforms. The process described above is ideal for checking chip functionality. After it has

67 Chapter 4. Measurement Results 53 Figure 4.3: Photograph of the packaged test chip, with the lid open. Test.v Test.binl waveform initial { } module { } test chip verilog binary vector input signals output signals sampled bits Figure 4.4: The V93K test flow procedure. been confirmed that the chip is functional, we move onto performance-based testing. In this phase, it becomes tedious to visually check individual results. Therefore, using the capability of the V93K Tester, we create scripts to automate the entire flow outlined in Fig These scripts are written in C++ and can incorporate multiple tests in one run. A more detailed account of the measurement setup, including the testboard and the testing equipments, are outlined in Appendix A.

68 Chapter 4. Measurement Results Read Performance Current-based Read Array The very first measurement to be made is the process yield of the MTJs. The MTJ yield across the die is not perfect. There is a small percentage of faulty MTJs, randomly spread throughout the array, which are either stuck in parallel state or in anti-parallel state. Fig. 4.5 shows MTJ yield map of the current-based array. Each box represents one cell. We define the individual MTJ yield as passing if, given sufficient access time for both operations, the write and read of 0 and 1 are successful. A failing MTJ yield is defined as the unsuccessful write and read of either 0 or 1, given sufficient time for both operations. In this particular example (a die from lot-2), the absolute MTJ yield of the array is 97%. We now measure the read yield by sweeping the length of the read access time, which was defined earlier in Fig The read yield as a function of read access time is plotted in Fig The read yield is a measure of relative yield, discounting the bits that were determined to have faulty MTJ (as explained above). The read access time sweep is done by varying the time at which CLK signal is triggered. In stand-alone memory systems or embedded memory modules, the clock is usually provided by the external system utilizing the memory. This is why CLK is the signal swept to determine access time. The minimum read access time is the point at which the read yield reaches maximum. Thus, we see from the measured results, a minimum read access time of 6ns can be achieved. This result was consistently obtained from several different lots. It is important to ensure that the read operation itself is not destructive. Read disturbance can occur if, by the act of reading, the stored bit is altered (such as in DRAM). We follow the measurement sequence outlined in Fig 4.7. First, a long duration write is executed to ensure the correct data is stored. Then, we execute the variable-length read operation twice. Thus the second read operation would be able to detect whether the

69 Chapter 4. Measurement Results 55 Figure 4.5: MTJ yield map of the current-based read array. (White: pass, black: fail.)

70 Chapter 4. Measurement Results Read yield (%) Read access time (ns) Figure 4.6: Read access time of the current-based read scheme. first read was destructive or not. Only after the confirmation of correct data from the second read for both 0 and 1 is the read operation considered a success. Long duration write Variable duration read Variable duration read Write 0 Read Read Write 1 Read Read Read 0 Read 1 Figure 4.7: Measurement sequence for the read operation.

71 Chapter 4. Measurement Results 57 Logic VDD (V) Max yield >95% max yield >90% max yield <90% max yield Read access time (ns) Figure 4.8: Shmoo plot of the current-based read yield as a function of V DD variation Read yield (%) Logic V DD (V) Read access time (ns) 20 0 Figure 4.9: A 3D graph of the shmoo plot illustrating yield in terms of V DD and read access time.

72 Chapter 4. Measurement Results 58 The next test is to find the yield in the presence of V DD variation. The Logic V DD is varied in steps of 50mV, and the read yield in terms of access time is measured for each step. We sweep Logic V DD because the read circuit primarily uses the 1.2V source. Testing chips from lot-6, Fig. 4.8 shows the resulting shmoo plot. This plot confirms a relatively consistent read access time across V DD values. We also present a 3D plot of the same result in Fig. 4.9, with yield as the z-axis, to observe the sharp roll-off near the point of minimum read access time. The final testing done was with respect to temperature. The range tested was from 0 C to 85 C, which is the standard temperature rating for commercial products. Obtained from measuring lot-5, Figure 4.10 plots multiple read yield curves, at various temperature settings, on the same graph. If we cut the graph at the point of max yield plateau, we can obtain a plot of normalized yield (with respect to room temperature) as a function of the chip package temperature. It is shown in Fig At higher temperatures, the MTJ s thermal stability decreases, thus the critical current becomes smaller [24]. Although this may help with the write operation, it exacerbates read disturbance failure, due to the decreased read margin. The observed overall result is that at higher temperatures, the yield of the array decreases. Strictly speaking, the temperature-based test results presented above is a combination of both write and read performance. During testing, the chip is heated/cooled to the specified temperature and both write and read are executed, one after the other. It is impractical to separate those operations in terms of thermal testing, and useless to do so, since the die temperature will almost always be the same between read and write operations in high-speed memory system.

73 Chapter 4. Measurement Results 59 Yield (%) Read access time (ns) 0 C 5 C 10 C 15 C 20 C 27 C 35 C 40 C 45 C 50 C 55 C 60 C 65 C 70 C 75 C 80 C 85 C Figure 4.10: Read access yield data from multiple temperatures, appended on one graph Relative yield (%) DUT Package Temperature ( C) Figure 4.11: Maximum read yield of the current-based scheme as a function of temperature.

74 Chapter 4. Measurement Results Voltage-based Read Array Following the same order of presentation and using the same chips as tested in the current-based array, we first present the MTJ yield map of the voltage-based array in Fig It also exhibits an absolute yield of 97%. The legend of the boxes are the same as before, described in the previous section. The read yield as a function of read access time is plotted in Fig It is a measure of relative yield, discounting the bits that were determined to have faulty MTJs. The read access time sweep is done by varying the time at which CLK signal is triggered. The measured results again indicate that the minimum read access time is 6ns. The fluctuation in V DD is tested and the results are shown in Fig in the form of a shmoo plot. The Logic V DD (1.2V) is swept in steps of 50mV, and the read yield as a function of access time is measured in each case. This plot confirms a broad range of V DD operating conditions (±0.1V) that ensure 6ns read access time is preserved. To better visualize the results, a 3D plot is presented Fig Finally, we show measurement results of yield with respect to chip package temperature. The same range of temperature was used as before. Figure 4.16 plots multiple yield curves, at various temperature settings, on the same graph. By cutting the graph at the point of max yield, we obtain the plot of normalized yield as a function of temperature in Fig Similar in performance to that of the current-based array, the yield is consistent up to 40 C, but deteriorates in the high end temperature range Conventional Read Array We will first present the yield map of the conventional array in Fig 4.18, using the same lot as presented in the proposed arrays. What we immediately notice is the large number of failures. The high rate of error cannot be attributed to MTJ yield, because the conventional array is located right next to the proposed arrays and we ve seen that the

75 Chapter 4. Measurement Results 61 Figure 4.12: MTJ yield map of the voltage-based read array. (White: pass, black: fail.)

76 Chapter 4. Measurement Results Read yield (%) Read access time (ns) Figure 4.13: Read access time of the voltage-based read scheme. Logic VDD (V) Read access time (ns) Max yield >95% max yield >90% max yield <90% max yield Figure 4.14: Shmoo plot of the voltage-based read yield as a function of V DD variation.

77 Chapter 4. Measurement Results Read yield (%) Logic V DD (V) Read access time (ns) 20 0 Figure 4.15: A 3D graph of the shmoo plot illustrating yield in terms of V DD and read access time. Yield (%) Read access time (ns) 0 C 5 C 20 C 27 C 35 C 40 C 45 C 50 C 55 C 60 C 65 C 70 C 75 C 80 C 85 C Figure 4.16: Read access yield data from multiple temperatures, appended on one graph.

78 Chapter 4. Measurement Results Relative yield (%) DUT Package Temperature ( C) Figure 4.17: Maximum read yield of the voltage-based scheme as a function of temperature. MTJ yield is quite high, around 97%. Therefore, there are other contributing factors to the failures. One apparent pattern in Fig is that the yield is primarily per-column based. Every column contains its own write and read circuitry. The write circuitry is identical in all three sub-arrays and has already been proven to work from the previous measurement results. This leads us to believe that the issue originated from the read circuitry. After further probing, the problem was pin-pointed to two issues within the read circuit (repeated for convenience in Fig. 4.19). The main issue comes from the input levels to the sense amplifier. The nominal reference voltage (V ref ) is 600mV, with AP and P cell voltages (V x ) being 700mV and 500mV, respectively. We suspect the voltage margin is too small for reliable operation under the influence of MTJ resistance variation. In the nominal case (i.e. R P =1KΩ, R AP =1.5KΩ), the input currents (i + and i ) that steer the cross-coupled inverters have a difference of 12µA. However, simulations show that by varying the MTJ resistance, for example, R P by ±100Ω will reduce the (i + i )

79 Chapter 4. Measurement Results 65 Figure 4.18: Yield map of the conventional read array. (White: pass, black: fail.)

80 Chapter 4. Measurement Results V 1.2V 1.2V i read M1 M2 2i read en V ref M3 M4 V x cell ref SA_out 3.3V 3.3V en d i + i en d R CELL R P R AP voltage generation sense amp Figure 4.19: The conventional read scheme, with critical matching pairs highlighted. difference to mere 4µA. This is not sufficient margin to ensure the latch will be steered in the correct direction. Another issue comes from the implemented current-mirror structure, highlighted in Fig Any transistor mismatch between the two pairs can lead to drastic loss in the V x to V ref voltage margins. For example, simulations confirm that a V th mismatch of 10mV between M3 and M4 will reduce the voltage margin from the nominal 100mV down to 53mV. Due to a tight pitch requirement, the layout of the four transistors did not utilize optimization techniques for matching purposes, such as interdigitation. This particular design of the conventional read scheme was ported directly from a previous project within the group. At the time of tape out, the previous test chip had not been tested yet, so we were unaware of any issues with the design. The problems related to this implementation were only discovered during chip testing. Due to the issues reported above, the entire conventional array read scheme cannot be measured effectively. The level of yield observed varied from 0% - 48% from lot-to-

81 Chapter 4. Measurement Results 67 lot. With sub-optimal operating conditions for the read circuit, the results would not be indicative of its full performance potential. Thus, no reliable conventional scheme read data could be obtained from this test chip. 4.4 Write Performance Proposed Array For write measurements, the sequence of operations is outline in Fig To ensure the previously stored data does not give false pass results, the first step is to execute a long duration write of the opposite data. Then, the main test of a variable-length write is performed, followed by a long duration read to confirm the second write. Long duration write Variable duration write Long duration read Write 1 Write 0 Read Write 0 Write 1 Read Write 0 Write 1 Figure 4.20: Measurement sequence for the write operation. The write driver used is a standard push-pull driver as shown in Fig The write access time is simply defined as the pulse width of W L en signal. When WL is turned on, maximum current flows through the cell in either direction, depending on the input data. Fig. 4.21(a) and 4.21(b) show the write yield access time plot. We have shown two separate graphs for the write of 0 and 1, due to the asymmetrical switching requirements of the MTJ. Write yield is a measure of relative yield, discounting the cells with faulty MTJs due to process error. Note that all measurement results in this section are obtained from lot-3. The minimum write access time is the point at which the write

82 Chapter 4. Measurement Results 68 yield reaches its maximum. Thus in the measurement plot, we see the write-0 minimum access time to be 6ns, while the write-1 minimum access time is 10ns. Similar to the V DD variation test run in the read measurements, the write operation was also tested against such fluctuations. In terms of write, the only power supply of interest is the Core V DD (3.3V), which is utilized by the write circuit. Fig 4.22(a) and 4.22(b) shows the shmoo plot of the write yield as a function of Core V DD variation Conventional Array The write circuitry is the same in the conventional array as that used in the proposed. Since it has been proven to work in the proposed arrays, we can safely assume functionality in this array as well. The only difference here is that the access transistor is larger in the conventional 1T1MTJ cells. Therefore we expect faster write access times, as hypothesized earlier in our design stage. Fig shows the write access times of the conventional array from the same chips as tested in the previous section. Again, two separate graphs are presented, Fig. 4.23(a) for the write of 0 and and Fig. 4.23(b) for the write of 1. The write yield is a measure of relative yield. Knowing there is a major issue with the conventional read circuit, we must discount the problematic columns, as well as the faulty MTJ cells, in determining yield due strictly to the write operation. However, no matter what the absolute value of the write yield may be, the knee point at which the yield reaches maximum is the same. As Fig illustrates, the minimum write access time is 6ns for parallel ( 0 ) write and 10ns for anti-parallel ( 1 ) write. Voltage variation is the final step of the testing in the conventional write performance. Fig. 4.24(a) and 4.24(b) shows that the V DD variation effect on write-0 and write-1 yield. The results are similar to those from the proposed array using 2T1MTJ cells (Fig. 4.22).

83 Chapter 4. Measurement Results Write 0 yield (%) Write access time (ns) (a) Write Write 1 yield (%) Write access time (ns) (b) Write 1 Figure 4.21: Write access times of the proposed array.

84 Chapter 4. Measurement Results 70 Core VDD (V) Max yield >95% max yield >90% max yield <90% max yield Write access time (ns) (a) Write 0 Core VDD (V) Max yield >95% max yield >90% max yield <90% max yield Write access time (ns) (b) Write 1 Figure 4.22: Shmoo plot of write yield of the proposed array as a function of V DD variation.

85 Chapter 4. Measurement Results Write 0 yield (%) Write access time (ns) (a) Write Write 1 yield (%) Write access time (ns) (b) Write 1 Figure 4.23: Write access times of the conventional array.

86 Chapter 4. Measurement Results 72 Core VDD (V) Max yield >95% max yield >90% max yield <90% max yield Write access time (ns) (a) Write 0 Core VDD (V) Max yield >95% max yield >90% max yield <90% max yield Write access time (ns) (b) Write 1 Figure 4.24: Shmoo plot of write yield of the conventional array as a function of V DD variation.

87 Chapter 4. Measurement Results Comparison Against Conventional The proposed work will first be compared against the conventional scheme implemented on chip. Then, we will compare the performance of the proposed design against the most recently published works in this area. First, we compare the write access times of the proposed and conventional arrays, to determine the impact of the smaller access transistor in the 2T1MTJ cell. Fig plots the write access times of the proposed array with the conventional. It reveals that the point at which both arrays reach maximum yield is the same. This means the difference in the minimum write access time between the proposed design and conventional is negligible. Before minimum write access time, there is a noticeable difference in yield in favour of the conventional array. However, the performance in this area is irrelevant, because the memory array should not be operating in this region of time. The write time at which the memory system is used should be at least the minimum access time to guarantee successful writes. Section explained in details how the conventional read failed to perform. Therefore, we cannot make any comparisons to the on-chip conventional read scheme. We now move on to compare our proposed work against other published works. There have been numerous papers published on the topic of STT MRAM. However, many are simply proposed ideas with only simulation results. We are interested in works in which the designs have been fabricated and proven through measurement results. The recently published papers in STT MRAM with fabricated results are listed in the table of comparison in Table The table states the critical factors of each work. This work demonstrates a 6ns read access time, which is a 25% improvement over the previous benchmark. The 10ns write access time is also matched to the fastest reported yet. The cell size is relatively large, but this is due to the MTJ switching current requirements to achieve the write time specification. However, we expect continued advancements in MTJ technology to help improve the density in the near future. It will also allow us

88 Chapter 4. Measurement Results Write 0 yield (%) Proposed Conventional Write access time (ns) (a) Write Write 1 yield (%) Proposed Conventional Write access time (ns) (b) Write 1 Figure 4.25: Comparison of the write access times of the proposed array (circle) against the conventional array (square).

89 Chapter 4. Measurement Results 75 to migrate to lower voltages for the array core. Although the overall chip capacity in other works is larger, the sub-array size is the more important parameter for comparison purposes, since the total array is a direct tiling of the sub-arrays. For our test chip, we simply did not have the area to implement a 256kb array and high capacity was not the goal of this thesis. Due to the different CMOS and MTJ technologies used in each work, it is difficult to make direct comparisons. In an effort to normalize the reported values of this work against others such as [15], simulations were run for the proposed design in a 256kb array configuration with 1.2V power supply. It showed the change in read access time to be 50% longer than the implemented array, increasing from 5ns to 7.5ns. Therefore, the read access time of this proposed design, when scaled to a 256kb - 1.2V system, is projected to be increased by the same factor, to 9ns. This is still faster than [13 15]. The result stems from the fact that the 2T1MTJ cell design will always yield a smaller loading compared to the conventional scheme, during the read operation. The proposed design can be summarized as having achieved a faster read access time without loss of write access time, in comparison to the conventional scheme. 4.6 Summary In this chapter, we presented the full measurement results of the test chip. The read access time and yield due to V DD and temperature variations were all tested for the proposed current-based and voltage-based read schemes. The write access times were also measured for the proposed arrays and the conventional array. The faulty components of the test chip were explained and justified in details. The result of the proposed design was compared against on-chip conventional array as well as recent published works. The highlight of this work is that it is able to achieve the fastest read access time published to date, while maintaining the previously benchmarked write access time.

90 Chapter 4. Measurement Results 76 Table 4.2: Comparison against the most recently published works in STT MRAM. ISSCC 07 [13] VLSI 09 [14] ISSCC 10 [15] ISSCC 10 [33] This work MTJ type i-stt i-stt p-stt i-stt i-stt Critical current 200µA 300µA 49µA 870µA 950µA Read time 40ns 32ns 11ns 8ns 6ns Write time 100ns 40ns 30ns 10ns 10ns Cell size (µm 2 ) Power supply 1.8V 1.8V 1.2V 3.3V core 3.3V core 1.2V logic 1.2V logic Capacity (sub-array) 2Mb (256kb) 32Mb (256kb) 64Mb (256kb) 16kb 16kb CMOS technology 0.2µm 0.15µm 65nm 0.13µm 90nm i-stt = in-plane STT, p-stt = perpendicular STT

91 Chapter 5 Conclusion and Future Directions This thesis presented new designs to help achieve fast read access times in STT MRAM. The proposed ideas included a novel 2T1MTJ cell, along with current-based and voltagebased read schemes. Together, the designs are capable of realizing a shorter and more reliable read operation, compared to the conventional read scheme. An entire memory system including the peripheral circuits was designed and implemented in Fujitsu 90nm CMOS process and MTJ technology. The measured results from the test chip prove that a read access time of 6ns is possible, without sacrificing the write access time. The testing incorporated voltage and temperature variations to determine the proposed scheme s reliability under different operating conditions. When evaluated against other state-of-the-art published works, due to technology limitations, the cell sizes are larger, but the circuit design techniques allowed the read and write access times to compare very well. 5.1 Contributions The contributions made in this thesis to the STT MRAM area of research include: 77

92 Chapter 5. Conclusion and Future Directions 78 ˆ Proposal of a 2-Transistor-1-MTJ (2T1MTJ) cell. The cell is competitively sized to the conventional 1T1MTJ cell, by slightly decreasing the main access transistor, to accommodate the additional read transistor. As the theoretical prediction and measured results prove, this sacrifice has negligible effect on write access time. ˆ The current-based and voltage-based read schemes, designed specifically for the 2T1MTJ cell, with speed and reliability as the objectives. The read schemes have proven to provide an access time of 6ns. In both cases, the reference generation methods ensures the reference current/voltage to be continually midway between the two states. ˆ The 6ns read access time measured in this work is faster than any STT MRAM system published thus far. The 10ns write access time also matches the fastest reported yet. 5.2 Future Directions There are several avenues of improvements possible in future research work. The most important progress is the continued advancement in MTJ technology. MTJ enhancement in the critical current directly translates to smaller cell sizes and improvements in the density. Furthermore, we can migrate to low-voltage transistors and thus lead to more power savings. At the very least, for the particular design in this thesis, using all 1.2V transistors would mean a significantly smaller footprint for the second transistor (relative to the main access transistor), because the minimum-sized 3.3V device is 4.5 larger than the 1.2V device of the technology. As well, the read access time will improve along with smaller cells, due to less capacitive loads on the column lines. The addition of the second transistor opens the door for many new architectures to be possible. In this thesis, we have focused exclusively on designs to achieve lower read access time. However, the 2T1MTJ cell topology offers flexibility to pursue innovative

93 Chapter 5. Conclusion and Future Directions 79 designs for other merits, such as write schemes and power reduction. Finally, we note that the MTJ model used in this work was quite primitive. It simply modeled the DC resistance and static critical current values at 10ns pulse. This leads to a pessimistic design methodology to account for worst-case scenarios. With the help of an accurate, dynamic model, we may better design the system to account for the dynamic characteristics of the MTJ. This will lead to smaller, more flexible, and smarter design solutions.

94 Appendix A Test Setup A.1 Testboard Fig. A.1 is a photograph of our PCB testboard. To ensure constant power supplies without peaks or dips, V DD -3.3, V DD -1.2, and V WL, have been connected with decoupling capacitors of 100µF, 220µF, and 100µF, respectively. Also, under the socket (Fig. A.2), which is the point closest to the chip itself on the test board, every power supply pin has been shunted with smaller 10nF decoupling capacitors as well. We also have the ability to observe the voltage on every pin using a real-time oscilloscope by simply probing the break out points on the test board. This is especially helpful for observing analog outputs, as well as confirming the correct input signals are fed into the chip. A.2 Equipments The Verigy (formerly Agilent) SoC is the main equipment used for testing. The V93K is an industrial, production-level tester provided by Canadian Microelectronics Corporation, capable of running fully automated tests. It has 480 digital data channels, with serial data rate of up to 1.25Gbits/s. There are also 3 RF channels, capable of handling signal frequencies from 10MHz to 7.5GHz. 80

95 Appendix A. Test Setup 81 Figure A.1: Photograph of the testboard. Figure A.2: Photograph of the underside of the testboard where additional decoupling caps are soldered.

96 Appendix A. Test Setup 82 Thermal forcing unit sensor package socket die Figure A.3: Measurement setup for temperature testing. Another critical equipment used during the debugging stage was a real-time oscilloscope. The Tektronics MSO4104 has 4 input channels with 1GHz bandwidth. We probe the pins of the chip to obtain more accurate signal levels than the V93K Tester is able to capture. Temperature-based testing was performed using the Temptronic X-Stream It is capable of producing ambient air temperature ranging from -50 C to +250 C. Fig. A.3 illustrates how the temperature measurement test is set up. The Thermal Forcing Unit (TFU) blows dry-compressed air into its chamber, which encloses the chip-socket. A small metal tip is attached to the top of the chip, which feeds the chip temperature back into the TFU to maintain the DUT at the desired temperature. Due to the complete closure of the socket by the chamber, very little air escapes outside the test board. Fig. A.4 is a photograph of the physical setup, wherein the TFU is docked on top of the testboard and the V93K tester.

97 Appendix A. Test Setup 83 Figure A.4: Photograph of the testing setup using the TFU.

MAGNETORESISTIVE random access memory

MAGNETORESISTIVE random access memory 132 IEEE TRANSACTIONS ON MAGNETICS, VOL. 41, NO. 1, JANUARY 2005 A 4-Mb Toggle MRAM Based on a Novel Bit and Switching Method B. N. Engel, J. Åkerman, B. Butcher, R. W. Dave, M. DeHerrera, M. Durlam, G.

More information

Status and Prospect for MRAM Technology

Status and Prospect for MRAM Technology Status and Prospect for MRAM Technology Dr. Saied Tehrani Nonvolatile Memory Seminar Hot Chips Conference August 22, 2010 Memorial Auditorium Stanford University Everspin Technologies, Inc. - 2010 Agenda

More information

A novel sensing algorithm for Spin-Transfer-Torque magnetic RAM (STT-MRAM) by utilizing dynamic reference

A novel sensing algorithm for Spin-Transfer-Torque magnetic RAM (STT-MRAM) by utilizing dynamic reference A novel sensing algorithm for Spin-Transfer-Torque magnetic RAM (STT-MRAM) by utilizing dynamic reference Yong-Sik Park, Gyu-Hyun Kil, and Yun-Heub Song a) Department of Electronics and Computer Engineering,

More information

DESIGN & IMPLEMENTATION OF SELF TIME DUMMY REPLICA TECHNIQUE IN 128X128 LOW VOLTAGE SRAM

DESIGN & IMPLEMENTATION OF SELF TIME DUMMY REPLICA TECHNIQUE IN 128X128 LOW VOLTAGE SRAM DESIGN & IMPLEMENTATION OF SELF TIME DUMMY REPLICA TECHNIQUE IN 128X128 LOW VOLTAGE SRAM 1 Mitali Agarwal, 2 Taru Tevatia 1 Research Scholar, 2 Associate Professor 1 Department of Electronics & Communication

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 24: Peripheral Memory Circuits

CMPEN 411 VLSI Digital Circuits Spring Lecture 24: Peripheral Memory Circuits CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 24: Peripheral Memory Circuits [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11

More information

Memory (Part 1) RAM memory

Memory (Part 1) RAM memory Budapest University of Technology and Economics Department of Electron Devices Technology of IT Devices Lecture 7 Memory (Part 1) RAM memory Semiconductor memory Memory Overview MOS transistor recap and

More information

Sensing Circuits for Resistive Memory

Sensing Circuits for Resistive Memory Sensing Circuits for Resistive Memory R. Jacob, Ph.D., P.E. Department of Electrical Engineering Boise State University 1910 University Dr., ET 201 Boise, ID 83725 jbaker@ieee.org Abstract A nascent class

More information

Application Note Model 765 Pulse Generator for Semiconductor Applications

Application Note Model 765 Pulse Generator for Semiconductor Applications Application Note Model 765 Pulse Generator for Semiconductor Applications Non-Volatile Memory Cells Characterization The trend of memory research is to develop a new memory called Non-Volatile RAM that

More information

Lecture 6: Electronics Beyond the Logic Switches Xufeng Kou School of Information Science and Technology ShanghaiTech University

Lecture 6: Electronics Beyond the Logic Switches Xufeng Kou School of Information Science and Technology ShanghaiTech University Lecture 6: Electronics Beyond the Logic Switches Xufeng Kou School of Information Science and Technology ShanghaiTech University EE 224 Solid State Electronics II Lecture 3: Lattice and symmetry 1 Outline

More information

Memory Basics. historically defined as memory array with individual bit access refers to memory with both Read and Write capabilities

Memory Basics. historically defined as memory array with individual bit access refers to memory with both Read and Write capabilities Memory Basics RAM: Random Access Memory historically defined as memory array with individual bit access refers to memory with both Read and Write capabilities ROM: Read Only Memory no capabilities for

More information

Electronic Circuits EE359A

Electronic Circuits EE359A Electronic Circuits EE359A Bruce McNair B206 bmcnair@stevens.edu 201-216-5549 1 Memory and Advanced Digital Circuits - 2 Chapter 11 2 Figure 11.1 (a) Basic latch. (b) The latch with the feedback loop opened.

More information

Low Power 256K MRAM Design

Low Power 256K MRAM Design Low Power 256K MRAM Design R. Beech, R. Sinclair, NVE Corp., 11409 Valley View Road, Eden Prairie, MN 55344, beech@nve.com Abstract A low power Magnetoresistive Random Access Memory (MRAM), that uses a

More information

Lecture 12 Memory Circuits. Memory Architecture: Decoders. Semiconductor Memory Classification. Array-Structured Memory Architecture RWM NVRWM ROM

Lecture 12 Memory Circuits. Memory Architecture: Decoders. Semiconductor Memory Classification. Array-Structured Memory Architecture RWM NVRWM ROM Semiconductor Memory Classification Lecture 12 Memory Circuits RWM NVRWM ROM Peter Cheung Department of Electrical & Electronic Engineering Imperial College London Reading: Weste Ch 8.3.1-8.3.2, Rabaey

More information

Magnetic tunnel junction sensor development for industrial applications

Magnetic tunnel junction sensor development for industrial applications Magnetic tunnel junction sensor development for industrial applications Introduction Magnetic tunnel junctions (MTJs) are a new class of thin film device which was first successfully fabricated in the

More information

Low Power Design of Schmitt Trigger Based SRAM Cell Using NBTI Technique

Low Power Design of Schmitt Trigger Based SRAM Cell Using NBTI Technique Low Power Design of Schmitt Trigger Based SRAM Cell Using NBTI Technique M.Padmaja 1, N.V.Maheswara Rao 2 Post Graduate Scholar, Gayatri Vidya Parishad College of Engineering for Women, Affiliated to JNTU,

More information

A Low-Power SRAM Design Using Quiet-Bitline Architecture

A Low-Power SRAM Design Using Quiet-Bitline Architecture A Low-Power SRAM Design Using uiet-bitline Architecture Shin-Pao Cheng Shi-Yu Huang Electrical Engineering Department National Tsing-Hua University, Taiwan Abstract This paper presents a low-power SRAM

More information

Static Random Access Memory - SRAM Dr. Lynn Fuller Webpage:

Static Random Access Memory - SRAM Dr. Lynn Fuller Webpage: ROCHESTER INSTITUTE OF TECHNOLOGY MICROELECTRONIC ENGINEERING Static Random Access Memory - SRAM Dr. Lynn Fuller Webpage: http://people.rit.edu/lffeee 82 Lomb Memorial Drive Rochester, NY 14623-5604 Email:

More information

STT-MRAM Read-circuit with Improved Offset Cancellation

STT-MRAM Read-circuit with Improved Offset Cancellation JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.17, NO.3, JUNE, 2017 ISSN(Print) 1598-1657 https://doi.org/10.5573/jsts.2017.17.3.347 ISSN(Online) 2233-4866 STT-MRAM Read-circuit with Improved Offset

More information

MULTI-PORT MEMORY DESIGN FOR ADVANCED COMPUTER ARCHITECTURES. by Yirong Zhao Bachelor of Science, Shanghai Jiaotong University, P. R.

MULTI-PORT MEMORY DESIGN FOR ADVANCED COMPUTER ARCHITECTURES. by Yirong Zhao Bachelor of Science, Shanghai Jiaotong University, P. R. MULTI-PORT MEMORY DESIGN FOR ADVANCED COMPUTER ARCHITECTURES by Yirong Zhao Bachelor of Science, Shanghai Jiaotong University, P. R. China, 2011 Submitted to the Graduate Faculty of the Swanson School

More information

CHAPTER 6 DIGITAL CIRCUIT DESIGN USING SINGLE ELECTRON TRANSISTOR LOGIC

CHAPTER 6 DIGITAL CIRCUIT DESIGN USING SINGLE ELECTRON TRANSISTOR LOGIC 94 CHAPTER 6 DIGITAL CIRCUIT DESIGN USING SINGLE ELECTRON TRANSISTOR LOGIC 6.1 INTRODUCTION The semiconductor digital circuits began with the Resistor Diode Logic (RDL) which was smaller in size, faster

More information

On Chip Active Decoupling Capacitors for Supply Noise Reduction for Power Gating and Dynamic Dual Vdd Circuits in Digital VLSI

On Chip Active Decoupling Capacitors for Supply Noise Reduction for Power Gating and Dynamic Dual Vdd Circuits in Digital VLSI ELEN 689 606 Techniques for Layout Synthesis and Simulation in EDA Project Report On Chip Active Decoupling Capacitors for Supply Noise Reduction for Power Gating and Dynamic Dual Vdd Circuits in Digital

More information

A Read-Decoupled Gated-Ground SRAM Architecture for Low-Power Embedded Memories

A Read-Decoupled Gated-Ground SRAM Architecture for Low-Power Embedded Memories A Read-Decoupled Gated-Ground SRAM Architecture for Low-Power Embedded Memories Wasim Hussain A Thesis In The Department of Electrical and Computer Engineering Presented in Partial Fulfillment of the Requirements

More information

In pursuit of high-density storage class memory

In pursuit of high-density storage class memory Edition October 2017 Semiconductor technology & processing In pursuit of high-density storage class memory A novel thermally stable GeSe-based selector paves the way to storage class memory applications.

More information

64 Kb logic RRAM chip resisting physical and side-channel attacks for encryption keys storage

64 Kb logic RRAM chip resisting physical and side-channel attacks for encryption keys storage 64 Kb logic RRAM chip resisting physical and side-channel attacks for encryption keys storage Yufeng Xie a), Wenxiang Jian, Xiaoyong Xue, Gang Jin, and Yinyin Lin b) ASIC&System State Key Lab, Dept. of

More information

Highly Reliable Memory-based Physical Unclonable Function Using Spin-Transfer Torque MRAM

Highly Reliable Memory-based Physical Unclonable Function Using Spin-Transfer Torque MRAM Highly Reliable Memory-based Physical Unclonable Function Using Spin-Transfer Torque MRAM Le Zhang 1, Xuanyao Fong 2, Chip-Hong Chang 1, Zhi Hui Kong 1, Kaushik Roy 2 1 School of EEE, Nanyang Technological

More information

Lecture #29. Moore s Law

Lecture #29. Moore s Law Lecture #29 ANNOUNCEMENTS HW#15 will be for extra credit Quiz #6 (Thursday 5/8) will include MOSFET C-V No late Projects will be accepted after Thursday 5/8 The last Coffee Hour will be held this Thursday

More information

SRAM Read-Assist Scheme for Low Power High Performance Applications

SRAM Read-Assist Scheme for Low Power High Performance Applications SRAM Read-Assist Scheme for Low Power High Performance Applications Ali Valaee A Thesis In the Department of Electrical and Computer Engineering Presented in Partial Fulfillment of the Requirements for

More information

Non-Volatile Memory Characterization and Measurement Techniques

Non-Volatile Memory Characterization and Measurement Techniques Non-Volatile Memory Characterization and Measurement Techniques Alex Pronin Keithley Instruments, Inc. 1 2012-5-21 Why do more characterization? NVM: Floating gate Flash memory Very successful; lead to

More information

UNIT-II LOW POWER VLSI DESIGN APPROACHES

UNIT-II LOW POWER VLSI DESIGN APPROACHES UNIT-II LOW POWER VLSI DESIGN APPROACHES Low power Design through Voltage Scaling: The switching power dissipation in CMOS digital integrated circuits is a strong function of the power supply voltage.

More information

Semiconductor Memory: DRAM and SRAM. Department of Electrical and Computer Engineering, National University of Singapore

Semiconductor Memory: DRAM and SRAM. Department of Electrical and Computer Engineering, National University of Singapore Semiconductor Memory: DRAM and SRAM Outline Introduction Random Access Memory (RAM) DRAM SRAM Non-volatile memory UV EPROM EEPROM Flash memory SONOS memory QD memory Introduction Slow memories Magnetic

More information

Journal of Electron Devices, Vol. 20, 2014, pp

Journal of Electron Devices, Vol. 20, 2014, pp Journal of Electron Devices, Vol. 20, 2014, pp. 1786-1791 JED [ISSN: 1682-3427 ] ANALYSIS OF GIDL AND IMPACT IONIZATION WRITING METHODS IN 100nm SOI Z-DRAM Bhuwan Chandra Joshi, S. Intekhab Amin and R.

More information

Design and Implement of Low Power Consumption SRAM Based on Single Port Sense Amplifier in 65 nm

Design and Implement of Low Power Consumption SRAM Based on Single Port Sense Amplifier in 65 nm Journal of Computer and Communications, 2015, 3, 164-168 Published Online November 2015 in SciRes. http://www.scirp.org/journal/jcc http://dx.doi.org/10.4236/jcc.2015.311026 Design and Implement of Low

More information

Low Power, Area Efficient FinFET Circuit Design

Low Power, Area Efficient FinFET Circuit Design Low Power, Area Efficient FinFET Circuit Design Michael C. Wang, Princeton University Abstract FinFET, which is a double-gate field effect transistor (DGFET), is more versatile than traditional single-gate

More information

Low Power Design of Successive Approximation Registers

Low Power Design of Successive Approximation Registers Low Power Design of Successive Approximation Registers Rabeeh Majidi ECE Department, Worcester Polytechnic Institute, Worcester MA USA rabeehm@ece.wpi.edu Abstract: This paper presents low power design

More information

Design and Evaluation of two MTJ-Based Content Addressable Non-Volatile Memory Cells

Design and Evaluation of two MTJ-Based Content Addressable Non-Volatile Memory Cells Design and Evaluation of two MTJ-Based Content Addressable Non-Volatile Memory Cells Ke Chen, Jie Han and Fabrizio Lombardi Abstract This paper proposes two non-volatile content addressable memory (CAM)

More information

DESIGN OF A NOVEL CURRENT MIRROR BASED DIFFERENTIAL AMPLIFIER DESIGN WITH LATCH NETWORK. Thota Keerthi* 1, Ch. Anil Kumar 2

DESIGN OF A NOVEL CURRENT MIRROR BASED DIFFERENTIAL AMPLIFIER DESIGN WITH LATCH NETWORK. Thota Keerthi* 1, Ch. Anil Kumar 2 ISSN 2277-2685 IJESR/October 2014/ Vol-4/Issue-10/682-687 Thota Keerthi et al./ International Journal of Engineering & Science Research DESIGN OF A NOVEL CURRENT MIRROR BASED DIFFERENTIAL AMPLIFIER DESIGN

More information

Design of Sub-10-Picoseconds On-Chip Time Measurement Circuit

Design of Sub-10-Picoseconds On-Chip Time Measurement Circuit Design of Sub-0-Picoseconds On-Chip Time Measurement Circuit M.A.Abas, G.Russell, D.J.Kinniment Dept. of Electrical and Electronic Eng., University of Newcastle Upon Tyne, UK Abstract The rapid pace of

More information

A Comparative Simulation Study of Four Multilevel DRAMs

A Comparative Simulation Study of Four Multilevel DRAMs A Comparative Simulation Study of Four Multilevel DRAMs Gershom Birk, Duncan Elliott, Bruce Cockburn Department of Electrical & Computer Engineering University of Alberta Edmonton, Alberta, Canada Outline

More information

Domino Static Gates Final Design Report

Domino Static Gates Final Design Report Domino Static Gates Final Design Report Krishna Santhanam bstract Static circuit gates are the standard circuit devices used to build the major parts of digital circuits. Dynamic gates, such as domino

More information

INTEGRATED CIRCUITS. AN109 Microprocessor-compatible DACs Dec

INTEGRATED CIRCUITS. AN109 Microprocessor-compatible DACs Dec INTEGRATED CIRCUITS 1988 Dec DAC products are designed to convert a digital code to an analog signal. Since a common source of digital signals is the data bus of a microprocessor, DAC circuits that are

More information

Static Power and the Importance of Realistic Junction Temperature Analysis

Static Power and the Importance of Realistic Junction Temperature Analysis White Paper: Virtex-4 Family R WP221 (v1.0) March 23, 2005 Static Power and the Importance of Realistic Junction Temperature Analysis By: Matt Klein Total power consumption of a board or system is important;

More information

A Novel Low-Power Scan Design Technique Using Supply Gating

A Novel Low-Power Scan Design Technique Using Supply Gating A Novel Low-Power Scan Design Technique Using Supply Gating S. Bhunia, H. Mahmoodi, S. Mukhopadhyay, D. Ghosh, and K. Roy School of Electrical and Computer Engineering, Purdue University, West Lafayette,

More information

[Vivekanand*, 4.(12): December, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785

[Vivekanand*, 4.(12): December, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY DESIGN AND IMPLEMENTATION OF HIGH RELIABLE 6T SRAM CELL V.Vivekanand*, P.Aditya, P.Pavan Kumar * Electronics and Communication

More information

PMOS-based Integrated Charge Pumps with Extended Voltage Range in Standard CMOS Technology

PMOS-based Integrated Charge Pumps with Extended Voltage Range in Standard CMOS Technology PMOS-based Integrated Charge Pumps with Extended Voltage Range in Standard CMOS Technology by Jingqi Liu A Thesis presented to The University of Guelph In partial fulfillment of requirements for the degree

More information

A REVIEW ON MAGNETIC TUNNEL JUNCTION TECHNOLOGY

A REVIEW ON MAGNETIC TUNNEL JUNCTION TECHNOLOGY A REVIEW ON MAGNETIC TUNNEL JUNCTION TECHNOLOGY Pawan Choudhary 1, Dr. Kanika Sharma 2, Sagar Balecha 3, Bhaskar Mishra 4 1 M.E Scholar, Electronics & Communication Engineering, National Institute of Technical

More information

A Novel Radiation Tolerant SRAM Design Based on Synergetic Functional Component Separation for Nanoscale CMOS.

A Novel Radiation Tolerant SRAM Design Based on Synergetic Functional Component Separation for Nanoscale CMOS. A Novel Radiation Tolerant SRAM Design Based on Synergetic Functional Component Separation for Nanoscale CMOS. Abstract This paper presents a novel SRAM design for nanoscale CMOS. The new design addresses

More information

UMAINE ECE Morse Code ROM and Transmitter at ISM Band Frequency

UMAINE ECE Morse Code ROM and Transmitter at ISM Band Frequency UMAINE ECE Morse Code ROM and Transmitter at ISM Band Frequency Jamie E. Reinhold December 15, 2011 Abstract The design, simulation and layout of a UMAINE ECE Morse code Read Only Memory and transmitter

More information

Supplementary Figure 1 High-resolution transmission electron micrograph of the

Supplementary Figure 1 High-resolution transmission electron micrograph of the Supplementary Figure 1 High-resolution transmission electron micrograph of the LAO/STO structure. LAO/STO interface indicated by the dotted line was atomically sharp and dislocation-free. Supplementary

More information

Reading. Lecture 17: MOS transistors digital. Context. Digital techniques:

Reading. Lecture 17: MOS transistors digital. Context. Digital techniques: Reading Lecture 17: MOS transistors digital Today we are going to look at the analog characteristics of simple digital devices, 5. 5.4 And following the midterm, we will cover PN diodes again in forward

More information

NOVEMBER 28, 2016 COURSE PROJECT: CMOS SWITCHING POWER SUPPLY EE 421 DIGITAL ELECTRONICS ERIC MONAHAN

NOVEMBER 28, 2016 COURSE PROJECT: CMOS SWITCHING POWER SUPPLY EE 421 DIGITAL ELECTRONICS ERIC MONAHAN NOVEMBER 28, 2016 COURSE PROJECT: CMOS SWITCHING POWER SUPPLY EE 421 DIGITAL ELECTRONICS ERIC MONAHAN 1.Introduction: CMOS Switching Power Supply The course design project for EE 421 Digital Engineering

More information

A Novel Technique to Reduce Write Delay of SRAM Architectures

A Novel Technique to Reduce Write Delay of SRAM Architectures A Novel Technique to Reduce Write Delay of SRAM Architectures SWAPNIL VATS AND R.K. CHAUHAN * Department of Electronics and Communication Engineering M.M.M. Engineering College, Gorahpur-73 010, U.P. INDIA

More information

Chapter 6 Combinational CMOS Circuit and Logic Design. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Chapter 6 Combinational CMOS Circuit and Logic Design. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Chapter 6 Combinational CMOS Circuit and Logic Design Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Outline Advanced Reliable Systems (ARES) Lab. Jin-Fu Li,

More information

電子電路. Memory and Advanced Digital Circuits

電子電路. Memory and Advanced Digital Circuits 電子電路 Memory and Advanced Digital Circuits Hsun-Hsiang Chen ( 陳勛祥 ) Department of Electronic Engineering National Changhua University of Education Email: chenhh@cc.ncue.edu.tw Spring 2010 2 Reference Microelectronic

More information

CHAPTER 3 NEW SLEEPY- PASS GATE

CHAPTER 3 NEW SLEEPY- PASS GATE 56 CHAPTER 3 NEW SLEEPY- PASS GATE 3.1 INTRODUCTION A circuit level design technique is presented in this chapter to reduce the overall leakage power in conventional CMOS cells. The new leakage po leepy-

More information

Design and analysis of 6T SRAM cell using FINFET at Nanometer Regime Monali S. Mhaske 1, Prof. S. A. Shaikh 2

Design and analysis of 6T SRAM cell using FINFET at Nanometer Regime Monali S. Mhaske 1, Prof. S. A. Shaikh 2 Design and analysis of 6T SRAM cell using FINFET at Nanometer Regime Monali S. Mhaske 1, Prof. S. A. Shaikh 2 1 ME, Dept. Of Electronics And Telecommunication,PREC, Maharashtra, India 2 Associate Professor,

More information

Variation-tolerant Non-volatile Ternary Content Addressable Memory with Magnetic Tunnel Junction

Variation-tolerant Non-volatile Ternary Content Addressable Memory with Magnetic Tunnel Junction JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.17, NO.3, JUNE, 2017 ISSN(Print) 1598-1657 https://doi.org/10.5573/jsts.2017.17.3.458 ISSN(Online) 2233-4866 Variation-tolerant Non-volatile Ternary

More information

Performance Comparison of CMOS and Finfet Based Circuits At 45nm Technology Using SPICE

Performance Comparison of CMOS and Finfet Based Circuits At 45nm Technology Using SPICE RESEARCH ARTICLE OPEN ACCESS Performance Comparison of CMOS and Finfet Based Circuits At 45nm Technology Using SPICE Mugdha Sathe*, Dr. Nisha Sarwade** *(Department of Electrical Engineering, VJTI, Mumbai-19)

More information

LSI and Circuit Technologies for the SX-8 Supercomputer

LSI and Circuit Technologies for the SX-8 Supercomputer LSI and Circuit Technologies for the SX-8 Supercomputer By Jun INASAKA,* Toshio TANAHASHI,* Hideaki KOBAYASHI,* Toshihiro KATOH,* Mikihiro KAJITA* and Naoya NAKAYAMA This paper describes the LSI and circuit

More information

Low-Voltage Wide Linear Range Tunable Operational Transconductance Amplifier

Low-Voltage Wide Linear Range Tunable Operational Transconductance Amplifier Low-Voltage Wide Linear Range Tunable Operational Transconductance Amplifier A dissertation submitted in partial fulfillment of the requirement for the award of degree of Master of Technology in VLSI Design

More information

UNIT-III POWER ESTIMATION AND ANALYSIS

UNIT-III POWER ESTIMATION AND ANALYSIS UNIT-III POWER ESTIMATION AND ANALYSIS In VLSI design implementation simulation software operating at various levels of design abstraction. In general simulation at a lower-level design abstraction offers

More information

PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS

PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS PROCESS-VOLTAGE-TEMPERATURE (PVT) VARIATIONS AND STATIC TIMING ANALYSIS The major design challenges of ASIC design consist of microscopic issues and macroscopic issues [1]. The microscopic issues are ultra-high

More information

Advanced Operational Amplifiers

Advanced Operational Amplifiers IsLab Analog Integrated Circuit Design OPA2-47 Advanced Operational Amplifiers כ Kyungpook National University IsLab Analog Integrated Circuit Design OPA2-1 Advanced Current Mirrors and Opamps Two-stage

More information

The Design and Characterization of an 8-bit ADC for 250 o C Operation

The Design and Characterization of an 8-bit ADC for 250 o C Operation The Design and Characterization of an 8-bit ADC for 25 o C Operation By Lynn Reed, John Hoenig and Vema Reddy Tekmos, Inc. 791 E. Riverside Drive, Bldg. 2, Suite 15, Austin, TX 78744 Abstract Many high

More information

PHYSICAL STRUCTURE OF CMOS INTEGRATED CIRCUITS. Dr. Mohammed M. Farag

PHYSICAL STRUCTURE OF CMOS INTEGRATED CIRCUITS. Dr. Mohammed M. Farag PHYSICAL STRUCTURE OF CMOS INTEGRATED CIRCUITS Dr. Mohammed M. Farag Outline Integrated Circuit Layers MOSFETs CMOS Layers Designing FET Arrays EE 432 VLSI Modeling and Design 2 Integrated Circuit Layers

More information

A Robust Low Power Static Random Access Memory Cell Design

A Robust Low Power Static Random Access Memory Cell Design Wright State University CORE Scholar Browse all Theses and Dissertations Theses and Dissertations 2018 A Robust Low Power Static Random Access Memory Cell Design A. V. Rama Raju Pusapati Wright State University

More information

Difference between BJTs and FETs. Junction Field Effect Transistors (JFET)

Difference between BJTs and FETs. Junction Field Effect Transistors (JFET) Difference between BJTs and FETs Transistors can be categorized according to their structure, and two of the more commonly known transistor structures, are the BJT and FET. The comparison between BJTs

More information

Design of Low Power High Speed Fully Dynamic CMOS Latched Comparator

Design of Low Power High Speed Fully Dynamic CMOS Latched Comparator International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 10, Issue 4 (April 2014), PP.01-06 Design of Low Power High Speed Fully Dynamic

More information

EE 330 Lecture 12. Devices in Semiconductor Processes. Diodes

EE 330 Lecture 12. Devices in Semiconductor Processes. Diodes EE 330 Lecture 12 Devices in Semiconductor Processes Diodes Guest Lecture: Joshua Abbott Non Volatile Product Engineer Micron Technology NAND Memory: Operation, Testing and Challenges Intro to Flash Memory

More information

Highly Efficient Ultra-Compact Isolated DC-DC Converter with Fully Integrated Active Clamping H-Bridge and Synchronous Rectifier

Highly Efficient Ultra-Compact Isolated DC-DC Converter with Fully Integrated Active Clamping H-Bridge and Synchronous Rectifier Highly Efficient Ultra-Compact Isolated DC-DC Converter with Fully Integrated Active Clamping H-Bridge and Synchronous Rectifier JAN DOUTRELOIGNE Center for Microsystems Technology (CMST) Ghent University

More information

Digital Design and System Implementation. Overview of Physical Implementations

Digital Design and System Implementation. Overview of Physical Implementations Digital Design and System Implementation Overview of Physical Implementations CMOS devices CMOS transistor circuit functional behavior Basic logic gates Transmission gates Tri-state buffers Flip-flops

More information

A New Capacitive Sensing Circuit using Modified Charge Transfer Scheme

A New Capacitive Sensing Circuit using Modified Charge Transfer Scheme 78 Hyeopgoo eo : A NEW CAPACITIVE CIRCUIT USING MODIFIED CHARGE TRANSFER SCHEME A New Capacitive Sensing Circuit using Modified Charge Transfer Scheme Hyeopgoo eo, Member, KIMICS Abstract This paper proposes

More information

Magnetic Spin Devices: 7 Years From Lab To Product. Jim Daughton, NVE Corporation. Symposium X, MRS 2004 Fall Meeting

Magnetic Spin Devices: 7 Years From Lab To Product. Jim Daughton, NVE Corporation. Symposium X, MRS 2004 Fall Meeting Magnetic Spin Devices: 7 Years From Lab To Product Jim Daughton, NVE Corporation Symposium X, MRS 2004 Fall Meeting Boston, MA December 1, 2004 Outline of Presentation Early Discoveries - 1988 to 1995

More information

Fully Parallel 6T-2MTJ Nonvolatile TCAM with Single-Transistor-Based Self Match-Line Discharge Control

Fully Parallel 6T-2MTJ Nonvolatile TCAM with Single-Transistor-Based Self Match-Line Discharge Control Fully Parallel 6T-2MTJ Nonvolatile TCAM with Single-Transistor-Based Self Match-Line Discharge Control Shoun Matsunaga 1,2, Akira Katsumata 2, Masanori Natsui 1,2, Shunsuke Fukami 1,3, Tetsuo Endoh 1,2,4,

More information

DESIGNING OF SRAM USING LECTOR TECHNIQUE TO REDUCE LEAKAGE POWER

DESIGNING OF SRAM USING LECTOR TECHNIQUE TO REDUCE LEAKAGE POWER DESIGNING OF SRAM USING LECTOR TECHNIQUE TO REDUCE LEAKAGE POWER Ashwini Khadke 1, Paurnima Chaudhari 2, Mayur More 3, Prof. D.S. Patil 4 1Pursuing M.Tech, Dept. of Electronics and Engineering, NMU, Maharashtra,

More information

Short Channel Bandgap Voltage Reference

Short Channel Bandgap Voltage Reference Short Channel Bandgap Voltage Reference EE-584 Final Report Authors: Thymour Legba Yugu Yang Chris Magruder Steve Dominick Table of Contents Table of Figures... 3 Abstract... 4 Introduction... 5 Theory

More information

Current Mirrors. Current Source and Sink, Small Signal and Large Signal Analysis of MOS. Knowledge of Various kinds of Current Mirrors

Current Mirrors. Current Source and Sink, Small Signal and Large Signal Analysis of MOS. Knowledge of Various kinds of Current Mirrors Motivation Current Mirrors Current sources have many important applications in analog design. For example, some digital-to-analog converters employ an array of current sources to produce an analog output

More information

Low Power High Performance 10T Full Adder for Low Voltage CMOS Technology Using Dual Threshold Voltage

Low Power High Performance 10T Full Adder for Low Voltage CMOS Technology Using Dual Threshold Voltage Low Power High Performance 10T Full Adder for Low Voltage CMOS Technology Using Dual Threshold Voltage Surbhi Kushwah 1, Shipra Mishra 2 1 M.Tech. VLSI Design, NITM College Gwalior M.P. India 474001 2

More information

An introduction to Depletion-mode MOSFETs By Linden Harrison

An introduction to Depletion-mode MOSFETs By Linden Harrison An introduction to Depletion-mode MOSFETs By Linden Harrison Since the mid-nineteen seventies the enhancement-mode MOSFET has been the subject of almost continuous global research, development, and refinement

More information

BICMOS Technology and Fabrication

BICMOS Technology and Fabrication 12-1 BICMOS Technology and Fabrication 12-2 Combines Bipolar and CMOS transistors in a single integrated circuit By retaining benefits of bipolar and CMOS, BiCMOS is able to achieve VLSI circuits with

More information

DFT for Testing High-Performance Pipelined Circuits with Slow-Speed Testers

DFT for Testing High-Performance Pipelined Circuits with Slow-Speed Testers DFT for Testing High-Performance Pipelined Circuits with Slow-Speed Testers Muhammad Nummer and Manoj Sachdev University of Waterloo, Ontario, Canada mnummer@vlsi.uwaterloo.ca, msachdev@ece.uwaterloo.ca

More information

Reduction of Peak Input Currents during Charge Pump Boosting in Monolithically Integrated High-Voltage Generators

Reduction of Peak Input Currents during Charge Pump Boosting in Monolithically Integrated High-Voltage Generators Reduction of Peak Input Currents during Charge Pump Boosting in Monolithically Integrated High-Voltage Generators Jan Doutreloigne Abstract This paper describes two methods for the reduction of the peak

More information

Ultra Low Power High Speed Comparator for Analog to Digital Converters

Ultra Low Power High Speed Comparator for Analog to Digital Converters Ultra Low Power High Speed Comparator for Analog to Digital Converters Suman Biswas Department Of Electronics Kiit University Bhubaneswar,Odisha Dr. J. K DAS Rajendra Prasad Abstract --Dynamic comparators

More information

HIGH-VOLTAGE PROGRAMMABLE DELTA-SIGMA MODULATION VOLTAGE-CONTROL CIRCUIT. Lucien Jan Bissey. A thesis. submitted in partial fulfillment

HIGH-VOLTAGE PROGRAMMABLE DELTA-SIGMA MODULATION VOLTAGE-CONTROL CIRCUIT. Lucien Jan Bissey. A thesis. submitted in partial fulfillment HIGH-VOLTAGE PROGRAMMABLE DELTA-SIGMA MODULATION VOLTAGE-CONTROL CIRCUIT by Lucien Jan Bissey A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Electrical

More information

An Analog Phase-Locked Loop

An Analog Phase-Locked Loop 1 An Analog Phase-Locked Loop Greg Flewelling ABSTRACT This report discusses the design, simulation, and layout of an Analog Phase-Locked Loop (APLL). The circuit consists of five major parts: A differential

More information

SRAM Read Performance Degradation under Asymmetric NBTI and PBTI Stress: Characterization Vehicle and Statistical Aging

SRAM Read Performance Degradation under Asymmetric NBTI and PBTI Stress: Characterization Vehicle and Statistical Aging SRAM Read Performance Degradation under Asymmetric NBTI and PBTI Stress: Characterization Vehicle and Statistical Aging Xiaofei Wang,2 Weichao Xu 2 and Chris H. Kim 2 Intel Corporation, Hillsboro 2 University

More information

A Differential 2R Crosspoint RRAM Array with Zero Standby Current

A Differential 2R Crosspoint RRAM Array with Zero Standby Current 1 A Differential 2R Crosspoint RRAM Array with Zero Standby Current Pi-Feng Chiu, Student Member, IEEE, and Borivoje Nikolić, Senior Member, IEEE Department of Electrical Engineering and Computer Sciences,

More information

EE 42/100 Lecture 23: CMOS Transistors and Logic Gates. Rev A 4/15/2012 (10:39 AM) Prof. Ali M. Niknejad

EE 42/100 Lecture 23: CMOS Transistors and Logic Gates. Rev A 4/15/2012 (10:39 AM) Prof. Ali M. Niknejad A. M. Niknejad University of California, Berkeley EE 100 / 42 Lecture 23 p. 1/16 EE 42/100 Lecture 23: CMOS Transistors and Logic Gates ELECTRONICS Rev A 4/15/2012 (10:39 AM) Prof. Ali M. Niknejad University

More information

An 8-bit Analog-to-Digital Converter based on the Voltage-Dependent Switching Probability of a Magnetic Tunnel Junction

An 8-bit Analog-to-Digital Converter based on the Voltage-Dependent Switching Probability of a Magnetic Tunnel Junction An 8-bit Analog-to-Digital Converter based on the Voltage-Dependent Switching Probability of a Magnetic Tunnel Junction Won Ho Choi*, Yang Lv*, Hoonki Kim, Jian-Ping Wang, and Chris H. Kim *equal contribution

More information

8-Bit, high-speed, µp-compatible A/D converter with track/hold function ADC0820

8-Bit, high-speed, µp-compatible A/D converter with track/hold function ADC0820 8-Bit, high-speed, µp-compatible A/D converter with DESCRIPTION By using a half-flash conversion technique, the 8-bit CMOS A/D offers a 1.5µs conversion time while dissipating a maximum 75mW of power.

More information

Testing Power Sources for Stability

Testing Power Sources for Stability Keywords Venable, frequency response analyzer, oscillator, power source, stability testing, feedback loop, error amplifier compensation, impedance, output voltage, transfer function, gain crossover, bode

More information

Chapter 3 DESIGN OF ADIABATIC CIRCUIT. 3.1 Introduction

Chapter 3 DESIGN OF ADIABATIC CIRCUIT. 3.1 Introduction Chapter 3 DESIGN OF ADIABATIC CIRCUIT 3.1 Introduction The details of the initial experimental work carried out to understand the energy recovery adiabatic principle are presented in this section. This

More information

Analysis of Low Power-High Speed Sense Amplifier in Submicron Technology

Analysis of Low Power-High Speed Sense Amplifier in Submicron Technology Voltage IJSRD - International Journal for Scientific Research & Development Vol. 2, Issue 02, 2014 ISSN (online): 2321-0613 Analysis of Low Power-High Speed Sense Amplifier in Submicron Technology Sunil

More information

CMOS Digital Integrated Circuits Analysis and Design

CMOS Digital Integrated Circuits Analysis and Design CMOS Digital Integrated Circuits Analysis and Design Chapter 8 Sequential MOS Logic Circuits 1 Introduction Combinational logic circuit Lack the capability of storing any previous events Non-regenerative

More information

Low Transistor Variability The Key to Energy Efficient ICs

Low Transistor Variability The Key to Energy Efficient ICs Low Transistor Variability The Key to Energy Efficient ICs 2 nd Berkeley Symposium on Energy Efficient Electronic Systems 11/3/11 Robert Rogenmoser, PhD 1 BEES_roro_G_111103 Copyright 2011 SuVolta, Inc.

More information

Mayank Chakraverty and Harish M Kittur. VIT University, Vellore, India,

Mayank Chakraverty and Harish M Kittur. VIT University, Vellore, India, International Journal of Micro and Nano Systems, 2(1), 2011, pp. 1-6 FIRST PRINCIPLE SIMULATIONS OF FE/MGO/FE MAGNETIC TUNNEL JUNCTIONS FOR APPLICATIONS IN MAGNETORESISTIVE RANDOM ACCESS MEMORY BASED CELL

More information

CHAPTER 7 A BICS DESIGN TO DETECT SOFT ERROR IN CMOS SRAM

CHAPTER 7 A BICS DESIGN TO DETECT SOFT ERROR IN CMOS SRAM 131 CHAPTER 7 A BICS DESIGN TO DETECT SOFT ERROR IN CMOS SRAM 7.1 INTRODUCTION Semiconductor memories are moving towards higher levels of integration. This increase in integration is achieved through reduction

More information

S1. Current-induced switching in the magnetic tunnel junction.

S1. Current-induced switching in the magnetic tunnel junction. S1. Current-induced switching in the magnetic tunnel junction. Current-induced switching was observed at room temperature at various external fields. The sample is prepared on the same chip as that used

More information

VARIATION MONITOR-ASSISTED ADAPTIVE MRAM WRITE

VARIATION MONITOR-ASSISTED ADAPTIVE MRAM WRITE Shaodi Wang, Hochul Lee, Pedram Khalili, Cecile Grezes, Kang L. Wang and Puneet Gupta University of California, Los Angeles VARIATION MONITOR-ASSISTED ADAPTIVE MRAM WRITE NanoCAD Lab shaodiwang@g.ucla.edu

More information

Stepwise Pad Driver in Deep-Submicron Technology. Master of Science Thesis SAMUEL KARLSSON

Stepwise Pad Driver in Deep-Submicron Technology. Master of Science Thesis SAMUEL KARLSSON Stepwise Pad Driver in Deep-Submicron Technology Master of Science Thesis SAMUEL KARLSSON Chalmers University of Technology University of Gothenburg Department of Computer Science and Engineering Göteborg,

More information

Rail to Rail Input Amplifier with constant G M and High Unity Gain Frequency. Arun Ramamurthy, Amit M. Jain, Anuj Gupta

Rail to Rail Input Amplifier with constant G M and High Unity Gain Frequency. Arun Ramamurthy, Amit M. Jain, Anuj Gupta 1 Rail to Rail Input Amplifier with constant G M and High Frequency Arun Ramamurthy, Amit M. Jain, Anuj Gupta Abstract A rail to rail input, 2.5V CMOS input amplifier is designed that amplifies uniformly

More information