IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 A Low-Power Robust Easily Cascaded PentaMTJ-Based Combinational and Sequential Circuits Mohit Kumar Gupta and Mohd Hasan, Senior Member, IEEE Abstract Advanced computing systems embed spintronic devices to improve the leakage performance of conventional CMOS systems. High speed, low power, and infinite endurance are important properties of magnetic tunnel junction (MTJ), a spintronic device, which assures its use in memories and logic circuits. This paper presents a PentaMTJ-based logic gate, which provides easy cascading, self-referencing, less voltage headroom problem in precharge sense amplifier and low area overhead contrary to existing MTJ-based gates. PentaMTJ is used here because it provides guaranteed disturbance free reading and increased tolerance to process variations along with compatibility with CMOS process. The logic gate is validated by simulation at the 45-nm technology node using a VerilogA model of the PentaMTJ. Index Terms Counter, magnetic logic gate, magnetic tunnel junction (MTJ), magnetoresistance, nonvolatile logic devices, PentaMTJ, precharge sense amplifier (PCSA), spintronics. I. INTRODUCTION SPINTRONICS has been under extensive research because of nonvolatility, infinite endurance, and low power [1]. The spin is employed for storing information and the charge for its processing. It has the potential to replace CMOS logic and memory [2]. In deep submicrometer, scaling of CMOS causes the leakage power to dominate over all other power components [3]. Digital signals are represented in conventional CMOS logic by the presence or absence of electrical charge in terms of voltage V DD or ground. However, in spintronics, digital signals are represented by up and down spin of electron. In recent years, researchers have developed spintronic devices, such as magnetic tunnel junctions (MTJs), which operates on the principle of tunnel magnetoresistance (TMR) [4]. An MTJ is composed of two ferromagnetic layers separated by an oxide layer with the capability to improve the performance of CMOS logic circuit in terms of power dissipation, area required, and interconnection delay [5]. It can also be easily fabricated using 3-D backend integration process, which is compatible with CMOS process, without any area overhead [6]. Manuscript received May 26, 2014; revised September 22, 2014, November 30, 2014, and January 16, 2015; accepted January 17, 2015. This work was supported by the Departmental Research Support Grant through the University Grants Commission, Government of India. The authors are with the Department of Electronics Engineering, Aligarh Muslim University, Aligarh 202002, India (e-mail: engg.mkg@gmail.com; mohdhasan097@gmail.com). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TVLSI.2015.2398117 Fig. 1. Structure of PentaMTJ with two pinned layers (TPL and BPL) and one free layer. Several types of logic gates using MTJ are reported in the literature. The dual properties of MTJ, namely, processing and storage, help to reduce the memory and interconnect delay/power [7] needed to store the processed data back into memory. Although reported magnetic logic gates help in reducing power and delay but they have many drawbacks. In [8], a magnetic XOR gate comprising of six MTJs and transistors is presented. Its area requirement is less but as the number of MTJ increases, the writing energy also rises, which is a serious limitation of hybrid circuit consisting of MTJ and CMOS. In [9], the logic gate would require additional circuitry to convert the voltage signals to the current signal of sufficient magnitude for writing the MTJ of the subsequent stage leading to an increase in delay, power consumption, and area. Lyle et al. [10] designed a logic gate with only one output MTJ that can realize logic operation by selecting proper preset, i.e., initial state and operating voltage. Moreover, the author has implemented only linear logic like NAND, NOR, and majority function. If a nonlinear logic like two-input XOR/XNOR were to be implemented using NAND/NOR, respectively, then the output would be obtained in three stages. Friedman et al. [11] and Horowitz and Hill [12] proposed a spin-diode logic family and CMOS logic gate, respectively, in which the static power dissipation was more than the writing power dissipation. This is due to the requirement of constant V DD supply for nodes of spin-diode and leakagepower dissipation in CMOS at the nanoscale, respectively. A PentaMTJ is composed of two pinned ferromagnetic layers and one free layer. In between the pinned layer and the free layer, MgO (insulating oxide) is used, as shown in Fig. 1. Two resistance states, like in conventional MTJ, 1063-8210 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
2 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS namely, one is a parallel state and the other is an antiparallel state. It is also effectively used in the realization of memory. Huda and Sheikholeslami [13] proposed a novel PentaMTJ-based Spin Transfer Torque-Magnetic Random Access Memory for disturbance free reading. We have also presented a PentaMTJ-based Ternary Content Addressable Memory with less delay and search power [14]. PentaMTJ-based realization of digital circuits has many advantages. First, PentaMTJ-based magnetic logic gates do not require referencing circuit due to the presence of two pinned layers with opposite spin orientations (self-referencing) contrary to MTJ. Second, no extra hardware is needed for complementary outputs due to the presence of precharge sense amplifier (PCSA) for sensing. Third, the output of a spintronic device is directly sensed by the PCSA so there is no need to initialize the state of the output MTJ for sensing. The sensing power consumption is reduced and the speed is enhanced due to the use of PCSA as sense amplifier because it turns ON only for a short duration during the transitions. Finally, ease of cascading is the greatest contribution of our proposed magnetic logic gate because the output of the gate and the programming signal of PentaMTJ are both voltage signals. This paper is organized into six sections. Section II describes the PentaMTJ and logic in memory architecture. Section III covers the design of three basic logic gates and implementation of XOR/XNOR along with simulation result to validate its functionality. Section IV discusses the cascading of logic gate with the help of a 3-bit Gray counter as an example and also its simulation results. Section V computes the energy and delay in writing and sensing followed by the conclusion in Section VI. II. STATE OF THE ART A. PentaMTJ Fig. 1 shows the structure of the PentaMTJ which comprises of two pinned layers: 1) top pinned layer (TPL) and 2) bottom pinned layer (BPL). The magnetization of two pinned layers is in opposite direction and is fixed. In this paper, 1 state is assigned when TPL (pinned 1) is parallel to the free layer and 0 state when BPL (pinned 2) is parallel to the free layer. The proposed structure of PentaMTJ, presented in [13], needs less current for writing as compared with the conventional MTJ. It requires current only for converting antiparallel to parallel state for one stack, the other stack automatically comes into antiparallel state. Moreover, PentaMTJ provides guaranteed disturbance free reading and increases the tolerance to process variation as per the only reference available in the literature on PentaMTJ [13]. The effect of process variation of one stack is nullified by another stack in case of PentaMTJ contrary to two different MTJs whose process variations degrade the performance [15]. Actually, no experimental data is available for the double barrier and therefore, we have assumed that the single barrier model is also valid for a double barrier for TMR ratio. However, the dual pinned layer with single free layer structure of PentaMTJ has already been verified by micromagnetic simulation in [16]. PentaMTJ has lower resistance than the conventional MTJ because it works well for small value of oxide thickness Fig. 2. (a) Block diagram of logic gates using PentaMTJ. (b) Writing, state detection, and amplification using PCSA of PentaMTJ cell. compared with MTJ [13]. The spin-transfer torque using perpendicular magnetic anisotropy (PMA) would greatly reduce the required switching voltage due to the absence of the easy-plane anisotropy term found in in-plane devices which increases the switching voltage without contributing to the activation energy [17]. B. Logic in Memory The logic-in-memory architecture, shown in Fig. 2, is composed of three parts: 1) PCSA for sensing the difference between the two states of resistance; 2) PentaMTJ logic; and 3) PentaMTJ writing cell. PCSA (as shown in Fig. 2) is a dynamic logic circuit having two phases, namely, a precharge phase and an evaluation phase. The discharging of both branches of PCSA depends upon their relative resistances such that the low-resistance branch discharges the output node capacitance more rapidly that cuts off the other branch because of the cross-coupled PCSA structure. The low-resistance branch pulls down toward ground and the high-resistance branch pulls up toward V DD. The importance of PCSA is described in [14] and [18]. It has low read disturbance and dynamic sensing capabilities that decrease delay. In the proposed logic gates, simultaneous precharging and writing in PentaMTJ are possible. The writing path of PentaMTJ and the precharging path of PCSA are separated by nmos transistors MN2 and MN3. During precharging, CLK is low which disconnects the upper half from the lower half, i.e., precharging of PCSA at the time of writing leads to less delay as well as improved design. Stacking and high processvoltage-temperature (PVT) variations in case of conventional MTJ in deep submicrometer [19] cause severe resistance mismatch that will also lead to PCSA failure in MTJ/CMOS hybrid logic circuits. Due to single stacking of transistor in our proposed PCSA-based gate and low-pvt variation in case of PentaMTJ [13], the proposed logic gates are immune to these two limitations of PCSA. The writing of PentaMTJ is done in only one direction (from antiparallel to parallel state). The writing phase begins by connecting the free layer to V DD by enabling CLK. D and D_bar determine in which stack the parallel writing needs to be done. An nmos MN6 ensures efficient writing in PentaMTJ and discharging of PCSA because during discharging of
GUPTA AND HASAN: LOW-POWER ROBUST EASILY CASCADED PentaMTJ-BASED COMBINATIONAL AND SEQUENTIAL CIRCUITS 3 Fig. 3. XOR/XNOR gates using PentaMTJ. Fig. 5. Circuit diagram of 3-bit Gray counter using PentaMTJ. Fig. 4. Simulation result of XOR/XNOR gate. PCSA, it turns OFF. Hence, the discharging of PCSA only happens through the PentaMTJ and not through the writing transistors. III. LOGIC GATES USING PENTAMTJ Logic gates act as basic building blocks for both combinational and sequential circuits. The basic structure of PentaMTJ-based logic gate is divided into three parts, as shown in Fig. 2 and described in Section II. Fig. 3 shows the PentaMTJ-based XOR/XNOR logic gates. For different logic gates, different writing circuitry is required but the sensing portion remains identical. Therefore, the information is stored in the pinned layers using series or parallel combinations of transistors as per the logic. Storing logic in PentaMTJ is designed such that for storing 1, all logic combinations with high output are combined and the net expression is evaluated using K -map and for storing 0, the complement of the expression is evaluated. Fig. 4 shows the simulation results of logic gates with both normal and complementary outputs. A and B are the two inputs, 0 output corresponds to the discharging of PCSA whereas 1 means no discharging for normal output. The evaluation phase begins after precharging the outputs of the PCSA to V DD using the clock CLK. IV. 3-bit GRAY COUNTER Sequential logic circuits differ from combinational logic circuits as the output of a sequential logic circuit depends upon both the previous output (present state) and the present input. A 3-bit Gray counter is a sequential circuit whose successive states differ in only one digit [12]. The present state in a sequential circuit like Gray counter is stored in flip-flops, which is very power consuming under standby condition. Use of MTJ/PentaMTJ in a sequential circuit is beneficial because in case of unintentional shutdown, the counter can be restored from its previous state instead of its initial state. The previous state is restored from PentaMTJ within few hundred picoseconds. In the Gray counter, PCSA is used for sensing to generate the next state, PentaMTJ for present state storage and the writing circuitry to assign the next state to the present state. Fig. 5 shows the circuit diagram of a 3-bit Gray counter comprising of three PentaMTJs for storage, three PCSAs for sensing, and a writing circuit according to the characteristic (1). A n, B n,andc n are the stored outputs (present state) whereas A n+1, B n+1,andc n+1 signify the next state which is to be stored in a PentaMTJ. It starts operating by writing in PentaMTJ using the clock CLKW with a pulsewidth of 1.5 ns. To accomplish the writing in PentaMTJ in 1.5 ns, a clock CLKW with 2-ns time period and 1.5-ns pulsewidth is generated. The precharging and then sensing are performed when CLKW is high (500 ps) using a short duration low CLK (300 ps) pulse followed by a short duration high CLKR pulse (200 ps). The same process is repeated until the counter stops. It may be noted that nmos MN2 and MN3, as shown in Fig. 2, are not used in the Gray counter because the writing and precharging are not done simultaneously. The writing is done according to the previous state and, therefore, precharging and writing are not done at the same time. Fig. 6 shows the simulation results with A n+1 being the least significant bit and C n+1, the most significant bit A n+1 = B n C n B n+1 = A n C n + A n B n C n+1 = A n C n + A n B n. (1) V. RESULT AND DISCUSSION The self referencing property of the PentaMTJ is useful in decreasing the area overhead because of its differential nature. The switching current density in PMA is directly proportional to the magnetization, anisotropy field, and the thickness of the free layer. The thermal stability factor of MTJ/PentaMTJ governs the data retention capability of the digital logic. In case of PentaMTJ, for = 43, the retention
4 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS TABLE II COMPARISON OF PENTAMTJ AND CMOS-BASED LOGIC GATE IN TERMS OF NUMBER OF TRANSISTORS TABLE III ENERGY AND DELAY ANALYSIS OF XOR LOGIC GATE Fig. 6. Simulation result of Gray counter. TABLE I PARAMETERS FOR POWER MEASUREMENT TABLE IV ENERGY AND DELAY ANALYSIS OF GRAY COUNTER time is 10 years [13]. It is possible to decrease further as the required retention time is less in case of logic circuits [20]. The switching time delay is inversely proportional to the writing current [21], [22]. Zhang et al. [20] proved that the switching delay can be 0.5 ns for a particular voltage or current and in [23] it has been experimentally found to be 1 ns for a digital logic system. In our proposed PentaMTJ-based Gray counter, the switching delay is assumed to be 1.5 ns and the writing energy is computed for this delay. The time period of the clock for the Gray counter is taken as 2 ns which corresponds to a frequency of 500 MHz. The sensing time of the PCSA is assumed to be 500 ps during which both precharging and sensing take place. It is possible to further increase the frequency of operation of the Gray counter by either reducing the switching time or by increasing the writing current which will cause the write energy to go up. Table I gives the parameters used for the computation of writing and sensing energies. It can also be inferred from Table II that the PentaMTJ-based Gray counter requires less number of transistors compared with MTJ-based Gray counter because of additional writing circuitry requirement in case of MTJ [9]. Table III gives the values of energy dissipation and delay during writing and sensing for XOR logic gate realized using PentaMTJ, MTJ [9], and MTJ [8], respectively. The writing in the proposed PentaMTJ-based logic gate requires less energy as compared with MTJ [10] because only one PentaMTJ is to be programmed. Moreover, in MTJ [8], six MTJs are to be programmed for comparison. The sensing delay is more in the proposed logic gate because of the presence of extra MN3 and MN4 transistors in the sensing path. However, these extra transistors enable simultaneous precharging and efficient writing that reduces errors. It is clear from Table IV that the proposed magnetic Gray counter consumes less writing power as compared with [9] because only three PentaMTJs are required and no intermediate circuitry is needed for conversion. It is also evident from Table IV that the overall energy consumption (sensing and writing) of MTJ-based Gray counter is much more than the proposed Gray counter. As compared with CMOS logic, the proposed magnetic logic gates consume more power and delay in writing but this logic gate consumes little static power which is a major power contributor along with the interconnect power at the nanoscale. VI. CONCLUSION The attractive features of MTJ/PentaMTJ-based CMOS logic are low static power, short interconnect delay,
GUPTA AND HASAN: LOW-POWER ROBUST EASILY CASCADED PentaMTJ-BASED COMBINATIONAL AND SEQUENTIAL CIRCUITS 5 and effective power gating because of nonvolatility. PentaMTJ-based logic decreases the area overhead by removing the intermediate circuitry needed for conversion of voltage to current or current to voltage. Moreover, no initial condition is required for performing the logic operation and self referencing property removes the extra MTJs used for referencing. PentaMTJ also provides guaranteed disturbance free reading and increased tolerance to process variations due to its differential nature. REFERENCES [1] S. Tehrani et al., Recent developments in magnetic tunnel junction MRAM, IEEE Trans. Magn., vol. 36, no. 5, pp. 2752 2757, Sep. 2000. [2] G. A. Prinz, Magnetoelectronics, Science, vol. 282, pp. 1660 1663, Nov. 1998. [3] ERD. (2011). International Roadmap for Semiconductor (ITRS). [Online]. Available: http://www.itrs.net/links/2011itrs/home2011.htm [4] S. Parkin, X. Jiang, C. Kaiser, A. Panchula, K. Roche, and M. Samant, Magnetically engineered spintronic sensors and memory, Proc. IEEE, vol. 91, no. 5, pp. 661 680, May 2003. [5] S. A. Wolfet al., Spintronics: A spin-based electronics vision for the future, Science, vol. 294, no. 5546, pp. 1488 1495, 2001. [6] C. Chappert, A. Fert, and F. N. Van Dau, The emergence of spin electronics in data storage, Nature Mater., vol. 6, no. 11, pp. 813 823, Nov. 2007. [7] S. D. Pable and M. Hasan, Interconnect design for subthreshold circuits, IEEE Trans. Nanotechnol., vol. 11, no. 3, pp. 633 639, May 2012. [8] H.-P. Trinh, W. Zhao, J.-O. Klein, Y. Zhang, D. Ravelsona, and C. Chappert, Magnetic adder based on racetrack memory, IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 60, no. 6, pp. 1469 1477, Jun. 2013. [9] S. Lee, N. Kim, H. Yang, G. Lee, S. Lee, and H. Shin, The 3-bit gray counter based on magnetic-tunnel-junction elements, IEEE Trans. Magn., vol. 43, no. 6, pp. 2677 2679, Jun. 2007. [10] A. Lyle et al., Magnetic tunnel junction logic architecture for realization of simultaneous computation and communication, IEEE Trans. Magn., vol. 47, no. 10, pp. 2970 2973, Oct. 2011. [11] J. S. Friedman, N. Rangaraju, Y. I. Ismail, and B. W. Wessels, A spin-diode logic family, IEEE Trans. Nanotechnol., vol. 11, no. 5, pp. 1026 1032, Sep. 2012. [12] P. Horowitz and W. Hill, The Art of Electronics. Cambridge, U.K.: Cambridge Univ. Press, 1989. [13] S. Huda and A. Sheikholeslami, A novel STT-MRAM cell with disturbance-free read operation, IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 60, no. 6, pp. 1534 1547, Jun. 2013. [14] M. K. Gupta and M. Hasan, Design of high speed energy efficient masking error immune PentaMTJ based TCAM, IEEE Trans. Magn., no. 99. [15] W. Xu, T. Zhang, and Y. Chen, Design of spin-torque transfer magnetoresistive RAM and CAM/TCAM with high sensing and search speed, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 18, no. 1, pp. 66 74, Jan. 2010. [16] A. Makarov, V. Sverdlov, D. Osintsev, and S. Selberherr, Fast switching in magnetic tunnel junctions with two pinned layers: Micromagnetic modeling, IEEE Trans. Magn., vol. 48, no. 4, pp. 1289 1292, Apr. 2012. [17] J. Z. Sun, Spin-current interaction with a monodomain magnetic body: A model study, Phys.Rev.B, vol. 62, no. 1, pp. 570 578, 2000. [18] E. Deng, Y. Zhang, J.-O. Klein, D. Ravelsona, C. Chappert, and W. Zhao, Low power magnetic full-adder based on spin transfer torque MRAM, IEEE Trans. Magn., vol. 49, no. 9, pp. 4982 4987, Sep. 2013. [19] W. S. Zhao et al., Failure and reliability analysis of STT-MRAM, Microelectron. Rel., vol. 52, nos. 9 10, pp. 1848 1852, Sep./Oct. 2012. [20] Y. Zhang et al., Compact modeling of perpendicular-anisotropy CoFeB/MgO magnetic tunnel junctions, IEEE Trans. Electron Devices, vol. 59, no. 3, pp. 819 826, Mar. 2012. [21] M. Marins de Castro et al., Precessional spin-transfer switching in a magnetic tunnel junction with a synthetic antiferromagnetic perpendicular polarizer, J. Appl. Phys., vol. 111, no. 7, pp. 07C912-1 07C912-3, Apr. 2012. [22] D. C. Worledge et al., Spin torque switching of perpendicular Ta CoFeB MgO-based magnetic tunnel junctions, Appl. Phys. Lett., vol. 98, no. 2, pp. 022501-1 022501-3, Jan. 2011. [23] S. Patil, A. Lyle, J. Harms, D. J. Lilja, and J.-P. Wang, Spintronic logic gates for spintronic data using magnetic tunnel junctions, in Proc. IEEE Int. Conf. Comput. Design, Oct. 2010, pp. 125 131. Mohit Kumar Gupta received the B.Tech. degree in electronics and communication engineering from Uttar Pradesh Technical University, Lucknow, India, in 2012, and the M.Tech. degree in electronics circuit and system design from the Department of Electronics Engineering, Aligarh Muslim University, Aligarh, India, in 2014. He is involved in the design of magnetoresistive random access memory and implementation of magnetic tunnel junctions in digital circuits. He has authored three IEEE TRANSACTIONS. His current research interests include digital circuit and system design for memory. He was a recipient of the Junior Research Fellowship from the Council of Scientific and Industrial Research, India, in 2013, and the University Grant Commission, India, in 2014. Mohd. Hasan (M 10 SM 13) received the B.Tech. degree in electronics engineering from Aligarh Muslim University (AMU), Aligarh, India, the M.Tech. degree in integrated electronics and circuits from IIT Delhi, Delhi, India, and the Ph.D. degree in lowpower architectures for multicarrier systems from the University of Edinburgh, Edinburgh, U.K. He has been a Full Professor at AMU since 2005. He was a Visiting Post-Doctoral Researcher on a project funded by the prestigious Royal Academy of Engineering, U.K., on low-power field programmable gate array architecture with the School of Engineering, University of Edinburgh. He has authored over 132 research papers in reputed journals and conference proceedings with 470 citations. His current research interests include low-power VLSI design, nanoelectronics, spintronics, and batteryless electronics. He received the Best International Journal Paper Award and International Conference Paper Award.