JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.17, NO.3, JUNE, 2017 ISSN(Print) 1598-1657 https://doi.org/10.5573/jsts.2017.17.3.458 ISSN(Online) 2233-4866 Variation-tolerant Non-volatile Ternary Content Addressable Memory with Magnetic Tunnel Junction Dooho Cho 1, Kyungmin Kim 2, and Changsik Yoo 2,* Abstract A magnetic tunnel junction (MTJ) based ternary content addressable memory (TCAM) is proposed which provides non-volatility. A unit cell of the TCAM has two MTJ s and 4.875 transistors, which allows the realization of TCAM in a small area. The equivalent resistance of parallel connected multiple unit cells is compared with the equivalent resistance of parallel connected multiple reference resistance, which provides the averaging effect of the variations of device characteristics. This averaging effect renders the proposed TCAM to be variationtolerant. Using 65-nm CMOS model parameters, the operation of the proposed TCAM has been evaluated including the Monte-Carlo simulated variations of the device characteristics, the supply voltage variation, and the temperature variation. With the tunneling magnetoresistance ratio (TMR) of 1.5 and all the variations being included, the error probability of the search operation is found to be smaller than 0.033-%. Index Terms Ternary content addressable memory (TCAM), content addressable memory (CAM), magnetic tunnel junction (MTJ) Manuscript received Mar. 6, 2017; accepted Jun. 12, 2017 This work was supported by the Ministry of Trade, Industry, and Energy (MOTIE) of Korea and the Korean Semiconductor Research Consortium through the Future Semiconductor Device Technology Development Program under Grant 10044608 1 Department of Electronics and Computer Engineering, Hanyang University, Seoul 04763, Korea and also with the Memory Division of Samsung Electronics, Korea 2 Department of Electronics and Computer Engineering, Hanyang University, Seoul 04763, Korea E-mail : csyoo@hanyang.ac.kr I. INTRODUCTION Ternary content addressable memory (TCAM) and content addressable memory (CAM) compare input data with stored values and return the address in which the matched data are stored [1-3]. TCAM can be considered as a superset of CAM because it can process don t-care input. For high-speed data searching, data pattern processing, and address control of network routers, the operating speed of TCAM/CAM is desired to be as high as possible. Static random access memory (SRAM) based TCAM/CAM are widely used in which at least 12 transistors are required [4]. Albeit tolerant to the variations of process, supply voltage, and temperature (PVT), the SRAM based TCAM/CAM loses its content if the power is off. Therefore, either standby power has to be dissipated to keep the data from being lost or additional non-volatile memory is required. To avoid the above mentioned problem, magnetic tunnel junction (MTJ) can be utilized to provide the non-volatileness to TCAM/CAM. In [5, 6], MTJ s are employed to realize a SRAM-based TCAM/CAM. While a SRAM-based TCAM/CAM requires large number of transistors and MTJ s, less number of transistors and MTJ s are required if both the data storage and searching operation of TCAM/CAM are performed by MTJ s as proposed in [7-11]. Although the TCAM proposed in [8] requires the smallest number of devices (four transistors and two MTJ s) per cell, the sensing margin is very small. In these non-sram-based TCAM/CAMs, the critical concern is how to ensure the tolerance to the variation of MTJ resistance with minimum number of transistors and MTJ s.
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.17, NO.3, JUNE, 2017 459 R AP R R1 V CELL R CELL R REF R R2 M CELL M REF R R3 R P R P R AP V CELL R CELL M CELL R MTJ1 V CELL R MTJ2 (a) R CELL M CELL M REF (b) R REF (c) M REF R REF R R1 R R2 R R3 Fig. 1. Proposed TCAM cell with two MTJ s during the search operation when the stored cell state is (a) equal (R CELL > R REF ), (b) not equal (R CELL < R REF ) to the input, (c) when the input is don t care. In this paper, a non-volatile TCAM with MTJ is proposed which provides excellent tolerance to the variation of MTJ resistance. Three transistors and two MTJ s are required per cell and therefore the silicon area can also be minimized. The architecture and circuit realization of the proposed non-volatile TCAM with MTJ are explained in the Section II and the simulation results are given in the Section III. Finally, the paper is concluded in the Section IV. II. NON-VOLATILE TCAM WITH MTJ The basic operation principle of the proposed nonvolatile TCAM with MTJ can be explained with Fig. 1. The two MTJ s are connected in series and always in different state, that is, if one of the two MTJ s is in the low-resistance state (R P ; parallel magnetization), the other R R1 R R2 R R3 Fig. 2. Unit cell of the proposed TCAM with two MTJ s and three transistors. MTJ is in the high-resistance state (R AP ; anti-parallel magnetization). These two MTJ s divides the supply voltage V DD to generate the cell voltage V CELL which is either higher or lower than the reference voltage. The on-resistance R CELL of the cell transistor M CELL proportional to the cell voltage V CELL is compared with that of the reference transistor M REF which is proportional to the reference voltage =V DD (R R2 +R R3 )/(R R1 +R R2 + R R3 ). If the MTJ connected to V DD is in the R AP state and the other one connected to ground is in the R P state, the cell voltage V CELL =V DD R P /(R AP +R P ) is lower than and the cell resistance R CELL is larger than the reference resistance R REF as shown in Fig. 1(a). This tells the stored cell state matches the input. If the MTJ connected to V DD is in the R P state and the other one connected to ground is in the R AP state, the cell voltage V CELL = V DD R AP /(R AP +R P ) is larger than and R CELL is smaller than R REF as shown in Fig. 1(b). This tells the stored cell state is not equal to the input. When the input is don t care (x), the cell resistance R CELL has to be larger than the reference resistance R REF regardless of the MTJ state of the cell. For this, the MTJ s are all driven to the same voltage =V DD R R3 /(R R1 +R R2 +R R3 ) as shown in Fig. 1(c) which is always lower than. This simple structure can be applied to implement the proposed TCAM unit cell with two MTJ s and three transistors as shown in Fig. 2. As explained before the two MTJ s R MTJ1 and R MTJ2 are always in the opposite states. The selection transistor M SEL is turned-on only for the write operation. The bit-line-bar is driven to the desired write voltage and the reference voltage, respectively for the write and search operation. 1. Write Operation During the write operation, the word-line is driven
460 DOOHO CHO et al : VARIATION-TOLERANT NON-VOLATILE TERNARY CONTENT ADDRESSABLE MEMORY WITH Fig. 3. Match operation illustrated for a row with N unit cells when the input for the match operation is 1 0x. If the stored states match the input, the output RC OUTB of the resistance is 0. to V DD to turn on the selection transistor M SEL in the selected row. To write 0 ( 1 ), the bit-line-bar is driven to V DD (0-V) and the source-line SL and sourceline-bar SLB are all driven to 0-V (V DD ). Because the fixed layer of the MTJ R MTJ1 and the free layer of the MTJ R MTJ2 are shorted together, the directions of the current flow through the two MTJ s R MTJ1 and R MTJ2 are opposite with each other and therefore they are simultaneously programmed to have the opposite states. For the state of 1 ( 0 ), the MTJ s R MTJ1 and R MTJ2 are programmed to have the resistance R AP (R P ) and R P (R AP ), respectively. The ground-line is left floating to prevent any unwanted current flow of the transistors M CELL and M REF. 2. Search Operation Fig. 3 illustrates the search operation for a row with N unit cells where the input for the search operation is 1 0x. During the search operation, the word-line and the ground-line are all driven to 0-V and the bitline-bar is connected to which is generated by a simple resistive voltage divider. When the input for a cell is 1 as for CELL[N-1] in Fig. 3, the source-line SL and source-line-bar SLB are driven to V DD and 0-V, respectively. If the stored value is 1 (match), the resistance of the MTJ s R MTJ1 and R MTJ2 are R AP and R P, respectively and the cell resistance R CELL is R HIGH which is larger than the reference resistance R REF. If the stored value is 0 (mismatch), R CELL is R LOW which is smaller than R REF. When the input for a cell is 0 as for CELL[1] in Fig. 3, SL and SLB are driven to 0-V and V DD, respectively. If the stored value is 0 (match), R MTJ1 and R MTJ2 are R P and R AP, respectively and R CELL is R HIGH > R REF. If the stored value is 1 (mismatch), R CELL is R LOW < R REF. When the input for a cell is don t care (x) as for CELL[0] in Fig. 3, both SL and SLB are driven to. Then, R CELL of CELL[0] is R HIGH that is greater than R REF and therefore the don t care (x) cell is recognized as match regardless of the stored state. The resistance R seen at the local-match-line () input of the resistance is the parallel equivalent resistance of all the R CELL s of the N unit cells. The resistance R seen at the reference-line () input of the resistance is the parallel equivalent resistance of all the R REF s of the N unit cells. If the input matches the stored state in the row, R and R are R HIGH /N and R REF /N, respectively and therefore R is larger than R. When the evaluation signal becomes 1, the output RC OUTB of the resistance becomes 0 indicating the matching of the input with the stored state in the row. The cells in the same column are all connected to the same SL and SLB and therefore it is possible to search the matching cell at one time. In order to see how to tell the mismatch between the stored state and the input, let s assume only one input bit does not match the stored state in the corresponding cell, which is the worst case for the mismatch detection. Then R is given as; In order for this mismatch case to be distinguishable from the match case, the resistance given in (1) should be smaller than R REF /N, that is, the following relationship has to be satisfied. (1)
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.17, NO.3, JUNE, 2017 461 [63] RCOUTB RCOUTB ML[63] MPRE [0] RCOUTB RCOUTB ML[0] SEN SL[0] [0] [0] SLB[0] SL[7] [7] [7] SLB[7] SL[24] [24] [24] SLB[24] SL[31] [31] [31] SLB[31] Fig. 4. Simplified architecture of a TCAM array with 64-rows and 32-columns where each row has four sub-rows with eight cells. (2) Write operation Search operation If the relationship in the Eq. (2) is satisfied, R is smaller than R and therefore the pull-down strength of the transistor M 1 of the resistance is stronger than that of the transistor M 2 of the resistance when the input does not match with the stored state of the row. Then, the output RC OUTB of the resistance becomes 1 when becomes 1. SEN SL SLB high-z 3. TCAM Array RC OUTB ML Because the parallel-connected cell resistance of all the cells in a row is compared with the reference resistance, the sensing margin may become very small as the number of cells in a row increases as is clear from the Eq. (2). According to the simulation results, the maximum number of cell in a row is limited to be eight to have sufficient sensing margin. If more cells are required to be placed in a row, a row has to be divided into multiple sub-rows. Fig. 4 shows an exemplar TCAM array with 64-rows and 32-columns. Each row has four sub-rows and a subrow has eight TCAM cells to have sufficient sensing margin as commented above. In a sub-row, the output RC OUTB of the resistance is applied to the gate of the transistor M 1. By short-circuiting the drain nodes Input SL/SLB Write operation "0" "1" V DD 0-V V DD 0-V hi-z hi-z 0-V/V DD Search operation "0" "1" V DD/0-V 0-V 0-V Fig. 5. Operation timing and the voltage levels of the control signals for the write and search operations. of the transistors M 1 of all the sub-rows, we can perform the wired-or operation of the resistance outputs. The operation timings and the voltage levels of the control signals for the write and search operations are summarized in Fig. 5. During the search operation, the match-line ML of a row is first pre-charged by the transistors M PRE to V DD. After the pre-charging of the match-line ML is completed, "x" 0-V
462 DOOHO CHO et al : VARIATION-TOLERANT NON-VOLATILE TERNARY CONTENT ADDRESSABLE MEMORY WITH (a) (a) (b) (b) (c) Fig. 6. Distribution of R and R when (a) V DD =1.26-V, temperature=-40- o C, (b) V DD =1.2-V, temperature=25- o C, (c) V DD =1.14-V, temperature=90- o C. the evaluation signal is pulled-up to V DD to enable the resistance s of all the sub-rows. After the evaluations of all the resistance s are completed, the match-line ML of the corresponding row is discharged to 0-V by the transistor M 1 if any one of the resistance outputs RC OUTB is 1, which indicates the mismatch of the stored state with the input. When the stored state matches with the input, the outputs RC OUTB of all the resistance outputs of the sub-rows are 0 and the match-line ML stays at V DD. III. SIMULATION RESULTS AND DISCUSSION The TCAM array with 64-rows and 32-columns shown in Fig. 4 has been simulated to verify the operation of the Fig. 7. Distribution of R CELL and R REF when the input and the stored state (a) matches, (b) do not match. proposed non-volatile TCAM with MTJ s. The MTJ is modeled in the Verilog-A with the model described in [12] and the nominal value of R P is 3.97-kW while the tunneling magnetoresistance ratio (TMR) is 1.5. The MTJ resistance is allowed to vary from its nominal value with 4-% standard deviation. The resistors R R1 and R R2 +R R3 generating the reference voltage have 7:10 ratio in their resistance values. A 65-nm CMOS process is assumed for the transistors and the supply voltage V DD can vary from 1.14-V to 1.26-V with its nominal value of 1.2-V. The junction temperature is also variable from - 40- o C to 90- o C. With the above condition, a Monte-Carlo simulation has been performed to see the distributions of R and R. When all the stored bits match with the input and when only one bit does not match with the input, the distributions of R and R are shown in Fig. 6 for the three cases, that is, V DD = 1.26-V, temperature = -40- o C; (HV/LT), V DD = 1.2-V, temperature = 25- o C; (NV/RT), V DD = 1.14-V, temperature = 90- o C (LV/HT). From the distribution, the error probability is found to be 0.001-%, 0.016-%, and 0.033-%, respectively for the HV/LT, NV/RT, and LV/HT cases. For single TCAM cell, the distributions of R CELL and R REF are plotted in Fig. 7(a) and (b) when the input and the stored state matches and do not match, respectively. As can be seen in the figure, the distributions R CELL and R REF have large overlap when the input and the stored
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.17, NO.3, JUNE, 2017 463 Table 1. Comparison with other MTJ-based TCAM s [7] [8] [11] This work Cell arch. 6T-2MTJ 4T-2MTJ 10T-4MTJ 4.875T-2MTJ Process [nm] 90 90 45 65 Supply [V] 1.2 1.2 1 1.2 Cell area [µm 2 ] 10.35 3.14 2.78 1.56 t search [ns] 0.29 2.5 1 0.9 E search/bit [fj] 1.04-40.5 65.9 operation, the parallel connected cell resistances are compared with the reference resistance, which averages out the variation and therefore the TCAM array shows excellent robustness to the variations of the device characteristics. Fig. 8. Simulated waveforms of the search operation of the TCAM array in Fig. 4. state matches, meaning large probability of error. Despite this, the proposed TCAM shows the excellent immunity to the variations due to the averaging effect. As explained in the Section II, R and R are the equivalent resistance of the parallel connected R CELL and R REF of eight cells in a sub-row, respectively. Therefore, the variations of individual R CELL and R REF are averaged out when they appear in R and R. Including the Monte-Carlo simulated variations of the device parameters, the search operation of the TCAM array shown in Fig. 4 has been simulated. The simulated waveforms for three consecutive search operations with the input 11111..11111, x1111..11111, and 1xxxx..xxxxx are shown in Fig. 8. The stored states of the 63-rd and 62-nd rows are 01111..11111 and 11111..11111, respectively. Therefore, the match line output ML[63] is 1 for only the second search operation while the match line output ML[62] is 1 for all the three search operations. The proposed TCAM is compared with other MTJbased non-volatile TCAM s in Table 1. While the proposed TCAM has the smallest area per cell, the power consumption is relatively large because of the DC current path formed by the two MTJ s of each TCAM cell during the search operation. IV. CONCLUSIONS A MTJ-based non-volatile TCAM is proposed which provides compact unit cell architecture. For the search ACKNOEDGMENTS The CAD tools were provided by the IC Design Education Center (IDEC), Korea. REFERENCES [1] M. Meribout, T. Ogura, and M. Nakanishi, On using the CAM concept for parametric curve extraction, Image Processing, IEEE Transactions on, Vol.9, No.12, pp.2126-2130, Dec., 2000. [2] M. Nakanishi and T. Ogura, Real-time CAMbased Hough transform and its performance evaluation, Pattern Recognition, Proceedings, 13 th IEEE International Conference on, Vol.2, pp.516-521, 1996. [3] V. C. Ravikumar, R. N. Mahapatra and L. N. Bhuyan, EaseCAM: An Energy and Storage Efficient TCAM-based Router Architecture for IP Lookup, Computers, IEEE Transactions on, Vol.54, No.5, pp.521-533, May, 2005. [4] K. Pagiamtzis and A. Sheikholeslami, Content- Addressable Memory (CAM) Circuits and Architectures: A Tutorial and Survey, Solid-State Circuits, IEEE Journal of, Vol.41, No.3, pp.712-727, Mar., 2006. [5] Y. Zhang, W. Zhao, J. O. Klein, D. Ravelsona and C. Chappert, Ultra-High Density Content Addressable Memory Based on Current Induced Domain Wall Motion in Magnetic Track, Magnetics, IEEE Transactions on, Vol.48, No.11, pp.3219-3222, Nov., 2012.
464 [6] [7] [8] [9] [10] [11] [12] [13] [14] DOOHO CHO et al : VARIATION-TOLERANT NON-VOLATILE TERNARY CONTENT ADDRESSABLE MEMORY WITH M. K. Gupta and M. Hasan, Design of High-Speed Energy-Efficient Masking Error Immune PentaMTJ-Based TCAM, Magnetics, IEEE Transactions on, Vol.51, No.2, Feb., 2015. S. Matsunaga, et al., Fully Parallel 6T-2MTJ nonvolatile TCAM with single-transistor-based self match-line discharge control, VLSI Circuits, IEEE Symposium 2011, Digest of Technical Papers, pp.289-290., 2011. S. Matsunaga, et al., A 3.14 um2 4T-2MTJ-cell fully parallel TCAM based on nonvolatile logic-inmemory architecture, VLSI Circuits, IEEE Symposium 2012, Digest of Technical Papers, pp.44-45., 2012. T. Hanyu, et al., Spintronics-Based Nonvolatile Logic-in-Memory Architecture Towards an UltraLow-Power and Highly Reliable VLSI Computing Paradigm, Design, Automation & Test, IEEE 2015 Europe Conference & Exhibition, pp.1006-1011, 2015. M. K. Gupta and M. Hasan, Robust High Speed Ternary Magnetic Content Addressable Memory, Electron Devices, IEEE Transactions on, Vol.62, No.4, pp.1163-1169, Apr., 2015. B. Song, T. Na, J. P. Kim, S. H. Kang and S. O. Jung, A 10T-4MTJ Nonvolatile Ternary CAM Cell for Reliable Search Operation and Compact Area, Circuits and Systems II: Express Briefs, IEEE Transactions on, 2016. K. Kim and C. Yoo, Macro-model of magnetic tunnel junction for STT-MRAM including dynamic behavior, Semiconductor Technology and Science, IEIE Journal of, Vol.14, No.6, pp.728-732, Dec., 2014. I. Arsovski and R. Wistort, Self-referenced sense amplifier for across-chip-variation immune sensing in high-performance contentaddressable memories, Proceedings, IEEE 2006 Custom Integrated Circuits Conference, pp. 453-456, 2006. I. Hayashi, et al., A 250-MHz 18-Mb Full Ternary CAM With Low-Voltage Matchline Sensing Scheme in 650-nm CMOS, Solid-State Circuits, IEEE Journal of, Vol.48, No.11, pp.2671-2680, Nov., 2013. Dooho Cho was born in Seoul, Korea, on 1981. He received the B.S. degree in the Department of Electronics and Computer Engineering from Hanyang University, Korea, in 2006. He joined the Memory Division of Samsung Electronics in 2006. He is currently pursuing the M.S. degree at Hanyang University, Korea. His interests include memory devices and mixed-mode CMOS circuits. Kyungmin Kim received the B.S, degrees in electrical and computer engineering from Hanyang University, Seoul, Korea in 2010. He is currently working towards the Ph.D. degree at the same university. His research interests include a STTMRAM circuit design and mixed- mode CMOS circuits design. Changsik Yoo received the B.S. (Honors), M.S., and Ph.D. degrees from Seoul National University, Seoul, Korea, in 1992, 1994, and 1998, respectively, all in electronic engineering. From 1998 to 1999, he was with the Integrated Systems Laboratory (IIS), Swiss Federal Institute of Technology (ETH), Zurich, Switzerland, as a Research Staff. From 1998 to 2002, he was with Samsung Electronics, Hwasung, Korea, as a Senior DRAM Design Engineer. Since 2002, he has been a Professor of Hanyang University, Seoul, Korea. His main research interest is the mixed-mode CMOS integrated circuit design.