Binary Neural Network and Its Implementation with 16 Mb RRAM Macro Chip
|
|
- Leona Carroll
- 5 years ago
- Views:
Transcription
1 Binary Neural Network and Its Implementation with 16 Mb RRAM Macro Chip Assistant Professor of Electrical Engineering and Computer Engineering 3/22/2017 School of Electrical, Computer, and Energy Engineering (ECEE)
2 Outline Challenges of Analog Synapses and Why We Need Binarize the Neural Network Binary Neural Network and Its Implementations on Tsinghua s 16 Mb RRAM Macro Chip Benchmark of Binary and Analog Synapses Summary 2
3 Demands for Neuromorphic Hardware Deep learning in Cloud: huge training labeled dataset, high precision training, power-hungry, etc. Google Cat: 16,000 CPU cores MS Residual-CNN: 8 GPUs Edge (IoT) computing needs novel hardware / algorithms Local to the sensor, real-time inference, small area and low-power Adaptive on-line learning with continuous (possibly unlabeled) data GPU FPGA 30 frames/s 3 Neuromorphic ASIC
4 A Shift in Computing Paradigm towards Neuro-inspired Resistive synaptic device Long-term vision: Brain-like computer 4
5 Current Status of envm based Neuromorphic Research Mostly focused on device-level engineering Performance metrics Desired Targets Device dimension < 10 nm Multilevel states number >100 * with a linear symmetric update Energy consumption <10 fj/programming pulse Dynamic range >100 * Retention >10 years * Endurance >10 9 updates * Note: * these numbers are application-dependent A few array-level demo with simple pattern classification, such as: UCSB s 12*12 crossbar array with memristors (Nature 2015) IBM s 256*256 1T1R array with PCM (IEDM 2015) ASU s 12*12 crossbar array with multilevel RRAM (EDL 2016) ASU-Tsinghua s 400*400 1T1R array with binary RRAM (IEDM 2016) 5
6 Cross-point Architecture for Accelerating Weighted Sum and Weight Update Weighted sum (inference): all cells are activated in parallel, summing up column current perform vector-matrix multiplication Weight update (training): cell s conductance could be updated by applying programming voltage from row/column at the same time. Task Operations WW XX WW update II ii = GG iiii VV jj jj GG iiii = ηη VV ii VV jj (analog computation inside the array, may need ADC at edge of array) 6
7 Binary RRAM and Analog RRAM Synaptic Devices Current (A) 1m 100µ 10µ 1µ 100n 10n 1n Gradual reset Pt HfO2 TiN V Voltage (V) Abrupt set Current (A) 100n Binary Synapses: Conventional filamentary switching RRAM with abrupt set and gradual reset, multilevel states achievable in the reset, could be used for offline training. 10µ 1n 10p V Ta TaOx TiO2 Ti Gradual reset Gradual set Voltage (V) T.-H. Hou s group, NCTU, Taiwan Analog Synapses: Special interfacial switching RRAM with both smooth set and reset, attractive for online training. 7
8 Realistic Analog Device s Weight Update Behaviors Nonlinearity in weight update Device variations Non-zero off-state conductance How would these non-ideal effects impact learning accuracy? 8
9 NeuroSim: A Simulator from Device to Algorithm Parameters: Network size, learning rate, thresholding value, etc. MNIST data Input layer Key operations: - Feed forward (weighted sum) - Back propagation (weight update) Input data Algorithm level Synapse Array Read peripheral Thresholding circuit & buffer Hidden layer Circuit level Output layer Synapse Array Read peripheral Output buffer True crossbar Array WL Synapse Interconnects BL Pseudo-crossbar Array WL SL 6T SRAM Array WL BL BL BLB n SRAM cells as a synapse Device level NVM device model Digital RRAM Analog RRAM Device parameters: - Cell height and width - Maximum and minimum conductance - Read/write voltage and pulse width Non-ideal properties: - Nonlinear weight update with finite number of states Conductance # pulse - Variations (Device-todevice and cycle-to-cycle weight update variation, and read noise) SRAM device model SRAM Device parameters: - Cell height and width - Transistor width - Sensing voltage - Read/write latency and energy Input: Network structure, Training/testing traces Array type and technology node Device type and non-ideal factors Output: Area, Latency, Energy, Accuracy 9
10 Model Calibration (Latency, Energy, Leakage) Benchmark at 45 nm with PTM model 10
11 Model Calibration (Area) µm Layout using 45 nm NanGate PDK SL Switch Matrix SL envm BL µm Crossbar WL Decoder Pseudocrossbar Array (256x256) BL Switch Matrix 0.18 µm WL ADC Mux w/ Decoder Shift Register 0.18 µm Adder Layout Area: E+04 um 2 Model Area: E+04 um 2 11
12 Impact of Weight Precision and Weight Update Nonlinearity in Analog Synapses A multilayer perceptron (MLP) network is used for benchmarking. At least 6-bit is required for MNIST dataset online learning, while 1 or 2-bit may work for offline classification. Nonlinearity significantly degrades accuracy for online learning if using analog synapses. 12
13 Benchmark of Reported Analog Resistive Synapses Desired analog envms for Reported analog envms for learning learning Targeted envm type PCMO Ag:a-Si TaO x /TiO 2 AlO x /HfO 2 Ideal envm envm # of bits Nonlinearity (weight increase/decreas e) 3.25/ / /0.72 3/1 1/1 0/0 R ON 23 MΩ 26 MΩ 5 MΩ 16.9 kω 200 kω 200 kω ON/OFF ratio Weight update cycle-to-cycle variation (σ) Accuracy for online learning Accuracy for offline classification <1% 3.5% <1% 5% 2% 0% 10% ~75% ~10% ~10% ~90% ~94.8% ~13% ~51% ~10% ~10% ~94.5% ~94.5% Green: good attributes, Red: major cause of learning failure 13
14 Outline Challenges of Analog Synapses and Why We Need Binarize the Neural Network Binary Neural Network and Its Implementations on Tsinghua s 16 Mb RRAM Macro Chip Benchmark of Binary and Analog Synapses Summary 14
15 Binary Neural Network (BNN) Precision Reduction to Ternary Weight (+1,0,-1) and Binary Neuron for Propagation Higher precision (e.g. 8 bit) is kept for weight update only (because ΔW is small) Ternary for backpropagation Back-Propagation of Errors n-bit gradient descent for weights update p 1 a 1 10 W W a n Feedforward Inference MNIST dataset Ternary for feedforward p n Output Layer Hidden Layer Input Layer Accuracy [%] All floating point Avg floating point 8bit weight & neuron Avg 8bit weight & neuron 8bit weight & 1bit neuron Avg 8bit weight & 1bit neuron Ternary weight & 1bit neuron Avg ternary weight & 1bit neuron (a) (b) (c) Training Epoch Accuracy [%] Training Epoch Avg Avg Avg Followed the recent trends in machine/deep learning, e.g. BinaryNet and XNOR-Net S. Yu, et al. IEDM
16 16 Mb Macro Chip (Tsinghua) Dobus01 <0:7> Dobus23 <0:7> Block0 512*1024 Block0 512*1024 Block1 512*1024 Block1 512* stage gate Block5 512*1024 Block5 512*1024 Block4 512*1024 Block4 512*1024 Block2 Block3 Block7 Block6 3-stage gate Block2 Block3 Block7 Block6 3-stage gate Dobus45 <0:7> Dobus67 <0:7> I/O Dobus1011 <0:7> Dobus89 <0:7> Analog Dout <0:7> Data buffer 3-stage gate Block10 Block11 Block14 Block15 3-stage gate Block10 Block11 Block14 Block15 Block8 Block9 Block13 Block12 3-stage gate Digital Block8 Block9 Block13 Block12 Chip designed and fabricated by Huaqiang Wu s group in Tsinghua University I/O Dobus1415 <0:7> Dobus1213 <0:7> Capacity 16 Mb Tech Node 130 nm V DD_Digital V DD_Analog 1.8 V 5 V V WL _ SET 2-5 V/ 50 ns V BL _ SET 2-3 V/ 50 ns V WL _ RESET V/ 50 ns V SL _ RESET 2-3 V/ 50 ns I/O Width 8 16
17 RRAM Stack and Endurance of RRAM 54.3 nm 9.1 nm RRAM Cell HfOx based RRAM integrated between M4 and M5 on top of CMOS Measured endurance ~1E6 cycles Courtesy of Huaqiang Wu (Tsinghua University) 17
18 Implementation of BNN on 16 Mb RRAM Chip for Offline Classification W 1-2 W 1-2 (400X400) Zoom in W row / from input images column weighted sum for hidden layer Subtraction & Acvtivation / / 200 row inputs 20 column weighted sum for output Network topology / W2-3 (200X20) Experiment data Programmed weight matrix pattern on 1 block of 16 Mb chip Error (in red) occurs, bit yield ~99% 18
19 Impact of RRAM Finite Bit Yield for Classification Accuracy [%] Ideal Software BNN with 1-bit Classification Accuracy [%] BNN with 1-bit classification Training Epoch RRAM bit Yield [%] The software baseline with high precision classification has accuracy ~97%. BNN with 1-bit classification (with sign) has accuracy ~96.3% For MNIST dataset, 99% bit yield is sufficient to maintain ~96.3% 19
20 Precision Reduction for Training Online training needs higher precision than offline classification, because the small error accumulation is needed in backpropagation 100 Column Decoder Decoder Driver 95 WL Accuracy [%] BNN with online training bit precision >96% accuracy WL Decoder Mux Decoder (b) Binary Synapses SL BL 1T1R Array envm VSA VSA Adder Register Adder Shift Register Mux VSA VSA Adder Register Adder Shift Register VSA VSA Adder Register Adder Shift Register Precision bit 6-bit is needed for MNIST dataset, thus 6 binary RRAM cells are grouped for implementing one synapse 20
21 Distribution of RRAM Updates During Training 100 W 1-2 W Online training MSB Endurance limit Online training Endurance limit CDF [%] 50 Sign 40 Sign b7 b6 30 b5 b4 20 b3 10 LSB b2 b # of switching cycle CDF [%] 50 Sign Sign 40 b7 b6 30 b5 20 MSB b4 b3 10 b2 LSB b # of switching cycle Most cells update less than endurance limit (10 4 cycles) LSB updates more than MSB, and W 2-3 updates more than W
22 Impact of RRAM Finite Endurance on Training Endurance 1e Endurance 3e3 Endurance 5e3 Accuracy [%] Endurance 1e3 3e3 5e3 8e3 Accuracy [%] Endurance 8e3 Endurance 1e4 40 1e Training Epoch Training Epoch Lower endurance results in lower peak of accuracy. With 10 4 cycles, ~96.9% accuracy is achievable for online training 22
23 Outline Challenges of Analog Synapses and Why We Need Binarize the Neural Network Binary Neural Network and Its Implementations on Tsinghua s 16 Mb RRAM Macro Chip Benchmark of Binary and Analog Synapses Summary 23
24 NeuroSim Simulation Set-up for Analog and Binary Synapses Analog synapse Binary synapse # bits 6 6 Nonlinearity (weight 0.72/ increase/weight decrease) R ON 200kΩ 200kΩ ON/OFF ratio Read voltage 0.5 V 0.5 V Write voltage 2 V (for both weight increase 2 V and decrease) Write pulse width 100 ns per pulse 100 ns Resistance of access 10kΩ 10kΩ transistor in 1T1R Read noise 2.89% -- Array type Pseudo-crossbar Traditional 1T1R Array size 400x100 and 100x10 400x600 and 100x60 Tech node 14 nm 14 nm Wire width 40 nm 40 nm 24
25 Benchmark Results of Analog and Binary Synapses Analog synapse Binary synapse Accuracy 82.17% 94.03% Area µm µm 2 Total feed forward e-01 s e+00 s latency Total weight e+05 s e+03 s update Latency Total feed forward e-04 J e-03 J energy Total weight e+00 J e+00 J update energy Leakage µw µw Binary synapses could be a near-term solution, while a perfect analog synapses could bring in many benefits in the long run 25
26 Summary Today s RRAM technology (even binary) can support offline classification with low-power, fast and accurate recognition. For online training, analog synapses with continuous weights need to overcome grand challenges such as nonlinear weight update, and slow programming speed (as multiple pulses are needed to tune the weights). Binarizing neural network with low-precision weights, allows today s binary RRAM for online training with high accuracy, which also shows a good resilience to limited yield and endurance, as shown in our demonstration of 16 Mb RRAM chip. Trade-offs exist between binary and analog synapse implementations: binary synapses are good for high accuracy and fast training speed, but with overhead in the chip area and dynamic energy. 26
27 Acknowledgement Students: Pai-Yu Chen, Zhiwei Li Collaborator: Huaqiang Wu, Tsinghua University NSF-CCF : CAREER: Scaling-up Resistive Synaptic Arrays for Neuro-inspired Computing 27
Nano-device and Architecture Interaction in Machine/deep Learning
Nano-device and Architecture Interaction in Machine/deep Learning Assistant Professor of Electrical Engineering and Computer Engineering shimengy@asu.edu http://faculty.engineering.asu.edu/shimengyu/ 12/13/2017
More informationNeuromorphic Computing based Processors
Neuromorphic Computing based Processors Hao Jiang A collaborative research among San Francisco State University, EI-Lab at University of Pittsburgh, HP Labs, and AFRL Outline Why Neuromorphic Computing?
More informationCreating Intelligence at the Edge
Creating Intelligence at the Edge Vladimir Stojanović E3S Retreat September 8, 2017 The growing importance of machine learning Page 2 Applications exploding in the cloud Huge interest to move to the edge
More informationSupplementary Figures
Supplementary Figures Supplementary Figure 1. The schematic of the perceptron. Here m is the index of a pixel of an input pattern and can be defined from 1 to 320, j represents the number of the output
More informationRRAM based analog synapse device for neuromorphic system
RRAM based analog synapse device for neuromorphic system Kibong Moon, Euijun Cha, and Hyunsang Hwang Pohang University of Science and Technology (POSTECH), Korea The 13 th Korea-U.S. Forum on Nanotechnology,
More informationArtificial Neural Networks. Artificial Intelligence Santa Clara, 2016
Artificial Neural Networks Artificial Intelligence Santa Clara, 2016 Simulate the functioning of the brain Can simulate actual neurons: Computational neuroscience Can introduce simplified neurons: Neural
More informationCompact Oscillation Neuron Exploiting Metal-Insulator- Transition for Neuromorphic Computing
Compact Oscillation Neuron Exploiting Metal-Insulator- Transition for Neuromorphic Computing Pai-Yu Chen, Jae-sun Seo, Yu Cao, and Shimeng Yu * Arizona State University, Tempe, AZ 85281, USA * Email: shimengy@asu.edu
More informationMINE 432 Industrial Automation and Robotics
MINE 432 Industrial Automation and Robotics Part 3, Lecture 5 Overview of Artificial Neural Networks A. Farzanegan (Visiting Associate Professor) Fall 2014 Norman B. Keevil Institute of Mining Engineering
More informationIEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 34, NO. 12, DECEMBER
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 34, NO. 12, DECEMBER 2015 1905 RRAM-Based Analog Approximate Computing Boxun Li, Student Member, IEEE, PengGu,Student
More informationEnergy Efficient and High Performance Current-Mode Neural Network Circuit using Memristors and Digitally Assisted Analog CMOS Neurons
Energy Efficient and High Performance Current-Mode Neural Network Circuit using Memristors and Digitally Assisted Analog CMOS Neurons Aranya Goswamy 1, Sagar Kumashi 1, Vikash Sehwag 1, Siddharth Kumar
More informationMS Project :Trading Accuracy for Power with an Under-designed Multiplier Architecture Parag Kulkarni Adviser : Prof. Puneet Gupta Electrical Eng.
MS Project :Trading Accuracy for Power with an Under-designed Multiplier Architecture Parag Kulkarni Adviser : Prof. Puneet Gupta Electrical Eng., UCLA - http://nanocad.ee.ucla.edu/ 1 Outline Introduction
More informationLecture 6: Electronics Beyond the Logic Switches Xufeng Kou School of Information Science and Technology ShanghaiTech University
Lecture 6: Electronics Beyond the Logic Switches Xufeng Kou School of Information Science and Technology ShanghaiTech University EE 224 Solid State Electronics II Lecture 3: Lattice and symmetry 1 Outline
More informationMultiscale Co-Design Analysis of Energy, Latency, Area, and Accuracy of a ReRAM Analog Neural Training Accelerator
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1 Multiscale Co-Design Analysis of Energy, Latency, Area, and Accuracy of a ReRAM Analog Neural Training Accelerator
More informationOption 1: A programmable Digital (FIR) Filter
Design Project Your design project is basically a module filter. A filter is basically a weighted sum of signals. The signals (input) may be related, e.g. a delayed versions of each other in time, e.g.
More informationA Parallel Analog CCD/CMOS Signal Processor
A Parallel Analog CCD/CMOS Signal Processor Charles F. Neugebauer Amnon Yariv Department of Applied Physics California Institute of Technology Pasadena, CA 91125 Abstract A CCO based signal processing
More informationUltra Low Voltage Operation with Bootstrap Scheme for Single Power Supply SOI-SRAM
[ 2007 International Conference on VLSI Design ] Jan. 9, 2007 Ultra Low Voltage Operation with Bootstrap Scheme for Single Power Supply SOI-SRAM Masaaki Iijima, Masayuki Kitamura, Masahiro Numa, *Akira
More informationChalcogenide Memory, Logic and Processing Devices. Prof C David Wright Department of Engineering University of Exeter
Chalcogenide Memory, Logic and Processing Devices Prof C David Wright Department of Engineering University of Exeter (david.wright@exeter.ac.uk) Acknowledgements University of Exeter Yat-Yin Au, Jorge
More informationA Synchronized Axon Hillock Neuron for Memristive Neuromorphic Systems
A Synchronized Axon Hillock Neuron for Memristive Neuromorphic Systems Ryan Weiss, Gangotree Chakma, and Garrett S. Rose IEEE International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, Massachusetts,
More informationLow Transistor Variability The Key to Energy Efficient ICs
Low Transistor Variability The Key to Energy Efficient ICs 2 nd Berkeley Symposium on Energy Efficient Electronic Systems 11/3/11 Robert Rogenmoser, PhD 1 BEES_roro_G_111103 Copyright 2011 SuVolta, Inc.
More informationImplementation of High Performance Carry Save Adder Using Domino Logic
Page 136 Implementation of High Performance Carry Save Adder Using Domino Logic T.Jayasimha 1, Daka Lakshmi 2, M.Gokula Lakshmi 3, S.Kiruthiga 4 and K.Kaviya 5 1 Assistant Professor, Department of ECE,
More informationCMPEN 411 VLSI Digital Circuits Spring Lecture 24: Peripheral Memory Circuits
CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 24: Peripheral Memory Circuits [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11
More informationTopics. Memory Reliability and Yield Control Logic. John A. Chandy Dept. of Electrical and Computer Engineering University of Connecticut
Topics Memory Reliability and Yield Control Logic Reliability and Yield Noise Sources in T DRam BL substrate Adjacent BL C WBL α-particles WL leakage C S electrode C cross Transposed-Bitline Architecture
More informationTechnology Timeline. Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs. FPGAs. The Design Warrior s Guide to.
FPGAs 1 CMPE 415 Technology Timeline 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs FPGAs The Design Warrior s Guide
More informationProposers Day Workshop
Proposers Day Workshop Monday, January 23, 2017 @srcjump, #JUMPpdw Cognitive Computing Vertical Research Center Mandy Pant Academic Research Director Intel Corporation Center Motivation Today s deep learning
More informationStatic Random Access Memory - SRAM Dr. Lynn Fuller Webpage:
ROCHESTER INSTITUTE OF TECHNOLOGY MICROELECTRONIC ENGINEERING Static Random Access Memory - SRAM Dr. Lynn Fuller Webpage: http://people.rit.edu/lffeee 82 Lomb Memorial Drive Rochester, NY 14623-5604 Email:
More informationIntegration, Architecture, and Applications of 3D CMOS Memristor Circuits
Integration, Architecture, and Applications of 3D CMOS Memristor Circuits K. T. Tim Cheng and Dimitri Strukov Univ. of California, Santa Barbara ISPD 2012 1 3D Hybrid CMOS/NANO add-on nanodevices layer
More informationA New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology
Inf. Sci. Lett. 2, No. 3, 159-164 (2013) 159 Information Sciences Letters An International Journal http://dx.doi.org/10.12785/isl/020305 A New network multiplier using modified high order encoder and optimized
More informationInterconnect-Power Dissipation in a Microprocessor
4/2/2004 Interconnect-Power Dissipation in a Microprocessor N. Magen, A. Kolodny, U. Weiser, N. Shamir Intel corporation Technion - Israel Institute of Technology 4/2/2004 2 Interconnect-Power Definition
More informationCHAPTER 7 A BICS DESIGN TO DETECT SOFT ERROR IN CMOS SRAM
131 CHAPTER 7 A BICS DESIGN TO DETECT SOFT ERROR IN CMOS SRAM 7.1 INTRODUCTION Semiconductor memories are moving towards higher levels of integration. This increase in integration is achieved through reduction
More informationMULTI-PORT MEMORY DESIGN FOR ADVANCED COMPUTER ARCHITECTURES. by Yirong Zhao Bachelor of Science, Shanghai Jiaotong University, P. R.
MULTI-PORT MEMORY DESIGN FOR ADVANCED COMPUTER ARCHITECTURES by Yirong Zhao Bachelor of Science, Shanghai Jiaotong University, P. R. China, 2011 Submitted to the Graduate Faculty of the Swanson School
More informationMemory (Part 1) RAM memory
Budapest University of Technology and Economics Department of Electron Devices Technology of IT Devices Lecture 7 Memory (Part 1) RAM memory Semiconductor memory Memory Overview MOS transistor recap and
More informationLecture 12 Memory Circuits. Memory Architecture: Decoders. Semiconductor Memory Classification. Array-Structured Memory Architecture RWM NVRWM ROM
Semiconductor Memory Classification Lecture 12 Memory Circuits RWM NVRWM ROM Peter Cheung Department of Electrical & Electronic Engineering Imperial College London Reading: Weste Ch 8.3.1-8.3.2, Rabaey
More informationMultiple-Layer Networks. and. Backpropagation Algorithms
Multiple-Layer Networks and Algorithms Multiple-Layer Networks and Algorithms is the generalization of the Widrow-Hoff learning rule to multiple-layer networks and nonlinear differentiable transfer functions.
More informationRuixing Yang
Design of the Power Switching Network Ruixing Yang 15.01.2009 Outline Power Gating implementation styles Sleep transistor power network synthesis Wakeup in-rush current control Wakeup and sleep latency
More informationArithmetic Encoding for Memristive Multi-Bit Storage
Arithmetic Encoding for Memristive Multi-Bit Storage Ravi Patel and Eby G. Friedman Department of Electrical and Computer Engineering University of Rochester Rochester, New York 14627 {rapatel,friedman}@ece.rochester.edu
More informationCHAPTER 4 LINK ADAPTATION USING NEURAL NETWORK
CHAPTER 4 LINK ADAPTATION USING NEURAL NETWORK 4.1 INTRODUCTION For accurate system level simulator performance, link level modeling and prediction [103] must be reliable and fast so as to improve the
More informationSiGe epitaxial memory for neuromorphic computing with reproducible high performance based on engineered dislocations
SUPPLEMENTARY INFORMATION Articles https://doi.org/10.1038/s41563-017-0001-5 In the format provided by the authors and unedited. SiGe epitaxial memory for neuromorphic computing with reproducible high
More informationDIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N
DIGITAL INTEGRATED CIRCUITS A DESIGN PERSPECTIVE 2 N D E D I T I O N Jan M. Rabaey, Anantha Chandrakasan, and Borivoje Nikolic CONTENTS PART I: THE FABRICS Chapter 1: Introduction (32 pages) 1.1 A Historical
More informationDesigning Of A New Low Voltage CMOS Schmitt Trigger Circuit And Its Applications on Reduce Power Dissipation
IJISET - International Journal of Innovative Science, Engineering & Technology, Vol. Issue 1, December 015. www.ijiset.com ISSN 348 7968 Designing Of A New Low Voltage CMOS Schmitt Trigger Circuit And
More informationMemory Basics. historically defined as memory array with individual bit access refers to memory with both Read and Write capabilities
Memory Basics RAM: Random Access Memory historically defined as memory array with individual bit access refers to memory with both Read and Write capabilities ROM: Read Only Memory no capabilities for
More informationFully Parallel 6T-2MTJ Nonvolatile TCAM with Single-Transistor-Based Self Match-Line Discharge Control
Fully Parallel 6T-2MTJ Nonvolatile TCAM with Single-Transistor-Based Self Match-Line Discharge Control Shoun Matsunaga 1,2, Akira Katsumata 2, Masanori Natsui 1,2, Shunsuke Fukami 1,3, Tetsuo Endoh 1,2,4,
More informationA 7 bit 3.52 GHz Current Steering DAC for WiGig Applications
A 7 bit 3.52 GHz Current Steering DAC for WiGig Applications Trindade, M. Helena Abstract This paper presents a Digital to Analog Converter (DAC) with 7 bit resolution and a sampling rate of 3.52 GHz to
More informationA Low-Offset Latched Comparator Using Zero-Static Power Dynamic Offset Cancellation Technique
1 A Low-Offset Latched Comparator Using Zero-Static Power Dynamic Offset Cancellation Technique Masaya Miyahara and Akira Matsuzawa Tokyo Institute of Technology, Japan 2 Outline Motivation Design Concept
More informationLow Power System-On-Chip-Design Chapter 12: Physical Libraries
1 Low Power System-On-Chip-Design Chapter 12: Physical Libraries Friedemann Wesner 2 Outline Standard Cell Libraries Modeling of Standard Cell Libraries Isolation Cells Level Shifters Memories Power Gating
More informationAccelerating Stochastic Random Projection Neural Networks
Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 12-2017 Accelerating Stochastic Random Projection Neural Networks Swathika Ramakrishnan sxr1661@rit.edu Follow
More informationA Differential 2R Crosspoint RRAM Array with Zero Standby Current
1 A Differential 2R Crosspoint RRAM Array with Zero Standby Current Pi-Feng Chiu, Student Member, IEEE, and Borivoje Nikolić, Senior Member, IEEE Department of Electrical Engineering and Computer Sciences,
More informationA 1Mjot 1040fps 0.22e-rms Stacked BSI Quanta Image Sensor with Cluster-Parallel Readout
A 1Mjot 1040fps 0.22e-rms Stacked BSI Quanta Image Sensor with Cluster-Parallel Readout IISW 2017 Hiroshima, Japan Saleh Masoodian, Jiaju Ma, Dakota Starkey, Yuichiro Yamashita, Eric R. Fossum May 2017
More informationCHAPTER 6 NEURO-FUZZY CONTROL OF TWO-STAGE KY BOOST CONVERTER
73 CHAPTER 6 NEURO-FUZZY CONTROL OF TWO-STAGE KY BOOST CONVERTER 6.1 INTRODUCTION TO NEURO-FUZZY CONTROL The block diagram in Figure 6.1 shows the Neuro-Fuzzy controlling technique employed to control
More informationCHAPTER 5 DESIGN OF COMBINATIONAL LOGIC CIRCUITS IN QCA
90 CHAPTER 5 DESIGN OF COMBINATIONAL LOGIC CIRCUITS IN QCA 5.1 INTRODUCTION A combinational circuit consists of logic gates whose outputs at any time are determined directly from the present combination
More informationDigital Integrated CircuitDesign
Digital Integrated CircuitDesign Lecture 13 Building Blocks (Multipliers) Register Adder Shift Register Adib Abrishamifar EE Department IUST Acknowledgement This lecture note has been summarized and categorized
More informationPOST CMOS PATHFINDING. Leti Innovation Days June 28-29, 2017
POST CMOS PATHFINDING DEVELOPING THE BUILDING BLOCKS FOR DATA PROCESSING The challenges to continue the performance improvement of data processing systems are multiple Improve the energy efficiency to
More informationRANA: Towards Efficient Neural Acceleration with Refresh-Optimized Embedded DRAM
RANA: Towards Efficient Neural Acceleration with Refresh-Optimized Embedded DRAM Fengbin Tu, Weiwei Wu, Shouyi Yin, Leibo Liu, Shaojun Wei Institute of Microelectronics Tsinghua University The 45th International
More informationOpportunities and Challenges in Ultra Low Voltage CMOS. Rajeevan Amirtharajah University of California, Davis
Opportunities and Challenges in Ultra Low Voltage CMOS Rajeevan Amirtharajah University of California, Davis Opportunities for Ultra Low Voltage Battery Operated and Mobile Systems Wireless sensors RFID
More informationCHAPTER 3 NEW SLEEPY- PASS GATE
56 CHAPTER 3 NEW SLEEPY- PASS GATE 3.1 INTRODUCTION A circuit level design technique is presented in this chapter to reduce the overall leakage power in conventional CMOS cells. The new leakage po leepy-
More informationCHAPTER 4 MIXED-SIGNAL DESIGN OF NEUROHARDWARE
69 CHAPTER 4 MIXED-SIGNAL DESIGN OF NEUROHARDWARE 4. SIGNIFICANCE OF MIXED-SIGNAL DESIGN Digital realization of Neurohardwares is discussed in Chapter 3, which dealt with cancer cell diagnosis system and
More informationCMOS Analog Integrate-and-fire Neuron Circuit for Driving Memristor based on RRAM
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.17, NO.2, APRIL, 2017 ISSN(Print) 1598-1657 https://doi.org/10.5573/jsts.2017.17.2.174 ISSN(Online) 2233-4866 CMOS Analog Integrate-and-fire Neuron
More informationModeling and Design Analysis of 3D Vertical Resistive Memory - A Low Cost Cross-Point Architecture
Modeling and Design Analysis of 3D Vertical Resistive Memory - A Low Cost Cross-Point Architecture Cong Xu, Dimin Niu, Shimeng Yu, Yuan Xie, Pennsylvania State University, {czx102,dun118,yuanxie}@cse.psu.edu
More informationDESIGN & IMPLEMENTATION OF SELF TIME DUMMY REPLICA TECHNIQUE IN 128X128 LOW VOLTAGE SRAM
DESIGN & IMPLEMENTATION OF SELF TIME DUMMY REPLICA TECHNIQUE IN 128X128 LOW VOLTAGE SRAM 1 Mitali Agarwal, 2 Taru Tevatia 1 Research Scholar, 2 Associate Professor 1 Department of Electronics & Communication
More informationDesign and Analysis of Row Bypass Multiplier using various logic Full Adders
Design and Analysis of Row Bypass Multiplier using various logic Full Adders Dr.R.Naveen 1, S.A.Sivakumar 2, K.U.Abhinaya 3, N.Akilandeeswari 4, S.Anushya 5, M.A.Asuvanti 6 1 Associate Professor, 2 Assistant
More informationGenerating an appropriate sound for a video using WaveNet.
Australian National University College of Engineering and Computer Science Master of Computing Generating an appropriate sound for a video using WaveNet. COMP 8715 Individual Computing Project Taku Ueki
More information1 Introduction. w k x k (1.1)
Neural Smithing 1 Introduction Artificial neural networks are nonlinear mapping systems whose structure is loosely based on principles observed in the nervous systems of humans and animals. The major
More informationIntroduction to Machine Learning
Introduction to Machine Learning Perceptron Barnabás Póczos Contents History of Artificial Neural Networks Definitions: Perceptron, Multi-Layer Perceptron Perceptron algorithm 2 Short History of Artificial
More informationCHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES
69 CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES 4.1 INTRODUCTION Multiplication is one of the basic functions used in digital signal processing. It requires more
More informationStochastic Mixed-Signal VLSI Architecture for High-Dimensional Kernel Machines
Stochastic Mixed-Signal VLSI Architecture for High-Dimensional Kernel Machines Roman Genov and Gert Cauwenberghs Department of Electrical and Computer Engineering Johns Hopkins University, Baltimore, MD
More informationA comparative study of different feature sets for recognition of handwritten Arabic numerals using a Multi Layer Perceptron
Proc. National Conference on Recent Trends in Intelligent Computing (2006) 86-92 A comparative study of different feature sets for recognition of handwritten Arabic numerals using a Multi Layer Perceptron
More informationDarwin: a neuromorphic hardware co-processor based on Spiking Neural Networks
MOO PAPER SCIENCE CHINA Information Sciences February 2016, Vol 59 023401:1 023401:5 doi: 101007/s11432-015-5511-7 Darwin: a neuromorphic hardware co-processor based on Spiking Neural Networks Juncheng
More informationSilicon photonics integration roadmap for applications in computing systems
Silicon photonics integration roadmap for applications in computing systems Bert Jan Offrein Neuromorphic Devices and Systems Group 2016 IBM Corporation Outline Photonics and computing? The interconnect
More informationLeakage Power Minimization in Deep-Submicron CMOS circuits
Outline Leakage Power Minimization in Deep-Submicron circuits Politecnico di Torino Dip. di Automatica e Informatica 1019 Torino, Italy enrico.macii@polito.it Introduction. Design for low leakage: Basics.
More informationLow-Power Communications and Neural Spike Sorting
CASPER Workshop 2010 Low-Power Communications and Neural Spike Sorting CASPER Tools in Front-to-Back DSP ASIC Development Henry Chen henryic@ee.ucla.edu August, 2010 Introduction Parallel Data Architectures
More informationS.Nagaraj 1, R.Mallikarjuna Reddy 2
FPGA Implementation of Modified Booth Multiplier S.Nagaraj, R.Mallikarjuna Reddy 2 Associate professor, Department of ECE, SVCET, Chittoor, nagarajsubramanyam@gmail.com 2 Associate professor, Department
More informationA New Capacitive Sensing Circuit using Modified Charge Transfer Scheme
78 Hyeopgoo eo : A NEW CAPACITIVE CIRCUIT USING MODIFIED CHARGE TRANSFER SCHEME A New Capacitive Sensing Circuit using Modified Charge Transfer Scheme Hyeopgoo eo, Member, KIMICS Abstract This paper proposes
More informationA Multiplexer-Based Digital Passive Linear Counter (PLINCO)
A Multiplexer-Based Digital Passive Linear Counter (PLINCO) Skyler Weaver, Benjamin Hershberg, Pavan Kumar Hanumolu, and Un-Ku Moon School of EECS, Oregon State University, 48 Kelley Engineering Center,
More informationEE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling
EE241 - Spring 2004 Advanced Digital Integrated Circuits Borivoje Nikolic Lecture 15 Low-Power Design: Supply Voltage Scaling Announcements Homework #2 due today Midterm project reports due next Thursday
More informationLow Power Design of Successive Approximation Registers
Low Power Design of Successive Approximation Registers Rabeeh Majidi ECE Department, Worcester Polytechnic Institute, Worcester MA USA rabeehm@ece.wpi.edu Abstract: This paper presents low power design
More informationLow-Power Approximate Unsigned Multipliers with Configurable Error Recovery
SUBMITTED FOR REVIEW 1 Low-Power Approximate Unsigned Multipliers with Configurable Error Recovery Honglan Jiang*, Student Member, IEEE, Cong Liu*, Fabrizio Lombardi, Fellow, IEEE and Jie Han, Senior Member,
More information64 Kb logic RRAM chip resisting physical and side-channel attacks for encryption keys storage
64 Kb logic RRAM chip resisting physical and side-channel attacks for encryption keys storage Yufeng Xie a), Wenxiang Jian, Xiaoyong Xue, Gang Jin, and Yinyin Lin b) ASIC&System State Key Lab, Dept. of
More informationDESIGN OF PARALLEL MULTIPLIERS USING HIGH SPEED ADDER
DESIGN OF PARALLEL MULTIPLIERS USING HIGH SPEED ADDER Mr. M. Prakash Mr. S. Karthick Ms. C Suba PG Scholar, Department of ECE, BannariAmman Institute of Technology, Sathyamangalam, T.N, India 1, 3 Assistant
More informationA Survey on A High Performance Approximate Adder And Two High Performance Approximate Multipliers
IOSR Journal of Business and Management (IOSR-JBM) e-issn: 2278-487X, p-issn: 2319-7668 PP 43-50 www.iosrjournals.org A Survey on A High Performance Approximate Adder And Two High Performance Approximate
More informationJohn Lazzaro and John Wawrzynek Computer Science Division UC Berkeley Berkeley, CA, 94720
LOW-POWER SILICON NEURONS, AXONS, AND SYNAPSES John Lazzaro and John Wawrzynek Computer Science Division UC Berkeley Berkeley, CA, 94720 Power consumption is the dominant design issue for battery-powered
More informationJDT EFFECTIVE METHOD FOR IMPLEMENTATION OF WALLACE TREE MULTIPLIER USING FAST ADDERS
JDT-002-2013 EFFECTIVE METHOD FOR IMPLEMENTATION OF WALLACE TREE MULTIPLIER USING FAST ADDERS E. Prakash 1, R. Raju 2, Dr.R. Varatharajan 3 1 PG Student, Department of Electronics and Communication Engineeering
More informationHfO 2 Based Resistive Switching Non-Volatile Memory (RRAM) and Its Potential for Embedded Applications
2012 International Conference on Solid-State and Integrated Circuit (ICSIC 2012) IPCSIT vol. 32 (2012) (2012) IACSIT Press, Singapore HfO 2 Based Resistive Switching Non-Volatile Memory (RRAM) and Its
More informationI DDQ Current Testing
I DDQ Current Testing Motivation Early 99 s Fabrication Line had 5 to defects per million (dpm) chips IBM wanted to get 3.4 defects per million (dpm) chips Conventional way to reduce defects: Increasing
More informationAssoc. Prof. Dr. Burak Kelleci
DEPARTMENT OF ELECTRICAL &ELECTRONICS ENGINEERING ANALOG-TO-DIGITAL AND DIGITAL- TO-ANALOG CONVERTERS Assoc. Prof. Dr. Burak Kelleci Fall 2018 OUTLINE Nyquist-Rate DAC Thermometer-Code Converter Hybrid
More informationA Foveated Visual Tracking Chip
TP 2.1: A Foveated Visual Tracking Chip Ralph Etienne-Cummings¹, ², Jan Van der Spiegel¹, ³, Paul Mueller¹, Mao-zhu Zhang¹ ¹Corticon Inc., Philadelphia, PA ²Department of Electrical Engineering, Southern
More informationTeam VeryLargeScaleEngineers Robert Costanzo Michael Recachinas Hector Soto. High Speed 64kb SRAM. ECE 4332 Fall 2013
Team VeryLargeScaleEngineers Robert Costanzo Michael Recachinas Hector Soto High Speed 64kb SRAM ECE 4332 Fall 2013 Outline Problem Design Approach & Choices Circuit Block Architecture Novelties Layout
More informationAnalog Axon Hillock Neuron Design for Memristive Neuromorphic Systems
University of Tennessee, Knoxville Trace: Tennessee Research and Creative Exchange Masters Theses Graduate School 12-2017 Analog Axon Hillock Neuron Design for Memristive Neuromorphic Systems Ryan John
More information/14/$ IEEE 63
Reduction and IR-drop Compensations Techniques for Reliable Neuromorphic Computing Systems Beiye Liu 1, Hai Li 6 Yiran Chen 7 Xin Li 2 Tingwen Huang 3 Qing Wu 4, Mark Barnell 5 Department of Electrical
More informationColumn-Parallel Architecture for Line-of-Sight Detection Image Sensor Based on Centroid Calculation
ITE Trans. on MTA Vol. 2, No. 2, pp. 161-166 (2014) Copyright 2014 by ITE Transactions on Media Technology and Applications (MTA) Column-Parallel Architecture for Line-of-Sight Detection Image Sensor Based
More informationCMOL CrossNets as Pattern Classifiers
CMOL CrossNets as Pattern Classifiers Jung Hoon Lee and Konstantin K. Likharev Stony Brook University, Stony Brook, NY 11794-3800, U.S.A {jlee@grad.physics, klikharev@notes.cc}sunysb.edu Abstract. This
More informationA new 6-T multiplexer based full-adder for low power and leakage current optimization
A new 6-T multiplexer based full-adder for low power and leakage current optimization G. Ramana Murthy a), C. Senthilpari, P. Velrajkumar, and T. S. Lim Faculty of Engineering and Technology, Multimedia
More informationWhite Paper Kilopass X2Bit bitcell: OTP Dynamic Power Cut by Factor of 10
White Paper Kilopass X2Bit bitcell: OTP Dynamic Power Cut by Factor of 10 November 2015 Of the challenges being addressed by Internet of Things (IoT) designers around the globe, none is more pressing than
More informationThe High-Voltage Monolithic Active Pixel Sensor for the Mu3e Experiment
The High-Voltage Monolithic Active Pixel Sensor for the Mu3e Experiment Shruti Shrestha On Behalf of the Mu3e Collaboration International Conference on Technology and Instrumentation in Particle Physics
More informationReference. Wayne Wolf, FPGA-Based System Design Pearson Education, N Krishna Prakash,, Amrita School of Engineering
FPGA Fabrics Reference Wayne Wolf, FPGA-Based System Design Pearson Education, 2004 CPLD / FPGA CPLD Interconnection of several PLD blocks with Programmable interconnect on a single chip Logic blocks executes
More informationChapter 3 Novel Digital-to-Analog Converter with Gamma Correction for On-Panel Data Driver
Chapter 3 Novel Digital-to-Analog Converter with Gamma Correction for On-Panel Data Driver 3.1 INTRODUCTION As last chapter description, we know that there is a nonlinearity relationship between luminance
More informationPower Optimization of FPGA Interconnect Via Circuit and CAD Techniques
Power Optimization of FPGA Interconnect Via Circuit and CAD Techniques Safeen Huda and Jason Anderson International Symposium on Physical Design Santa Rosa, CA, April 6, 2016 1 Motivation FPGA power increasingly
More informationFigure 1. Artificial Neural Network structure. B. Spiking Neural Networks Spiking Neural networks (SNNs) fall into the third generation of neural netw
Review Analysis of Pattern Recognition by Neural Network Soni Chaturvedi A.A.Khurshid Meftah Boudjelal Electronics & Comm Engg Electronics & Comm Engg Dept. of Computer Science P.I.E.T, Nagpur RCOEM, Nagpur
More informationBy Dayadi Lakshmaiah, Dr. M. V. Subramanyam & Dr. K. Satya Prasad Jawaharlal Nehru Technological University, India
Global Journal of Researches in Engineering: F Electrical and Electronics Engineering Volume 14 Issue 9 Version 1.0 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals
More informationTHE content-addressable memory (CAM) is one of the most
254 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 40, NO. 1, JANUARY 2005 A 0.7-fJ/Bit/Search 2.2-ns Search Time Hybrid-Type TCAM Architecture Sungdae Choi, Kyomin Sohn, and Hoi-Jun Yoo Abstract This paper
More informationRRAM for Future Memory and Computing Applications
RRAM for Future Memory and Computing Applications Ming Liu Key Lab. of Microelectronic Devices &Integrated Technology, (CAS) Institute of Microelectronics, CAS Macao University, July7.2018 Outline 2 Computing
More informationA 32 Gbps 2048-bit 10GBASE-T Ethernet Energy Efficient LDPC Decoder with Split-Row Threshold Decoding Method
A 32 Gbps 248-bit GBASE-T Ethernet Energy Efficient LDPC Decoder with Split-Row Threshold Decoding Method Tinoosh Mohsenin and Bevan M. Baas VLSI Computation Lab, ECE Department University of California,
More information