A 1Mjot 1040fps 0.22e-rms Stacked BSI Quanta Image Sensor with Cluster-Parallel Readout IISW 2017 Hiroshima, Japan Saleh Masoodian, Jiaju Ma, Dakota Starkey, Yuichiro Yamashita, Eric R. Fossum May 2017
Challenges to Realize Quanta Image Sensor (QIS) 1. Realizing jots (tiny pixels) Small pitch ( < 1um) Low read noise ( < 0.25e- r.m.s.) Photon counting at room temperature High-fill factor and QE Low dark count ( < 1Hz) Compatible with commercial CMOS fabrication line process (low-cost, system on a chip) 2. Readout circuits (Focus of this work) High speed ( > 1000frames/s) Low-power ( < 1pJ/b) Low-noise ( < 0.1e- r.m.s.) 3. On-chip image processing to reduce the output data rate 2
Measured Results of Tapered-Reset Pump-Gate Jots Photon counting histogram (PCH) of a pixel with 0.18er.m.s. read noise Conversion gain (CG) variation of 16k pixels Read noise variation of 16 pixels 3
High-Speed and Low-Power Readout Circuits Challenge #1 has been addressed by invention of photon counting taperedreset-gate pump-gate jot Ultimate goal is to implement these pixels in an imager and design and implement a QIS with: 0.1 to 1 billion pixels 1000 frames/s 4
Challenge: High-Speed and Low-Power Readout Circuits Jot array (16:9 ratio): 24,000(V) x 42,000(H) Power consumption budget: Total: 5W Bias in-jot source-follower amplifiers: 2.5W Sense-amplifiers & ADCs: 1W Primary image processing: 1W Fast digital output pads: 0.5W 5
Bias In-Jot Source-Follower Amplifiers Column-parallel readout structure: 42,000 columns VDD 24,000 rows of jots Band-width is limited (huge parasitic 1 st jot RST 1 TG 1 FD m SF 1 capacitance and resistance on each column) RS 1 To increase band-width bias current is RST L VDD R P increased, power is increased, significantly Last jot TG L FD L SF L C P Solution: Use of stacked (3D) CMOS process to reduce the length of the columns and the parasitic capacitance on the column RS L V bias M CS To readout circuits 6
Stacked Process to Increase Band-Width and Reduce Power More than one stacked wafers A group of jots form a cluster Readout circuits of a cluster of jots are located underneath of every jot cluster Clusters function in parallel Column line length is reduced, parasitics are reduced 7
Charge-Transfer Amplifier and D-Latch From 2015 IISW Work Gain stage Used in PGAs Used in ADCs to reduce offset of ADCs Continuous-time amplifiers consume too much static power VREF VDD S3 Solution: Charge transfer amplification (CTA) technique Comparator Used in ADC in1 in2 S1 S2 S2 S1 M1 M2 S3 M3 M4 out2 out1 S1 S2 S3 Reset Sample Latch t D-latched comparator VREF T 8
Prototype of a 1Mjot and 1040fps QIS 20 1Mjot arrays on a chip Pump-gate jots with tapered reset-gate Stacked (3D) process Charge-transfer amplification technique 45nm detector substrate 65nm ASIC substrate 16x16 clusters A cluster has 4096 jots CDS units, a CTA and one 1-bit ADC per readout cluster Detector Substrate Addressing 1126.4um 16x16=256 clusters 4096 jots in each cluster 16x16=256 readout clusters 8 CDS units and a 1b-ADC in each readout cluster 1126.4um High-Speed Digital PADs ASIC Substrate 9
10 Schematic of a Jot and Readout Cluster
1Mjot Prototype QIS Experimental Results Target scene 1Mpixel QIS photon-counting binary image operating at 1040fps 11
Summary of Measured Results Process 45nm (jot layer), 65nm (ASIC layer) 1.8V & 2.5V VDD (Analog, digital and array), 3V & 2.2V (I/O pads) BSI Tapered Pump Jot type Gate 2-Way Shared RO Jot pitch 1.1µm BSI Fill Factor ~100% Quantum Efficiency 79% @ 550nm Conversion gain on column 345µV/e- Input Referred Noise 0.22e- r.m.s. Corresponding BER ~1% Avg. Dark current (RT) 0.16e-/s Equiv. Dark Count Rate (RT) 0.16Hz/jot Equiv. PD Dead Time <0.1% Array 1024 (H) x 1024 (V) Field rate 1040fps ADC sampling rate 4MSa/s ADC resolution 1 bit 32 (output pins) x Output data rate 34Mb/s = 1090Mb/s Package PGA with 224 pins Array 2.3mW 256 ADCs 7.5mW Power Addressing 4.1mW I/O pads 3.7mW Total 17.6mW FOM ADC 6.9pJ/b FOM = Power Consumption # of pixels frame rate [pj b ] 12
13 QIS Power Consumption Projection IISW 2015 (past) IISW 2017 (present) IISW 2019 (future) Imager type Single-bit Single-bit stacked QIS Single-bit stacked QIS Min gate length 180nm 250nm 65nm Pixel/jot pitch 3.6um 1.1um 0.8um Resolution 1MP 768 x 1376 1MP 1024 x 1024 100MP 10240 x 10240 Frame rate 1000fps 1040fps 1040fps In-pixel bias 8.6mW 8.1pJ/b 2.3mW 2.1pJ/b 230mW 2.1pJ/b Gain+ADCs 2.6mW 2.5pJ/b 180nm Transistors 7.5mW 6.9pJ/b 250nm Transistors 65mW 0.6pJ/b 65nm Transistors Digital & I/O 8.8mW 8.3pJ/b 7.8mW 7.2pJ/b 780mW 7.2pJ/b Total power 20mW 19pJ/b 17.6mW 16pJ/b 1075mW 9.9pJ/b FOM = Power Consumption # of pixels frame rate [pj b ]
Acknowledgment Design work supported by Rambus, Inc. Characterization work supported by DARPA 14