Low-Power Digital Image Sensor for Still Picture Image Acquisition Steve Tanner a, Stefan Lauxtermann b, Martin Waeny b, Michel Willemin b, Nicolas Blanc b, Joachim Grupp c, Rudolf Dinger c, Elko Doering d, Michael Ansorge a, Peter Seitz b and Fausto Pellandini a a Institute of Microtechnology, University of Neuchâtel, CH-2000 Neuchâtel, Switzerland Tel: +41.32.718.34.30, web site: www-imt.unine.ch, e-mail: steve.tanner@unine.ch. b Centre Suisse d'electronique et de Microtechnique SA, CH-8048 Zurich, Switzerland c Asulab SA, and d EM Microelectronic-Marin SA, CH-2074 Marin, Switzerland ABSTRACT This article presents the design and realization of a CMOS digital optimized for button-battery powered applications (limited available peak current and limited stored energy). First, a pixel with local analog memory was designed, allowing efficient global shutter operation. The exposure time becomes independent on the readout speed and a lower readout frequency (reduced peak supply current) can be used without causing distortion. Second, a multipath readout architecture was developed, allowing an efficient use of the power consumption in sub-sampling modes. These techniques were integrated in a 0.5 um CMOS digital with a resolution of 648 x 488 pixels. The peak supply current is 7 ma for a readout frequency of 4 Mpixel/s at Vdd = 3V. Die size is 55 mm2 and overall SNR is 55 db. The global shutter performance was demonstrated by acquiring pictures of fast moving objects without observing any distortion, even at a low readout frequency of 4 MHz. Keywords: CMOS,, low power, ADC, global shutter, digital photography. 1 INTRODUCTION Digital Still cameras are gaining more and more importance in the photography market. However, most of the devices available today consume typically 1 Watt or more of electric power. This corresponds to an important electric current and has a negative impact on the size and weight of the batteries. They represent, with the optical and display systems, the limiting element in terms of weight and size of today's electronic still cameras. Digital still cameras have changed the way we perceive photography. However, this change could be far more important if a drastic reduction in size of the camera would be gained by the electronic approach. New, very small still cameras of the size of objects like pens, credit cards, etc., would allow the next step of the electronic revolution: easy acquisition in any situation without bulky and power consuming equipment. Such devices would use button-sized batteries, and would rely on CMOS single-chip digital s for their low power characteristics. 1.1 Power consumption budget of a still acquisition system A typical still acquisition system is schematically represented in Figure 1. It is made of a digital (including optics), an display, a frame buffer (static or dynamic RAM memory), an processing unit (featuring color interpolation, compression, etc.), and a permanent storage memory (usually a Flash memory). This system is represented three times, corresponding to the three phases of an acquisition procedure. During the first phase (preview mode), the is continuously acquiring s, which are sent to the display, allowing the user to frame the. The second phase is the acquisition itself, during which an is transferred from the to the frame buffer. The acquisition procedure ends with the third phase, when the contents of the frame buffer are processed and the result is written into the permanent storage memory (="electronic film"). The power consumption profile as function of time of the three phases is shown in Figure 2. The preview phase may represent more than 90% in time of the overall procedure, leading to a dominant contribution in terms of consumed energy. The acquisition phase itself is very short, but it is characterized by an important peak power consumption due to the fact that both the and the frame buffer are operating at full resolution and full speed. Finally, the processing phase involves the frame buffer and the processing unit, while the and display can be switched off.
1. 2. 3. digital frame buffer (SRAM) display (LCD) storage (FLASH) processing digital frame buffer (SRAM) display (LCD) storage (FLASH) processing digital frame buffer (SRAM) display (LCD) storage (FLASH) processing Figure 1: Still acquisition system with its three phases: (1) preview, (2) acquisition, and (3) processing. preview acquisition processing power consumption [mw] display buffer buffer processing unit time [s] Figure 2: Power consumption profile in function of time for the three phases of an acquisition (indicative values). Such a power consumption profile presents two difficulties for miniature acquisition systems employing small, button-sized batteries. Such batteries have a typical capacity of 100 to 200 mah, and can provide a maximal current of about 20 to 30 ma during short periods. The first difficulty is the high peak current during acquisition, while the second is the low capacity of the battery compared to the energy requested for the preview phase. 1.2 Limiting peak-current during the acquisition phase The first measure for limiting the peak current during acquisition is to switch off the display and the processing unit, since they are not involved in the transfer from to frame buffer. Secondly, the readout frequency can be lowered as much as possible, but one is then faced to a too important readout time, leading to distortion. Indeed, since most of CMOS s use a rolling snap electronic shutter, a slow readout frequency leads to an important delay between the exposition of the first and the last rows. If the scene or the camera has moved during readout, the motion will produce a distorted. Usually, a readout time of 1/25 s and an exposure time of 1/50 s are considered as maximal values for acceptable picture quality. This corresponds, for a with a VGA resolution (640 x 480 pixels), to a readout frequency of about 8 to 10 millions of pixels per second (Mpix/s). Considering that today's lowest power SRAMS and digital CMOS s are consuming about 3 ma/mhz and 2 ma/mhz, respectively, at 2.7 Volt, the peak current of an transfer will be about 50 ma at 10 MHz, which is too high for a button-sized battery. The proposed approach described in this article for overcoming this problem is to implement a global shutter, called so because all the pixels are simultaneously exposed. This shutter relies on an on-pixel analog memory (patent pending), able to store a complete inside the for a few tenths of milliseconds. During this time the pixels can be read out at a lower frequency, resulting in lower peak current. For example, if the pixel memory allows a retention time of 80 ms, the readout frequency of the same VGA can be lowered from 10 MHz to 4 MHz, resulting in a peak current of 20 ma, which is acceptable for a button-sized battery. 1.3 Limiting power consumption during the preview phase During preview, the lowest power consumption must be ensured because this phase may take an important time. Therefore, the display must feature a very small power consumption and must present a low spatial resolution (the full resolution is in most cases not required for a viewfinder function). It is then advised to operate the in a sub-sampled mode to substantially reduce the power consumption, while matching the resolution to the display.
1.4 Overview of the article Section two presents the developed global shutter with its memory pixel. Section three describes the architecture allowing efficient power reduction in sub-sampling mode. It also addresses the problem of the aliasing effect due to subsampling. Section 4 presents the general architecture of the CMOS, while Section 5 gives a complete set of experimental results: power consumption, noise and global shutter performances. Finally, section 6 concludes the paper. 2 GLOBAL ELECTRONIC SHUTTER 2.1 Pixel structure The proposed pixel schematic diagram is represented in Figure 3. Compared to the basic APS structure 1, with reset (Rs), read (Rd) and source follower transistor, it includes two more transistors. The first is the shutter (Sh) transistor, whose function is to disconnect the photodiode from the source follower transistor gate (memory node) at the end of the exposure time. The second is the "timer" (Ti) transistor enabling to reset the photodiode independently from the memory node. The reset (Rs) transistor is hence used for resetting the memory node. Previous pixels with analog memory 2 could only reset the memory node through the shutter transistor. Ti Vdd Sh photodiode Rs memory node Vdd Rd Vdd out Exposure time Ti (global) Sh (global) Rs Rd (row-wise) Phase index 0 1 2 3 0 Figure 3: Pixel schematic diagram. Figure 4: Timing operation of the global electronic shutter mode. The pixel can be operated in line shutter or global shutter modes, depending on the timing of the control signals. For the global shutter mode, the corresponding signals are given in Figure 4, showing if the transistor is switched on (high level) or off (low level). In phase 0, the photodiode and memory nodes of all the pixels are reset (connected to Vdd). The exposition phase (1) begins when the photodiode and memory nodes are released. Then, at the end of phase 1, a pulse is generated on the shutter transistor (phase 2), allowing charge transfer from the photodiode to the memory node. The exposure time stops with the end of phase 2. At this moment, the pixel signal has been stored into the memory node. This node is then read out and reset (phase 3, operated row-by-row) like in line shutter mode. The successive read and reset allows performing a double sampling operation on the output stage for Fixed Pattern Noise (FPN) suppression. For the line shutter mode, the shutter and timer transistors are constantly switched on and off respectively, and the reset transistor is used row-wise. No global signals are used. 2.2 Sensor addressing logic The logic for addressing the pixels is schematically represented in Figure 5, where a basic block for the selection of four rows is represented, each of them having a reset (rs), a read (rd) and a shutter (sh) selection output signal, indexed from 1 to 4. This logic is based on a shift register approach with supplementary gates supporting both global and row-wise addressing modes for the reset (rs) and the shutter transistors (sh). Supplementary signals enable the selection of those two transistors during a read cycle (ena_rs_rd and ena_sh_rd respectively). This logic also allows various addressing modes within a group of four rows, but not random or window (region of interest) addressing. 2.3 Global shutter For the optimization of the global electronic shutter, the following possible limitations must be taken into account: 1/ Storage quality of the in-pixel analog memory, degraded by: Leakage current of the memory node (caused by leakage current in the source of Rs and Sh transistors). Photo-generated charges to be integrated by the memory node due to insufficient light shielding. Charge injection when Rs or Sh transistors are switched off, causing noise and offset. Collection by the memory node of photo-generated charges coming from the substrate (diffusion).
2/ Degraded frame rate in video operation (continuous acquisition) due to the fact that an overlap between two consecutive frames (read and reset operations) is not possible as it is in the rolling-snap (line) shutter. However, a careful pixel design can prevent from a too strong influence of light and charge diffusion on the memory node. D Q D Q rs rd sh rs rd sh rs rd sh rs rd sh rs-ck rs-in rs-row[1:4] ena-rs-rd global-rs rd-ck rd-in rd-row[1:4] ena-sh-rd global-sh row 4 row 3 row 2 row 1 Figure 5: Sensor addressing logic in the row direction (only four rows are represented). 3 IMAGE SUB-SAMPLING IMPLEMENTATION 3.1 Description Sub-sampling consists of reading a subset of the pixels, usually one pixel over two or four in both directions. The resulting contains less information but conserves the same field of view. However, aliasing appears because the Nyquist spatial frequency is lower than the signal spatial frequency. This latter is usually voluntarily limited by the optic block to the spatial frequency given by the pixel pitch. For sub-sampling mode, a low-pass filter operation is thus required for ensuring a good quality. This operation is usually performed after acquisition, on the full resolution, in the digital domain, and the filtered is then sub-sampled (decimation). Unfortunately, this requires the to operate in full resolution mode, which is not acceptable for power consumption reasons. The proposed approach described here is to realize the filtering and sub-sampling operation inside the, so that the overall consumption can be lowered. The proposed low-power sub-sampling implementation relies on two distinctive features: In the column direction, a parallel readout circuitry with programmable path selection allows to scale the required energy to the number of columns being read out by switching off the unused paths. The column(s) being read out can be selected within a pattern of four. In the row direction, a programmable row selection allows to address one or several rows at the same time within a pattern of two or four rows. This allows the implementation of analog mean functions between adjacent rows, either by blocks of two or four rows 5. 3.2 Configurations allowed Depending on the selected rows and columns, several configurations can be used for reading the pixels. Figure 6 represents four examples with sub-sampling factors of 2 and 4 in every direction. If the pixel configuration and column amplifier allows implementing an analog mean between two adjacent rows (by current summing), interesting mean operations between four (or more) pixels can be made, leading to a simple low-pass filter. In the horizontal direction, a single low-pass filter can also be implemented by performing a digital mean between two consecutive columns. Table 1 gives, for the examples of Figure 6, the ratio of necessary energy and the expected quality compared to a full resolution acquisition that is further filtered and decimated. The proposed filtering and sub-sampling implementation allow a reduction of the energy necessary for acquiring an.
Case 1: sub-sampling 2 x 2 1 row over 2 addressed 1 column over 2 read out Case 3: sub-sampling 4 x 4 1 row over 4 addressed 1 column over 4 read out Case 2: sub-sampling 2 x 2 rows addressed by blocks of 2 analog mean between rows all columns read out digital mean between columns Case 4: sub-sampling 4 x 4 rows addressed by blocks of 4 analog mean between rows 1 column over 2 read out digital mean between columns Figure 6: Four sub-sampling configurations with and without mean operation between pixels. Case Sub-sampling Addressed rows Active columns Energy Expected quality 1 2 x 2 2 of 4 2 of 4 25% acceptable 2 2 x 2 4 of 4 4 of 4 50% good 3 4 x 4 1 of 4 1 of 4 6.25% bad 4 4 x 4 2 of 4 2 of 4 12.5% acceptable Table 1: Energy per and quality compared to the case of a full resolution filtered and decimated. 4 CHIP ARCHITECTURE The global shutter and sub-sampling architecture are implemented into a digital with a spatial resolution of 648 x 488 pixels (VGA) and a digital resolution of 10 bit. 4.1 Sensor architecture The is made of a pixel array, a row addressing logic, a column addressing logic and an output stage. The pixel schematic diagram, as well as the pixel addressing logic, was already presented in Figure 3 and Figure 5 respectively. The readout electronic is represented in Figure 7. The column signals are read out through FPN capacitors 3 (one for each column) that perform storage of the signal value for the FPN suppression operation 4. In a first phase, the signal sw_col_all allows the selection of all the columns at the same time, and the pixel output voltages are stored into the FPN capacitors, whose common plates are grounded (signal sw_agnd). Then, in a second phase, the pixels are reset and they are read out eight by eight by two blocks of four columns. For this, the column addressing logic includes a shift register with half of its flip-flops triggering on a first clock signal (col_ck_a) and the other half triggering on a second clock signal (col_ck_b) decayed by half a cycle. This decay is required due to the ADC architecture described in the next section. The selected capacitors are connected to eight S-C amplifiers via eight analog output lines. Their charge, corresponding to the difference between read and reset pixel voltages, is transferred into the feedback capacitors Cfb, and the corresponding voltage signals are available on the eight amplifier outputs Vout. 4.2 Analog-to-Digital Converters An eight-channel, 10-bit parallel-pipelined ADC with active element sharing technique was chosen for its low-power operation and relatively small die area 6. The eight channels can be switched on or off by four blocks of two. In a block, the twin ADCs are operating with a phase shift of 180. This explains the readout architecture previously presented. 4.3 Other on-chip units The circuit includes also on-chip voltage references for the active elements, and a voltage multiplier for providing a 4.5 Volt voltage to the pixel transistor gates for a higher output dynamic and speed. It includes also a digital controller (state machine) performing statistical calculations ( mean, number of saturated pixels), exposure time control, digital offset
compensation of the eight signal paths (comprising both output amplifier and ADC offset correction). The digital can be entirely controlled with a set of 32 8-bit registers accessed through a serial interface. output lines block a output lines block b FPN capacitors sw-bus-on sw-fb-a analog output bus Cfb vout 1-a sw-fb-b Cfb vout 4-b col-in col-ck-a col-ck-b sw-col-all D Q D Q sw-agnd vrefbuf agnd = 1.5 V vrefbuf = 2.0 V Figure 7: Schematic diagram of the readout electronic with FPN stage. 4.4 Chip floor-plan and circuit realization The chip floor-plan was optimized in order to have the smallest vertical pitch. For this, only the two vertical pad rows were used, as it is shown in the right part of Figure 8, showing a schematic diagram of the floor-plan. The circuit was integrated into a 0.5 µm CMOS technology (double poly, triple metal). Overall chip size is 9.28 x 5.94 mm, giving an area of 55.1 mm 2. The block occupies the left part of the chip, while the ADCs and the on-chip controller are located on the topright and on the bottom-right respectively. The area occupied by the, ADC, logic and pads represent 76%, 3%, 3% and 11% respectively. Routing channels occupy the remaining area (7%). A photomicrography of the chip is represented in the left part of Figure 8. The pixel pitch is limited by the on-pixel electronics to 10.5 µm. pads 648 x 488 pixel array row addressing logic voltage multiplier output amplifiers (8x) column addressing & FPN stage 10-bit ADC (8x) digital controller pads Figure 8: Chip photomicrography (left) and schematic floor-plan (right).
5 EXPERIMENTAL RESULTS The circuit was fully characterized and the resulting parameters are given in Table 2. The measurements were made at a pixel clock of 4 MHz, a voltage of Vdd = 3 Volt and a temperature of 30 C. Sensor parameter Value ADC parameters Value Saturated output signal swing 1.86 Volt ADC DNL ± 0.5 LSB Computed full well capacity 266'000 electrons ADC INL ± 0.6 LSB Conversion gain 7 µv/e [23 ff] ADC SNDR @ 4 MHz 55.5 Fill factor 32% ADC inter-channel gain error 0.02% Fixed Pattern Noise 10 mv rms Noise floor < 1.6 mv rms Power consumption parameters Value Dynamic range > 60 db Sensor power consumption 12 mw Sensitivity @ 626 nm 450 V/W m -2 s ADC power consumption 5 mw Dark current per pixel 2-3 fa Overall circuit peak current 7 ma Table 2: General parameters for the and ADC block, and for power consumption. 5.1 Global shutter operation The global shutter operation proved to be efficient with the acquisition of moving objects. An example is represented in Figure 9, where a fan in rotation was acquired. The decisive advantage of the global shutter over the usual rolling snap shutter is demonstrated. The pixel clock was 4 MHz, resulting in a peak current of only 7 ma. The global shutter does not induce visual difference on the quality compared to the line shutter. Figure 9: Acquisition of a fan with a rotation speed of about 500 RPM. The left is taken in rolling snap (line) shutter mode and the right is taken in global shutter mode. In both cases exposure time is 0.5 ms. The performance of the on-pixel memory was tested by acquiring an with uniform illumination, and by observing the variations inside the. The following elements were observed: A gradient in the readout direction due to the time delay between the readout of the first and last line. Its value is about 10 mv for a readout speed of 4 MHz, which is not visible in an. A variance of the memory voltage due to different dark current values between pixels. The corresponding noise induced in the, although not negligible, is not visually perceptible. Some pixels showed a very important leakage current and could consequently be identified as defective pixels. Therefore the number of defective pixels is higher in global shutter than in line shutter.
Figure 10 shows a typical pixel output response in function of exposure time for a constant illumination. The sensitivity of the pixel is higher of about 10% in global shutter mode. This can partially be explained by the fact that the memory node is still collecting photo-generated charges during its hold phase. The figure shows also that the non-linearity (vertical difference between ideal and real response) is higher by a factor three in global shutter compared to the line shutter mode. This difference is due to charge injection (shutter transistor) dependency with signal amplitude. 250 200 = 0.8% line shutter global shutter Output [LSB] 150 100 = 2% = 3.1% 50 λ = 626 nm -2 P light = 8 E -7 Wcm = 9.4% 0 0 100 200 300 400 500 t [ms] int Figure 10: Typical pixel responses in function of exposure time for both line and global shutter modes. 5.2 Power consumption reduction in sub-sampling operation The anti-aliasing filtering with analog mean between two consecutive rows was not implemented in this circuit, and will be tested in a future version. However, the present circuit offers already a scaling of power consumption with resolution. In sub-sampling mode of factor 2, the power consumption was measured to be 6 mw for a frame rate of 12 frame/s. This value is to be compared to the 21 mw of the full resolution mode at the same frame rate. 6 CONCLUSION The design of a digital CMOS was optimized for low-power still picture acquisition. First, a minimal current consumption during acquisition was ensured through a readout frequency reduction. For this, a pixel with efficient analog memory was designed, allowing effective global shutter operation. Second, minimal energy consumption in sub-sampled modes was allowed by parallel readout architecture with programmable active paths. A mixed-mode simple anti-aliasing filtering operation was proposed and will be tested in a next design. These techniques were implemented into a digital CMOS with a resolution of 648 x 488 pixels, an overall SNR of 55 db and a peak current consumption of 7 ma at 3V for a readout frequency of 4 MHz. REFERENCES 1. E. R. Fossum, "Low Power Camera-on-a-Chip using CMOS Active Pixel Sensor Technology", Proc. SLPE'95, pp. 74-77, October 1995. 2. C. H. Aw and B. A. Wooley, "A 128 x 128 Pixel Standard CMOS Image Sensor with Electronic Shutter", IEEE JSSC, Vol. 31, No. 12, pp. 1922-1930, December 1996. 3. B. Dierickx, G. Meynants, D. Scheffer, "Offset-Free Offset Correction for Active Pixel Sensors", Proc. of IEEE Workshop on CCD and Advanced Image Sensors, Bruge, Belgium, June 5-7, 1997. 4. R. H. Nixon, E.R. Fossum, "128 x 128 CMOS Photodiode-type Active Pixel Sensor with On-Chip Timing, Control and Signal Chain Electronic", Proc. SPIE, Vol. 2415, pp.117-123, 1995. 5. J. Coulombe, C. Wang, "Variable Resolution CMOS Current Mode Active Pixel Sensor", Proc. ISCAS2000, Vol. II, pp. 293-296, Geneva, Switzerland, May 28-31, 2000. 6. S. Tanner, A. Heubi, M. Ansorge, F. Pellandini, A 10-bit Low-Power Dual-Channel ADC with Active Element Sharing, Proc. ISIC99, pp. 529-532, Singapore, September 8-10, 1999.