Demonstration of a Frequency-Demodulation CMOS Image Sensor Koji Yamamoto, Keiichiro Kagawa, Jun Ohta, Masahiro Nunoshita Graduate School of Materials Science, Nara Institute of Science and Technology 8916-5 Takayama, Ikoma 630-0101, JAPAN Yasushi Yamasaki, Kunihiro Watanabe Microsignal Co. Ltd. SeikaCho, Kyoto 619-0237, JAPAN Keywords: CMOS image sensor, frequency-demodulation, photogate, active pixel sensor, floating diffusion ABSTRACT A frequency-demodulation CMOS image sensor for capturing images only by the modulated light is proposed and demonstrated. The pixel circuit has two FD (floating diffusion) for accumulating signal charges and one photo-gate for detecting the modulated light and the background light. By operating the image sensor synchronously with a frequency and a phase of the modulated light, signal charges generated by the modulated light and the background light are accumulated at FD of one side, while signal charges generated only by the background light are accumulated at another FD, respectively. By subtracting outputs of two FD with the off-chip subtraction circuits, images produced only by the modulated light can be obtained. Based on the proposed circuit, an image sensor with 64 x 64 pixels are fabricated by using 0.6 µm CMOS technology. We captured images by using this image sensor and demonstrate the sensor can capture images only by the modulated light. When the object is partially illuminated by the modulated illumination under constant background illumination, we can successfully demonstrate the image sensor captures the potion illuminated by the modulated light with removing any static background light. Also we demonstrate the marker detection. When the marker is attached to an object under several background illuminations, the image sensor can extract the marker without affected by the background illumination intensities. A motion capturing is successfully demonstrated by use of this sensor. INTRODUCTION Modulation technique has widely been used in several types of sensors. It is effective to detect signals with high signal-to-noise ratio, because it enables the sensor to detect only modulated signals and thus remove any static background noise. If this technique is implemented in an image sensor with a modulated light source, it could be applied to track a target specified with the modulated light source, because such a sensor could acquire image with being hardly affected by a background light condition. Thus, motion capture could be easily realized by using this sensor under several illumination conditions. It is, however, difficult to implement such a demodulation function in a conventional image sensor, because it generally operates in an integration mode to increase the signal-to-noise ratio, and thus the integration of the photo-carriers wipes out the modulated input light signal. To overcome this problem, we have proposed a novel type of a frequencydemodulation image sensor. Compared with the other frequency-demodulation image sensors [1], [2], [3], this sensor has an advantage of a simple structure with a high sensitivity. In this paper, we describe the fundamental properties of the sensor and propose the application of this sensor to motion capturing and demonstrate the experimental results by using the image sensor.x SENSOR STRUCTURE AND OPERATION PRINCIPLE Figure 1 shows the pixel structure of the proposed image sensor. The circuit consists of a pair of read out circuits like a conventional active pixel sensor (APS) [4], that is, two transfer gates (TX1 and TX2) and two floating diffusions (FD1
and FD2) are implemented. One photogate (PG) is used as a photodetector instead of a photodiode, and is connected with both FD1 and FD2 through TX1 and TX2, respectively. The reset transistor (RST) is common to the two read out circuits. Two outputs OUT1 and OUT2 are subtracted from each other, and thus only a modulated signal is obtained. The proposed structure is similar to that reported by Ref. [5], but these sensors are dedicated to a range-finder, which is different from our aim, that is, the detection of modulated light signals. Another similar structure has been proposed and demonstrated in Ref. [6], although it has been studied mainly for taking an image in an inter-frame Background difference. In this structure, a parasitic n-type diffusion region is inevitably created around the PG, because a selfalignment gate process to construct a source/drain area around a gate area, which is commonly used in a standard CMOS technology, automatically creates an n-type diffusion region around the PG. We refer to this parasitic n-type diffusion region around the PG as a parasitic ndiff in this paper. Modulation: ON «Modulation ««FD1 : Background Modulation Modulation:OFF FD2 : Backgournd The timing diagram of this sensor is shown in Fig. 2. First, the reset operation is achieved by turning RST on when the modulated light is OFF. It is noted that the two reset transistors are connected each other, thus FD1 and FD2 are simultaneously reset by this operation. When the modulated light is turn on, PG is biased to accumulate photo-carriers. Then the modulated light is turned off, and PG is turned off to transfer accumulated charges to FD1 by opening TX1. This is the ON-state of the modulated light and in this state both modulated and static light components are stored in FD1. Next, PG is biased again, and starts to accumulate charges in the OFFstate of the modulated light. At the end of the OFF period of the modulated light, the accumulated charges are transferred to FD2. Thus only the static light component is stored in FD2. By repeating this process, the charges in the ON- and OFF- states are accumulated in FD1 and FD2, respectively. According to the amount of the accumulated charges, voltages in FD1 and FD2 decrease in a stepwise manner. By measuring the voltage drops of FD1 and FD2 at a certain time and subtracting them from each other, one can extract only the modulated signal component. We designed and fabricated an image sensor with 64 x 64 pixels based on the proposed circuit using a standard 0.6µm 2-poly 3-metal CMOS technology. The third metal is used to cover the FDs and rest of the pixel circuitry. The layout of the pixel is shown in Fig. 3. In this layout, the parasitic n-diff region is designed to be common to both FD1 and FD2. If the region is separated, it could make a difference between the residue charges in the two parasitic n-diff regions, which results in Figure 2: Timing diagram of the operation. subtraction errors. Table 1 summarizes the chip specifications. Figure 3 shows a microphotograph of the fabricated chip and experimental setup. The packaged chip is attached to a camera lens and mounted on a test board to capture images. The subtraction is executed in a personal computer after the two outputs from the chip are fed into I-V converters and an AD converter. (b) (a) (c) Figure 1: Pixel structure of the frequency-demodulation image sensor (a) and illustration of charge transfer under both background and modulation lights (b) and under only background light (c).
FUNDAMENTAL CHARACTERISTICS We have measured the chip characteristics to confirm the effectiveness of the proposed architecture. An LED with the emission wavelength of 660 nm is used as a light source. The measurement is performed in a dark room. Figure 5 (a) shows the oscilloscope traces of the outputs from OUT1 and OUT2 when the LED is not modulated, that is static illumination conditions. The two outputs coincide with each other, which results in no difference between the two outputs under static illumination. Figure 5 (b) shows the traces when the LED operates at 2.5 khz modulation with a static background bias light intensity. Thus, the input represents the condition of modulated light under static illumination. The modulation and bias light power are 0.56 nw and 0.48 nw, respectively. The slopes corresponding to the two outputs are different. The value of OUT1 decreases faster than that of OUT2, because the total transferred charges to FD1 are always larger than those to FD2. The subtraction results between the two outputs are shown in Fig. 5 (c), and are obtained by off-chip subtraction circuits. The subtraction value increases in proportion to the total amount of PG Process technology Chip seize Pixel Size PG size Input dynamic range Conversion gain Fixed Pattern Noise Table 1: Chip specifications Power supply voltage 0.6 µm CMOS 2poly-3metal 4.2 x 4.2 mm 2 42 x 42 µm 2 10 x 24 µm 2 5 V 25 db 6.7 µv/e - 2.7% the modulated light intensity. This figure shows that the detection of modulated light is successfully demonstrated under static background illumination using this sensor. FD1 FD2 Vdd Row Select RST Image sensor PC monitor OUT1 TX1 TX2 OUT2 Object Figure 3: Pixel layout of the frequencydemodulation image sensor. Figure 4: Chip microphotograph and experimental setup. Figure 6 shows the subtracted output voltage as a function of the power density of the modulated light under constant background illumination of 0.34 nw. The light source is the same LED as used in Fig. 5. The modulation frequency is 2.5 khz and the frame rate is 1.28 msec. It is confirmed that the subtracted output is almost proportional to the intensity of the modulated light, which demonstrates the effectiveness of the proposed method. The input dynamic range of the chip is around 25 db and the gamma value is almost one except for the saturated region. The input dynamic range is defined here as the ratio of the maximum detectable light intensity to the minimum detectable light intensity. The output saturates in the high input intensity OUT2 OUT1 (a) (b) (c) Figure 5: Oscilloscope traces of OUT1 and OUT2 (a) under static illumination, (b) under modulated illumination, and (c) their subtracted output.
region due to the saturation of the accumulated charges in FD1. Next, we take images using this sensor. Figure 7 shows images taken by the sensor. In Fig. 7(a), an object, a human face, is partially illuminated by modulated light of white LEDs with 1300 lux under constant background illumination of a fluorescent lamp with 3700 lux. The modulated light is shone around the right eye in the face. The output from the FD2, where the image is illuminated only by the constant light, is shown in the left side of Fig. 7(a). The right side of Figure 7(a) shows the difference between images of OUT1 and OUT2, and successfully demonstrates that the portion illuminated by the modulated light is extracted. Also Figure 7 (b) shows the LED light spots both with and without modulation. Only modulation LED spots can be successfully detected. Differential Output (mv) 10 3 10 2 10 1 10 0 10-3 10-2 10-1 10 0 Modulated Light Intensity (nw) Figure 6: Subtracted output voltage as a function of modulated light intensity under constant light illumination. Backgournd+Modulation Modulation Backgournd+Modulation Modulation (a) (b) Figure 7: Output images of (a) a human face and (b) static and modulated light spots. MOTION CAPTURING Some applications such as motion capture require the extraction of a specific marker from complicated background images under various illumination conditions. A demonstration of capturing a modulated image is shown in Fig. 8. In this case, a marker, that is a modulated white LED, is attached to an object, a stuffed dog. The marker is attached to the collar of the dog, which is located in the bottom of the image. The object is captured by this sensor under three different illumination intensities of a fluorescent lamp. The averaged power of the LED is kept constant at 70 µw through this experiment. In the weakest light condition of 3600 lux in Fig. 8(a), the dog can barely be seen, which means that this captured image is under exposed. The medium illumination level in Fig. 8(b) is 11000 lux. In the strongest light condition of 20000 lux in Fig. 8(c), the captured image is over exposed. In some places, it seems that the output from the sensor is saturated. It is noted that the LED spot is not the brightest region in the whole image of Fig. 9(c), so that the spot could not be extracted only by searching for pixels with the maximum intensity. The experimental results in Fig. 9 demonstrate that this sensor can selectively extract the marker with little effect from the background illumination condition, and is suitable to the application of motion capture. We applied this sensor to motion capturing. Figure 9 shows the experimental results of the captured images by using this sensor. The moving object, a dog, is attached a modulated LED in its neck. The LED is modulated with the frequency of 2.5 khz in the averaged power of 70 µw. The background illumination level is around 1000 lux. The accumulation time is 1.28msec. The middle and right side images in Fig. 3 show the image from OUT 1 and OUT2, respectively. The left side figures show the subtracted image between OUT1 and OUT2, which correspond to the demodulated images. Only the modulated LED can be clearly extracted in each captured image. The bottom figure shows the result of the motion capturing. For convenience, the moving direction or the track of the LED is
superimposed. These results demonstrate that the sensor is suitable to motion capturing. In the future application, we are planning to demonstrate a three dimensional motion capturing by using two sensors for a binocular image. 3600 lx Backgound Light 11000 lx 20000 lx (a) OUT1 (Background Modulation) (b) OUT2- OUT1 (Modulation) Figure 8: Marker extraction under several background illumination conditions. Figure 9: The experimental results of motion capturing. DISCUSSION In this section, we discuss the effect of the parasitic ndiff region on the device characteristics. As mentioned above, the parasitic ndiff region is inevitably created as long as a standard CMOS technology is used. Figure 10 demonstrates an example of the effect of the parasitic ndiff region, where the subtracted output voltage is plotted as a function of the modulated light intensity under four background illumination levels of 0, 0.20, 0.41, and 0.65 nw. When the background level is zero, the output does not increase even if the modulated light increases. This is presumably due to the effect of the parasitic ndiff region, where initial charges are trapped and cannot contribute to the output voltage. When charges created by background light completely fill the parasitic ndiff region, then the following charges can flow into the FD and thus the output is appeared. By simulating the potential profile using a device simulation, we have confirmed that the potential valley exists between PG and FD, and acts as the trap for the transferred charges. It is noted that the threshold mismatch between the two transfer gates TX1 and TX2 little affect the sensitivity before carriers fill the region. Once the region is filled, then the threshold mismatch increases the error of the subtraction and thus reduces the sensitivity. To alleviate the effect of the parasitic ndiff region, we propose a reset operation of the parasitic ndiff region. By filling charges in the region by the reset operation, it is expected to reduce the difference between the two outputs. We fabricated a test circuit to confirm this idea. In the circuit, we added an NMOSFET between the parasitic ndiff region
and GND. We have confirmed that the output characteristics in the low light intensity region are improved by introducing this proposed mechanism. CONCLUSION We have proposed an image sensor to detect modulated light. The device has a simple structure and has a potential for high sensitivity compared with other methods. The image sensor chip with 64 x 64 pixels has been fabricated using 0.6 µm CMOS technology, and has been used to demonstrate the acquisition of modulated images. It is also demonstrated that this sensor could be applied to motion capture. REFERENCES 1. S. Ando and A. Kimachi, Time-Domain Correlation Image Sensor: First CMOS Realization of Demodulator Pixel Array, IEEE Workshop CCD & APS, pp.33-36, Karuizawa, Japan, June 1999. 2. T. Spirig, P. Seitz, O. Vietze, and F. Heitger, The Lock-In CCD -Two-Dimensional Synchronous Detection of Light, IEEE J. Quantum Electron. 31, pp.1705-1708, 1995. 3. B. Buxbaum, R. Schwarte, and T. Ringbeck, PMD-PLL: receiver structure for incoherent communication and ranging systems, Proc. SPIE, 3850, pp.116-127, 1999. 4. E.R. Fossum, CMOS Image Sensors -Electronic Camera On A Chip-, IEEE Electron Device, 44, pp.1689-1698, 1997. 0 0 0.5 1 1.5 Modulated Light Intensity (nw) 5. R. Miyagawa and T. Kanade, CCD-Based Range-Finding Sensor, IEEE Electron Device, 44, pp.1648-1652, 1997. 6. [7] S.-Y. Ma and L.-G. Chen, A Single-Chip CMOS APS Camera with Direct Frame Difference Output, IEEE J. Solid State Circuits, 34, pp.1415-1418, 1999. Differenial Output (mv) 1000 500 0.65 nw 0 nw 0.20 nw 0.41 nw Figure 10: Differential output as a function of modulated light intensity. The parameter is the value of the background light intensity.