Short-course Compressive Sensing of Videos Venue CVPR 2012, Providence, RI, USA June 16, 2012 Richard G. Baraniuk Mohit Gupta Aswin C. Sankaranarayanan Ashok Veeraraghavan
Tutorial Outline Time Presenter Topic 1:30 2:00 Mohit Gupta Columbia University 2:00 3:00 Aswin Sankaranarayanan Rice University Introduction and Motivation Compressive Sensing Theory and Sparse Representations 3:30 4:30 Ashok Veeraraghavan Rice University 4:30 5:00 Mohit Gupta Columbia University Compressive Video Sensing Systems Discussion of CS in Other Domains and Related Problems
Space Shuttle Discovery Flight Deck Gigapan: 2.74 Gigapixels http://www.gigapan.com/gigapans/102753
Still Life Gigapan: 0.88 Gigapixels http://www.gigapan.com/gigapans/105851
Playing Drums Frame Rate: 50 fps
Playing Drums Frame Rate: 500 fps
Splashing Marbles Frame Rate: 50 fps
Splashing Marbles Frame Rate: 500 fps
Detail Fascinates
Automotive Testing Frame Rate: 2000 fps
Biomechanical Analysis Frame Rate: 2000 fps
Military Testing Frame Rate: 2000 fps
Selling Insurance Frame Rate: 4000 fps
Promoting HDTV Frame Rate: 1000 fps
Microscopy Frame Rate: 500 fps
Golf Swing Test Frame Rate: 10000 fps
Nature Frame Rate: 1000 fps Images captured for the BBC production "Life"
Nature Frame Rate: 1000 fps
High-Speed Schlieren Imaging Frame Rate: 500 fps
Capturing Photons Frame Rate: Trillion fps http://cameraculture.media.mit.edu/femtotransientimaging
Fun Frame Rate: 2000 fps
Cost: High-Performance Video Cameras Product Name Cost for Demo Unit Cost for New Unit SA5 775K M1 (MONO 8 GIGS) $68,500 $90,000 SA5 775K M2 w/ mech. Shutter SA5 1000K C2 RV COLOR -16 GIGS BC2 HD with Keypad SA2 M2 (MONO 16 GIG HIGH DEF ) $77,000 $80,000 $90,000 $55,500 $103,120 $113,120 $132,400 $100,000 No consumer high-performance sensors (>1MP, >1000fps) Photron cameras. Quotation source: Email from techimaging.com representative.
Why are these sensors so expensive? 1. Light Limitation Incident Illumination Space-Time Volume Scene y t Reflected Illumination Sensor x
Light Limitation Signal Level (electrons) Exposure time Incident Illumination Pixel Size F-number Scene Reflectivity Quantum Efficiency [Cossairt et al., 2012]
Light Limitation I src (lux) 2 x 10-3 1 x 10-2 2 10 10 2 10 3 10 4 number of electrons 6.2 x 10-4 3.1 x 10-3 0.62 3.08 30.4 304 3040 1000 FPS, 10MP camera: Exposure time of t = 1/1000 seconds, Pixel size of = 4µm. [Cossairt et al., 2012]
Light Limitation I src (lux) 2 x 10-3 1 x 10-2 2 10 10 2 10 3 10 4 number of electrons 6.2 x 10-4 3.1 x 10-3 0.62 3.08 30.4 304 3040 Highly sensitive sensors required [Cossairt et al., 2012]
Why are these sensors so expensive? 2. Noise signal photon noise dark noise read noise Slide: Courtesy Marc Levoy
SNR Over The Years http://www.dxomark.com/index.php/publications/ DxOMark-Insights/SNR-evolution-over-time Sensor technology has improved significantly over the years. But total number of voxels per unit volume has risen to offset these improvements. So, SNR has remained static. For higher-performance cameras of the future, sensor technology has to keep up with the rising number of voxels. Slide: Courtesy Marc Levoy
Why are these sensors so expensive? 3. Bandwidth Frame-rate is limited by the sensor readout rate Analog-to-digital conversion Time required to clear charge from the parallel register. Shutter opening delay in CCDs employing mechanical shutters. 1MP x 1000fps x 16-bit pixels = 4GB/s Expensive!
Spatio-Temporal Resolution Tradeoff Single image Spatial Resolution = 1X Temporal Resolution = 1X 30
Spatio-Temporal Resolution Tradeoff Captured Interpolated Thin-out Movie Movie (Row-wise sub-sampling) Spatial Resolution = 1/4X Temporal Resolution = 4X
Spatio-Temporal Resolution Tradeoff Captured Interpolated Thin-out Movie Movie (Row-wise sub-sampling) Spatial Resolution = 1/36X Temporal Resolution = 36X 32
Spatio-Temporal Resolution Tradeoff High-speed, High-res Video
Why are these sensors so expensive? 4. Non-visible wavelength sensors Infrared image Infrared camera (FLIR T620) Resolution: 640x480. Cost: $26,000. Expensive!
Do we need to capture all this data?
Redundancy in Visual Data Raw Captured Image (1.2MB) JPEG Compressed Image (40 KB) 30X Compression without significant loss of visual quality
Redundancy in Visual Data Raw Captured Video (270MB) H.264 Compressed Video (1.8 MB) 150X Compression without significant loss of visual quality
Redundancy in Visual Data Raw Captured Video (270MB) H.264 Compressed Video (1.8 MB) Massive data acquisi-on Most of the data is redundant and can be thrown away
The Sparseland Model for Images (Videos) Each image (video) patch = Sparse linear combination of dictionary atoms Slide courtesy: Guillermo Sapiro
The Sparseland Model for Images (Videos) Examples of dictionary: Wavelets, DCT, learned dictionaries Slide courtesy: Guillermo Sapiro
Capturing Relevant Data One can regard the possibility of digital compression as a failure of sensor design. If it is possible to compress measured data, one might argue that too many measurements were taken. David Brady
Capturing Relevant Data Can we design sensing systems that capture only the relevant data?
What is Compressive Sensing? Compressive Sensing is data acquisition protocols which directly acquire just the important information. Compressive Sensing is about acquiring and recovering a signal in the most efficient way possible.
What is Compressive Sensing? Compressive Sensing is data acquisition protocols which directly acquire just the important information. Compressive Sensing is about acquiring and recovering a signal in the most efficient way possible. Acquisition distinguishes compressive sensing from image processing
Compressive Sensing Acquisition: Time-domain measurements Frequency-domain measurements Important to Take `Good Measurements
Compressive Sensing: Challenges What are good measurements? How to take measurements in sparse domain?
Compressive Sensing: Challenges What are good measurements? How to take measurements in sparse domain?
Compressive Sensing: Enablers Incoherent Measurements for Sparse Recovery Signal Measurements Signal is local, measurements are global Each measurement picks up a little information about each component See papers by Candes, Romberg, Tao, Donoho for details Slide courtesy: Emmanuel Candes
Compressive Sensing: Challenges What are good measurements? How to take coded measurements?
Compressive Sensing: Enablers Computational Imaging and Optical Devices Conventional Camera Computational Camera A computational camera uses a combination of novel optics to map rays to pixels in some unconventional fashion. The captured image is optically coded and may not be meaningful in its raw form. The computational module decodes the captured image. See papers by Nayar, Levoy, Raskar, Freeman, Durand.
Novel Optical Devices Digital Micro-mirror Device (DMD) [10KHz] Single Pixel Camera [Rice] Liquid Crystal on Silicon (LCoS) [5KHz] Compressive Video Acquisition System [MERL, Rice, Columbia]
Why is Video Compressive Sensing Hard?
Specifications of the Human Eye Spatial resolution: Approximately 500 MegaPixels Temporal resolution: Approximately 15-20 Frames Per Second http://www.clarkvision.com/articles/eye-resolution.html
Eye as a Jitter Camera Eye FOV `Single-view Resolution
Eye as a Jitter Camera Eye FOV `Single-view Resolution
Eye as a Jitter Camera Eye FOV Higher Resolution Jitter Camera [Ben Ezra and Nayar]
Jittering in Time? Time is ephemeral in nature. Hard to take multiple measurements of the same duration. Can use multiple cameras, but it is an expensive solution Other practical issues such as registration. High-speed video using a camera array [Levoy et al.]
Tutorial Outline Time Presenter Topic 1:30 2:00 Mohit Gupta Introduction and Motivation 2:00 3:00 Aswin Sankaranarayanan Compressive Sensing Theory and Sparse Representations 3:30 4:30 Ashok Veeraraghavan Compressive Video Sensing Systems 4:30 5:00 Mohit Gupta Discussion of CS in Other Domains and Related Problems