Spring 15 CIS 5543 Computer Vision Visual data acquisition devices Introduction Haibin Ling http://www.dabi.temple.edu/~hbling/teaching/15s_5543/index.html Revised from S. Lazebnik The goal of computer vision To perceive the story behind visual data. The goal of computer vision To perceive the story behind visual data What exactly does this mean? Vision as a source of metric 3D information. Vision as a source of semantic information. What we see What a computer sees Source: S. Narasimhan Vision as measurement device Vision as a source of semantic information Real-time stereo Structure from motion Multi-view stereo for community photo collections NASA Mars Rover Pollefeys et al. Goesele et al. 1
Object categorization sky building flag Scene and context categorization outdoor city traffic banner bus face street lamp bus wall cars Qualitative spatial information slanted Why study computer vision? Vision is useful: Images and video are everywhere! non-rigid moving object Personal photo albums Movies, news, sports vertical rigid moving object horizontal rigid moving object Surveillance and security Medical and scientific images Why study computer vision? Vision is interesting: visual illusion http://interactive.usc.edu/members/yuechuan/archives/2004/09/ 2
Why study computer vision? Vision is useful Vision is interesting Vision is difficult Half of primate cerebral cortex is devoted to visual processing Achieving human-level visual perception is probably AI-complete Chi et al. 2010 Why is computer vision difficult? Challenges: viewpoint variation Michelangelo 1475-1564 Challenges: illumination Challenges: scale image credit: J. Koenderink 3
Challenges: deformation Challenges: occlusion Xu, Beihong 1943 Magritte, 1957 Challenges: background clutter Challenges: Motion Challenges: object intra-class variation Challenges: object inter-class similarity Camouflage: Katydid emulating a leaf. Phyllium Giganteum & leaf 4
Challenges: Shape ambiguidy Challenges: local ambiguity (a) (b) (c) Lonicera_maackii Prunus_serrotina Lonicera_maackii Ling & Jacobs 2007 Revised from : Fei-Fei, Fergus & Torralba Challenges or opportunities? Depth cues: Linear perspective Images are confusing, but they also reveal the structure of the world through numerous cues Our job is to interpret the cues! Image source: J. Koenderink Depth cues: Aerial perspective Depth ordering cues: Occlusion Source: J. Koenderink 5
Shape cues: Texture gradient Lighting cues: Shading Lighting cues: Shading Position and lighting cues: Cast shadows Source: J. Koenderink Grouping cues: Similarity (color, texture, proximity) Grouping cues: Common fate Image credit: Arthus-Bertrand (via F. Durand) 6
Bottom line Perception is an inherently ambiguous problem Many different 3D scenes could have given rise to a particular 2D picture Connections to other disciplines Artificial Intelligence Robotics Machine Learning Computer Vision Possible solutions Bring in more constraints (more images) Use prior knowledge about the structure of the world Need a combination of different methods Computer Graphics Image Processing Psychology Neuroscience Image Sources: source: S. F. Lazebnik Durand Origins of computer vision Progress to date The next slides show some examples of what current vision systems can do L. G. Roberts, Machine Perception of Three Dimensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963. Earth viewers (3D modeling) Photosynth Image from Microsoft s Virtual Earth (see also: Google Earth) http://labs.live.com/photosynth/ 7
Optical character recognition (OCR) Face detection Recognize text Digit recognition, AT&T labs http://www.research.att.com/~yann/ License plate readers http://en.wikipedia.org/wiki/automatic_number_plate_recognition Many new digital cameras now detect faces Canon, Sony, Fuji, Smile detection? Object recognition (in supermarkets) Sony Cyber-shot T70 Digital Still Camera LaneHawk by EvolutionRobotics A smart camera is flush-mounted in the checkout lane, continuously watching for items. When an item is detected and recognized, the cashier verifies the quantity of items that were found under the basket, and continues to close the transaction. The item can remain under the basket, and with LaneHawk,you are assured to get paid for it Face recognition Vision-based biometrics How the Afghan Girl was Identified by Her Iris Patterns Read the story Who is she? 8
Login without a password Object recognition (in mobile phones) Fingerprint scanners on many new laptops, other devices Face recognition systems now beginning to appear more widely http://www.sensiblevision.com/ This is becoming real: Microsoft Research Point & Find iphone Apps: (www.kooaba.com) iphone Apps: (www.snaptell.com) Special effects: shape capture Special effects: motion capture The Matrix movies, ESC Entertainment, XYZRGB, NRC Pirates of the Carribean, Industrial Light and Magic 9
Sports Smart cars Slide content courtesy of Amnon Shashua Sportvision first down line Nice explanation on www.howstuffworks.com Mobileye Vision systems currently in high-end BMW, GM, Volvo models By 2010: 70% of car manufacturers. Vision-based interaction (and games) Kinect Sony EyeToy Nintendo Wii has camera-based IR tracking built in. See Lee s work at CMU on clever tricks on using it to create a multi-touch display! Assistive technologies http://www.youtube.com/watch?v=0_sayyxgo3u&fe ature=player_embedded Vision in space Robotics NASA'S Mars Exploration Rover Spirit captured this westward view from atop a low plateau where Spirit spent the closing months of 2007. Vision systems (JPL) used for several tasks Panorama stitching 3D terrain modeling Obstacle detection, position tracking For more, read Computer Vision on Mars by Matthies et al. NASA s Mars Spirit Rover http://en.wikipedia.org/wiki/spirit_rover http://www.robocup.org/ 10
The computer vision industry A list of companies: http://www.cs.ubc.ca/ spider/lowe/vision.ht ml 11