COMP 9517 Computer Vision Introduc<on 1
What is Computer Vision? 2
Every picture tells a story Goal of computer vision is to write computer programs that can interpret images 3
Can computers match (or beat) human vision? Yes and no (but mostly no!) humans are much berer at hard things computers can be berer at easy things 4
Human percep<on has its shortcomings Sinha and Poggio, Nature, 1996 5
Copyright A.Kitaoka 2003 6
Current State of the Art Here are some examples 7
Earth viewers (3D modeling) Image from Microsoft s Virtual Earth (see also: Google Earth) 25/07/2016 COMP 9517 S2, 2016 8
Op<cal character recogni<on (OCR) Technology to convert scanned docs to text If you have a scanner, it probably came with OCR so[ware Digit recognition, AT&T labs http://www.research.att.com/~yann/ License plate readers http://en.wikipedia.org/wiki/automatic_number_plate_recognition 9
Face detec<on Many new digital cameras now detect faces Canon, Sony, Fuji, 10
Smile detec<on? Sony Cyber-shot T70 Digital Still Camera 11
Object recogni<on (in supermarkets) LaneHawk by EvolutionRobotics A smart camera is flush-mounted in the checkout lane, continuously watching for items. When an item is detected and recognized, the cashier verifies the quantity of items that were found under the basket, and continues to close the transaction. The item can remain under the basket, and with LaneHawk,you are assured to get paid for it 12
Face recogni<on Who is she? 13
Vision- based biometrics How the Afghan Girl was Identified by Her Iris Patterns Read the story at http://www.cl.cam.ac.uk/~jgd1000/afghan.html 14
Login without a password Fingerprint scanners on many new laptops, other devices Face recognition systems now beginning to appear more widely http://www.sensiblevision.com/ 15
Object recogni<on (in mobile phones) This is becoming real: Microso[ Research Point & Find, Nokia 16
Special effects: shape capture The Matrix movies, ESC Entertainment, XYZRGB, NRC 17
Special effects: mo<on capture Pirates of the Carribean, Industrial Light and Magic Click here for interactive demo 18
Sports Sportvision first down line How do they superimpose the first-down line on the field on televised football games? Nice explanation on www.howstuffworks.com 19
Smart cars Slide content courtesy of Amnon Shashua Mobileye Vision systems currently in high- end BMW, GM, Volvo models By 2010: 70% of car manufacturers. 20
Vision- based interac<on (and games) Digimask: put your face on a 3D avatar. Nintendo Wii has camera-based IR tracking built in. See Lee s work at CMU on clever tricks on using it to create a multi-touch display! Game turns moviegoers into Human Joysticks, CNET Camera tracking a crowd, based on this work. 21
Vision in space NASA'S Mars Exploration Rover Spirit captured this westward view from atop a low plateau where Spirit spent the closing months of 2007. Vision systems (JPL) used for several tasks Panorama s<tching 3D terrain modeling Obstacle detec<on, posi<on tracking For more, read Computer Vision on Mars by MaRhies et al. 22
Robo<cs NASA s Mars Spirit Rover http://en.wikipedia.org/wiki/spirit_rover http://www.robocup.org/ 23
Medical imaging 3D imaging MRI, CT Image guided surgery Grimson et al., MIT 24
More CV Applica<ons Vision- based HCI EyeMouse: a vision- based eye control system To use human head and eyes to control computers, so how? Computer vision and a webcam to track the eyes and head Shakes and winks to control a mouse pointer on the screen Face expression recogni<on Challenge: clurer and real <me Game Controller: Cam- Trax 25
Geographical: GIS Interpre<ng satellite images Road detec<on for crea<ng maps Edge detec<on, Road edge classifica<on and linking Challenge: complex and wide scene, occlusion, low resolu<on or large data size. 26
Medical Imaging Enhance imagery, or iden<fy important phenomena or events, or visualise informa<on obtained by imaging Parenchymal bands: linear structures touching the lung boundary Segment and classify candidate regions into posi<ve (parenchymal bands) and nega<ve (others) class Challenge: O[en arached to other structures, in this case a nodular mass Similar appearance to blood vessels 27
Video Surveillance Traffic Monitoring Object tracking Ac<on recogni<on, driving, stopping, etc Vehicle speed Coun<ng Challenge: occlusion, illumina<on changes and non- linear speed 28
Image/video retrieval Content- based retrieval Search engine Challenge: big data volume, seman<c 29
Text Recogni<on Conver<ng informa<on from paper documents into digital form Challenge: seman<c interpreta<on I looked as hard as I could see, beyond 100 plus infinity an object of bright intensity- it was the back of me! 30
Applica<on Videos 31
Goals of Computer Vision Extract useful informa<on from images Complexity of visual data is a challenge Recent progress due to higher processing power, memory, storage capacity Image- >measurements- >model- >algorithms for learning and inference 32
Computer Vision Topics Requires a solid understanding of camera and of the physical process of imaging to: - obtain simple inferences from individual pixel values - combine the informa<on available in mul<ple images into a coherent whole - enforce some order on groups of pixels to separate them from each other or infer shape informa<on - recognise objects using geometric informa<on or probabilis<c techniques. 33
Cri<cal Issues Sensing: how do sensors obtain images of the world? Encoded Informa1on: how do images yield informa<on of the scene, such as color, texture, shape, mo<on, etc.? Representa1ons: what representa<ons are appropriate to describe objects? Algorithms: what algorithms process image informa<on and construct scene descrip<ons? 34
Computer Vision Processes Low level processes use little knowledge of image content include image compression, noise filtering, edge extraction,... use data which resemble the input image, eg. matrix of picture elements High level processes based on knowledge, goals, plans use Artificial Intelligence methods simulate human cognition and decision making based on information in the image cognitive processes, geometric models, goals, plans,... 35
Low Level Vision almost en<rely digital image processing sensing: image capture and digi<sa<on pre- processing: improve image quality: suppress noise, enhance object features, edge extrac<on image segmenta1on: separate objects from background, par<<on image into objects of interest descrip1on: compute features which differen<ate objects- also called feature extrac*on Classifica1on: assign labels to image segments (regions) 36
High Level Vision About knowledge construc<on, representa<on and inference recogni1on: iden<fica<on of objects interpreta1on: assign meaning to groups of recognized objects scene analysis 37
For Reading Chapter 1, Szeliski Chapter 1, Shapiro and Stockman 38
Acknowledgement Some images on applica<ons taken from the textbook resources for the text by Szeliski 2010, with original sources credited where possible Videos credited 39