CENG 595 Selected Topics in Computer Engineering Computer Vision Zafer ARICAN, PhD
Today Administrivia What is Computer Vision? Why is it a difficult problem? State-of-the art Brief course syllabus
Instructor Zafer ARICAN, PhD zafer.arican@turktelekom.com.tr PhD, Ecole Polytechnique Federale de Lausanne, Switzerland Team Leader at Turk Telekom R&D Thesis: Analysis and Processing of Omnidirectional Images in a Spherical Framework
My Research at TT R&D Augmented Reality SDK
Course Prerequisities Data structures Familiarity with linear algebra Familiarity with probability A good working knowledge of C++ programming (or willingness and time to pick it up quickly) Knowledge of machine learning and image processing is plus
Textbook Primary textbook Richard Szeliski, Computer Vision: Algorithms & Applications Online at : http://www.szeliski.org/book Secondary textbooks Forsyth, David A., and Ponce, J. Computer Vision: A Modern Approach, Prentice Hall, 2003. Hartley, R. and Zisserman, A. Multiple View Geometry in Computer Vision, Academic Press, 2002.
Grading Programming Problem Sets (30%) Will be announced during each related lecture 4-5 homeworks Final Project (70%) Programming 15 Min. Presentation A technical report Strong class participation Not mandatory but can offset negative performance
OpenCV Strong Computer Vision Library Mainly in C++ Interfaces for C, C++, Python, Java Supports Windows, Linux, Mac OS, ios and Android Will be used for programming homeworks and final project Web site: http://opencv.org/ Reference version: 2.4.3
What is Computer Vision? What do you see? Which city? What is there in the photo? Which one in the front? Which one behind? People standing or walking?
What we see What a computer sees
What is Computer Vision? Computer vision is automatic understanding of What is present Where things are What actions happened from a picture, a series of pictures or videos. Mainly Measurement (3D shape and pose) Perception and interpretation (Recognition of people, objects, scenes and activities)
What is Computer Vision NOT? Image processing : Images to Images image enhancement, image restoration, image compression. Take an image and process it to produce a new image which is, in some way, more desirable Comp. Photograpy: Images to Images extending the capabilities of digital cameras through the use of computation to enable the capture of enhanced or entirely novel images of the world Computer Graphics: Models to Images Rendering of 2D images from models and scenes in a 3D virtual world.
Related Disciplines
Why is computer vision difficult? Viewpoint variation Illumination Scale
Why is computer vision difficult? Intra-class variation Motion (Source: S. Lazebnik) Background clutter Occlusion
Why is computer vision difficult? An image is a projection of world An under-constrained problem
Why is computer vision difficult? Vision is an amazing feat of natural intelligence Visual cortex occupies about 50% of Macaque brain More human brain devoted to vision than anything else
Source: S. Lazebnik Why study computer vision? Millions of images being captured all the time Lots of useful applications The next slides show the current state of the art
6 billion 6E+09 Flickr 5 billion 5E+09 4 billion 4E+09 3 billion 3E+09 2 billion 2E+09 1 billion 1E+09 0 12.15.2003 12.15.2004 12.15.2005 12.15.2006 12.15.2007 12.15.2008 12.15.2009
Other photo sharing sites 50 billion 40 billion 30 billion 20 billion 10 billion
and growing Flickr: > 1.7 million photos / day Facebook: > 100 million photos / day (as of February 2010) YouTube: > 35 hours of video every minute (as of November 2010) ~ 57 billion photos will be taken (US) in 2010 http://windowsteamblog.com/windows_live/b/windowslive/archive/2010/04/09/what-to-do-with-57-billion-photos.aspx (compare with ~17 billion negatives exposed in 1996)
Optical character recognition (OCR) If you have a scanner, it probably came with OCR software Digit recognition, AT&T labs http://www.research.att.com/~yann/ License plate readers http://en.wikipedia.org/wiki/automatic_number_plate_reco Sudoku grabber http://sudokugrab.blogspot.com Automatic check processing Source: S. Seitz
Face detection Many new digital cameras now detect faces Canon, Sony, Fuji, Source: S. Seitz
Smile detection? Sony Cyber-shot T70 Digital Still Camera Source: S. Seitz
Face recognition
Face recognition Who is she? Source: S. Seitz
Vision-based biometrics How the Afghan Girl was Identified by Her Iris Patterns Read the story Source: S. Seitz
Login without a password Fingerprint scanners on many new laptops, other devices Face recognition systems now beginning to appear more widely http://www.sensiblevision.com/ Source: S. Seitz
Object recognition (in supermarkets) LaneHawk by EvolutionRobotics A smart camera is flush-mounted in the checkout lane, continuously watching for items. When an item is detected and recognized, the cashier verifies the quantity of items that were found under the basket, and continues to close the transaction. The item can remain under the basket, and with LaneHawk,you are assured to get paid for it Source: S. Seitz
Object recognition (in mobile phones) This is becoming real: Microsoft Research Point & Find Source: S. Seitz
iphone Apps: (www.kooaba.com) Source: S. Lazebnik
Google Goggles
Special effects: shape capture The Matrix movies, ESC Entertainment, XYZRGB, NRC Source: S. Seitz
Special effects: motion capture Pirates of the Carribean, Industrial Light and Magic Source: S. Seitz
Special effects: camera tracking Boujou, 2d3
Sports Sportvision first down line Nice explanation on www.howstuffworks.com Source: S. Seitz
Smart cars Mobileye Vision systems currently in high-end BMW, GM, Volvo models By 2010: 70% of car manufacturers. Sources: A. Shashua, S. Sei
Vision-based interaction (and games) Sony EyeToy Assistive technologies Nintendo Wii has camera-based IR tracking built in. See Lee s work at CMU on clever tricks on using it to create a multi-touch display! Xbox Kinect
Vision in space NASA'S Mars Exploration Rover Spirit captured this westward view from atop a low plateau where Spirit spent the closing months of 2007. Vision systems (JPL) used for several tasks Panorama stitching 3D terrain modeling Obstacle detection, position tracking For more, read Computer Vision on Mars by Matthies et al. Source: S. Seitz
Robotics NASA s Mars Spirit Rover http://en.wikipedia.org/wiki/spirit_rover Autonomous RC Car http://www.cs.cornell.edu/~asaxena/rccar/
Medical imaging 3D imaging MRI, CT Image guided surgery Grimson et al., MIT Source: S. Seitz
Augmented Reality Google AR Glasses Lego AR Booth artt SDK
Syllabus (Tentative) 2. Image formation o Camera& Optics o Light & Color 3. Image Filtering o Linear Filters o Image Sampling 4. Grouping and fitting o Segmentation o Hough transform o Contour detection o Image transformations 5. Multiple view geometry o Camera model o Homography and image warping o Epipolar geometry o Camera calibration 6. 3D reconstruction o Stereo o Sparse depth estimation o Dense depth estimation 7. Feature Detection& Matching o Corners & Blobs o Descriptors o Matching 8. Recognition o Introduction to recognition o Face detection o Bag-of-words models 9. Motion o Optical Flow o Tracking o Video processing