CSE 527: Introduction to Computer Vision

CSE 527: Introduction to Computer Vision Week 2 - Class 2: Vision, Physics, Cameras September 7th, 2017

Today Physics Human Vision Eye Brain Perspective Projection Camera Models Image Formation Digital Cameras

Very Briefly: Physics of Light

(Visible) Light EM Radiation

Why this exact band of the EM spectrum?

Photons A particle? A wave? Both? Have a frequency of oscillation Travel as the speed of light (very fast) Usually in a straight line Interact with matter (atoms) in different ways, which also depends on their frequency Oscill8 bro!

Light-Matter Interaction Light ( photons ) have a few choices when hitting a material: Transmission: Reflection: Absorption: go through it go back get stuck in it Other things may happen Refraction: bend through it Diffusion/Scatter: go in many ways Subsurface: hit things inside and more We re mostly interested in absorbed, reflected and refracted light.

What is color? Reflected light Some light is absorbed by the material, and some reflected. We see the reflected light. Materials have different properties: which frequencies are absorbed which frequencies are reflected which frequencies are transmitted...

The Bidirectional Reflectance Distribution Function An approximate numerical representation of the material light interaction properties Light direction wavelength Diffuse: Specular: Phong model : (introduce Ambient Lighting)

Lambertian Surfaces Ideal surfaces: Diffuse reflection occurs in a known pattern (Lambert s cosine rule) We often make this assumption in Vision applications. Reflect!

Human Vision And Visual Processing

Why do we care about human vision? Bionics, Bio-inspired Engineering, Biomimicry,... The human visual sense is very central (large swaths of the brain devoted to visual processing) We know that we can extract so much information from flat (stereo) images -> Inspired the creation of cameras -> Inspires (even today) computational methods for visual processing And there is still much to learn... Da Vinci s flying machine, inspired by bird wings

Human Eye Parts Is very much like a camera: - Light shines through a hole (pupil, iris) and focused (cornea, lens) on a sensitive substrate (retina) But much more complicated Aperture Lens Film / CCD

Retina Layers, Rods and Cones

Rods and Cones Humans may be considered to be Tetrachromats: able to see 4 colors. However mostly cones contribute to color vision, making us Trichromats.

Color Perception

This Tuesday (9/5/17). Human vision is affected by proteins too!

Retina Cone Mosaic Digital Sensor Grid

Retina Cell Arrangement, Fovea

Lateral Inhibition Receptor AND Gates

Lateral Inhibition

Visual Processing System Hierarchy

Brightness Constancy Principle

Brightness Constancy

Size Constancy

Pinhole Camera And Perspective Projection Perspectograph ca. 1600

Pinhole Camera C est moi! Kepler, Descartes, ca. 1600

Camera Obscura Earliest evidence: ~400 BCE Aristotle wrote about this! Later advancement (17th century): - Lenses - Drawing aid - Projectors (magic lantern) Yep.

Pinhole Camera Model

Normalized Camera

Focal Length Matters We ll get back to this shortly...

Perspective in Early Art Gentile di Fabriano (1423) Masolino (1425)

It s All a Matter of Perspective

Vanishing Points Parallel lines in the scene converge in a vanishing point in the image. Vanishing points and lines are cues to the visual system (e.g. in size constancy) to determine size and distance.

More Than a Single Vanishing Point One Point Two Points Three Points

Why Vanishing Points? Can be used to reconstruct the geometry in the scene, from a single image. [Wilczkowiak 99]

Back to Projection Intrinsics Matrix Our goal: Go from world coordinates (in mm) to image coordinates (in pixels). We need to introduce a scaling factor from mm to pixel. This depends on the physical focal length and the size of a single pixel on the sensor in mm. W is sensor size (in mm) and O is field of view. Also, let s shift everything so the image is centered about the camera optical axis:

Homogeneous Coordinates Our projection equation is nonlinear: (It has division in it) Let s take a different representation for the cartesian 2D and 3D points: Now our projection equation is linear: To go back to cartesian we divide by the added component.

Homogeneous Coordinates Why are they so useful? - Adding translation is trivial and turns our transformation to a linear matrix operation. - In fact, some transformations can only work with homogeneous coordinates. - Can be used to represent points at infinity, like vanishing points (by setting ) or directions. - They can do other tricks like find line intersections easily, or fetch a ball.

Camera Extrinsic Parameters So far we assumed the camera is at the origin But that s often not the case. rotation translation A more common notation:

Beyond Pinhole Radial Distortions Wholly nonlinear, but can be simple enough: Can be negated by de-warping the image.

Image Formation: Lenses

Do We Need a Lens? The eye has a lens, so... that s a good yes indicator. Lenses focus light, and allow us to image objects that are far away (or very close). The pinhole model still holds!

Thin Lens Circle of confusion (blur), relates to Depth of Field

Focal Length Field Of View

Lenses Add Distortions Chromatic Aberration Spherical Aberration Vignetting

Wrap Up