Chapter 10: Depth perception

Monocular cues (pp. 272—281)

Ask Yourself

What you need to know

  1. Retinal Image Size (p. 272)
    • Size-distance function
    • Area-distance function
    • Limitation as a depth cue
  2. Height in the Visual Field (pp. 272—273)
    • HVF-distance function
    • Visual and non-visual cues
    • Observer's height
    • Limitation as a depth cue
  3. Texture Gradients (pp. 273—275)
    • Perspective gradient
    • Compression gradient
    • Density gradient
    • Power function
    • Limitation as a depth cue
  4. Image Blur (pp. 275—276)
    • Depth of field
    • Blur-distance function
    • Limitations as a depth cue
  5. Atmospheric Perspective (pp. 276—277)
    • Contrast-distance function
    • Attenuation coefficient
    • Limitations as a depth cue
  6. Accommodation (pp. 277—278)
    • Near and far accommodation
    • Limitations as a depth cue
  7. Motion Parallax (pp. 278—279)
    • Optic flow
    • Motion parallax
    • Relation to compression and perspective gradients
  8. Shadows (pp. 279—280)
    • Cast shadows
    • Attached shadows
    • Limitations as depth cues
  9. Interposition (p. 281)
    • "T" intersections

Retinal Image Size

A non-linear function relates the size of an object's retinal image to its distance from the observer: as distance doubles, retinal size halves.

The decrease in the area of an object's retinal image with increasing distance follows a similar function, although little variation in retinal area occurs beyond 100 cm, as opposed to 200 cm for retinal size.

Retinal image size (area) a = real size (area) of object / distance from observer

If an object is familiar, its real size will be known, and this relationship can be used to estimate its absolute distance.

Height in the Visual Field

Standing on level ground, fixating the distant horizon (0°), distant objects are higher in the visual field than those at our feet (90°).

Height in the visual field can be used as a cue to depth because it varies tangentially with distance:

tan(HVF) = observer's height / distance from observer

The HVF of a point in the scene can be derived from:

  1. Visual information in the retinal image, and
  2. Non-visual information from changes in eye position as fixation moves between objects at different distances.

HVF is also a function of the observer's height. At distances less than 100 cm, HVF varies more rapidly for children than for adults. The HVF of a scene point at any distance is closer to a child's line of sight than to an adult's.

Ooi et al. (2001; see FP p. 273) used vertically displacing prisms to manipulate HVF and demonstrate its value as a cue to depth.

The utility of HVF is limited to objects in contact with a level, horizontal, ground plane.

Texture Gradients

This photograph of a lily pond shows how variation in texture element size and shape offers a cue to surface slant.

Uniform texture on a surface slanted away from the observer (texture gradient) has three image qualities that vary systematically with depth, and can be used to estimate distance:

  1. Width, or separation of elements perpendicular to the surface slant, decreases with increasing distance and is known as perspective gradient. Linear perspective is a special example of this type of gradient, where "elements" are lines that converge on the vanishing point.
  2. Height, or separation of elements in the direction of surface slant, decreases with increasing distance and is known as compression gradient.
  3. Density, or number of elements per unit area, increases with increasing distance and is known as density gradient.

All three texture cues vary with distance according to a power law. The steepness of the power functions that define the variation of texture cues with distance is dependent on:

Texture gradients are only reliable depth cues when elements of similar size, shape, and spacing repeat in the scene.

Image Blur

The depth of field of an optical system is the distance around the point of focus in which the image remains sharply focused.

Beyond the depth of field, the amount of blur in the image is lawfully related to its distance from fixation, and can provide a cue to depth.

As the distance from fixation increases, areas closer than fixation blur more rapidly than those farther away.

Perceptual evidence from Marshall et al. (1996; see FP p. 276) and Mather (1996; see FP p. 276) demonstrates that image blur does influence depth perception.

Image blur is limited to coarse ordinal depth judgements because:

Demo: An animation demonstrating the effect of blur

Blur and depth. Notice that the dark rectangle appears to stand in front of the background texture when the texture is blurred. Contrast drops as blur increases, contributing to the effect.

Atmospheric Perspective

In dull, wet weather image contrast declines rapidly with distance.

When large distances are considered, the contrast of an object can be used as a cue to depth because it is lawfully related to the object's distance from the observer.

Contrast is attenuated with increasing distance because atmospheric particles scatter light. The extent of scattering depends on the atmospheric attenuation coefficient for the given conditions.

Perceptual evidence from O'Shea et al. (1994; see FP p. 276) confirms that contrast can influence depth perception.

Atmospheric perspective is limited to coarse ordinal depth judgements because:

Accommodation

The ciliary muscles, which circle the lens and control its accommodation, provide non-visual information about the absolute distance to fixation.

For distant focus, the ciliary muscles relax into a wide ring, allowing the tension of the suspensory ligaments to thin the lens. For near focus, the ciliary muscles contract into a small ring, releasing the tension of the suspensory ligaments, and allowing the lens to become thicker.

Perceptual evidence from Fisher and Ciuffreda (1988; see FP p. 278) suggests that accommodation can be a coarse ordinal depth cue, but only at distances in the range of 15 to 100 cm from the observer.

Motion Parallax

For an observer moving through the world, the velocity of an image point across the retina is lawfully related to its depth in the scene.

Optic flow refers to the retinal velocity gradient created by an observer moving through the scene. When fixation is on the horizon, the function that describes optic flow for the ground plane is identical to the compression gradient function for textured surfaces.

Motion parallax refers to the retinal velocity gradient created by sideways movements of the observer. When fixation is on the horizon, the function that describes motion parallax for the ground plane is identical to the perspective gradient function for textured surfaces.

If the line of sight is not the horizon, fixation becomes the stationary reference point in the image. For optic flow and motion parallax, retinal velocity then increases systematically for scene points that are increasingly distant from fixation, rather than from the horizon.

Any movements of the observer, the scene, or both, that create retinal velocity gradients are powerful, independent cues to depth (Rogers & Graham, 1979; see FP p. 279).

Demo: An animation of motion parallax

Optic flow. During movement through a 3-D space, image elements appear at the focus of expansion and accelerate towards the periphery of the image to create a characteristic flow pattern.

Shadows

Cast shadows originate from one object and fall on another. Kersten et al. (1996; see FP p. 279) demonstrated their influence in perceptual interpretations of depth.

Attached shadows fall on the object that created them. Koenderink et al. (1996; see FP p. 279) demonstrated their value in judging 3-D shape.

When utilising shadows, the visual system makes the following assumptions, which need to be satisfied for accurate 3-D interpretations:

And for attached shadows:

Interposition

When a near object partially occludes a more distant one (interposition), "T"-shaped intersections are created, which provide ordinal depth information.

So What Does This Mean?

Monocular cues to depth are derived from a single eye, either from visual cues in its retinal image, or from non-visual information about its position (for HVF) or accommodative state. The diversity of cues covers various distance ranges. Some provide precise metric measurements of absolute or relative depth (metric depth cues), others only coarse ordinal estimates (ordinal depth cues). All monocular cues have limitations that restrict their use to certain conditions. Information about height in the visual field, texture gradients, and the dynamic cues of motion parallax and optic flow, varies more gradually for taller observers.

Demo: An illustration of pictorial depth cues

This image contains three playing cards. You can manipulate the pictorial depth cues present by clicking on the buttons along the bottom of the display.

Other topics in this chapter