February 21st, 2012 @ 11:00am Geoffrey Hinton (U Toronto)

NOTE: time is 11:00am, not 11:30am.

A new theory of how visual cortex deals with viewpoint variation during object recognition

Recognizing a familiar shape from the pattern of light intensities on the retina is difficult because changes in viewpoint can dramatically change the pattern of light intensities. The general belief among both neuroscientists and neural network modelers is that the infero-temporal pathway copes with viewpoint variation by using multiple levels of representation in which each level is slightly more viewpoint-invariant than the level below. Unfortunately, this approach cannot explain how we can be acutely sensitive to the precise spatial relationships between high-level parts such as a nose and a mouth.

From an engineering perspective, the natural way to deal with spatial relationships is to associate a vector of pose parameters with each recognized part and to make use of the fact that spatial relationships can then be modeled very efficiently using linear operations. This is what is done in computer graphics and it is the reason why computer graphics can deal with changes in viewpoint so easily. Despite the long history of generative models of perception, computational neuroscientists have not taken this aspect of computer graphics seriously, possibly because they do not believe that the brain can do linear algebra. I shall show how a neural net can learn to extract parts with explicit pose parameters from an image and how this makes it very easy to recognize spatial configurations of parts under a very wide range of viewpoints. I shall then sketch a way in which the brain could use spike timing to implement the required linear algebra very efficiently.

seminars/seminaritems/2012-02-21.txt · Last modified: 2012/02/17 15:25 by silberman