Reference
S. S. Intille, "Visual Recognition of Multi-Agent
Action," Massachusetts Institute of Technology, Cambridge, MA, Ph.D.
thesis 1999.
[Compressed Postscript] [PDF]
Abstract
Developing computer vision sensing systems that work robustly in everyday environments
will require that the systems can recognize structured interaction between
people and objects in the world. This document presents a new theory for the
representation and recognition of coordinated multi-agent action from noisy
perceptual data.
The thesis of this work is as follows: highly structured, multi-agent action can be
recognized from noisy perceptual data using visually grounded goal-based primitives and
low-order temporal relationships that are integrated in a probabilistic framework. The
theory is developed and evaluated by examining general characteristics of multi-agent
action, analyzing tradeoffs involved when selecting a representation for multi-agent
action recognition, and constructing a system to recognize multi-agent action for a real
task from noisy data.
The representation, which is motivated by work in model-based object recognition and
probabilistic plan recognition, makes four principal assumptions: (1) the goals of
individual agents are temporal relationships between agents engaged in group activities,
(2) a high-level description of temporal structure of the action using a small set of
low-order temporal and logical constraints is adequate for representing the relationships
between the agent goals for highly structured, multi-agent action recognition, (3)
Bayesian networks provide a suitable mechanism for integrating multiple sources of
uncertain visual perceptual feature evidence, and (4) an automatically generated Bayesian
network can be used to combine uncertain temporal information and compute the likelihood
that a set of object trajectory data is a particular multi-agent action.
The recognition algorithm is tested using a database of American football play
descriptions. A system is described that can recognize single-agent and multi-agent
actions in this domain given noisy trajectories of object movements. The strengths and
limitations of the recognition system are discussed and compared with other multi-agent
recognition algorithms.
Keywords
Action recognition, motion understanding, knowledge representation, plan recognition,
multiple agents
|