Persistent Link:
http://hdl.handle.net/10150/312653
Title:
Bayesian Data Association for Temporal Scene Understanding
Author:
Brau Avila, Ernesto
Issue Date:
2013
Publisher:
The University of Arizona.
Rights:
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Abstract:
Understanding the content of a video sequence is not a particularly difficult problem for humans. We can easily identify objects, such as people, and track their position and pose within the 3D world. A computer system that could understand the world through videos would be extremely beneficial in applications such as surveillance, robotics, biology. Despite significant advances in areas like tracking and, more recently, 3D static scene understanding, such a vision system does not yet exist. In this work, I present progress on this problem, restricted to videos of objects that move in smoothly and which are relatively easily detected, such as people. Our goal is to identify all the moving objects in the scene and track their physical state (e.g., their 3D position or pose) in the world throughout the video. We develop a Bayesian generative model of a temporal scene, where we separately model data association, the 3D scene and imaging system, and the likelihood function. Under this model, the video data is the result of capturing the scene with the imaging system, and noisily detecting video features. This formulation is very general, and can be used to model a wide variety of scenarios, including videos of people walking, and time-lapse images of pollen tubes growing in vitro. Importantly, we model the scene in world coordinates and units, as opposed to pixels, allowing us to reason about the world in a natural way, e.g., explaining occlusion and perspective distortion. We use Gaussian processes to model motion, and propose that it is a general and effective way to characterize smooth, but otherwise arbitrary, trajectories. We perform inference using MCMC sampling, where we fit our model of the temporal scene to data extracted from the videos. We address the problem of variable dimensionality by estimating data association and integrating out all scene variables. Our experiments show our approach is competitive, producing results which are comparable to state-of-the-art methods.
Type:
text; Electronic Dissertation
Keywords:
Data association; MCMC; Multiple target tracking; Scene understanding; Computer Science; Bayesian inference
Degree Name:
Ph.D.
Degree Level:
doctoral
Degree Program:
Graduate College; Computer Science
Degree Grantor:
University of Arizona
Advisor:
Barnard, Kobus J.

Full metadata record

DC FieldValue Language
dc.language.isoen_USen
dc.titleBayesian Data Association for Temporal Scene Understandingen_US
dc.creatorBrau Avila, Ernestoen_US
dc.contributor.authorBrau Avila, Ernestoen_US
dc.date.issued2013-
dc.publisherThe University of Arizona.en_US
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.en_US
dc.description.abstractUnderstanding the content of a video sequence is not a particularly difficult problem for humans. We can easily identify objects, such as people, and track their position and pose within the 3D world. A computer system that could understand the world through videos would be extremely beneficial in applications such as surveillance, robotics, biology. Despite significant advances in areas like tracking and, more recently, 3D static scene understanding, such a vision system does not yet exist. In this work, I present progress on this problem, restricted to videos of objects that move in smoothly and which are relatively easily detected, such as people. Our goal is to identify all the moving objects in the scene and track their physical state (e.g., their 3D position or pose) in the world throughout the video. We develop a Bayesian generative model of a temporal scene, where we separately model data association, the 3D scene and imaging system, and the likelihood function. Under this model, the video data is the result of capturing the scene with the imaging system, and noisily detecting video features. This formulation is very general, and can be used to model a wide variety of scenarios, including videos of people walking, and time-lapse images of pollen tubes growing in vitro. Importantly, we model the scene in world coordinates and units, as opposed to pixels, allowing us to reason about the world in a natural way, e.g., explaining occlusion and perspective distortion. We use Gaussian processes to model motion, and propose that it is a general and effective way to characterize smooth, but otherwise arbitrary, trajectories. We perform inference using MCMC sampling, where we fit our model of the temporal scene to data extracted from the videos. We address the problem of variable dimensionality by estimating data association and integrating out all scene variables. Our experiments show our approach is competitive, producing results which are comparable to state-of-the-art methods.en_US
dc.typetexten
dc.typeElectronic Dissertationen
dc.subjectData associationen_US
dc.subjectMCMCen_US
dc.subjectMultiple target trackingen_US
dc.subjectScene understandingen_US
dc.subjectComputer Scienceen_US
dc.subjectBayesian inferenceen_US
thesis.degree.namePh.D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.disciplineGraduate Collegeen_US
thesis.degree.disciplineComputer Scienceen_US
thesis.degree.grantorUniversity of Arizonaen_US
dc.contributor.advisorBarnard, Kobus J.en_US
dc.contributor.committeememberBarnard, Kobus J.en_US
dc.contributor.committeememberMorrison, Clayton T.en_US
dc.contributor.committeememberEfrat, Alonen_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.