Persistent Link:
http://hdl.handle.net/10150/265361
Title:
Human Action Recognition on Videos: Different Approaches
Author:
Mejia, Maria Helena
Issue Date:
2012
Publisher:
The University of Arizona.
Rights:
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Abstract:
The goal of human action recognition on videos is to determine in an automatic way what is happening in a video. This work focuses on providing an answer to this question: given consecutive frames from a video where a person or persons are doing an action, is an automatic system able to recognize the action that is going on for each person? Seven approaches have been provided, most of them based on an alignment process in order to find a measure of distance or similarity for obtaining the classification. Some are based on fluents that are converted to qualitative sequences of Allen relations to make it possible to measure the distance between the pair of sequences by aligning them. The fluents are generated in various ways: representation based on feature extraction of human pose propositions in just an image or a small sequence of images, changes of time series mainly on the angle of slope, changes of the time series focus on the slope direction, and propositions based on symbolic sequences generated by SAX. Another approach based on alignment corresponds to Dynamic Time Warping on subsets of highly dependent parts of the body. An additional approach explored is based on SAX symbolic sequences and respective pair wise alignment. The last approach is based on discretization of the multivariate time series, but instead of alignment, a spectrum kernel and SVM are used as is employed to classify protein sequences in biology. Finally, a sliding window method is used to recognize the actions along the video. These approaches were tested on three datasets derived from RGB-D cameras (e.g., Microsoft Kinect) as well as ordinary video, and a selection of the approaches was compared to the results of other researchers.
Type:
text; Electronic Dissertation
Keywords:
Gesture Recognition; Machine learning; Computer Science; Activity recognition; Computer vision
Degree Name:
Ph.D.
Degree Level:
doctoral
Degree Program:
Graduate College; Computer Science
Degree Grantor:
University of Arizona
Advisor:
Cohen, Paul

Full metadata record

DC FieldValue Language
dc.language.isoenen_US
dc.titleHuman Action Recognition on Videos: Different Approachesen_US
dc.creatorMejia, Maria Helenaen_US
dc.contributor.authorMejia, Maria Helenaen_US
dc.date.issued2012-
dc.publisherThe University of Arizona.en_US
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.en_US
dc.description.abstractThe goal of human action recognition on videos is to determine in an automatic way what is happening in a video. This work focuses on providing an answer to this question: given consecutive frames from a video where a person or persons are doing an action, is an automatic system able to recognize the action that is going on for each person? Seven approaches have been provided, most of them based on an alignment process in order to find a measure of distance or similarity for obtaining the classification. Some are based on fluents that are converted to qualitative sequences of Allen relations to make it possible to measure the distance between the pair of sequences by aligning them. The fluents are generated in various ways: representation based on feature extraction of human pose propositions in just an image or a small sequence of images, changes of time series mainly on the angle of slope, changes of the time series focus on the slope direction, and propositions based on symbolic sequences generated by SAX. Another approach based on alignment corresponds to Dynamic Time Warping on subsets of highly dependent parts of the body. An additional approach explored is based on SAX symbolic sequences and respective pair wise alignment. The last approach is based on discretization of the multivariate time series, but instead of alignment, a spectrum kernel and SVM are used as is employed to classify protein sequences in biology. Finally, a sliding window method is used to recognize the actions along the video. These approaches were tested on three datasets derived from RGB-D cameras (e.g., Microsoft Kinect) as well as ordinary video, and a selection of the approaches was compared to the results of other researchers.en_US
dc.typetexten_US
dc.typeElectronic Dissertationen_US
dc.subjectGesture Recognitionen_US
dc.subjectMachine learningen_US
dc.subjectComputer Scienceen_US
dc.subjectActivity recognitionen_US
dc.subjectComputer visionen_US
thesis.degree.namePh.D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.disciplineGraduate Collegeen_US
thesis.degree.disciplineComputer Scienceen_US
thesis.degree.grantorUniversity of Arizonaen_US
dc.contributor.advisorCohen, Paulen_US
dc.contributor.committeememberDowney, Peteren_US
dc.contributor.committeememberBarnard, Jacobusen_US
dc.contributor.committeememberMorrison, Claytonen_US
dc.contributor.committeememberCohen, Paulen_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.