Who Moved My Slide? Recognizing Entities In A Lecture Video And Its Applications

Persistent Link:
http://hdl.handle.net/10150/347183
Title:
Who Moved My Slide? Recognizing Entities In A Lecture Video And Its Applications
Author:
Tung, Qiyam Junn
Issue Date:
2014
Publisher:
The University of Arizona.
Rights:
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Abstract:
Lecture videos have proliferated in recent years thanks to the increasing bandwidths of Internet connections and availability of video cameras. Despite the massive volume of videos available, there are very few systems that parse useful information from them. Extracting meaningful data can help with searching and indexing of lecture videos as well as improve understanding and usability for the viewers. While video tags and user preferences are good indicators for relevant videos, it is completely dependent on human-generated data. Furthermore, many lecture videos are technical by nature and sparse video tags are too coarse-grained to relate parts of a video by a specific topic. While extracting the text from the presentation slide will ameliorate this issue, a lecture video still contains significantly more information than what is just available on the presentation slides. That is, the actions and words of the speaker contribute to a richer and more nuanced understanding of the lecture material. The goal of the Semantically Linked Instructional Content (SLIC) project is to relate videos using more specific and relevant features such as slide text and other entities. In this work, we will present the algorithms used to recognize the entities of the lecture. Specifically, the entities in lecture videos are the laser and pointing hand gestures and the location of the slide and its text and images in the video. Our algorithms work under the assumption that the slide location (homography) is known for each frame and extend the knowledge of the scene. Specifically, gestures inform when and where on a slide notable events occur. We will also show how recognition of these entities can help with understanding lectures better and energy-savings on mobile devices. We conducted a user study that shows that magnifying text based on laser gestures on a slide helps direct a viewer's attention to the relevant text. We also performed empirical measurements on real cellphones to confirm that selectively dimming less relevant regions of the video frame would reduce energy consumption significantly.
Type:
text; Electronic Dissertation
Keywords:
gesture recognition; lecture videos; Computer Science; computer vision
Degree Name:
Ph.D.
Degree Level:
doctoral
Degree Program:
Graduate College; Computer Science
Degree Grantor:
University of Arizona
Advisor:
Efrat, Alon

Full metadata record

DC FieldValue Language
dc.language.isoen_USen
dc.titleWho Moved My Slide? Recognizing Entities In A Lecture Video And Its Applicationsen_US
dc.creatorTung, Qiyam Junnen_US
dc.contributor.authorTung, Qiyam Junnen_US
dc.date.issued2014-
dc.publisherThe University of Arizona.en_US
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.en_US
dc.description.abstractLecture videos have proliferated in recent years thanks to the increasing bandwidths of Internet connections and availability of video cameras. Despite the massive volume of videos available, there are very few systems that parse useful information from them. Extracting meaningful data can help with searching and indexing of lecture videos as well as improve understanding and usability for the viewers. While video tags and user preferences are good indicators for relevant videos, it is completely dependent on human-generated data. Furthermore, many lecture videos are technical by nature and sparse video tags are too coarse-grained to relate parts of a video by a specific topic. While extracting the text from the presentation slide will ameliorate this issue, a lecture video still contains significantly more information than what is just available on the presentation slides. That is, the actions and words of the speaker contribute to a richer and more nuanced understanding of the lecture material. The goal of the Semantically Linked Instructional Content (SLIC) project is to relate videos using more specific and relevant features such as slide text and other entities. In this work, we will present the algorithms used to recognize the entities of the lecture. Specifically, the entities in lecture videos are the laser and pointing hand gestures and the location of the slide and its text and images in the video. Our algorithms work under the assumption that the slide location (homography) is known for each frame and extend the knowledge of the scene. Specifically, gestures inform when and where on a slide notable events occur. We will also show how recognition of these entities can help with understanding lectures better and energy-savings on mobile devices. We conducted a user study that shows that magnifying text based on laser gestures on a slide helps direct a viewer's attention to the relevant text. We also performed empirical measurements on real cellphones to confirm that selectively dimming less relevant regions of the video frame would reduce energy consumption significantly.en_US
dc.typetexten
dc.typeElectronic Dissertationen
dc.subjectgesture recognitionen_US
dc.subjectlecture videosen_US
dc.subjectComputer Scienceen_US
dc.subjectcomputer visionen_US
thesis.degree.namePh.D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.disciplineGraduate Collegeen_US
thesis.degree.disciplineComputer Scienceen_US
thesis.degree.grantorUniversity of Arizonaen_US
dc.contributor.advisorEfrat, Alonen_US
dc.contributor.committeememberEfrat, Alonen_US
dc.contributor.committeememberBarnard, Kobusen_US
dc.contributor.committeememberGniady, Chrisen_US
dc.contributor.committeememberLega, Jocelineen_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.