Geometry of Presentation Videos and Slides, and the Semantic Linking of Instructional Content (SLIC) System

Persistent Link:
http://hdl.handle.net/10150/612447
Title:
Geometry of Presentation Videos and Slides, and the Semantic Linking of Instructional Content (SLIC) System
Author:
Kharitonova, Yekaterina
Issue Date:
2016
Publisher:
The University of Arizona.
Rights:
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Embargo:
Release after 31-Dec-2018
Abstract:
Presentation slides are now a de facto standard in most classroom lectures, business meetings, and conference talks. Until recently, electronic presentation materials have been disjointed from each other: the video file and the corresponding slides are typically available separately for viewing or download. In this work, we exploit the fact that video frames of a presentation and the corresponding slides are mapped into one another by a geometric transformation, called a homography. This mapping allows us to synchronize a video with the slides shown in it, enabling users to interactively view presentation materials, and search within and across presentations. We show how we can approximate homographies with affine transformations. Similarly to the original homographies, such transformations allow us to project slides back into the video (i.e., perform backprojection), which improves their resulting appearance. The advantage of our method is that we use homographies to compress the original video, reducing bandwidth used to transmit the video file, and then carry out backprojection using affine transformations on the client side. Additionally, we introduce a novel approach to slide appearance approximation, which improves SIFT-based matching for videos with out-of-plane rotation of the projection screen. This method also allows us to split each slide into three overlapping panels, and generate rotated versions of each such panel. Using these panels during matching, we detect slide's content that is projected onto a speaker (what we call "slide tattoos"). We treat these "tattoos" as implicit structured light, which provides hints about the scene geometry. We then use the homography obtained from detecting "slide tattoos" to compute a fundamental matrix. The main significance of this contribution is that it allows us to infer 3-D information from 2-D presentation materials. Finally, we present the Semantically Linked Instructional Content (SLIC) Portal, an online system for accessing presentations that exploits our slide-video matching. Aspects of the SLIC system fully developed or significantly improved as part of this work include: *a publicly-open web collection of video presentations indexed by slides *a unified clear interface displaying a video player along with slide images synchronized with their appearance in the video *a categorization tree that allows browsing for presentations by topic/category *an ability to query slide words within and across presentations; querying is integrated with the"browsing"mode, where the search results can be narrowed to only the selected categories *an easy integration with the audio transcript: the ability to preview and search within speech words *cross-platform and mobile support. We conducted user studies at the University of Arizona to measure the effect of synchronized presentation materials on learners, and discuss students' favorable response to the SLIC Portal, which they used during the experiments.
Type:
text; Electronic Dissertation
Keywords:
Computer Science
Degree Name:
Ph.D.
Degree Level:
doctoral
Degree Program:
Graduate College; Computer Science
Degree Grantor:
University of Arizona
Advisor:
Barnard, Kobus

Full metadata record

DC FieldValue Language
dc.language.isoen_USen
dc.titleGeometry of Presentation Videos and Slides, and the Semantic Linking of Instructional Content (SLIC) Systemen_US
dc.creatorKharitonova, Yekaterinaen
dc.contributor.authorKharitonova, Yekaterinaen
dc.date.issued2016-
dc.publisherThe University of Arizona.en
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.en
dc.description.releaseRelease after 31-Dec-2018en
dc.description.abstractPresentation slides are now a de facto standard in most classroom lectures, business meetings, and conference talks. Until recently, electronic presentation materials have been disjointed from each other: the video file and the corresponding slides are typically available separately for viewing or download. In this work, we exploit the fact that video frames of a presentation and the corresponding slides are mapped into one another by a geometric transformation, called a homography. This mapping allows us to synchronize a video with the slides shown in it, enabling users to interactively view presentation materials, and search within and across presentations. We show how we can approximate homographies with affine transformations. Similarly to the original homographies, such transformations allow us to project slides back into the video (i.e., perform backprojection), which improves their resulting appearance. The advantage of our method is that we use homographies to compress the original video, reducing bandwidth used to transmit the video file, and then carry out backprojection using affine transformations on the client side. Additionally, we introduce a novel approach to slide appearance approximation, which improves SIFT-based matching for videos with out-of-plane rotation of the projection screen. This method also allows us to split each slide into three overlapping panels, and generate rotated versions of each such panel. Using these panels during matching, we detect slide's content that is projected onto a speaker (what we call "slide tattoos"). We treat these "tattoos" as implicit structured light, which provides hints about the scene geometry. We then use the homography obtained from detecting "slide tattoos" to compute a fundamental matrix. The main significance of this contribution is that it allows us to infer 3-D information from 2-D presentation materials. Finally, we present the Semantically Linked Instructional Content (SLIC) Portal, an online system for accessing presentations that exploits our slide-video matching. Aspects of the SLIC system fully developed or significantly improved as part of this work include: *a publicly-open web collection of video presentations indexed by slides *a unified clear interface displaying a video player along with slide images synchronized with their appearance in the video *a categorization tree that allows browsing for presentations by topic/category *an ability to query slide words within and across presentations; querying is integrated with the"browsing"mode, where the search results can be narrowed to only the selected categories *an easy integration with the audio transcript: the ability to preview and search within speech words *cross-platform and mobile support. We conducted user studies at the University of Arizona to measure the effect of synchronized presentation materials on learners, and discuss students' favorable response to the SLIC Portal, which they used during the experiments.en
dc.typetexten
dc.typeElectronic Dissertationen
dc.subjectComputer Scienceen
thesis.degree.namePh.D.en
thesis.degree.leveldoctoralen
thesis.degree.disciplineGraduate Collegeen
thesis.degree.disciplineComputer Scienceen
thesis.degree.grantorUniversity of Arizonaen
dc.contributor.advisorBarnard, Kobusen
dc.contributor.committeememberEfrat, Alonen
dc.contributor.committeememberMorrison, Clayton T.en
dc.contributor.committeememberSurdeanu, Mihaien
dc.contributor.committeememberBarnard, Kobusen
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.