SURVEILLANCE IN THE INFORMATION AGE: TEXT QUANTIFICATION, ANOMALY DETECTION, AND EMPIRICAL EVALUATION

Persistent Link:
http://hdl.handle.net/10150/193893
Title:
SURVEILLANCE IN THE INFORMATION AGE: TEXT QUANTIFICATION, ANOMALY DETECTION, AND EMPIRICAL EVALUATION
Author:
Lu, Hsin-Min
Issue Date:
2010
Publisher:
The University of Arizona.
Rights:
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Abstract:
Deep penetration of personal computers, data communication networks, and the Internet has created a massive platform for data collection, dissemination, storage, and retrieval. Large amounts of textual data are now available at a very low cost. Valuable information, such as consumer preferences, new product developments, trends, and opportunities, can be found in this large collection of textual data. Growing worldwide competition, new technology development, and the Internet contribute to an increasingly turbulent business environment. Conducting surveillance on this growing collection of textual data could help a business avoid surprises, identify threats and opportunities, and gain competitive advantages.Current text mining approaches, nonetheless, provide limited support for conducting surveillance using textual data. In this dissertation, I develop novel text quantification approaches to identify useful information in textual data, effective anomaly detection approaches to monitor time series data aggregated based on the text quantification approaches, and empirical evaluation approaches that verify the effectiveness of text mining approaches using external numerical data sources.In Chapter 2, I present free-text chief complaint classification studies that aim to classify incoming emergency department free-text chief complaints into syndromic categories, a higher level of representation that facilitates syndromic surveillance. Chapter 3 presents a novel detection algorithm based on Markov switching with jumps models. This surveillance model aims at detecting different types of disease outbreaks based on the time series generated from the chief complaint classification system.In Chapters 4 and 5, I studied the surveillance issue under the context of business decision making. Chapter 4 presents a novel text-based risk recognition design framework that can be used to monitor the changing business environment. Chapter 5 presents an empirical evaluation study that looks at the interaction between news sentiment and numerical accounting earnings information. Chapter 6 concludes this dissertation by highlighting major research contributions and the relevance to MIS research.
Type:
text; Electronic Dissertation
Keywords:
ambiguous information; chief complaint classification; Markov switching with jumps; news sentiment; syndromic surveillance; text-based risk recognition
Degree Name:
Ph.D.
Degree Level:
doctoral
Degree Program:
Management Information Systems; Graduate College
Degree Grantor:
University of Arizona
Advisor:
Chen, Hsinchun
Committee Chair:
Chen, Hsinchun

Full metadata record

DC FieldValue Language
dc.language.isoENen_US
dc.titleSURVEILLANCE IN THE INFORMATION AGE: TEXT QUANTIFICATION, ANOMALY DETECTION, AND EMPIRICAL EVALUATIONen_US
dc.creatorLu, Hsin-Minen_US
dc.contributor.authorLu, Hsin-Minen_US
dc.date.issued2010en_US
dc.publisherThe University of Arizona.en_US
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.en_US
dc.description.abstractDeep penetration of personal computers, data communication networks, and the Internet has created a massive platform for data collection, dissemination, storage, and retrieval. Large amounts of textual data are now available at a very low cost. Valuable information, such as consumer preferences, new product developments, trends, and opportunities, can be found in this large collection of textual data. Growing worldwide competition, new technology development, and the Internet contribute to an increasingly turbulent business environment. Conducting surveillance on this growing collection of textual data could help a business avoid surprises, identify threats and opportunities, and gain competitive advantages.Current text mining approaches, nonetheless, provide limited support for conducting surveillance using textual data. In this dissertation, I develop novel text quantification approaches to identify useful information in textual data, effective anomaly detection approaches to monitor time series data aggregated based on the text quantification approaches, and empirical evaluation approaches that verify the effectiveness of text mining approaches using external numerical data sources.In Chapter 2, I present free-text chief complaint classification studies that aim to classify incoming emergency department free-text chief complaints into syndromic categories, a higher level of representation that facilitates syndromic surveillance. Chapter 3 presents a novel detection algorithm based on Markov switching with jumps models. This surveillance model aims at detecting different types of disease outbreaks based on the time series generated from the chief complaint classification system.In Chapters 4 and 5, I studied the surveillance issue under the context of business decision making. Chapter 4 presents a novel text-based risk recognition design framework that can be used to monitor the changing business environment. Chapter 5 presents an empirical evaluation study that looks at the interaction between news sentiment and numerical accounting earnings information. Chapter 6 concludes this dissertation by highlighting major research contributions and the relevance to MIS research.en_US
dc.typetexten_US
dc.typeElectronic Dissertationen_US
dc.subjectambiguous informationen_US
dc.subjectchief complaint classificationen_US
dc.subjectMarkov switching with jumpsen_US
dc.subjectnews sentimenten_US
dc.subjectsyndromic surveillanceen_US
dc.subjecttext-based risk recognitionen_US
thesis.degree.namePh.D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.disciplineManagement Information Systemsen_US
thesis.degree.disciplineGraduate Collegeen_US
thesis.degree.grantorUniversity of Arizonaen_US
dc.contributor.advisorChen, Hsinchunen_US
dc.contributor.chairChen, Hsinchunen_US
dc.contributor.committeememberGoes, Pauloen_US
dc.contributor.committeememberZeng, Danielen_US
dc.identifier.proquest10922en_US
dc.identifier.oclc659754819en_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.