Persistent Link:
http://hdl.handle.net/10150/289906
Title:
Modeling evolution of protein coding DNA sequences
Author:
Pond, Sergei L.
Issue Date:
2003
Publisher:
The University of Arizona.
Rights:
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Abstract:
We develop a new class of computationally feasible stochastic models for statistical analysis of genetic sequence evolution and inference of properties of the underlying substitution processes in the context of maximum likelihood framework. Existing models for evolution of protein coding sequences allow site to site variation in non-synonymous substitution rates, but assume that the rate of synonymous substitutions is constant for all sites. New models provide a rigorous statistical framework for testing the hypothesis of synonymous rate constancy, and enable a host of data exploration and analysis tools. For several indicative data sets, the constancy assumption is shown to be violated, and some possible explanations are given. We also present an algorithm for improving efficiency of maximum likelihood evaluations, and discuss HyPhy--a user friendly and publicly distributed software implementation of our methods.
Type:
text; Dissertation-Reproduction (electronic)
Keywords:
Biology, Biostatistics.; Biology, Genetics.; Statistics.
Degree Name:
Ph.D.
Degree Level:
doctoral
Degree Program:
Graduate College; Applied Mathematics
Degree Grantor:
University of Arizona
Advisor:
Watkins, Joseph C.

Full metadata record

DC FieldValue Language
dc.language.isoen_USen_US
dc.titleModeling evolution of protein coding DNA sequencesen_US
dc.creatorPond, Sergei L.en_US
dc.contributor.authorPond, Sergei L.en_US
dc.date.issued2003en_US
dc.publisherThe University of Arizona.en_US
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.en_US
dc.description.abstractWe develop a new class of computationally feasible stochastic models for statistical analysis of genetic sequence evolution and inference of properties of the underlying substitution processes in the context of maximum likelihood framework. Existing models for evolution of protein coding sequences allow site to site variation in non-synonymous substitution rates, but assume that the rate of synonymous substitutions is constant for all sites. New models provide a rigorous statistical framework for testing the hypothesis of synonymous rate constancy, and enable a host of data exploration and analysis tools. For several indicative data sets, the constancy assumption is shown to be violated, and some possible explanations are given. We also present an algorithm for improving efficiency of maximum likelihood evaluations, and discuss HyPhy--a user friendly and publicly distributed software implementation of our methods.en_US
dc.typetexten_US
dc.typeDissertation-Reproduction (electronic)en_US
dc.subjectBiology, Biostatistics.en_US
dc.subjectBiology, Genetics.en_US
dc.subjectStatistics.en_US
thesis.degree.namePh.D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.disciplineGraduate Collegeen_US
thesis.degree.disciplineApplied Mathematicsen_US
thesis.degree.grantorUniversity of Arizonaen_US
dc.contributor.advisorWatkins, Joseph C.en_US
dc.identifier.proquest3090006en_US
dc.identifier.bibrecord.b44425697en_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.