Evolutionary rates at codon sites may be used to align sequences and infer protein domain function

Persistent Link:
http://hdl.handle.net/10150/610187
Title:
Evolutionary rates at codon sites may be used to align sequences and infer protein domain function
Author:
Durand, Pierre; Hazelhurst, Scott; Coetzer, Theresa
Affiliation:
Evolutionary Medicine Unit, University of the Witwatersrand and National Health Laboratory Service, Johannesburg, South Africa; Plasmodium Molecular Research Unit, Department of Molecular Medicine and Haematology, University of the Witwatersrand and National Health Laboratory Service, Johannesburg, South Africa; Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, USA; School of Electrical and Information Engineering, University of the Witwatersrand, Johannesburg, South Africa
Issue Date:
2010
Publisher:
BioMed Central
Citation:
Durand et al. BMC Bioinformatics 2010, 11:151 http://www.biomedcentral.com/1471-2105/11/151
Journal:
BMC Bioinformatics
Rights:
© 2010 Durand et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0
Collection Information:
This item is part of the UA Faculty Publications collection. For more information this item or other items in the UA Campus Repository, contact the University of Arizona Libraries at repository@u.library.arizona.edu.
Abstract:
BACKGROUND:Sequence alignments form part of many investigations in molecular biology, including the determination of phylogenetic relationships, the prediction of protein structure and function, and the measurement of evolutionary rates. However, to obtain meaningful results, a significant degree of sequence similarity is required to ensure that the alignments are accurate and the inferences correct. Limitations arise when sequence similarity is low, which is particularly problematic when working with fast-evolving genes, evolutionary distant taxa, genomes with nucleotide biases, and cases of convergent evolution.RESULTS:A novel approach was conceptualized to address the "low sequence similarity" alignment problem. We developed an alignment algorithm termed FIRE (Functional Inference using the Rates of Evolution), which aligns sequences using the evolutionary rate at codon sites, as measured by the dN/dS ratio, rather than nucleotide or amino acid residues. FIRE was used to test the hypotheses that evolutionary rates can be used to align sequences and that the alignments may be used to infer protein domain function. Using a range of test data, we found that aligning domains based on evolutionary rates was possible even when sequence similarity was very low (for example, antibody variable regions). Furthermore, the alignment has the potential to infer protein domain function, indicating that domains with similar functions are subject to similar evolutionary constraints. These data suggest that an evolutionary rate-based approach to sequence analysis (particularly when combined with structural data) may be used to study cases of convergent evolution or when sequences have very low similarity. However, when aligning homologous gene sets with sequence similarity, FIRE did not perform as well as the best traditional alignment algorithms indicating that the conventional approach of aligning residues as opposed to evolutionary rates remains the method of choice in these cases.CONCLUSIONS:FIRE provides proof of concept that it is possible to align sequences and infer domain function by using evolutionary rates rather than residue similarity. This represents a new approach to sequence analysis with a wide range of potential applications in molecular biology.
EISSN:
1471-2105
DOI:
10.1186/1471-2105-11-151
Version:
Final published version
Additional Links:
http://www.biomedcentral.com/1471-2105/11/151

Full metadata record

DC FieldValue Language
dc.contributor.authorDurand, Pierreen
dc.contributor.authorHazelhurst, Scotten
dc.contributor.authorCoetzer, Theresaen
dc.date.accessioned2016-05-20T09:00:35Z-
dc.date.available2016-05-20T09:00:35Z-
dc.date.issued2010en
dc.identifier.citationDurand et al. BMC Bioinformatics 2010, 11:151 http://www.biomedcentral.com/1471-2105/11/151en
dc.identifier.doi10.1186/1471-2105-11-151en
dc.identifier.urihttp://hdl.handle.net/10150/610187-
dc.description.abstractBACKGROUND:Sequence alignments form part of many investigations in molecular biology, including the determination of phylogenetic relationships, the prediction of protein structure and function, and the measurement of evolutionary rates. However, to obtain meaningful results, a significant degree of sequence similarity is required to ensure that the alignments are accurate and the inferences correct. Limitations arise when sequence similarity is low, which is particularly problematic when working with fast-evolving genes, evolutionary distant taxa, genomes with nucleotide biases, and cases of convergent evolution.RESULTS:A novel approach was conceptualized to address the "low sequence similarity" alignment problem. We developed an alignment algorithm termed FIRE (Functional Inference using the Rates of Evolution), which aligns sequences using the evolutionary rate at codon sites, as measured by the dN/dS ratio, rather than nucleotide or amino acid residues. FIRE was used to test the hypotheses that evolutionary rates can be used to align sequences and that the alignments may be used to infer protein domain function. Using a range of test data, we found that aligning domains based on evolutionary rates was possible even when sequence similarity was very low (for example, antibody variable regions). Furthermore, the alignment has the potential to infer protein domain function, indicating that domains with similar functions are subject to similar evolutionary constraints. These data suggest that an evolutionary rate-based approach to sequence analysis (particularly when combined with structural data) may be used to study cases of convergent evolution or when sequences have very low similarity. However, when aligning homologous gene sets with sequence similarity, FIRE did not perform as well as the best traditional alignment algorithms indicating that the conventional approach of aligning residues as opposed to evolutionary rates remains the method of choice in these cases.CONCLUSIONS:FIRE provides proof of concept that it is possible to align sequences and infer domain function by using evolutionary rates rather than residue similarity. This represents a new approach to sequence analysis with a wide range of potential applications in molecular biology.en
dc.language.isoenen
dc.publisherBioMed Centralen
dc.relation.urlhttp://www.biomedcentral.com/1471-2105/11/151en
dc.rights© 2010 Durand et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0en
dc.titleEvolutionary rates at codon sites may be used to align sequences and infer protein domain functionen
dc.typeArticleen
dc.identifier.eissn1471-2105en
dc.contributor.departmentEvolutionary Medicine Unit, University of the Witwatersrand and National Health Laboratory Service, Johannesburg, South Africaen
dc.contributor.departmentPlasmodium Molecular Research Unit, Department of Molecular Medicine and Haematology, University of the Witwatersrand and National Health Laboratory Service, Johannesburg, South Africaen
dc.contributor.departmentDepartment of Ecology and Evolutionary Biology, University of Arizona, Tucson, USAen
dc.contributor.departmentSchool of Electrical and Information Engineering, University of the Witwatersrand, Johannesburg, South Africaen
dc.identifier.journalBMC Bioinformaticsen
dc.description.collectioninformationThis item is part of the UA Faculty Publications collection. For more information this item or other items in the UA Campus Repository, contact the University of Arizona Libraries at repository@u.library.arizona.edu.en
dc.eprint.versionFinal published versionen
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.