2nd International ICST Conference on Bio-Inspired Models of Network, Information, and Computing Systems

Research Article

Improved Covariance Model Parameter Estimation Using RNA Thermodynamic Properties

Download19 downloads
  • @INPROCEEDINGS{10.4108/ICST.BIONETICS2007.2431,
        author={Scott F .Sminth and Kay C. Wiese},
        title={Improved Covariance Model Parameter Estimation Using RNA Thermodynamic Properties},
        proceedings={2nd International ICST Conference on Bio-Inspired Models of Network, Information, and Computing Systems},
        proceedings_a={BIONETICS},
        year={2008},
        month={8},
        keywords={Bioinformatics  Covariance models  Database search  Non-coding RNA gene search  RNA secondary structure},
        doi={10.4108/ICST.BIONETICS2007.2431}
    }
    
  • Scott F .Sminth
    Kay C. Wiese
    Year: 2008
    Improved Covariance Model Parameter Estimation Using RNA Thermodynamic Properties
    BIONETICS
    ICST
    DOI: 10.4108/ICST.BIONETICS2007.2431
Scott F .Sminth1,*, Kay C. Wiese2,*
  • 1: Boise State University ECE Department Boise, Idaho, 83725-2075, USA +1-208-426-5743
  • 2: Simon Fraser University School of Computing Science Surrey, BC, Canada, V3T 0A3 +1-778-782-7436
*Contact email: sfsmith@boisestate.edu, kwiese@cs.sfu.ca

Abstract

Covariance models are a powerful description of non-coding RNA (ncRNA) families that can be used to search nucleotide databases for new members of these ncRNA families. Currently, estimation of the parameters of a covariance model (state transition and emission scores) is based only on the observed frequencies of mutations, insertions, and deletions in known ncRNA sequences. For families with very few known members, this can result in rather uninformative models where the consensus sequence has a good score and most deviations from consensus have a fairly uniform poor score. It is proposed here to combine the traditional observed-frequency information with known information about free energy changes in RNA helix formation and loop length changes. More thermodynamically probable deviations from the consensus sequence will then be favored in database search. The thermodynamic information may be incorporated into the models as informative priors that depend on neighboring consensus nucleotides and on loop lengths.