Nature of Computation and Communication. International Conference, ICTCC 2014, Ho Chi Minh City, Vietnam, November 24-25, 2014, Revised Selected Papers

Research Article

MetaAB - A Novel Abundance-Based Binning Approach for Metagenomic Sequences

Download
309 downloads
  • @INPROCEEDINGS{10.1007/978-3-319-15392-6_13,
        author={Van-Vinh Le and Tran Lang and Tran Hoai},
        title={MetaAB - A Novel Abundance-Based Binning Approach for Metagenomic Sequences},
        proceedings={Nature of Computation and Communication. International Conference, ICTCC 2014, Ho Chi Minh City, Vietnam, November 24-25, 2014, Revised Selected Papers},
        proceedings_a={ICTCC},
        year={2015},
        month={2},
        keywords={Metagenomics Binning Next-generation sequencing Bayesian information criterion Genome abundance},
        doi={10.1007/978-3-319-15392-6_13}
    }
    
  • Van-Vinh Le
    Tran Lang
    Tran Hoai
    Year: 2015
    MetaAB - A Novel Abundance-Based Binning Approach for Metagenomic Sequences
    ICTCC
    ICST
    DOI: 10.1007/978-3-319-15392-6_13
Van-Vinh Le,*, Tran Lang1, Tran Hoai2
  • 1: Vietnam Academy of Science and Technology
  • 2: HCMC University of Technology
*Contact email: vinhlv@fit.hcmute.edu.vn

Abstract

Metagenomics is a research discipline of microbial communities that studies directly on genetic materials obtained from environmental samples without isolating and culturing single organisms in laboratory. One of the crucial tasks in metagenomic projects is the identification and taxonomic characterization of DNA sequences in the samples. In this paper, we present an unsupervised binning of metagenomic reads, called MetaAB, which can be able to identify and classify reads into groups of genomes using the information of genome abundances. The method is based on a proposed reduced-dimension model that is theoretically proved to have less computational time. Besides, MetaAB detects the number of genome abundances in data automatically by using the Bayesian Information Criterion. Experimental results show that the proposed method achieves higher accuracy and run faster than a recent abundance-based binning approach. The software implementing the algorithm can be downloaded at