The First International Workshop on Bioinformatics

Research Article

Metabolic Network Construction Using Ensemble Algorithms

  • @INPROCEEDINGS{10.4108/eai.3-12-2015.2262392,
        author={Seongho Kim and Joohyoung Lee and Hyejeong Jang and Xiang Zhang},
        title={Metabolic Network Construction Using Ensemble Algorithms},
        proceedings={The First International Workshop on Bioinformatics},
        publisher={ACM},
        proceedings_a={BIOINFORMATICS},
        year={2016},
        month={5},
        keywords={ensemble averaging metabolomics network construction},
        doi={10.4108/eai.3-12-2015.2262392}
    }
    
  • Seongho Kim
    Joohyoung Lee
    Hyejeong Jang
    Xiang Zhang
    Year: 2016
    Metabolic Network Construction Using Ensemble Algorithms
    BIOINFORMATICS
    ACM
    DOI: 10.4108/eai.3-12-2015.2262392
Seongho Kim1,*, Joohyoung Lee2, Hyejeong Jang1, Xiang Zhang3
  • 1: Wayne State University/Karmanos Cancer Institute
  • 2: Wayne State University
  • 3: University of Louisville
*Contact email: kimse@karmanos.org

Abstract

One of the most important and challenging "knowledge extraction" tasks in bioinformatics is the reverse engineering of genes, proteins, and metabolites networks from biological data. Gaussian graphical models (GGMs) have been proven to be a very powerful formalism to infer biological networks. Standard GGM selection techniques can unfortunately not be used in the "small N, large P" data setting. Various methods to overcome this issue have been developed based on regularized estimation, partial least squares method, and limited-order partial correlation graphs. Several studies compared the performances among several network construction algorithms, such as PLSR, SCE, and ES, ICR and PCR, Ridge regression, Lasso and adaptive Lasso, to see which method is the best for biological network constructions. Each comparison analysis resulted in that each construction method has its own advantages as well as disadvantages according to different circumstances, such as the network complexity. However, it is almost impossible to recognize the complexity of the network before estimation. Thus, we develop an Ensemble method which is model averaging to construct a metabolic network. Our simulation studies show that the ensemble averaging based network construction has F1 score larger than these of other methods except only for Adaptive Lasso, reflecting its ability to account for uncertainty of network complexity.