Research Article
Pattern Recognition of Big Nutritional Data in RCT
@INPROCEEDINGS{10.4108/icst.bodynets.2013.253690, author={Jin Wang and Hua Fang and Honggang Wang and Gin-Fei Olendzki and Chonggang Wang and Yunsheng Ma}, title={Pattern Recognition of Big Nutritional Data in RCT}, proceedings={8th International Conference on Body Area Networks}, publisher={ICST}, proceedings_a={BODYNETS}, year={2013}, month={10}, keywords={big data random controlled trial pattern recognition heterogeneity simulation dietary quality nutritional datasets gaussian mixture model (gmm) hidden markov random fields (hmrfs) self-organizing map-based neural networks (som) k-means agglomerative hierarchical clustering}, doi={10.4108/icst.bodynets.2013.253690} }
- Jin Wang
Hua Fang
Honggang Wang
Gin-Fei Olendzki
Chonggang Wang
Yunsheng Ma
Year: 2013
Pattern Recognition of Big Nutritional Data in RCT
BODYNETS
ACM
DOI: 10.4108/icst.bodynets.2013.253690
Abstract
As technology develops and research environment improves, large volume of data is collected for analyses. Unfortunately, these data are collected but not fully used or even untouched. Particularly, such big data from health and medical studies pose significant challenges to the methodological field. This paper presents a new multi-clustering approach for pattern recognition of big data in a randomized controlled trial (RCT) with multi-validation criteria. Specifically, a nutritional dataset was used to demonstrate our approach, which was generated from an NIH-funded RCT for patients with metabolic syndromes The proposed approach includes a suite of emerging and popular clustering methods: probability-based Gaussian Mixture Model (GMM), Hidden Markov Random Fields(HMRFs), Self-Organizing Map (SOM)-based neural networks, K-means and Agglomerative Hierarchical method. Using our RCT data and multi-validation criteria, our approach identified a most sufficient set of nutritional variables and detected distinct dietary change patterns with a universal agreement among the proposed multi-methods. The trajectory patterns were then generated using the method with the most clustering accuracy which was cross-validated via simulation. These patterns generated new and finer results for outcomes of the RCT. While our approach demonstrated a more accurate and comprehensive clustering only for big nutritional data in RCT, it can be generalized to big data in other research fields.