
Research Article
An Interactive Web Solution for Electronic Health Records Segmentation and Prediction
@INPROCEEDINGS{10.1007/978-3-031-35078-8_8, author={Sudeep Mathew and Mithun Dolthody Jayaprakash and Rashmi Agarwal}, title={An Interactive Web Solution for Electronic Health Records Segmentation and Prediction}, proceedings={Intelligent Systems and Machine Learning. First EAI International Conference, ICISML 2022, Hyderabad, India, December 16-17, 2022, Proceedings, Part I}, proceedings_a={ICISML}, year={2023}, month={7}, keywords={Natural Language Processing EHR Segmentation Serious Adverse Event Prediction}, doi={10.1007/978-3-031-35078-8_8} }
- Sudeep Mathew
Mithun Dolthody Jayaprakash
Rashmi Agarwal
Year: 2023
An Interactive Web Solution for Electronic Health Records Segmentation and Prediction
ICISML
Springer
DOI: 10.1007/978-3-031-35078-8_8
Abstract
A vast variety of patient data has been collected and monitored through Electronic Health Records (EHR) using various tools in the healthcare. The objective of the paper is to start data acquisition and data understanding and then create a web interface for data exploration and segmentation and classification. In the data modeling phase, the objective is to create machine learning models for segmentation and classification. The first step is data acquisition from theMIMIC-III v1.4(Clinical database) data mart. In the data understanding phase, the relationship of multiple tables is evaluated. After data wrangling the combined dataset is then used for k-means clustering techniques for obtaining chest heart failure patients clusters. In the following phase, the diagnosis text data is used for data modeling and for that various text features are created and then multiple classification techniques are applied for predicting the occurrences of death and the best model is considered for the model deployment. In the model evaluation phase, it is observed that six clusters were optimal while training the model and it is incorporated into the application for predicting the segments of the patients based on the risk levels. Few machine learning models were trained on patient’s historic diagnosis text data and the logistic regression model indicated 89% of AUC score in test data and is deployed into the application for the prediction.