
Research Article
Demystifying Predictive Analytics with Data Mining to Optimize Fraud Detection in the Insurance Industry
@INPROCEEDINGS{10.1007/978-3-030-80621-7_31, author={Betelhem Zewdu and Gebeyehu Belay}, title={Demystifying Predictive Analytics with Data Mining to Optimize Fraud Detection in the Insurance Industry}, proceedings={Advances of Science and Technology. 8th EAI International Conference, ICAST 2020, Bahir Dar, Ethiopia, October 2-4, 2020, Proceedings, Part I}, proceedings_a={ICAST}, year={2021}, month={7}, keywords={Fraud Detection Data mining Optimization Predictive analytics Determinant factor}, doi={10.1007/978-3-030-80621-7_31} }
- Betelhem Zewdu
Gebeyehu Belay
Year: 2021
Demystifying Predictive Analytics with Data Mining to Optimize Fraud Detection in the Insurance Industry
ICAST
Springer
DOI: 10.1007/978-3-030-80621-7_31
Abstract
The insurance industry is a company that renders risk management in the form of finance, humans, etc. ensuring contracts. Fraud is one risk, which does for self benefits or interest. In workmen’s compensation, insurance fraud is intentional deception for gaining some interest in the form of health expenditures, which is challenging to handle manually. In this study, we proposed and introduced a novel approach to demystifying a predictive analytics approach using data mining techniques. The model can detect and predict fraud suspicious insurance claims with a particular emphasis on Insurance Corporation in the case of Workmen’s Compensation. We use ensemble clustering followed by classification techniques for developing the predictive model. The predictive analytics applied to build an analytical model of the known variables’ value to build a model that can predict the value of the variable of the unknown value. K-Means clustering algorithm is employed to find the natural grouping of the different insurance claims as fraud and non-fraud. The resulting cluster is employed to develop the classification model. The classification performed using the J48 and JRip algorithm to create the model of classifying fraud suspicious insurance claims using the AdaBoost method JRip as a base classifier, and it scored an accuracy of 98.26% on an 80% split CLAIMREPORTLENGTH_DATE is the determinant factor for predict fraud suspicious.