Advanced Hybrid Information Processing. First International Conference, ADHIP 2017, Harbin, China, July 17–18, 2017, Proceedings

Research Article

Accurate Decision Tree with Cost Constraints

Download
470 downloads
  • @INPROCEEDINGS{10.1007/978-3-319-73317-3_19,
        author={Nan Wang and Jinbao Li and Yong Liu and Jinghua Zhu and Jiaxuan Su and Cheng Peng},
        title={Accurate Decision Tree with Cost Constraints},
        proceedings={Advanced Hybrid Information Processing. First International Conference, ADHIP 2017, Harbin, China, July 17--18, 2017, Proceedings},
        proceedings_a={ADHIP},
        year={2018},
        month={2},
        keywords={Decision tree Cost constraint Machine learning Algorithm of classification},
        doi={10.1007/978-3-319-73317-3_19}
    }
    
  • Nan Wang
    Jinbao Li
    Yong Liu
    Jinghua Zhu
    Jiaxuan Su
    Cheng Peng
    Year: 2018
    Accurate Decision Tree with Cost Constraints
    ADHIP
    Springer
    DOI: 10.1007/978-3-319-73317-3_19
Nan Wang1, Jinbao Li1,*, Yong Liu1, Jinghua Zhu1, Jiaxuan Su2, Cheng Peng2
  • 1: Heilongjiang University
  • 2: Harbin Institute of Technology
*Contact email: lijbsir@126.com

Abstract

A decision tree is a basic classification and regression method that uses a tree structure or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. Decision tree is an effective approach for classification. At the same time, it is also a way to display an algorithm. It serving as a classical algorithm of classification has many optimization algorithms. Even though these approaches achieve high performance, the acquirement costs of attributes are usually ignored. In some cases, the acquired costs are very different and important, the acquirement cost of attributes in decision tree could not be ignored. Existing construction approaches of cost-sensitive decision tree fail to generate the decision tree dynamically according to the given data object and cost constraint. In this paper, we attempt to solve this problem. We propose a global decision tree as the model. The proper decision tree is derived from the model dynamically according to the data object and cost constraint. For the generation of dynamic decision trees, we propose the cost-constraint-based pruning algorithm. Experimental results demonstrate that our approach outperforms C4.5 in both accuracy and cost. Even though the attribute acquirement cost in our approach is much smaller, the accuracy gap between our approach and C4.5 is also small. Additionally, for large data set, our approach outperforms C4.5 algorithm in both cost and accuracy.