Proceedings of the 1st International Conference on Statistics and Analytics, ICSA 2019, 2-3 August 2019, Bogor, Indonesia

Research Article

Comparing Decision Tree, Random Forest and Boosting in Identifying Weather Index for Rice Yield Prediction

Download462 downloads
  • @INPROCEEDINGS{10.4108/eai.2-8-2019.2290475,
        author={Mohammad  Masjkur and Ken Seng  Tan},
        title={Comparing Decision Tree, Random Forest and Boosting in Identifying Weather Index for Rice Yield Prediction},
        proceedings={Proceedings of the 1st International Conference on Statistics and Analytics, ICSA 2019, 2-3 August 2019, Bogor, Indonesia},
        publisher={EAI},
        proceedings_a={ICSA},
        year={2020},
        month={1},
        keywords={boosting decision tree random forest rice insurance weather index},
        doi={10.4108/eai.2-8-2019.2290475}
    }
    
  • Mohammad Masjkur
    Ken Seng Tan
    Year: 2020
    Comparing Decision Tree, Random Forest and Boosting in Identifying Weather Index for Rice Yield Prediction
    ICSA
    EAI
    DOI: 10.4108/eai.2-8-2019.2290475
Mohammad Masjkur1,*, Ken Seng Tan2
  • 1: Department of Statistics, Bogor Agricultural University, Bogor, Indonesia
  • 2: Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Canada
*Contact email: masjkur@apps.ipb.ac.id

Abstract

Modeling relationship of weather index and yield losses is a basis for developing weather-based index crop insurance. The data mining approach may overcome some limitations of traditional regression approaches to identify a weather index for predicting crop yield. The purpose of study is to evaluate performance Decision Tree, Random Forest and Boosting in identifying most important weather index for rice crop yield prediction. The study using district level of rice yield data of 8 locations within the annually period of 1991 – 2014 in Java region. The corresponding weather data consist of 48 weather variables including timescale Standardized Precipitation Index (SPI), Growing Degree Days (GDD), and Vapor Pressure Deficit (VPD) for growing season, respectively. Results show that Boosted Regression Tree is the best model compared to Regression Tree and Random Forest for rice yield prediction. The most important weather index is Growing Degree Days on growing season I (GDD I) and Growing Degree Days on growing season III (GDD III).The threshold values of GDD I > 21000C and GDD III > 21500C would trigger rice yield losses.