Proceedings of the 2nd International Conference on Mathematical Statistics and Economic Analysis, MSEA 2023, May 26–28, 2023, Nanjing, China

Research Article

An Estimation of the Pricing of Second-Hand Sailboats Based on the Random Forest Algorithm

Download257 downloads
  • @INPROCEEDINGS{10.4108/eai.26-5-2023.2334481,
        author={Chengyuan  Yang and Sinan  Tang and JiaHao  Chen},
        title={An Estimation of the Pricing of Second-Hand Sailboats Based on the Random Forest Algorithm},
        proceedings={Proceedings of the 2nd International Conference on Mathematical Statistics and Economic Analysis, MSEA 2023, May 26--28, 2023, Nanjing, China},
        publisher={EAI},
        proceedings_a={MSEA},
        year={2023},
        month={7},
        keywords={random forest decision tree lagrange interpolation local outlier factor},
        doi={10.4108/eai.26-5-2023.2334481}
    }
    
  • Chengyuan Yang
    Sinan Tang
    JiaHao Chen
    Year: 2023
    An Estimation of the Pricing of Second-Hand Sailboats Based on the Random Forest Algorithm
    MSEA
    EAI
    DOI: 10.4108/eai.26-5-2023.2334481
Chengyuan Yang1,*, Sinan Tang1, JiaHao Chen1
  • 1: South China University of Technology
*Contact email: 1770573981@qq.com

Abstract

The sailboat market is a complex market, with prices affected by multiple factors. For buyers, it is important to understand the background information behind the pricing of used sailboats. By understanding the impact of some common factors, buyers can better evaluate the actual value of the sailboat and make better purchasing decisions. Therefore, accurately predicting sailboat prices is crucial for developing pricing strategies and making buying and selling decisions. In response to problem one, the data is processed by using a crawler and collector to expand the dataset to 11 dimensions. The data is then checked for missing values and interpolated. Next, outlier detection is performed using the LOF algorithm, and data cleaning is done using high-dimensional mapping to find outlier data. Text data, such as the brand, is encoded, and logical data, such as the number of sails, is represented by boolean values. For regional factors, four economic indicators related to the region are directly used as replacements.