
Research Article
Applied Analysis of Probability Theory and Mathematical Statistics in Data Mining
@INPROCEEDINGS{10.1007/978-3-031-63130-6_12, author={Jidong Zhao and Xiaoxuan Gong and Cheng Zhenhua}, title={Applied Analysis of Probability Theory and Mathematical Statistics in Data Mining}, proceedings={Application of Big Data, Blockchain, and Internet of Things for Education Informatization. Third EAI International Conference, BigIoT-EDU 2023, August 29-31, 2023, Liuzhou, China, Proceedings, Part I}, proceedings_a={BIGIOT-EDU}, year={2024}, month={7}, keywords={data mining probability theory mathematical statistics application analysis}, doi={10.1007/978-3-031-63130-6_12} }
- Jidong Zhao
Xiaoxuan Gong
Cheng Zhenhua
Year: 2024
Applied Analysis of Probability Theory and Mathematical Statistics in Data Mining
BIGIOT-EDU
Springer
DOI: 10.1007/978-3-031-63130-6_12
Abstract
With the improvement of the national scientific and technological level, China has entered the era of big data. If we want to achieve better development in the era of big data and make rapid progress in the domestic socio-economic level, we need to find the universal laws of data in massive, complex and low correlation data, which requires the use of data mining and probability theory and mathematical statistics. Probability theory is a branch of mathematics that studies probability, random variables and random functions. It has been applied in many different fields such as engineering, computer science, economics and so on. Data mining is the summary and analysis of a large number of data, while probability theory and mathematical statistics are the more detailed analysis of data based on data mining. The two complement each other and achieve each other. Probability distribution is a function that assigns probability to the result or event according to the occurrence or non-occurrence of the result or event in the experiment or observation. Apply probability theory and mathematical statistics to data mining to improve the accuracy and efficiency of data mining. Therefore, in order to improve the quality of data mining, it is necessary to apply statistical methods effectively. On this basis, the specific application of statistics in data mining is analyzed through the connection of probability theory, mathematical statistics and data mining. Combined with specific algorithms, the application of statistics in data mining is discussed. This paper studies and analyzes the characteristics of data mining, probability theory and mathematical statistics, and discusses the specific application of statistics in data mining.