
Research Article
Study of Dimensionality Reduction and Clustering Machine Learning Algorithms for the Analysis of Ship Engine Data
@INPROCEEDINGS{10.1007/978-3-031-58053-6_6, author={Theodoros Dimitriou and Emmanouil Skondras and Christos Hitiris and Cleopatra Gkola and Ioannis S. Papapanagiotou and Dimitrios J. Vergados and Georgia Fasoula and Stratos Koumantakis and Angelos Michalas and Dimitrios D. Vergados}, title={Study of Dimensionality Reduction and Clustering Machine Learning Algorithms for the Analysis of Ship Engine Data}, proceedings={Wireless Internet. 16th EAI International Conference, WiCON 2023, Athens, Greece, December 15-16, 2023, Proceedings}, proceedings_a={WICON}, year={2024}, month={5}, keywords={Unsupervised Machine Learning Dimensionality Reduction Data Clustering Ship Engine Data}, doi={10.1007/978-3-031-58053-6_6} }
- Theodoros Dimitriou
Emmanouil Skondras
Christos Hitiris
Cleopatra Gkola
Ioannis S. Papapanagiotou
Dimitrios J. Vergados
Georgia Fasoula
Stratos Koumantakis
Angelos Michalas
Dimitrios D. Vergados
Year: 2024
Study of Dimensionality Reduction and Clustering Machine Learning Algorithms for the Analysis of Ship Engine Data
WICON
Springer
DOI: 10.1007/978-3-031-58053-6_6
Abstract
Machine Learning (ML) is being successfully applied to ship engine management with proven economic and environmental benefits by engine performance optimization, timely fault detection and appropriate service planning. However, the data preparation for usage in ML algorithms provides several advantages including faster training and improved performance of the algorithm, improved visualization of the dataset, noise reduction, dataset simplification, avoidance of the curse of dimensionality and improved resource utilization. In this paper, two key techniques of the ML algorithms, that can be applied for data preparation and organization of ship engine data are studied, namely the dimensionality reduction and the data clustering. Dimensionality reduction involves the reduce of the number of input variables or features in a dataset, by retaining as much valuable information as possible. On the other hand, clustering ML techniques help to uncover insights and reduce data complexity through the organization of the data into clusters. Evaluation results demonstrate the usefulness of both techniques.