
Research Article
Enhanced Sound Recognition and Classification Through Spectrogram Analysis, MEMS Sensors, and PyTorch: A Comprehensive Approach
@INPROCEEDINGS{10.1007/978-3-031-54521-4_1, author={Alexandros Spournias and Nikolaos Nanos and Evanthia Faliagka and Christos Antonopoulos and Nikolaos Voros and Giorgos Keramidas}, title={Enhanced Sound Recognition and Classification Through Spectrogram Analysis, MEMS Sensors, and PyTorch: A Comprehensive Approach}, proceedings={Collaborative Computing: Networking, Applications and Worksharing. 19th EAI International Conference, CollaborateCom 2023, Corfu Island, Greece, October 4-6, 2023, Proceedings, Part I}, proceedings_a={COLLABORATECOM}, year={2024}, month={2}, keywords={Sound recognition Machine Learning PyTorch Environmental Monitoring Spectrogram Sound Analysis}, doi={10.1007/978-3-031-54521-4_1} }
- Alexandros Spournias
Nikolaos Nanos
Evanthia Faliagka
Christos Antonopoulos
Nikolaos Voros
Giorgos Keramidas
Year: 2024
Enhanced Sound Recognition and Classification Through Spectrogram Analysis, MEMS Sensors, and PyTorch: A Comprehensive Approach
COLLABORATECOM
Springer
DOI: 10.1007/978-3-031-54521-4_1
Abstract
The importance of sound recognition and classification systems in various fields has led researchers to seek innovative methods to address these challenges. In this paper, the authors propose a concise yet effective approach for sound recognition and classification by combining spectrogram analysis, Micro-Electro-Mechanical Systems (MEMS) sensors, and the Pytorch deep learning framework. This method utilizes the rich information in audio signals to develop a robust and accurate sound recognition and classification system.
The authors outline a three-stage process: data acquisition, feature extraction, and classification. MEMS sensors are employed for data acquisition, offering advantages such as reduced noise, low power consumption, and enhanced sensitivity compared to traditional microphones. The acquired audio signals are then preprocessed and converted into spectrograms, visually representing the audio data’s frequency, amplitude, and temporal attributes.
During feature extraction, the spectrograms are analyzed to extract significant features conducive to sound recognition and classification. The classification task is performed using a custom deep learning model in Pytorch, leveraging modern neural networks’ pattern recognition capabilities. The model is trained and validated on a diverse dataset of audio samples, ensuring its proficiency in recognizing and classifying various sound types.
The experimental results demonstrate the effectiveness of the proposed method, surpassing existing techniques in sound recognition and classification performance. By integrating spectrogram analysis, MEMS sensors, and Pytorch, the authors present a compact yet powerful sound recognition system with potential applications in numerous domains, such as predictive maintenance, environmental monitoring, and personalized voice-controlled devices.