
Research Article
Cross-Modal Transformer Framework for Emotion-Aligned Music Therapy using Indian Classical Raagas for Individuals with Autism Spectrum Disorder
@INPROCEEDINGS{10.4108/eai.28-4-2025.2357995, author={Sreeja Poduri and Lalit Kovvuri and Vamsi Uppalapati and Lavanya Addepalli and Vidya Sagar S D and Jaime Lloret}, title={Cross-Modal Transformer Framework for Emotion-Aligned Music Therapy using Indian Classical Raagas for Individuals with Autism Spectrum Disorder}, proceedings={Proceedings of the 4th International Conference on Information Technology, Civil Innovation, Science, and Management, ICITSM 2025, 28-29 April 2025, Tiruchengode, Tamil Nadu, India, Part II}, publisher={EAI}, proceedings_a={ICITSM PART II}, year={2025}, month={10}, keywords={autism spectrum disorder (asd) music therapy cross-modal learning indian classical raaga transformer models emotion-aware recommendation}, doi={10.4108/eai.28-4-2025.2357995} }
- Sreeja Poduri
Lalit Kovvuri
Vamsi Uppalapati
Lavanya Addepalli
Vidya Sagar S D
Jaime Lloret
Year: 2025
Cross-Modal Transformer Framework for Emotion-Aligned Music Therapy using Indian Classical Raagas for Individuals with Autism Spectrum Disorder
ICITSM PART II
EAI
DOI: 10.4108/eai.28-4-2025.2357995
Abstract
Autism Spectrum Disorder (ASD) is complicated and usually involves non-verbal, sensory sensitive and personalised therapeutic interventions. In this paper, we introduce a novel AI based framework, NeuroMusical Cross Modal Transformer (NM XMT) which translates self-expression of an autistic person in form of listening to song to an AI based recommendation system of emotionally and contextually contoured Indian classical Raagas for an autistic person. The system proposed here utilizes crossmodal deep leaning to embed clinical behavioral features and musical semantics into a common affective space, so that precise alignment of the therapy can be achieved using arousal valence modeling, affective sensory preference, and contextual parameters like the time of the day and age. Based on the simulated datasets, the model was evaluated against traditional and emotion aware baselines. The results indicate that NM-XMT performs superior to conventional recommender systems with a high therapeutic effectiveness score of 0.89, which is the best amongst compared models. Reinforcement-driven personalization and feedback loops mean that the system also has high explainability and adaptability. The significance of the model is that it can serve as the culturally grounded, non-invasive digital therapy solution for the ASD community, as the results from these findings show.