6th International ICST Conference on Body Area Networks

Research Article

How’s my Mood and Stress? An Efficient Speech Analysis Library for Unobtrusive Monitoring on Mobile Phones

  • @INPROCEEDINGS{10.4108/icst.bodynets.2011.247079,
        author={Keng-hao Chang and Drew Fisher and John Canny and Bjoern Hartmann},
        title={How’s my Mood and Stress? An Efficient Speech Analysis Library for Unobtrusive Monitoring on Mobile Phones},
        proceedings={6th International ICST Conference on Body Area Networks},
        publisher={ICST},
        proceedings_a={BODYNETS},
        year={2012},
        month={6},
        keywords={health care mental health monitor mobile phones voice analysis toolkit},
        doi={10.4108/icst.bodynets.2011.247079}
    }
    
  • Keng-hao Chang
    Drew Fisher
    John Canny
    Bjoern Hartmann
    Year: 2012
    How’s my Mood and Stress? An Efficient Speech Analysis Library for Unobtrusive Monitoring on Mobile Phones
    BODYNETS
    ICST
    DOI: 10.4108/icst.bodynets.2011.247079
Keng-hao Chang1,*, Drew Fisher1, John Canny1, Bjoern Hartmann1
  • 1: Computer Science Division, University of California at Berkeley
*Contact email: kenghao@cs.berkeley.edu

Abstract

The human voice encodes a wealth of information about emotion, mood, stress, and mental state. With mobile phones (one of the mostly used modules in body area networks) this information is potentially available to a host of applications and can enable richer, more appropriate, and more satisfying human-computer interaction. In this paper we describe the AMMON (Affective and Mental health MONitor) library, a low footprint C library designed for widely available phones as an enabler of these applications. The library incorporates both core features for emotion recognition (from the Interspeech 2009 Emotion recognition challenge), and the most important features for mental health analysis (glottal timing features). To comfortably run the library on feature phones (the most widely-used class of phones today), we implemented the routines in fixed-point arithmetic, and minimized computational and memory footprint. On identical test data, emotion and stress classification accuracy was indistinguishable from a state-of-the-art reference system running on a PC, achieving 75% accuracy on two-class emotion classification tasks and 84% accuracy on binary classification of stressed and neutral situations. The library uses 30% of real-time on a 1GHz processor during emotion recognition and 70% during stress and mental health analysis.