Fusion of Sound Source Localization and Face Detection for Supporting Human Behavior Analysis

Markus Niiranen; Janne Vehkaperä; Satu-Marja Mäkelä; Johannes  Peltola; Tomi  Räty

4th International ICST Mobile Multimedia Communications Conference

Research Article

Fusion of Sound Source Localization and Face Detection for Supporting Human Behavior Analysis

Download1050 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.4108/ICST.MOBIMEDIA2008.4071,
    author={ Markus Niiranen and Janne Vehkaper\aa{} and Satu-Marja M\aa{}kel\aa{} and  Johannes  Peltola and Tomi  R\aa{}ty},
    title={Fusion of Sound Source Localization and Face Detection for Supporting Human Behavior Analysis},
    proceedings={4th International ICST Mobile Multimedia Communications Conference},
    publisher={ICST},
    proceedings_a={MOBIMEDIA},
    year={2010},
    month={5},
    keywords={Audio localization audio detection microphone arrays face detection},
    doi={10.4108/ICST.MOBIMEDIA2008.4071}
}

Markus Niiranen
Janne Vehkaperä
Satu-Marja Mäkelä
Johannes Peltola
Tomi Räty
Year: 2010
Fusion of Sound Source Localization and Face Detection for Supporting Human Behavior Analysis
MOBIMEDIA
ICST
DOI: 10.4108/ICST.MOBIMEDIA2008.4071

Markus Niiranen¹^,*, Janne Vehkaperä¹^,*, Satu-Marja Mäkelä¹^,*, Johannes Peltola¹^,*, Tomi Räty¹^,*

1: VTT Technical Research Centre of Finland, Kaitoväylä 1, 90571, Oulu, Finland.

*Contact email: markus.niiranen@vtt.fi, janne.vehkapera@vtt.fi, satu-marja.maäkela@vtt.fi, johannes.peltola@vtt.fi, tomi.raty@vtt.fi

Abstract

This paper describes a demonstrated concept implementation that combines sound source localization and face detection from video stream for supporting human behavior analysis. System monitors space containing multiple persons using microphone array and video camera. The aim is to detect which person in the scene is producing the sound that is received by the microphones. For this task the microphone array localizes the sound in the environment. Simultaneously face detection is performed to the video signal produced by the monitoring video camera. If face is detected from the bearing of the sound the system may decide that the sound is produced by the person who's face is detected. Preliminary results indicate that the fusion may give useful information for human behavior analysis for space containing multiple persons.

Keywords: Audio localization audio detection microphone arrays face detection

Published: 2010-05-16
Publisher: ICST

: http://dx.doi.org/10.4108/ICST.MOBIMEDIA2008.4071