Research Article
Accessing Speech Documents on Smartphones
@INPROCEEDINGS{10.4108/ICST.MOBIQUITOUS2008.3635, author={Marcel-Cătălin Rosu}, title={Accessing Speech Documents on Smartphones}, proceedings={5th International ICST Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services}, publisher={ICST}, proceedings_a={MOBIQUITOUS}, year={2010}, month={5}, keywords={Speech archive search smartphone Lucene}, doi={10.4108/ICST.MOBIQUITOUS2008.3635} }
- Marcel-Cătălin Rosu
Year: 2010
Accessing Speech Documents on Smartphones
MOBIQUITOUS
ICST
DOI: 10.4108/ICST.MOBIQUITOUS2008.3635
Abstract
This paper introduces BBSearch, which is an experimental system for exploring the challenges of ubiquitous access to recorded speech data. BBSearch applies information retrieval techniques to transcripts obtained by automatic speech recognition and it aims at providing a uniform user experience across platforms. To provide identical search functionality and document ranking, BBSearch applications use the same IR library for indexing and retrieval, namely Apache Lucene. For Java-enabled mobile platforms, BBSearch uses our J2ME Lucene port, called LuceneME. This paper explores the resource requirements of LuceneME when used for Boolean searches and for supporting the podcast navigation GUI. On a BlackBerry smartphone, a diverse set of queries against a 70-hour corpus complete in less than 3 seconds and use less than 2MB of memory. The results of the evaluation validate our design and warrant expanding BBSearch to less capable cellphones, larger corpuses, or with more complex search capabilities