Research Article
Using a Distributed Search Engine to Identify Optimal Product Sets for Use in an Outbreak Detection System
@INPROCEEDINGS{10.4108/icst.collaboratecom.2012.250728, author={Ruhsary Rexit and Fuchiang (Rich) Tsui and Jeremy Espino and Sahawut Wesaratchakit and Ye Ye and Panos Chrysanthis}, title={Using a Distributed Search Engine to Identify Optimal Product Sets for Use in an Outbreak Detection System}, proceedings={International Workshop on Collaborative Big Data}, publisher={IEEE}, proceedings_a={C-BIG}, year={2012}, month={12}, keywords={distributed search syndromic surveillance out- break detection time series analysis}, doi={10.4108/icst.collaboratecom.2012.250728} }
- Ruhsary Rexit
Fuchiang (Rich) Tsui
Jeremy Espino
Sahawut Wesaratchakit
Ye Ye
Panos Chrysanthis
Year: 2012
Using a Distributed Search Engine to Identify Optimal Product Sets for Use in an Outbreak Detection System
C-BIG
ICST
DOI: 10.4108/icst.collaboratecom.2012.250728
Abstract
This study tests an approach for identifying sets of over-the-counter (OTC) thermometer products whose aggregate sales correlate optimally with aggregate counts of emergency department (ED) visits where patients have symptoms consistent with Constitutional syndrome such as fever and chills. We show that by using a distributed search engine alongside search algorithms (Brute-force), we can quickly identify a minimum set of OTC thermometer products whose sales are optimally correlated to the ED data. We used the Pearson correlation coefficient function to measure the degree of correlation between OTC and ED time series. The optimal OTC product set— comprising 9 thermometer products found by the Brute-force algorithm—has a correlation coefficient value of 0.96. We believe the approach used in this study can be used to efficiently identify different optimal OTC sets for detection of different types of disease outbreaks.