
Research Article
Large-Scale Video Analytics Through Object-Level Consolidation
@INPROCEEDINGS{10.1007/978-3-031-06371-8_11, author={Daniel Rivas and Francesc Guim and Jord\'{a} Polo and Josep Ll. Berral and David Carrera}, title={Large-Scale Video Analytics Through Object-Level Consolidation}, proceedings={Science and Technologies for Smart Cities. 7th EAI International Conference, SmartCity360°, Virtual Event, December 2-4, 2021, Proceedings}, proceedings_a={SMARTCITY}, year={2022}, month={6}, keywords={Video analytics DNN Inference Backgroung substraction Motion detection}, doi={10.1007/978-3-031-06371-8_11} }
- Daniel Rivas
Francesc Guim
Jordà Polo
Josep Ll. Berral
David Carrera
Year: 2022
Large-Scale Video Analytics Through Object-Level Consolidation
SMARTCITY
Springer
DOI: 10.1007/978-3-031-06371-8_11
Abstract
As the number of installed cameras grows, so do the compute resources required to process and analyze all the images captured by these cameras. Video analytics enables new use cases, such as smart cities or autonomous driving. At the same time, it urges service providers to install additional compute resources to cope with the demand while the strict latency requirements push compute towards the end of the network, forming a geographically distributed and heterogeneous set of compute locations, shared and resource-constrained. Such landscape (shared and distributed locations) forces us to design new techniques that can optimize and distribute work among all available locations and, ideally, make compute requirements grow sublinearly with respect to the number of cameras installed. In this paper, we present FoMO (Focus on Moving Objects). This method effectively optimizes multi-camera deployments by preprocessing images for scenes, filtering the empty regions out, and composing regions of interest from multiple cameras into a single image that serves as input for a pre-trained object detection model. Results show that overall system performance can be increased by 8x while accuracy improves 40% as a by-product of the methodology, all using anoff-the-shelfpre-trained model with no additional training or fine-tuning.