ew 19(22): e1

Research Article

Development of the video stream object detection algorithm (VSODA) with tracking

Download120 downloads
  • @ARTICLE{10.4108/eai.22-1-2019.156385,
        author={A.Y. Zarnitsyn and A.S. Volkov and A.A. Voycehovsky and B.I. Pyakillya},
        title={Development of the video stream object detection algorithm (VSODA) with tracking},
        journal={EAI Endorsed Transactions on Energy Web},
        keywords={Computer vision, deep learning, machine learning, pattern recognition, mobile robotics, object tracking, video analysis},
  • A.Y. Zarnitsyn
    A.S. Volkov
    A.A. Voycehovsky
    B.I. Pyakillya
    Year: 2019
    Development of the video stream object detection algorithm (VSODA) with tracking
    DOI: 10.4108/eai.22-1-2019.156385
A.Y. Zarnitsyn1,*, A.S. Volkov1, A.A. Voycehovsky1, B.I. Pyakillya1
  • 1: Tomsk Polytechnic University
*Contact email: ayz10@tpu.ru


The object tracking is one of the most important task in video analysis. Many methods have been proposed such as TLD (Tracking, Learning, Detection), Meanshift and MIL but they show good accuracy in laboratory cases, not in real ones, where the accuracy is defined as a numerical difference between computed object coordinates and the real ones. One of the reasons is lack of information about tracked object and environment changes. If a method has the prior information about tracked object, then it will be able to perform with higher accuracy. Some of the newest object tracking methods such as GOTURN use trained CNN (convolutional neural network) and have better accuracy because of knowledge about how the tracked object looks like in different situations such as light intensity changes and tracked object’s rotations. If we use only a classification algorithm (classifier) then it can find an object that was in training set with high probability. But if its appearance is changing it will be lost when deviation will be higher than trust limit. Then it is important to have parts of prior and posterior information about tracked object. The prior information is given by detector (CNN) and posterior information – by tracking algorithm (TLD). One of the biggest detector problems is high computational complexity in terms of operations’ number and one of the solutions is to use the classifier in parallel with the tracker. In future work we are going to use different sensors, not only RGB camera, but RGBD camera, which may improve accuracy due to higher amount of information.