EAI Endorsed Transactions on Context-aware Systems and Applications 1 A Combination of Off-line and On-line Learning to Classifier Grids for Object Detection

We propose a new method for object detection by combining off-line and on-line boosting learning to classifier grids based on visual information without human intervention concerned to intelligent surveillance system. It allows for combine information labeled and unlabeled use different contexts to update the system, which is not available at off-line training time. The main goal is to develop an adaptive but robust system and to combine prior knowledge with new information in an unsupervised learning framework that is learning 24 hours a day and 7 days a week. We use co-training strategy by combining off-line and on-line learning to the classifier grids. The proposed method is practically favorable as it meets the requirements of real-time performance, accuracy and robustness. It works well with reasonable amount of training samples and is computational efficiency. Experiments on detection of objects in challenging data sets show the outperforming of our approach.


Introduction
Today the increasing number of surveillance cameras which should have a self-operated monitoring surveillance systems.It is an important thing.Researches focused on detecting the object from the static cameras emphasis on applications in the real world environment in which context information plays an important role.Robust machine learning system to detect objects can be applied in many different fields.For example intelligent surveillance systems, human-computer interaction, information safety, machine vision, data mining.One of the main challenges is how machine learning can combine the information which has not been labeled in different contexts to update the classifications but at the same time preserving the accuracy of the object detection for long periods.This is a very important requirement for the system to detect objects in real time.It may help the selection of information for updating the system better, limit the changes and reduce the size of the training sample.In many applications, object detection problem is much simpler, such a monitoring system with a still camera 24/7 focusing on one type of object in the same scene.Our approach is a machine learning system having access to large data stream and using different contexts and constantly updated information for specific classifier to detect objects of a class in any context [22].Moreover, the visual data stream is continuously supplied a large amount of unlabeled data but must be explored to improve detection results as well as the speed of the object detection.The unlabeled large data-stream is analyzed and specific scene of positive and negative samples are collected for continuously updating the classifiers.One simple way to benefit from a static camera is to combine new information unlabled on the specific situation and data labeled for improving current classifier.However such information only helps reduce the number of false detection [10].To increase the object detection rate accurately, adaptive learning methods online are applied [18].This method focuses on solving object detection task in a specific scene and has many advantages for data stream to be continuous.In fact, this approach used in the context in training process and postprocessing.Thus, online unsupervised learning methods often used to continuously adapt a model.The main problem is how to be able to update the new data which has not been labeled, but to reduce the update failed for the classification.In other words, that is to reduce the number in detection objects drifting.Several prominent researchers applied sliding windows technique [6,7,8,16].The classifier performs a search greed, from top to bottom, from left to right, each area of the image corresponding to the difference in size will be tested.The classifier will estimate the value of the image area corresponding to the object to be detected or not.The corresponding area of the image to match the object will be marked on the image results.Typically, the goal of this approach is that the classification has been chosen for a specific object and be applied to different object detection in different environments [7,8,12].However, if classifier trained a large number of samples to the problem of object detection in general often fail in specific situations.The performance of the detector will have low accuracy.The use of fixed cameras with reasonable constraints for the majority of applications used for a particular situation may help reduce the number of objects to be in detection [10].In fact, the trained classifier with just a few training data for a specific problem often results in efficient and accurate objects detection.The methods of object detection using traditional models and the sliding window of object search are used a single classification for finding and developing the object on the whole picture [1,2,3,4,5] .The training to form the aforementioned classification learning using offline methods will then encounter the following problems: The first, to establish the classification in the offline model of traditional methods (such as support vector machines, neural networks, boosting) the set of training samples must be collected and prepared.Therefore, this dataset can be very large and be up to several thousand samples, depending on the specific application.The preparation of training sample sets is usually made by hand resulting in much time and effort.Also, using classifier on a prepared set sample but when applied on a new context; it should not be effectively promoted.For the effective performance of the system, it must train new additional models to adapt to the new object in the scene.Hence, there should be a new approach in training the classifier.It is online learning.The second, after training completion, object detection must use greedy search methods for all positions and different sizes of objects.Object Detection not only on the current image but also on consecutive frames; so the complexity of the problem of object detection increases.The third, in order to extract features sample learning, the system normally uses a specific method but applied across the sample set.Therefore using a features method of selecting specific extract is only suitable for this sample but not matching other sample .This means not getting the best features for each different sample.Such as the features geometry, textures, colors, shapes.In addition to the system that can be applied in real-time applications, the fast calculation method for selecting features extract is also an example.Thus the research could use a variety of methods for features extraction.How to train each sample to choose the most suitable features for that sample is one of the concerns of our research.
To overcome these limitations, in this research, we use the feature selection methods such as Haar wavelet, Local binary pattern and Histogram of Oriented.To reduce the complexity of the greedy search for object detection, we use grids classifier model in which each set of classifications that will be used with a few parameters and undertaking the search object in a smaller scope.The classification is done to update and detect simultaneously and allowed to adapt to the current context.Additionally our research uses fast calculation method on the integral image.We propose to use a co-training approach in combination with a novel robust on-line learner.The robust on-line learner keeps two separate models for the positive class as well as two separate models for the negative class.For both, the positive and the negative class, one model is off-line trained and kept fixed during run-time, while the other one is adapted over time.This combination of an off-line model with an on-line adapted model within a single learner allows for incorporating scene specific positive information (i.e., the recall can be increased), while still preserving the accuracy.Our approach to classifier grids in object detection problem from one or multiple cameras in which may combine information unlabeled and labeled use different contexts to collect samples to update the system from the background image and object classes over time.The rest of the paper is structured as follows.First, in Section 2, we mention the issues related to recent researches.Next, we consider the idea of combination of off-line and on-line learning for classifier grids in Section 3. We give an empirical evaluation of the approach experimental evaluation and results in Section 4. Finally, we conclude with an outline of research problem in this direction in Section 5.

Related Works
Recently, to improve the performance of the classifier and reduce the number of samples on the training set, researches using online learning algorithms have withdrawn much interest from researchers worldwide [11,16,18].These algorithms allow to update the information necessary to suit each specific context.This facilitates adapting to each learning sample for the improved system performance of the classifier which is enabling significantly.Also it helps to reduce the number of training samples while assuring the incremental learning.Under this approach, the algorithm often uses self-training [14,17] and learning combined sample which was labeled and unlabeled in a co-training framework [4,13] or selfgenerated sample of learning training process, the method of unsupervised learning [16].For semi-supervised learning methods, often using a combination of information which was given to exploring the unknown information to form the classification.Self-training method commonly encounters determining updated sample correctly because it is only based on the feedback of the current classification at that time.This is very important because it will affect the performance of the classification, as well as the classification may be drift in the training process.Object detection using classifiers grids overcomes above disadvantages while still assuring effective classification.In contrast to sliding window technique, the main idea here each classification only undertakes an image area size limitations on the image grid.Hence the complexity of each classification reduced and each classification unit designed simpler parameter numbers and number combinations weak classifier.Each classification is only responsible for detecting objects in the specified region by classifying objects from the background.Using online classified system can respond to changing environmental conditions, reduce the complexity of the classification.In general, the approach adapts very effectively to the problem of lost or omitting required information.Such performance of the classifier affected when updating wrong information may pull back by learning to adapt to existing data and update it correctly.To avoid this problem, classifier grids [20] applied to update the fixed strategy using a few positive samples set and negative samples are automatically generated during the training.Each region will have a set image classification and responsible for undertaking quantified image area to determine whether or not the object appears on the image.This update strategy ensures the stability of the findings in the long run for not being drifted while giving good results.In addition, this approach needs no time to prepare data in advance.In fact, the sampling error during update online learning classifier leads to being degraded that can be restored with a certain number of sample in a short time.An object classifier grids in the same place for a long period of time can be seen as negative examples when major updates lead to the loss of the object detected.In this research, we solve the problem of omitting objects detected in the sequence of continuous image frames by combining current information with the temporarily information of previous images frame and replace most fixed update strategy with a combination adapted between the classification given training to set carefully-selected data as a prior knowledge.The experimental results show that the benefits of the proposed approach.Solving the problem of detecting the object does not move in for a while with better results in terms of both accuracy and performance of the classifier grids.

Combination of off-line and on-line learning for classifier grids
In the section we describe how to combine off-line boosting for feature selection with on-line boosting for feature selection to allow for combining prior information from labeled data with new information, which is not available at off-line training time.The main purpose of the learning based on the classification grid is to simplify problem detection object, reducing complexity in detecting objects and solving drifting objects.Moreover, classifier grids learning is to incorporate unlabeled information, which classifier adapts to new environment.In this section, we will present the key issues as well as the main idea of our approach to classifier grids learning.

Classifier grids
The main idea of classifier grids follows the division and conquer principle.The image is divided into highly overlapping grid elements (regions), where each grid element corresponds to one simple classifier [9].Using a separate classifier for each position within the image significantly simplifies the problem.. Classifier grids is a set of simple classification.The classification task is handled by the classification of detected objects which is discriminated to the background image.Classifier grids is to reduce complexity of task by training a separate classifier for each position in image and using online learning increase adaptive to classifier.The update of classifier is used fixed rules.We use online boosting learning for feature selection [15] Output: The final strong classifier : Method: 1. Initialize weights ; 2. for t = 1, 2,…, T do 3.For each feature j train one weak classifier h j : X Y with error with respect to D t 4.
5. Select best J weak classifiers to initialize selector t with proper features 6. end for 6.Chose 7. Update weight distribution:

end for
The fixed update strategies are based on a fixed set of hand-labeled samples used for the positive updates to describe the appearance of the object of interest and on samples extracted on the fly from the scene to model the back ground.The positive updates are correct by definition, since they are taken from a finite set X + of hand-labelled positive samples.Thus, for each feature , where is the full feature space, a generative model can be estimated.By drawing from a fixed set of hand-labeled samples X + for the positive updates, the positive distributions is not changing over time and can be calculated in an off-line manner, since all information is given in advance.This allows for neglecting these updates during the on-line scene adaptation and results in a fixed distribution for the object class (positive class) . If this step is performed by our modified off-line boosting for feature selection algorithm (Algorithm 1), we can exploit the advantage of having a classifier which consists only of features suitable for the task of interest, i.e., the features selected during this offline training stage are well suited to describe the object class.Additionally, the number of updates can be reduced to the half, since positive updates are no longer required because the distribution of the positive information is calculated during off-line training.In order to adapt to changing environmental conditions, the negative distributions have to be updated all the time.Thus, the input images are directly used to perform updates of corresponding grid elements.We assume that these updates are correct most of the time.Finally, in the particular case of classifier grids the discriminative classifier can be estimated by combining the two generative models (can be calculated off-line) and (has to be calculated on-line) at feature level.This combination can be efficiently realized by using on-line boosting for feature selection.The overall idea is illustrated in Figure 1.Since during the on-line stage of our classifier grids approach only negative updates are performed, the error of the positive samples stemming from the off-line training is kept fixed during the on-line stage while the error of the negative samples is adapted all the time.By using solely the error calculated during on-line learning, only the error for negative samples, i.e., the false positive rate can be estimated.However, the fixed distributions of the object class were estimated off-line.Thus, we can use the combined error to select the best weak classifier within the selector.Yet, this linkage of off-line and on-line learning is ideally suited for the classifier grid approach, since prior information can be exploited in the off-line stage while information available at run-time can be considered for on-line adaptation.The linkage of offline and on-line learning for classifier grids allows for solely performing negative updates for a classifier, whereas the positive representation was trained off-line in advance and kept fix.These update strategies ensure "long-term" stability.

Combination of off-line and on-line learning to classifier grids
Initialization stage: Given an input image X is divided into highly overlapping grid elements, regions X i (i = 1, . ., N), where each grid element i corresponds to one compact classifier G i operating in image patch X i .We have classifier grid G i operating in regions  Detection stage: First, the classifier C is used as an oracle to generate new positive sample as well as negative sample updates of the classifiers within the classifier grids on the background subtracted image.Positive samples updates are spread to all classifiers in the grids whereas negative sample updates are performed for a particular classifiers in grids.In combination with our robust online learning, this oracle can be replaced by the fixed updates rules.In addition, learning framework performs negative sample updates to classifier grids G i only if the scene is changing.A confident positive result of classifier C at position i in grid generates an update for all classifier grids G i (i=1..N).In this way, negative sample updates are performed with the classifier grids G i if there is not corresponding detection result at this position for classifier C. For a new scene specific, positive samples update over the whole classifier grids G i .The update algorithm during detection phase for a specific grid element i is presented in Algorithm 4 and illustrated in Figure 4.

Algorithm 4: Classifier Grids Update Input:
Classifier grids G i t-1 ; Classifier C t-1 ; Patch corresponding to grid-element X i ; Background subtracted patch B i ;

Experimental evaluation and results
We evaluate the classifer grids approach on different scenarios to demonstrate the benefits compared to generic state-of-the-art approaches.The aim of our experiments is to demonstrate the efficiency of linked off-line and online boosting for classifier grids and the robustness of our framework for object detection problem in surveillance system.In the following, we first give a description of the used data sets, then present the training process, and finally report the performance of our system.To test the proposed object detection system, we used two challenging data sets.We performed experiments and demonstrated our approach on public different and available datasets for person detection.The first is the public PETS 2006 dataset showing pedestrians at a train station.The second is the CAVIAR dataset.Besides evaluations on two different pedestrian datasets and one car dataset we perform a long-term experiment to demonstrate the stability over time.For the experiment of detecting objects, we start with a random classifier which comprises of 20 weak classifiers and 50 selectors.We use an off-line trained boosted classifier with a size of 50 x 20 weak classifiers which means that we have 50 selectors, each of it containing 20 weak classifiers for the feature selection process.To increase the solidity of the negative samples updates, we collected four background images overlapped activities for different time periods.From these experiments, the benefits of the proposed approach are obvious.We use an overlap between the grid elements of 90%.For the classifier grids approach we first perform updates with each frame within the first 30 frames.After we adapted our classifiers in the initial stage (first 30 frames) we reduce the number of updates by using only every 10-th frame to improve the run-time performance.

PETS Dataset
In this experiment, we compare the different proposed approaches on a publicly available dataset for pedestrian detection, namely the PETS 2006 dataset.It contains leftluggage scenarios at a train station.We evaluate on Dataset, a scenario where one person leaves a ski equipment.The sequence consists of 308 frames with a resolution of 720 x 576 pixels containing 1265 pedestrians.We compare the classifier grids with other advanced methods, namely the object model deformation Felzenszwalb et al. [7] and the approach chart of Dalal Gradients and Triggs [5].Both methods use fixed classifier trained offline and are based on the sliding window technique.In addition, we compare our approaches at classifier grids and compare with the approach of Roth et al. [20].The results of the PET dataset is shown in Figure 5 and the recall and precision for the best F-measure value is shown in Table 1.Illustrated object detection is shown in Figure 6.For all classifier grids approaches we set the overlap between the grid elements to 87%.

Long-term sequences
In the following we demonstrate the long-term sequence of the proposed method.The main goal of classifier grids is to develop an adaptive but still robust system that is learning 24 hours a day and 7 days a week.We captured a sequence of a corridor in our building with 1 fps during 7 days, resulting in 580.000 frames.To show that the performance is unchanged over time, we annotated 2, 500 frames at four different points in time (which corresponds to approx.45 minutes of video data).The sequences are selected at first day, the third day, the sixth day and the last day.Every single frame was used to perform an update of the classifiers within the classifier grids.
The number of pedestrians visible as well as the number of updates performed before the sequence starts are summarized in Table 3 for all four sequences.It can be seen that the method is stable over time.The results shown in Figure 9 and Table 4.The slight variations in the curves can be explained by the different levels of complexity for the four sequences (i.e., number of persons, density of persons, etc.) and the Fmeasure is unchanged over time.Finally, we illustrate the significant changing conditions (i.e., natural light, artificial lighting, inadequate lighting, etc.) in Figure 10.
We had to deal with during these 7 days.These drastical changing conditions arise the need for adaptive approaches, like the classifier grids approach.

Car detection
We evaluate the off-line on-line linkage approach on a sequence showing one direction of a public highway.This dataset as "highway dataset".The whole scene consists of 1000 frames and contains a total number of 1952 cars from the rear view.The overlap between the grid elements was set to 92%.We compare our method to existing established methods, namely the Implicit Shape Models (ISM) of Leibe et al. [23] and the deformable part of model detector (DPM-FS).Illustrative detection results obtained by the proposed approach are shown in Figure 12 The detection characteristics summarized in Table 5.From the results shown in Figure 12 it can be seen that the proposed method clearly outperforms the generic car detectors (ISM and DPM-FS).

Conclusion
We have shown the success of object detection in surveillance systems based on idea of linking off-line and on-line learning for classifier grids.We have demonstrated the efficiency of classifier grids detection using the proposed framework.Also, the robustness of the detector against vast changes of pedestrians is clearly mentioned.The advance of our method is capability in dealing with unlabeled information incorporation for object detection from stationary cameras which is to preserve robustness classifier grids.We use semisupervised learning with co-training approach for the classifier grids.The idea of classifier grids could be used to combine both scene specific information and background information.This combination allows to incorporate unlabelled large information-streams but still in preservation of the reliable labelled information.Empirical experiments demonstrated the performance of our system for object detection on several challenging data sets on real-world surveillance scenario, where a corridor in public building is monitored over 24 hours per day and 7 days per week and object detection performed.Our approach is updated without supervision, the classifier grids is to preserve long-term stability and robustness.By using update strategies and combination of off-line and on-line learning algorithm, these classifier grids are modified over time to ensure scene adaptation of our object detection framework.Even though the updates are completely unsupervised, empirical results on longterm evaluations demonstrated that this approach is robust over time.

Figure 1 .
Figure 1.Combination of two generative models in one discriminative model at feature level by linking off-line and on-line learning in context.

Figure 2 .
Figure 2.This is a linking off-line and on-line learning: the off-line boosting for feature selection algorithm has to be adapted to select a set of J weak classifiers within each boosting iteration.
on Context-aware Systems and Applications 03 -05 2016 | Volume 3 | Issue 9 | e1 A Combination of Off-line and On-line Learning to Classifier Grids for Object Detection Nguyen Dang Binh 6

Algorithm 3 :
X i and one classifier C operating in sliding window on background subtracted image B. In our system, classifier grids G, note Gi is a classifier i in grid G, are trained in a co-training strategy manner.Beginning, classifier grids Gi and classifier C are initialized with the same off-line trained classifier.The off-line prior trained classifier has to capture generic knowledge by a few number samples to update classifiers within a new scene.The classifier grids G i is updated using labels generated by a second independent co-trained classifier C evaluated on the background subtracted image.In otherwise, classifier grids G i and classifier C co-train each other.A confident classification of classifier grids G i is used to update classifier C at position i.Otherwise, a confident classification of classifier C at position i generated an update for classifier grids G i .Classifier grids model is a combination of two generative models, one describing the background and one describing the object of interest, which are combined to a discriminative model at feature level by combining off-line and on-line boosting as illustrated in Figure 3.The relations of classifier grids G and classifier C on subtracted image updating are summarized in Algorithm 3. Classifier Grids Co-training.Input: Classifier grids G i t-1 ; Classifier C t-1 ; Patch corresponding to grid-element X i ; Background subtracted patch B i ; Output: Classifier grids G i t and classifier C t Method:

Figure 3 .
Figure 3. Classifier grids co-training: classifier grids on the left side are co-trained with an independent classifier operating on the background subtracted image on the right side.

Figure 4 .
Figure 4.The classifier C is used as an oracle to perform positive as well as negative updates on the classifiers within the classifier grids.Positive updates are spread to all classifiers in the grids whereas negative updates are performed for a particular classifiers grid.

Figure 6 .
Figure 6.Some illustrations person detection results of the classifier grids for the PETdataset.

Figure 9 .
Figure 9. RPC for the long-term data experiment

Figure 10 .
Figure 10.Illustrative detection results of the classifier grids obtained during long-term experiment.Each row corresponds to one time of day, morning, noon, afternoon, and night respectively.

Figure 11 .
Figure 11.RPC for the Highway sequence.Detection characteristics of the Highway Sequence for different methods sorted by the F-measure.

Figure 12 .
Figure 12. illustrative detection results of the generic car detector for the Highway sequence.
on Context-aware Systems and Applications 03 -05 2016 | Volume 3 | Issue 9 | e1 on estimated classifier grids.Using pool of positive samples for updating classifier and the negative samples are directly generated The classifier within classifier grids considers a simple problem to discriminate between object of interest and background at one specific location within current image.This is illustrated in Figure4.In contrast, sliding window technique with which classifier has to evaluate confidence value to different window scales at every position in the image.Classifier in grid has responsibility at one specific region within the image.Hence, this reduces the search space for each classifier in grids and can significantly reduce the number of classifiers.weak classifiers has to be selected in each boosting iteration, where J is the number of weak classifiers within one selector.This admits incorporating prior information about the object class.The huge pool of randomly initialized weak classifiers may contain very similar features.In general, similar features give a similar training error, which is the criterion for selecting the weak classifiers.This does not influence off-line boosting for feature selection, since only the best weak classifier is selected in each iteration.However, to allow a subsequent on-line boosting for feature selection, instead of a single best weak classifier, we select the best J weak classifiers in each boosting iteration.To avoid having too similar features within one selector, an additional selection criterion measuring the similarity between features has to be introduced.Too similar features within one selector would hinder the adaptivity of the classifier during on-line updates.Hence, we have to introduce a similarity criterion based on the overlap between the features and the feature types.The overlap criterion considering the spatial position and extend of features within the patch.Features with an overlap larger than a specified threshold are only allowed if they have distinctive feature types, i.e., horizontal vs. vertical or diagonal feature types.The avoidance of too similar features within a selector ensures the adaptivity of the strong classifier, which is required for an on-line adaption.Applying off-line boosting for feature selection to select features appropriate to a specific task allow using less complex classifiers to solve the same problem, since the features within a classifier are well suited for the particular problem.The modified off-line boosting for feature selection algorithm is described in Algorithm 1.
EAI Endorsed Transactions on Context-aware Systems and Applications 03 -05 2016 | Volume 3 | Issue 9 | e1from image patches corresponding to a classifier grid.choosing the features that are most suitable for the actual task.In general, on-line classifiers like on-line boosting for features selection are initialized randomly from the set of all possible weak classifiers.However, if the problem is known in advance then it is possible to use a suitable representation describing the actual problem, i.e., features that are suitable for the particular task.Using this prior information, which is often available, can improve the results.Originally, on-line boosting for feature selection initializes the selectors with random features.In order to exploit the often prior available information, we propose to link off-line and on-line boosting for feature selection.Off-line boosting for feature selection allows initializing the classifier with features suitable for the actual task.Therefore, off-line boosting for feature selection needs to be modified.Originally, in each iteration, off-line boosting for feature selection selects one weak classifier, i.e., one particular feature.To allow a subsequent on-line boosting for feature selection, a set of J

Table 1 .
The Recall and Precision Comparison Volume 3 | Issue 9 | e1 A Combination of Off-line and On-line Learning to Classifier Grids for Object Detection

Table 2 .
The Recall and Precision Comparison

Table 3 .
Sequences of the long-term experiment

Table 4 .
The Recall and Precision Comparison

Table 5 .
The Recall and Precision Comparison