Hybrid Machine Learning Techniques to detect Real Time Human Activity using UCI Dataset

The cell phone is assuming a crucial job in present day life. It offers types of assistance and applications, for example, location tracking, medical applications, and human activity examination. All android smartphones have motion sensors i.e. Accelerometer, gyroscope, in order to detect motion of a user in a very precise way. In early conditions, committed sensors were utilized for activity acknowledgment. Different techniques are developed for distinguishing normal or human activities scenes in the crowd by processing the video or an image. A novel KNN-SVM human activity detection method is proposed to detect human activities in the UCI dataset for complex multi-process physical activities. Model trained with machine learning algorithms to capture the temporal dependency, normal sequences with high dimension is uniformly utilized to train the model to discriminate each activity. In the classification process, 2 different efficient classifiers are applied to identify the types of human activities in the UCI dataset. Support Vector Machine and K-Nearest Neighbour are applied in the proposed method for the classification. The efficiency of each classifiers is about 85% to 87%. The classification efficiency is comparable with existing literature after applying the majority decision in these classification techniques.


Introduction
Different techniques are developed for distinguishing normal or human activities scenes in the crowd by processing the video or an image.Human activity Recognition Activity (HAR) has been emerging as an area of great significance proving itself helpful in various applications.It is the problem of correctly classifying the smartphone sensor observation into distinct activities.The readings recorded by sensors are taken in three dimensions (x, y, and z).The HAR system can keep track of daily life human activities from simple to complex and thus render its services for different applications such as disease prevention, elderly monitoring, security, fall detection system and many more.
It is a challenging task as it has to deal with a large volume of dataset obtained at a high speed by smartphone sensors.The unavailability of an obvious and straight forward method to relate these observations to well-defined movements also makes it a more complex task.
This work is aimed at the Sensor based Human Activity Recognition for the recognition of six physical activities using smartphones.Smartphones, being ubiquitous, embedded with various sensors prove to be a feasible approach to collect inherent sequential characteristics of data.These sensors can be environmental sensors, inertia sensors and video sensors.This work focuses on real time classification of activities while employing machine learning algorithms.Smartphones based sensors can be efficiently used to identify human activities in real time because of possessing great computing ability and increasing networking capacity.The devised solution is first evaluated for an already collected dataset and then for a new dataset obtained in a real time scenario.This dissertation presents a comprehensive approach to attain desirable results by employing sensors of smartphones and efficient Machine Learning algorithms for activity recognition system along with addressing the different issues related to design aspects of used architecture.
The cell phone is assuming a crucial job in present day life.It offers types of assistance and applications, for example, wellbeing checking, beginning time infection detection, human examination, wellness following, and conduct investigation [1].Human Activity Recognition has become an important research area over the last few years.Besides their embedded sensors, smartphones are portable and possess great computational power.These features along with strong communication capability make it an ideal approach to be used in a real time situation for HAR.HAR has got a considerable significance in providing immense benefits for supporting many novel applications like surveillance, health and wellness monitoring, home rehabilitation and early detection of disorders.HAR is linked with multiple research areas such as computer vision, machine learning, artificial intelligence, ubiquitous computing and machine perception.Thus, it has become an interesting area of research or researchers.
HAR identifies the activities by continuous observation obtained from embedded sensors to understand the environment characteristics as well and then responds accordingly.The system aims to discover human movements in real time environments and thus supporting in-time learning environments by giving the accurate description of activities at right time.
Various authors used accelerometer in order to identify activities.Accelerometer is the best sensor to detect motion in all three directions.In our research we have also used built in accelerometer in smart phone in order to identify the activities by sensing the motion detection in all three directions and send to the machine learning model for prediction.
The key contributions of this examination which likewise addresses the impediment of past work can be summed up as follows: • To find out the accuracy of accelerometer using android application • To find accuracy of machine learning models • Proposed a hybrid approach in order to justify our proposed methodology • For all this purpose we have used UCI Human activity recognition dataset, implement ML models and identify the best one on the basis of metrics (accuracy, precision, recall and Fscore) • For this purpose we have applied hybrid techniques to find better accuracy to classify the activities in the real time environment The detection of human activities in video streams or an image is an interesting and challenging task.Different techniques are developed for distinguishing normal or human activities scenes in the crowd by processing the video or an image.Video processing techniques are used to check the video segments human activities.The detection of human activities is a very difficult task due to the variation of moving objects and bodies of different sizes.[1 -5].
The primary step of human activities detection is a preprocessing which includes noise removing, image resizing, background subtraction and filtering process, etc.The second step is the feature and pattern extraction from the filtered data.Feature modelling is used to extract the texture patterns in the videos.[2].The third step is extracted features and patterns are classified by using different classifiers such as SVM, KNN, histogram and template matching, etc. [6,7].Machine learning techniques can be efficiently used for human activities detection in video.

Literature Review
As of the literature has advanced to the development of the first collaborative solutions.The detection of human activities in video streams or an image is an interesting and challenging task.Different techniques are developed for distinguishing normal or human activities scenes in the crowd by processing the video or an image.Video processing techniques are used to check the video segments human activities.The detection of human activities is a very difficult task due to the variation of moving objects and bodies of different sizes [1].
The primary step of human activities detection is a preprocessing which includes noise removing, image resizing, background subtraction and filtering process, etc.The second step is the feature and pattern extraction from the filtered data.Feature modelling is used to extract the texture patterns in the videos.[2].The third step is extracted features and patterns are classified by using different classifiers such as SVM, KNN, histogram and template matching, etc. [3].Machine learning techniques can be efficiently used for human activities detection in video.
In general, such information was used to improve the accuracy of the classification models of machine learning algorithms.At this moment have encountered a couple of composed works for (HAR) human activity recognition, some concentrated upon consistent taking care of and a bit of the technique uses web getting ready.A Nandy et.al describes the use of sensors in smartphones and how can we differentiate between dynamic and static activities, also recognized the static and dynamic activities at accuracy of 94%.[4] In order to achieve the best trade-off between the structure's computational complexity and recognition precision, a couple of appraisals were finished to make sense of which characterization figuring and features to be used.In this way, an instructive list from different individuals was accumulated that joins run of the mill step by step practices and a couple of health works out.Chin et.al identifies the flaws and compares the techniques for finding out the activity recognition and motion detection of a human being.[5] The examinations done accordingly far that realize activity recognition structures on mobile phones and use only their on-board sensors.We talk about various pieces of these examinations.Moreover, we talk about their restrictions and present various recommendations for future research.At the present time, examined the work done accordingly far on online physical activity recognition using PDAs.We consider mulls over that use just mobile phone sensors and that do the gathering locally on PDAs constantly.
The human activity checking is a unique zone of research and a lot of business progression are represented.It is ordinary that a great deal logically light-weight, tip top wearable contraptions will be open for watching a wide extent of activities as Gaussian blob modelling has been used in [6].
In [7] authors depicted the study of human activity recognition in the form of clusters to identify the activities of each person, by using clustering.The challenges looked by the present arrangement will in like manner be tended to in future contraptions.The improvement of light-weight physiological sensors will provoke pleasing wearable contraptions to screen different extents of activities of tenants.Formal and Informal examination predicts a development of interest and resulting uses of wearable devices in not all that removed future, the cost of the devices is furthermore expected to fall coming to fruition in of wide application in the overall population.
It has been created reliant on a balanced Support Vector Machine model that works with fixed-point calculating.
The proposed [8] model was maintained similarly as Magnetic Induction recognition model used, where progressively direct models are continually loved in case they have (almost) indistinguishable ability to acknowledge when stood out from progressively complex techniques.The degree of this work is to apply the present advancement for enveloping information applications, for instance, in remote patient checking and smart conditions.Its focal points join speedier getting ready time, and the usage of less structure resources which in result give hold assets in imperativeness use while keeping up for all intents and purposes indistinguishable recognition execution when differentiated and other customary procedures [9]- [11].
To deal with the lack of getting ready data for irregular activities, we propose a two-organize variety from the standard recognition estimation.In the primary stage, KNN depends on customary activities, which helps with filtering through most by far of the common activities.[12] The suspicious follows are then given to a combination of unpredictable activity models balanced by methods for KNLR for extra recognition.A huge favoured situation of our strategy is that it can achieve a predominant trade-off between revelation rate and sham alarm rate.We show the ampleness of our system using real data accumulated from sensors joined to a human body transitional activities.[13]- [15] A potential imperative of Multi View Human Recognition goes over a particular direct more than once after a particular time point.Research show that with the usage of a through and through diminished number of features, the proposed procedure shows genuine execution interestingly with other characterization counts, achieving a general exactness of up to 90% accuracy.[16] In past research, activity recognition models have been ordered into information driven and information driven methodologies.The information driven methodologies are equipped for dealing with vulnerabilities and fleeting data yet require enormous datasets for preparing and learning.Shockingly, the accessibility of huge genuine world datasets is a significant test in the field of surrounding helped living.

Figure 1. Typical Methodology of Support Vector Classification
Figure 1 shows the typical SVM Model.Two activities or classes are separated by making a hyperplane between two support vectors as a result of decision boundary we can easily classify activities which can be more than two.Support Vector Machine Techniques In AI, Support vector machines are models that are connected with learning calculations for researching and gathering data.A SVM calculation makes a model that appoints advisers for one or different divisions of class.
This model maps the points of reference which segregate into the various classes.SVM's can in like manner play out a non-direct request.The activity of SVM relies upon the hyper plane.The undertaking of SVM relies upon the hyper plane that gives greatest least detachment to the planning models.The qualities are taken in the wake of actualizing dimensionality decrease strategy [17], [18].
The proposed framework centers on recognizing unusual activities with less computational time.The unusual activities are expressed utilizing state change table, which holds every single imaginable state.The framework is prepared to order the activities performed by the people and report the irregularities.As framework perceives 6 distinct activities of an individual utilizing multi-class SVM.The general design of the framework is appeared in Figure .The crude information got from sensors of an individual performing different activities is huge in size, where all the information can't be utilized all things considered, where the information must be part into pieces and handled.
The sliding window is utilized to part information into window of size N information for the framework to perceive the activity.The sliding window diminishes the stream rate and sends less information to the framework to perceive the activity performed by the person.As the sliding window parts the information into window size N, there is tremendous plausibility that the framework might not have adequate information for perceiving the activity.In such cases, window limit must be balanced and the procedure must be rehashed with the new information.(c) There is no weight qualification between tests.The precision of K-NN calculation can be degraded within the sight of commotion or improper highlights.In design recognition, K-NN is a strategy for masterminding objects dependent on closest preparing highlights in the component space [8], [13], [19] Order and Regression Tree Classification and Regression Tree calculation characterizes an example as showed by social events of various models with near properties.During setting up, the arrangement data is continually disengaged into littler subsets.

K-Nearest Neighbour
Exactly when the divisions are done, the models are assembled according to their properties.Testing tests are then evaluated against explicit conditions in each centre point and caused all through the tree.Exactly when the model accomplishes a leaf centre point, it is then doled out the class to which the models in that centre point have a spot.Right now, equal tree with genuine conditions was used.Trucks are still under wide research and can be used as a free classifier or as a bit of greater algorithmic structures [12].
As our information assortment was acted in a controlled situation, our outcomes for complex activity recognition may be hopeful when contrasted with a true arrangement.Notwithstanding, it is the initial move towards perceiving such activities, all things considered, situations.These activities can be acted in different manners.For instance, eating activity can be involved eating a sandwich while strolling, standing or sitting, eating soup while sitting or eating something with a blade and a fork in various stances.

Methodology
The UCI HAR detection dataset is publicly freely available.It's a very large scale dataset, to evaluate our method.This dataset was collected on the cameras with a specific angle for different real world activities.
This dataset contains of long and raw surveillance videos which hold 6 different types of real-world human activities, contains: Therefore, these activities are detected.The dataset was divided into four subsets, each video contains different clips [20]   Above figure shows the input data coming from sensors of android smartphone is pre-processed and filtered in order to segment the data into readable normalized form, this data further compared with trained model of SVM, KNN and Hybrid Model in order to predict activity by each classifier, further the results of each testing sample on each classifier are compared to show performance parameters.
The UCI human activity dataset is used for testing and evaluation of the proposed methodology.The human activity dataset contains 6 different human activities.As we know that data is the combinations of frames, so we have converted the videos into frames for preprocessing and feature extraction.
Our proposed technique explains here based on the rule when an irregular event occurred.The maximum frames are different compared than activity frames.We trained the model which contains SVC and KNN Classifiers.The image description explains the visual feature of each video frame.The Model is trained properly with video blocks contains only Activities.We reduce the errors between the input and output volume of the frames.The model is trained correctly on the normal video frames, and then model shows the low reconstruction error.Each testing input video volume produces reconstruction error.The reconstruction error depends upon custom loss.We set the threshold on the value for each activity.If the value crossed a threshold limit, then shows another activity and below the threshold limit, it represents the other activity.Then our system able to recognize these activities events occurs.

Model Parameters
The training model is used for reducing the reconstruction error of the input volume.Our proposed model uses the learning rate automatically depends upon the updated history of the models weight.The minimum patch size is 64 and each training video volume is trained for a maximum 50 tests or pending the rebuilding loss of authentication data stop reducing after 10 consecutive tests.

Hybrid Model
A novel KNN-SVM human activity detection method is proposed to detect human activities in the UCI dataset for complex multi-process physical activities.Model is trained with multiple input parameters, to capture the temporal dependency, normal sequences with high dimension is uniformly utilized to train the model to discriminate each activity.The main strategy to obtain better results is to combine them in such a way, we have applied SVM on training samples in order to obtain the support vectors to classify each activity in a given dataset of UCI.After taking testing samples we have found the average distance between each support vector in order to apply KNN by using distance method of Euclidian distance for classifying each testing sample into minimum average samples and then plot results.
In our proposed methodology we have applied SVM to find support vectors to make a decision boundary and hyperplane along classified data points after that we will find an average distance between each support vector to clustered the different activities by making the value of k = 3 in KNN classifier.

Classification Models
In the classification process, two different efficient classifiers are applied to identify the types of human activities in the UCI dataset.Support Vector Machine and K-Nearest Neighbour are applied in the proposed method for the classification.All classifiers give the results separately as human activities in the same case.But the final decision about the type of activity is taken by using metrics scheme.Classifiers give results, majority voting count all Six Outputs, the result is displayed as which one output is greater number.The majority Classifier answer is considered as a final answer in the classification portion.Classification is done by starting with the additional discriminating features and slowly adding less discriminating features.Classification of the types of activities gives better results with existing literature.The classification is done by different classifiers, and some features are extracted from the input image.These features are then compared with the parameters of the standard classifier for comparison to finalize the detection.Different parameters for classification are homogeneous, contrast, correlation, mean, probability, etc.The efficiency of each classifiers is about 85% to 87%.The classification efficiency is comparable with existing literature after applying the majority decision in these classification techniques.

Support Vector Machine (SVM)
In SVM classification, original input UCI Human Activity are calculated into advanced dimensional features in which classifier gives the result of the type of human activities.Due to these properties, SVM classifiers incline to possess a dressed aptitude for the detection of the type of human activities.An SVM classifier gives good results for the type of human activities because some parameters in the classifier are lengthy calculated, critical and timely taking task in the classification.Classification is done by starting with the additional discriminating features and slowly adding less discriminating features.Features for classification of the human activities are homogeneous, contrast, correlation, mean and probability.SVM is a vector support machine, whose introduction is the milestone in machine learning.It belongs to the supervised classification class.The main advantages of SVM are mathematical tractability, high precision and direct decision on geometry.There are many SVMs, but the best is the SVM kernel.In the proposed model, we use the SVM kernel with two different parameters; linear and RBF.Traditional SVM uses the hyper plane to classify data.In the SVM kernel, the algorithm is almost the same, but each point produced between the vectors is replaced by the nonlinear kernel function.[17].Input from sensors taken, will filtered in to segmented data so that it can be extracted into features as well as well-trained so that to be tested on SVM in order to classify each activity with the help of support vectors.

EAI Endorsed Transactions on
Internet of Things Online First

K-Nearest Neighbours (KNN)
In K-NN classification, classifier works based on the neighbouring training samples in feature space.Some parameters give the information presents in the dataset frames for types of human activities.K-NN is one of the reliable classification algorithms and belongs to the supervised classification class.[19] Figure

Proposed model for KNN in activity recognition
Figure shows the basic model adapted by us.Input from sensors taken, will be filtered in to segmented data so that it can be extracted into features as well as well-trained so that to be tested on KNN in order to classify each activity.

Results
This section delivers all results which are concluded as a result of our experiments.As in our methodology we have loaded the dataset (UCI_HAR), we have visualized the dataset, to find out any missing and duplicate values.All six activities are plotted in order to visualize and understand data in a better form.Variable within a dataset can be related in lots of ways and for lots of reasons: They could depend on values of other variable They could be associated to each other They could both depend on a third variable.In this project, we will be using the pandas method .CoRR () for calculating correlation between dataframe columns

Data Splitting
Data splitting is very necessary in order to train our algorithm to validate on testing data.In our case we have split the data in 80% training and 20% testing sets.Correctly classified activities concerning misclassified activities using the classifier are shown in these confusion matrices.The experiment shows the result of our proposed methodology.

Analysis of Hybrid Classifier
As hybrid classifier has shown best accuracy of testing and training in previous chapter we have plotted the accuracies of these classifiers where Hybrid Model has shown the best accuracy.Now we will also test our classifier on normalized data and non-normalized data.

Comparison with previous techniques
Besides our approach, authors in [2] used principal component analysis (PCA) for feature selection from fall detection datasets.This dataset contains 3 accelerometer Axis as features.According to the authors, PCA selects optimal features from the feature matrix.PCA measures the importance of a feature based on variance.Feature having high variance are treated as principal components and low variance feature are considered as noise.In this study, we focus on daily life six activities.Applying PCA on our dataset provides the X-axis and Y-axis of the accelerometer as a principal component but provides an overall accuracy of 85.4% which is far less than our approach.The rationale behind this is that activities such as standing, sitting, walking and jogging can be recognized efficiently using x and y-axis but no other complex activities such as upstairs and downstairs.These activities are efficiently recognized with the accuracy of 87% using testing dataset and 100% with training dataset at ratio of 20% and 80% respectively.The primary step in this research is human activities detection is a preprocessing which includes noise removing, image resizing, background subtraction and filtering process, etc.The second step is the feature and pattern extraction from the filtered data.Feature modelling is used to extract the features.The third step is extracted features and patterns are classified by using three classifiers such as SVM, KNN and Hybrid Classifier.Machine learning techniques has shown efficiency for human activities detection.
In this research, the figure depicted just above shows the accuracies of the three classifiers, SVM, KNN and Hybrid.10 tests initially, conducted on the testing dataset in order to validate results we have concluded that Hybrid classifier is slightly better than SVM and KNN on the basis of identifying the activities accurately which can be much better after using optimizing techniques to train classifiers.

Conclusions
The detection of human activities is a very difficult task due to the variation of moving objects and bodies of different sizes.A novel KNN-SVM human activity detection method is proposed to detect human activities in the UCI dataset for complex multi-process physical activities.Model is trained with multiple input parameters, to capture the temporal dependency, normal sequences with high dimension is uniformly utilized to train the model to discriminate each activity.In the classification process, 2 different efficient classifiers are applied to identify the types of human activities in the UCI dataset.Support Vector Machine and K-Nearest Neighbour are applied in the proposed method for the classification.All classifiers give the results separately as human activities in the same case.But the final decision about the type of activity is taken by using metrics scheme.Classifiers give results, majority voting count all Six Outputs, the result is displayed as which one output is greater number.These methods can be extended to deep learning approaches, as well as optimization techniques to show the tuning hyperparameters for better accuracies.The majority Classifier answer is considered as a final answer in the classification portion.Classification is done by starting with the additional discriminating features and slowly adding less discriminating features.Classification of the types of activities gives better results with existing literature.The classification is done by different classifiers, and some features are extracted from the input image.These features are then compared with the parameters of the standard classifier for comparison to finalize the detection.Different parameters for classification are homogeneous, contrast, correlation, mean, probability, etc.The efficiency of each classifiers is about 85% to 87%.The classification efficiency is comparable with existing literature after applying the majority decision in these classification techniques.Alternative use for this strategy will be using locomotive sensors.Using deep learning approaches with optimized techniques/ Stochastic approaches in order to predict the probability of each activity.Machine learning classifier with optimized techniques i.e.Particle Swarm Optimization and Convex Optimization Techniques K-Nearest Neighbour is a directed learning calculation which is champion among the most standard calculation for design recognition.K-Nearest Neighbour calculation uses neighbourhood characterization true to form expectation.The regular K-NN content arrangement calculation has three restrictions: (a) Calculation eccentrics due to the utilization of all the planning tests for request (b) The execution is solely dependent upon the readiness set.
. The video segment is collected from the different scene and split into typically almost 2000 to 2500 frames of each video.The UCI HAR dataset contain 1900 clips, average frame is 7247 and the dataset length is 120 hours.The more details of the UCI HAR dataset are given in different activities.Performance parameters such as accuracy, sensitivity, specificity, and AUC are calculated for authentication of the proposed technique.The visual and parametric outcomes of the proposed technique are compared with the existing literature.The performance parameters of the proposed method are computed as follows: The working flow of the proposed method using KNN, SVM and Hybrid Model is shown in Figure2.The details of the block diagram are as follows: The input sample is taken from the UCI human activity dataset.These input datasets are pre-processed and trained using machine learning techniques: The different existing classifiers (SVM, KNN and Hybrid Models) are compared with and test the best classifier for HAR detection.

Figure 3 .
Figure 3. Proposed Methodology for Android Application

Figure 5 .
Figure 5. Basic model for SVM in order to classify each activity in dataset

Figure 7 .Figure 8 .
Figure 7. Histogram plot for the counts of occurring against each activity (Walking, Sitting, and Standing, Lying, Downstairs and Upstairs)

Figure 9 .
Figure 9. Predictive model testing has been conducted w.r.t. each classifier at ratio of 80% to 20%

Figure 10 .
Figure 10.Graph of KNN at different values of k having different distances

Figure 11 .
Figure 11.Metrics (Accuracy, Precision, Recall and Fscore) of each classifier to identify the best between them

Figure 12 .
Figure 12.Confusion matrix of activity recognition using Hybrid classifier at normalized values having ratio of 80:20.

Figure 13 .Figure 14 .
Figure 13. 100 experiments conducted, 70 times our classifiers show best results plotted better accuracy for each activity

Table 1 .
Comparison Results of different approaches