Ascending and Descending Walking Recognition using Smartphone

In the recent decades, activity recognition for human behavior monitoring has been one of the major interesting research subjects. This paper investigates fractal dimension (FD) as an effective feature for human activity recognition. The target activities are descending and ascending stairs, which are performed by six male participants. To testify the validity of feature, the classification methods are k-nearest neighbors (kNN) and support vector machine (SVM). The best result achieved in this results is using SVM classifier and obtaining 95.30% for walking downstairs and 88.56% for walking upstairs with window of 4s and equalized data. Another aim of this paper is to examine the impact of database on the classification results. The finding in the paper provides evidence that the classification accuracy is affected insignificantly by the ratio of descending and ascending data.


INTRODUCTION
Human activity recognition is an important topic in the field of mobile Health (mhealth) [1].They enable many applications such as daily life monitoring, personal biometric signature, elderly and youth care [2].Mobile devices are becoming increasingly sophisticated and embedded many diverse and powerful sensors.These sensors include accelerometer, gyroscope, GPS sensors, camera, microphone, etc.It is important to realize that smartphones are an ideal factor for the development of mHealth area because of their embedded sensors, computational power, communication capabilities and ubiquity.

LITERATURE REVIEW
The comprehensive survey of the recent advances in human activity recognition with embedded sensors in smartphone was introduced recently [2].The paper reviewed basic concepts of sensors and their functionality.A smartphone application processed the features extracted from raw sensor readings and predicted user's activities.The common features used in activity recognition were also reported in both time and frequency domain.The activities were categorized in different types such as simple activities, complex activities, living activities and health activities.
Another research attempted to recognize six human activities using multiclass support vector machine (MC-SVM) [3].The experiment was carried out by a group of 30 subjects wearing the smartphone on their wrist.The recognition process was based on the fixed-width sliding windows of 2.56 second and 50% overlaps.From each window, a vector of 17 features in time and frequency domains was extracted from the accelerometer readings.The classification results using MC -SVM was obtained an average accuracy of 89.3%.
With the same objective, a system was conducted by numerical experiments on identifying the physical activities of a user [4].The data was collected from 29 users as they performed daily activities.In the next step, informative features were generated based on the 200 raw accelerometer readings (10 seconds).Although there were six basic features, the total features generated from three axes of an accelerometer were forty-three features.The classification methods in this experiment were J48, logistic regression, multilayer perceptron and straw man.The most average accuracy result was 91.7%, which was derived from multilayer perceptron classification.However, it appeared that the walking downstairs and walking upstairs had the results of 44.3% and 61.5% accuracy because they confused each other.As results, the authors suggested a solution that considering both activities as one set, which had the classification result is 77.6%.
At the moment this paper is written, many commercial smartphone applications have been developed to recognize human activities in daily life.The first example is Google which is compatible with Android smartphones [5].The indexes of daily activities such as number of steps, traveled distance, jogging, biking, etc. are recorded automatically and the consumed energy is calculated at the same time.The second example is Sony Lifelog application, which is developed by Sony Corporation [6].Sony Lifelog is an application that allows users to draw a framework of the activities carried out in daily life.One of many benefits of this application is the recording of the total number of steps a user walked in the day.The consumed energy corresponding to the step counts is estimated simultaneously.However, the energy is estimated based on the number of steps counts, which is inappropriate calculation in walking-like activities.For example, it is required smaller amount of energy for walking downstairs in comparison with walking upstairs.Although the Health App of Apple has similar functions with the two applications in collecting user's exercise data and keeps track of health issues [7], it provides more information about flights of stairs, which is counted as about 3 meters (16 steps) of elevation gain.
In this research, ascending stairs and descending stairs walking are considered because they are similar activities and performed regularly by many people in their daily life.Additionally, fractal dimension is investigated as a new feature, which is useful and effective in human activity recognition.It is benefit not only in classification but also in reducing the curse of dimensionality.
Another key point of this research is the trade off between utilizing balance data to secure the mathematical meaning of classification and utilizing unbalance data to maintain the physical meaning of the activities.

METHODOLOGY 3.1 Data Acquisition
In the experiment, six male subjects were asked to perform 10 times with each time including walking descending then walking ascending five floors of a building.The smartphone is located portrait in the left pant's pocket of a subject.A Sony Xperia Z3 smartphone has been exploited for the experiment because it contains an accelerometer and a gyroscope for measuring threedimension linear acceleration and angular velocity respectively at a constant rate of 25 Hz, which is sufficient for capturing human daily activities.In order to achieve a practical data of the activity, each subject was allowed to walking freely.However, this allowance leads to an unbalance in data of walking downstairs and upstairs of each subject and among different subjects.The consequence is that there is a trade off in constructing training set and testing set.Although equalizing the descending and ascending data in each set avoid bias in classification mathematically, it is regardless the physical meaning of going up and down of five floors.The two cases were analyzed in this research to find a suitable data set for online classification.

Fractal Dimension (FD)
FD presents the irregular characteristic of a time series based on the power law index [8].Although the FD can be considered similar to Euclidian dimension, an important difference between them is FD may be a non-integer.The FD demonstrates how a set in micro-scale is similar to its counterpart in the macro-scale.The closest integer value of a FD shows its equivalence to the Euclidian dimension.For example, if the FD is close to zero, it presents a point (0-dimension in Euclidian).It describes lines when it approaches one (1-dimension in Euclidian) and surfaces when it approaches to two (2-dimension in Euclidian), and so on.

Higuchi's Algorithm
Higuchi's algorithm is an effective method to calculate the FD in time domain [9].It is the most accuracy algorithm to estimate FD [10].It is based on the measurement of the length of the curve in time series.
From a given time series: X(1), X(2),..., X(N), the algorithm constructs k new time series: where is the integer part of a real number.The length L m (k) of the curve with k-segments is calculated by the following formula: If the average length L(k) is estimated approximately as followings: the curve is said to has fractal dimension D, which is calculated by least-square linear best-fitting procedure.
According to the result of practical experiments, in this experiment k has the value of 10, which means the data in investigated window is divided in 10 segments.

Signal Processing
In this experiments, horizontal, vertical and forward/backward movement of the user's leg are captured in x-axis, y-axis and zaxis respectively.The activities are analyzed in the window's length of 2 and 4 seconds, which is the expected duration for classification in online implementation.From each window, a vector of 6 features is obtained by calculating FD from three accelerometer variables and three gyroscope variables.

k-nearest neighbor (kNN)
kNN is a type of instance-based learning and is the simplest of all machine learning algorithms.In this research, the neighbors were taken from two classes (ascending stairs, descending stairs).The Euclidean distance is typically computed for classification.In the next step, the distance from each sample in the testing set to all the samples in training set is calculated and the results is arranged in ascending order.A sample is classified by a majority vote of its neighbors, then the object is assigned to the class most common among its k nearest neighbors (k is a positive integer).The kNN classification in this paper is concluded by k = 5 and k = 15 then the value of k that rises a better accuracy will be reported in this paper.

Support Vector Machine (SVM)
In SVM classification, the inputs are transformed to high dimensional feature spaces in which they become more separable in comparison to the original input spaces.At the beginning, a hyperplane is constructed to optimally separate the training data into two classes.Then, two parallel hyperplanes (or support vectors) are constructed on each side of the hyperplane.By maximizing the distance between the two parallel hyperplanes, the data is classified into two classes.

RESULTS AND DISCUSSION 4.1 Fractal Dimension
The FD of the accelerometer (Ax, Ay, Az) and gyroscope (Gx, Gy, Gz) in different window lengths are shown in the Figure 1 and Figure 2. It is possible to draw several quantitative evaluations from the two figures.Firstly, although descending walking and ascending walking are visually distinguishable in channel Az, Gx and Gy, the 4-second window shows clear difference than 2second window.Among all channels, Gx is a channel that manifests the clarity of the difference between the two activities.
The reason is when a subject moves his left leg to step forward in the ascending stairs, there is a moment the smartphone is fixed in the subject pant's pocket.When he moves his right leg forward, the smartphone is released in the left pant pocket and vibrate around x-axis.In descending walking, this phenomenon does not happen, which leads to the difference in FD of the two activities.
When FD of a curve approaches 2, it manifests the filling capacity the whole plane or the complexity of the curve.According to the two figures, the information of accelerometer is more complex than information of gyroscope because its FD is near to 2. The figures also reveal that descending walking has higher degree of complexity than ascending walking.

Statistical Results
The differences between descending walking and ascending walking are highlighted in Table 1.Although the slight different in activities are shown in the Ax channel, the most noticeable difference could be found in Gx channel.Furthermore, there are significant differences in the channels Az, Gx and Gy.Therefore, in the following experiments, these channels will be combined to form feature vectors for classification.

k-nearest neighbors (kNNs)
The Table 2 shows the accuracy of kNN classification for different feature vectors.For example, the column Axyz presents the classification results when the inputs are the three axes of accelerometer.Similarly, the data in column AzGxy is the accuracy of classification when inputs are the vectors of accelerometer x-axis, gyroscope x-axis and gyroscope y-axis.It is apparent from the Table 2 that is no significant difference between equal (X_eq) and unequal (X_uneq) data in descending (D) and ascending (A) walking.However, data from this table reveals that long window (4 sec) provides more accuracy in classification in comparison with short window (2 sec).It can be seen from the table that the accuracy using accelerometer data is lower than the results using gyroscope data.Among several methods of channel selection, the AzGxy option raises the highest accuracy with the same window lengths.

Support Vector Machine (SVM)
According to the Table 3, the data affirms that the results of SVM classifier have similar findings with kNN classifier.To begin with, although the equal data classification results in higher accuracy than unequal data, the difference between them is insufficient to draw conclusion which one is better the other.Secondly, 4-sec window yields a better recognition results than 2sec window.Additionally, gyroscope data has been proved that it provides a better result than accelerometer in both kNN and SVM classification.Finally, selecting the AzGxy option bringing the best classification is also true with SVM classifier.The most accuracy of SVM classification is 95.30% for walking downstairs and 88.56% for walking upstairs with window of 4 seconds and equalized data.
In comparison with other works, this research has several advantages.Firstly, only one feature is used to recognize human similar activity such as descending and ascending stairs.This advantage decreases the cost of dimensional curse in comparison with other researches.For example, the research in [4] needs 43 features for activity classification but the result is quite small for descending stairs (44.3 %) and ascending stairs (61.5 %), which leads to the consideration of combining the two activities as one action.In another research, the classification precisions of walking downstairs and walking upstairs are 87.2% and 72.6% respectively [3].However, it requires a vector of 17 features to draw that classification result.It is important to notice that the feature in this research is calculated directly from the raw data in time domain, which reduces the calculating complexity.Another good point of this research is an activity could be recognized in a short interval of time such as 2 or 4 seconds.The research [4] applies an approximate window size of 2.56 sec, while the study [3] divides data in 10 seconds window segments.A short window time allows the flexible in recognition because sometimes the subjects may perform an activity less than the required duration for recognition.This paper also recommends that the data of the two activities should be equalized in the preprocessing stages.
Although there is no significant difference between equalized and unequalized data, this implementation facilitates the signal processing as well as mitigates the complexity of calculation.

CONCLUSION AND FUTURE WORK
In this research, FD is introduced as a valuable feature for human activity recognition.To evaluate its validity, it is used to classify similar activities such as descending and ascending stairs with kNN and SVM classifier.According to the results, it is recommended that ones should utilize equal raw data for classification when the difference ratio is insignificant.This utilization leads to the easy in programming application and processing data.This study focuses on classify activities of the subjects with the smartphone in their left pant's pocket.The final goal of this research is develop a complete human activity recognition application on smartphone.More broadly, research is also needed to determine different locations of the smartphone and more target activities in the future.Equally important is the implementing of completed smartphone application that could classify human activities online.
L m (k) is not the length of the curve in Euclidean sense, it stands for the normalized sum of absolute values of difference in ordinates of pair of points distant k (with initial point m).The length of curve for the time interval k, L(k) is calculated as the mean of the k values L m (k) for m = 1, 2, ..., k: