Review of Transportation Mode Detection techniques

This paper reviews the works found in the literature in the field of Transportation Mode Detection (TMD) which is a subfield of Activity Recognition aiming at indentifying (i.e. classifying) the mean of transportation a person is using. The solutions found in literature have different characteristics according to the device for which the solution was tailored (smartphones or other systems such as, e.g., GPS loggers) and to the algorithm used for the classification task. This may vary a lot according to the number and type of input used (e.g. accelerations, GPS, maps information or GIS Geographical Information System information) and to the identified classes of transportation mode. These two aspects are the most relevant to consider when evaluating and comparing the accuracies claimed by each work. A comparison of the works is proposed taking into account the characteristics discussed above. In general the accelerometer is the most widely used sensor for TMD applications, as it limits battery consumption and captures relevant features for detecting motion. Indeed a key challenge in TMD is to detect different motorized classes such as bus, car, train and metro because they share common characteristics (such as e.g. the average speed and accelerations) which make hard identifying suitable features for the classification algorithm. Identifying the “walk” and “stationary” transportation modes is a simpler task because they are characterized by distinct features.


Introduction
Transportation Mode Detection (TMD) is a particular subfield of the Activity Recognition field that aims at identifying the mean of transportation a person is using, in an automatic way.In TMD personal mobile sensors are required in order to capture the most relevant features that characterize the specific mode of transportation.These personal mobile sensors must travel along with the person under observation, implying that sensors must be either integrated in portable technologies that have to be carried by persons (e.g. a smartphone) or worn by persons (e.g. a watch).In the literature, many different sensing techniques are considered: in the earliest research, custom mobile devices were used to collect data (e.g., GPS loggers and accelerometers), while in more recent works smartphonebased solutions have been developed, with the purpose of exploiting the sensing capability of these widely used devices.Indeed the technology evolution on one side, and the worldwide spread of smartphones on the other, made these latter the ideal candidates to be used for continuous sensing such as a TMD application.In fact the use of these devices avoids the use of dedicated ones which should be carried by persons in addition to those that are already usually carried.The algorithms found in the literature for the detection of the mode of transportation differ mainly on: a) the assumption on the position of the mobile sensor(s), i.e. fixed and known or variable and unknown a priori; b) the number and type of input signals collected; c) the kind and the number of features extracted from the collected data; d) the type of classifier used for the recognition of the mode of transportation on the basis of the extracted features.In the following sections, the most recent approaches to TMD are discussed and compared, differentiating between solutions based on smartphones (section 2) and those based on other mobile devices (section 3).Finally in section 4 a comparison of the works is proposed discussing which are for the authors of this review the most important and promising contributions.

Solutions based on smartphones
A variety of smartphone based solutions have been studied in the literature, differing mainly on the information used for classification and on the implementation of the classification algorithm, i.e. as server-based online classification, server-based offline classification or smartphone-based online classification.
Table 1 compares the PROs and CONs of the serverbased and smartphone-based approaches in the implementation of the classification algorithm.

Smartphonebased
No data exchange is required with the server thus the algorithm relies only on the device internal resources

Classification algorithm must be simple to comply with smartphone resources
The server-based approach has the main advantage that the algorithm implementation is independent of the device HW.Nevertheless, it is limited by the communication rate at which data can be exchanged between the smartphone and the server.
On the other side, the smartphone-based approach does not require any data exchange with the server since the classification algorithm relies only on the device internal resources which imposes that the algorithm itself must be simple enough to allow it running on smartphone platforms.
Indeed the technology evolution makes available both communication infrastructures and mobile platforms ever more powerful.Therefore the limitations of both alternative approaches may not be considered as such, in particular in the future.Finally it is worth highlighting that a smartphonebased solution allows avoiding or limiting sending personal data (such as GPS information) to a server which could be considered favourably by users who may be particularly concerned on their privacy.
In recent works, the online implementation of TMD algorithms on smartphones has been studied and enhanced to cope with the limited energy resource available on smartphones.In general, this is achieved by a trade-off between the number of sensors used and the complexity of the classification algorithms.
As regards sensors, the accelerometer is the most widely used sensor for TMD applications [1], as it limits battery consumption and captures relevant features for detecting motion.
The first approaches to TMD focused mainly on the use of GPS information, possibly combined with Geographical Information System (GIS) information, and later also on GSM and WiFi signals, which identified variations of the user environment.
The solutions based on GSM and/or WiFi signals despite being more energy-efficient than GPS based solutions, suffer of the variability of the GSM cell sizes and WiFi access points density, thus resulting unreliable outside urban areas [1].
Moreover, these two classes of approaches (GPS and GSM/ WiFi) were typically implemented as server-based applications for classification, due to smartphone battery limitations.
The use of GPS data for TMD is being limited in the recent applications, in order to avoid the problem of relying on uncertain information (GPS data may not be continuously available in certain environments) and of using a very energy demanding sensor.The main reasons that were found in literature for the wider adoption of the accelerometer sensor for TMD applications are [1]: 1. low power consumption; 2. independence on external signals sources; 3. possibility to extract highly detailed information about user motion.These advantages allow also the development of TMD algorithms as smartphone apps, running online on the user's phone.In this case, as already highlighted, the main issue is to design a simple classification algorithm allowing differentiating among a variety of transportation modes.However, in some cases the absence of additional information yields the development of very complex classification strategies.Furthermore, the phone orientation represents in general a critical aspect to take into account to analyse relevant acceleration measurements.
Indeed, three main approaches to this issue have been identified in the literature: in the first, accelerometers are fixed in a prescribed position on the user body so that the gravity components and the motion direction can be directly deduced; in the second, the three acceleration components are processed to calculate a total acceleration vector or to filter out the gravity components; in the third approach, the three acceleration components are not combined into a single variable but are used as independent input variables for classification.In this case, test subjects are normally required to label records with the phone orientation they are using during the data collection, so that this information can be directly used for training the classification model.Examples of these approaches are discussed in the following.
A very recent work on TMD applications is that of Hemminki et Al.[1], in which an accelerometer based solution for TMD was implemented.This approach consists on the recognition of the gravity acceleration components, the extraction of a large set of features (78 in total) which capture time and frequency characteristics of the  Al.[1].
At the root of the hierarchy is a kinematic motion classifier which performs a coarse-grained distinction between pedestrian and other modalities.When the kinematic motion classifier fails to detecting substantial physical movement (e.g.walking), the process progresses to a stationary classifier, which determines whether the user is stationary or in a motorized transport.When motorized transportation is detected, the classification proceeds to a motorized classifier which is responsible for classifying the current activity into one of five modalities: bus, train, metro, tram or car [1].
Each classifier uses a variant of Adaptive Boosting [2] for the learning phase.Boosting is a general technique for improving the accuracy of the learning algorithm.More in detail, the idea is to iteratively train a set of classifiers that focus on different subsets of the training data and to combine these classifiers into one stronger classifier.In addition to this, adaptive boosting introduces an iterative strategy for reducing the classification error by assigning to each sample of the training data a weight that represents the importance of the sample, so that higher priority is given to samples that are misclassified.
The algorithm was tested in different use scenarios and mission profiles, resulting in very accurate estimates of the transportation modes (see also Table 5 in sec.4).Nevertheless, the complexity of the classification framework and the number of parameters involved in the feature extraction make results interpretation much more difficult and enhancements to the classification logic hard to demonstrate and test.
However, the most interesting aspects in the approach proposed by Hemminki et Al. are the gravity components estimation algorithm and the variety of features used in the classification process.Other accelerometer-based solutions found in the literature normally consider only subsets of the features used by Hemminki et Al. and avoid the process of estimating gravity components, which may result hard to set up properly.
An example of this different approach is the work of Manzoni et Al.[3], in which a decision tree classifies a set of features including 32 FFT coefficients, computed on the total acceleration vector, and the signal variance.The overall classification accuracy is slightly lower than that obtained by Hemminki et Al., but the number of transportation modes classified by Manzoni et Al. is higher (eight vs. seven, see also Table 5 in sec.4).However, it is not clearly stated in their paper whether the real-time classification was implemented on the smartphone as an app, or a prototype application was developed running on other hardware devices.
Another example of accelerometer-based TMD application is that of Nham et Al.[4], in which offline classification of transportation modes is performed by training a Support Vector Machine (SVM) with 253 features (250 FFT coefficients, signal energy, mean and variance), obtaining accuracy over 90%.Similarly to Manzoni et Al., they calculated the total vector of acceleration magnitudes in order to eliminate the dependency of the signal processing from the phone orientation.It shall be remarked that the authors did not test the algorithm as extensively as Hemminki et Al., since they trained the model with each subject data independently and did not train the model on one set of subjects then testing it on an unrelated subject.
Brezmes el Al. [5] propose a solution in which frequency-based features are classified online by a server based application for the more general purpose of activity recognition, in which standing, walking, running, climbing up stairs, climbing down stairs, ecc are identified.Although the activities classified in this approach are different from those of interest for TMD, some activity recognition solutions have been considered relevant for this review, as they have common algorithm characteristics and implementation issues to the TMD problem.In their approach, the K-nearest neighbours algorithm is used assuming no a priori knowledge on the phone orientation.
Yan et Al.[6] provide an insight into the problem of online implementation of activity recognition algorithms on smartphones, by analysing in detail the energy efficiency issue.Their work provides an analysis of the effects on energy consumption of the sensor sampling frequency (see for example  is being performed by the user.A J48 adaptive decision tree was trained using the Weka toolkit.

Figure 2. Accuracy at different sampling frequencies
and classification features [6].The strategy was tested on Android phones achieving overall energy savings of 20-25% with respect to the continuous use of the highest sampling frequency and the larger set of features (non-adaptive approach).Figure 3 and Figure 4 show the result of the analysis of energy savings obtained with the A3R algorithm and of the battery consumption time evolution, with respect to the nonadaptive approach and the normal use of the smartphones.Differing from the above solutions, examples of activity recognition approaches using fixed phones positions, are that of Ravi et Al.[7] and of Kwapisz et Al.[8].
Presenting similar purposes to the approach by Yan et Al., the approach proposed by Rosenberg Randleff et Al. for TMD [9] focused on the simplification of the TMD estimation process, in order to make the smartphone workload as low as possible.Their algorithm exploits mainly the accelerometer data and introduces the GPS data and other information (such as train track) only when necessary for the classification.The strategy is to use accelerometer data only, as long as these data provide enough evidence of the transportation mode used.In this way, energy demanding sensors and data processing are activated only when needed, and the algorithm complexity is kept to a minimum.It is highlighted that this algorithm was tested by the authors only on a limited set of data and that the training phase requires a quite relevant effort by the users, which are required to label their trips indicating the device orientation in addition to the transportation mode used.However, the idea of activating additional sensors on condition, to complete the accelerometer information, is interesting.
A similar approach is that proposed by Reddy et Al.[10], which developed a classification system of transportation modes using GPS and accelerometer data only, obtaining an accuracy better than 93% over a set of 5 transportation modes.The classifier is composed by a decision tree followed by a first-order discrete Hidden Markov Model and analyses GPS speed every second, together with variance and frequency components of the accelerometer signal.
Figure 5 shows the distribution of the speed signal for the different transportation modes, as reported by the authors.An additional algorithm is described in the paper that turns on the classifier when the user goes outdoor, using the changes in the connected cell tower as a trigger to start logging the GPS signal.The main drawbacks of this approach are that the set of transportation modes considered for classification is actually quite limited, as "motorized transport" is considered as a single class; in addition, the training process required the users to wear 6 smartphones at specified positions on their body, in order to train a generalized decision tree that was able to classify modes of transportation independently on the phone actual position during usage.

EAI
An interesting result reported by Reddy et Al. is the loss of accuracy caused by the use of other information than that of the accelerometer and the GPS, which is shown in Table 2.

Table 2. Classification accuracy decrease compared to
GPS and accelerometer based system [10].
It is not clearly described how the WiFi and GSM information is integrated with the accelerometer and/or GPS data.Presumably, the GSM and WiFi signals were used to calculate the motion speed when the GPS speed is absent.The authors also state that the classifier with all four modalities resulted in a negligible increase in accuracy (0.6%) compared to the use of accelerometer and GPS only.
Another similar approach is that of Xiao et Al [11], which uses speed statistics derived from GPS and cellular network information and the standard deviation of the magnitude of the force on the body obtained from accelerometer samples.This approach creates traces of positions with the GPS data and GSM-based position estimates in case of GPS unavailability, and exploits this geographical information to detect stop positions.A decision tree classifies the standard deviation of the force and the maximum and average moving speed along a trace between two successive stops.It shall be said that the algorithm as it is presented in the paper does not distinguish among a wide variety of transportation modes, as it is limited to differentiating motorized transportation among bus, MRT (Mass Rapid Transit, a system that forms the backbone of the public transportation in Singapore) and taxi.
An example of very accurate TMD based on GPS and GIS data is that developed by Stenneth et Al.[12], in which the knowledge of the underlying transportation network, in terms of bus locations, spatial rail and spatial bus stops information, is exploited.The authors state that the use of this information improves the accuracy of the TMD algorithm by 17% in comparison with the GPS only approach and by 9% in comparison with GPS and GIS models, achieving an overall estimation accuracy of over 90%.In this approach, a centralized system collects data from smartphones, elaborates the information and sends back the resulting classification to the user.The GPS data are pre-processed in order to filter out invalid GPS points and are sent to the central station every 15 seconds, to avoid battery drain.The detection algorithm uses a Random Forest model and only five classification features: average speed, average rail line closeness, average acceleration, average bus closeness, candidate bus closeness.The main drawbacks of this approach are that the infrastructure knowledge required makes this solution difficult to disseminate into large geographical areas; in addition, the use of GPS data, even though collected every 15 seconds, will probably reduce excessively the smartphone battery endurance.
Another example of TMD solution based on GPS data is presented by Gonzalez et Al.[13], in which a Multi-Layer Perceptron neural network model is trained with GPS data to classify transportation modes among car, walk and bus.The interesting aspect of their implementation is in the Critical Points algorithm (see Figure 6), which reduces significantly the number of GPS points that represent the user's path and that are sent to the server for classification, thus saving battery, network bandwidth and storage space.The overall accuracy obtained by the authors is greater than 91%, achieving 100% accuracy in the classification of walking segments.Points (below) [13].

EAI Endorsed
An interesting perspective is that of enhancing models prediction by including the knowledge on a user's past history.This approach would certainly require the setup of personal features databases and higher computational complexity, but would exploit the normally repetitive trajectories of users in everyday transportation.A discussion on this approach can be found in [14].

Solutions based on other devices
TMD solutions based on other mobile devices refer to those approaches found in the literature in which data are collected by means of general portable devices, typically accelerometers and GPS loggers.
One of the first contributions (2004) to the activity recognition problem is that proposed by Bao and Intille [15], in which five biaxial accelerometers worn simultaneously on different parts of the user's body were used to collect and then extract mean, energy frequency-domain entropy and correlation of acceleration data.The C4.5 decision tree classifier was selected as the best performing, recognizing a variety of 20 everyday activities with an overall accuracy of 84%.No online classification was implemented, but detailed post processing analysis was performed in order to validate the model with different data sets (the leave-one-subjectout-validation approach was used).
A more recent work (2010) in the more specific field of transportation modes detection is that of Zheng et Al.[16].Their approach consists of three parts: a change point based segmentation to partition GPS trajectories into segments of different transportation modes, an inference model that classifies the segment features and a graph-based post processing algorithm that improves inference performance.The classification algorithm can distinguish between walk, driving, bus and bike transportation modes.
The change-point segmentation algorithm relies on the assumption that walking should be a transition between different transportation modes and that during transitions between two modes the GPS speed should be close to zero as people must stop and then go when changing their transportation mode.An example of change point-based graph is shown in Figure 7, where change-points are represented as circles.
The post processing algorithm calculates a transport mode probability considering both real world constraints to transportation and typical user behaviour based on locations.The model is however not user-specific as it is trained from datasets of different users.Another recent contribution (2013) to this research area is that of Biljecki et Al.[17], in which the concept of singlemode segment, similar to the segments defined by Zheng et Al., is introduced.However, the method proposed by Biljecki et Al.differs from previous works for the following main reasons, as reported by the authors: a) it exploits the fuzzy concepts of membership functions and certainty factors (from the expert systems field), b) it uses OpenStreetMap data, c) there is a strong separation between reasoning and knowledge, so that parameters can be modified and new transportation modes can be easily added.Moreover, this solution can distinguish among 10 transportation modes and handles data with signal shortages and noise.
The features used in the expert system for classification are: three single values of speed in the segment, the mean speed, the mean moving speed, five average proximities of the segment to the infrastructures used by the selected transportation modes and the location of the trajectory with respect to water surfaces.
A GPS logger was used to collect data, which were added to data downloaded from the internet (from the OpenStreetMaps database), obtaining a data set for training and validation of 17.5 million GPS points.The overall accuracy obtained is higher than 91%.The interesting characteristic of this approach is the use of expert systems in this kind of application.Indeed, the logical rules constructed on the membership functions are very easy to interpret and modify.

EAI
This solution was implemented and tested with GPS receivers and GPS phones, which sent data to a web application for processing.During training, the subjects were asked to label their trajectories on the web application.

Discussion and solutions comparison
A summary of the main implementation features of the most relevant contributions described in sections 3 is reported in Table 5.This table compares the works found in literature in terms of:  identified classes;  sensor and information used as input;  implementation mode;  duration of test data;  number of users used for the experimental phase;  the claimed accuracy.
It can be noticed that most of the effort made in the literature is focused on smartphone based solutions, even though not all the applications were developed to actually run online on a smartphone.Most of them used smartphones as sensors to measure accelerations and collect GPS position data.However, some of the offline ones are suitable, in principle, for the online implementation on smartphones.
One of the most relevant contributions in the set of smartphone-based solution is that of Manzoni et Al.[3], because of the simplicity of the classification algorithm proposed and of the high number of transportation modes considered.
Accelerometer and GPS are the most used sensors, these information being sometimes integrated with GSM, maps and/or knowledge of the underlying transportation network.The highest accuracy (93.8%) is achieved by Nham et Al.[4] using only accelerometer data.However it distinguishes only four different classes of transportation modes: walk, run, bicycle and motorized transport.
The second best accuracy (93.6%) is achieved by Reddy et Al.[10] using both accelerometer and GPS data and distinguishing five different classes (stationary, walk, run, bicycle, motorized transport).Both contributions group together the transportation mode of car, bus, tram and train in a unique class named "motorized transport".
The third best accuracy (93.5%) is achieved by Stenneth et Al.[12] distinguishing among six different classes: car, bus, surface train, walk, bicycle, stationary.However, the detection is not based on accelerometer data, which, as already discussed, would preserve the battery endurance, but on GPS data and on the knowledge of the underlying transportation network.As already said this infrastructure knowledge makes the solution difficult to disseminate into large geographical areas or complying with heterogeneous transport infrastructures.
Table 3 shows the shortlist of the three works detecting the highest number of transportation modes.Among these three, Biljecki et Al.[17] achieve the highest accuracy of 91.6%.However it is based on GPS data and on the availability of map-based information as those provided by OpenStreetMap.The average accuracy of these two works is 83.5% which could be considered as the SoA benchmark accuracy.However caution is recommended in this case because for example the dataset used by Manzoni is said to include "several hours" of data but it is not actually quantified.
Indeed Hemminki et Al.'s datasets include 150 hours of data which is one of the largest dataset found in the literature.Furthermore the experimental phase is detailed in different qualified scenarios, based on different smartphone models (three) and users (16), from four different countries, demonstrating to be effective in different geographic locations.
In conclusion the work by Hemminki et Al., which is very recent (SenSys '13, Rome November 2013) may be considered as the most valuable one found in the literature by the authors of this review.However, the system proposed may still be improved as discussed in sec. 5.The accelerometer is the most widely used sensor for TMD applications, as it limits battery consumption and captures relevant features for detecting motion.Indeed in transportation mode detection, a key challenge is to detect different motorized classes such as bus, car, train and metro.This differentiation may be sometimes difficult because these transportation modes share common characteristics (such as the average speed and accelerations), which make hard identifying suitable features for the classification algorithm.It is not incidental that the two contributions that achieve the highest accuracies (Nham et Al.[4] and Reddy et Al.[10]) consider "only" a unique "motorized transport" class.Identifying the "walk" and "stationary" transportation modes is a simpler task because they are characterized by distinct features.For example walking is characterized by higher values (with respect to the other classes) of the standard deviation and maximum features of the total acceleration vector.On the other side, the stationary class is characterized by lower values of the maximum and median features of the total acceleration vector.

EAI
The most interesting and promising works found in literature by the authors of this review are those by Manzoni et Al.[3] and Hemminki et Al.[1], belonging to the smartphones based solutions since: a) They detect a high number of relevant transportation modes.b) They rely only on accelerometer data (e.g. they do not rely on maps database) c) They do not require dedicated devices.Indeed the work by Hemminki et Al., given the quality of their validation phase for both the extension (over 150 hours) and the variety of the dataset (16 users from four different countries using three different smartphone models), appears to be the most valuable work found by the authors of this paper in transportation mode detection.However, as recognized by the authors themselves, the system proposed has still the following drawbacks:  It is susceptible to interference from extraneous kinematic events;  The latency (20 s) of detecting the correct modality while switching to a motorized transportation modality.These could be significantly reduced by fusing measurements from additional sensors (e.g.changes in GSM or WiFi signal environment, GPS speed or changes in magnetic field).
Therefore improvements may be done in the field of TMD taking into account that the technology evolution is providing more and more powerful smartphones embedding more and more sensors which could be exploited for the scope of detecting the mode of transportation of a person, such as, for example, temperature sensors already integrated in the Samsung Galaxy S4 and in the new Samsung Galaxy S5 model.
In conclusion the authors of this review point out the following topics as the most relevant for further investigations in future works: 1) Improvement of the latency time required by the TMD algorithms to identify a "motorized" class; 2) Improvement of robustness of TMD algorithms, for example, against interference from extraneous kinematic events, such as user interactions or changes in the orientation of the phone; 3) Reduction of the power consumption required by the TMD algorithms, since battery endurance is one of the critical factors in nowadays smartphones.This could be achieved by the exploitation of low-power co-processors, dedicated to the management of sensors, which have been integrated in the last year in the most recent smartphones.4) Exploitation and integration in the TMD algorithms of innovative information provided by new sensors such as the already mentioned temperature sensor; 5) Adaptation of TMD algorithms to run in other smart devices different from smartphones such as smartwatches or smartglasses.
and the construction of a three-stage hierarchical classification framework ( Figure1).

Figure 1 .
Figure 1.Overview of the system architecture including three classifiers proposed by Hemminki et Al.[1].

Figure 2 )
Figure2) and of the choice of the set of features used in activity classification.They found that the trade-off between energy overhead and classification accuracy is activitydependant.For this reason, they propose an activity-adaptive approach (called A3R), in which sampling frequency and features selection are selected depending on the activity that

Figure 4 .
Figure 4. Power consumption of different activity recognition modes in daily lifestyle settings (evaluating the embedded A3R for two Android users) [6].

Figure 6 .
Figure 6.Comparison between Car trip data with All GPS Points (top) and Car trip data with only CriticalPoints (below)[13].

Table 1 .
PROs and CONs of server-based and smartphone-based solutions in the implementation of the classification algorithm.

Table 3 .
PROs and CONs of server-based and smartphone-based solutions in the implementation of the classification algorithm.

Table 5 .
Comparison of Transportation Mode Detection Solutions based on smartphones and other devices. 8