Toward Detection and Monitoring of Gait Pathology using Inertial Sensors under Rotation , Scale , and Offset Invariant Dynamic Time Warping

Walking ability can be degraded by a number of pathologies, including movement disorders, stroke, and injury. Personal activity tracking devices gather inertial data needed to measure walking quality, but the required algorithmic methods are an active area of study. To detect changes in walking ability, the similarity between a person’s current gait cycles and their known baseline gait cycles may be measured on an ongoing basis. This strategy requires a similarity measure robust to variability encountered in an outpatient scenario, including changes in walking surface, walking speed, and sensor orientation. Here we propose rotation, scale, and offset invariant dynamic time warping (RSOI-DTW), a variant of the well-known dynamic time warping (DTW) algorithm, as a generalization of DTW appropriate for threedimensional inertial data. RSOI-DTW is invariant under rotation, scaling, and offset, yet it preserves the salient features of gait cycles required for gait monitoring. To support this claim, gait cycles from 21 subjects walking with four different styles were compared using both DTW and RSOIDTW. The data show that RSOI-DTW converges quickly and achieves rotation, scale, and offset invariance. Both algorithms distinguish persons and detect abnormal walking, but only RSOI-DTW does so in the presence of sensor rotation. Variations in walking speed pose a challenge for both algorithms, but performance is improved by collecting baseline information at a variety of speeds.


INTRODUCTION
Movement disorders, stroke, traumatic injury, and other pathologies can degrade walking ability.A patient's walking quality can be monitored by a care provider to guide treatment decisions, measure the effectiveness of interventions, or provide prognostic information to the patient.While personal activity tracking devices make it easy to measure the quantity of a person's walking, they are not yet equipped to measure and monitor its quality over time.
The pathology detection and monitoring problem is challenging because of heterogeneity among persons, pathologies, and devices.What is abnormal in one person might be typical in another, and two different pathologies -say, hemiparesis and Parkinsonian gait -may have little in common.Further, each personal tracking device has its own unique combination of sensors, placement location(s) on the body, and attachment method(s).If inertial signal features are used to monitor walking quality, this heterogeneity must be considered carefully: the features important in one scenario may not be important in others.
To overcome these difficulties, monitoring can be based on a similarity measure, not extracted features.In a similarity measure based strategy, rather than learning the unique set of features important for each combination of persons, pathologies, and devices, a monitoring algorithm can instead observe a person's walking at baseline, then monitor its similarity to current walking patterns as time goes on.No prior information about pathology is needed, and a person's unique walking characteristics are not a hindrance.
In rehabilitation following stroke or trauma, similarity to baseline should return as walking ability is recovered.In chronic, progressive disease -including multiple sclerosis, the focus of our clinical research -the similarity to baseline can be quantified on an ongoing basis, and new baselines can be established to detect further degradation.Current walking is compared only to baseline measurements from the same person using the same device, so observed differences may be attributed primarily to changing walking patterns.
Patients would routinely complete a self-initiated walking test at home, wearing a personal activity tracking device or smartphone able to record inertial data.Gait cycles would be compared to baseline cycles using an appropriate similarity measure.For this purpose, we propose rotation, scale, and offset invariant dynamic time warping (RSOI-DTW), a variant of the well-known dynamic time warping (DTW) algorithm.We first establish the theoretical properties of RSOI-DTW, then explore the suitability of DTW and RSOI-DTW for pathology detection and monitoring using data gathered from a walking test with 21 healthy participants.
After a brief literature review, Section 3 describes the walking test and DTW algorithm, then introduces the RSOI-DTW algorithm.Section 4 gives results from the walking test, including evidence that RSOI-DTW (1) converges quickly; (2) achieves RSO invariance in practice; (3) retains the ability to distinguish persons despite its flexibility in matching sequences; and (4) identifies simulated pathology, warranting a trial with true pathology.The last portion of Section 4 revisits these results in the presence of fast walking, highlighting the challenge posed by varying gait speed.

Inertial Data Processing
DTW has been used successfully for activity recognition [10] and biometric gait recognition [4] [9][11] [13], and several variants have been proposed.Scale and offset invariant DTW, in which one sequence can be scaled or shifted to improve similarity, has been developed by several authors, notably Chen et al., who evaluated an iterative algorithm similar to ours on a number of time series data sets [2].When analyzing gait, scale and offset invariance may mitigate variability due to walking surface, shoe type, attachment method, and moderate changes in speed.
In addition to scale and offset invariance, RSOI-DTW incorporates rotation invariance, allowing it to match gait cycles regardless of device orientation.This is a necessity in a real world, self-testing scenario, because a particular orientation cannot be assumed.A rotation invariant DTW algorithm was used by Qiao and Yasuhara to analyze two-dimensional handwriting samples [12], and Bours et al. devised a rotation invariant algorithm based upon principal component analysis [1].To our knowledge, no iterative, rotation invariant DTW variant appropriate for threedimensional inertial data had been developed and tested prior to the current work.
However, a number of methods for inertial data processing not based on DTW have incorporated rotation invariance.

Device Based Disability Monitoring
Inertial devices have been used for disability monitoring in many different contexts.The ISway test developed by Mancini et al. uses an accelerometer to measure postural sway resulting from neurological impairment [8].Salarian et al. created the iTug system, which uses inertial sensors to partly automate the Timed Up and Go test, a clinical measure of balance and mobility [15].Spain et al. captured differences between multiple sclerosis subjects grouped by disability level using features derived from inertial sensors, but could not detect changes in those features over time [16].Many studies have used daily step counts derived from inertial sensors as an outcome measure, but comparatively few have assessed features intrinsic to individual gait cycles.

Data Collection and Segmentation
21 subjects participated in the walking trial.Subjects wore a single ActiGraph accelerometer on their left hip, secured using an elastic belt with a pouch for the device.All subjects wore the same device.Each subject was asked to walk down a long corridor four times to demonstrate four different styles of gait: casual walking, fast walking, ataxic walking, and right leg circumduction.Ataxic walking is seen in persons with balance difficulties, characterized by a wide base and lateral swaying.Circumduction is the outward, circular swinging of one leg in swing phase; it occurs when the leg is rigid or spastic at the knee and/or ankle joint.
Subjects walked with each style in one direction for 40 steps, then turned, paused five seconds, and walked back with the next style.Each style was demonstrated before the trial by a clinically trained research assistant, and subjects were given an opportunity to practice until comfortable.
To experimentally verify the rotation invariance of RSOI-DTW, three subjects completed a second trial where they walked casually each time, but sensor orientation was changed.
The data was manually divided by person and walking style and segmented into gait cycles, defined as the data between consecutive left heel strikes.There are prominent peaks in the acclerometer signal at the point of heel strike in all walking styles, making the heel strikes easy to identify.Subsequent processing using DTW and RSOI-DTW exploits these gait cycles.

Dynamic Time Warping
Individual gait cycles were compared using DTW and RSOI-DTW, the variant of DTW described in the next section.Here we offer a brief, formal description of DTW.A more comprehensive treatment may be found in [7].
The DTW algorithm takes two sequences X = (x1, ..., xm) and Y = (y1, ..., yn) as inputs and returns a measure of similarity dDT W , often called the DTW distance, between them.In this work, the xi and yi are three dimensional acceleration vectors.Our implementation of DTW also returns warped sequences XW and YW derived from X and Y by (possibly) repeating terms to improve alignment.More precisely, XW = ((x1) a 1 , ..., (xm) am ) and YW = ((y1) b 1 , ..., (yn) bn ), where (•) k denotes k repetitions of a term, and the aj and bj are positive integers found by the algorithm.Using this notation, the DTW distance dDT W is the squared Euclidean distance between XW and YW , defined as follows: Definition 1.Given sequences A = (a1, ..., aN ) and B = (b1, ..., bN ), the squared Euclidean distance between A and B is: where • is the usual Euclidean norm.
To compute dDT W , XW , and YW , we first construct an (m × n) matrix D, where D (i,j) = xi − yj 2 .Intuitively, we then find the minimum cost path through D from D (1,1) to D (m,n) subject to a path constraint.Letting w k be the k th element of a warping path W -a possible path through D -we constrain W to allow only three moves: repeat the current point in X, repeat the current point in Y , or move to the next point in both.Formally, if The optimal path from (i, j) to (m, n) and its cost C (i,j) are computed using dynamic programming, where C (m,n) = D (m,n) , and the remaining C (i,j) are given by the following recursion: This recursion may be carried out row-wise or column-wise, with C (i,j) = ∞ for i > m or j > n.The final DTW distance is C , and the warping path W along with the warped sequences XW and YW may be recovered from C.
In this work, we have resampled Y to be the same length as X and limited the warping path to the Sakoe-Chiba band [14] to reduce computation, so that C (i,j) = ∞ whenever |j − i| is greater than one fourth the length of X.

Rotation, Scale, and Offset Invariant DTW
RSOI-DTW is an iterative algorithm that alternates between optimizing the rotation, scaling, and offset of the sequence Y , and optimizing the warping path using DTW.The former is an instance of the Procrustes problem, which may be solved in closed form using singular value decomposition.The details of this problem are beyond the current scope, but may be found in [5].
In this application, the rotation matrices must be limited to SO(3), the (3 × 3) orthogonal matrices of determinant 1. SO(3) are the rigid rotations in R 3 , excluding reflection; they correspond with the rotations possible for a rigid physical object.These matrices form a group under multiplication: in particular, they are invertible, and the inverses and products of rigid rotations are also rigid rotations.
This section first defines the transformations allowed in RSOI-DTW -the RSO transformations -then provides the RSOI-DTW algorithm.Finally, it proves that RSOI-DTW is rotation, scale, and offset invariant under typical circumstances, and the algorithm is guaranteed to terminate.Proof.Let fα(x) = sαRαx + bα, and 3) is closed under inverse.So, the RSO transformations are closed under inverse.
With this background in place, we now present the RSOI-DTW algorithm: Algorithm 1. Rotation, Scale, and Offset Invariant DTW where f is an RSO transformation 9: return d, XW , f (YW ) 12: end procedure This algorithm can be restricted for a particular use case by limiting f to a subset of the RSO transformations.If rotation is not a concern, R may be held to I, the identity matrix, reducing RSOI-DTW to scale and offset invariant DTW, as developed in [2].Similarly, if scaling is not a concern, s may be held to 1.
Proposition 2. RSOI-DTW is rotation, scale, and offset invariant.More precisely, let X, Y , and Z be sequences in R 3 of equal length, where Z = fz(Y ) for some RSO transformation fz.If d(XW , f (ZW )) has a unique minimizer f * at each iteration of the RSOI-DTW algorithm, then RSOI-DTW(X, Y ) = RSOI-DTW(X, Z).
Proof.Let fz be the RSO transformation taking Y to Z, so that fz(Y ) = Z, and suppose fz(YW ) = ZW at the beginning of the i th iteration of RSOI-DTW(X, Y ) and RSOI-DTW(X, Z).We proceed by induction; note that the i = 1 case holds, since we initialize YW ← Y and ZW ← Z.
Given a unique RSO transformation f * that minimizes d(XW , f (ZW )) in iteration i, the RSO transformation (f * • fz) must be the unique minimizer of d(XW , f (YW )).To see this, suppose there were some other RSO transformation g such that d(XW , g(YW is the inverse of fz, we have: violating our assumption that f * is the unique minimizer. Knowing that f * and (f * • fz) are the unique minimizers found in line 7 of the i th iteration of RSOI-DTW(X, Z) and RSOI-DTW(X, Y ), respectively, we conclude that the input to the DTW subroutine is f * (ZW ) in either case.Because of this, DTW returns the same distance d and warping path W in both cases, guaranteeing fz(YW ) = ZW at the beginning of the (i+1) th iteration, completing our inductive proof.
Intuitively, since the first step in RSOI-DTW is to optimally rotate, scale, and shift the input, the sequences YW and ZW are both transformed to f * (ZW ) in the first iteration of the algorithm, and subsequent processing is identical.
Proof.First, notice that f (XW ) = f (X)W , because applying the warping path W is a repetition of terms, and the transformation f is applied once to each term in either case.
Let di, Wi, and fi be the distance, warping path, and optimal RSO transformation found in iteration i, so that And DTW finds the warping path Wi+1 minimizing d(X, fi+1(Y )).Together, we have: Since this holds for all i, the sequence {di} i∈N is monotonically decreasing.Further, since the di are squared Euclidean distances, they are bounded below by zero.Therefore this sequence converges by the monotone convergence theorem, guaranteeing termination of the RSOI-DTW algorithm.

Convergence and RSO Invariance of RSOI-DTW in Practice
Having established the theoretical properties of RSOI-DTW, we now show it performs as expected on our walking data.Figure 1 illustrates the use of RSOI-DTW to compare two gait cycles.Both cycles were taken from a single subject's casual walking segment.The raw cycles C1 and C2 are shown in the top and middle left plots, respectively.The bottom left plot shows C2 after a randomly chosen RSO transformation f is applied.The right panel shows how RSOI-DTW alters the cycles.When running RSOI-DTW(C1, C2), a warped version of C1 (top right) and a warped, transformed version of C2 (middle right) are returned.Here the rotation, scale, and offset are small, because sensor alignment was consistent and no rotation was applied.When running RSOI-DTW(C1, f (C2)), the same two plots are returned.The warped version of C1 (not shown) is identical to the top right plot, and the warped, rotated version of f (C2) (bottom right) is identical to the plot above it, because RSOI-DTW is invariant under f .
To verify that RSOI-DTW is invariant under a real rotation -that is, a misorientation of the sensor -three of our participants completed a second trial with the sensor in four different orientations: no rotation, a 90 • rotation about the medial-lateral axis, a 180 • rotation about the medial-lateral axis, and a 180 • rotation about the vertical axis.The 125 rotated signals were compared to the non-rotated signals using DTW and RSOI-DTW, and the resulting distances were   1), and RSOI-DTW distances were similar for all four orientations (Figure 2).As additional support, randomly chosen RSO transformations were applied to each of our 21 participants' 12 casual gait cycles.RSOI-DTW was used to compare the transformed cycles to the original, non-transformed cycles, for a total of 252 2 , or 31,626, comparisons.These distances were compared to the corresponding distances obtained without first applying a transformation.In all 31,626 cases, the RSOI-DTW distances were identical up to rounding error, never differing by more than 10 −13 .Together with the experimental result, this suggests that RSOI-DTW is RSO invariant in practice when used on inertial time series data.
Figure 3 shows histograms of the RSOI-DTW convergence rate when comparing (a) pairs of casual walking cycles, and (b) fast walking cycles to casual walking cycles.In over 60,000 runs per plot, the algorithm most often required 7 iterations, rarely over 20, and never over 30.To ensure local optimality, we insisted that d = d old for convergence, meaning the warping path was stable.In our data, using a less strict (e.g. 10 −5 ) convergence criterion typically reduces the number of iterations by one, and never more than two.
As shown in Algorithm 1, each iteration calls DTW once and the Procrustes algorithm once.DTW involves O(N 2 ) computations, where N is the length of the inputs: computation is proportional to the number of pairings between points in X and points in Y .However, RSOI-DTW must recover the warping path W in addition to the distance d, requiring a second trip through the cost matrix and adding a multiple of N 2 computations.Run time is increased, but not by more than a factor of two.The Procrustes problem requires only O(N ) computations [5], and in our data set, DTW occupies the vast majority of run time.A conservative run time estimate may be obtained by multiplying the DTW run time by 2I, where I is the number of iterations required.

Distinguishing Between Persons Using DTW and RSOI-DTW
For the gait recognition problem, 1NN classification was used to match gait cycles from all 21 participants to the casual walking template cycles under DTW and RSOI-DTW distances.A test cycle Y was classified as belonging to the owner of X, where X is the template cycle minimizing DTW(X, Y ) or RSOI-DTW(X, Y ).
Table 2 shows that gait recognition succeeds in our data set under both DTW and RSOI-DTW distances.Out of Gait recognition may also be treated as a decision problem: the algorithm must decide (YES/NO) whether a gait cycle of unknown origin belongs to a given person.As seen in Figure 4, a threshold is set on the best match between the tested cycle and the known subject's template cycles.If best match distance is below the threshold, the cycle is accepted, otherwise it is rejected.Table 3 summarizes the decision problem results in all subjects by the equal error rate (EER), calculated by finding the threshold minimizing the difference between the false negative and false positive rates, then taking their average.Figure 4 shows decision problem results for subject 8, who had the worst EER (7.1%) under RSOI-DTW among all subjects because of two unusual, poorly matching cycles.The error rates are zero in all subjects under DTW and in 17 of 21 subjects under RSOI-DTW.

Detecting Simulated Pathology
The pathology detection problem is similar to the gait recognition decision problem: the algorithm must decide (YES/NO) whether an unknown cycle represents normal gait or possible pathology.As before, a threshold is placed on the best match between the tested cycle and the known subject's template cycles.Table 4 shows that the two simulated pathologies were easily recognized in all subjects under both DTW and RSOI-DTW distances.The EER was nonzero in only one subject under RSOI-DTW.As before, only RSOI-DTW is capable of this accuracy despite the variability possible in a real-world use case.

Fast Walking
An algorithm monitoring walking ability should separate possible pathology from normal walking at any speed.Simi-  Figure 5 illustrates this difficulty in subject 10: while RSOI-DTW dramatically lowered the distances between fast cycles and the template cycles compared to DTW, it is not enough for reliable pathology detection or gait recognition.
In light of this difficulty, either (1) walking speed must be consistent between cycles, or (2) the template set must include cycles at many speeds.To test the latter, fast walking cycles were added to the template set in each subject, with results shown in Table 6.With this modification, simulated pathology could again be distinguished from normal walking (casual or fast).Compared to the near perfect results in the previous section, however, the EER is high in several subjects.RSOI-DTW improves the EER in two subjects, but worsens it in others.

DISCUSSION
Under the ideal conditions in our walking test, both DTW and RSOI-DTW performed well.However, only RSOI-DTW is equipped to compensate for the peculiarities of real-world gait data.Section 4 illustrates the cost of this advantage.First, in rare cases, the flexibility of RSOI-DTW allowed it to closely match cycles from different subjects.This confused the decision algorithm more than the classification algorithm, because the latter had information about the match distance to other subjects.Second, RSOI-DTW increases run time by a factor of I to 2I, where I is the number of iterations.In this data set, I had a median value of 8. We recommend using DTW in a controlled, clinical scenario, and RSOI-DTW in real-world remote monitoring.Indeed, the RSOI-DTW algorithm can be constrained as much as the use case allows, as mentioned in Section 3.3.
RSOI-DTW solves the orientation problem, but not the gait speed problem: the distance between fast walking and casual walking was reduced, but still significant.Section 3.4 shows that increasing the size and variety of the template set improved performance substantially.We believe that further template collection may solve the problem.
Matching inertial time series can be broken into many subproblems.One could find the gravitational vector, using it to partly correct the orientation, then adjust for sensor bias, finally running DTW to optimize alignment, and so on.RSOI-DTW is attractive partly because it solves all of these problems at once.We have not compared RSOI-DTW to another rotation invariant algorithm, because there is no clear candidate: other rotation invariant approaches use entirely different methodologies.
In our view, the RSO transformation is the most general affine transformation reasonable for inertial data; yet RSOI-DTW preserves the distinctive cycle features needed to distinguish persons and walking styles.

CONCLUSIONS AND FUTURE WORK
These results confirm that RSOI-DTW is an appropriate similarity measure for real-world inertial gait data, and support the proposed approach to the detection and monitoring of gait pathology.RSOI-DTW matches gait cycles despite incorrect sensor orientation, partly compensates for changes in gait speed, distinguishes our 21 participants almost perfectly, and detects the changes in walking style in this trial.
Our subsequent work will focus on using DTW and RSOI-DTW to monitor true pathology in both clinic and outpatient settings.In particular, we intend to monitor persons with multiple sclerosis while exercising to identify changes in walking patterns induced by fatigue.
Long-term monitoring will require not just a similarity measure, but also a method for clustering cycles, tracking cluster progression, and summarizing it as an outcome measure.Clustering algorithms that accept arbitrary distance metrics, such as k-medoids or affinity propagation, may be an appropriate starting point.

Definition 2 .
An RSO transformation is an affine transformation f : R 3 → R 3 of the form f (x) = sRx + b, where s ∈ R + , R ∈ SO(3), and b ∈ R 3 .Proposition 1.The RSO transformations are closed under composition and inverse, thus forming a subgroup of the affine group.

Figure 1 :
Figure 1: RSOI-DTW is rotation, scale, and offset invariant: when an RSO transformation is applied to cycle 2 (bottom left), the RSOI-DTW algorithm corrects it (bottom right).

Figure 2 :
Figure 2: RSOI-DTW distances are similar regardless of sensor orientation.The figure shows DTW (left) and RSOI-DTW (right) distances between incorrectly oriented cycles and the correctly oriented cycles from the same subject.

Figure 3 :
Figure 3: RSOI-DTW converges after 8 iterations on average.The figure shows histograms of the number of iterations required to compare pairs of cycles.

Figure 4 :
Figure 4: Threshold-based gait recognition is imperfect in subject 8 under RSOI-DTW due to two poorly matching gait cycles, but perfect in all but three of the other subjects.

Figure 5 :
Figure 5: Fast gait confuses the pathology detection and gait recognition algorithms in subject 10 under DTW, but less so under RSOI-DTW.The figure shows distances to the closest casual gait template cycle for several groups of cycles using DTW (left) and RSOI-DTW (right).
[6]en et al. corrected sensor orientation as part of modelbased classification of upper limb movements and walking activities[3].Recently, Gong et al. created a linear dynamical model based method for activity classification that is robust to sensor mounting errors, including rotation[6].

Table 1 :
RSOI-DTW achieves perfect 1NN recognition of cycles collected with the sensor incorrectly oriented.The table shows the number of cycles correctly classified under DTW and RSOI-DTW distances out of the 125 cycles collected from three subjects.

Table 2 :
1NN classification of casual gait is almost perfect under both DTW and RSOI-DTW.However, when rotations are applied to individual cycles beforehand, only RSOI-DTW succeeds.The table shows the number of cycles correctly recognized out of the 252 tested.

Table 3 :
Threshold-based gait recognition achieves an equal error rate (EER) of zero in all subjects under DTW, and all but four subjects under RSOI-DTW.The table summarizes the EER for the decision problem in all subjects.

Table 4 :
Simulated pathology can easily be distinguished from casual gait.The table summarizes the equal error rate (EER) for the decision problem with DTW and RSOI-DTW.

Table 5 :
Fast gait is hard to distinguish from pathology using the casual gait cycles as templates.The table shows the equal error rate (EER) in all subjects when distinguishing normal gait (casual or fast) from simulated pathology.

Table 6 :
Including fast gait cycles in the template set dramatically improves performance when distinguishing normal gait from simulated pathology.The table summarizes the improved equal error rate (EER) in all subjects.