Design and Implementation of an Antagonistic Exercise Support System Using a Depth Image Sensor

Dementia is one of the main reasons for elderly people becoming dependent on care. Antagonistic exercise, which involves performing different movements with the upper and lower limbs on the left and right sides, is a form of exercise that uses cognitive and motor functions at the same time. Preventive care professionals who can lead this sort of exercise are few in number compared with elderly people, and are under a heavy burden. On the other hand, the Kinect has become popular as a low-cost device that can acquire human actions. In this study we designed and implemented an antagonistic exercise support system using a Kinect. The user’s joint information acquired from Kinect is used to estimate the exercise, and the system provides real-time feedback to the user. We performed recognition accuracy tests with young and elderly test subjects, and carried out an interview survey to ascertain elderly user opinions.


Introduction
As of 1st October 2016, over 27 percent of Japan's population is now aged 65 or over [1].As Japans' population becomes increasingly elderly, the number of people that are dependent on care will also increase, and the importance of preventive care will become even greater if elderly people are to live healthy independent lives.
Preventive care aims to prevent people from deteriorating to the point where they become dependent of care, or to delay this for as long as possible.Dementia is one of the factors that causes elderly people to become care-dependent.Dementia causes loss of memory and attention that can affect everyday lifestyles and the taking of medicines, and is often complicated by physical infirmity.Effective ways of preventing dementia are thought to include lifestyle and dietary improvements, social activity, and appropriate exercise [2], [3].Also, methods that combine motor function and cognitive function exercises have been reported to be highly effective [4].Antagonistic exercises aim to prevent dementia by having people perform different movements with the upper and lower limbs on the left and right sides as a way of exercising motor function and cognitive function simultaneously [5], [6].By having people perform different actions with the left and right arms and legs, these actions stimulate both the brain and the body by providing simultaneous mental and physical exercise.A study that examined the effects of antagonistic exercise showed that this exercise is effective at preventing the decline in cognitive function [7].
When performing preventive care or rehabilitation, it is generally performed with the guidance and assistance of a professional such as a physiotherapist.Although the number of physiotherapists has been increasing in recent years, it is insufficient to cater for the number of elderly people in Japan [8].Therefore, the burden on professionals is increasing, and there is a need for an exercise system that allows elderly people to exercise by themselves in order to reduce the burden on professionals.
Although positive effects are achieved by continuously performing preventive care and other health activities, performing the same activities every day can be a mental strain.Therefore, research has been performed on maintaining the motivation of users and encouraging them to use exercise systems by incorporating games where users perform voluntary activities [9], [10], [11].
On the other hand, the Kinect system developed by Microsoft is able to recognize people's postures and the three-dimensional coordinates of their joints, and work has been done to research and develop systems that use a Kinect to measure hand and foot movements for rehabilitation purposes [12], [13].Since the Kinect can detect real-world human postures, it can also be used to recognize antagonistic exercises.Recently, several Kinect-based commercial rehabilitation systems have been developed [14], [15].
Formerly, we designed and developed a prototype lower limb chair exercise support system using a depth image sensor, and evaluated the performance and usability [16], [17].The system recognizes and evaluates exercises based on 3D position data and joint angles for skeletal and RGB data obtained from the Kinect sensor.In this study we designed, implemented and evaluated a system that supports antagonistic exercise using a depth sensor.It recognizes exercises by using skeletal data about the user's joints acquired from a depth sensor, and evaluates the user's exercises to provide real-time feedback.This system uses an audiovisual display to explain the exercise procedures to the user, and displays user's real time video to encourage the user to perform the exercises.It also has a rhythm game function whereby the user can exercise in time with music.This system is provided with four types of exercise: upper/lower limb antagonistic movement, upper limb left/right antagonistic movement, rock/paper/scissors using both arms and both legs, and duple/triple time exercises.
After discussing related research in Section 2, the design and implementation of this system are described in Sections 3 and 4. In Section 5, we describe and discuss recognition accuracy experiments conducted with young and elderly test subjects.Finally, we present a summary of this study in Section 6.

"Rehabilitainment" systems
Exercises related to preventive care and rehabilitation have to be performed every day, but tend to become a monotonous task that causes the patient mental stress.Research is therefore being done on so-called "rehabilitainment" systems where game functions are added in order to keep the user motivated and encourage them to use the system.In a study of rehabilitation for standing and sitting by Matsukuma et al. [9], entertainment features were added to the standing and sitting exercises in order to improve the patients' drive and persistence.Erazo et al. [18] built a rehabilitation system for patients with upper limb dysfunction based on a magic mirror game using a Kinect, and are evaluating the effects of this system.Also, the Dance2Rehab3D system of Bruckheimer et al. [11] is a system that supports rehabilitation of upper limbs by using joint angles acquired with a Kinect in an interactive 3D environment with a water tank motif.As a result of experiments conducted with stroke patients, they showed that this system is effective at reducing the level of fatigue experience in upper limb training.

Exercises to prevent dementia
In addition to lifestyle improvements, social activity and improved diet, it is thought that cognitive function training and appropriate exercise can also be effective for the prevention of dementia [2].Although activities related to dementia prevention have hitherto mostly consisted of exercises that only target motor functions, it has been reported that greater dementia preventive effects can be achieved by performing simultaneous cognitive function and motor function exercises [4].Recently, exercises have been developed in which cognitive function and motor function exercises are performed simultaneously.
Antagonistic exercise is a health behavior that uses both cognitive function exercise and motor function exercise.In antagonistic exercise, the left/right and upper/lower limbs perform different actions rhythmically, which is expected to promote brain stimulation and suppress deterioration of brain function.A study by Tabira et al. [7] examined the usefulness of antagonistic exercise and its effects on the prefrontal area of the brain.After getting healthy elderly test subjects to perform three different upper limb antagonistic exercises, it was shown that the people who completed the exercise had been stimulated in the lateral prefrontal cortex.The authors explained that antagonistic exercise is useful as an executive function exercise, but must be introduced in accordance with the executive ability of elderly people.

System overview
In this study, we built a system that supports antagonistic exercise by using a depth image sensor [19].This system has support functions that help users to understand how to perform the exercises, thereby reducing the burden on instructors.For this purpose, it includes functions for displaying video of the user in real time, showing how the exercises should be performed, and providing spoken guidance.Also, to increase the user's understanding of exercises and motivation to perform the exercises, it has functions for performing exercises in time with music, and for evaluating these exercises to provide real-time feedback.
An overview of the system is shown in Figure 1.The system includes a Kinect and a monitor situated in front of the user, who is seated in a chair.Based on the positional data of each of the user's joints as obtained from the Kinect, the exercise is recognized by determining the distances between each of the user's joints.The monitor displays real-time video of the user, example images of the exercise, and the exercise evaluation results, and uses speech to describe the exercise and inform the user of the evaluation results.The system also has a game function whereby the user performs exercises in time with music, and the system evaluates the user's timing.The system's exercise support functions are shown in Table 1.

Table 1. The system's exercise support functions
To facilitate user interaction, this system avoids the use of a mouse and keyboard as much as possible, and is designed so that the users can initiate exercises by themselves.For this purpose, the Kinect is used to recognize the user's gesture actions.Numbers are displayed on the exercise screen, and an exercise can be selected by gesturing with the hands to indicate the number associated with this exercise.The main gestures and system operations are as follows:  Holding a hand over a number: Select an exercise  Raising the right hand: Continue with an exercise; proceed to the next step  Raising the left hand: Stop an exercise

Recognizing antagonistic exercises
This system includes four types of exercise: upper/lower limb antagonistic movement, upper limb left/right antagonistic movement, rock/paper/scissors using both arms and both legs, and duple/triple time exercises.The antagonistic exercises are recognized by using threedimensional position information from the joints in the user's legs and arms obtained from the Kinect, and recognition of rock/paper/scissors moves.For each exercise, we use the distances between joints or the positional coordinates of the joints.Figure 2 shows the positional relationship between the Kinect and the user, and the coordinate axes.

Antagonistic movement of upper limbs
The antagonistic movement of the upper limbs is an exercise in which the user alternately holds out each hand while forming different rock/paper shapes with both hands.The exercise procedure is shown below.(B1) Hold out the right hand in the shape of "rock" with the elbow extended, and hold the left hand in the shape of "paper" in front of the chest.
(B2) Form the opposite posture to (1), with the right hand in the shape of "paper" in front of the chest, and the left hand held out in the shape of "rock".Rock/paper/scissors with both hands and both feet Rock/paper/scissors with both hands and both feet is an exercise in which the user makes "rock", "paper" and "scissors" shapes in sequence with both hands and both feet.The exercise procedure is shown below.(C1) Both hands in the shape of "rock", legs closed in the shape of "rock" (C2) Both hands in the shape of "scissors", legs open to front and back in the shape of "scissors" (C3) Both hands in the shape of "paper", legs open to left and right in the shape of "paper" (C4) Repeat steps (C1) through (C3).
Figure 5 shows the postures at each step of the exercise.In the figure, symbols Fl and Fr are the joint positions of the left and right feet, respectively.Also, Figure 6 shows how the x-axis and z-axis distances between the feet change while performing rock/paper/scissors with both hands and both feet.Here, the horizontal axis represents the elapsed time, and one frame is approximately 33msec.The vertical axis represents the variation of x-axis or zaxis distance between the feet.From Equation ( 1) and ( 2), the status of the hands and feet can be classified into three types: open, closed, and somewhere in between.The distances between joints are measured by choosing two joints according to the exercise being estimated, and calculating the difference between their coordinates as an absolute value.
Using the positional coordinates of joints to recognize exercises Some exercises are estimated by using the positional coordinates of the user's hands.It is possible to judge whether the hands are raised or lowered, or if they are opened horizontally.If (x, y, z) are the three-dimensional coordinates of the user's hands, σ x1 and σ x2 are threshold values on the x-axis, and σ y1 and σ y2 are threshold values on the y axis, then the hand positions on the (x,y) plane are judged according to equations ( 3) through (6) as follows: Recognizing rock/paper/scissors positions Of the four types of antagonistic exercises performed by this system, rock/paper/scissors moves are performed in exercises involving upper limb left/right antagonistic movement and rock/paper/scissors using both arms and both legs.Therefore in this system, the rock/paper/scissors positions are estimated by using the Kinect to detect the opening and closing of the hands.
Since the Kinect is able to detect fingers and thumbs, it can recognize when the hand is open and closed.The open/closed state of the hand is classified into four types: closed, open, lasso (two fingers extended), and unknown (any other state).These can be used to estimate rock/paper/scissors positions, since "closed" corresponds to "rock", "open" corresponds to "paper", and "lasso" corresponds to "scissors".
If Hdetect is the hand state detected by the Kinect, then the rock/paper/scissors state Hstate is defined by the following formula where a closed fist corresponds to "rock", an open hand corresponds to "paper", and two extended fingers corresponds to "scissors".

Evaluation of timing
When one action of an exercise is made to correspond to one beat of music, in this system a beat is divided into five parts that are evaluated with three levels corresponding to "perfect", "great" and "good".The exercise timing and evaluation are shown in Figure 8.The region closest to the beat timing is given the highest evaluation score of "perfect", and the adjacent regions (earlier or later) are evaluated as "great" and "good".Table 2 shows how these exercise timing regions are evaluated, and their corresponding game scores.

System configuration
The overall processing flow of the system is shown in Figure 9.This system broadly consists of five stages: exercise selection, exercise practice, synchronized exercise, rhythm game, and rhythm game results.The user chooses an exercise in the exercise selection stage, and learns how to perform the exercise in the exercise practice stage.In the synchronized exercise stage, the user practices performing a rhythmical exercise, and then plays a game in the rhythm game stage.The game scores and overall evaluation can be checked in the rhythm game results stage when the exercise is finished.In this system, the user is first presented with real-time video and 4 types of exercises, as shown in Figure 10.The numbers in the video on the left side of the figure correspond to the exercise menu items, and can be selected by holding a hand over the desired number for two seconds, whereupon the corresponding exercise is selected and a description of the exercise is displayed.The selection of an exercise is followed by an exercise practice stage where the user is presented with example images and speech describing how to perform the exercise, and the user performs the exercise a set number of times.When the user has finished practicing, the system moves into the synchronized exercise stage.In synchronized exercise, a sound outputs a beat with a fixed rhythm, and the user performs the exercise in time with this beat.When the exercise has been performed a set number of times, the system moves into the rhythm game stage.

Rhythm game
Figure 11 shows an example of the screen display during the rhythm game.After the rhythm game has been described, the user performs an exercise in time with music and a sound that plays at a fixed tempo.The circles at the right side of the figure are displayed in time with the exercise, and are synchronized with the music.From the outermost circle to the innermost circle, the third circle gets smaller with time, and the exercise is performed in time with when it overlaps the inner circle.A game score is calculated based on the time difference between when the shrinking circle overlaps the inner circle and when the exercise is performed, and the score gained so far is displayed inside the inner circle.When the music finishes, the final score is displayed together with the evaluation in four stages -S, A, B and C, which are corresponding to Excellent, Great, Good, Poor, respectively.

Experiments and Discussion
To evaluate the effectiveness of the antagonistic exercise recognition methods in this system, we performed recognition accuracy tests with ten test subjects -three males in their twenties, and seven people in their seventies, eighties and nineties (two males and five females).We also observed the elderly test subjects as they performed the exercises in this experiment, and followed it up with a survey to gather their opinions regarding the system's support functions.

Experimental method
In the recognition accuracy experiments conducted with young test subjects, the test subjects performed each of the four types of antagonistic exercise five times, and we investigated the number of times the system correctly detected the exercise movements being performed by the test subjects.The exercises were counted by regarding a single iteration of an exercise as a series of movements where each action is performed once.For example, in the case of the rock/paper/scissors using both arms and both legs exercise, forming a "rock" posture is regarded as one action, and the three actions corresponding to "rock", "paper" and "scissors" are regarded as one exercise.In the recognition accuracy tests conducted with elderly test subjects, the test subjects were given the choice of whether or not to participate in the experiment, were allowed to stop the exercise at any time, and were able to choose the types of exercises they wanted to participate in.
The experimental environment is shown in Figure 12.The Kinect was situated 0.7 m above the floor and 2.0 m away from the test subject.After a spoken explanation of the purpose and effects of the exercise system, the experiment was started.In this experiment, we measured the accuracy when performing the second-stage synchronized exercises out of the system's three stages.The number of times the actions recognized by the system matched the attempted actions performed by the test subjects was recorded as the number of recognitions achieved by the system.Here, the system's recognition rate is defined as the ratio of the number of recognitions achieved by the system to the number of attempted actions performed by the test subject.The system's recognition rate is expressed by the following formula: Recognition rate =

Results and discussion of recognition experiment
Since participation in the experiment was optional for the elderly test subjects, the number of participating test subjects varied between exercises -four participated in the upper/lower limb antagonistic movement exercise, seven in the upper limb left/right antagonistic movement exercise, three in the rock/paper/scissors using both arms and both legs exercise, and there were no participants in the duple/triple time exercise.The lack of participation in the duple/triple time exercise may have been because this exercise has a high level of difficulty.
Figure 13 shows the average recognition rates for the young and elderly test subjects.The average recognition rate for each exercise with the young test subjects was 90% for upper/lower limb antagonistic movement, 100% for upper limb left/right antagonistic movement, 100% for rock/paper/scissors using both arms and both legs, and 80% for the duple/triple time exercise.The overall exercise recognition rate was 93%.Also, the average recognition rate of the exercises when performed by elderly test subjects was 75% for upper/lower limb antagonistic movement, 76% for upper limb left/right antagonistic movement, 67% for rock/paper/scissors using both arms and both legs, and 73% overall.It is thought that the recognition rate for the duple/triple time exercise with young test subjects was lower than for the other exercised because of the values used for recognition.Exercises are recognized by using the x-axis distances of both arms and legs in upper/lower limb antagonistic movements, the z-axis distances of the hands and shoulders in upper limb left/right antagonistic movements, and the x-axis and z-axis distances of both legs in rock/paper/scissors using both arms and both legs, and these are all relative values in two coordinates.On the other hand, the duple/triple time exercise is recognized using the y-coordinate position of the left hand and the xand y-coordinate positions of the right hand, which are all coordinate positions.
In the case of relative values between joints, even if the user's position shifts during the exercise, it will still be possible to measure the change in the distance between joints.On the other hand, when using coordinate values, since the threshold value is also a coordinate value, a change in the user's position such as sitting up straight in the chair can make it impossible for the system to recognize exercises.This is thought to be one of the reasons why the recognition rate for the duple/triple time exercise was lower than for the other exercises.It is expected that the recognition rate for the duple/triple time exercise can be improved by using relative values to recognize this exercise.
From Figure 13, by comparing the results obtained with young and elderly test subjects, it can be seen that the recognition rates were lower for elderly people than for young people in all three types of exercise.It is thought that this difference in recognition rates was caused by a disparity in the variation of joint data used when estimating the exercises.Figure 14 shows an example on the variation of joint distances for young and elderly users.It shows the variation of x-axis distance between both feet during upper/lower limb antagonistic movement.The variation of the distance between both feet during upper/lower limb antagonistic movement was more or less fixed for young users, with a stable variation of roughly 0.05 to 0.9m.In contrast, the variation of this distance for elderly users was smaller at approximately 0.05 to 0.5m, and the width of this variation was also unstable.In this system, the threshold values used for exercise recognition were determined during the exercise practice stage, and we measured the recognition accuracy during the synchronized exercise stage.To determine a user's threshold values during the practice stage, we made adjustments to accommodate individual differences such as the distances between joints.For young users who performed the exercise actions stably, the actions were performed stably and without variation both during exercise recognition and in the practice phase, resulting in a high recognition rate.On the other hand, it seems that the elderly users did not move stably and sometimes moved differently between the practice phase and the exercise recognition phase, resulting in a lower recognition rate.
The threshold values used for exercise recognition in this system were set so as to accommodate individual differences for users who performed the exercises stably and with large movements of the joints during the exercises.However, it is possible that there was not enough leeway for individual differences in the recognition of unstable exercises performed by elderly users with a smaller movement fluctuation width.Therefore, we need to develop a method to improve the action recognition rate even for users that have a small range of joint movement and do not perform the exercises stably.

Observation and survey of exercises by elderly users
The elderly users felt that the duple/triple time exercise was very difficult, and thus none of them participated in this exercise during the experiment.The antagonistic exercises involved performing different movements rhythmically with the upper and lower limbs, and although the young users found this easy to understand and do, the elderly users did not find it easy.During the exercises, the elderly users were often unable to adopt the same postures as the younger users.Figure 15 shows an example of the posture of an elderly user and the sample image for step C2 of the exercise "rock/paper/scissors using both arms and both legs".From (a), we can see that both hands are held apart while performing scissors shapes, and the system is able to recognize them correctly.Meanwhile in (b), the user's hands are formed into scissor shapes and are held out to the front so that one hand is obscuring the other from the Kinect's viewpoint, making it impossible to judge if the obscured hand is forming a scissors shape.In posture C2 of rock/paper/scissors using both arms and both legs, both hands are forming a "scissors" shape, and the legs are arranged with one in front of the other, but in (b), it seems that the hands were placed one in front of the other to match the arrangement of the feet, resulting in one of the hands becoming obscured.
In this experiment, a single elderly user performed a rhythm game as a representative, while the other elderly users waited for their turn while watching the elderly representative's example.Here, the elderly people waiting for their turn and were not coerced into performing any of  Although the elderly people waiting for their turn were not encouraged to perform the exercises together with the test subject, they did do so.
 The elderly people waiting for their turn communicated among themselves while performing the exercise.One of the main topics of conversation was the test evaluation scores.
From the initial observations, we confirmed that elderly people actively took part in exercises using this system, and that this system facilitates effective spontaneous use.From our second observations of exercises performed by the system user and other elderly people, we confirmed that many elderly people became interested in performing antagonistic exercises and using the system.Also, from the third observation conducted while communicating with the elderly test subjects, the system is likely to be effective in promoting the communication of the elderly.

Conclusion
In this study, we designed, implemented and evaluated an antagonistic exercise support system using a depth sensor.This system recognizes exercises by using information about the user's joints acquired from a depth sensor, and evaluates the user's exercises to provide real-time feedback.The system uses an audiovisual display to explain the exercise procedures to the user, and displays example images to encourage the user to perform the exercises.It also has a rhythm game function whereby the user can exercise in time with music.To evaluate this system, we performed recognition rate experiments with young and elderly test subjects, and we also observed the elderly users and carried out an interview survey to ascertain their opinions.
In recognition experiments in which four types of exercise were performed by young and elderly users, the average recognition success rate for young users was 93%, while for elderly users this fell to 73%, showing that the exercise recognition rate is lower for elderly users than for young users.A possible reason for this is that the actions of elderly people have a smaller range of movement and are less stable than the actions of young people.Elderly users may perform exercise actions differently from the way they are presented by the system, and it is necessary to develop a method for increasing the recognition rate of actions even when performed by users whose exercise is not stable.
From observations of exercises performed by elderly people, it was found that elderly people communicated with one another while exercising, resulting in more people becoming interested in the antagonistic exercise, and showing that the exercise system may have the ability to promote communication among the elderly.In an interview survey, many of them reported feeling that the duple/triple time exercise was difficult.It is therefore essential to prepare exercises with multiple levels of difficulty for each exercise.
A future challenge is to improve the system's recognition rate of actions performed by elderly people.This will require improved recognition methods such as automatically adjusting the threshold values according to the amount of variation in the joint data obtained during the exercise.High-level technology is needed for functions that judge when the user performs actions correctly or incorrectly, or to point out incorrect actions using voice and video in real time, but this would be a very useful feature for users.It is essential to have a function that infers the user's situations when the user is unable to perform an exercise properly, and uses speech to encourage the user to return to the course.

Figure 3
Figure 3 shows the postures at each step of the exercise.In the figure, symbols Hl, Hr, Fl and Fr are the joint positions of the left and right hands and the left and right feet, respectively.

Figure 4
Figure 4 shows the postures at each step of the exercise.In the figure, symbols Sl, Sr, Hl and Hr are the joint positions of the left and right shoulders and the left and right hands, respectively.

Figure 5 .
Figure 5. Rock/paper/scissors with both hands and both feet.

Figure 6 .
Figure 6.Variation of x-axis and z-axis distances between the feet while performing rock/paper/scissors with both hands and both feet.

Figure 7
Figure 7 shows a user performing the exercise.In the figure, symbols Hl and Hr are the joint positions of the left and right hands, respectively.

Figure 9 .
Figure 9. Overall processing flow of the system.

Figure 11 .
Figure 11.Example of the rhythm game screen.

Figure 13 .
Figure 13.Average recognition rates of exercises performed by young and elderly test subjects.

Figure 14 .
Figure 14.Example on variation of joint distances for young and elderly users.

Figure 15 .
Figure 15.Sample of posture C2 in rock/paper/scissors using both arms and both legs, and the same posture performed by an elderly user.

EAI
Endorsed Transactions on Pervasive Health and Technology 03 2017 -07 2017 | Volume 3 | Issue 10 | 3 the exercises.The results of observing the elderly group are as follows:  The test subjects participated actively in the games.