A novel SURF-RANSAC matching method for athletics posture recognition

In athletics sports, accurate identification and correction of athlete's wrong posture can improve the quality of athlete's daily training. In the course of athletics sports, affine deformation of human body is easy to occur, which leads to the appearance of action feature points with low brightness and shading. However, the traditional method is to extract these feature points and compare them with the correct posture to realize the recognition and correction of posture, which leads to the failure of real-time detection and correction of athletes' wrong posture. Therefore, this paper proposes a method of posture recognition and correction for athletes with depth image bone tracking. The threshold method is used to preprocess the image, and the Kalman filter is used to filter the acquired image. The motion feature points are obtained from the filtered image by Gaussian distribution function. By improving SURF-RANSAC method, marginal points and action feature points with low brightness are screened out. Euclidean distance method is used to determine the distance between two adjacent feature points, and feedback monitoring principle is used to identify and correct the wrong posture. The simulation results show that the improved posture recognition and correction method of depth image bone tracking can realize tracking and monitoring of track and field athletes' movements, complete the detection and recognition of track and field sports posture with high accuracy and strong stability.


Introduction
In the World Track and Field Championships, coaches and athletes of all countries attach great importance to competition results. And each time the progress of competition results are inseparable from the improvement of athletes' skills [1,2]. Correct track and field posture is helpful for athletes to improve their skills, while wrong track and field posture is not conducive to the progress of athletes' skills and can not achieve ideal results. Therefore, it is very important for athletes to improve their skills and achieve good performance to accurately identify and correct their postures [3].
Under the action of special medium field, the coach's oral teaching can not make all athletes improve their skills, and it is difficult for athletes to make accurate imitation. At the same time, the use of human body demonstration to teach track and field technology disadvantages. It is difficult for athletes to correct their own detailed technical movements through hearing, unable to obtain self-feedback information, and unable to EAI Endorsed Transactions Fang Duan 2 achieve the purpose of improving their own skills. Therefore, it is very important to apply multimedia technology to correct athletes' wrong postures. By means of video recording, positive and negative contrast of action picture, on-site shooting and replay of video technology, the posture recognition and correction of track and field athletes has become an important topic of study by relevant experts and scholars, with great research value [4][5][6].
Some achievements have been made in the research of athletes' wrong posture recognition methods. Reference [7] identified and corrected the posture of track and field athletes through target tracking of their track and field movements. Reference [8] used dynamic module matching algorithm to detect and identify athlete's wrong posture, and completed posture recognition correction of track and field athletes. In reference [8], SAA7111 and PGA were used to collect and store images without DSP intervention, and DMA was used to transmit the images, and DSP was made to perform recognition operation. The system integrated the functions of operation, acquisition, storage and recognition into a PCI card, and completed the posture recognition and correction of track and field athletes with the memory function of PCI card. However, traditional methods all have defects to varying degrees. When track and field sports posture is affected by factors such as scale, noise and distance, the image is prone to deformation [9].
Aiming at the disadvantages of traditional methods, a new calculation method is proposed. By combining the bone tracking method of depth image acquisition and SIFT algorithm, the moving target image is tracked and extracted. By adjusting the rotation Angle, scale scaling and brightness, the affine transformation and noise of the image are relatively stable. Finally, the real-time tracking of the stable posture of track and field sports is completed, so that athletes can realize their own action mistakes, and timely correction of action, and obtain better results.

Principle of athlete posture recognition and correction
In the process of athlete posture recognition and correction, represents the background pixels of the athlete's posture image. Formula (1) is used to position each node when athletes running.
We analyze the positioning nodes and extract the feature points of each joint of athletes' limbs in track and field by using equation (2).
Where R represents the radius of the athlete's activity range.
B represent the coordinates of shoulder joint, elbow joint, wrist joint and palm joint of left and right arms respectively. According to these obtained coordinates, the athlete's posture recognition and correction model is established by using equation Affine deformation of human body is easy to occur in track and field sports, which leads to the appearance of action feature points with low brightness and shade. However, the traditional method is to extract these feature points and compare them with the correct posture to realize the recognition and correction of posture, which leads to the failure of real-time detection and correction of athletes' wrong posture. Therefore, this paper proposes a method of posture recognition correction for track and field athletes based on depth image bone tracking.

Proposed posture extraction method
With the continuous update and progress of domestic IT technology, DV video related hardware and technology development gradually complete, moving target image feature recognition and monitoring has been widely used in various industries. And in track and field sports, the application scope is also larger. Firstly, the image is obtained by Kinect technology [10]. Secondly, Kalman predictor and SURF-RANSAC algorithm are used to detect and recognize target posture in track and field. Based on the deep image interception tracking algorithm, the position of athletes' target movements can be estimated through the deep image interception method, so that the complicated prediction problem can be transformed into a relatively easy classification problem. The human track and field posture is taken as the target of tracking and detection. In order to better realize the movement target monitoring, Kinect method is used to obtain deep intercepted images [11,12]. The methods are as follows: 1) Use kinect sensors on the left and right sides to send and receive infrared rays from different directions. First of all, Kinect uses the infrared transmitter on the left to send infrared light to existing objects. Objects in the environment will react to form light spots, reflecting different three-dimensional "light coding"; Then the infrared receiver on the right is used to collect deep infrared images within the range of Kinect. Finally, the initial data of Kinect and the infrared deep intercepted images collected are used to carry out a series of design calculations, and the 3D depth screenshot information within the range is calculated. Based on the obtained information parameters, bone tracking is carried out. 2) Based on the obtained information parameters, the image of each bone point is formed. First, the human body is stripped and captured in the background. Secondly, the repeated parts of each part of the human body, positive images, side images, depression images are studied to obtain each image and each node. 3) The bone structure map is formed by using 20 key points obtained from "bone tracing". For each frame of deep captured image collected by Kinect camera, we mark and classify the obtained deep captured image of human movement; It is estimated as the location information of the key node; The marked image is matched with the 3D node.

Kinect 3D motion image feature extraction
Through the above image extraction method, the location information of moving target can be obtained. Through the acquisition of position information, it provides necessary basis for the retrieval and recognition of wrong postures in track and field.

Modified SURF-RANSAC
First, after feature points are detected and main directions were determined based on SURF algorithm [13][14][15], the circular neighborhood is constructed to extract 32dimensional descriptors. Then the threshold adaptive method is used to complete the rough matching of feature points. Finally, cosine constraint model is established by feature vector to optimize RANSAC algorithm and complete feature point matching.

Dimension reduction of SURF descriptor
In order to reduce the data complexity of SURF feature matching algorithm, this paper first improves the generation process of feature descriptors to achieve dimensionality reduction. In the traditional algorithm, when the main direction is at different positions, the area of calculating the statistical response of descriptor is also inconsistent, as shown in figure 1(a). When the main direction deviation occurs, the black area that should be counted will be ignored, which reduces the regional similarity after rotation. Therefore, this paper takes feature points as the center and uses circular fields to calculate feature descriptors, as shown in figure 1 (b). The purpose is that no matter how the main direction is offset, the range of descriptor calculation is the same, and the time of descriptor extraction is reduced compared with rectangular region while the edge interference pixels are removed.

Figure 1. Rectangular and circular neighborhood comparison
In this paper, dimension reduction extraction of SURF descriptor is divided into the following steps: Step 1. With the feature point as the center of the circle, a circular field with R of 10S is selected in the neighborhood. In order to achieve rotation invariance, the circular region is rotated according to the main direction of feature points. Step 2. The circular area is divided into two circles, and the inner circle radius is 3.12S, which is enclosed into four quadrants according to the X and Y axes, that is, the neighborhood forms eight small areas (numbered from 0 to 3 respectively), as shown in figure 2.
Circle has good rotation invariance, so if the descriptor extraction method is changed from rectangle to circle, the corresponding rotation time will be reduced in the matching process of each feature point. When the descriptor dimension is reduced from 64 to 32, the matching process not only reduces the data complexity but also improves the running speed of the algorithm.

Threshold adaptive matching
The traditional SURF feature matching algorithm takes the Euclidean distance between the image to be tested and the template image as the similarity criterion after finding the matching point [16,17].
In traditional methods, the artificial threshold will affect the matching accuracy and result, and one-way matching will produce one-to-many matching phenomenon. In the feature point matching stage, the initial matching feature set is screened by threshold adaptive method, and then each pair of feature points is compared with the calculated threshold value. Finally, the "one-to-many" mismatching situation is eliminated by feature bidirectional matching. The specific steps of feature matching process improvement are as follows: Step 1. Assume that the feature point sets of the two images are Step 2. The ratio set Step 4. The above image to be detected is reversed as the reference image, that is, point set 2 N is the set seeking the best matching point pair.

Improved RANSAC
RANSAC algorithm [18,19] is a stable and reliable algorithm for eliminating false matching points. An optimal mathematical model is obtained through continuous iteration, as shown in figure 3(a). The data set contains noise points and data points that can form a straight line. The basic principle of RANSAC is that: firstly, four non-collinear sample points are randomly selected and their variable model matrix is calculated. Then, an error threshold is set, and the data points within the error range are classified as the points in the fitting line, that is, the interior points. If the calculation error is greater than the threshold, it is the outer point. Finally, the data points of the line are fitted iteratively until the number reaches the maximum and remains unchanged, as shown in figure 3(b).
RANSAC algorithm has good robustness and can reduce the influence of noise, but RANSAC algorithm also has the shortcomings of unstable iteration times and high computational complexity.

Figure 3. RANSAC algorithm data fitting model
In the stage of RANSAC algorithm to eliminate mismatching, cosine constraint is considered to be added to achieve accurate matching. This paper evaluates the similarity of two eigenvectors by calculating their included Angle cosine. Because the cosine of two eigenvectors satisfies rotation invariance and is not disturbed by scaling. Therefore, the eigenvector cosine constraint is established, i.  (8) In the formula, p and q correspond to the feature vectors corresponding to feature points in the image to be matched. Principle of feature vector cosine constraint: The cosine value between each pair of feature vectors is calculated in the experiment, and the cosine judgment threshold w C is set according to the distribution of the cosine value. Then, the results calculated by the above formula are compared with w C . If the calculated cosine value is greater than w C , the feature points corresponding to feature vectors p and q are judged as candidate matching points. The constraint method is combined with RANSAC algorithm to achieve accurate matching.
In order to reduce the number of iterations and improve the registration accuracy, the RANSAC algorithm is improved in this paper. The specific ideas are as follows.
Step 1. Because the ratio of nearest neighbor to second neighbor is smaller, the confidence of matching is higher. Firstly, the sample set of 40% points whose nearest neighbor to next-nearest neighbor distance ratio of feature points is near 0.45 is selected as the sample set.
Step 2. Four groups of matching points were randomly selected from the feature points set to establish constraint equations. The least square method is used to calculate the homography matrix F under the condition that the selected matching point pairs are not collinear.
Step 3. The cosine of feature vector is used to constrain the remaining feature points, that is, if the cosine of feature vector of each pair of feature matching points is greater than the threshold w C , it will be added to the inner points; otherwise, it will be removed.
Step 4. If the number of feature points in the current interior point set is greater than that in the optimal interior point set, the current parameter model matrix is considered as the optimal matrix, and the number of iterations M is updated.
Step 5. The final transformation matrix is calculated until the internal points no longer change and the final set of internal points is obtained.
The improved RANSAC algorithm has a change in the selection of matching points, the inner points in the initial sample set increase, so the algorithm greatly reduces the number of iterations, the probability of selecting the correct point pair increases, and the feature vector cosine constraint is established to improve the matching accuracy.

Experiments and analysis
The experimental platform is MATLAB2017a. The operating environment is Intel core i5-8250u/1.60 GHz CPU, 16GB memory and 64-bit Windows10 operating system [20][21][22][23][24]. In order to verify the timeliness and accuracy of the proposed algorithm, a large number of images are used in the experimental analysis of feature matching. Compared with other three algorithms, the comparison results of three groups A, B and C are shown in figures 4-6. Three evaluation indexes of matching time, matching accuracy and algorithm robustness are analyzed and verified.  The matching accuracy reflects the merits and demerits of feature descriptors. The higher the matching accuracy, the better the matching performance. Its calculation formula is as follows:

Ne Nc
Nc P + = (9) Table 1 shows the comparison results of matching accuracy of the traditional SIFT algorithm, SURF algorithm, reference [25] and the new algorithm in this paper. As can be seen from the table and the experimental comparison results, although the SIFT algorithm detects more feature points, there are a large number of mismatching point pairs, that is, "matching cross", "oneto-many" and other situations, resulting in the correct matching rate is not high. From the experiments in figure 4 to figure 6, it can be seen that the matching results processed by traditional SURF algorithm all have matching crossover. Figure 7 shows that in the same group of test experiments, the correct matching rate obtained by the algorithm in this paper is higher than the other three algorithms, reaching more than 90%. Although the total matching logarithm is small, the matching accuracy is improved by 10%~20%. Because the threshold adaptive method and bidirectional matching criterion are used to reduce the false matching point pairs in the feature matching stage, the feature vector cosine model is established to optimize the RANSAC algorithm to purify the feature point pairs. Experimental results show that the proposed algorithm has good stability and anti-interference ability.  In the experiment, in order to verify the timeliness of the algorithm, multi-group analysis and comparison are conducted, and the matching time results of the algorithm in this paper and other three algorithms are shown in Table 2. As can be seen from Table 2, the matching time of the algorithm proposed in this paper is the least for the same group of test experiments, and the matching speed is about 70% higher than that of SIFT algorithm. Compared with traditional SURF algorithm and reference [25], the matching speed has been improved. This is because the circular region is used to construct descriptors in this paper, so that the description vector is reduced to 32 dimensions, reducing the data complexity. Meanwhile, the RANSAC algorithm is improved with cosine vector constraint, and the number of iterations is reduced by optimizing the sample model to solve the transformation matrix, and the overall running time was significantly improved compared with the traditional algorithm. Therefore, the algorithm presented in this paper has advantages of speed and good performance. In order to verify the robustness of the algorithm, this paper identifies the posture in track and field under different scenes. We obtain the average recognition accuracy, as shown in Table 3.  Table 3 shows that compared with the traditional SURF algorithm, the matching accuracy and running time are improved, and the new algorithm has better stability, antiinterference and ability. Therefore, the algorithm in this paper is suitable for affine transformation, blur, illumination change, rotation and other conditions, and it still has good robustness under different interference. Aiming at the problems of high description dimension and low matching accuracy of traditional SURF algorithm, this paper proposes an improved SURF-RANSAC algorithm for posture recognition in track and field sports. Firstly, in the feature point extraction stage, the original 64-dimension feature descriptor is reduced to 32dimension to reduce the data complexity and algorithm matching time. Then adaptive threshold method is adopted to avoid the influence of artificial threshold on matching results. Two-way matching criterion is adopted to eliminate the phenomenon of "one to many" matching. Finally, the RANSAC algorithm is improved with feature vector cosine constraint to further improve the matching accuracy. Through the analysis of experimental results and comparison with other three algorithms, the proposed algorithm improves the matching accuracy and shorts the matching time, making the algorithm more effective and robust. In the future research, the performance and adaptability of the algorithm should be further optimized so that it can be applied to image registration experiments in different scenes.