Exploiting Machine Learning Algorithms to Diagnose Foot Ulcers in Diabetic Patients

INTRODUCTION: Diabetic foot ulcer (DFU) is a complication of diabetes that affects most of the diabetic patients. It will cause open wounds on the foot. Untreated DFU will lead to amputation and infection, which results in removal of foot or leg. As diabetes is the major health problem faced by people of all age groups, identifying foot ulcers at an early stage is essential. In this context, an efficient model to predict the foot ulcer accurately was proposed in this work. OBJECTIVES: To predict DFU using an effective neural network algorithm on a suitable dataset that consists of risk factors and clinical outcomes of the disease. METHODS: In recent days, ML techniques are most commonly used for predicting various diseases. To achieve the objectives a neural network technique, namely extreme learning machine (ELM) is proposed to predict DFU accurately. In addition, three existing algorithms, namely KNN, SVM with Gaussian kernel and ANN are also considered. These are implemented in R programming. RESULTS: Algorithms compared in terms of five evaluation metrics accuracy, zero-one loss, threat score/critical success index (TS/CSI), false omission rate (FOR) and false discovery rate (FDR). The values of accuracy, 0-1 loss, TS/CSI, FOR and FDR obtained for ELM are 96.15%, 0.0385, 0.95, 0 and 0.05 respectively. CONCLUSION: After comparison, it was discovered that ELM had outperformed other algorithms in terms of all the metrics. Thus, it was recommended to use ELM over other algorithms while predicting diabetic foot ulcers.


Introduction
Diabetes leads to several complications where foot ulcers are one of them. This disease occurs in type-1 and type-2 diabetic patients. In type-1 diabetic patients, there will be no insulin production or produced in less quantity, so that the patient should depend on insulin injections. In type-2 diabetic patients, insulin will be produced, but it was not sufficient for This disease is most common in diabetic patients who are suffering from nerve damage (neuropathy). The ball of the foot and the big toe bottom are the most common chances of ulcers occurring. Smoking is also one of the significant reasons for affecting foot ulcers. Chain-smoking will affect small blood vessels. This will decrease blood flow to the feet, and the healing process may slow down. The risk factors of foot ulcers are poor hygiene, alcohol consumption, retinopathy, neuropathy, nephropathy, cardiovascular, obesity, smoking habit, and peripheral vascular disease. If diabetic patients have any risk factors, the chance of affecting diabetic foot ulcers will increase [4]. From an observational study [5] conducted on the severity of DFU, illiterate patients and lower socioeconomic status are identified with an increase in the severity of DFU. It was due to the habit of walking barefoot. By taking proper foot care and using footwear, the chances of increasing severity will be reduced. In a study of diabetic foot ulcers, more than 70% of the diabetic patients diagnosed with foot ulcers had observed the increase in disease severity in successive 5 years [6]. Some preventive measures can reduce the severity of the disease, like controlling sugar levels in the blood and blood pressure, maintaining proper skin and nail care, using good footwear outdoors and indoors, and avoiding smoking. Maintaining normal sugar levels and blood pressure will improve the healing power and help speed recovery [7]. Good health service is mandatory for a developing country to overcome the limitations in resource availability, health care professionals, and government expenditures. Technology driven services play a vital role to minimize the efforts and expenditure for the government. Internet of things (IoT) and cloud computing are two leading technologies in the healthcare sector [8]. IoT has changed the tradition of consulting a medical expert either by visiting a hospital or through telecommunication. It's been called as Internet of Medical Things (IoMT) for its stellar role in the healthcare sector. IoT devices to monitor glucose levels, blood pressure, and heart rate, etc. are used mostly to monitor health conditions. As a result, complexity of storing and analysing the electronic medical records, i.e. data collected from IoT devices increases. Cloud computing can make it easier by storing vast data on a virtual cloud [9]. High data storage at low cost, backup maintenance, and secured data storage are its major advantages. In healthcare sector image processing and machine learning is being used wisely, where the dataset is the primary requirement for predictive analysis of a disease. Image processing, enhances the images captured by various means and analyses those enhanced images to make a decision. It's widely used in different fields like medical, robot vision, and face detection. In the healthcare sector, image enhancement plays a vital role. For example, analysis performed on enhanced images of several medical imaging techniques like CT scan, UV imaging, and X-Ray, etc. helps to predict the disease efficiently [10]. However, machine learning which includes neural network techniques for analysing and detecting various problems and strengthens the decision. Most of the ML algorithms use feature vector to analyse the data. It can detect nonlinear relationships between predictor and target variables efficiently. Also can predict output from the unforeseen input data based on the available feature data. However, image processing can't predict accurate output for unforeseen input. Early diagnosis and treatment of diabetic foot ulcers will reduce the risk of affecting its further complications. As discussed above, machine learning is widely used in the healthcare sector, helping the doctor or the physician provide care in the initial stage. This can reduce costs and give better results, as proper care will be provided in the early stage [11]. So, in this context, some machine learning algorithms are chosen to predict diabetic foot ulcers better. The techniques, namely K-nearest neighbor, SVM with Gaussian kernel, and artificial neural network, are the existing algorithms chosen. Comparison between existing algorithms and the proposed algorithm, i.e., extreme learning machine (ELM) shed light on the results. The implementation used the R programming language. Evaluation metrics like accuracy, zero-one loss, threat score/critical success index, false omission rate, and false discovery rate are considered for evaluating and comparing algorithms. In addition to these metrics, cross-entropy error or log loss is also used to  [25] presented their study on the prediction of diabetic foot ulcers. This was done based on dynamic pressure distribution. The dataset selected contains dynamic plantar pressure measurements for 28 non diabetes and 56 diabetes patients without diabetic neuropathy. DFU is predicted separately in these three cases of the dataset. SVM is the ML technique selected for their work. Firstly, the plantar surface is divided into 11 regions. Then, preprocessing is performed, followed by feature extraction and selection. Then, SVM is implemented, which has obtained precision, ROC, and accuracy of more than 95.2%, 0.946, and 94.6%, respectively, in all the three cases of the considered dataset. Patel et al. [26] focused on medical image processing to detect and classify the DFU wound. The four steps involved in the foot ulcer detection system are image preprocessing, image segmentation, feature extraction, texture detection and image classification. The classification techniques used in their work are KNN, SVM, fuzzy logic, Bayesian networks, and neural networks. Then, the DFU wound is classified into three groups slough, necrotic, and granulation. The algorithms' evaluation was not done based on metrics.
Comparison of the algorithms to identify the best one was also not performed. Instead, the cluster of images for these three groups is obtained. Keerthika et al. [27] highlighted their work on predicting DFU. This investigation considered the image dataset obtained by consulting doctors. The prediction happens based on image segmentation. This was done using the watershed and region growth algorithm. These algorithms are used to obtain the accurate wounded area from the images. Then, the single-stage SVM algorithm performs classification. The output result will be obtained as different stages of the wounded image, specifically the initial and final stages.
Cui et al. [28] presented their work on diabetic wound segmentation based on CNN. The diabetic foot ulcer image dataset provided by New York University was considered. Initially, preprocessing was performed followed by image segmentation using patch-based CNN and post-processing. The post-processing includes the output from the CNN, a probability map that gives the final result of wound segmentation. . Decision tree algorithms CART and random forest are implemented using feature selection evolutionary techniques, namely particle swarm optimization (PSO), bat algorithm (BA), genetic algorithm (GA), gravitational search algorithm (GSA), cuckoo search (CS), fire fly (FF) and dragonfly (DF). After obtaining results, the comparison is made in terms of accuracy. The CART algorithm with firefly feature selection obtained good accuracy of 79.73%. The overall accuracy of the model is 77%. The next section, i.e., section 3 demonstrates the work methodology that describes the dataset used and the architecture of the proposed work. Section 4 demonstrates the four algorithms used in this work, the proposed techniques ELM was elaborated, and the remaining algorithms are explained briefly. Section 5 has Metrics used and Section 6 will provide the complete details and information about the obtained results and discussion. Section 7 will provide the future work. Lastly, section 8 is the conclusion is provided after generalizing all the previous works and this work.

Research approach
This section comprises details regarding the work's objectives, a description of the chosen dataset, and the methodology.

Objectives of work
Identification of any disease in an early stage will help to take the treatment immediately. Accordingly, treatment in time of any illness will reduce the chance of further complications and support speed recovery. As diabetic foot ulcer is the major side effect of diabetic patients, its early detection is vital to avoid future complications. So, this problem was considered in this work, and the main objectives are as follows • Choose an adequate dataset and develop a useful predictive model for a diabetic foot ulcer. • Predict diabetic foot ulcer disease more accurately than existing related works. • Identify the best performing algorithm among all the considered algorithms.

EAI Endorsed Transactions on Pervasive Health and Technology
Online First In this work, the selected dataset contains various risk factors and clinical outcomes of diabetic foot ulcers. So, an effective model could be developed, and an accurate result will be obtained. In this context, few existing algorithms are chosen along with one proposed algorithm. KNN, SVM with Gaussian kernel, and ANN are the existing algorithms, and ELM is the proposed technique. To obtain an effective model, the best performing algorithm was identified and suggested to use in further works. So, 5 evaluation metrics were considered, based on which the best algorithm was identified. These metrics are accuracy, 0-1 loss, TS/CSI, FOR, and FDR. This entire process was done in R programming. The details about the dataset and the methodology are provided in the following subsections.

Dataset
Dataset selected in this work is consists of 22 attributes and 133 instances. It was taken from the "Figshare" data repository. In this dataset, 21 attributes are the predictor or independent features, and 1 attribute is the target or the dependent feature. The attributes from 1 to 21 in table 1 are the predictor attributes. The 22nd attribute, named as Ft_ulcer, is the target attribute that classifies the dataset into two categories, tested positive and negative for diabetic foot ulcers, respectively. Accordingly, it is called a binary classification problem. In the dataset, attributes, i.e., neuropathy, nephropathy, retinopathy, PVD, and CDV, are the diabetic complication diseases. Neuropathy is a nerve disease, nephropathy is a kidney disease, and retinopathy is an eye disease. Peripheral vascular disease (PVD) is the disease of poor blood circulation that mostly affects blood vessels' functioning in the brain and outside the heart. CDV is a cardiovascular disease like heart stroke. Ft_ulcer is the considered disease foot ulcer.

Methodology
The methodology of the work is demonstrated in figure 3. Data pre-processing is performed on the loaded dataset. Then percentage split of 80% was used for training and 20% for testing with 107 and 26, respectively. Then three existing algorithms and one proposed algorithm are implemented in R programming individually. A training model is obtained after implementing each algorithm. These trained models are evaluated on the test dataset and get the result in terms of evaluation metrics. Accuracy, 0-1 loss, threat score/critical success index, false omission rate, and false discovery rate are the performance metrics considered in this work. All four algorithms are compared based on evaluation metrics to identify the best one among all. In this work, the proposed algorithm ELM has obtained better values than the remaining algorithms.

Algorithms used
The algorithms used in this work are explained in this section. KNN, SVM with Gaussian kernel, and artificial neural network (ANN) are the considered existing algorithms. To these techniques, a brief introduction and details are provided. Extreme learning machine (ELM) which is the proposed technique, is explained in a detailed manner.  attribute, n is the total number of predictor attributes, xt and yt indicate data point in the training dataset and test dataset for attribute t, correspondingly [31]. In this work, after implementation, the optimal value of K chosen is 9.

SVM with Gaussian kernel
Support Vector Machine (SVM) has different types of kernel methods which can be used for classification purpose. Gaussian kernel or radial basis function (RBF) kernel is one of those techniques. In this technique, the conversion of feature space was performed by using the formula (6). The feature space with low dimension was converted to high dimension in order to obtain better results when the data is not linearly separable. Then hyperplane is constructed in the high dimensional feature space. The hyperplane with maximum margin was selected among all the possible hyperplanes. This hyperplane separates the data into two categories, namely positive and negative, in the case of disease prediction, based on which the prediction was made. The formula (6) is the Gaussian function where σ is a free parameter, xt and xs are the vectors in low dimensional feature space [32].

ANN
Artificial Neural Network (ANN) is one of the prominent ML techniques used for classification problems. It contains three layers, namely input, hidden, and output layers. The input data feeds the input layer, where the neurons hold those values. Hence, it processes this data to neurons in the hidden layers. There can be one or more hidden layers in ANN. The value of each hidden neuron (k) of hidden layer m is calculated using formula (7). L is the input neuron value, and W is the weight of the connection between input and hidden neurons. The hidden layer is responsible for performing the activation function, namely, the logistic sigmoid function on the hidden neurons given in formula (8). Like the hidden layer, the values of output layer neurons are calculated using formula (9), where HA is the activation value of the hidden neuron. Then the activation values of output neurons are obtained using the activation function in formula (10). This activation function will determine the output value, which is between 0 and 1. [33] Hₖₘ = � L * W HAₖ = 1 1 + e −Hₖ

Extreme Learning Machine (ELM)
ELM is a feed-forward neural network which can be used for classification purpose. It has only one hidden layer so it can be called a single hidden layer feed-forward neural network. The training speed of ELM is very less when compared to an artificial neural network. ELM contains only a single input, hidden, and output layers. It doesn't use a back propagation technique like ANN. Instead, it uses an inverse matrix concept. It calculates an output weight matrix based on which the prediction is made. A detailed stepwise algorithm for ELM was demonstrated below. In step 2, the input neuron values are given based on input instances. Assigning random values to the input layer's weights and bias for the hidden layer neurons occurs in step 3. In step 4, the output matrix for the hidden layer is calculated. Formula (11) is the general output function used to calculate each element of matrix H in formula (12). The activation function in this step is usually the sigmoid function. The matrix H from this step is used to calculate the weight matrix for the output layer in step 5. Here formula (13) is for obtaining the output weight matrix, and formula (14) is its matrix representation. The matrices H and ꞵ from steps 4 and 5, correspondingly, are used to calculate the output values in step 6. The formula (15) is to calculate the output matrix with predicted target values. Formula (16) is the matrix representation of T. To conclude, from this matrix T, the predicted values are given as output for the given input instances [34].

Algorithm: Extreme Learning Machine (ELM)
INPUT: Give the dataset as input OUTPUT: Predicted output values for the given input instances ASSUMPTIONS: k is total instances, xk is the input vector, ht is the output value for hidden neuron t where t = 1, 2,....,p, bt is the bias of the hidden neuron t, wt is weight vector for connections between the input layer and the hidden layer neurons t, ꞵ is weight vector connecting neurons of the hidden layer, and output layer, g() is the activation function.

STEPS
5. Obtain output layer weight matrix, the pseudo inverse of H. Here D is the matrix containing actual target values from the input instances.
6. Calculate output for the input instances.
7. Return the predicted output values. 8. Stop The pictorial representation for the generalized process of the algorithm is demonstrated in figure 4. After providing the input, values to the input layer neurons are assigned. Weights and biases are assigned randomly, as explained in the above algorithm. Hidden layer neurons connected to input neurons are used to obtain the output matrix for the hidden layer. The final matrix for the output layer was obtained using this matrix, which gives the predicted values.

Metrics and definitions to evaluate results
Accuracy, 0-1 loss, TS or CSI, FOR, and FDR evaluation metrics are considered in this work. This section comprises of definitions of metrics. Evaluation of each metric for the best algorithm is discussed in section 6.

Accuracy
This metric is used to calculate the percentage of instances that are classified correctly. Accuracy was calculated based on formula 1.

Zero One Loss (0-1 Loss)
It is an evaluation metric that is used to calculate the misclassification rate. Its value usually lies between 0 and 1.
The value nearer to 0 is considered as a good value and can be said as best performance.

TS or CSI
Threat Score (TS) or Critical Success Index (CSI) is an evaluation metric that is defined as the ratio of no. of instances which are correctly predicted as positive to the total no. of instances except the correctly predicted negatives. Its value lies between 0 and 1. The highest value represents good value.

TS = TP TP + FN + FP
(2) FOR FOR stands for False Omission Rate that is calculated using parameters of the confusion matrix. It is defined as the ratio of no. of instances wrongly predicted as negative to the total no. of instances which are predicted as negative. Its value lies between 0 and 1, where the value nearer to 0 represents a good value.

FDR
FDR stands for False Discovery Rate that is calculated using confusion matrix parameters. It is defined as the ratio of no. of instances wrongly predicted as positive to the total no. of instances that are predicted as positive. Its value lies between 0 and 1, where the least value represents good value.

Results analysis & discussion
The results obtained by implementing algorithms in R programming are provided in this section. The analysis of results was done and also discussed here. The demonstration of each metric is given for the proposed technique extreme learning machine (ELM). The existing algorithms were also evaluated in the same way. The values of TP, TN, FP and FN from the confusion matrix obtained for ELM are 19, 6, 1, and 0 in that order. Using these values, some evaluation metrics can be calculated as in the following subsection.

EAI Endorsed Transactions on Pervasive Health and Technology
Online First

Evaluation metrics
Evaluating proposed technique ELM using all the considered evaluation metrics is demonstrated below. These metrics are defined in section 5. In this work, the value of 0-1 loss is obtained by implementing a built-in function in R programming. The 0-1 loss value obtained for ELM is 0.0385. This value indicates better performance of ELM in terms of 0-1 loss. In addition, remaining four metrics had also obtained better results as per the respective criteria.

Results obtained
After implementing all the considered algorithms, the results obtained using same data set described in section 3 are given below. The values of five evaluation metrics for each considered algorithms KNN, SVM, ANN and ELM are provided in tables 2, 3, 4, and 5. The ANN and ELM are neural network techniques. In this work, the no. of neurons considered in the hidden layer is 10 and 35 for ANN and ELM, respectively. Two neural network techniques can be compared using cross-entropy error (log loss) values. Cross entropy error or log loss is a metric that is used to measure the classification model performance only when the output is given as probability value, i.e., range [0, 1]. These predicted probabilities are rounded up to obtain the confusion matrix for calculating the above evaluation metrics. The value of cross-entropy error increases when the predicted probability doesn't match the actual value or label. Less value represents good performance, and a value equal to 0 indicates the best performance. The log loss value for the neural network techniques considered in this work is also obtained. By comparing them, it was found that the proposed technique ELM has obtained the least value of 1.3285 and outperformed ANN, whose value is 5.31375.  [19, 21, 22, and 24-29] included the SVM technique, which was identified as the best performing algorithm in some works. The algorithm KNN was used in [19, 22, and 26], and ANN was used [24 and 26]. These are the techniques considered from previous research works, and after obtaining the results, the performance of those existing ML algorithms is compared with the new proposed algorithm ELM. Present and previous works are based on the ML techniques but present work used ELM also. The previous articles did not address the ELM algorithm. Figures 5 and 6 show that ELM outperformed the remaining techniques. Hence, it was clear that ELM will help predict DFU more accurately than other similar algorithms from literary works.

EAI Endorsed Transactions on Pervasive Health and Technology
Online First  ref. [19] ref. [21] ref. [22] ref. [23] ref. [25] ref. [28] ref. [29] Proposed work EAI Endorsed Transactions on Pervasive Health and Technology Online First Cui et al. [28] Patch-based CNN, SVM and U-net Diabetic wound segmentation was done mainly on the basis of CNN. The steps involved are pre-processing followed by image segmentation using patch-based CNN and then post-processing. It was compared with SVM and U-net in terms of precision, sensitivity, specificity, pixel accuracy, mean IoU, dice and MCC. Some works in literature used image classification algorithms like BCM, EACM, and CNN. As the dataset chosen in this work was a clinical dataset but not an image dataset, those algorithms were not considered here. From the comparative study, it was observed that the ELM algorithm achieved very good values in each and every metric. Apart from KNN and SVM with Gaussian kernel, ANN and ELM are the neural network techniques. The advantage of using ELM over ANN was provided above table 2. Various literature works which have used accuracy for evaluation purpose are compared with the proposed technique in figure 7. This comparison ensures to identify the work which got highest accuracy. Though datasets used are different the overall accuracy of literature works are less compared to ELM in the proposed work. This exhibits the effectiveness of proposed DFU prediction model. Pangaribuan et al. [35] have considered ANN and ELM algorithms for diagnosing diabetes. Their work is focused on comparing only ANN and ELM in diabetes and mean squared error (MSE). A foot ulcer is one of the complications of diabetes. An efficient model was developed in the present proposed work to diagnose diabetic foot ulcers to avoid diabetes consequences and its other side effects. Comparisons encompassed ELM with ANN and two other prominent ML techniques, KNN and SVM with Gaussian kernel. By performing this comparison, ELM performed better than KNN and SVM (Gaussian kernel) in addition to ANN. The scope of comparison was increased. And different metrics related to identifying the loss rate or error rate were chosen for analysis and accuracy. These facts differentiate this paper from other works like research work in [35]. As ELM outperformed remaining techniques, including ANN, it was recommended over the other methods.

Future work
Accuracy of 96.15% is obtained by the proposed approach which refers to best output. Enhancing the output by considering the dataset with some extra attributes related to diabetic foot ulcer will be considered for future work. In addition, evaluating different classification techniques based on some prominent metrics, the association between diabetes and its complications will be considered. It is recommended that more such works should concentrate and obtain better techniques to predict foot ulcers and other diabetic related side effects and risks. It is also hoped that future works will emerge that concentrate on Corona effects for diabetic patients also using latest effective novel Machine learning methods.

EAI Endorsed Transactions on Pervasive Health and Technology
Online First

Conclusion
Diabetes is a disease that most people are suffering from across the world. As diabetic foot ulcer disease is a significant complication of diabetes, its accurate prediction is crucial. After predicting it early, immediate treatment could save the diabetic patient from losing any leg part. A practical model relying on the best out of four considered ML techniques besides information from related works helped achieve accurate diabetic foot ulcer prediction. By evaluating the performance of the considered algorithms, it is found that ELM obtained better values of 96.15%, 0.0385, 0.95, 0, and 0.05 for accuracy, 0-1 loss, TS/CSI, FOR, and FDR, respectively. Among these techniques, the proposed technique extreme learning machine (ELM) has achieved better results than the remaining methods. Hence, it is recommended to use ELM to predict diabetic foot ulcers.