Towards an Intelligent Monitoring System for Patients with Obstrusive Sleep Apnea

Due to the growing incidence of chronic diseases and aging populations, the pressure to control costs and the expectations of continuous improvements in the quality of service have increased the need to understand how healthcare is provided and to determine whether cost-e ﬀ ective improvements to care practices can be made. In the case of people su ﬀ ering Obstructive Sleep Apnea, patients using self-administer nasal Continuous Positive Airway Pressure (CPAP) may receive information on the treatment only once they go to a visit with the lung specialist. In this paper, we propose an IoT-based Intelligent Monitoring System that relies on machine learning to achieve a threefold goal: (1) it is aimed at early detecting compliance in order to predict CPAP usage; (2) it monitors the actual adherence degree to the treatment to keep informed both the patient and the lung specialists; and (3) it sends recommendations to the patient to empower her/him and to better follow up.


Introduction
The pressure to control costs and the expectations of continuous improvements in the quality of service have increased the need to understand how healthcare is provided and to determine whether cost-effective improvements to care practices can be made [1].Thanks to the new and inexpensive technologies it is now possible and affordable to get information and realize medical tasks remotely, thus reducing the corresponding overall costs [2] [3].Telemedecine is now a reality and can be applied to several fields of the health practice to assist patients in selfcare and adherence to treatments [4].In particular, patients' monitoring consists of using information technology to monitor patients who are not located in the same place of the health care provider.Meystre [5] reported how the telemonitoring systems have been successfully adopted in cardiovascular, hematologic, respiratory, neurologic, metabolic, and urologic domains.In fact, it may help people stay healthy, and in their homes, longer [6].Besides patients, the role of case managers, therapists, caregivers, social workers, as well as relatives is essential for remotely assisting monitored patients [4].
Filtering and analyzing data coming from teleassistance systems is becoming more and more relevant.In fact, data (both raw and processed) are continuously gathered and sent through the sensors.In this direction, data mining approaches trying to extract knowledge from the Health Information Systems can provide useful results, targeting some of the main phases of medical workflow: screening, risk stratification, as well as care pathways.Being the final goal to ease and automate some intervention and triage in a cost effective manner, recent studies make use of data mining and predictive methods to improve the execution of healthcare pathways [7].In particular, compliance and adherence to prescribed therapies to chronic diseases [8] [9] are common areas of application.
In a scenario of growing incidence of chronic diseases and aging populations, we focus on Obstructive Sleep Apnea (OSA).This is a chronic disease defined as repeated episodes of obstructive apnea and hypopnea during sleep, together with daytime sleepiness or altered cardiopulmonary function [10].OSA treatment normally involves using a device that self-administer nasal Continuous Positive Airway Pressure (CPAP) [11].CPAP is applied to the upper airway with a nasal mask, nasal prongs, or a mask that covers both the nose and mouth [12] [13].Recent studies show that CPAP therapy is very effective in improving symptoms, such as memory deficits and frontal lobe related abnormalities [14] [15].Although it is a highly effective treatment, compliance with therapy -defined as use of the machine for more than four hours per nightremains a difficult issue.Many studies have statistically evaluated compliance over relatively brief periods of time (one to six months) [16] [17] [18].Long-term studies are far fewer and rare to find [19] [20].In these reports, the reasons for noncompliance have often been inadequately documented.The degree to which a given patient is adhered to the therapy appears to be more closely related to the relief of daytime symptoms and the restoration of alertness than to the severity of the apnea-hypopnea index [10].
In this paper, we focus on eHealth and propose an IoT-based intelligent monitoring system aimed at giving automatic remote support to patients suffering OSA, as well as a suitable feedback to lung specialists.Although technologies to make IoT a systematic reality are far form being assessed [21], there are several of efforts to utilize IoTbased systems for remotely monitoring patients [22] [23] [24] [25].Focusing on OSA, Vandenberghe and Geerts [26] performed a study on observations in hospital-based sleep centers performing traditional and ambulatory sleep studies to allow hospitalbased sleep centers to deploy current practices in a targeted, meaningful, and accountable way.Kumar [27] proposed a methodology based on the IoT to help monitor patients with OSA and similar life-threatening diseases.The intelligent monitoring system presented in this paper differs from those proposed in the literature because has a threefold goal: early compliance prediction, adherence degree monitoring, and patient's empowerment.Furthermore, decision support to lung specialist and suitable recommendations to patients are automatically provided.
This paper improves and extends our previous work [28].In particular, there we focused on monitoring adherence degree and on presenting the implementation of the app.On the contrary, here, we present advances and focus on the three functionalities, their implementation and results from the clinical studies performed at the Hospital Arnau de Vilanova en Lleida (Spain).
The rest of the paper is organized as follows.Section 2 presents the data that have been used in the study and proposes the intelligent monitoring system and its main functionalities.In Section 3, we summarize the experiments that have been performed.Finally, Section 4 ends the paper with the conclusions.

Data and Methods
In the context of innovation in telemedicine, the myOSA project (RTC-2014-3138-1) is aimed at developing new ICT tools and remote clinical follow-up methods to allow the creation of continuous integrated services for the treatment of OSA.
Currently, in Spain, after a visit with a lung specialist, patients suffering from severe OSA are treated with a CPAP machine at their home.Lung specialists and CPAP providers guide patients on how to use the device properly and prescribe them to use the machine more than 4 hours daily in order to benefit the therapy.From that moment on, the medical protocol states the following visits after 1, 3, 6 and 12 months and then once a year.In so doing, it may happen that, during a visit, lung specialist discovers that the patient is using the CPAP less than 4 hours or s/he is not using it at all.
Modern versions of CPAP machines (e.g., AirSense 10 Autoset by RESMED) have wireless Internet connection and remote access to the information of their daily use.Thanks to these advances in the CPAP devices, we developed the MyOSA system [28] that, connecting the CPAP with Internet and providing patients with an app in their smartphone, gives support to both patients and lung specialists.In fact, its overall goal is to improve patients' compliance and achieve better follow-up.The system is composed of three tiers, as sketched in Figure 1.At her/his home the patient has the CPAP machine connected to Internet and a smartphone with the app (MyOSA app).At the hospital, lung specialists are provided with a web application that summarizes relevant information and gives also a support in medical decisions.Finally, the MyOSA platform, installed in the cloud, connects all the devices for data exchanging.The core of the MyOSA platform is the Intelligent Monitoring System (see Section 2.2) that is composed of a set of intelligent algorithms aimed at predicting the expected adherence level to the therapy by a given patient.In fact, to improve therapy follow-up and give support to both patients and lung specialists, the MyOSA system relies on suitable intelligent algorithms that, receiving a input raw data from the CPAP machine, transform and analyze them providing recommendations to both patients (through the MyOSA app) and lung specialists (through the web application).
The MyOSA app is available in Spanish and Catalan from both Android 1 and iOS 2 devices.The web application is of exclusive use of lung specialists at Hospital Arnau de Vilanova in Lleida.Figure 2 shows three screenshots (two from the MyOSA app and one from the web application) of the MyOSA system.Figure 2. Screenshots of the myOSA system.On the left an example of recommendation to the patient through the app; in the center, the list of information given to a patient; on the right the main board in the web application for follow up by the lung specialist.

Data
Two datasets were collected to build the functionalities of the Intelligent Monitoring System.
To perform the analysis of compliance, 51 patients were recruited at Hospital Arnau de Vilanova (Lleida, Spain).They agreed to participate in the 6-month study, signing the informed consent3 .However, 9 of them were excluded due to: the CPAP machine stopped working before the 6th month (3 patients); they did not go to the final visit at month 6 (5 patients); and 1 patient died during the study.The study variables from the 42 patients were manually 1 https://play.google.com/store/apps/details? id=myosa.www.oxigensalud.com.myosa&hl=en 2 https://itunes.apple.com/es/app/myosa/id1062842892?mt=8 3 CEIC-1283 approved by Comité Ético de Investigación Clínica from the Institut Catalá de Salut.
collected by lung specialists along four visits at month-0 (baseline or T0), at month-1 (T1), at month-3 (T3) and at month-6 (T6).During the first visit clinicians gathered 77 features organized in five categories: clinical history (e.g., depression, anxiety, arterial hypertension (HTA), cardiopathy, neurological disease, respiratory disease), symptoms (e.g., irritability, apathy, depression, insomnia), co-morbidities (e.g., diabetes, obesity, dyslipidemia), therapies (e.g., beta blockers, diuretics), sleeping test (e.g., sleeping time, AHI, percentage of the night spent with oxygen saturation < 90% or CT 90) and basal information (e.g., size, weight, BMI, tas, tad, oxygen saturation).In the second visit, when the patients had the CPAP machine at home during one consecutive month, 16 new features related to monitoring were collected (e.g., nightly average use, abandon or adverse effects of the treatment, such as dry mouth, allergies, and cutaneous irritations).At the third month (T3), the same number of features as in T1 were gathered plus 5 new ones (i.e., size, weight, BMI, drugs removed, and drugs added).At month-6, although some other variables were collected, for the purpose of this study only the average use of nigh hours was considered.Eventually, three datasets (D0, D1, and D3) with an incremental number of features (i.e.D1 features = D0 features + features collected at T1) were created with 77, 93 and 114 features.Hereinafter, we refer to the overall dataset as compliance-DS.
The analysis of adherence was based on the followup data of 4207 patients (980 women) using CPAP (59 +/-25 days) in the Spanish area.From the CPAP usage of each patient along the time we created 12 features to be able to extract patterns of use (i.e., maximum/minimum/average usage per day, number of consecutive days using CPAP more than 4 hours).Table 1 describes the values of the features to study the adherence.Hereinafter, we refer to this dataset as adherence-DS

Methods
The Intelligent Monitoring System has three main modules (outlined in Figure 3) each of them aimed at providing a specific functionality and support to the end-users (patients and lung specialists, depending on the case): analysis of compliance, analysis of adherence, and usage-based recommendations.
Analysis of compliance.The objective of this module is to allow early detection of compliance with therapy to reduce the misuse of CPAP devices and dropout rates.As it is indicated in different studies, if a patient is compliant for the first 6 months, s/he will be compliant also in the future [29] [30].According to the literature, we addressed CPAP compliant users as those who had more than 4 hours on average night during the first 6 months of treatment.The module gives support to the lung specialists predicting if the patient will be compliant with the treatment 6 months after starting the therapy.This is achieved thanks to three compliance classifiers created for month-0, month-1 and month-3 of the treatment.In each of them, we defined patients as "compliant", if they correctly followed the CPAP therapy prescription.On the contrary, patients did not achieve the prescribed treatment were labeled as "non-compliant".

Analysis of adherence.
Taking into account the benefits of continuous and prolonged use of the machine for the health of patients with OSA, we have developed a module that provides the degree of adherence to the use of the machine.In particular, we have created a predictive model to cluster patients according to their CPAP use profile.The resulting information is used, on the one hand, to inform the patient about its adherence and, on the other hand, to inform lung specialists about therapeutic adherence.
Usage-based recommendations.With the aim of empowering patients with self-management tools, we defined a module to provide recommendations based on the patient's CPAP device monitoring.These recommendations, designed under the supervision of the experts (i.e., lung specialists involved in the myOSA project), are intended to encourage patients to continue using the CPAP device.This module continuously receives information on the use of the CPAP device and the two previous functionalities, and automatically generates recommendations to the patients if certain conditions are met.

Analysis of compliance
On the compliance-DS we conducted a study -based on machine learning-in order to build three predictive models (namely, M0, M1, and M3) each one working with its dataset (D0, D1, and D3, respectively).All the methods are aimed at predicting compliance at month 6.
In order to ensure adequate performance evaluation, the three datasets were randomly divided into stratified train and test sets with a ratio of 70/30.For each dataset, we ran 10-fold cross-validation looking for the best hyper-parameter values using a grid search approach.The best models were identified by ranking the cross-validation performances (i.e., average of f1-score) reported by each pipeline.This process was configured with two learning metrics (precisionweighted -p_weighted-and f1-weighted) in order to find the pipelines which raised best predictive results.To complete this analysis, the best models of each dataset were evaluated on the independent test set in order to report their generalization ability.
The three datasets have common particularities, such as a small number of samples, the presence of missing values, class unbalance and high multidimensionality feature space.To cope with these complexities we designed a classification framework flexible enough to enable the execution of heterogeneous pipelines or sequence of configurable machine learning steps.In particular, the pipelines were composed of 3 mandatory steps (i.e., imputation, variance filtering, data standardization), 2 optional steps (i.e., feature selection and sampling) and 2 more final steps (i.e.classifier training and evaluation).In total 80 pipelines were configured from 4 feature selection methods, 5 classifier algorithms, 2 sampling strategies and 2 evaluation metrics.
The result of running (i.e., training or building) a pipeline (Pipe i ) on a dataset (D j ) with parameters (params i ) is a predictive model or classifier (M i,j ) with its associated predictive performance (Perf i,j ). Figure 4 shows the pipeline schema with the inputs, outputs and the different steps that configure the pipelines for compliance with CPAP therapy.
Feature imputation.A simple strategy was used to replace the null values with their most frequent value (for categorical characteristics) and with the mean value (for numerical characteristics).
Variance filter.We used a simple filter method to eliminate features with zero variance, that is, to eliminate these characteristics that have the same value in all the samples and that do not provide any additional information to the data set.
Data standardization.All features standardized to zero mean and variance one to enable comparisons among features [31].
Feature selection (fs).Due to the large number of features compared with the number of samples for each dataset (p > n), we relied on feature selection [32].We have adopted three feature selection methods with the intention of having several options that would allow finding the one that generated the best results.The combine_f s method ranks the features by their statistical significance with the class.To do this, it applies ANOVA or chi-squared according to the data type of the characteristics and selects the subset of features by means of a configurable threshold.The recursive feature elimination (rf e_rf _f s) method [33] differs from the previous one, since it provides the importance of the features by means of a classifier model trained for it (i.e., random forest).The lasso_f s [34] method uses a linear model configured with the L1 norm in order to provide the weight of the features.It was also considered the possibility of not using any method of feature selection.
Sampling (sm).To avoid the bias produced by many standard classifier learning algorithms towards the class with a larger number of instances, we introduced in the pipeline the possibility to use of a sampling technique.In particular we relied on Smote [35].
Classifier training and evaluation.We selected several classification algorithms (cls) to deal with various classification strategies (i.e., linear, non-linear, distance-based, and tree-based).In fact, the provision of different classification strategies is especially appropriate in complex datasets when the distribution of data is not clearly manifested.Therefore, we opted for logistic regression (LR) [36], k-nearest neighbor (k − N N ) [37] and random forest (RF) [38].Additionally, we selected support vector machines (SV M) [39] and artificial neural networks (N N ) [40] as a subset of algorithms with less interpretative capacity but with a potential greater discriminatory capacity.Table 2 summarizes parameters, methods, and performances of the best predictive models achieved in test for each time-point.Figure 5 shows a comparative between validation and test performances of best pipelines for each dataset.

Analysis of adherence
This study was performed on the data from adherence-DS.First, we did a principal component analysis (PCA) in the standardized data set to obtain a better understanding of the problem to solve.The first 2 components of the PCA achieved a 0.72 of explained variance ratio.This positive result allowed to suitably visualize the amount of data with only 2 dimensions.Figure 6 shows the whole amount of data on the 2 PCA dimensions highlighting in different colors users with few, normal, and high consecutive usage of the CPAP device.
After feature reduction, k-means algorithm [41] was used to find similar cluster of patients.To this purpose, we built several clustering models using different numbers of k.The resulting models were compared using the silhouette metric.The best model was achieved with k = 3 obtaining a score of 0.31.With k = 4, k = 5 and k = 10 scores of 0.28, 0.25 and 0.23 were obtained, respectively.Once the best model was selected, a post processing analysis was conducted to map the resulting clusters with adherence profiles.Table 3 summarizes the results for each cluster.According to the results, Cluster0 was assigned to "High" (more than 4 hours of usage, on average), Cluster1 to "Low" (less than 3 hours of usage, on average), and Cluster2 to "Medium" (in the range of 3-4 hours of usage, on average).

Usage-based recommendations
In order to provide suitable recommendations to the OSA patients we required to gather a wider picture of the patient behavior with respect to the CPAP treatment.To do this, we first selected 3 features related to the compliance analysis (see Table 4) and 5 to the adherence analysis (see Table 5).
According the above mentioned features, their range of values, and the guidance of the clinicians, we modeled a set of rules.Each rule has a set of conditions (C) and a recommendation message (R).These rules are continuously evaluated throughout the patient's monitoring and they send recommendation messages when the conditions of the rules are met.In particular, three kinds of recommendations were identified: awards, feedback, and alerts.Awards are given to outstanding patients when they considerably    comply with the adherence.Moreover, awards are given to empower the patient when they move from an adherence level to a higher one.Feedback is given anytime patients need to receive some specific recommendation to improve the use of the CPAP or to be encouraged to use it more.Alerts are sent when the patient belongs to the no-compliant cluster and needs to be supported.Alerts may also be sent when a patient moves from an adherence level to a lower one.Table 6 shows an example of recommendations based on adherence period, type, and gradient.

Conclusions
Recent advances in technologies related to remote monitoring and data analysis are driven a paradigm change in the way how we traditionally understand the management of chronic diseases.In this paper, we presented an intelligent monitoring system aimed at remotely support Obstructive Sleep Apnea disease.Three main functionalities have been provided: early compliance detection, adherence prediction, and rulebased recommendations.Satisfactory results were achieved for all functionalities.Patients access the system through an app installed in their smartphone.
Lung specialists may access to a secure web application, through which they can manage patients information and also access to specific information coming from the data analysis of the continuous positive air pressure devices.Experience of both patients and lung specialists involved in the 6-month study was very satisfactory.Their feedback, comments, and suggestions will be used to improve the current version of the system.

Figure 1 .
Figure 1.Overall view of the MyOSA system.

Figure 4 .
Figure 4. Pipeline Steps for early compliance classification.

Table 1 .
Summary of the features related to the use of CPAP for the analysis of adherence.

Table 2 .
Performances of the best classification models for each dataset.
Figure 5. Classification compliance in validation and test for best predictive models.

Table 3 .
Cluster centroids for each cluster.

Table 4 .
Compliance features for usage-based recommendations.

Table 5 .
Adherence features for usage-based recommendations.

Table 6 .
Example of recommendations.It has worsened the use of CPAP lately.It is very important to be consistent.Please contact us if you have questions!