MediExpert: An Expert System based on Differential Diagnosis focusing on Educational Purposes

The early and accurate identification of a disease is important for its effective treatment. However, medical errors represent a serious problem and pose a threat to patient safety. To this direction, appropriate and continuous education of the medical personnel has been widely recognized as an important mean to reduce medical errors and increase the quality of the health system. In this paper, we present MediExpert, an expert system targeting on continuous education of health personnel, providing also guidelines to persons that either cannot easily move due to age related comorbidities, or because they are away from healthcare units, further recommending users to talk with their doctors. It is based on differential diagnosis, employs ontologies for effective classification of health related problems and intelligent algorithms to enhance continuous education. We present the various components of the system and we elaborate on the benefits gained when using it for education.


Introduction
Diagnosis is one of the most important tasks performed by health providers and the impact of diagnostic errors on patient safety has been highly recognized [10]. However, although quantifiable (for example in the Harvard Medical Practice Study, diagnostic errors accounted for 17% of preventable errors [6]) limited attention has been shown at improving diagnostic errors and thousands of patients die or suffer every year due to them. To this direction, expert systems could benefit both the diagnostic procedure and the education of health providers. An expert system is a computer system that emulates the decision-making ability of a human expert. The development of such a system requires the relevant knowledge to be extracted from an expert and then to be represented in knowledge base for reasoning. Based on this knowledge, it emulates the decision-making ability of a human expert and can aid human experts in decision making and in education. In this paper, we present MediExpert, an expert system for enabling decision support and education on diagnosis. The final objective of MediExpert is primarily to provide remotely, preliminary medical diagnosis in remote areas, away from healthcare units mainly in rural areas, or in cases that aged people cannot easily move, further recommending users to talk with their doctors. In addition, it can be used to enhance physicians in decision support without being limited to a specific medical specialty. To this direction, students of medical schools can use it for educational purposes. The students can practice diagnosis on several hypothetical scenarios by presenting a set of symptoms and asking them to find potential diseases associated with the specific set of symptoms and vice versa. The novelty of the system lies in the fact that it can be widely deployed independently of the specific domain, it is simple to be understood and be used by health experts and that it uses ontologies to annotate and link available knowledge identifying partial matches or generalizations using this ontology. The remaining of this paper is structured as follows: In Section 2, we elaborate on related work. Then, in Section 3, we present system architecture and we elaborate on the various components and algorithms used. Section 4 presents a demonstration scenario using the system and finally Section 5 concludes this paper and presents directions for further work.

Related work
Within the years, several expert systems have been developed trying to support decision making in diagnostic procedures. APACHE III for example is a system [5] trying to predict the person's risk of dying in a hospital. This prediction is based on a comparison of the medical history of 18,000 cases, stored in the system database. It has an average of 95% predictive accuracy and it uses a score based mechanism. Although we also use a scoring mechanism, our system is mostly targeted on educational purposes and does not rely on available history of cases rather than already existing knowledge. LISA on the other hand, is a Clinical Information and Decision Support System for co-operative care in childhood acute lymphoblastic leukaemia [3]. It is primarily concerned with providing support during the patient's treatment period, where weekly decisions on drug dosing should be made. Dose adjustment rules are applied using the Guideline Modelling Language PROforma and the Recommendations are provided in clinical setting by the TALLIS PROforma approval mechanism. Our system however, does not try to implement existing clinical guidelines but only to educate and to offer help in diagnostic procedures. ISABEL is a web-based clinical decision support system providing support for paediatric diagnostic decisions [11]. Isabel uses Autonomy's natural language processing software and consists of a proprietary medical database with over 11,000 diagnoses and 4,000 drugs. It supports templates for querying and HL7 for interoperability. Although the system is really promising with nice features like the natural language processing it is not targeting on educating health personnel and as a commercial product is difficult to be modified to do so. PUFF is an expert system introduced to interpret pulmonary function test data [1]. Its reasoning is based on a backward chaining and it uses about 400 rules in knowledge base. Although promising, it is dedicated to pulmonary function tests. Therapy Edge HIV is a web-based clinical decision support system introduced in 2005 that deals with HIV treatment [2]. Its reasoning is based on temporal guidelines to assess the patient's current state and create alternative treatment options. Therapy Edge HIV implements an API to communicate with external systems, providing information in XML format. Again, this is a system, which is limited to a specific domain of knowledge whereas MediExpert has been developed to be a generic tool. The expert system in [7] derives differential diagnosis of epilepsy in childhood. This system as our system uses a meta-rule. The rules in [7] as in MediExpert that are fired are instances of this meta-rule. The expert system in [7] uses effectively the knowledge of experts for diagnosis. It does not run on the web and it does not have any student teaching component. Docs 'n Drugs [8] is another intelligent tutoring system for web-based and case-oriented training in medicine, however is was identified to have a poor user experience. There, the development of a training case influences the correctness of the learner's answers, whereas ICD-10 and other ontologies are also used. We believe using ontologies is to the right direction. However, user experience with the system was low, and as such, we would like to investigate a simpler, cleaner approach. The training that is provided by Docs 'n Drugs is case oriented while MediExpert provides a more general medical training. Finally, COSMOS [4] is another web-based expert system trying to offer decision support in diagnosis to interdisciplinary experts. As such, it can be used for diagnostic purposes in health. However, due to the complexity of the temporal rules that need to be defined, the system is really difficult to be used by domain experts. In addition, it has not been exploited for educational purposes.

MediExpert architecture and components
The MediExpert employs a three-layered architecture, shown in Figure 1. It consists of the web interface, the diagnostic sub-system and the knowledge base. In the sequel, we describe in details each one of those layers. In the top layer, two modules, the expert and the student module enable user interaction. This layer has been implemented using CSS/Javascript and HTML. The expert module is implemented as a web application available by using a web-browser. This web interface enables experts to define the rules for differential diagnosis. The user interface is simple, yet powerful, enabling the uninterrupted addition of new rules. Those rules can come from medical books or existing knowledge from medical experts. In our experiments, we used rules from a standard differential diagnosis book [9] that are used for educational and instructional purpose describing the diagnostic procedure. Different books might contain similar procedure and methods for the medical diagnosis and our knowledge base is flexible enough to be extended with additional knowledge. An example table indicating the differential diagnosis of cough is shown in Table 1.
Knowledge rules are formulated based on this tabular information by a medical expert with the help of a Knowledge Engineer, as it will be described in the sequel. The student module is implemented as a mobile application and it can be used either in class or at home. It is able to select randomly cases with specific symptoms, enabling students to suggest a diagnosis. Then the suggested diagnosis is compared to the diagnosis derived by our system and the results are returned to the user. In addition, the system enables students to select a disease and then to select from a list the symptoms related to this disease. The system proceeds to the tutoring process by comparing the selected symptoms to the ones recorded by the bibliography. As such, the tool can be used for training health personnel and for helping with the diagnostic process.

The diagnostic sub-system
We selected Prolog for the implementation of application logic, as Prolog is a programming language based in logic. When some symptoms are provided and the disease should be identified, the preconditions in the rules, available in the Knowledge Base (KB), are unified and the most similar disease/problem is returned. On the other hand, when a disease is selected, the symptoms are retrieved from the available rules. We have to note that our algorithms are intelligent enough thus they process partial matches and they return a matching percentage, enabling student to understand their success rate. Differential Diagnosis (DD) of a disease is the process of distinguishing between two or more diseases with similar symptoms the one which the patient is suffering. This differentiation is based on systemic comparison of symptoms, signs and laboratory findings. Our Expert System follows a clinical approach of DD as in Seller & Symons 2012 [9] and follows the International Classification of Diseases (ICD-10). Seller's and Symons' Proverb: "If you don't think about it, you will never diagnose it.". The reasoning in this approach is as follows.
Step 1: Initially, check that the Complaint (symptom) is member of the driving complaint list.
Step 2: Then, follow diagnosis by processing the derived rules based on tables as the one shown in Table 1.. The construction of the rules is based on the following conditions and characteristics.
Step 2a: Nature of patient. Identifies those conditions that are most prevalent within a particular subgroup (e.g. children, the elderly and premenopausal, diabetic, hypertensive, and immunocompromised individuals).
Step 2b: Nature of symptoms. Further identifies conditions by amplifying additional characteristics of the symptoms (how, when, where, radiation, acute/chronic, and others).
Step 2c: Associated symptoms. Any additional complaint (e.g., headache) could contradict or ensure the diagnosis Step 2d: Precipitating and aggravating factors. For example, the pain of gastritis is worsened by the ingestion of most foods, particularly alcoholic beverages. Peptic ulcer pain usually begins an hour or so after eating and it is generally relieved by eating. If epigastric pain occurs primarily or it is worsened in the recumbent position, peptic esophagitis should be suspected.
Step 2e: Ameliorating factors. For example, if the patient experiences relief after eating or taking antacids, peptic ulcer or peptic esophagitis is the probable cause of pain. The pain of gastritis, though worsened by the ingestion of food and alcoholic beverages, may be relieved by antacids.
Step 2f : Physical findings. Physical examination can often provide major clues to the diagnosis.
Step 3: Derivation of diagnosis A novel feature of our system is that the constructed KB is complemented with terms from the ICD-10 † ontology, in order to ensure interoperability and a common understanding of the various health problems. ICD10 is a standard medical classification ontology, which we exploit to record and identify similarities between health problems. The ICD10 taxonomy can be represented as a tree, with health problems as its nodes. In the 2017 version of ICD10, there are four levels in the tree, in addition to the root level. Sibling nodes that belong to lower levels share greater similarity than siblings that belong to upper levels. Our diagnostic sub-system is able to perform reasoning on the various layers identifying for example that "S27.4 Injury of bronchus 2" is a subcategory of "S27 Injury of other and unspecified intrathoracic organs" applying this to the decision support process.

Meta-rule and form of rules
The reasoning process is driven using a meta-rule. The next meta-rule encodes the reasoning of using the DD tables. This meta-rule is general enough to cover all cases of health problems with some initial symptoms. All domain-specific rules are instances of this meta-rule. This meta-rule is as follows where ∈, ⊆, ∩ and ≠ are set operators.

If (Complaint ∈ Driving_complaint_list) and
( (Nature_of_patient ⊆ Nature_of_patients_list) or In the aforementioned rule, Condition is the diagnostic result, ICD-10_code is the ICD-10 term and the † http://www.icd10data.com/ Diagnostic Studies are the studies to ensure the diagnostic result. An instance of the aforementioned rule, which encodes part of the knowledge shown in Figure 2, is the following: For this example, the diagnostic elements are the following: • The Condition is asthma, The Diagnostic Studies are the pulmonary_function_tests

The Knowledge Base
The sentences in the knowledge base are similar to normal text and they are easily understood by any health scientist without computer programming expertise. The rules that are stored in our knowledge base are of the following form, which directly corresponds to the table shown in Table 1: [Id, Complaint, Nature_of_patient, Nature_of_symptoms, Associated_sympotms, Precipitating_and_aggravating_factors, Ameliorating_factors, Physical_findings, Condition, ICD-10_related_problems].
Note that each rule is represented as a list. The reasoning engine of MediExpert processes this rule representation as "if-then" rule during diagnosis.
We have to note that the domain experts are able to generate the aforementioned rules through a nice GUI, which are then stored internally as Prolog clauses. In addition, the constructed KB is complemented with terms from the ICD-10 in order to ensure interoperability and a common understanding of the various health problems.

System Demonstration Scenario and Preliminary Results
In the sequel, we demonstrate the sub-system for the students, as shown in Figure 2. More specifically, we presented the system to 30 student nurses at the Technological Educational Institute of Crete, asking the class to identify symptoms about a specific complaint (e.g. "Chest Pain"). If a student successfully selects the symptoms, s/he proceeds to the next problem. A different approach is just to present the symptoms and then to ask for the proper disease. As we exploit the ICD-10 terminology we are able to exploit generalizations and specializations of the terms used, however with appropriate warnings (e.g. throat pain is also a pain and can replace it as a more generic symptom). More specifically the performed scenario is the following: Students' scenario: In the first step of this, shown in Figure 2a, the mobile version of the MediExpert, presents to the student a list of symptoms of a specific condition. Then the program invites the student to choose from a list of possible answers the correct one as a diagnosis (shown in Figure 2b). Finally, the program reveals to the student the correct answer, as shown in Figure 2c. Preliminary results from students: All students using the system found it interesting, usable and recognized its potential to be used as an educational tool, helping them in the difficult process of linking symptoms and other factors with diagnosis. Experts' scenario: Next, we demonstrate the expert's web interface focusing on the diagnosis process. As such, the first page asks about the complaint of the patient (shown in Figure 3).

Figure 3. Complaint Selection
The next step presents the symptoms and factors in checkbox form and the user should check the ones that are present in the specific case, as shown in Figure 4. Finally, the last step is to present the diagnostic result, as shown in Figure 5. Preliminary results from experts. We made a short demonstration to two medical practitioners in order to get an indication of the usability of the system. The practitioners also recognized its usefulness, provided us with additional books to extract rules and proposed instead of letting the doctors to write the rules, to provide a search/select functionality where by simple clicks will be able to generate those rules almost automatically. Currently we are investigating those aspects, whereas we have to note that the next version of the system with a new updated interface is about to be released in the following weeks.

Conclusions and Discussion
This paper presents an expert system based on differential diagnosis for educational purposes. The summary of the main features of MediExpert are as follows. The system is extended easily with the addition of new differential diagnosis rules, through a nice user interface and it can be used to educate health students on their difficult task of diagnosis. The system employs ontologies to identify and classify the diseases and implements intelligent algorithms for reasoning and relating symptoms and diseases. A web interface and a mobile application support the communication of the system with its potential users. The system employs an intelligent reasoning and in case of partial matching of diagnostic rules, it derives alternative diagnoses by using ICD-10 thus exploiting the ICD-10 tree for identifying close diagnoses. In the envisioned deployment of the system, it will be available online for all, enabling both students of medical schools to test their knowledge and on individuals to get more information about the process of differential diagnosis and what could be the cause of their symptoms. We have to note that always the system will recommend to the individuals to further discuss their findings with their doctor as by no means we intend to replace their doctor. As next step, we intend to proceed on an extensive usability evaluation of the system using students from the medical university of Crete, and to implement the collected requests for changes. In addition we intend to investigate both a more sophisticated inference engine be added to this system and also shifting from a rule based system to one that uses learning algorithm that can help the students see why they failed (gave the wrong diagnosis) and maybe provide more information.