Wireless Mobile Communication and Healthcare. 9th EAI International Conference, MobiHealth 2020, Virtual Event, November 19, 2020, Proceedings

Research Article

Explainable Deep Learning for Medical Time Series Data

Download
184 downloads
  • @INPROCEEDINGS{10.1007/978-3-030-70569-5_15,
        author={Thomas Frick and Stefan Gl\'{y}ge and Abbas Rahimi and Luca Benini and Thomas Brunschwiler},
        title={Explainable Deep Learning for Medical Time Series Data},
        proceedings={Wireless Mobile Communication and Healthcare. 9th EAI International Conference, MobiHealth 2020, Virtual Event, November 19, 2020, Proceedings},
        proceedings_a={MOBIHEALTH},
        year={2021},
        month={7},
        keywords={Explainable deep learning Convolutional Neural Network Explanation quality metric Medical time series data},
        doi={10.1007/978-3-030-70569-5_15}
    }
    
  • Thomas Frick
    Stefan Glüge
    Abbas Rahimi
    Luca Benini
    Thomas Brunschwiler
    Year: 2021
    Explainable Deep Learning for Medical Time Series Data
    MOBIHEALTH
    Springer
    DOI: 10.1007/978-3-030-70569-5_15
Thomas Frick1, Stefan Glüge2, Abbas Rahimi3, Luca Benini3, Thomas Brunschwiler1
  • 1: IBM Research Zurich, Smart System Integration
  • 2: Zurich University of Applied Sciences, Institute of Applied Simulation
  • 3: ETH Zurich, Integrated Systems Laboratory

Abstract

Neural Networks are powerful classifiers. However, they are black boxes and do not provide explicit explanations for their decisions. For many applications, particularly in health care, explanations are essential for building trust in the model. In the field of computer vision, a multitude of explainability methods have been developed to analyze Neural Networks by explaining what they have learned during training and what factors influence their decisions. This work provides an overview of these explanation methods in form of a taxonomy. We adapt and benchmark the different methods to time series data. Further, we introduce quantitative explanation metrics that enable us to build an objective benchmarking framework with which we extensively rate and compare explainability methods. As a result, we show that the Grad-CAM++ algorithm outperforms all other methods. Finally, we identify the limits of existing explanation methods for specific datasets, with feature values close to zero.