Decoding HDF5: Machine Learning File Forensics and Data Injection

Clinton Walker; Ibrahim Baggili; Hao Wang

Digital Forensics and Cyber Crime. 14th EAI International Conference, ICDF2C 2023, New York City, NY, USA, November 30, 2023, Proceedings, Part I

Research Article

Decoding HDF5: Machine Learning File Forensics and Data Injection

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-031-56580-9_12,
    author={Clinton Walker and Ibrahim Baggili and Hao Wang},
    title={Decoding HDF5: Machine Learning File Forensics and Data Injection},
    proceedings={Digital Forensics and Cyber Crime. 14th EAI International Conference, ICDF2C 2023, New York City, NY, USA, November 30, 2023, Proceedings, Part I},
    proceedings_a={ICDF2C},
    year={2024},
    month={4},
    keywords={File Forensics Machine Learning HDF5 TensorFlow 2},
    doi={10.1007/978-3-031-56580-9_12}
}

Clinton Walker
Ibrahim Baggili
Hao Wang
Year: 2024
Decoding HDF5: Machine Learning File Forensics and Data Injection
ICDF2C
Springer
DOI: 10.1007/978-3-031-56580-9_12

Clinton Walker¹^,*, Ibrahim Baggili¹, Hao Wang²

1: Baggil(i) Truth (BiT) Lab, Center of Computation and Technology
2: Division of Computer Science and Engineering, Louisiana State University

*Contact email: cwal117@lsu.edu

Abstract

The prevalence of ML in computing is rapidly expanding and Machine Learning (ML) systems are continuously applied to novel challenges. As the adoption of these systems grows, their security becomes increasingly important. Any security vulnerabilities within an ML system can jeopardize the integrity of dependent and related systems. Modern ML systems commonly encapsulate trained models in a compact format for storage and distribution, including TensorFlow 2 (TF2) and its utilization of the Hierarchical Data Format 5 (HDF5) file format. This work explores into the security implications of TF2 ’s use of the HDF5 format to save trained models, aiming to uncover potential weaknesses via forensic analysis. Specifically, we investigate the injection and detection of foreign data in these packaged files using a custom tool external to TF2, leading to the development of a dedicated forensic analysis tool for TF2 ’s HDF5 model files.

Keywords: File Forensics, Machine Learning, HDF5, TensorFlow 2

Published: 2024-04-03
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-031-56580-9_12

Decoding HDF5: Machine Learning File Forensics and Data Injection

Abstract

About EAI

Community

Publish with EAI