Digital Forensics and Cyber Crime. 10th International EAI Conference, ICDF2C 2018, New Orleans, LA, USA, September 10–12, 2018, Proceedings

Research Article

Associating Drives Based on Their Artifact and Metadata Distributions

Download
158 downloads
  • @INPROCEEDINGS{10.1007/978-3-030-05487-8_9,
        author={Neil Rowe},
        title={Associating Drives Based on Their Artifact and Metadata Distributions},
        proceedings={Digital Forensics and Cyber Crime. 10th International EAI Conference, ICDF2C 2018, New Orleans, LA, USA, September 10--12, 2018, Proceedings},
        proceedings_a={ICDF2C},
        year={2019},
        month={1},
        keywords={Drives Forensics Link analysis Similarity Divergence Artifacts Metadata},
        doi={10.1007/978-3-030-05487-8_9}
    }
    
  • Neil Rowe
    Year: 2019
    Associating Drives Based on Their Artifact and Metadata Distributions
    ICDF2C
    Springer
    DOI: 10.1007/978-3-030-05487-8_9
Neil Rowe1,*
  • 1: U.S. Naval Postgraduate School
*Contact email: ncrowe@nps.edu

Abstract

Associations between drive images can be important in many forensic investigations, particularly those involving organizations, conspiracies, or contraband. This work investigated metrics for comparing drives based on the distributions of 18 types of clues. The clues were email addresses, phone numbers, personal names, street addresses, possible bank-card numbers, GPS data, files in zip archives, files in rar archives, IP addresses, keyword searches, hash values on files, words in file names, words in file names of Web sites, file extensions, immediate directories of files, file sizes, weeks of file creation times, and minutes within weeks of file creation. Using a large corpus of drives, we computed distributions of document association using the cosine similarity TF/IDF formula and Kullback-Leibler divergence formula. We provide significance criteria for similarity based on our tests that are well above those obtained from random distributions. We also compared similarity and divergence values, investigated the benefits of filtering and sampling the data before measuring association, examined the similarities of the same drive at different times, and developed useful visualization techniques for the associations.