About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Security and Privacy in Communication Networks. 18th EAI International Conference, SecureComm 2022, Virtual Event, October 2022, Proceedings

Research Article

Language and Platform Independent Attribution of Heterogeneous Code

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-25538-0_10,
        author={Farzaneh Abazari and Enrico Branca and Evgeniya Novikova and Natalia Stakhanova},
        title={Language and Platform Independent Attribution of Heterogeneous Code},
        proceedings={Security and Privacy in Communication Networks. 18th EAI International Conference, SecureComm 2022, Virtual Event, October 2022, Proceedings},
        proceedings_a={SECURECOMM},
        year={2023},
        month={2},
        keywords={Source code and Binary attribution Authorship attribution},
        doi={10.1007/978-3-031-25538-0_10}
    }
    
  • Farzaneh Abazari
    Enrico Branca
    Evgeniya Novikova
    Natalia Stakhanova
    Year: 2023
    Language and Platform Independent Attribution of Heterogeneous Code
    SECURECOMM
    Springer
    DOI: 10.1007/978-3-031-25538-0_10
Farzaneh Abazari, Enrico Branca, Evgeniya Novikova, Natalia Stakhanova,*
    *Contact email: natalia@usask.ca

    Abstract

    Code authorship attribution aims to identify the author of source or binary code according to the author’s unique coding style characteristics. Recently, researchers have attempted to develop cross-platform and language-oblivious attribution approaches. Most of these attempts were limited to small sets of two-three languages or few platforms. However, rapid development of cross-platform malware and general language, platform and architecture diversity raises concerns about the suitability of these techniques. In this paper, we propose a unified approach that supports attribution of code irrespective of its format. Our approach leverages an image-based code abstraction that preserves the developer’s coding style and lends itself to spatial analysis that reflects hidden patterns. We validate our approach on a set of Android applications achieving accuracy 82.8%–100% with source and byte code. We further explore the robustness of our approach in attributing developers’ code written in 27 programming languages, compiled on 14 instruction set architectures types and 18 intermediate compiled versions. Our results on the GitHub dataset show that in the worst case scenario the proposed approach can discriminate authors of code in heterogeneous format with at least 68% accuracy.

    Keywords
    Source code and Binary attribution Authorship attribution
    Published
    2023-02-04
    Appears in
    SpringerLink
    http://dx.doi.org/10.1007/978-3-031-25538-0_10
    Copyright © 2022–2025 ICST
    EBSCOProQuestDBLPDOAJPortico
    EAI Logo

    About EAI

    • Who We Are
    • Leadership
    • Research Areas
    • Partners
    • Media Center

    Community

    • Membership
    • Conference
    • Recognition
    • Sponsor Us

    Publish with EAI

    • Publishing
    • Journals
    • Proceedings
    • Books
    • EUDL