Multi-relational Instruction Association Graph for Cross-Architecture Binary Similarity Comparison

Qige Song; Yongzheng Zhang; Shuhao Li

Security and Privacy in Communication Networks. 18th EAI International Conference, SecureComm 2022, Virtual Event, October 2022, Proceedings

Research Article

Multi-relational Instruction Association Graph for Cross-Architecture Binary Similarity Comparison

Download

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-031-25538-0_11,
    author={Qige Song and Yongzheng Zhang and Shuhao Li},
    title={Multi-relational Instruction Association Graph for Cross-Architecture Binary Similarity Comparison},
    proceedings={Security and Privacy in Communication Networks. 18th EAI International Conference, SecureComm 2022, Virtual Event, October 2022, Proceedings},
    proceedings_a={SECURECOMM},
    year={2023},
    month={2},
    keywords={Cross-architecture binary similarity comparison IoT malware defense instruction association graph Relational graph convolutional network},
    doi={10.1007/978-3-031-25538-0_11}
}

Qige Song
Yongzheng Zhang
Shuhao Li
Year: 2023
Multi-relational Instruction Association Graph for Cross-Architecture Binary Similarity Comparison
SECURECOMM
Springer
DOI: 10.1007/978-3-031-25538-0_11

Qige Song¹, Yongzheng Zhang², Shuhao Li¹^,*

1: Institute of Information Engineering
2: China Assets Cybersecurity Technology CO.

*Contact email: lishuhao@iie.ac.cn

Abstract

Cross-architecture binary similarity comparison is essential in many security applications. Recently, researchers have proposed learning-based approaches to improve comparison performance. They adopted a paradigm of instruction pre-training, individual binary encoding, and distance-based similarity comparison. However, instruction embeddings pre-trained on external code corpus are not universal in diverse real-world applications. And separately encoding cross-architecture binaries will accumulate the semantic gap of instruction sets, limiting the comparison accuracy. This paper proposes a novel cross-architecture binary similarity comparison approach with multi-relational instruction association graph. We associate mono-architecture instruction tokens with context relevance and cross-architecture tokens with potential semantic correlations from different perspectives. Then we exploit the relational graph convolutional network (R-GCN) to perform type-specific graph information propagation. Our approach can bridge the gap in the cross-architecture instruction representation spaces while avoiding the external pre-training workload. We conduct extensive experiments on basic block-level and function-level datasets to prove the superiority of our approach. Furthermore, evaluations on a large-scale real-world IoT malware reuse function collection show that our approach is valuable for identifying malware propagated on IoT devices of various architectures.

Keywords: Cross-architecture binary similarity comparison, IoT malware defense, instruction association graph, Relational graph convolutional network

Published: 2023-02-04
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-031-25538-0_11

Multi-relational Instruction Association Graph for Cross-Architecture Binary Similarity Comparison

Abstract

About EAI

Community

Publish with EAI