
Research Article
SEMEO: A Semantic Equivalence Analysis Framework for Obfuscated Android Applications
@INPROCEEDINGS{10.1007/978-3-030-94822-1_18, author={Zhen Hu and Bruno Vieira Resende E. Silva and Hamid Bagheri and Witawas Srisa-an and Gregg Rothermel and Jackson Dinh}, title={SEMEO: A Semantic Equivalence Analysis Framework for Obfuscated Android Applications}, proceedings={Mobile and Ubiquitous Systems: Computing, Networking and Services. 18th EAI International Conference, MobiQuitous 2021, Virtual Event, November 8-11, 2021, Proceedings}, proceedings_a={MOBIQUITOUS}, year={2022}, month={2}, keywords={Malware Android Security}, doi={10.1007/978-3-030-94822-1_18} }
- Zhen Hu
Bruno Vieira Resende E. Silva
Hamid Bagheri
Witawas Srisa-an
Gregg Rothermel
Jackson Dinh
Year: 2022
SEMEO: A Semantic Equivalence Analysis Framework for Obfuscated Android Applications
MOBIQUITOUS
Springer
DOI: 10.1007/978-3-030-94822-1_18
Abstract
Software repackaging is a common approach for creating malware. Malware authors often use software repackaging to obfuscate code containing malicious payloads. This forces analysts to spend a large amount of time filtering out benign obfuscated methods in order to locate potentially malicious methods for further analysis. If an effective mechanism for filtering out benign obfuscated methods were available, the number of methods that analysts must consider could be reduced, allowing them to be more productive. In this paper, we presentSemeo, an obfuscation-resilient approach for semantic equivalence analysis of Android apps.Semeoautomatically and with high accuracy determines whether a repackaged and obfuscated version of a method is semantically equivalent to an original version thereof.Semeofurther handles widely-used and complicated types of obfuscations, as well as the scenarios where multiple obfuscation types are applied in tandem. Our empirical evaluation corroborates thatSemeosignificantly outperforms the state-of-the-art, achieving 100% precision in identifying semantically equivalent methods across almost all apps under analysis.Semeoconsistently provides over 80% recall when one or two types of obfuscation are used and 73% recall when five different types of obfuscation are compositely applied.