
Research Article
MalEfficient10%: A Novel Feature Reduction Approach for Android Malware Detection
@INPROCEEDINGS{10.1007/978-3-031-40467-2_5, author={Hemant Rathore and Ajay Kharat and Rashmi T and Adithya Manickavasakam and Sanjay K. Sahay and Mohit Sewak}, title={MalEfficient10\%: A Novel Feature Reduction Approach for Android Malware Detection}, proceedings={Broadband Communications, Networks, and Systems. 13th EAI International Conference, BROADNETS 2022, Virtual Event, March 12-13, 2023 Proceedings}, proceedings_a={BROADNETS}, year={2023}, month={7}, keywords={Android Feature Selection Machine Learning Malware Detection Static Analysis}, doi={10.1007/978-3-031-40467-2_5} }
- Hemant Rathore
Ajay Kharat
Rashmi T
Adithya Manickavasakam
Sanjay K. Sahay
Mohit Sewak
Year: 2023
MalEfficient10%: A Novel Feature Reduction Approach for Android Malware Detection
BROADNETS
Springer
DOI: 10.1007/978-3-031-40467-2_5
Abstract
The Android OS has recently gained immense popularity among smartphone users. It has also attracted many malware developers, leading to countless malicious applications in the ecosystem. Many recent reports suggest that the conventional signature-based malware detection technique fails to protect android smartphones from new and sophisticated malware attacks. Therefore, researchers are exploring machine learning-based malware detection systems that can successfully discriminate between malware and benign applications:effectivelyandefficiently. Existing literature suggests that many machine learning-based models use large feature sets for malware detection. However, classification models based on a large number of features are computationally expensive, time-consuming, and have poor generalizability. Therefore, this paper proposes a reliable feature reduction approach to select the most prominent features for effective and efficient malware detection. The proposed approach is tested on two different datasets, three distinct features, and twenty-six unique classifiers. The twenty-six baseline malware detection models based on 724 features and thirteen classification algorithms achieved an average accuracy and average AUC of(94.73\%)and(94.49\%), respectively. Later we performed feature reduction that works with mutually exclusive and merged feature spaces of android permissions, intents, and opcodes. The proposed feature reduction approach reduced the number of features from 724 to 72 ((10\%)of the original features). We also list the reduced set of 72 features comprising android permissions, intent, and opcode used for malware detection. The reduced features based twenty-six malware detection models achieved an average accuracy and average AUC of(93.12\%)and(92.97\%), respectively. The feature reduction leads to less than(2\%)reduction in average accuracy and AUC. However, it leads to(85.25\%)and(91.45\%)reduction in average test and average training time for twenty-six android malware detection models. Therefore, the feature reduction leads to a minute reduction in the effectiveness but results in massively efficient (w.r.t time) malware detection models.