7th IEEE International Workshop on Trusted Collaboration

Research Article

Similarity Analysis of Shellcodes in Drive-by Download Attack Kits

Download648 downloads
  • @INPROCEEDINGS{10.4108/icst.collaboratecom.2012.250507,
        author={Manoj Cherukuri and Srinivas Mukkamala and Dongwan Shin},
        title={Similarity Analysis of Shellcodes in Drive-by Download Attack Kits},
        proceedings={7th IEEE International Workshop on Trusted Collaboration},
        publisher={IEEE},
        proceedings_a={TRUSTCOL},
        year={2012},
        month={12},
        keywords={cloud services security shellcodes similarity web malware collaboration frameworks security},
        doi={10.4108/icst.collaboratecom.2012.250507}
    }
    
  • Manoj Cherukuri
    Srinivas Mukkamala
    Dongwan Shin
    Year: 2012
    Similarity Analysis of Shellcodes in Drive-by Download Attack Kits
    TRUSTCOL
    ICST
    DOI: 10.4108/icst.collaboratecom.2012.250507
Manoj Cherukuri1,*, Srinivas Mukkamala2, Dongwan Shin2
  • 1: New Mexico Institute of Mining and Technology
  • 2: New Mexico Institute of Science and Technology
*Contact email: manoj@cs.nmt.edu

Abstract

Drive-by downloads have become the primary attack vehicle for malware distribution in recent years. With the rise of targeted attacks, the vulnerabilities within the cloud based services and web based collaboration frameworks might end up as the principal targets for hosting drive-by download attacks. In this paper, we studied the similarity of the shellcodes among different attack kits. Shellcode is the malicious code used as the payload in drive-by download attacks. Specifically, we collected 15 different drive-by download attack kits and identified shellcodes used in each kit. As the shellcodes are transmitted to the browser as Javascript strings, we measured the similarity between regular strings and shellcodes defined in Javascript. We disassembled the shellcodes and computed the mean of Cosine Similarity, Extended Jaccard Similarity and Pearson Correlation measures based on the frequencies of the opcodes. Our analysis shows that the shellcodes, used as payloads, across different attack kits were similar with other shellcodes and dissimilar with benign Javascript strings. We observe that some of the attack kits released across different years had same shellcodes. The performance of similarity analysis was compared to an emulation based approach and observed reduction of 75% in the analysis time. Based on the results, the similarity measure of the shellcodes could be an effective static mechanism in detecting the shellcode based drive-by download attacks.