CLETer: A Character-level Evasion Technique Against Deep Learning DGA Classiﬁers

The detection of pseudo-random domain names generated by Domain Generation Algorithms (DGAs) is one of the e ﬀ ective ways to ﬁnd botnets. Study on the vulnerability of deep learning models to adversarial attacks can enhance the robustness of DGA detection mechanism. This paper proposes CLETer, an improved DGA that provides a character-level evasion technique against state-of-the-art DGA classiﬁers. Based on existing DGA domain names, CLETer can intelligently generate adversarial examples by quantifying the inﬂuence of every character to the classiﬁcation result and then changing the important characters. Those improved domain names can easily evade being detected and show good transferability. The experimental results demonstrate that when modifying only two characters, CLETer can e ﬀ ectively lower the LSTM classiﬁer’s recall from 99.76% to 1.29% and drop the CNN classiﬁer’s recall from 99.36% to 3.64%. It is proved that adversarial retraining is a viable defense strategy to CLETer.


Introduction
Malware has been evolved into the greatest threat to cybersecurity. As one of the most sophisticated modern malware, a botnet is a group of Internet-connected and malware-infected individual devices (bots) that receive operational instructions from command and control (C&C) servers directly controlled by attackers (botmasters). The botnet is used to conduct harmful cyber-attacks such as distributed denial-of-service [1], spam [2], information theft [3], etc. The implementation of these attacks depends on the information exchange between botmaster and bots. This is not only the core of botnet construction but also the key for the attackdefense game.
To evade detection and make botnets more robust, domain-flux is applied to establish a communication channel between bots and the C&C server. Domainflux binds multiple domain names generated by domain generation algorithms (DGAs) to the IP address of the C&C server. Botmaster and bots share the same DGA. DGAs can dynamically produce a large number of candidate domain names based on a given seed including the current date/time [4], trending topic, random number, dictionary, etc. The possible seeds work as shared keys letting botmaster and infected bots generate the same domain list. Only a few domain names are selected to be registered for subsequent attacks, making them harder to be detected than traditional hard-coded malicious domain names. The detection of DGAs is an important way to prevent botnet attacks.
In the field of DGA detection, deep learning has been widely applied and achieved productive results. At the same time, its potential security problems have gradually attracted researchers' attention. From the perspective of botnet operators, it is beneficial to generate DGA domain names that can evade security detection. So that it's of great significance to research on generating adversarial examples against deep learning models for DGA detection. Recent papers propose some novel approaches. DeepDGA [5] makes use of Generative Adversarial Networks (GANs) to generate pseudo-random domain names that are more difficult to be distinguished from benign ones. Charbot [6] generates adversarial domain names completely based on black-box and only needs to replace two random characters of existing benign domain names. While both the two ways can decrease the accuracy of DGA classifiers, they are not associated with the generated adversarial examples with the targeted model. Deception_DGA [31] utilizes the knowledge of manually engineered features to construct a new DGA, which can make the random forest classifier powerless but works weakly on a deep learning classifier. MaskDGA [8] is a black-box attack method that generates adversarial examples by training a substitutive model and based on the Jacobian-based saliency map. But the substitutive model must include a one-hot encoding layer to meet the requirements of the following calculation. Taken together, our primary goal is to carry out the adversarial attack in a black-box manner, explore the targeted model's vulnerability not depending on the knowledge of the model, and generate adversarial domain names to evade the deep learning DGA classifiers.
In this paper, we introduce CLETer, a characterlevel evasion technique to "fool" deep learning-based DGA classification models. It applies to the blackbox attack scenario, where the attacker doesn't know the details of the targeted model (include network structures, parameters or training data, etc.), but can query the targeted model and get the output feedback of classification probability and label.
CLETer can intelligently generate adversarial domain names by modifying existing DGAs, including two steps: (1) evaluate every character's influence to the classification result of targeted model and (2) perform character-level transformers for those important characters. If a DGA domain name is detected as malicious initially but turned to benign finally, it's considered to construct adversarial example successfully.
To evaluate the effectiveness of CLETer, we choose two character-level deep learning models as our targets. We successfully apply CLETer to known DGAs and unknown DGAs. Those adversarial domain names can significantly reduce the recall of the two targeted models to below 10% when no more than three characters are transformed. To defense against CLETer's attack, we adopt the countermeasure of adversarial retraining that notably improves the robustness of targeted models.
Our main contributions are as follows: We also verify that adversarial retraining is an effective defense strategy against our method.
• We reveal the vulnerability of state-of-the-art DGA classifiers to some simple adversarial attacks as CLETer and provide a noteworthy perspective to enhance the robustness of DGA detection mechanism.

The Detection of DGAs
The detection of DGA domain names generally includes DNS traffic-based detection and textual-based detection. However, obtaining contextual information from network traffic is costly, thus it's difficult to be used for real-time monitoring and prevention. Adding randomness in the construction of DGA domain names makes them significantly different from legitimate ones. Therefore, detection based on the domain itself has become the mainstream method [9][10][11][12]. It can be divided into two types: one is the machine learning methods based on feature extraction [13], which mainly focus on the obvious differences in character distribution between legal and DGA domain names. These manually defined features include string length, entropy, vowel/consonant ratio, N-gram information, etc. However, feature extraction is a time-consuming task and these features are easily circumvented in detection, which results in high false positives and poor performance to detect new DGA domain names. The other popular practice is the deep learning methods based on non-feature extraction. Compared with traditional machine learning, deep learning [14] shows huge superiority in automatic feature extraction from mass data. Remarkable achievements have been made in DGA detection [15][16][17][18]. The commonly used deep neural networks are LSTM networks and CNN. LSTM [19,20] adds state information on the general RNN to enable it to learn long-term dependency information, and it is capable of capturing intercharacter sequential relationship. The CNN is a kind of deep neural network whose key structure is the convolution kernel. For the detection of DGA domain names, CNN can build a model at the character level and then extract the spatial structure features of DGA domain names with convolution operation.

The Adversarial Attack
Recent studies have shown that some deep learning models are vulnerable to adversarial attacks [21], that 2 EAI Endorsed Transactions on Security and Safety 06 2020 -03 2021 | Volume 7 | Issue 24 | e5 is, subtle perturbations to inputs will lead detection models to get incorrect outputs. These artificially constructed instances by perturbing on originals are adversarial examples [22]. Adversarial attacks are attacks against the integrity of the Artificial Intelligence (AI) model, which are typically divided into two types: poisoning attacks [23,24] against the data during the training phase, and evasion attacks [25] against the model during the test phase.
Evasion attack refers to the attack in which the attacker constructs specific input samples to fool the targeted system without changing the system. It can be classified in several ways. In different application scenarios, it is categorized into white-box attack [26] and black-box attack [27]. White-box attack requires obtaining the detailed information inside the machine learning model and then calculates directly to get the adversarial examples. Conducting the black-box attack depends on the transferability [27,28] of adversarial examples. A substitutive classifier can be trained to perform the black-box attack [29]. From the perspective of attack results, evasion attacks can be divided into untargeted attack and targeted attack [30], the difference between them is whether to restrict the categories of adversarial examples. The untargeted attack only focuses on the successful attack results and doesn't limit the final category, but the targeted attack requires generated adversarial examples belonging to a specific category. In this paper, the detection of DGA domain names is essentially a binary categorization.
Generated adversarial examples are misclassified as benign, which is considered as an untargeted attack.
In the study on adversarial examples of deep learning-based DGA detection methods. Deception_DGA designed by [31] is a novel DGA based on the white-box attack which can reduce the classification accuracy of the random forest classifier to 59.9% on a basis of requiring knowledge of manually engineered features used by the FANCI system [13]. But it works weakly on deep learning classifier.
MaskDGA [8] is a black-box attack method to evade DGA detection. It firstly trains a simple CNN substitutive model and then adds character-level perturbation to the input domain names based on the Jacobian-based saliency map. The resulting adversarial examples have achieved an obvious attack effect against four deep learning-based DGA classifiers.
With GANs being widely applied in cybersecurity, it is also gradually being used to the field of DGA classification. DeepDGA [5] is a generative model to generate pseudo-random domains that are more difficult to be distinguished from benign domain names and can effectively evade the detection of random forest DGA classifier using some manualselected features. Extending these generated domain names to the training set can enhance the robustness of the classification model. Furthermore, Khaos [32] firstly make use of a Wasserstein Generative Adversarial Network (WGAN) to synthesize DGA domain names. It is easier to train than DeepDGA and performs more stably.
Compared with these new type DGAs, Charbot [6] is a much simpler approach of evasion attack against DGA detection. Charbot is completely based on black-box. It simply replaces two characters of the valid domain with characters having equal distribution in the DNS traffic, and generates effective DGA domain names to fool state-of-the-art DGA classifiers.
Coincident goals as ours, we try to evade detection by a simper but effective method. Thus we propose CLETer, an improved domain generation algorithm that achieves good evasion detection results only through simple character-level perturbation. CLETer can be implemented on any type of DGAs, liking putting an invisibility cloak on them to help them hard to be found.

Method
In a black-box attack scenario, targeted model's inner information is not attainable for us. When implementing CLETer, we only focus on the model's input-output relationship to explore its weakness. This is more applicable to real-world DGA detection.
Without changing the original network structure, CLETer can be applied as an extensional module deployed in the real-time detection of DGA (as presented in Figure 1). It requires the following main steps: (i) prepare a list of DGA domain names that are seeking to evade detection.
(ii) determine the characters which have significant influence on classification result.
(iii) add character-level transform on them to evade a targeted deep learning model.
In this paper, we classify a domain name into two categories where benign domain names are labeled as "0" and malicious DGA domain names are labeled as "1". A domain name and its category are denoted in the form of (x,y), where x = x 1 x 2 x 3 ...x n is an input of character sequence and y∈{0, 1}. x i (i∈[0, n]) represents the character on the i th position and is from the following valid characters set C. In the character-level deep learning model, every valid character is a token as categorical features.
A deep learning-based DGA classification model is denoted as F: X → Y, where F is a function mapping from the input domain name to its category. The final classification result is generally a probability value. In this paper, if the probability value is above 0.5, we 3 EAI

Character Importance Ranking
In the deep learning DGA classification models, characters with different positions and content have different influence on the classification result. We use a score to quantify the importance of every character. The higher the score, the greater the influence on the classification results. And it is easier to mislead the DGA classifier by modifying those important characters.
For an input domain name x = x 1 x 2 x 3 ...x n , by removing the i th character x i , we can evaluate how the character x i affects the classification results. Inspired by DeepWordBug [33] and TEXTFOOLER [34], we define four kinds of scoring functions: (i) Head Influence Score (HIS); The HIS quantifies a character's influence based on the preceding sequence. For the first character x 1 , we define HIS (x 1 ) is 0.
(ii) Tail Influence Score (TIS); The TIS quantifies a character's influence based on the subsequent sequence. For the last character x n , we define TIS (x n ) is 0.
(iii) Combined Influence Score (CIS); The CIS considers the combination of (1) and (2), where λ is a self-defined parameter and used to adjust the weight of HIS and TIS when calculating CIS.

OIS(x i ) =
We rank all the characters by descending order according to the output scores of targeted models. 4 EAI Endorsed Transactions on Security and Safety 06 2020 -03 2021 | Volume 7 | Issue 24 | e5

Character-level Transformer
After calculating and ranking every character's score, the following step is to add character-level perturbation to construct adversarial examples. Here, we do this basically through substitution operation.
Our perturbation mechanism is based on the score ranking. Transforming the character with a higher score can easily confuse the targeted model to get the wrong result. We firstly substitute the character with a maximum score to another valid character in C and select the result with the lowest classification probability to substitute the character of the secondlargest score. Continue transforming like that. We get a new domain name after every substitution and until the new domain name is classified as benign the process is stopped. The new one is the adversarial example we want to obtain. We can limit the maximum number of substitutions m. By applying different scoring functions mentioned in Section 3.1, the generation of adversarial examples through CLETer is summarized in Algorithm 1.

Algorithm 1 Adversarial Attack by CLETer
Require: while F(x adv ) → y is malicious do 8: x ←Transform x L [j] in x adv to c in C 9: if F(x )< F(x adv ) then The transform process needs to follow the rules of legitimate domain names. For example, "-" cannot be used in a beginning or end position.
We consider the adversarial examples generated successfully only if the classifier misclassifies the them as benign.

Experimental Setup
Datasets. To implement CLETer, we use open datasets from two sources: • The Alexa one million domain names from alexa.com 1 are used as the benign (negative) dataset.
• The malicious (positive) dataset is from 360 DGA NetLab Open Project 2 . It consists of over 40 DGA families, each data contains the domain name, the family it belongs and validation time.
This paper selects parts of these data (see Table 1). The 25 DGA families are divided into two parts: 18 DGA families (each family includes over 1000 domain names) are involved in training, used as detected data (*). The rest 7 DGA families not involved in training are used as predicted data (+).
In the classification of domain names, only the key part needs to be extracted. Therefore, the first step is data preprocessing. Generally, a complete domain consists of two or more parts that are separated by a dot (e.g. tsinghua.edu.cn). From right to left are toplevel domain (TLD, e.g. cn), second-level domain (SLD, e.g. edu), third-level domain (e.g. tsinghua), etc. Few DGA families generate third-level domain, thus domain names in this paper are mainly referred to SLD part. But for some domain names with a third-level domain, we will pick the third-level domain part. Some examples are shown in Table 2.  embedding. In our model, we choose d = 128. Then, the LSTM layer serves essentially as a feature extractor. In DGA binary classification, sigmoid is applied in the logistic regression layer. Besides, a dropout layer is used between the LSTM layer and logistic regression to prevent overfitting when training.
The detected data are involved in the training and test phase (see Table 1). Classification models are run using 10-fold cross-validation with a maximum of 25 epochs. Training set accounts for 80% and the other 20% are used as the test set. Validation accounts for 5% of the training set. Finally, We get the trained LSTM classifier with AUC is 0.996.

Character-level CNN Model.
The other is a CNN-based DGA classifier. The embedding layer encodes the input domain name into a two-dimensional tensor. We perform multiple convolution operations Conv(t, k, n) to extract local features, and then this information is aggregated into a fixed-length feature vector. In this one-dimensional CNN, the size of filters k = {2, 3, 4, 5}, the number of convolution kernels t = 256, and n is input tensor. Max pooling is used to reduce the computational complexity of the model.
Training set and test set (see Table 1) account for 80% and 20% respectively. Finally, we get the trained CNN classifier with AUC is 0.995.

Adversarial Attack
We apply the targeted models in two cases: one is detecting the known 18 DGAs involved in training. We generate adversarial examples for the DGA domain names in the test set. The other is predicting the unknown 7 DGAs not involved in training. We generate adversarial domain names for them all.
Two scoring functions are adopted to calculate the character's score: CIS (λ=1) and OIS. For example, for a DGA domain name "quisleiymnnmilp", the calculation result of each character is showed in Figure 2.
Besides our method, we implement Random Substitution (RS) to generate adversarial examples: randomly select a character, and transform it into another random character. To enhance the reliability of the experiment, the random substitution process is repeated over 10 times. Average recalls are given.

Experimental Results on Classification
We evaluate the effectiveness of different adversarial methods on the two targeted models. The results of recalls are presented in Table 3.
Compared to the result when no adversary happens, we firstly find that targeted models have significantly lower recalls when classifying the adversarial examples generated by CLETer. The lower the recalls, the better the effect of evading DGA detection. Moreover, the CLETer is effective for both known and unknown DGAs. Choose different scoring functions to calculate importance scores has a different impact on classification results. As the number of substitutions m increases, the recalls decrease more significantly.
When applying CLETer on the 18 DGA families involved in training phase, the scoring functions of CIS and OIS are reliable to decide what important characters to be substituted. Here, CIS is superior to OIS. When two characters are transformed, the recalls reduce from 99.76% to 1.29% in the LSTM model and 3.64% in the CNN model.
When the two targeted models are predicting DGA families not involved in training model, the prediction results are slightly worse than detection results when no adversary happens. Because the deep learning networks did not learn the features of those DGAs. After implementing CLETer, the recalls also decrease significantly. OIS stands out a little. By twice substitutions, almost all the adversarial examples can evade the detection of the CNN classifier. Moreover, the method of RS hardly works in the adversarial attack.
Knowing how to select important characters to modify is the key to CLETer's phenomenal success. For CIS and OIS, the higher the score of a character is, the more significant effect it has on classification results. That is to say, changing these important characters is easy to cause misclassification.
To sum up, CLETer is effective in fooling deep learning classifiers that are used to detect known DGAs and predict unknown DGAs. 7 EAI Endorsed Transactions on Security and Safety 06 2020 -03 2021 | Volume 7 | Issue 24 | e5

Evaluation of Adversarial Defense
Adversarial examples generated by CLETer have achieved noteworthy attacking results. To defense against the proposed evasion technique, we consider using Adversarial Retraining as the countermeasure. Adversarial Retraining is a proactive defense  [35], which can make neural networks more robust. In this experiment, we randomly select 5,000 samples from training set of the two targeted models and generate adversarial examples for them respectively. CIS and OIS scoring functions are used and the number of perturbed characters m = 1. Then, we augment the 5,000 adversarial domain names into the training set and retrain the classification model for another 2 epochs. Finally, we evaluate the retrained targeted models on a test set including 10,000 adversarial examples generated by CLETer (m ≤ 5). The result of the recalls is presented in Table 5.
From the experimental results, the retrained LSTM and CNN models can effectively detect adversarial examples generated by CLETer. This also proves that the effectiveness of CLETer is not accidental but follows certain inherent rules and features, which has wide adaptability. Augmenting the training set with those 8 EAI Endorsed Transactions on Security and Safety 06 2020 -03 2021 | Volume 7 | Issue 24 | e5

Discussion
It is sufficient to prove the vulnerability of the deep neural network that the well-designed classifiers can be destroyed only by subtle perturbation. In this paper, it is feasible to take the confidence score of the model as the basis of perturbation. But CLETer's successful application has been verified only on the character-level deep neural models. It is unknown whether it is valid to feature-based classifiers. Further research is needed. CLETer is a character-level evasion technique by replacing characters on existing DGA domain names. The experimental results show the effectiveness of CLETer to generate adversarial examples, but our character-level perturbations could be improved through combining a more sophisticated algorithm such as semantic parsing, syntactic parsing, etc. Character-level operations also can be extended to swap, insertion, or deletion. Furthermore, considering more effective countermeasures to defend against attacks such as CLETer is a promising future work.

Conclusion
This paper proposed a simple but effective evasion technique for DGA domain names, we called CLETer. CLETer can intelligently generate adversarial examples for existing DGA domain names, including two steps: firstly, use the confidence score to quantify the influence of every character in a domain name to classification result. Secondly, transform the important characters to get a new domain name. We proposed two scoring functions to calculate the confidence score and proved their practicality in measuring the influence on the classification. The adversarial examples generated by CLETer effectively evade the detection of LSTM and CNN classifiers and reduce the recalls under 10%. CLETer is applied to a black-box manner where we don't know the structure and parameters of the targeted model but directly observe the relationship between input and output of it. Those adversarial domain names generated through CLETer show good transferability in different deep learning classifiers. We also proved the effectiveness of adversarial retraining to defense against the attack of CLETer.
The successful application of CLETer reveals the vulnerability of deep learning DGA classifiers to such so simple adversarial attacks. CLETer can be applied as an extensional module and be deployed flexibly.
Generating adversarial examples by CLETer provides a noteworthy perspective to enhance the robustness of the DGA detection mechanism.