Measuring the Cost of Software Vulnerabilities

Enterprises are increasingly considering security as an added cost, making it necessary for those enterprises to see a tangible incentive in adopting security measures. Despite data breach laws, prior studies have suggested that only 4% of reported data breach incidents have resulted in litigation in federal courts, showing the limited legal ramiﬁcations of security breaches and vulnerabilities. In this paper, we study the hidden cost of software vulnerabilities reported in the National Vulnerability Database (NVD) through stock price analysis. We perform a high-ﬁdelity data augmentation to ensure data reliability and to estimate vulnerability disclosure dates as a baseline for estimating the implication of software vulnerabilities. We further build a model for stock price prediction using the nonlinear autoregressive neural network with exogenous factors (NARX) Neural Network model to estimate the e ﬀ ect of vulnerability disclosure on the stock price. Compared to prior work, which relies on linear regression models, our approach is shown to provide better prediction performance. Our analysis also shows that the e ﬀ ect of vulnerabilities on vendors varies, and greatly depends on the speciﬁc software industry. Whereas some industries are shown statistically to be a ﬀ ected negatively by the release of software vulnerabilities, even when those vulnerabilities are not broadly covered by the media, some others were not a ﬀ ected at all.


Introduction
Vulnerabilities in a software expose users to unwarranted environments leading to security and privacy issues [1]. These vulnerabilities can be a result of flaws or bugs in the software code base [2]. Defects can be due to limited unit testing, performance testing, or stress testing which render the software to behave in an unintended fashion, exposing the product and the users to risk alike. In an event of such a vulnerability, intuitively, users prefer vendors that take such defects with utmost priority, fix them, report them to their users, and keeping the users susceptibility in check. Failure to do so, can put the vulnerable vendors at risk, To make matters worse, the number of security incidents and vulnerabilities have been growing at a rapid pace, leading to similar growth in resources required for fixing them. In 2012, for example, Knight Capital, a financial services company, lost $400 Million USD because of a bug in their code; the company bought shares at the ask price and sold them at the bid price [5]. Losses from WannaCry (2017), a ransomware attack in over 150 countries affecting more than 100,000 groups, is estimated to be $4 Billion USD [6]. Virus attacks, such as Love Bug (2000), SirCam (2001), Nimda (2001), and CodeRed (2001), have had an impact of $8.75 Billion, $1.25 Billion, $1.5 Billion and $2.75 Billion USD, respectively [7]. With the deployment of software in critical infrastructure, vulnerabilities could have an overwhelming impact. For example, defects such as the loss of radio contact between the air traffic controller and the pilots due to unexpected shutdown of the voice communication system and crash of the backup system within a minute of turning it on, could cost lives [8].
The cost of vulnerabilities is a variable that does not depend only on the type of vulnerability, but also on the industry, potential users, and the severity of the vulnerability as seen by those users. For example, users of security or financial software are more likely to lose trust in their product, compared to general ecommerce applications. A more severe vulnerability is also more likely to impact a vendor than a minor software glitch. For example, a vulnerability that can be used to repeatedly launch a Denial of Service (DoS) attack could be viewed more severely by users than, say, an access control misconfiguration (e.g., 1-time accesstoken exposure). We note that while companies may have cyber insurance, they are still susceptible to losses due to vulnerabilities and the cost of vulnerability could be due to long term impact on brand. Although the immediate cost is bared by the insurance provider, these incidents will result in increased insurance cost.
For publicly-traded drug and auto vendors, Jarrell and Peltzman [9] show that recalling products have an impact on value. Conversely, even if software vendors are experiencing an increase in software vulnerabilities, researches have shown that software vendors may not suffer significant losses due to those vulnerabilities [10], or that revenue and products may increase concurrently. However, there are also underlying costs associated with each software vulnerability, as mentioned above, and those costs maybe invisible [10]. For example, Romanosky et al. [11] studied software-related data breaches in the United States and found that 4% of them resulted in litigation in federal courts, out of which 50% (2% of the original studied cases) were won. Considering no impact of vulnerabilities on vendors, as shown by prior work, the vendors do not seem to face any immediate effect on themselves, unlike the end users. In this work, we work on finding how vendors could be equally impacted inversely by vulnerabilities. Contributions. We quantitatively analyze the loss faced by software vendors due to software vulnerabilities, through the lenses of stock price and valuation. To this end, this work has the following contributions. (i) An evaluation of all the publicly disclosed vulnerabilities from the National Vulnerability Database (NVD) and their impact on their vendors. (ii) An accurate method for predicting the stock price of the next day using the NARX Neural Network. (iii) Industryimpact correlation analysis, demonstrating that some industries are more prone to stock loss due to vulnerabilities than others. (iv) Vulnerability type analysis, indicating that different types have different powers affecting the stock return of a vendor.
Our work stands out in the following aspects, compared to the prior work (more in section 2). First, unlike the prior work, which is event-based (tracks vulnerabilities that are only reported in the press), we use a comprehensive dataset of disclosed vulnerabilities in the National Vulnerability Database (NVD). Additionally, data breaches are the events where a vulnerability in a vendor is exploited to gain access to its data storage with malicious intent. Per Spanos and Angelis [12], 81.1% of the prior work they surveyed was limited to security breaches, while we focus on all software vulnerabilities. Furthermore, per the same source, 32.4% of the prior work used Lexis/Nexis (database of popular newspapers in the United States) as their source, 24.3% used the Data Loss Archive and Database (data for privacy breach), 13.5% used CNET (technology website), and 13.5% used Factiva (global news database). In this study, we uniquely focus on using NVD. (ii) We design a model to accurately predict stock for the next day to precisely measure the effect of a vulnerability. Our approach outperforms the stateof-the-art approach using linear regression (e.g., while our mean-squared error (MSE) using ANN is below 0.6, using linear regression results in MSE of 6.24). (iii) Unlike the prior work, we did not exclude any vendors, as we considered publicly-traded vendors on nine different stock markets, namely, NASDAQ, NYSE, EPA, ASX, STO, NYSEAMERICA, TYO, CVE, and NSE. Spanos and Angelis [12] in their survey found that 83.8% of the surveyed work used vendors that traded in a US stock market, 13.5% used vendors from different countries and only 2.9% (1 out of 34 works) used firms traded in TYO (the leading stock exchange in Japan) [12]. Organization. The rest of the paper is organized as follows: In section 2, we re-visit the literature. In section 3, we present our approach to the problem. In section 4, we present our prediction model. In section 5, we evaluate the results. In section 6 we further comment on the statistical significance of our results, followed by discussion, limitations and future work in section 7. We conclude the paper in section 8.

Related Work
Our work is an amalgam of different fields, where we connect the vulnerabilities to economic effect on a vendor. Perceptions often relate vulnerabilities to effect on end users. Little has been said and done from the vendor's perspective.
A lot of work has been done on software vulnerabilities and reported to the community. The area has been approached from different fronts making the topic multi-faceted, some of which we review below. Effect on Vendor's Stock. Hovav and D'Archy [10], and Telang et al. [13] analyzed, in event-based studies, vulnerabilities and their impact on vendors. While Hovav and D'Archy have shown that market exhibit no signs of significant negative reaction due to vulnerabilities, Telang et al. showed that a vendor on average loses 0.6% of its stock value due to vulnerabilities. Goel et al. [14] pointed out that security breaches have an adverse impact of about 1% on the market value of a vendor. Campbell et al. [15] observed a significant negative market reaction to information security breaches involving unauthorized access to confidential data, but no significant reaction to nonconfidential breaches. Cavusoglu et al. [16] showed that the announcement of Internet security breaches has a negative impact on vendors' market value.
Anwar et al. [17] analyzed the effect of vulnerabilities on vendors and demonstrated the impact depends on the products' industry sector. Gamero-Garrido et al. [18] characterized the effect of legal threats on vulnerability researchers and observed that 40% of the studied vendors allow academic researchers to evaluate their products, and 25% of security researchers stated they do not do so because they fear legal measures. Vulnerability Analysis. Li and Paxson [19] outlined a method to approximate public disclosure date by scrapping reference links in NVD, which we use in this study. Nguyen and Massaci [20] pointed out that the vulnerable versions of data in NVD is unreliable. Christey and Martin [21] outlined caveats with the NVD data, also suggesting its unreliability. Romanosky et al. [22] found that data breach disclosure laws, on average, reduce identity theft caused by data breaches by 6.1%. Similarly, Gordon et al. [23] found a significant downward shift in impact post the September 11 attacks. Steinke et al. [24] presented a vulnerability management framework. Stock et al. [25] focused on notifying affected vendors about vulnerabilities.
Zhao et al. [26] conducted an empirical study on data from two web vulnerability discovery ecosystem to analyze their trends. Trinh et al. [27] proposed an algorithm-based string solver to identify vulnerabilities in web applications. Saha [28] extended an attack graph-based vulnerability analysis framework to include complex security policies for efficient vulnerability analysis. Zhang et al. [29] used data from NVD to predict time to next vulnerability and argued that NVD has a poor prediction capability. They also pointed out inconsistencies in NVD, e.g., missing version information, vulnerability release time, and obvious errors.
Sabottke et al. [30] proposed a Twitter-based exploit detector to identify which vulnerabilities are likely to be exploited. Homaei and Shahriari [31] analyzed vulnerabilities report between 2008 and 2014, and observed that security professionals can prevent 60% of them by focusing on just seven vulnerability categories. Horvath et al. [32] pointed out that CVSS metrics are more suitable for software products running in an IT-environment than products for personal use. Holm and Afridi [33] studied the reliability of CVSS through a survey of 384 experts, covering more than 3,000 vulnerabilities, and concluded that the outcome depends on the type of vulnerability. Allodi et al. [34] assessed the vulnerabilities by evaluating information cues that increase assessment accuracy. Johnson et al. [35] assessed the credibility of CVSS scoring data using a Bayesian method and found CVSS is quite reliable except for a few dimensions. They argued, by analyzing five vulnerability databases, that NVD is the most reliable for CVSS quality. Financial Impact of Defects. Jarrell and Peltzman [9] analyzed the impact of recall in the drug and auto industries on vendors' stock value loss. Towards calculating the effect of a vulnerability, it is crucial to predict a hypothetical valuation of the stock in the absence of a vulnerability. Kar [36] suggested using Artificial Neural Network (ANN) as a reliable method for predicting stock value. Farhang et al. [37] suggested higher security investments in Android devices do not result in higher product prices on customers.

Methodology
The goal of this study is to determine the impact of publicly disclosed vulnerabilities on their vendors. Our dataset of publicly disclosed vulnerabilities is gathered from the information available on the National Vulnerability Database (NVD). Prior work have shown that product recall have an adverse impact on a vendor's stock [9]. Taking cue from the prior art, we consider the fluctuation in the stock price as a measure of the vulnerabilities' impact. To this end, we calculate the impact on the day a vulnerability was disclosed by a vendor and the days following to it, with respect to the predicted value of the stock on the day of vulnerability disclosure. However, we limit up to the third day of the public disclosure of the vulnerability to reduce the likelihood of interference with factors that might affect the market value. The rest of this section explains in details the steps taken to achieve the above goal.

Data and Data Augmentation
The major repository for publicly disclosed vulnerabilities is NVD [38]. Therefore, we use NVD as a source dataset for our analysis. The dataset of publicly disclosed vulnerabilities is then augmented with the stock data from Alpha Vantage [39] for analysis of the impact. Fig. 1 summarizes, at a high-level, the flow of data creation, from the source of data to the final dataset. In a nutshell, we extract information from JSON files downloaded from the NVD, scrape through the reference links for each vulnerability provided by NVD to approximate the disclosure date of the vulnerability.
Overall, the data from the NVD ranges from 1998 to 2018, making our dataset range over 20 years. The 17.1K vendors from the NVD are then searched over the internet for their market and code. Using the vendor's stock market and code, we then gather the historical daily stock returns data for each of the vendors. We use the Alpha Vantage as the source of historical data. For all the vendors that exist in the Alpha Vantage, we analyze the impact of vulnerabilities in them on their stock returns.
National Vulnerability Database (NVD). NVD is a vulnerability database maintained by the National Institute of Standards and Technology (NIST) that serves as a one-stop listing of all the vulnerabilities reported to MITRE [40]. Analysts at NVD further analyze the vulnerabilities before inserting them into the database.
Consequently, NVD enlists the following information for each of the vulnerabilities: the Common Vulnerabilities and Exposures Identifier (CVE-ID), vendor, product, Common Vulnerability Scoring System (CVSS), published date, Common Weakness Enumeration Identifier (CWE-ID) [41], description, reference links, etc.. Additionally, the NVD uses both the versions, version 2 and version 3 [42,43], of the CVSS (a widely used severity scoring technique).
CVSS version 3, released in the latter half of 2015 labels vulnerabilities as LOW, MEDIUM, HIGH, and CRITICAL, while version 2 classifies them into LOW, MEDIUM, and HIGH. Although version 3 has been adopted by the database, the NVD is yet to accept it throughout the dataset. The vendor attribute is the name of the vendors that has the vulnerability in their software, the product element is the name of the software that had the vulnerability, CWE-ID is the type of the vulnerability or the reason for the vulnerability, description contextualizes the vulnerability including the exploit conditions, published date is the date when the vulnerability entered the database, and the reference links are additional details about the vulnerability, such public disclosure date ← published date from NVD; 10 end as, security advisory, vendor advisory, security thread, email thread, patch details, commits (in an event of a vulnerability in open-source vendors). Particularly, the reference links contain vulnerability details, such as date the vulnerability was reported to the vendor, date it was acknowledged, date it was patched, disclosure date, and other information like, products effected, etc.
Data Preprocessing and Augmentation. The NVD data provides data in XML or JSON format. The data is distributed in multiple files such that the each file represents vulnerabilities in a specific year. Altogether, our dataset, built upon these files, comprise of vulnerabilities reported to the NVD until May 21, 2018. Additionally, we also observe vulnerabilities that were inserted into the database, but were later removed from the database. Such vulnerabilities can be identified by the sight of a description prefixed by "REJECT:". Moreover, the rejected vulnerabilities do not have any other information. For our analysis, we disregard all those vulnerabilities. Finally, our dataset encompasses a total of 101,580 vulnerabilities.
The impact of a vulnerability can be felt on the date a vulnerability is disclosed to the public or on the days in its vicinity. Since the published date attribute captured in the NVD is the date when a vulnerability enters into the database and not the date when it was disclosed by a vendor, it is important to find the date when it was publicly disclosed. To do so, we scrape through the links present in the NVD and label the disclosure dates corresponding to each of the links (if present), similar to the approach taken by Li and Paxson [19]. For a vulnerability with multiple URLs, after labelling all the URLs with corresponding disclosure dates, we consider the earliest date as the public disclosure date of the vulnerability. It should be noted that we ignore the links directing to patches, as the date of patching may differ from the disclosure date (disclosure is done after vulnerability patching), and the market can respond to public disclosure date.
Algorithm 1 summarizes the aforementioned steps towards determining the public disclosure date of a vulnerabilities. In particular it takes the JSON files from the NVD as inputs and extracts the CVE-ID, reference links, and the published date. We then scrape through individual URLs in the reference links and extract the public distribution date corresponding to each of them. We then group the dates by CVE-ID, followed by finding the minimum of the dates as described in steps 4 & 5. Lastly, the older of the dates (date from the links and published date from NVD) is approximated as the public disclosure date as detailed in steps 7 -10.
We record redundant vendor names in our dataset, e.g., schneider-electric vs. schneider_electric, trendmicro vs. trend-micro, and palo_alto_networks vs. paloaltonetworks. We consolidate the various vendors under a consistent vendor name. For all the vendors in the above dataset, we further augment them by incorporating historical stock return data from Alpha Vantage. [39] is a community of researchers, engineers, and professionals, and is the leading provider of real-time and historical data on stocks, physical currencies, and digital currencies for free through an API. We found the market code for every vendor in our dataset, along with the company code. We then search for a list of vendors codes and the market they are traded on. We obtain a list of companies being traded on NYSE and another list of companies being traded on NASDAQ from Alpha Vantage The open and the close attributes are the stock values of the vendor on the given day at which the market opens and closes, respectively. The low and the high attributes correspond to low and high values of the vendor's stock price on the given day. Upon careful examination of the vulnerable vendors in our dataset, and successful augmentation with the Alpha Vantage dataset, we generate an overall dataset of 202 vendors.

Alpha Vantage. Alpha Vantage
Predicting Return. Calculating the impact of a vulnerability is dependent on the effect due to non-occurrence of the vulnerability. We, therefore, determine the stock return of a vendor for this hypothetical event by leveraging the machine learning-based prediction models. To this end, we feed the open, low, high, and close of the preceding days as inputs to our prediction model to predict the return for the day a vulnerability occurs. We describe our prediction model in section 4. We use the return (in an event of non-occurrence of vulnerability) as a baseline to compare with the actual return on the day of vulnerability occurrence.

Press.
We contrast the impact of the vulnerabilities reported in the NVD with the impact of the vulnerabilities that capture the media attention. Towards this, we collect four vulnerabilities that were reported in the media. Specifically, we look for news with relating to "software vulnerabilities" in media outlets, such as Forbes and ZDNet, and capture four vulnerabilities for comparison, namely Alteryx, Dow Jones, Viacom, and Equifax.

Assessing Vulnerability's Impact
To assess the impact of vulnerabilities, we cluster our dataset by vendors. Additionally, there can be multiple vulnerability disclosures on the same day. Moreover, we have historical daily-stock return data for each vendor. Consecutively, we find the impact of vulnerabilities on a particular day -which means that, in an event of multiple vulnerabilities on a day, it is impossible to calculate the impact of individual vulnerabilities. Therefore, for all such days, we determine the overall impact of vulnerabilities.
With this distinction in place, we further narrow down the vulnerabilities by disclosure date. At this point it is also important to remember that, while a vulnerability can be declared on any day of the week, except for weekends and holidays. Therefore, for every date with corresponding disclosed vulnerability that does not have stock information, the effect of the vulnerability can only be observed on the next operational day. Thus, we approximate the vulnerability to have occurred on the next operating day of the market.
Towards analyzing the cost of vulnerability against its corresponding vendor, we perform an event-based study. To do so, we realize a hypothetical event of non-occurrence of vulnerability, as described earlier.
We call this event realization as the Normal Return. Additionally, the actual performance of the stock market, i.e., the stock return at the end of the day depending on the actual stock performance is called as the Actual Return. The comparison between the Normal Return and the Actual Return reflects on the abnormality in return for an event. To quantify the abnormality on a day due to vulnerability disclosure, we compare and contrast the Normal Return (R) and the Actual Return (R). Moreover, the impact of a vulnerability can also be delayed, upon considering the reaction time of the consumers. To this end, we find the impact on the days following the vulnerability disclosure. Finally, we limit the impact calculation to the third day from disclosure, to limit the influence of external factors on market returns. In particular, we define Abnormal Return (AR) as the deviation of Actual Return from the Normal Return. Mathematically, AR on day i, Algorithm 2 shows the method used for calculating the impact of vulnerabilities by mapping the marketcodes with vendor names then identifying public disclosure dates of the vulnerabilities followed by calculating the impact of the vulnerabilities on a disclosure date.

Prediction
To quantify the impact a vulnerability, we perform an event-based study. In particular, an event-based study compares and contrasts the Normal Return and the Actual Return. Our historical stock return dataset contains the Actual Return on a day. Defined as the return for non-occurrence of an occurred event and considering the trends in the past and ignoring the event, we determine the Normal Return (as explained in the previous section). We leverage the machine learning-based algorithms towards this determination. As aforementioned, our historic stock return dataset contains the following attributes for every vendor: date, open, close, high, and low. These attributes are then fed as features to our prediction models.
Recognizing the nonlinear behavior of the returns, therefore, we make use of nonlinear prediction techniques to analyze and predict the behavior [44]. Additionally, we perform data preprocessing to improve the performance of the machine learning algorithms.
We begin by normalizing the feature set for standardization. In other words, feature standardization projects the raw data into a new vector-space where each feature in the data has a mean and a standard deviation of zero and unit, respectively, i.e., every feature is represented in a space with more specific and richer realization. A widely accepted method for feature standardization is taking into account the mean and the standard deviation of features. Mathematically, the mapping transforms the feature vector x into z, where z = x−x σ , wherē x and σ , are the mean and standard deviation of the original feature vector x, respectively. These features are used as input to our machine learning-based prediction model to predict the Normal stock returns. In particular, we use the NARX Neural Network. To draw a parallel with the prior work and for comparison, we also use a linear regression model for prediction.

NARX Neural Network
The NARX neural network, generally applied for prediction of the behavior of discrete-time nonlinear dynamic systems, is one of the most efficient tools for predicting the behaviour [44]. Among the unique characteristics of NARX is its ability to provide accurate forecasts of the stock values by exploiting an architecture of recurrent neural network with limited feedback from the output neuron. When compared to other architectures which consider feedback from both hidden and output neurons, NARX is shown to be more efficient and also yields better results [45].
Extrapolating the NARX neural network model as per our needs, we determine the Normal Return, y(t), on the day t, where t is the day on which a vulnerability occurred in the vendor. The Normal Return, y(t), is regressed on previous values of the output and exogenous input, and is represented as following model: i.e., the input (low, high, open) values from historical return value, d n is the number of days before the day the vulnerability occurs that is considered as input to the model for training. Additionally, y(t − i) is the Actual Return for the corresponding input u(t − i). The lag d n is the exogenous inputs and output of the system, and the function f is multi-layer feed forward network. The general architecture of the NARX neural network is shown in Fig. 2. For every vendor in the dataset, we divide the historical return data into training, validation and test subsets. In particular, the training, the validation, and the test subsets are selected as 70%, 15%, and 15% of the dataset respectively. The training data is used to train a predictive model. Additionally, the Mean Squared Error (MSE) is used as a parameter to evaluate the performance of the models. The MSE is defined as: where n is the number of samples. y t and y p represent the Actual Return and the corresponding Normal Return value on the day of vulnerability, respectively. A feed-forward neural network with one hidden layer has been used as a predictor function of the NARX. Levenberg-Marquardt (LM) back-propagation learning algorithm [46] is employed to tune the weights of the neural network. The specifications of the proposed NARX neural network are presented in Table 1.

Baseline for Comparison: ARIMA
In addition to the NARX neural network model, and to build a parallel with prior work, we use linear regression to determine the Normal Return. Towards this, we use one of the most popular time series prediction models, the Autoregressive Integrated Moving Average (ARIMA) model [47]. Particularly, using linear regression we conduct the prediction for the stock return of one vendor, namely, Adobe, to determine the Normal Return. Conceptually, the AR portion of the ARIMA signifies that the variable to be predicted is regressed on its past values. Additionally, the MA portion in the ARIMA model indicates that the error in the regression model is a linear combination of error values in the past. The ARIMA model with external regressors, x, for one-step ahead prediction is represented by where y p and y t are the Normal and Actual stock return, respectively. µ is a constant, while the θ, and the φ are the MA coefficient and the AR coefficient values, respectively.
The results are shown only for Adobe and for the rest of the vendors only the MSE is shown in Table 3. Fig. 3 depicts the Actual and Normal stock returns. The low value of the error strongly suggests that the NARX model can forecast the stock values with high accuracy. In addition, the error histogram, as shown in Fig. 5, represents the performance of the predictor. We observe that the majority of the instances are forecasted precisely, and with very small prediction error. In Fig. 4, although visual representation suggests a weakness of fit with ARIMA in prediction the stock values, the difference in the value of MSE for these two models, 6.42 for ARIMA and 0.59 for NARX, quantitatively justifying the goodness of the proposed method over the existing methods in the literature.

Results
We perform our experiment over a large dataset of publicly available vulnerabilities, encompassing all possible publicly traded vendors. To begin with, we augment the dataset, label the extracted vendors with We then determine the impact of vulnerabilities on a vendor after grouping them by date, meaning that multiple vulnerabilities could correspond to a single date. Therefore, the effect we see in Table 3 could be due to one or more vulnerabilities. For every vulnerability disclosure date and vendor, we calculate % Abnormal Return on days 0, 1, and 2 (AR 1 , AR 2 , and AR 3 respectively as described above).
We present the results in Table 3, including the normalized MSE, count of the vulnerabilities, and Abnormal Return on days 1, 2, and 3 for every vendor. We observe that 155 out of the 202 vendors suffer an adverse impact of vulnerabilities on their returns on either of the days.    To assign a likelihood of an industry's returns being impacted by vulnerabilities, we use Highly-Likely when the number of vendors with stocks affected negatively by the vulnerabilities in the given industry is larger than those not affected, Less-Likely otherwise; we use Equally-Likely when the number of vendors affected equals the number of vendors not affected.
To investigate the industries that are less likely to be affected by vulnerabilities at the vendor level, we examine vulnerabilities from 10 such vendors. For every vendor, we observed that there are a few dates which have a vulnerability, where that vulnerability does not have any visible impact on the return. In other words, these dates see a surge in the vendors' stock returns, despite vulnerability occurrence, thereby nullifying the impact of vulnerabilities on the other days. For a better understanding, we then examine the description of the vulnerabilities leading to the following observations: 1. Vulnerabilities affecting vendors' stock negatively are of critical severity (vulnerabilities with CVSS version 3 label of CRITICAL) while the rest were less severe (vulnerabilities with CVSS labels of HIGH or MEDIUM).

Vulnerabilities affecting vendors' stock returns
negatively have a combination of version 3 label of HIGH or CRITICAL, and a description containing phrases such as "denial of service", "allows remote attacker to read/execute", "allows context-dependent attackers to conduct XML External Entity XXE attacks via a crafted PDF", and "allows context-dependent attackers to have unspecified impact via an invalid character". Additionally, vulnerabilities description such as "allows authenticated remote attacker to read/execute", "remote attackers to cause a denial of service", and "allows remote attackers to write to files of arbitrary types via unspecified vectors" have little (on days 0, 1, and 2) to no effect on the returns. Therefore, we can conclude that vulnerabilities involving unauthorized accesses have a higher cost, seen in their detrimental effect on the stocks.
3. Vulnerabilities with phrases such as "local users with access to" and "denial of service" in the description have no impact on the stock. Therefore, DoS attacks lacking confidentiality factor lead to no impact on stock value.
Severity effect. To study the significance of our results, we evaluate the impact of vulnerabilities and their severity. To do so, we first conduct a correlation (Pearson correlation coefficient) analysis between the impact and severity of the vulnerabilities. As stated earlier, a public distribution date can have multiple vulnerabilities corresponding to it, therefore, the impact of a particular vulnerability is impossible to quantify. Our prior manual effort hints at the higher impact of more severe vulnerabilities. We start by assigning a severity index to every public distribution date. In doing so, we prioritize the CVSS version 3 scoring system over version 2. For vulnerabilities that do not have version 3 labels, we consider the version 2 label as their tag. Moreover, we prioritize a more severe vulnerability over less severe vulnerability. More precisely, if a public distribution date has a critical, high, medium, or low vulnerability, we consider the date to contain a critical severity vulnerability, and label it as critical.
Having labeled the public distribution dates with severity labels, we examine the correlation between impact and the severity labels. We observe a low positive correlation between the severity and the impact on the individual days. In particular, the correlation coefficient between severity and the impact on the day a vulnerability surfaces and the following two days is 0.119, 0.115, and 0.11, respectively. Although, we observed a positive correlation, the low magnitude of the correlation indicates that the we cannot consider the the severity of vulnerabilities as an indicator of impact on vendors. This shows the change in the role of severity labels to the overall impact of vulnerabilities on its vendors.  We followed the same steps for the vulnerabilities gathered from the press. We found that these vulnerabilities have an adverse effect on vendor stock in almost every case.

Statistical Significance
To understand the statistical significance of our results, we use the confidence interval of the observations as a guideline. Particularly, we measure the statistical confidence of the overall effect of vulnerabilities corresponding to a vendor on days 1, 2, and 3, respectively. Table 5 shows the confidence intervals (lower and upper limit) on days 1, 2, and 3, measured with 95% confidence. 95% Confidence Interval. 95% Confidence Interval (CI) is a range that contains the true mean of a population with 95% certainty. For a smaller population, the CI is almost similar to the range of the data, while only a tiny sample of data lies within the confidence interval for a large population. In our study, we have noticed that our data populations are diverse. While some vendors have a small number of samples, others have a larger number of samples. For example, Fig. 6 -Fig 8 show Table 5. Statistical confidence for each Vendor. OAR 1 , OAR 2 , and OAR 3 stand for the average effect at day 1, 2, and 3 (percent), respectively. CI i is the confidence interval for day i , where i {1, 2, 3}. (2) Vendor names are abbreviated as in Table 3 and Table 4  given population (distribution) can be calculated as, wherex is the mean of the population, σ is the standard deviation, and n is the number of samples.
Putting it into perspective, while OAR i , where i ∈ {1, 2, 3}, captures the overall effect of vulnerabilities corresponding to a vendor, the Confidence Interval (CI i , where i ∈ {1, 2, 3}) gives the confidence for the effect to lie within its upper and lower bound. In Table 5, and by considering the data associated with Adobe, for example, we can say with 95% confidence that the confidence interval for the population, CI i , contains the true mean, OAR i . We also observe that: 1. OAR i for the vendors in Table 3 and Table 4 are within their respective confidence intervals, which means that our results reported earlier are statistically significant.
2. The confidence intervals depict that chances of the overall impact of a vendor due to vulnerabilities to lie within their respective confidence intervals is 95%.
3. The true mean for the vendors with their confidence intervals bounded in negative intervals is likely to be negative. Thus, the probability for a vulnerability having a negative impact on days succeeding the day a vulnerability is disclosed on the vendor's stock returns is highly likely.
4. The confidence intervals reveal that 19 of the 202 vendors in Table 3 and Table 4 are non-negative bounded, i.e., they have not been impacted due to vulnerabilities corresponding to them.

Discussion and Comparison
Prior art have made concentrated effort on exploring and comprehending the influence of data breaches on a victim's stock returns. Additionally, data breaches can be a result of abuse of known vulnerabilities in the victim's software due to the use of vulnerable software. Intuitively, while data breaches will have an impact on the victim, vulnerabilities, however, will not have a direct impact on the victim. In this work, we focus our effort on understanding their effect on the victim, and our results prove otherwise which is also strengthened by the statistical significance of our results. Given the varying severity of the vulnerabilities, we study the role of severity on the impact on vendor's stock returns. Moreover, we compare our results with prior work. We discuss the results further in the rest of the section.

Comparison with Prior Works
Studies in the past have made varied conclusions in light of vulnerabilities. In particular the studies reflect the associations with certain aspects of those vulnerabilities, including correlation with type of vulnerability, effect of publicity, etc.. In the following, we compare our findings with the prior work across multiple aspects, e.g., vulnerability type, publicity, data source, methodology, and sector.
Confidentiality vs. non-confidentiality. Campbell et al. [15] observed a negative market reaction for information security breaches involving unauthorized access to confidential data, and reported no significant reaction to non-confidentiality related breaches. Through our analysis, we reached into similar conclusion. Specifically, we found that vulnerabilities that have negative affect on vendor's stock have descriptions containing phrases indicating confidentiality breaches, such as "denial of service", "allows remote attacker to read/execute", "allows context-dependent attackers to conduct XML External Entity XXE attacks via a crafted PDF", and "allows context-dependent attackers to have unspecified impact via an invalid character". This observation is inline with prior work.  Data source and effect (broadening scopes). Goel et al. [14] and Telang and Wattal [13] estimated the impact of vulnerabilities on the stock value of a given vendor by calculating a Cumulative Abnormal Rate (CAR) and using a linear regression model. Their results are based on security incidents: while both gather data from the press, Telang and Wattal [13] also used a few incidents from Computer Emergency Response Team (CERT) reports. On the other hand, we consider a wide range of vulnerabilities regardless of being reported by the press. Our results show various trends and indicate the dynamic and wide spectrum of the effect of vulnerabilities on the vendors' returns. Additionally, we discuss the caveats of the methodology used by Telang and Wattal [13] below.
Methodology (Addressing caveats of prior work). The prior work have utilized CAR to measure the impact of vulnerabilities [13], which aggregates AR's on different days. However, we design a methodology that captures the impact of vulnerabilities with more precision. In particular, our method performs better due to multiple reasons. First, CAR does not effectively capture the impact of a vulnerability, due to information loss by aggregation: 1) CAR would indicate no-effect if the magnitude (upward) of one or more days analyzed negate the magnitude (downward) of other days. 2) We consider a vulnerability as having had an impact if the stock shows a downward trend on d 1 , d 2 , or d 3 , irrespective of the magnitude. 3) Our results, through a rigorous analysis, are statistically significant. Second, we demonstrate the caveats of CAR and show the advantages of our approach in capturing a better state of the effect of vulnerabilities on the return, we consider both Samsung and Equifax in Table 3. On one hand, the impact of vulnerability on Equifax on days 2 and 3 was significant (-14.02 and -24.09 vs. +1.52 on day 1), where CAR would capture the effect. On the other hand, such an effect would not be captured by CAR with Samsung (-0.08 and -0.08 on days 1 and 2 vs. +2.95 on day 3). Our approach considers the effect of vulnerabilities on return over different days individually, and thereby preserving the information, rather than losing it due to aggregation.
We also compare the performance of our predictor by contrasting it with the linear regression-based models in the literature. Although Fig. 4 and Fig. 3 visually suggest a similar performance in predicting the stock values, the difference in the value of MSE for these two models, 6.42 for ARIMA and 0.59 for NARX, quantitatively shows the improvement of the proposed method over the existing methods in the literature.

Sector-based analysis.
Although it is intuitive that the cost of vulnerabilities on vendors is sector-dependent, a major shortcoming of the literature is that it fails to demonstrate it through analysis. By clustering vendors based on the industry they belong to, our results show the likelihood of effect to be high in software and consumer products' industry, and to be less in the device, networking or conglomerate industries as shown in Table 2. While Table 3 shows that a vulnerability may or may not have an effect on its vendor's stocks, Table 5 shows that individual vulnerabilities may affect the returns.

Shortcomings.
In this study we find a significant effect of vulnerabilities on a given day and limited ourselves to the second day after the release of the vulnerability in order to minimize the impact of other factors. However, other factors may affect the stock value than the vulnerability, making the results unreliable, and highlight the correlational-nature of our study (as opposed to causational). Eliminating the effect of those factors, once known, is an open question.
For impact estimation, this study utilizes two datasets, the NVD and the stock data obtained from Alpha Vantage. As such, this study is limited to the vendors that are publicly traded. Moreover, among the publicly traded vendors, we are also limited by the vendors the data of which are captured by Alpha Vantage. For example, we notice that ATI/AMD is a publicly traded vendor, but is not captured by the service during the study period. However, we acknowledge that given our tool, this shortcoming is not difficult to address, although requires ingestion of those additional vendors for analysis.
Moreover, apart from the effect on stock, a vendor may sustain other hidden and long-term losses, such as consumers churn (switching to other products or vendors), loss of reputation, and internal losses (such as man-hour for developing remedies), which we do not consider in our evaluation, and open various directions 14 Afsah Anwar et al.

EAI Endorsed Transactions on
Security and Safety 05 2020 -06 2020 | Volume 7 | Issue 23 | e1 for future work. Furthermore, much of the effort depends upon the automated gathering of historical stock data, in this study Alpha Vantage is used as a source. Lack of a source encompassing stock exchanges worldwide further limits the study.

Vulnerabilities and Disclosure
Our analysis of the vulnerabilities shows that while vulnerabilities may or may not have an impact on the stocks, a vulnerability reported by the press is highly likely to impact the stock return. The diverse results for the vulnerabilities collected from NVD are explained by the severity of the vulnerabilities, where 1) the press may report on highly critical vulnerabilities that are more likely to result in loss, or 2) the reported vulnerabilities in the press may create a negative perception of the vendor leading to loss in their stock value. This, as a result, led many vendors to not disclose vulnerabilities in order to cope with bad publicity. For example, Microsoft did not disclose an attack on its bug tracking system in 2013 [48], demonstrating such behavior in vendors when dealing with vulnerabilities [49]. Recent reports also indicate a similar behavior by Yahoo when their online accounts were compromised, and by Uber when their employees' and users' personal information were leaked. More broadly, a recent survey of 343 security professionals worldwide indicates that the management of 20% of the respondents considered cyber-security issues a low priority, alluding to the possibility of not disclosing vulnerabilities even when they affect their systems [50].

Conclusion and Future Work
We perform an empirical analysis on vulnerabilities from NVD and look at their effect on the vendor's return. Our results show that the effect is industryspecific and depends on the severity of the reported vulnerabilities. We also compare the results with the vulnerabilities found in the popular press: while both vulnerabilities affect the vendor's stock, vulnerabilities reported in the media have a much more adverse effect. En route, we also design a model to predict the stock return with high accuracy. Our work is limited in the sense that we do not consider other external factors affecting the stock or internal factors affecting long term user behavior and deriving vulnerabilities cost. Exploring those factors along with regional differences and cascade effect of vulnerabilities in effect will be our future work.