Univariate Interpolation-based Modeling of Power and Performance

Performance and power scale non-linearly with device utilization, making characterization and prediction of energy efficiency at a given load level a challenging issue. A common approach to address this problem is the creation of power or performance state tables for a pre-measured subset of all possible system states. Approaches to determine performance and power for a state not included in the measured subset use simple interpolation, such as nearest neighbor interpolation, or define state switching rules. This leads to a loss in accuracy, as unmeasured system states are not considered. In this paper, we compare different interpolation functions and automatically configure and select functions for a given domain or measurement set. We evaluate our approach by comparing interpolation of measurement data subsets against power and performance measurements on a commodity server. We show that for non-extrapolating models interpolation is significantly more accurate than regression, with our automatically configured interpolation function improving modeling accuracy up to 43.6%.


INTRODUCTION
Computational devices have to run at a great range of device utilization levels, with power and performance scaling non-linearly over the different load levels, as power saving mechanisms, such as dynamic voltage and frequency scaling (DVFS) are being used. A common and highly accurate method for the characterization of power and performance . of a workload on a target system at a given load level is the measurement of said characteristics and storing those measurements in a table for re-use during management decisions. The stored measurement results contain power and performance values for a subset of possible system states. Determining the power and performance of the system at other states remains challenging. Currently, several approaches to estimate power consumption at these states exists: Some tools, such as [10] only consider the existing pre-defined states and determine the current state either by nearest neighbor interpolation or through other rule-based mechanisms. Another approach is the training of models based on measured data. Models range from simple models, such as the linear power model [1] and variations thereof [3] to more complex regression models, which also take additional system properties into account [6]. In comparison to approximation, interpolation increases prediction accuracy, as it does not sacrifice or approximate any of the pre-measured results. A great number of different interpolation methods exist. Depending on the system and the power or performance metric under observation, a different interpolation method may be optimal. In addition, some interpolation methods can be configured with varying degrees of freedom.
This paper presents a library 1 for automated selection of interpolation and configuration strategies for performance and power characterization with the goal of minimizing prediction errors for unmeasured performance and power. The major contributions of this paper are as follows: (1) We present an approach for automated selection and configuration of an interpolation strategy for a given set of performance or power measurements; (2) We propose a composition of piece wise polynomial interpolators of varying degrees for the interpolation of a system's power over utilization function; (3) We demonstrate that for closed and bounded inputs, interpolation provides superior prediction accuracy on the system of measurement in comparison to approximation techniques, such as regression.
We evaluate our approach based on power and performance measurements using ten of the workloads of the SPEC SERT [5], measuring at 100 load levels per workload. Prediction accuracy is evaluated based on the methods' ability to predict power and performance for all load levels based on a smaller sub-set. We show that for bounded problem spaces interpolation features superior accuracy compared to regression. Our automated interpolation configuration and selection improves modeling accuracy by 43.607% if additional reference data is available and by 31.36% if it is not.

INTERPOLATION
Scattered data interpolation [4] is the reconstruction of a continuous function f (x) from n different sample points {(x1, f1), (x2, f2), ..., (xn, fn)}. In this paper, we consider univariate functions, where f (x) is our power or performance metric and the input metric x the corresponding system metric (usually utilization). We consider the following interpolation functions: with xi ∈ {x1, ..., xn} being the nearest neighbor to x, meaning that ∀xj ∈ {x1, ..., xn} : |x − xi| ≤ |x − xj|. • Linear Interpolation: Given the two nearest neighbors of x, xi and xi+1, with xi ≤ x and xi+1 > x: Parameter p is freely configurable and usually selected based on experience. • Polynomial Interpolation: f (x) = anx n +an−1x n−1 + ... + a0x 0 , with the coefficients ai being the solution to a system of equations that guarantees that the polynomial of degree n passes through all n + 1 data points.
To avoid oscillation (Runge's Phenomenon [8]), we also split the set into subsets of size m and interpolate these using polynomial functions of degree m − 1. • Spline Interpolation: A type of piece-wise polynomial interpolation. Guarantees that the overall function remains continuous in all interpolated data points [2].

DETERMINING ACCURACY
During the automated selection and configuration process, we determine the accuracy of an interpolation function using one of these two methods: Interpolation against a reference dataset or cross validation. The latter of these two options is the more common one, as interpolation accuracy improves with additional interpolated data. As a result, all available data is included in the interpolated data, leaving no additional data for referencing. However, a separate reference dataset is most useful when determining the optimal interpolation method for a given problem domain.
If a reference dataset R containing the tuples (xi, yi) is available, we calculate a set of absolute errors E with E = {e1, ..., en} for interpolation function f as in Eq. 1: If no reference dataset exists, we calculate the set of absolute errors E via cross-validation on the interpolated dataset I containing the tuples (xi, yi), with |I| = n. We create a set of cross-validation-sets Vi as displayed in Eq. 2: With fi being the interpolation function constructed using Vi, we calculate the cross-validation errors as in Eq. 3: Our implementation allows the metric for calculation of the final aggregate error of the error set E to be passed using a functional expression. In this paper, we use the arithmetic mean and median.

SELECTION AND CONFIGURATION
We allow selection of the best interpolation function for the problem's domain using an independent reference dataset containing a larger set of data points, which describes a similar problem as the dataset to be interpolated. E.g., both datasets describe power per load level measurements, yet they were measured for different workloads on different machines. In such a case, we create a subset from the reference dataset. This subset is of the same size as the set that is to be interpolated and contains data points with the closest possible input values (x-axis values) to the input values of the target data set. Then we select the best configuration and interpolation methods for this subset by comparing the aggregate modeling error of the potential interpolation methods. The function with the minimum aggregate error is selected as the final interpolation function. For functions with a configurable parameter (degree of freedom), this parameter must be auto-configured first. Finally, we transfer the selected method and configuration to the actual set to be interpolated.
In some cases, reference data is not available and selection of a single pre-configured interpolation method is not possible, either due to a lack of sufficient domain knowledge or because of the problem domain's nature. In this case, we select the best interpolation function for a given dataset by calculating the cross-validation error. We compute different cross-validation datasets, each with one data point removed. At least one data point must be removed for crossvalidation, as the self-prediction error of an interpolation function is always 0. Consequently, cross-validation using the full dataset is not possible. We evaluate the interpolation method's ability to predict the missing data point for each of the cross-validation datasets. The function with the minimum aggregate error over all cross-validation datasets is selected as the final interpolation function.
Among the existing interpolation functions used in this paper, two function types feature a configurable degree of freedom. We select the final configuration parameter using a hill-climbing approach, as expected values are well known and as the parameters in all of our cases have a specified minimum value. To apply hill-climbing, each parameter must have an initial parameter instance p0 and a function h so that pi = h(pi−1). With f (pi) being the parametrized interpolation function and e(f (pi)) being its error metric, we iterate over the parameters pi in an ascending order (using h) until e(f (pi+1)) ≤ e(f (pi)).
To improve interpolation accuracy for performance and power measurements, we introduce a new approach to parametrization of piece-wise polynomials. It is designed to minimize the interpolation error due to state changes caused by device power management. These state changes cause noncontinuous behavior in a power or performance function. Consequently, it pays to introduce breaks at these points when interpolating polynomials. Breakpoints are detected at the data points featuring the greatest difference between their range value and their successor's range. Meaning that given a set of n breakpoint indices B, with |B| = n, the following has to hold true: ∀i ∈ B : |yi+1 − yi| < |yj+1 − yj|, with j / ∈ B. We use these break points for piece-wise polynomial interpolation by interpolating polynomials over the subsets defined within breakpoint boundaries. The amount of breakpoints remains a freely configurable parameter and can be determined using our hill-climbing approach.

EVALUATION
We evaluate the accuracy of our interpolation methods based on measurements using the SPEC Server Energy-Efficiency Rating Tool (SERT) [5]. We use all of SERT's mini-workloads (called worklets) except for the memory Capacity worklet, as it doesn't scale with load levels, and the XMLvalidate worklet, which didn't scale correctly for fine-granular target load levels. The used worklets are: six different CPU worklets [11]), two storage worklets, the memory Flood worklet, and the hybrid SSJ worklet. We exercise each of the worklets at 100 different load levels, with a separate idle power measurement serving as a 101st measurement. For each of these levels, throughput (in s −1 ) and power consumption (in W ) are measured. We select both evenly distributed as well as scattered subsets of measurements from the original 101 results. All models are evaluated based on their ability to accurately reconstruct the entire original measurement for the given workload using no additional data. We compute the relative absolute error (|p model − p ref erece |/p ref erence ) for each data point in the reference measurement, using the mean as overall error metric. A smaller relative error corresponds to a more accurate model. We compare the accuracy of our interpolation functions with three common power modeling approaches: • Linear Power Model: The linear power model is a common model in literature [7]. It calculates power consumption at a target load level u ∈ [0, 1]: p(u) = p idle + (pmax − p idle )u • Linear Power Model (Exponential Correction): This power model, introduced in [3], modifies the linear power model using an exponential correction factor r, accounting for the curvature in power per utilization functions: p(u) = p idle + (pmax − p idle )(2u − u r ) • Polynomial Fitting using Regression: We create polynomials of varying degrees to fit the measured results. The coefficients a0, a1, ..., an of the polynomial function p(u) = anx n +...a1x+a0 are fitted using OLS multiple linear regression.

Comparison of Interpolation Methods
A comparison of the mean accuracy of the interpolation and modeling methods for the power per load level function of the hybrid SSJ workload is shown in Table 1. The displayed error metric is the mean of the relative absolute differ-ences between each data point in the reference measurement and the corresponding model prediction. The table shows the error for interpolation methods and the power models. Interpolation with multiple configuration options (such as Shepard weights) is marked with the respective configuration. The best interpolation method changes depending on input dataset size. Compared to the simple power models, interpolation functions are highly accurate, with modeling errors reliably less than 1% for all dataset sizes. The two exceptions are nearest neighbor and Shepard interpolation. Polynomial interpolation provides the greatest accuracy, the optimal configuration depends on the size of the input dataset.
Regression is not as accurate as any of the piece-wise polynomial interpolation methods (equi-distant splits, dynamic splits, or splines) in cases of equi-distant data. For our scattered dataset, regression is only slightly less accurate than cubic spline interpolation.
It is also notable that the dynamically split polynomial interpolation and interpolation using one polynomial of maximum degree feature identical accuracy for this workload. This effect is the result of the largest difference in power consumption for SSJ taking place right before full utilization. Consequently, the first breakpoint is set at full utilization, resulting in no change to polynomial interpolation without breakpoints. However, this observation is specific to SSJ and does not repeat for other workloads. The CPU-bound LU workload's power, e.g., scales differently with increased load, as can be seen in its power scaling behavior. LU has a sharper increase in power consumption beginning at 60% load. This results in a visible impact on interpolation accuracy (see Table 2), as interpolation methods must correctly recognize this sudden increase in power draw.

Interpolation using Reference Dataset
Next, we evaluate the efficiency of automatic selection of a good interpolation function using an independent reference dataset. We choose the SSJ measurement set as our reference dataset and then choose the corresponding interpolation function for any given subset based on the optimal interpolation function of SSJ for this given subset.
The accuracy of interpolation based on automated interpolation selection and configuration using SSJ as the reference dataset is displayed in Table 3. Compared to using regression, interpolation using an independent reference   Table 3: Mean modeling errors of independent reference based interpolation for representative workloads.
dataset features an improvement of 43.607% (mean over the relative improvements) in accuracy. Compared to always choosing the single, most commonly best interpolation method (dynamically split polynomial interpolation with one breakpoint), automated selection and configuration still improves model accuracy by 20.025% (mean over the relative improvements).

Interpolation using Cross-Validation
Although cross-validation based configuration and selection lacks the additional data of its reference based counterpart, it is still fairly accurate. Specifically it is 31.36% more accurate than linear regression.
The major drawback of the cross-validation based configuration and selection are the different scales between the cross-validation problem and the final dataset to be interpolated. E.g., when removing a data point from the 10% utilization interval set, the interpolation function has to interpolate a 20% utilization gap. Its ability to do so is then judged to indicate its accuracy when interpolating equidistant data-points at 10% intervals.

CONCLUSIONS
This paper introduces an approach to automated configuration and selection for interpolation of power and performance measurements. We show that for closed and bounded problems, interpolation is far more accurate than similar black-box modeling methods, such as regression. Compared to linear regression, we are able to improve accuracy of a power per utilization prediction by 43.607% if additional reference data is available and by 31.36% if it is not. Our automated approach can also improve prediction accuracy by up to 20.025%, compared to always selecting the usually best interpolation function.
The results in this paper enable more accurate predictions of power and performance based on pre-measured results. This, in turn, allows for better decision making in power and performance management. The interpolation approaches, introduced in this paper, can also be used for the generation of additional data for under-fitted models.
For future work, we plan to evaluate the accuracy of the approach in this paper for other problem domains than power and performance modeling, as it is generic enough that is should be possible to use in any space that can be modeled as an interpolation problem.