Modelling state spaces and discrete control using MILP: computational cost considerations for demand response

INTRODUCTION: Demand response (DR) has been proposed as a mechanism to induce electricity cost reductions and is typically assumed to require the adoption of time-differentiated electricity prices. Making the most of these requires using automated energy management systems to produce optimised DR plans. Mixed-integer linear programming (MILP) has been used for this purpose, including by modelling dynamic systems (DS). OBJECTIVES: In this paper, we compare the computational performance of MILP approaches for modelling state spaces and multi-level discrete control (MLDC) in DR problems involving DSs. METHODS: A state-of-the-art MILP solver was used to compute solutions and compare approaches. RESULTS: Modelling state spaces using decision variables proved to be the most efficient option in over 80% of cases. In turn, the new MLDC approaches outperformed the standard one in about 60% of cases despite performing in the same range. CONCLUSION: We conclude that using state variables is generally the better option and that all MLDC variants perform similarly.


Context and motivation
Demand response (DR) programs have been advanced as a way to ensure the existing power system infrastructure is used rationally and efficiently, and also to integrate increasing shares of variable renewable energy resources into power grids [1,2]. This potential implies the dissemination of intelligible information about the power system for a given period, most often in the form of time-differentiated electricity prices and ad-hoc demand reduction requests, so as to indirectly influence end-users' aggregate electricity consumption. Once in possession of this information, end-protracted interactions with end-users. One common approach is to model DR problems using mixed-integer linear programming (MILP) and produce optimised solutions via MILP solvers. A wide range of DR problems have been modelled using MILP including those involving dynamic systems (DS), such as domestic hot water (DHW) vessels and air-conditioned spaces, whose DR potential stems from their ability to defer loads to an extent that allows for significant flexibility and possibly lower costs. It is thus important to model DSs using MILP as accurately and efficiently as possible. This paper describes a study comparing the computational performance of different MILP formulations of the same DR problems involving dynamic systems.

Literature review
The MILP models for dynamic systems found in the literature primarily cover thermal loads such as hot water vessels, airconditioned spaces, radiators and refrigerated compartments, and are mostly physical phenomena-inspired as opposed to physics-agnostic models. Most commonly, these dynamic systems are modelled as single-node temperature models that can be derived from first order ordinary linear differential equations, and often rely on continuous decision variables to store state (e.g., temperature) values [3,[6][7][8][9][10][11][12]. Onedimensional multi-node models can also be found in the literature, particularly to model thermal stratification in storage vessels [13,14] and heat transfer across building envelopes [15,16]. In a few cases, thermal loads have also been modelled as time-shiftable or interruptible loads [17,18].
The models reviewed rely on combinations of temperature, power, and control constraints. Temperature constraints have generally been used to ensure safety and comfort standards [3,8,12], either constant or changing over time due to occupancy or demand, rather than the model's own accuracy (e.g., due to phase changes). Power constraints have been used to comply with individual equipment's power ratings [8,9,11,19] or those defined by fuses or the utility grid [3,17,20]. Control constraints have been used to reproduce full-and part-load operation using multi-level discrete (MLD) control, on-off control with hysteresis, minimum operation and inactivity periods [20,21], and to ensure some modes are mutually exclusive [8,22]. Accommodating some of these features may be a necessity but these also increase the computational burden, particularly for short time steps, due to the additional binary variables needed. Ultimately, models of a strongly-combinatorial nature (due to a high number of binary variables) may prove impossible to solve within an acceptable time-frame (i.e., close to real-time) using affordable and low power computational resources [6,8,12,[18][19][20][21][22]. In those cases, the attractiveness of MILP models for in-house DR planning is limited. It is, therefore, desirable to find ways of improving their performance.
Multi-level discrete control can require a large number of binary variables to reproduce in MILP models, depending on the number of levels. The conventional approach requires one binary variable per time interval and level [18,21]. Recently, two alternative approaches have been proposed that enable reducing the number of binary variables needed [23]. These are suitable for those cases in which all levels can be reproduced using combinations of a subset of those levels. This encompasses typical situations where reproducing fulland part-load operation of systems is needed, namely if the levels are between nought and one at regular steps. The two new approaches resulted in less computational effort in most comparisons undertaken (>60%). These consisted of DR problems and focused only on one set of levels [23].
The same study also evaluated the effect of DS state variables (SV) on the computational performance of MILP models. It found that removing SVs concerning a single DS through finite linear operations increased the computational effort in most cases (>90%) and significantly (~1,000%, on average) [23]. This suggests state variables should be used in MILP models to keep the computational effort down.

Objectives and approach
The study described here set out to complement previous efforts on the computational performance of alternative MILP approaches to model state spaces and MLD control in DR problems. We did so by considering an additional approach for MLD control, a revised methodology, and additional problem diversity in relation to [23]. The additional diversity concerns aspects that directly influence the modelling approaches as well as general problem data. The problems selected for testing the approaches were day-ahead DR planning exercises involving DSs. Two sets of problems were considered: one for evaluating the effect of state variables and another to compare MLD control modelling approaches. Each unique problem in a given set was reproduced using each of the respective modelling approaches. A state-of-theart solver was used to compute optimised solutions to these problems. The computation times required to obtain these solutions were then compared and analysed.
The endeavours presented in this paper provide a fuller understanding of the trade-offs between the different approaches investigated. These contributions result from:  Comparing four MLD control modelling approaches, including one based on the standard one using special ordered sets of type 1 (SOS1);  Comparing MLD control modelling approaches using four distinct sets of levels and not just one;  Evaluating the effect of modelling state spaces with or without continuous decision variables in multiple dynamic systems instead of just one;  Considering more base load, DHW consumption and indoor heat gain profiles (64 vs. 27 in [23]). This effort is presented over 5 additional sections: Section 2 outlines the type of DR problem we have focused on to assess the different modelling approaches, and describes the mathematical models used to reproduce those problems in accordance with each approach; Section 3 details the methodology employed including the problems selected and optimisation settings; Section 4 presents the optimisation results and Section 5 discusses their implications; finally, Section 6 summarises our main findings and future work. Modelling state spaces and discrete control using MILP: computational cost considerations for demand response 3

Mathematical models
The different modelling approaches compared were applied to a set of single-household DR problems formulated using MILP. The approaches developed differ among themselves only with regard to the subject of our inquiry: the ways to model state spaces and multi-level discrete (MLD) control. They are presented in the next sections by highlighting their differences in relation to the standard (STD) formulation.

Problem description
The DR problem addressed in this study is centred around a grid-connected single-family dwelling. The household's electricity consumption during a 36-hour period is charged in accordance with time-differentiated energy prices and a fixed demand rate dependent on the maximum power needs for the entire period under analysis. The automated home energy management system is tasked with preparing optimised plans and managing the operation of domestic appliances and energy resources to ensure the goals are met. This requires adjusting the electricity consumption profile, which can be partially influenced through a set of controllable appliances and systems. A small behind-the-meter electricity generation system may also be installed in the premises. If so, energy surpluses are fed to the grid and financially compensated.

Standard formulation
The objective for this model is to maximise the cash flow (revenue minus expenditure) during the planning period. This is achieved by simultaneously minimising energy and power costs and maximising revenue while meeting safety and comfort standards. Revenue is obtained by selling locallygenerated excess electricity to the utility whereas costs are due to a demand rate and static and dynamic electricity consumption components, with the latter being determined by |S TSAC | time shiftable appliance cycles (TSACs) and |S DS | dynamic systems (DSs). S TSAC is the set identifying each TSAC and S DS is the set identifying each DS. The resulting electricity consumption limits the range of acceptable demand rate options, which simultaneously imply limits for electricity supply and demand. Penalties are induced by DSs only, namely by violations of states' high (H) and low (L) thresholds or by states' increases (H) and decreases (L) relative to the initial conditions by the end of the planning period. S PEN,∆t,H and S PEN,∆t,L are the sets identifying the DSs subject to penalties for high and low state threshold violations, respectively. S PEN,PP,H and S PEN,PP,L are the sets identifying the DSs subject to penalties for state increases and decreases relative to the initial conditions, respectively.
The objective function is thus given by (1), where: Pk IMP and Pk EXP stand for the mean power drawn from and fed to the utility grid, respectively, during time interval k of the planning period (out of |S K | ∆t-long intervals, with S K being the set of planning period time intervals); yp,p+1 U indicates whether the power level Pp U corresponding to the demand rate level p (out of P options) was exceeded during the planning period; ∆xs PP,L and ∆xs PP,H represent the state decrease and increase, if any, relative to the initial conditions for the dynamic system s; and, ∆xs,k ∆t,H and ∆xs,k ∆t,L respectively represent the violations of high and low state thresholds set for system s and time interval k, if any. With regard to the objective function coefficients: pk IMP and pk EXP are respectively the prices charged and offered to the household for electricity consumed from and delivered to the grid during the time interval k; cp,p+1 U is the additional cost of exceeding the power level p; cs PP,H and cs PP,L are the penalties for increasing and decreasing the state of system s by one unit by the end of the planning period; and, cs,k ∆t,H and cs,k ∆t,L are the penalties prompted by violations of the high and low state thresholds by one unit for system s during time interval k. As such, the objective function includes penalties and a cash flow component with costs and revenue.
Revenue is obtained by injecting electricity into the utility grid, which can only take place if there is a local surplus (electricity production exceeds the local demand) and no electricity is simultaneously needed from the grid. The electricity production profile is predetermined and is used in conjunction with the static electricity consumption to arrive at Pk STATIC , which is the static net electricity consumption (consumption minus production) during time interval k. The option to import or export electricity also depends on the dynamic consumption and production components, which can ideally be influenced by both types of systems, though TSACs are not expected to generate electricity. S IMP and S EXP are the sets identifying the systems capable of importing and exporting electricity dynamically; Ps,k EXP and Ps,k IMP represent the mean power supply and demand from system s during time interval k; S TSAC ∩ S EXP = ∅.
Costs are incurred due to the electrical energy imported and due to the power needs for exporting and importing electrical energy during the planning period, (2)- (17). Increasing power needs translate into higher demand rates, as the demand rate selected (p) has to correspond to power levels (Pp U and Pp U λp U ) that are higher than or equal to the maximum mean power needs (P IMP,MAX and P EXP,MAX ) during the planning period, and the demand rate is a monotonicallyincreasing function of the power needs. λp U is the factor relating the import and export power levels permitted by the demand rate p. The electrical energy consumption costs arise due to the various loads presented by the TSACs and DSs.
TSACs are modelled as non-interruptible cycles defined by ordered sequences of power demand stages whose scheduling must respect predefined comfort periods [21,24]. In the model, (18)-(24), Ts K,i and Ts K,f represent the initial and final comfort period time slots for system s, respectively. In turn, fs,r is the power demand stage r for system s, ds is the number of stages for system s, and ws,r,k indicates whether the stage r for system s is to be scheduled for time interval k.
DSs are modelled using continuous-time solutions to the respective differential equations. In particular, the solution for a given time interval is used as the initial condition for the next time interval. The systems are also subject to the influence of controllable input signals, which are modelled as a set of modes (Ss M is the set identifying the modes for the DS s). These are primarily used to implement different controls and can be associated with electricity imports or exports to define their contribution to a given DS's power supply and demand: Ss M,IMP and Ss M,EXP are mutually exclusive sets identifying the modes that contribute to electrical energy imports and exports, respectively; Ps,m,k is the mean power associated with system s, mode m and time interval k. In the model, (25)-(47), xs is the state vector for the DS s, comprising one state variable for each time interval (xs,k, corresponding to the state by the end of the time interval k; xs,0 is the initial condition). The state equations, (25), are presented using matrices and a vector to encode the solutions to the differential equations for each time interval: bs DS is the right-hand side coefficient vector for system s; As DS concerns the role of state variables for system s; Bs,m DS concerns the role of input signal variables for system s and mode m. In turn, us,m is the vector corresponding to the input signal mode m for the DS s, which is made up of one nonnegative variable (us,m,k) per time interval (k). These correspond to the zeroorder hold input signal component variables for each time interval, which define how the systems can be controlled to reach feasible solutions and the most desirable outcome.
Feasibility requires that state variables be kept within upper (xs,k MAX ) and lower (xs,k MIN ) bounds, (32)- (33). In turn, the most desirable outcome implies minimal penalties, (34)-(41). Penalties may be incurred when the state variable for a given DS s and time interval k exceeds xs,k ∆t,H or falls below xs,k ∆t,L . Similarly, should the final state of the planning period (xs,k, with k=|S K |) exceed or fall below the initial conditions (xs,0), penalties may also be incurred. Due to the existence of upper and lower state bounds, upper and lower bounds can also be defined for the state violation and difference variables, (36)-(37) and (40)-(41). Moreover, the ability to obtain feasible solutions and the most desirable outcomes depends on the type of controls used, which we describe next.
The dynamic systems considered are controlled using input signals that must abide by a set of control constraints, (42)-(47). Three types of control have been considered for DSs: on/off control, continuous control and multi-level discrete control. On/off control is self-explanatory and uses one binary variable per time interval. Continuous control uses one binary variable and one continuous variable per time interval to allow for a variable input signal above a fixed minimum, when on. Finally, multi-level discrete control is meant to reproduce multiple operation setpoints, namely to reproduce full-and part-load operation. The standard formulation for this type of control uses one binary variable per time interval and level. S DS,OO is the set identifying the systems controlled by way of on/off control, S DS,MLD is the set identifying the systems using multi-level discrete control and S DS,CC is the set identifying the systems using continuous control, all of which are mutually-exclusive subsets of S DS .

Multi-level discrete control
The standard formulation requires a binary variable per system, level and time interval in addition to a constraint per system and time interval to implement multi-level discrete control, (43)-(44). Depending on the number of levels one seeks to reproduce, this formulation may prove to be strongly combinatorial. It is therefore desirable to reduce the number of binary variables required or, more generally, the time needed to obtain optimised solutions. The three alternative approaches considered are described next.

Special ordered sets of type 1
One of the approaches considered relied on the use of special ordered sets of type 1 (SOS1) [25,26]. These sets provide additional information to solvers to enable more efficient tree searches with branch and bound algorithms, particularly for multiple choice programming problems. In this case, the standard formulation was kept and the input variables (us,m,k) for the same system (s) and interval (k) were defined as SOS1.

Non-exclusive levels
The two other approaches considered rely on the possibility of reproducing the outcome enabled by a given level using combinations of levels rather than the original one. If this possibility exists, then under certain additional conditions, it is possible to reduce the number of binary variables without changing the problem. The reduction in the number of binary variables needed depends on whether a given set of levels can be reproduced using only a subset of those levels. If this is the case and the mutual-exclusiveness constraint (44) is removed, the variables associated with the redundant levels can be presolved in accordance with (48), where Ss M,RED ⊂ Ss M is the set of redundant levels in a system s ∊ S DS,MLD .
In removing the mutual-exclusiveness constraint, one may be allowing for previously inexistent levels to be considered. If that is the case, then additional constraints are necessary. In general terms, this requires ruling out all the combinations that can lead to those levels, which is possible using linearised 0-1 polynomial constraints, (49)-(50), where Ss,v M,1 and Ss,v M,0 are the sets of levels that equal one and nought, respectively, in the combination v out of a total of Vs combinations to exclude [27]. Should this require ruling out too many combinations, it may make sense to allow some levels to remain mutually exclusive among themselves and in relation to those that can be combined as in (51) (51) Preliminary results show this general approach can improve performances in relation to the standard one but not consistently or significantly [23]. These results refer to cases in which the number of binary variables could be reduced but not the number of constraints. It may thus be that this approach is only interesting if a substantial reduction in the number of binary variables is possible without requiring (too many) additional constraints. This can be ensured if all levels are equally spaced (constant step) between nought and one (or 100%), since considerable reductions of binary variables are possible depending on the number of levels (see Appendix A) and intermediate levels are implicitly excluded, limiting the number of additional constraints needed. These assumptions arguably encompass the majority of situations in which full-and part-load operation of systems may be needed in practice. With those conditions and depending on the choice of free levels, the constraints required, if any, may only have to prevent the maximum value (1 or 100%, if there are no mutually-exclusive levels at all) from being exceeded. This can be achieved with (49)-(50), which is the general approach (NEX-G), or with (52), which is specific to certain cases as described previously (NEX-S).

Formulations without state variables
State variables are not strictly necessary in the model presented, as these can be defined by functions of the input variables as given by (53). By using (53), the part of the objective function and any constraints depending on those variables can be made independent of state variables. Without loss of generality, consider a generic DS s and Gs equalities that depend on that system's state variables, (54). Those equalities can be shown to be independent of state variables as in (55)

Methodology
The study focused on the computational performance of alternative modelling approaches for MILP-formulated DR problems. Two groups of problems were prepared: one to compare different MLD control modelling approaches, and another to evaluate the effect of state variables. In each group, a given problem was recreated using each of the respective alternative modelling approaches. In the first group, the same problem was reproduced once for each of the multi-level discrete control approach (STD, SOS1, NEX-G and NEX-S). In the second group, each problem was recreated with and without state variables for select DSs. We considered three DSs in these problems, all of which were used individually. This required four scenarios per problem: one per DS plus one for the standard case, which used variables for all DSs. The problems were then solved and the solutions compared.

Problems
The problems created require scheduling six loads: three TSACs and three DSs. The TSACs selected sought to reproduce the operation cycles of a dishwasher (DW), a laundry machine (LM) and a tumble dryer (TD), whereas the DSs modelled correspond to an electric water heater (EWH), a refrigerator (REF) and an air-conditioned room (ACR). Each of the DSs modelled was assigned a different type of control: the EWH was set to use on-off control (s ∊ S DS,OO ); the REF was set to use continuous control (s ∊ S DS,CC ), as a way to represent on/off operation during less than one full time step; and, the heat pump (HP) that controls the ACR temperature was set to use MLD control (s ∊ S DS,MLD ). Four sets of multi-level discrete control levels were considered for this study, each of which used individually. In addition to the five levels used in [23], which correspond to 20%, 40%, 60%, 80% and 100%, we also considered: three levels, corresponding to 33%, 67% and 100%; four levels, corresponding to 25%, 50%, 75% and 100%; and ten levels, spread in 10% steps between 10% and 100%. With the NEX-G and NEX-S approaches, we relied on the subsets of levels indicated in Table 1, which were determined with the method described in Appendix A. Note that the sets of three and ten levels do not require additional constraints to avoid additional levels, whereas the sets of four and five levels do. The latter was the only one used in the group of problems designed to study the effect of state variables. For the SOS1 approach, we defined the weights for each SOS1 as the power associated with each level divided by the maximum among them.

Table 1. Sets of MLD control levels considered
The problems included in each group were selected to test each modelling approach under different conditions. For the state space group, four test scenarios were considered: one where all DSs have state variables; and three others where the REF, ACR and EWH are respectively without them. For the MLD control group, sixteen scenarios were considered: four sets of levels times four approaches. For each test scenario, we considered four time step durations (5, 10, 15 and 30 minutes), to consider the effect of problem size, and sixty four problem data combinations. These refer to multiple indoor heat gain, static electricity consumption and DHW mass flow rate profiles, shown in Figures 1-3. In total, 5120 problems were created, 1024 in the first group and 4096 in the second.

Problem data
The problems were defined using data collected from various sources. The primary source was [28], from which the data sourced included: the TSAC operation data and comfort periods; the demand rates and respective power levels; the electricity prices; the electricity generation profile; the static electricity consumption profile; the indoor heat gain profile, the outdoor temperature profile (for the ACR system); the indoor temperature profile (for the EWH and REF systems); and the mains water temperature profile. The exceptions concerned: the air-conditioned room's lumped capacity, which increased ten-fold relative to that of [28]; the heat pump performance data, obtained from [29]; the COP temperature dependence for the refrigerator, taken from [30]; the daily DHW volume (120 litres) and EWH temperature thresholds (xs,k ∆t,L = 55ºC, and xs,k ∆t,H = 70ºC); the EWH heat loss coefficient (U = 1.08 W/m 2 K); the DHW profile, instead based on the RAND profile [31]; and, the objective function coefficients corresponding to the various penalties. For a clearer exposition, we describe these separately in Appendix B. Finally, the indoor heat gain, DHW mass flow rate and electricity consumption profiles cited above were resampled according to the time step duration and used without changes and to generate the additional profiles, shown in Figures 1-3.

Computational resources and settings
Optimised solutions were produced using IBM's CPLEX 12.8.0.0, which ran on a shared machine featuring an Intel Xeon Gold 6138 CPU and 320 GB of RAM. The solver was invoked from MATLAB using the official CPLEX class methods. The same solver settings were adopted for all problems and corresponded to the standard solver settings except the termination criteria which included optimality, a 1% relative gap and a 2-hour computational budget.

Figures of merit
The analysis relied on the objective function (OF) and computation time results for each problem. In particular, we used the incumbent (IS) and best bound solutions (BBS) to each problem, in addition to the deterministic computation time (DCT) returned by CPLEX (in ticks), rather than the actual computation time (measured in seconds). The former is deemed a repeatable measure of optimisation effort, and less susceptible to external influences than the latter [32].
Additional metrics were computed using the main results, namely those defined in (58)-(59). In the equations, OFX W refers to the objective function value obtained for approach X using the solution of type W (i.e., either IS or BBS), while ∆OFX,Y W refers to the absolute value of the difference between OFX W and OFY W . In turn, DCTX is the DCT for approach X, and ∆DCTX,Y is the difference between DCTX and DCTY. If ∆DCTX,Y ABS is positive, it means approach X performed worse than approach Y, which is counted as a DCT increase for approach X. If it is negative, it is counted as a decrease.
The DCT comparisons should ideally involve solutions that produce the same OF result. In practice, we cannot guarantee this except in very simple cases (i.e., by reaching optimality) due to the process of solving MILP problems. Consider two independent problems that reflect the same basic problem through different modelling approaches. The optimal OF result is the same but the MIP gap can be met without the ISs producing the same OF result, as illustrated in Figure 4. In the case shown, the MIP gap is met in problem 1 with a more accurate BBS than in problem 2, which means a worse IS is tolerated in problem 1. Hence, reaching the same OF result implies optimality or chance. At the same time, OF differences will be lower than the MIP gap, provided it is met. The method this study relies on rests on a low MIP gap and a sufficiently large computational budget to narrow OF differences. On the other hand, one does not need to rely on these approximations for every comparison, namely if the best IS also produces the lowest DCT. In that case, the OF differences can only reinforce the original outcome, not alter it. We document the prevalence of each type of case and the maximum OF approximations needed, which should be under the target MIP gap provided it is met (see Figure 4). Finally, our analysis also takes the magnitude of DCT differences into account, as patterns of large DCT differences offer greater confidence than those of small DCT differences.

Overview
The optimised solutions produced by CPLEX were analysed, though given the high number of problems, only a limited selection was inspected visually. One of these solutions (reference problem: STD formulation; reference profiles; 15minute time steps; 5 levels) can be seen in Figures 5-8, specifically its respective power demand and system states. Figure 5 shows a large share of the dynamic demand being scheduled for the low electricity price periods (between 2 and 6 am of the first and second days). One can also observe in Figures 6-8 that the states are kept well within their upper and lower bounds and that minimal penalties were induced. It is therefore a sensible solution to the underlying DR problem. The analysis was primarily numerical in nature. It revealed that feasible solutions were obtained for all 5120 problems considered and that no optimal solutions were computed. Less than 1% of problems (113, to be exact) required more than 1 minute to be solved and the solutions to all but one problem met the relative MIP gap. The exception belongs to the MLD control problem set, for which the solution ensured a relative MIP gap of 1.08%. Otherwise, the relative MIP gap range was between 0.44% and 1%. In practical terms, the solutions obtained to equivalent problems were virtually identical as we will show in the next section.
A total of 3840 DCT comparisons between a given approach and the STD one were enabled. This includes 768 (256 per DS affected) to evaluate the effect of state variables and 3072 (1024 per non-STD approach) to compare MLD control modelling approaches. Out of these, 465 (61%) and 1506 (49%) imply neglecting OF differences but the approximations are negligible (<0.021€) and consistent with the MIP gap. Table 2 summarises the optimisation results.

Formulation equivalence
The optimised solutions obtained were used to check for anything that might indicate the different approaches are not equivalent. We did not find evidence suggesting they are not, though we are aware that the approaches cannot be positively determined to be equivalent by comparing the solutions except if all proved to be optimal. Despite this, the BBS indicate the problems are at least very similar: the BBS OF differences between alternative approaches did not exceed 0.02 €, for objective function values between -3.9 € and -4.4 € (see Table 2). The IS OF differences also proved to be in the same range and did not exceed 0.022€. Additionally, none of the incumbent solutions corresponding to alternative approaches led to better OF results than the least favourable OF result possible with the best bound solutions for the same problem, which means they do not contradict each other. These results strongly suggest the different approaches can be regarded as equivalent in reproducing a given problem.

Effect of removing state variables
Removing DS state variables from the models increased the computational cost of producing optimised solutions in most problems examined. Out of 768 comparisons, only 125 (16%) revealed improved computational performances by removing SVs. That figure breaks down into 33 (13%), 3 (1%) and 89 (35%) occurrences for scenarios involving the removal of state variables from the refrigerator (REF), air-conditioned room (ACR) and electric water heater (EWH) models, respectively. It is also possible to discern that, in general, the time step duration correlates with an increased number of computation time decreases (inverse correlation with the number of intervals). At the same time, this effect did not exceed 23% of cases (out of 64 cases) for any two time step durations, even we increased the time step duration six-fold. We provide the absolute frequency of the computation time decreases, in total and by time step duration, in Table 3. Table 3. Share of problems in which the removal of state variables in select DSs improved computational performances, in total and by time step duration. Table 4. Average, minimum (dark shade) and maximum (light shade) DCT differences due to the removal of SVs for select DSs, by DS and time step duration, as a share (%) of the average DCT for the cases with all SVs. Beyond the outcome of the comparisons, DCT differences were also evaluated quantitatively. In particular, we found DCT differences to vary by several orders of magnitude, for both DCT increases and decreases, as indicated in Table 4. At the same time, the maxima for any given time step duration are higher than the minima in absolute value. Indeed, average DCT differences for a given DS and time step duration were generally positive and highest if the comparison concerned the ACR system (average variations between 263-1,551%). The results for this system were the only ones to indicate an inverse correlation between the average DCT differences and the time step duration. This correlation was also found in the results for other systems though only if one time step duration is excluded (15 min.). Figure 9 illustrates this general trend.

MLD control modelling approaches
The comparisons undertaken showed that the STD approach was outperformed by every other one in at least 47% of cases. Overall, the NEX-G and NEX-S approaches performed best, reaching shorter DCTs than the STD one in 59% and 58% of cases, whereas the SOS1 approach only did so in 47%. The results proved to be similar for individual time step durations and sets of levels: the NEX-G, NEX-S and SOS1 approaches outperformed the STD one in 50-63%, 51-64% and 44-52% of cases. Notwithstanding this general trend, the NEX-G and NEX-S approaches performed best with the sets of 3 and 10 levels and with a time step duration of 15 minutes. Tables 5 and 6 summarise these results by approach. Table 5. Share of problems for which the SOS1, NEX-G and NEX-S approaches outperformed the standard approach, in total and by time step duration. Table 6. Share of problems for which the SOS1, NEX-G and NEX-S approaches outperformed the standard approach, in total and by number of levels.
The analysis also considered every possible comparison, which we summarise in Figure 10. These comparisons show each approach was able to outperform all others in at least 14% of cases. Among the various approaches, the NEX-G approach outperformed the others (1 st place) in more cases than any other (38% of cases) and placed last (4 th ) in fewer cases than any other too (9%). It also placed first and second in more cases than any other approach (67%). The NEX-S approach was second-best in this regard (51%). Moreover, while these two approaches generally performed about the same, their advantage was more pronounced with a time step of 15 minutes and also with the sets of 3 and 10 levels. With these sets, the NEX-G approach performed best in 49-53% of cases and did not finish last, whereas the NEX-S one never finished first but performed comparably. We illustrate the differences for sets of 5 and 10 levels in Figures 11-12.    On the other hand, the computational effort required by each approach was generally in the same range, as can be sensed from Figure 13. In comparison, the DCT differences induced by the use of SVs were more pronounced (Figure 9). Figure 13 also makes clear that the time step duration had an impact on the computational effort required with any of the approaches, despite a few outliers with one specific time step duration (15 minutes). As a result, the average DCT increases among the cases for the same time step duration were between 37% and 478% of the respective average DCT for the STD approach. The average DCT decreases were between 37% and 99%. The most significant average decreases (89-99%) concerned the aforementioned time step duration, whose exclusion would place the limit at 54%, that is, a reduction of slightly more than half of the average DCT. Given the share of DCT increases and decreases, the average DCT differences (which account for both types of occurrences) were generally much lower in absolute value, as indicated in Table 8. Table 8. Average, minimum (dark shade) and maximum (light shade) DCT differences for comparisons between non-STD and STD approaches, by approach and time step duration, as a share (%) of the average STD DCT.  Table 7. Summary of comparisons by outcome: share of problems (%) and average DCT difference in relation to the average DCT for the same time step duration (%).

Modelling state spaces
Upon using the latest results to compare the two approaches to model state spaces, we confirmed previous findings, identified new trends and obtained the necessary variation to attempt to explain causality [23]. Whereas previous results chiefly indicated large and nearly consistent DCT decreases by removing SVs, the new results also indicate that the absolute frequency and magnitude of the DCT decreases vary significantly with the DS affected. Case in point, the ACR system had the same number of SVs removed as the REF and EWH systems, yet only 3 DCT decreases were observed, as opposed to 33 and 89, respectively, and the normalised average DCT differences were between 6-8 times higher.
A possible cause for these results concerns the number of input signal variables whose respective problem matrix coefficients change due to the removal of SVs. According to (53)-(57), removing SVs causes an increase in the number of nonzero problem matrix coefficients associated with the input signal variables. This led to an increase in the number of nonzero problem matrix coefficients in the problems used, which we quantify in Table 9. That is, more coefficients associated with input signal variables were added than those removed for being linked to SVs. This increase implies more algebraic operations are needed (for each row in which SVs were previously used other than in state equations, if the corresponding input signal variables were not already being used), and could justify the trend toward DCT increases. The number of additional nonzero coefficients also provides a plausible explanation for the discrepancy between the results for the removal of the ACR's SVs and those of the EWH and REF DSs: the number of input signal modes (|Ss M |) is the only factor differentiating the number of input signal variables between DSs, and is higher for the ACR (5 modes) than for the REF (2 modes) and EWH (1 mode). Accordingly, the increase in the number of nonzero coefficients in relation to the standard approach caused by removing the ACR's SVs was higher than those of other DSs (400-1,312% vs 76-262%), for any and all time step durations † . Hence, the number of input signal modes is an important factor. same number of (additional) nonzero coefficients. This is so because the first REF mode was set up not to have a direct effect on the respective SVs.  Table 9. Relative variation of the number of nonzero problem matrix coefficients caused by the removal of state variables for select DSs, by time step duration [variation relative to the same time step duration].

MLD
We can conclude that removing SVs implies a trade-off between having smaller problem matrices and having more nonzero problem matrix coefficients. The former aspect can contribute to improved computational performances whereas the latter can require more effort per iteration. According to the results obtained, the latter effect tends to dominate, since more computational effort was required in the majority of cases, even in those with only one input signal mode (i.e., the lowest number of additional nonzero coefficients per SV removed). The conclusion to draw from these results seems to be that, though inessential, state variables can generally contribute to keep computational burden down.
Among the cases we did not examine is that of multi-node dynamic systems, which require more SVs (|S K | times the number of nodes). Hence, there might be a number of nodes beyond which the removal of SVs improves performances.

Selecting an MLD control approach
The comparisons involving the different MLD control approaches did not produce a clear frontrunner. Overall, the NEX-G approach appears to be the best option for the entire set of problems considered but each the three other approaches was able to outperform it in at least 14% of the cases. It was also observed that the NEX-G and NEX-S approaches were generally unable to achieve DCT reductions equivalent to more than half of the average DCT for the STD approach. Those approaches also led to average increases equivalent to at least 37% relative to the same reference. Accordingly, none of the approaches appears to be able to provide consistent and significant performance advantages.
The results also demonstrate that the NEX-G and NEX-S approaches tended to perform best with specific sets of levels, namely with the sets of 3 and 10 levels. These sets not only enable reducing the number of binary variables in relation to the STD approach but also dispense additional constraints. The latter is not with case with the sets of 4 and 5 levels. At the same time, the sets of 3 and 10 levels do not appear to have led to significantly different results (see Table 6) despite their differences. We refer to the different impacts on the number of binary variables (-33% vs. -60%, respectively) and on the number of redundant levels (none vs. multiple, respectively). Hence, the need for additional constraints appears to be the factor with the most impact on the NEX-G and NEX-S approaches. Future work should consider exploring these approaches with more sets of levels, different solvers, and by examining their formulation strength [33].

Conclusions
This study sought to compare the computational performance of different MILP modelling approaches for demand response problems involving dynamic systems. We compared the effect of modelling state spaces with and without continuous state decision variables. We also compared four different approaches to model full-and part-load operation of dynamic systems using multi-level discrete (MLD) control: one relying on the standard MILP formulation using individual binary variables for each level and time interval, only one of which can be positive per time interval; an identical one coupled with special ordered sets of type 1 (SOS1); and two new approaches (NEX-G and NEX-S) relying on the possibility of representing a given set of levels using combinations of a subset of those levels. Two groups of problems were created to study the two components of this study. Optimised solutions were computed using a state-ofthe-art MILP solver and their computation times compared.
We found that each MLD control approach was able to outperform all others in at least 14% of the problems. Overall, the NEX-G approach performed best, having secured the best and the worst performances in respectively more and fewer problems than any other. Along with the NEX-S approach, it also outperformed the standard approach in about 60% of cases. This advantage was more significant with the problems that enable these approaches to use fewer binary variables without additional constraints. Therefore, these approaches appear to be better options if those conditions are met. At the same time, the computational effort required with any of the approaches was generally in the same range.
Our study also found that removing the state variables associated with a given dynamic system generally led to computational cost increases. In particular, about 84% of cases evaluated required more computational effort to solve without those variables. The increase itself was found to correlate with the number of input signal variables that define system states. We attribute this outcome to an increase in the number of nonzero problem matrix coefficients. We conclude that, in general, the best approach is to use state variables.
Future work could consider evaluating the effect of state variables on multi-node dynamic system models. It might also prove interesting to compare the different MLD control approaches using more sets of levels and other solvers.  level m (∀m∊M) needs to exist in the subset, ymn indicates whether the level m (∀m∊M) is needed to reproduce the level n (∀n∊M), and wm indicates the same as um but is used to force the solver to favour low levels, should there be options (since favouring low levels tends to minimise the number of constraints needed to avoid combinations producing levels not found in the original set). To ensure this aspect of the model works as intended, the coefficients cm (∀m∊M) should be selected in accordance with (A.8), so as not to interfere with the main goal of minimising the number of levels needed. Similarly, the MIP gap termination criteria should take this into account, namely by being safely set below the minimum cm. Table A.1 summarises the model's results for a selection of supersets of equally-spaced levels between nought and one, which are of special interest for this study.

Appendix B. Functions used to define objective function penalties
The mathematical model used in this study is prepared to penalise two types of occurrences. One concerns violations of state thresholds, and the other, state variations relative to initial conditions. To define the respective objective function coefficients, we relied on a specific set of functions, (B.1)-(B.8), which we describe in this section.
Penalties caused by state variations relative to initial conditions were defined using the coefficients given by (B.1)-(B.4). These were defined using the following rationale: the penalty for a given state variation must be not be lower than the cost of avoiding it using the available means. Otherwise, those costs would be transferred to the next PP. This means the coefficient for a given variation depends on the existence of actuators capable of opposing that variation. If there is no such actuator (e.g., an exceedingly hot EWH), the coefficient is nought: (B.1) and (B.4). If there is (e.g., an underheated EWH), the coefficient is greater than or equal to cost of correcting a variation equivalent to one state unit (e.g., a 1ºC temperature difference) by the end of the PP, which we ensure using (B.2) and (B.3). These functions presuppose there is only one type of actuator per system and all of them imply energy consumption. Therefore, S DS,UW and S DS,DW are mutually-exclusive subsets of S DS identifying the systems with actuators capable of influencing states in an upward and downward sense, respectively. Additionally: xs PP,SS is the DS state for system s under steady-state conditions while using its actuator at rated capacity; Ps R is the rated power of the actuator for system s; τs is the time constant for system s. As for the penalties for violations of state thresholds, the respective coefficients were defined using (B.5)-(B.8). The coefficients were also defined in accordance with the type of actuator available and the type of penalty. For violations caused by excessive use of the actuator during interval k, the functions (B.5) and (B.8) ensure that the maximum penalty exceeds the maximum savings enabled by price differences if the actuator had to be used during one time interval. This requires using the maximum price difference. For violations caused by insufficient actuator use during interval k, the functions (B.6) and (B.7) ensure that the maximum penalty is greater than or equal to the cost of correcting the violation using the actuator during that time interval. In the functions, xs,k ∆t,SS is the DS state for system s under steady-state conditions while using its actuator at rated capacity in accordance with conditions available during interval k.