## Abstract

An autonomous adaptive model predictive control (MPC) architecture is presented for control of heating, ventilation, and air condition (HVAC) systems to maintain indoor temperature while reducing energy use. Although equipment use and occupant changes with time, existing MPC methods are not capable of automatically relearning models and computing control decisions reliably for extended periods without intervention from a human expert. We seek to address this weakness. Two major features are embedded in the proposed architecture to enable autonomy: (i) a system identification algorithm from our prior work that periodically re-learns building dynamics and unmeasured internal heat loads from data without requiring re-tuning by experts. The estimated model is guaranteed to be stable and has desirable physical properties irrespective of the data; (ii) an MPC planner with a convex approximation of the original nonconvex problem. The planner uses a descent and convergent method, with the underlying optimization problem being feasible and convex. A yearlong simulation with a realistic plant shows that both of the features of the proposed architecture—periodic model and disturbance update and convexification of the planning problem—are essential to get performance improvement over a commonly used baseline controller. Without these features, long-term energy savings from MPC can be small while with them, the savings from MPC become substantial.

## 1 Introduction

Heating, ventilation, and air conditioning (HVAC) systems are responsible for approximately 40% of the total energy consumption of buildings in USA [1]. It has been recognized by many researchers that instead of the traditional rule-based control systems, an optimization-based controller—especially model predictive control (MPC)—is a highly promising approach to reduce energy use; see, for instance, the review paper [2].

In spite of extensive studies and even successful demonstration projects, e.g., Refs. [3–5], MPC has not been widely adopted in practice. The bottlenecks—which have been discussed extensively as well—can be summarized into *lack of autonomy* of existing control architectures that use MPC. By *autonomous MPC*, we mean an MPC scheme capable of reliably computing high-quality control decisions at all times without the need for human intervention. A building’s and its equipment’s behavior are quite complex and uncertain, so the models needed by MPC need to be learned from data. Since the building’s behavior also changes with time—albeit slowly—the models need to be updated over time. The overall architecture thus needs to be adaptive.

Although there is an extensive literature on identification of HVAC system models from data, the vast majority of the existing methods cannot be used for autonomous adaptation. These algorithms fit model parameters by solving a non-convex optimization problem, e.g., Refs. [6–9]. Depending on the type and quality of data used, they require re-tuning of hyper-parameters by a human expert. Clearly such an approach cannot lead to an autonomous control system. Another issue is that although the unmeasurable internal heat gains from occupants are substantial, most identification methods still ignore them which can lead to poor model quality. Works on model identification in the presence of large unknown disturbance in a principled manner are limited [8–10].

The planning problem that MPC solves at every decision instant to compute control commands should be feasible and convex. With a nonconvex problem, the planner can fail to converge to a local minimum within the allowed computation time. Infeasibility has the same effect. In either case, a rule-based controller must be used as back up when the non-convex planner cannot provide a control command. Switching between controllers can cause poor performance. The MPC planning problem is usually non-convex due to bilinearities in models and cost functions [11–15]. Most works on HVAC MPC ignore the issue of reliability of the decisions computed by a non-convex planner, especially over long periods of operation.

In this paper, we propose an adaptive MPC architecture for HVAC systems, shown in Fig. 1, that is capable of operating autonomously for long periods of time without intervention of a human expert.

The “ID + prediction” block uses an algorithm proposed in our prior work [10] to identify the plant model and the unmeasured internal disturbance from easily measured input-output data. This algorithm involves solving an optimization problem that is always feasible and convex, and the model it identifies ($M^$) is guaranteed to be stable and possess properties that are consistent with properties of a building HVAC system. The algorithm has one hyper parameter that needs to be tuned only once. In short, the identification algorithm does not need any human intervention when new data is fed into it periodically, say, every week, to update the model. The past disturbance identified by the algorithm is used to forecast the future disturbance ($w\xaf^$) that is in turn used by the MPC planner.

The “MPC planner” block of the system uses the model and disturbance forecasts to decide control commands so as to maintain indoor climate while reducing energy use. We provide a convex approximation algorithm to approximately solve the nominal non-convex planning problem. We show that the algorithm is a feasible, descent, and convergent algorithm. Thus, the MPC planner block can compute decisions autonomously without human expert. We also show that among the many ways of convexifying these types of non-convex problems, the proposed approach is the only one applicable to our specific problem structure.

The proposed convex planner and the associated analysis is the first novel contribution of the paper. The second contribution is the performance assessment of the closed-loop system for a yearlong period. Numerical results show that the proposed MPC scheme is not only more energy efficient and better at indoor climate control than a conventional baseline controller. More importantly, these simulations show that both features of the proposed design—periodic update of the model and disturbance and convexified MPC planner—are necessary to get the performance improvement over the baseline controller. These discoveries are made possible only due to the long duration for which simulations are conducted. While that is perhaps not surprising for the role of periodic model update, the discovery on the role of convexification also required the yearlong simulation. In particular, the non-convex controller was seen to perform as well as the convex one in all but a few rare instances. In these few rare cases, however, the performance of the non-convex planner was catastrophic.

This study makes three additional contributions over the preliminary version [16]. (i) We provide new analysis regarding the appropriate convexification method for the MPC planner, while Ref. [16] did not have any such results. (ii) This study includes closed-loop simulations for a yearlong period while the preliminary version only included 3 weeks of simulation. (iii) Comparison with three additional architectures, each obtained by removing model update or convexification or both, is provided. These comparisons reveal which of these features of the proposal are useful or necessary (or not).

The rest of this paper is organized as follows. Section 1.1 provides a review of relevant work on HVAC MPC. Section 2 describes the HVAC control problem. Sections 3 and 4 present the components of “MPC planner” and “ID + prediction,” respectively. The simulation is set up in Sec. 5, and the results are presented in Sec. 6. Finally Sec. 7 concludes this work.

### 1.1 Literature Review.

Dynamic models of building HVAC systems are typically nonlinear, which makes the planning problem in MPC a non-convex optimization problem. The nonlinearity comes from the existence of the bilinear terms—products between a state, temperature, and a control command, airflow rate; we will see this in Eqs. (4) and (6) later in this paper. Sometimes, the dynamic model is linearized to obtain a convex problem. Among works adopting this approach, some assume the value for a certain control command is known so that the product in which that command appears becomes linear in the remaining decision variables [14,17]. This reduces a degree-of-freedom that MPC can use. Others linearize around a trajectory, which requires an optimal (or at least a near optimal) trajectory first [13,18]. The quality of a linearized model is sensitive to the choice of the trajectory, and determining such a trajectory is challenging. After all, if it were easy, there would be little need for MPC. Identifying a linear black box model directly from data is also not straightforward (we discuss this later in Sec. 3). Recent progress in this direction is made in Refs. [10,19], which identify a linear model in which the input is the heat gain due to the HVAC system. However, the model is still not linear with respect to the control commands such as air flowrate. Convex relaxation of the MPC planning problem is thus far from trivial.

There has been recent attempts at convexification of the non-convex planning problems encountered in HVAC MPC [12,15,20]. Reference [20] does not require constraints to be satisfied at all time, but only with a pre-defined probability. Therefore, the resulting solution may not satisfy actuator constraints. In Ref. [12], values of the Lagrange multipliers are required for reformulation of the problem. The convexification approach using a McCormick envelope considered in Ref. [15] requires feasibility of the original problem (without slack) for all time. The original problem is likely to be infeasible when disturbance is large, since the actuator limits will prevent them from maintaining state constraints.

Another particular challenge is that the internal disturbance is also a large part of the heat load and hence a large part of the energy consumption in buildings. Therefore, internal disturbance prediction is needed to achieve the promised performance of MPC. Some works ignore the effects of internal disturbance in the MPC formulation altogether, e.g., Ref. [21]. Many leave the disturbance forecast question aside, assuming that future disturbance is somehow known to MPC, e.g., Refs. [12,15,22,23]. Some use stochastic optimization to address uncertainty in disturbance forecasts, e.g., Ref. [13]. A few works forecast internal disturbance from its estimate, which is obtained from measuring its various surrogates and modeling the relationship, such as plug loads [5] and occupancy and CO_{2} [4].

## 2 Architecture

### 2.1 Problem Description.

The focus of this study is the indoor climate control of a single-zone HVAC system shown in Fig. 2.

In such a system, part of the air exhausted from the zone is recirculated and then mixed with outdoor air (OA) at a specified ratio. This mixed air (MA) is usually warm and humid, especially for hot-humid climates, and is therefore cooled and dehumidified by passing through a cooling coil. Dehumidification requires that the air is cooled enough for the water vapor to condense out of the air stream, so the conditioned air (CA) temperature (after the cooling coil), is usually too cold for a comfortable indoor climate. It is reheated by the reheat coil up to supply air (SA) temperature and then delivered into the zone.

The goal of the control system designed in this study is to decide the control commands to maintain the zone temperature (*T*^{z}) within time-varying pre-determined bounds, while keeping the energy use as small as possible. The control commands are the setpoints for total airflow rate ($m\u02d9$) and supply air temperature (*T*^{sa}). Lower level PI controllers will maintain these setpoints by varying fan speed and reheat valve position.

Although conditioned air temperature $Tca$ and the outdoor air ratio *α* (ratio of outdoor air flowrate to supply air flowrate) can also be varied, in this study we assume they are fixed. The conditioned air temperature is typically set to 12.8 °C in order to maintain zone humidity, which is an important aspect of thermal comfort [24]. Similarly, the outdoor air ratio *α* is pre-specified at a constant value, and the minimum allowed value for the supply air flowrate $m\u02d9$ is computed so that the OA flowrate $\alpha m\u02d9$ meets ventilation requirements [25].

### 2.2 Control System Architecture.

The control architecture proposed in this study is shown in Fig. 1. It involves two main components: (i) the *ID and prediction* block and (ii) *MPC planner* block that uses the models and forecasts to compute control commands.

Model predictive control of a system *x*_{k+1} = *f*_{k}(*x*_{k}, *u*_{k}, *v*_{k}), *y*_{k} = *h*_{k}(*x*_{k}, *u*_{k}, *v*_{k}), with *x* being the state, *u* being the control command and *v* being the uncontrollable inputs, involves minimization of a cost function $Ji=\u2211k=ii+N\u22121ck(x^k+1,uk,v^k)$ over the planning horizon *N* with *c*_{k}(·) being the energy used during the interval between *k* and *k* + 1. At time index *i*, an optimization problem of minimizing *J*_{i} subject to the system model and other constraints is posed based on the current estimate $x^i$ of the state *x*_{i} and forecasts $v^$ of uncontrollable inputs *v*. The solution to this problem yields optimal commands *u*_{i}, *u*_{i+1}, …, *u*_{i+N−1}. The first entry, *u*_{i}, is implemented. At the next time index *i* + 1, the procedure is repeated.

In the adaptive architecture proposed here, the model is updated periodically by the system identifier, although at a much slower time scale than that of the control command update. In the numerical studies later reported, the model is updated every week while control commands are updated every 15 min.

## 3 (Block II) Model Predictive Control Planner

*C*

_{pa}is the specific heat of air at constant pressure,

*T*

^{ca}(°C) is the conditioned air temperature,

*COP*is the chiller performance coefficient, and the mixed air temperature $Tkma$ (°C) is given by

*T*

^{oa}(°C), solar irradiance

*η*

^{sol}(kW/m

^{2}), transformed disturbance $w\xaf$):

Although *q*^{hvac} is considered the controllable input in Eq. (5), it cannot be commanded directly. Only $m\u02d9$ and *T*^{sa} can be commanded. Treating *q*^{hvac} as the controllable input helps in two ways. First, it makes the model (5) linear, which aids model identification (discussed in Sec. 4). Second, the linear model is a convex constraint in the optimization problem the planner has to solve. We emphasize that a linear model structure with $m\u02d9$ and *T*^{sa} as inputs, even though conceptually possible, is not useful for eventual use in MPC. The reason is that the sign of the DC gain (from $m\u02d9$ to *T*^{z}) depends on whether the control commands are having a cooling or heating effect on the zone. If the supply air temperature *T*^{sa} is higher than the zone temperature *T*^{z}, increasing $m\u02d9$ will increase the zone temperature. So, the DC gain is positive in such a scenario. The opposite happens when *T*^{sa} is lower than *T*^{z}. Now the DC gain has to be negative. However, a-priori knowledge of whether the control inputs will lead to heating or cooling is not available since that depends on both the state an control command.

### 3.1 Nominal Non-Convex Planner.

The goal of the MPC planner is to compute the control commands over the planning horizon, supply airflow rate $m\u02d9$, and supply air temperature *T*^{sa}, to maintain thermal comfort while reducing energy use over that horizon. A direct translation of this goal into an optimization problem will be a non-convex problem, partly due to the bilinearity in Eq. (6). We first present this problem below, and then use it as a stepping stone to formulate a convex approximation that is actually used in the proposed MPC planner.

For notational simplicity, the current time index *i* is assumed to be 0 in this section. Define the decision variables as $zk:=[m\u02d9k,Tksa,Tkma,Tkz,qkhvac,xk+1T,\u03f5kmin,\u03f5kmax]T\u2208R9$, in which $x\u2208R2$ is the state of the thermal model (5), $m\u02d9$ and *T*^{sa} are the control commands, and *N* is the planning horizon. Let $x^0$ be the estimate of the current state obtained from a state estimator, and let $v^k$ ($:=[T^koa,\eta ^ksol,w\xaf^k]T$) be the prediction of the uncontrollable inputs, for $k=0,\u2026,N\u22121$. Specifically, $T^oa$ and $\eta ^sol$ are from weather forecast, and $w\xaf^$ is provided by a disturbance predictor which will be discussed later in Sec. 4.3.

*nominal non-convex planning problem*is

*a*)

*b*)

*c*)

*d*)

*e*)

*f*)

*g*)

*h*)

*i*)

*j*)

*k*)

*l*)

*m*)

Actuator constraints $[m\u02d9min,m\u02d9max]$ and $[Tsa,min,Tsa,max]$ represent the lower and upper bounds of the airflow rate and the supply air temperature, respectively. The minimum supply airflow rate, $m\u02d9min$, is computed based on ventilation requirements [25]. To ensure reheat coil can only add heat, we require $Tsa,min=Tca$. Thermal comfort bounds are $[Tz,min,Tz,max]$. Slack variables *ε*^{min}, *ε*^{max} are used to relax the thermal comfort bounds from a fixed range $[Tz,min,Tz,max]$ to a variable range $[Tz,min\u2212\u03f5min,Tz,max+$$\u03f5max]$. These slack variables help ensure that the problem is feasible. A high penalty parameter *ρ* encourages the slacks variables to be small so that temperature violation—when it occurs—is small.

*Problem (7) is feasible*.

### 3.2 Proposed Convex Planner.

The optimization problem (7) is non-convex since the equality constraint (7*b*) is bilinear, and the quadratic term in the cost (9) involves the indefinite matrix *P*. *The goal now is to approximate the problem (7) with a convex problem, so that the approximation is easy to solve and the obtained solution provides good approximation to that of problem (7).*

The algorithm we propose to this end is described in Algorithm 1. It uses the convex-concave procedure (CCP) [27]. In Algorithm 1, the following terminology is used. Let *P* = *Q*(Λ^{+} + Λ^{−})*Q*^{T} be the eigen-decomposition of the real symmetric matrix *P* from Eq. (10), where $\Lambda +\u227d0$ is the positive semi definite part and $\Lambda \u2212\u227a0$ is the negative definite part. Define $P+:=Q\Lambda +QT$ and $P\u2212:=Q\Lambda \u2212QT$.

#### Convex planner

**Input**: Initial guess *ζ*(0).

n ← 0.

**repeat**

**Convexify**: Form:

**Solve for**

*z**

**:**

s. t. equality constraints (12), (7*b*) − (7*d*)

inequality constraints (7*e*)−(7*m*)

k = 0, … N − 1

**Update iteration:** Set $n\u2190n+1,\zeta (n)\u2190z*$.

**until**$\Vert \zeta (n)\u2212\zeta (n\u22121)\Vert \u2264\delta $;

**Output:**$z*\u2190\zeta (n)$

Proposition 2 guarantees reliable performance of Algorithm 1. Since problem (13) is feasible and convex, if the algorithm converges within the allowable time, it converges to a local minimum of the original non-convex problem. If the algorithm must be stopped before convergence due to inadequate time, the solution obtained has a lower cost than solutions from previous iterates since it is a descent algorithm.

#### 3.2.1 Choice of Convex Approximation Method.

Apart from the convex-concave procedure we used, there are many approximation methods for non-convex optimization problems that involve bilinearities. The commonly used methods are (i) Branch-and-Bound (BnB) [28] and (ii) Alternate Convex Search (ACS) [29]. Next, we show that these methods are not applicable to our problem (7), leaving CCP as the only candidate. The following two propositions will be needed for that discussion.

*Every solution of Problem (7) is a boundary solution.*

##### 3.2.1.1 Inapplicability of branch-and-bound (BnB).

BnB requires construction of a tight convex under-estimator of the NLP within any given region of the space of the variables [28]. The most widely used under-estimators are Lagrangian relaxation [30] or convex relaxation. However, Proposition 3 shows the dual of our problem (7) is unbounded from below. Therefore, Lagrangian relaxation cannot be applied. For convex relaxation, common options are McCormick envelope [28] and reformulation linearization technique (RLT) [31]. Both of them reformulate a problem via the addition of certain nonlinear constraints that are generated by using the products of the bounding constraints. However, constructing such products require knowledge of bounds on variables that are involved. In our problem, thermal comfort limits do not have known bounds because of the introduction of slack variables. Hence, convex relaxation is also not applicable for our case.

##### 3.2.1.2 Inapplicability of alternate convex search (ACS).

ACS [29] divides variable set into disjoint blocks and in every step, only the variables of an active block are optimized while those of the other blocks are fixed. Analyses and examples from Refs. [32,33] show that this method will most likely fail to find a local optimum for problem with boundary solutions (our case). Only initial guesses that belong to a particular set will converge to a local optimum. Because there is no guarantee on convergence to local minima, we do not use ACS.

## 4 (Block I) Identification and Prediction

### 4.1 Identification.

*A*,

*C*are the same as in Eq. (5). But while the four inputs in Eq. (14) were divided into controllable and not controllable; here, they are divided into measurable and non-measurable. In particular, $ukid:=[qhvac,Toa,\eta sol]k\u2208R3$ consists of the measurable inputs to the thermal dynamics and the transformed disturbance $w\xafk\u2208R$ is the non-measurable input. Other than the regrouping, the two models are identical. Among the three components of $ukid$,

*q*

^{hvac}is computed from measurements of $m\u02d9$ and

*T*

^{sa}using Eq. (6), and the remaining two inputs can be obtained from a weather station. The output $Tz$ is measured with a sensor.

The system identification algorithm used here is the SPDIR method proposed in our earlier work [10]. Fix *i* as the current time when system identification is to be carried out. Define *τ*_{i}: = {*i* − *N*, *i* − *N* + 1, …, *i* − 1} and (*u*^{id}, *y*)_{j}, *j* ∈ *τ*_{i} be the measured input–output data for the model (14) over that time interval. The algorithm SPDIR takes this data and produces an estimate of the model parameters $M:=(A,Bid,Fid,C,Did,Gid)$ and an estimate of the transformed disturbance $w\xafj,j\u2208\tau i$. We denote these estimates $M^i$ and $w\xaf^j,j\u2208\tau i$ since they depend on *i*. The SPDIR algorithm is executed at time instants *i*, *i* + *N*_{ad}, *i* + 2*N*_{ad}, …, with *N*_{ad} large so that enough time has after the previous identification to warrant updating the estimates of the model and disturbance.

The SPDIR algorithm comes with the following guarantees [10]:

The computation involved in obtaining the estimates (model and disturbance signal) is a feasible and convex optimization problem with a strictly convex cost.

The model $M^i$ is BIBO stable and has a positive DC gain from each of the three measurable inputs (outdoor temperature, solar irradiance, and HVAC heat injection) to indoor temperature.

There is exactly one parameter that requires tuning by a human expert. This tuning can be done once (one data set). The two properties mentioned above hold irrespective of the value of this parameter.

The first property ensures that the system identification algorithm can be executed periodically without any human intervention, i.e., *autonomously*. Autonomy is also helped by the third feature. The second feature helps in two ways. One, it ensures that the model identified is consistent with the physics of HVAC systems. Two, it helps in state estimation. At every decision instant *i*, a Kalman filter is used to estimate the state of the thermal model (5), which is then used as the initial state by the MPC planner : $x^0$ in Eq. (7*b*). The stability guarantee of the model mentioned above ensures that the Kalman filter is stable [34].

### 4.2 Forecasts of Uncontrollable Inputs.

Two types of uncontrollable inputs appear in the thermal model (5), and thus their forecasts over the planning horizon are needed by the MPC planner: weather variables and transformed disturbance $w\xaf$. These forecasts are obtained as follows:

Weather variables: Obtain forecast of $[Toa,\eta sol]kT$ over the next planning horizon from an online weather service.

Transformed disturbance $w\xaf$: If the prediction horizon does not contain a holiday, assign the disturbance for the same time interval from the previous week estimated by the system identifier, as the forecast. If the prediction horizon contains a holiday, use the disturbance estimate from the same time interval of the previous Saturday as the forecast. This method is similar to the one used in Ref. [5], except for the holiday corrections.

### 4.3 Putting Them All Together.

#### Proposed MPC architecture

**Input:** Planning horizon $N\u2208Z+$, control horizon $Nc\u2208Z+$, and model updating interval $Nad\u2208Z+$.

**Setup:**$Sc:={Nc,2Nc,\u2026}$, $Sad:={Nad,2Nad,\u2026}$

**for**$i=1,2,\u2026$**do**

**if**$i\u2208Sad$**then**

**Measure:** Input *u*^{id} and output *y* of the model (14), over the time interval [*i − N _{ad} : i* − 1].

**System ID:** Estimate model $M^i$ and disturbance $w\xaf^[i\u2212Nad:i\u22121]$ using the SPDIR algorithm from [10].

**end**

**Estimate state:** Estimate current state $x^[i]$ of thermal model (5) using a Kalman filter.

**Predict disturbance:** As described in Section 4.2

**Optimize:** Compute control decisions $m\u02d9[i:i+N\u22121]$ and $Tsa[i:i+N\u22121]$ using Algorithm 1.

**Implement:** Apply $m\u02d9[i:i+Nc\u22121]$ and $Tsa[i:i+Nc\u22121]$ to the plant.

**end**

### 4.4 Baseline Controller for Comparison.

The baseline controller is chosen to be the single-maximum controller which is widely used in practice [35]. The single-maximum controller operates the HVAC system in three modes depending on where the zone temperature $Tz$ is compared with the deadband $[Tz,min,Tz,max]$. When $Tz$ exceeds the upper bound $Tz,max$, reheat is turned off and the supply airflow rate $m\u02d9$ is increased with the help of a PI controller. When the zone temperature is below the lower bound $Tz,min$, the airflow rate $m\u02d9$ is kept at the minimum allowed value but the supply air temperature is increased with the help of a PI controller. When the zone temperature is in the deadband $[Tz,min,Tz,max]$, the supply air temperature is kept at $Tca$ and the flowrate are both kept at the minimum allowed value.

## 5 Simulation Setup

To assess performance of the proposed control system, we perform closed loop simulations for nearly a yearlong period with a realistic time varying plant. Simulations with the baseline controller are also performed on the same plant for comparison. The plant model on which the controller acts is calibrated to mimic a large auditorium in a building in the University of Florida campus (Pugh Hall). The auditorium in Pugh Hall is served by an air handling unit, and it has the same HVAC system configuration as shown in Fig. 2.

### 5.1 Plant Description.

*q*

^{int}(kW).

*T*

^{w}(°C) is the wall temperature,

*C*

_{z}(

*t*),

*C*

_{w}(

*t*),

*R*

_{z}(

*t*), and

*R*

_{w}(

*t*) are the time-varying thermal capacitances and resistances of the zone and wall, respectively, and $Ae(t)$ is the effective area of the building for incident solar radiation. One can view this model as a time-varying version of the commonly used RC-network models of building thermal dynamics.

The time-varying plant parameters are shown in Fig. 3, which are chosen as follows. The average values of the time-varying parameters are chosen to be the same as the values given in Ref. [9], which contains the plant parameters estimated using data from an auditorium in Pugh Hall located in the University of Florida.

### 5.2 Closed Loop Parameters.

The planning horizon for MPC is 1 day and the control horizon is 15 min, with a sampling time Δ*t* = 5 min, so $N=288$ and $Nc=3$. These choices are inspired by the study presented in Ref. [36]. The total time span for MPC is 50 weeks. The number of decision variables for problems (7) and (13) is 2592($=9N$). The plant was simulated in matlab by discretizing the differential equation (13). Future work will explore using publicly available matlab-based simulators such as Ref. [37].

Thermal comfort and flowrate constraints depend on whether the building is in occupied or unoccupied mode [24]. The maximum occupancy for Pugh Hall auditorium is 229 persons, and its occupied mode (occ) is scheduled from 6:30 AM to 10:30 PM while the remaining time is deemed unoccupied (unocc). We used these parameters for the simulation. The thermal comfort bounds are [21.9, 23.6]°C for occupied mode and [21.1, 24.4]°C for unoccupied mode. The minimum allowed value for the supply airflow rate $m\u02d9min$ is computed based on the ventilation requirements specified in ASHRAE 62.1 [25]. More specifically, $m\u02d9min,unocc$ is computed assuming 0 occupancy for the unoccupied period with 31% occupancy for the occupied period. Note for the baseline controller, $m\u02d9min,occ$ is kept as high as 1.90 kg/s; otherwise, the baseline controller fails to maintain the zone temperature comfort satisfactorily. The remaining parameters are listed in Table 1.

Unoccupied | Occupied | $Tsa,min$ | 12.8 | °C | $Tca$ | 12.8 | °C | ||

$Tz,min$ | 21.1 | 21.9 | °C | $Tsa,max$ | 37.8 | °C | $COP$ | 3.5 | N/A |

$Tz,max$ | 24.4 | 23.6 | °C | $Tsa,rate$ | 0.56 | °C/min | α | 0.3 | N/A |

$m\u02d9min$ | 0.95 | 1.47, $1.90a$ | kg/s | $m\u02d9max$ | 4.72 | kg/s | a_{f} | 417.5 | W/(kg/s)^{2} |

$m\u02d9rate$ | 0.2 | kg/s/min |

Unoccupied | Occupied | $Tsa,min$ | 12.8 | °C | $Tca$ | 12.8 | °C | ||

$Tz,min$ | 21.1 | 21.9 | °C | $Tsa,max$ | 37.8 | °C | $COP$ | 3.5 | N/A |

$Tz,max$ | 24.4 | 23.6 | °C | $Tsa,rate$ | 0.56 | °C/min | α | 0.3 | N/A |

$m\u02d9min$ | 0.95 | 1.47, $1.90a$ | kg/s | $m\u02d9max$ | 4.72 | kg/s | a_{f} | 417.5 | W/(kg/s)^{2} |

$m\u02d9rate$ | 0.2 | kg/s/min |

$am\u02d9min,occ=1.47$ is used for the MPC controllers and $m\u02d9min,occ=1.90$ is used for the baseline controller.

The uncontrollable input signals are chosen as follows: solar irradiance data *η*^{sol} is taken from NSRDB [38], and ambient temperature *T*^{oa} is taken from online,^{1} both for Gainesville, FL, from the year of 2013. The internal heat load (*q*^{int}) is chosen by scaling CO_{2} data collected from the auditorium in Pugh Hall during the same year, which is shown in Fig. 4. The high-resolution and long-term data collection was made possible by using a custom made data logger [39]. The rationale is that occupancy is correlated to the CO_{2} level. Note that the heat load is by design a large, time-varying, and aperiodic signal.

All numerical results presented in this work are obtained through matlab. Specifically, the plant is simulated in simulink^{©}. The system identification problem from Ref. [10] for estimating model and disturbance is solved using cvx^{©} [40] package. For control computation, the nominal non-convex problem (7) is solved using ipopt^{©} [41] package, and the proposed convex problem (13) is solved using cvx^{©} [40] package. We used a desktop computer with a 3.60GHz × 8 CPU and 16 GB RAM, running Linux, for the closed loop simulations.

## 6 Simulation Results

A total of five distinct controllers are tested through simulations on the same plant:

**Baseline:**the single-max controller described in Sec. 4.4.**Proposed (Adapt-CVX):**the proposed controller (Algorithm 2), with both model update and convex planner for control computation.**NAdapt-CVX:**the proposed controller (Algorithm 2),*but without updating the dynamic model and the disturbance estimates*.**Adapt-NCVX:**the proposed controller (Algorithm 2),*but using the non-convex problem (7) instead of the convex problem from Algorithm 1 to compute commands.***NAdapt-NCVX:**MPC with the nominal non-convex problem (7) for computing control commands, and without updating the dynamic model and the disturbance estimates. Note this the MPC architecture generally described in the literature.

In all the controllers that uses a non-convex optimization, if the NLP solver is unable to converge before the control update interval is over, decisions computed by the baseline controller are sent to the actuators.

### 6.1 Comparison With the Baseline Controller.

The proposed MPC scheme outperforms the baseline controller in both maintaining zone temperature and reducing energy use, see Table 2. Data on uncontrollable inputs, control command, and the output (zone temperature) are shown in Fig. 5 for the full 50 weeks. Figure 6 zooms into one week: Aug. 26, 2013, to Sept. 1, 2013.

Controller | Site EUI (kBtu/
(ft^{2} ·yr)) | Planner failure (%) | RMSE of T^{z} violation (°C) | Max T^{z} violation (°C) |
---|---|---|---|---|

Baseline | 72.9 | N/A | 0.45 | 2.3 |

NAdapt-NCVX | 63.4 | 0.4 | 0.46 | 4.0 |

NAdapt-CVX | 63.9 | 0 | 0.41 | 1.7 |

Adapt-NCVX | 53.7 | 0.1 | 0.22 | 3.2 |

Adapt-CVX (Proposed) | 53.5 | 0 | 0.23 | 1.1 |

Controller | Site EUI (kBtu/
(ft^{2} ·yr)) | Planner failure (%) | RMSE of T^{z} violation (°C) | Max T^{z} violation (°C) |
---|---|---|---|---|

Baseline | 72.9 | N/A | 0.45 | 2.3 |

NAdapt-NCVX | 63.4 | 0.4 | 0.46 | 4.0 |

NAdapt-CVX | 63.9 | 0 | 0.41 | 1.7 |

Adapt-NCVX | 53.7 | 0.1 | 0.22 | 3.2 |

Adapt-CVX (Proposed) | 53.5 | 0 | 0.23 | 1.1 |

In particular, the proposed controller (Algorithm 2) reduces energy use by $26.8%$ over the baseline controller, to EUI = 53.5 kBtu/(ft^{2} ·year), see Table 2. The baseline controller is already more efficient than the average controller in the field: its site EUI for the tested period is 72.9 kBtu/(ft^{2} ·year), which is lower than the median site EUI = 84.3 kBtu/(ft^{2} ·year) for college buildings in the United States [42].

The improvement in performance over the baseline controller is consistent with results in the literature that have compared MPC with baseline controllers. MPC’s ability to use disturbance forecasts and prediction from the model allows it to make better decisions than a purely output feedback controller.

### 6.2 Benefit and Necessity of the Design Features

#### 6.2.1 Need for Model and Disturbance Ppdate.

We tested the role and/or value of adaptation by turning off the adaptation block. A model and disturbance (for a week) are estimated from data from the first week of 2013. They are used by the controller for every week of the year. The resulting MPC controller is referred to by the “NAdapt-” prefix, e.g., in Table 2. We see from the table that adaptation reduces energy use by about 16% and reduces zone temperature violation over the non-adaptation case.

Thus, adaptation—periodically updating models and disturbances from data—is both necessary (for indoor comfort) and beneficial (improves energy use) for an MPC-based controller for HVAC systems.

#### 6.2.2 Need for Convexification of the MPC Planner.

NLP solvers such as ipout [41] are quite powerful. Thus, solving the non-convex MPC planning problem (7) is *usually* not an issue. On average it takes 2.7 s for ipopt to find a local minimum of the non-convex problem, failing to do so with the available 15 min *only $0.1%$* of the time. When this happens, decision from the baseline controller is used as control commands. The resulting switching control action can lead to large violation in the indoor temperature. See Fig. 7 for an example of this phenomenon. The zone temperature exceeds the upper bound by 3.2 °C for an extended period of time. Thus, though a non-convex planner rarely fails, when it does it leads a catastrophic loss of performance that will render the control system unacceptable to the user.

In contrast to MPC with a non-convex planner, the proposed MPC scheme with a convexified planner *always* finds a minimum within the available 15 min, taking 1.7 s on average to compute the control decisions. Partially as a result of that, *it is able to provide the best performance in maintaining zone temperature among all five controllers tested.*

Therefore, even though solving the nominal non-convex problem is rarely an issue, in those rare occasions the controller can cause serious disruption to occupant’s thermal comfort. It is unlikely such a control system will be acceptable to building owners and occupants. In short, the convex approximation of the MPC planner is necessary.

It should be noted that the NAdapt-NCVX controller is the MPC scheme generally used in the literature, e.g., Refs. [20,22]. Without the benefits from both of the designed features, this controller has a maximum zone temperature violation of 4.0 °C, even though it occurs rarely and does not perform as well as the proposed controller in terms of energy use.

We remark here the performance delivered by the proposed MPC scheme is obtained under strong plant-model mismatch in the following aspects: (i) The plant is time-varying and nonlinear, while the MPC planner uses a linear model. (ii) The proposed MPC scheme assumes both the plant and the disturbance are the same as that from the previous week, but the plant and the disturbance do not satisfy those properties.

## 7 Conclusion

This paper takes a first stab at designing an MPC-based control system for HVAC systems that can operate autonomously for long periods, without requiring intervention of human experts. Autonomy is made possible by two features: (i) automated periodic update of thermal model and internal disturbance signals and (ii) a convex approximation of the MPC planner’s optimization problem. The yearlong simulations shows that both of the features are essential to get the performance improvement over the simple baseline controller over a long period. The need for periodic re-learning the model and disturbance is easy to see in the context of buildings. The need for convexity in the planning problem is less obvious at the design stage, but was discovered from the simulation results. Even though the nominal non-convex planning problem can be used effectively nearly 100% of times, the rare instances it fails to converge causes dramatic fluctuations in the indoor temperature rendering the control system an unlikely contender for real-life application. Without these features, though MPC can outperform the baseline controller in certain scenarios, the benefits may not be substantial enough to defray the additional cost of implementing MPC.

At the current stage, the proposed MPC architecture uses arguably one of the simplest schemes for forecasting of the internal disturbance. It is envisioned that a more accurate prediction scheme, possibly with the aid of technologies such as occupancy recognition or CO_{2} level sensing, should further improve performance of the MPC controller.

Many extensions of this work are possible. The most immediate next step is extending the proposed control scheme to include humidity dynamics and ventilation requirements, which will require including as part of the control commands the outdoor airflow and conditioned air temperature (downstream of the cooling/dehumidification coil; see Fig. 2). These two have been assumed fixed in this paper but in fact can be commanded through the building automation system. It should reduce energy use even more and provide better thermal comfort by including outdoor airflow and conditioned air temperature into the list of control commands. The challenge is to incorporate the nonlinear humidity dynamics in zone thermal models and the nonlinear process models of the cooling/dehumidification coil [22]. The autonomy achieved by the control system proposed here is due to the use of linear dynamic models. Other useful directions include extension to multi-zone buildings, improvements in the forecasting methodology for the internal disturbance, etc.

## Footnote

## Acknowledgment

This research is partially supported by NSF through Grant Nos. 1463316 and 1934322.

## Conflict of Interest

There are no conflicts of interest.

## Data Availability Statement

The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request. Data provided by a third party are listed in Acknowledgment.

### Appendix

## References

*US Energy Use Intensity by Property Type*