Boundary modeling in model-based calibration for automotive engines via the vertex representation of the convex hulls

When using the convex hull approach in the boundary modeling process, Model-Based Calibration (MBC) software suites -- such as Model-Based Calibration Toolbox from MathWorks -- can be computationally intensive depending on the amount of data modeled. The reason for this is that the half-space representation of the convex hull is used. We discuss here another representation of the convex hull, the vertex representation, which proves capable to reduce the computational cost. Numerical comparisons in this article are executed in MATLAB by using MBC Toolbox commands, and show that for certain conditions, the vertex representation outperforms the half-space representation.


Introduction
Model-Based Calibration (abbr. MBC) is a systematic approach for more cost-effective and short-term development of automotive engines, that enables engineers to design more efficient automotive engines, e.g., more fuel-efficient and/or eco-friendly engines. For efficient design of automotive engines, mathematical models for automotive engines are created in MBC, and statistics and optimization are applied to the model by using MBC software, such as [9].
Boundary modeling is one of the processes in MBC used to represent/approximate a region where the automotive engine works normally, e.g., without misfire and knock of the engine. We call the region the admissible operation domain (abbr. AOD). In general, as it is assumed that internal-combustion engines are highly nonlinear systems, it is impossible to exactly represent the AOD of the automotive engine from a finite number of acquired data. Thus one approximates the AOD instead of representing it exactly. One of the approximations of AOD is to use the convex hull of a set of data. This is a simple way to approximate AOD from data and is implemented in MBC software, such as [9]. In addition to the convex hull, the use of support vector machine for the approximation of AOD is also proposed in [6].
An AOD is used as a constraint in constrained optimization problems. One can assume that some of optimal solutions will lie on the boundary of the feasible region, otherwise the constraints would be irrelevant. That is why a proper handling of AODs is important in engine optimization problems.
The motivation of this article comes from the comment in [4] that some of the MBC software suites spend much computational time constructing a convex hull boundary model. In general, two representations for the convex hull of a set of points are possible, the half-space representation and the vertex representation. The reason for the comment was that the half-space representation for the convex hull of a set of points is typically used by software like MBC Toolbox, instead of the vertex representation.
The contribution of this article is to propose the use of the convex hull in the vertex representation instead of the half-space representation. In practice, the former representation seems to perform better than the latter. In fact, the numerical comparison in this article shows that the vertex representation is less computationally intensive than the hyperplane representation in the case when the dimension of inputs for engine models is more than five.
The organization of this article is as follows: convex hull modeling theory is discussed in Section 2. Section 3 provides an application of the vertex representation of the convex hull and numerical experiments. Conclusion is given in Section 4. Throughout this article, we assume that the measured engine data was acquired by keeping the engine under test at steady condition by controlling its inputs.

Preliminaries
We give a brief introduction on boundary modeling via the convex hull in Section 2.1, and some definitions and facts on the convex hull for a set of points in Section 2.2. Refer to [1,3] for more details regarding the convex hull mathematical representation.

Boundary modeling in model-based calibration
The behavior of automotive engines is represented by the state space representation. One of the simplest formulations is as follows: where t is time, x, u and y are vectors which represent the state of the automotive engine, the input signals into the engine and the output signals from the engine, respectively. Control theory, statistics and optimization are applied to such mathematical models of automotive engines to design more fuel-efficient and/or eco-friendly engines. MBC is a systematic approach for aiding such an efficient design of automotive engines and consists of some processes, such as the design of experiments and the response surface methodology.
Boundary modeling is a functionality used in MBC, and is applied to define an AOD for a mathematical engine model. Input signals for automotive engines under development have specific operating ranges and dynamics. In addition, automotive engines may not behave normally when some specific input signals are used, leading to undesirable events such as misfire and knock of the engine. In boundary modeling, one defines that approximates/represents a region of input signals where automotive engines behave normally, e.g., without misfire and knock of the engine.
One of the approximations of the AOD is the convex hull of a set of a finite number of input signals by which the automotive engine behaves normally. This approximation may be too rough, but is a simple way to define an AOD in practice. In fact, it is implemented in some MBC software, such as [9]. Figure 1 displays examples of the approximation of the AOD by the convex hull. In Figure 1, black circles are input signals by which the automotive engine behaves normally, and red circles indicates input signals by which the automotive engine does not behave normally. The blue region is the approximation of the AOD via the convex hull. Note that as we mentioned, the approximation of the AOD by the convex hull may be rough. In fact, it does not always represent the region where the automotive engine behave normally. For instance, the approximation at the right of Figure 1 contains red circles, which means that the automotive engine does not behave normally around the circle.
The approximation of the AOD is used in other processes in MBC as follows: (P1) Problem of determining whether a new point is in the approximated AOD or not. This is mathematically formulated as the problem of determininĝ v ∈ P orv ∈ P, wherev is a new point and P is an approximation of the AOD.
(P2) Optimization of some objective functions over the approximated AOD or a subset of the AOD for more realistic situation in response surface methodology. This is mathematically formulated as is the objective function and g j (v) ≥ 0 is an engine operating constraint.

Convex hull for a set of points in R n
is called a convex combination of v 1 , . . . , v m . In particular, the set {αa + (1 − α)b : 0 ≤ α ≤ 1} is called the line segment with the endpoints a and b and denoted by [a, b].
A set K ⊆ R n is convex if for every a, b ∈ K, the line segment [a, b] is contained in K. We define the empty set ∅ as a convex set. Figure 2 displays an example of convex and nonconvex sets. In fact, for the set at the left of Figure 2, we see that for every a, b in the set, the line segment [a, b] is contained in the set, which implies that the set is convex. In contrast, the line segment [a, b] is not contained in the set at the right of Figure 2.
In other words, the extreme point of K is a point which does not have any convex combinations with other points in K. For instance, at the set of the left in Figure 2, the black circles at the corners indicate an extreme point of the convex set. We denote the set of extreme points in K by ext(K).
Clearly the convex hull of a set of a finite numbers of points in R n is a polytope. A half-space is a set which is defined as {x ∈ R n : a T x ≤ b}, with suitable a ∈ R n and b ∈ R. A set P is called polyhedron if P is formed as the intersection of finitely many half-spaces, i.e., there exist a 1 , . . . , a k ∈ R n and b 1 , . . . , b k ∈ R such that P = x ∈ R n : a T i x ≤ b i (i = 1, . . . , k) . Minkowski-Weyl's theorem ensures that every polytope can be reformulated as a polyhedron. This implies that one can describe the convex hull of a set of points by some half-spaces in addition to the V-representation, which is called the half-space representation (abbr. Hrepresentation).
Theorem 2.1 (Minkowski-Weyl) Every polytope is polyhedron, i.e., for a given polytope P , there exist a 1 , . . . , a k ∈ R n and b 1 , . . , k)}. Moreover, every bounded polyhedron is also polytope, i.e., for a given polyhedron We give two examples of the V-and H-representations. We see from these examples that one needs to choose a suitable representation of the convex hull from the viewpoint of computation. Figure 4 displays an example of 3-dimensional unit cube. This is already the H-representation. In fact, we define a i ∈ R n , b i ∈ R (i = 1, . . . , 2n) as follows: where e i is the ith n-dimensional standard unit vector. Then P can be reformulated by We remark that the V-representation of P needs 2 n extreme points in ext(P ), whereas the H-representation needs only 2n half-spaces. Example 2.3 (Cross-polytope) Let P = {x ∈ R n : |x 1 | + · · · + |x n | ≤ 1}. P is called the ndimensional cross-polytope. Figure 4 displays an example of the 3-dimensional cross-polytope. The H-representation of P is x ∈ R n : Here the H-representation is the intersection of 2 n half-spaces. In contrast, the V-representation of P can be formulated by 2n points in R n . In fact, since both e i and −e i are extreme points in P , the V-representation of P is P = conv({±e 1 , . . . , ±e n }).
A more compact representation of the convex hull is often useful from the viewpoint of computation. For instance, the V-representation in Example 2.2 and the H-representation in Example 2.3 require more computer memory even for small n, whereas the H-representation in Example 2.2 and the V-representation in Example 2.3 need less memory even for large n.
Hence the H-representation in Example 2.2 and the V-representation in Example 2.3 are more suitable to deal with in actual computers when the dimension n is large. 3 Application of the V-representation to model based calibration for automotive engines We propose a way to handle the V-representation of the convex hull of a set of points without conversion into the H-representation in Sections 3.2 and 3.3. This way uses the results in [10]. Before mentioning them, we discuss the computational difficulty in using some MBC software in Section 3.1.

Computational difficulty due to the H-representation
As we have already mentioned in Section 2.1, the convex hull of a set of input signals which make the automotive engine behave normally is one of the approximation of the AOD of the engine. Let V = {v 1 , . . . , v m } be a set of input signals v 1 , . . . , v m ∈ R n . Then the approximation via the convex hull is formulated as conv(V ) and is the V-representation. On the other hand, for both (P1) and (P2) in Section 2.1, it is converted into P = {v ∈ R n : Av ≤ b} for some A ∈ R k×n and b ∈ R k in some MBC software, such as [9]. This corresponds to the conversion of the V-representation of the convex hull conv(V ) into the H-representation, and after this conversion, (P1) and (P2) are respectively equivalent to (P1)' Problem of determining whether Av ≤ b or Av ≤ b, and (P2)' Solution of min v∈R n {f (v) : g j (v) ≥ 0 (j = 1, . . . , k), Av ≤ b}. In general, the conversion is computationally costly and generates too many half-spaces to be handled efficiently by actual computers available RAM memory. This is the main computational difficulty in using the approximation of the AOD via the convex hull implemented in some MBC software. Table 1 displays the computation time and the number of generated halfspaces for the conversion of the V-representation into the H-representation. In this numerical experiment 1 , we generated a set V of m points in [−1, 1] n randomly and used vert2lcon.m in [8], which calls the built-in function convexhulln in MATLAB based on Qhull [11]. "-" in Table 1 indicates that we do not compute the conversion because it spends more than 1000 sec. We observe from Table 1 that when n is not so large, the conversion is not so computationally intensive and is rather fast. However, when n is larger (typically more than 10), generating the convex hull in the H-representation becomes computationally intensive. Moreover, since it generates many half-spaces, we can expect that the optimization in (P2)' will also be computationally intensive.

Application of the V-representation to (P1) : to determine whether a new point is in the convex hull or not
Let V = {v 1 , . . . , v m } be a set of points in R n . For (P1) in Section 2.1, i.e., the problem of determining whetherv ∈ conv(V ) orv ∈ conv(V ), we have two approaches via H-representation and V-representation. In the approach via H-representation, after converting conv(V ) to the linear inequalities Av ≤ b, we need to check whether Av ≤ b or not. It is relatively easy to check Av ≤ b, while the conversion is computationally intensive for not so large m and/or n as in Table 1. On the other hand, in the approach via V-representation, the linear programming (abbr. LP) method is available. At the end of this subsection, we will show that the approach via V-representation is much faster than the H-representation. Fundamentally, LP can be regarded as an optimization problem, i.e., the problem of minimization or maximization of a linear objective function over a polyhedron. The simplex method and interior-point method are efficient algorithms to solve a LP problem or detect the infeasibility of the problem. In addition, linprog implemented in Optimization Toolbox offered by MathWorks and [5], are available as commercial software to solve LP problems. Refer to [2,7], for more details on LP.
One can determine whether a new pointv is in conv(V ) or not by solving the following LP problem: where c ∈ R n is fixed arbitrarily. Since any convex combination ofv with v 1 , . . . , v m is feasible in (1), we see that • if the optimal value of (1) is finite, thenv is in conv(V ), and, • otherwise (1) is infeasible, i.e., the feasible region is empty, and thusv is not in conv(V ).
Hence one can determine whether a new pointv is in P or not by solving (1) instead of constructing Av ≤ b for the H-representation of conv(V ). Table 2 displays the computation time for the same sets V of m points in R n as Table  1. Here we generatev ∈ [−1, 1] n randomly. We used linprog to solve all LP problems. Comparing Table 2 with Table 1, we see that the determination ofv ∈ conv(V ) via LP method is much faster in computation time than the conversion into the H-representation of conv(V ). This implies that the H-representation for (P1) will require more time to compute than the V-representation. For instance, in the case (m, n) = (1000, 9), the same V is used in Tables 1  and 2, and it spends 394.21 seconds to construct the H-representation of conv(V ), whereas it spends only 0.09 seconds to determine whetherv ∈ conv(V ) or not. Since we need to check Av ≤ b for the determination ofv ∈ conv(V ) via the H-representation, where A and b are constructed by the H-representation of conv(V ), the total amount of computation time via the H-representation is more than 394.21 seconds. Therefore, we can conclude from Tables 1 and 2 that the V-representation is less computationally intensive than H-representation.  3.3 Application of the V-representation to (P2) : an optimization problem in the frame of MBC response surface methodology As we have already mentioned in (P2) of Section 2.1, the following optimization problems are typically solved by using MBC models obtained using the response surface methodology: where P is the approximation of AOD by the convex hull for a set V = {v 1 , . . . , v m } of points in R n , i.e. P = conv(V ). Since any v ∈ conv(V ) can be represented by a convex combination of v 1 , . . . , v m , the optimization (2) can be equivalently reformulated as wheref (α 1 , . . . , α m ) = f m i=1 α i v i andg j is defined in a similar manner tof .
Before showing numerical comparison of (3) with (2), we mention some advantages and disadvantages of the formulation (3): (I) One can skip the process of constructing the H-representation of conv(V ). As we have already seen in Table 1, the conversion is computationally intensive, and thus one can greatly reduce the computational cost.
(II) Since one does not apply the conversion of the V-representation of conv(V ) into the H-representation, the number of inequality constraints in (3) is much lower than (2) formulated by the H-representation. Consequently, the feasibility check of a generated solution in algorithms of optimization for (3) is much easier than (2).
(III) In contrast, the number of variables in (3) increases. In fact, it is m, while for (2) is n, and thus the computational cost increases in one evaluation of a function value at a given solution. This is the disadvantage of the formulation (3). For instance, we will see in Table 3 that (2)  The dimension n of these data sets is 4, 7 and 9, respectively. We considered different n in order to investigate the scalability of our proposed approach and to compare the computational cost with the H-representation of the convex hull.
Next, for each data set, we have solved the following seven optimization problems, one for each operating point set: where P = conv(V ), and V consists of a subset of the initial 875 n-dimensional vectors, since the approach we adopted is a point-by-point one. The measured points in each subset are unique. As an indication, each local model consisted of 125 of such measurements, and for each local model a corresponding convex hull was generated. Next, an optimization problems was considered. For this, we generate seven objective functions f p (for example using BSFC as the objective to be minimized) and do not use any extra constraint g j (v) ≥ 0 in this numerical experiment, except for the boundary model constraint itself. In conclusion, we have performed a point-by-point minimization problem for BSFC. Table 3 displays numerical comparison of (3) with (2) formulated by the H-representation of P . In this numerical experiment 2 , we use MBC Toolbox [9] and compare computation time of (3) with (2). The third and fourth columns in Table 3 are the computation time of the conversion of P into the H-representation and the total of computation time for seven types of optimization, respectively. We do not describe the time in (3), but "-" in Table 3 because we do not convert P into the H-representation. We used fmincon with interior-point algorithm implemented in Optimization toolbox of MATLAB to solve both (2) and (3). The optimization settings that were used to obtain the solution are listed in Table 4. For the settings not listed in Table 4 the defaults settings were used.
We observe the following from Table 3.
(i) In (A type) and (B type), i.e., n = 4 and n = 7, (2) formulated by the conversion of H-representation is faster than (3), whereas in (C type), (3) is approximately 2 times faster than (2). In fact, as we can expect form Table 1, the number of linear inequalities in (2) considerably increases. Consequently, the evaluation of computed solutions at each iteration becomes computationally intensive.
(ii) The computation time of converting conv(V ) into the H-representation considerably increases as n increases. This can be also expected from   seven, the H-representation is not so computationally intensive. Otherwise it becomes more computationally intensive than the V-representation. Enumeration of all the extreme points in conv(V ) may be useful when the V-representation is applied to (P1) and (P2) described in Section 3. In fact, this is ensured by Krein-Milman's theorem that for every bounded closed convex set A, conv(A) = conv(ext(A)) holds. A simple way to enumerate all extreme points of conv(V ) is to solve the following LP problem for every v k ∈ V : If the optimal value is finite, then v k is not an extreme point in conv(V ) because v k is a convex combination with other v i except for v k . Otherwise (4) is infeasible, and thus v k is an extreme point. This way is used as a pre-processing for (P1) and (P2). If conv(V ) consists a few extreme points in comparison to the set V , then we can expect the improvement of performance for (P1) and (P2). See [10] for a much faster algorithm of the enumeration of all extreme points.