Sensor System and Observer Algorithm Co-Design For Modern Internal Combustion Engine Air Management Based on H 2 Optimization

This paper outlines a novel sensor selection and observer design algorithm for linear time-invariant systems with both process and measurement noise based on H 2 optimization to optimize the tradeoff between the observer error and the number of required sensors. The optimization problem is relaxed to a sequence of convex optimization problems that minimize the cost function consisting of the H 2 norm of the observer error and the weighted l 1 norm of the observer gain. An LMI formulation allows for efficient solution via semi-definite programing. The approach is applied here, for the first time, to a turbo-charged spark-ignited engine using exhaust gas circulation to determine the optimal sensor sets for real-time intake manifold burnt gas mass fraction estimation. Simulation with the candidate estimator embedded in a high fidelity engine GT-Power model demonstrates that the optimal sensor sets selected using this algorithm have the best H 2 estimation performance. Sensor redundancy is also analyzed based on the algorithm results. This algorithm is applicable for any type of modern internal combustion engines to reduce system design time and experimental efforts typically required for selecting optimal sensor sets.


INTRODUCTION
The control of fuel and air in spark-ignited engines has increasingly become a challenge with the incorporation of turbo-charging, exhaust gas recirculation (EGR), valvetrain flexibility, and more stringent emission regulations. To enable effective stoichiometric air-to-fuel ratio control, the engine flow and composition must be accurately and robustly measured or estimated. The only viable option is to use algorithms to estimate the engine mass flow and composition. Five difficulties have to be taken into consideration for the engine air handling sensor and observer (i.e. estimator) design problem: 1) nonlinear system dynamics; 2) measurement uncertainties, including sensor delays and noise; 3) multivariate interactions; 4) engine variability for different operation conditions and 5) the trade-off between estimation accuracy and sensor costs.
Previous studies in the field of engine air handling system management have focused on the observer design based on pre-selected sensor sets (Wang, 2008;Simon and Garg, 2010;Chen and Wang, 2012;Rengarajan et al., 2018). However, considering the increasing complexity of today's engine systems and sensor characteristics, the choice of optimal air handling sensor set is not obvious; and it can be time-consuming and error prone if 'guess and check' experimental or simulation approaches are used. With the increasing variety of available sensors, the possible combinations of sensors will grow quickly. Brute-force experimentation with different sensors is very expensive and time-consuming, and may need to be redone even when there are minor changes to the engine system or control strategies. In order to effectively solve the problem, an algorithm for optimal sensor selection and observer design for the engine air handling system is outlined and demonstrated in this paper.
As described in more detail in the following paragraphs, several mathematical methods have been developed to solve the sensor selection problem, including greedy algorithms and convex optimization. Greedy algorithms aim to find a global optimum by making the locally optimal choice at each stage. The solution computed by the greedy algorithm is not always globally optimal. Convex optimization problems have the property that any found local optimal will also be global. While most formulations are nonconvex, sometimes it is possible to convexify them with minimum or no impact on the solution to take advantage of the solution properties as well as of available solvers (Joshi and Boyd, 2008;Tropp, 2006;Luo et al., 2010).
Sensor selection methods have been applied to various areas. In Kalandros and Pao (1998), the authors proposed three sensor selection algorithms for signal target tracking problems based on different resource and performance metrics. In another paper Kalandros et al. (1999), the authors explored the use of randomization and a super-heuristic in multiple targets tracking problem to improve any given sensor set solution via random perturbation. That approach is more suitable for systems with little structure and for which the cost of solution evaluation is not high. In Hashemi et al. (2018), the authors studied a randomized greedy algorithm for near-optimal sensor scheduling in large-scale sensor networks. In Rao et al. (2015), a greedy algorithm based on two submodular cost functions, the weighted frame potential and the weighted log-det, was developed for the sensor selection problem in non-linear measurement models with additive normally distributed noise. Several sensor selection algorithms based on convex optimization or relaxation have also been applied to flexible structures. In Fardad et al. (2011), Münz et al. (2014, Zare and Jovanović (2018), and Dhingra et al. (2014), the weighed l 1 norm (without considering measurement noise) or l 2 norm of the observer gain was used to represent the sensor number, and minimized along with the H 2 norm of the estimation error. The optimization problem was solved via SDP (Fardad et al., 2011;Münz et al., 2014), alternating direction method of multipliers (ADMM) (Zare and Jovanović, 2018) and proximal methods (Dhingra et al., 2014). In Chepuri and Leus (2014), several functions of the Cramer-Rao bound (CRB) were used as a performance measure and the sensor selection problem was formulated to minimize the CRB functions and a sparse selection vector. In Joshi and Boyd (2008), the authors computed the optimal sensor set among candidate linear measurements corrupted with normally distributed noises. The maximum-likelihood estimation errors were used as the performance evaluations.
In the application of engine sensor selection, methods include experimentation-based sensor selection and algorithm-based sensor selection. In Pekař et al. (2012), the best sensor configuration for a heavy-duty engine was found based on experimental results by testing each sensor design. In Mushini and Simon (2005), the authors implemented a sensor selection algorithm for aircraft gas turbine engine healthy parameter estimation by minimizing the cost function of the estimation error and financial cost via a greedy algorithm. In Suard et al. (2008), the authors determined the best sensor configurations among three candidate sensor configurations for air-fuel-ratio control in a spark ignited engine. Different controllers were designed for candidate sensor configurations. An objective function incorporating the overall system cost and controller performance as the optimization target was used, with solution via a genetic algorithm. In Palmer et al. (2018), the authors proposed a methodology for fault diagnosis sensor selection based on D s optimal FDI test design that maximized the sensitivity of outputs to anticipated faults and applied it in a diesel engine air handling system. The problem was solved using a heuristic method.
In the work described here, a simultaneous, coupled sensor selection and observer design method for the air handling system of the turbo-charged SI engine with EGR is proposed. In a manner similar to the approach taken in (Fardad et al., 2011;Münz et al., 2014), the strategy uses H 2 optimization and accounts for both process and measurement noise. The goal of this algorithm is to definitively, and accurately, determine the tradeoff between the necessary sensor number and the accuracy of intake manifold oxygen fraction estimation. The implemented cost function consists of the H 2 norm of the observer error and the weighted l 1 norm of the observer gain. The problem, once formulated, can be solved efficiently via semi-definite programming (SDP). After selecting the optimal sensor set, the algorithm computes the corresponding Kalmanfilter gain based on the selected sensor set. A method to estimate the modeling errors based on the comparison of reference data and modeling data is also proposed in this paper, enabling the application of the sensor selection framework to physical systems.
The rest of the paper is organized as follows: The statement of the sensor selection problem and its convex formulation; The application of the algorithm on a spark-ignited (SI) engine for intake manifold burnt gas mass fraction estimation for medium/high-speed operating conditions; Conclusions; Future work.

SENSOR SELECTION ALGORITHM BASED ON H OPTIMIZATION
Considering the following linear continuous state space model: with the state variables x ∈ R m , measured outputs y ∈ R n , control inputs u ∈ R p , disturbance inputs u d ∈ R s , unknown disturbance related to model uncertainty w ∈ R m , and sensor noise v ∈ R n . Both w and v are modeled as white noise. B w ∈ R m×m and H ∈ R n×n are diagonal magnitude matrices. A Luenberger observer then takes the form: where L ∈ R m×n is the observer gain, z ∈ R q is the weighted error and W ∈ R q×m is the weighting matrix to address some errors from all state errors. By introducing the following two matrices: the weighted error z can be formulated as: where the error system G is the transfer function matrix between w v and z.

Cost Function
For MIMO systems, the H 2 norm is the impulse-to-energy gain or steady-state variance of outputs in response to white noise (Arzelier, 2008). Therefore, by minimizing the H 2 norm of the error system (4), the expected root-mean-square error (RMSE) of the observer in response to white noise input excitation is minimized. The H 2 norm of the error system G in Eq. 4 is expressed as: where E is the expectation operator. Considering the observer gain matrix L, the corresponding jth sensor measurement does not contribute to the state estimation results if every element in the jth column of L is zero. In this case, the absolute sum of the elements in jth column of L is also zero. The number of sensors can be reduced by minimizing the nonzero columns in L, which is the l 0 norm of the row vector p ∈ R 1×n of absolute column sum of L, i.e., p 0 In order to optimize the tradeoff between the observer estimation error and the number of required sensors, a cost function is defined as following: where G 2 denotes the H 2 norm of the error system or the expected root-mean square weighted error, and α is the weighting factor between 0 and 1, balancing the effect of observer error and sensor number.

H 2 Norm of the Observer Error
From Peet (2016), for a LTI system with a transfer function G(s) C(sI − A) − 1 B, the following statements are equivalent: (1) A is Hurwitz and G 2 2 (the power of H 2 norm of the impulse response) < c 2 .
(2) There exists a positive definite matrix P (i.e., P P T _0) such that where the symbol 3 in the first inequality of Eq. 7 denotes the negative definiteness of a matrix. By applying Eq. 7 to system (4), the optimization problem for the first target G 2 2 in Eq. 6 can be formulated as following: The optimization target and the first constraint in (8) are bilinear matrix inequalities (BMI) and is thus not a convex optimization problem. Therefore, they need to be converted to linear matrix inequalities (LMI). A matrix S PL (thus L P −1 S) is defined and the first constraint in Eq. 8 can be written as the following LMI: Via the Schur complement condition for positive semidefiniteness (Chong and Zak, 2013), the following two statements are equivalent: (1) Symmetric matrix To apply the Schur complement condition to the optimization target Eq. 8, we introduce a positive semi-definite matrix T, i.e., T T T _ 0. By substituting Δ 1 T, Δ 2 (PB w − SH) T and Δ 3 P, the following statement holds: The inequality in Eq. 11 can be rewritten as: Therefore, the following statement holds if condition (10) Therefore, the optimization problem in (8) can be rewritten as: where the root of trace(T) is the upper bound of the expected weighted root-mean-square error z. In Eq. 8, the optimization target is the H 2 norm of the error system. In Eq. 14, the optimization target is relaxed to the upper bound of the H 2 norm. Here the direct optimization target is trace(T).

Weighted l 1 Norm of the Observer Gain Matrix
The second optimization target n j 1 m i 1 L ij 0 in Eq. 6 is nonconvex due to the existing of l 0 norm. For such l 0 norm optimization problem, it is generally impossible to solve as the solution usually requires an intractable combinatorial search (Candes et al., 2008). As proposed in Candes et al. (2008), the l 0 norm term m i 1 L ij 0 can be relaxed to a convex target by using the weighted l 1 norm, j is the weight of column j at iteration count k.

Optimization Problem
Using the lemma which is used and proved in Polyak et al. (2013): given a matrix L ∈ R m×n , the following statements are equivalent: (1) The jth column of L is zero.
(2) The jth column of S PL is zero for any P_0.
Combine the above lemma with (14) and weighted l 1 norm, the optimization problem is formulated as following: The optimization problem (15) can be solved by the CVX toolbox (Grant and Boyd, 2015) iteratively. At each iteration, the algorithm updates the weight factor μ (k) j 1 sufficiently small positive number, the iteration can be stopped. Ideally, the jth sensor signal is not utilized for the state estimation and should be removed if the jth column of the observer gain matrix L is zero (Münz et al., 2014). Similarly, for a properly scaled system, the signal from sensor(s) j with small m i 1 L ij will have very little impact on the estimation results and thus can be removed. However, it is hard to quantify the threshold of 'small' observer gain and decide the number of sensors that need to be removed. Instead of directly comparing the observer gain of the sensors, the value of n j 1 μ (k) j m i 1 S ij , representing the relaxed non-zero column number in the observer, is checked for every optimization result and used to decide the number of necessary sensors. n j 1 μ (k) j m i 1 S ij is rounded to its nearest integer q, which is FIGURE 1 | Engine architecture and candidate sensor placements. used as the number of selected sensors. For instance, if q is 3 for an optimization result, then the sensor(s) with the first three largest m i 1 L ij are the selected optimal sensors. After computing the optimal sensor set, set α to 0 and remove the rows in C corresponding to unneeded sensors. Substitute α 0 and the modified C into the optimization algorithm (15) again to calculate the observer gain matrix L P −1 S based only on the selected optimal sensor combination.

ALGORITHM APPLICATION ON A TURBO-CHARGED SI ENGINE MODEL FOR AIR HANDLING SYSTEM SENSOR DESIGNS
In this section, the proposed sensor selection algorithm is applied to a turbo-charged SI engine utilizing EGR. The goal of the sensor design is to choose the optimal sensor combination for accurately estimating the intake manifold gas composition. More specifically, the desired outcome is to quickly be able to determine the tradeoff between estimated intake manifold gas composition estimation error and the number of sensors.
The engine architecture is shown in Figure 1. For illustrative purposes, four available sensors are considered as candidates as shown in Table 1. A mass air flow sensor for inlet air (MAFa) can be placed upstream of the air and low pressure (LP) EGR confluence point, to measure the inlet air. A mass air flow sensor for high pressure flow (MAFh) can be placed downstream of the charge air cooler (CAC) to measure the cooled compressor mass flow rate. An EGR delta pressure sensor (EGR DP) can be located in the LP EGR valve to measure the LP EGR mass flow rate. Another option is a mass air flow sensor (MAF) put downstream of the air and LP EGR confluence point, but before the compressor, to measure total compressor inlet mass flow rate.
The model has 8 inputs, 1 disturbance input and 20 states as shown in Table 2-4, respectively. The nonlinear dynamic model equations can be written as follows and the detailed governing equations are listed in the Supplemental Material Taking the actuator and sensor response times into consideration, states x 14 to x 20 are added. First-order actuator responses are considered for the throttle valve, LP EGR valve and waste-gate. The following first-order approximation is used for the actuator and sensor dynamics: where x 0 is the command actuator input or physical expressions of sensed variables without delay, and τ is the time constant. In this engine model, the valve mass flow rate outputs are modeled by the following orifice equation (Eriksson and Nielsen, 2014): where γ is the gas specific heat ratio, A eff is the effective valve area, P out is the downstream pressure, P in and T in are the upstream pressure and temperature. A virtual flow sensor, which is developed based on speed-density equation, can be used for estimating the cylinder charge flow rate. Figure 2 shows the comparison of the linear model estimated charge mass flow rate and the GT-Power reference. The maximum error for the virtual flow sensor is within ±5.1%.
The nonlinear model is linearized at the steady-state (x e , u e , u de , y e ) of 3200 rpm engine speed, 60°throttle valve angle, 11.6 mm waste-gate diameter and 10°LP EGR valve angle. All of the valves are butterfly valves. The equilibrium points of system states x 1 to x 20 are directly obtained from GT-Power simulation results.
The nominal model is linearized into the following format: where δx x − x e , δu u − u e , δu d u d − u de , δy y − y e . An observer is designed from the linear state-space model as follows. The state-space representation is a well-known practice to capture the system dynamics for its effective computation and real-time implementation. The observer is a linear dynamic system to correct the model estimation errors from measurements of the inputs and outputs of the real system.
In simulation results that follow, the commanded engine throttle angle and number of firing cylinders are fixed as their linearization points. As studied in Rivas Perea (2016), a 11.5% brake→specific fuel consumption (BSFC) reduction and 4.5% absolute indicated efficiency improvement can be achieved by introducing 10% cooled EGR in a 2L, 4-cylinder, turbo-charged, direct injected SI engine at 3000 RPM part load conditions. Considering the fact that EGR tolerance decreases with the increasing engine speed (Francqueville and Michel, 2014), for an engine operation speed range of 2,400-4000 RPM, the wastegate and LP EGR valve are operated as shown in Figure 3, to vary the EGR ratio within 1.5-11%, which is a helpful level for the SI engine to improve fuel efficiency and keep combustion stability. Figure 4 shows the indicated mean effective pressure (IMEP) of the engine for the drive cycle (per Figure 3), which demonstrates the implementation of the proposed sensor selection algorithm for medium/high-speed operating conditions.

Unknown Disturbance
The process noise B w w and measurement noise Hv (per Eq. 1) are two necessary parameters to describe model and sensor errors. Incorrect description of the noise could result in significant worsening of estimation performances (Duník et al., 2017) and even the failure of the proposed sensor selection framework. Typically, the noise error covariance can be estimated by experimental tuning or computational methods (Duník et al., 2017;Kost et al., 2018;Solonen et al., 2014;Miran et al., 2019). The purpose of this section is to provide a simple and quick noise estimation method for the engine system based on experimental data to avoid repeated tuning work or complex computations. The sensor selection framework works well for the engine system with the diagonal noise covariance matrix estimated by the proposed method.
To implement the proposed sensor selection algorithm, the actual system is expressed as a linear state-space model with uncertainty represented by additive errors: where w is zero-mean unitary white noise and B T w B w is the process noise covariance matrix. The unknown disturbance B w w comes from the un-captured dynamics and model linearization errors. In this application, B w is assumed to be an diagonal matrix.
The modeling error B w w is estimated by fitting the difference between the actual _ x and the linear model estimated _ x as follows: where the values of the variables (δx GT , δu GT , δu d,GT ) are from the GT-Power simulation result which is used as the truth-reference, and _ x GT is the derivative of x GT . Figures 5-7 show the unknown disturbance plots for boost manifold pressure x 1 , exhaust manifold pressure x 3 and turbocharger speed x 7 , respectively. The errors Δ _ x model,1 , Δ _ x model,3 , and Δ _ x model,7 are calculated based on Eq. 22 where the data of states x GT , inputs u GT and disturbance inputs u d,GT directly comes from the GT-Power simulation result for the drive cycle in Figure 3.
An initial estimation of the process noise is the standard deviation of Δ _ x model in Eq. 22. Considering the fact that nonnormal noise (e.g. heavy-tailed or asymmetric) may not be wellrepresented by the first two moments (the mean and the standard deviation) Kost et al. (2018), the initial estimated process noise is then tuned based on its higher moments, i.e., skewness and kurtosis, to better represent the modeling errors.

Skewness Correction for Unknown Disturbance Estimation
The skewness c 1 of the error Δ _ x model is first calculated as follows to evaluate the asymmetry of the distribution and determine which B w (i, i) estimation equation is used for each state: where μ i and σ i are the mean value and the standard derivation of Δ _ x model,i , respectively. μ i and σ i are defined as follows: where N is the number of sampled points. The positive skewness values mean that the data is skewed to the right (right-tail), and negative values suggest skewing to the left (left-tail) (Blanca et al., 2013). The larger the absolute skewness value is, the more significant the asymmetry is. For the states where the error Δ _ x model,7 has small skewness (per Figure 6), the asymmetry is neglected and the unknown disturbance term B w (i, i) is estimated by the following equation: For the states where the error Δ _ x model,i distributions have large skewness, the asymmetry should not be neglected when estimating the unknown disturbance. If the skewness c 1,i and the mean value μ i have the same sign, the unknown disturbance of the state x i is estimated by the subtraction of the standard deviation σ i and the absolute mean value μ i (per Figure 5), otherwise the unknown disturbance is estimated by the sum (per Figure 7). The condition in Eq. 26 is to account for both of the asymmetry and non-zero mean error distributions. For instance, if the mean is positive and the skewness is negative (per Figure 7), the error has a positive bias and the majority of the error are even more positive than the bias. In this situation, the standard deviation may underestimate the error effect and thus we re-evaluate by adding the positive bias.

Kurtosis Correction for Unknown Disturbance Estimation
The excess kurtosis c 2 of the error Δ _ x model distribution defined as follows is then calculated to evaluate the outliers of the distribution and determine the correction made to the B w (i, i) term: For the states which have negative excess kurtosis, the unknown disturbances have more data distributed outside the region of the peak than a normal distribution. The more negative the excess kurtosis is, the more outliers the distributions will have. When the excess kurtosis is large, Eqs 25, 26 without considering the extreme error distributions may not be a proper way to estimate the unknown disturbance. Therefore, a correction is made to the unknown disturbance estimations of the states which have excess kurtosis lower than −1 (per Figure 5) by the following equation: where B w0 is the modeling error estimated by Section 3.2.1.
For the states x 14 to x 20 which represent the delayed actuator and sensor responses, the unknown disturbance terms B w (i, i) are set as 0. The details of B w (i, i) estimation for each state is listed in Supplemental Material.

Measurement Noise
The diagonal measurement noise covariance matrix H is defined as: where sensor accuracy data comes from Table 1 and δW max is the maximum flow rate deviation with respect to its linearization point.

Sensor Selection Results
The sensor selection algorithm is applied to the scaled linear system. This is to eliminate the effect of magnitude differences of measurements. Table 5 shows the optimal sensor set computed by the sensor selection algorithm (per Section 2) for different sensor number constraints. The iterative parameter ε is set as 1e −3 for single and two-sensor combinations or 1.2e −2 for three-sensor combinations. The trace(T), representing the upper power bound of the expected estimation RMSE, is calculated by Eq. 15 when setting α 0 for the normalized system. The upper bound of the expected RMSE E{RMSE} ub and the expected RMSE E{RMSE} for the actual system can be expressed, and related as follows: where δx 11,max max(x 11 − x e,11 ) is the scaling parameter of the intake manifold burnt gas fraction x 11 . It can be noticed that E{RMSE} ub is a very tight upper bound of E{RMSE} for this application as shown in Table 5.
The algorithm identifies the EGR DP sensor as the best sensor if only one single can be used. When two sensors are used, the optimal sensor set becomes EGR DP and MAFa, which measures the inlet air mass flow rate before the EGR joint (per Figure 1). The optimal three-sensor set combines EGR DP, MAFa and MAFh.
Different sensor sets with their corresponding observers are tested on the reference engine model in GT-Power. The four candidate sensors (per Table 1 and Figure 1) are placed in the GT-Power model. Per Table 1 data, these four GT-Power outputs are filtered with first-order functions described in Eq. 17 and corrupted by measurement noise before being sent to the observers to account for sensor noise. The GT-Power and observer simulation structure is shown in Figure 8. The observer gain for each sensor set is computed by the optimization (15) with α 0. The real-time inputs of the engine actuators (per Table 3 and the engine speed are known to the observer. The GT-Power cycle-averaged intake manifold burnt gas fraction is used as the truth-reference to validate the estimation results. RMSE of the intake manifold burnt gas mass fraction estimation for each sensor set is calculated from 3.3s to the end of the simulation to eliminate the effects of initial conditions. Figure 9 shows the estimation results of intake manifold burnt gas mass fraction when using different single sensor sets. As shown in Figure 9A, the EGR DP sensor has the most accurate estimation results at every step. Considering the overall estimation performance, the EGR DP sensor is the most accurate single-sensor option since it has the smallest rootmean-square error (RMSE), 0.498%, over the entire simulation. Without using any sensor, the maximum absolute estimation error is 1.744%. With the computed optimal sensor EGR DP, the maximum error is reduced to 1.014%, which is a 42% improvement compared to the model-only estimated result. The maximum errors for single MAFa sensor, MAFh sensor and MAF sensor are 1.684, 1.724 and 1.724%, respectively. Figure 9B shows the transient tracking performance of intake manifold burnt gas mass fraction when using different single sensor sets. Per Figure 9B, EGR DP sensor has the best tracking performance  Table 1), there is few difference between the tracking transient performance of these two sensors (per Figure 9B). Table 6, the RMSE for a single MAFa sensor, a single MAFh sensor and a single MAF sensor are 0.909, 0.935 and 0.935%, respectively. This indicates that if the EGR DP sensor fails, the next sensor the engine should select is the MAFa sensor based on their trace (T) and E{RMSE} ub calculations. Though the MAFh sensor has slightly lower trace (T) and E{RMSE} ub than the MAF sensor, their estimation performances are the same. Figure 10 shows the histograms of different single sensor sets' estimation errors. Compared to the optimal sensor EGR DP, the error distributions of the other three sensors are more spread out.

Two-Sensor Sets
The optimal two-sensor set computed by the sensor selection algorithm (per Section 2) is the combination of the EGR DP and MAFa sensors. This is verified in the coupled GT-Power/ Observer simulation (per Figure 8). As shown in Figure 11, the computed optimal sensor set has the smallest estimation error for almost every step. Comparing the overall estimation performance of the optimal sensor set with the other five combinations, the optimal one has the lowest RMSE. With the computed optimal sensor set, the maximum error is reduced to 0.754%, which is 57% improvement compared to the model estimated result. The maximum estimation error is 0.794% for the combination of EGR DP sensor and upstream compressor flow sensor MAFh, and is 0.804% for the combination of EGR DP sensor and downstream compressor flow sensor MAF. For the combinations of MAFa sensor + MAFh sensor, MAFa sensor + MAF sensor, the maximum estimation errors are both 1.564 and 1.594%. When only two compressor flow sensors are used, the maximum error is up to 1.724%.
In Table 7 and Figure 12, the simulated RMSE for different two-sensor set combinations monotonically increases with increasing E{RMSE}, as expected. The sensor sets with the first three lowest E{RMSE} all include the EGR DP sensor. Though the algorithm computes the combination of EGR DP sensor and MAFa sensor as the optimal two-sensor set, the combination of EGR DP sensor + MAFh sensor and EGR DP sensor + MAF sensor have similar estimation performances as the optimal one, as shown in Figure 11. These two combinations have very close E{RMSE} as well as the RMSE as shown in Table 7. When EGR DP is not considered in the two-sensor combination, such as the combination of MAFa sensor and MAFh sensor, there is a large increase in E{RMSE} as well the simulated RMSE. Additionally the two-sensor sets without the EGR DP sensor even have larger estimation errors than single EGR DP sensor. This indicates that under this operation condition, if only two sensors are allowed, the combination should include EGR DP sensor, and an EGR DP-only strategy would be preferred over a two-sensor strategy which did not include the EGR DP sensor. The optimal selection of the sensor in addition to the EGR DP sensor is MAFa sensor. The MAFh sensor may be considered as a backup selection to the MAFa sensor.   Figure 13 shows the histograms of different two-sensor combinations estimation errors. As shown, the best three twosensor combinations have the estimation errors distributions closer to 0. Figure 14 show the estimation results of intake manifold burnt gas mass fraction F b,im when using optimal sensor sets with different sensor numbers. As shown in Figure 14, the optimal two-sensor set has better estimation performance than the optimal single sensor. When more than two sensors can be used, all the optimal sensor set options have very similar estimation performances. Based on the data shown in Table 5, the optimal single sensor EGR DP reduces the RMSE by 47.4% compared with model-only estimated results. The optimal two-sensor option further reduces the RMSE by 25.9% based on the optimal single sensor estimation performance. Comparing the RMSE of the optimal three-sensor set, 0.367%, with the RMSE of the optimal two-sensor set, 0.369%, there is only 0.5% accuracy improvement. When the fourth sensor is added to the optimal three-sensor set, there is no improvement for the RMSE. In Figure 15, the computed E{RMSE} and trace(T) have similar trends. Using the optimal single sensor reduces the tarce(T) by 23.2% and E{RMSE} by 12.2% compared with model-only FIGURE 10 | Histograms of Intake manifold burnt gas mass fraction estimation error when only using one sensor.

Optimal Sensor Sets
FIGURE 11 | Intake manifold burnt gas mass fraction estimation when using two sensors. estimation results. From the optimal single sensor to the optimal two-sensor set, the trace(T) and E{RMSE} have 16.9 and 8.9% reductions, respectively. From the optimal two-sensor set to the optimal three-sensor set, the trace(T) is only lowered by 0.5% and E{RMSE} is lowered by 0.3%. From the optimal three-sensor set to the all-sensor set, both trace(T) and E{RMSE} remain the same. Compare the trends of E{RMSE} (or trace(T)) and simulated RMSE, both E{RMSE} (or trace(T)) and simulated RMSE have relative large reductions from model-only case to single sensor case to two-sensor case and small decreases when adding the third or fourth sensor. In this way, E{RMSE} or trace(T) can be a useful indicator of showing the necessity or redundancy when adding additional sensors. The sensor selection results indicates that though increasing sensor number reduces RMSE, the added sensor(s) brings in very small improvements of the estimation performance when number of sensors is higher than two. Based on the estimation error requirement, it may be worth using a single EGR DP sensor or FIGURE 12 | RMSE vs. trace(T) for two-sensor combinations.
FIGURE 13 | Histograms of intake manifold burnt gas mass fraction estimation error when using two sensors.
FIGURE 14 | Intake manifold burnt gas mass fraction estimation when using optimal sensor sets. adding a second sensor MAFa in addition to a single EGR DP sensor, but it may not be worth spending more money on adding the third or fourth sensor for the intake manifold gas composition estimation. Figure 16 shows the histograms of different optimal sensor combinations estimation errors. It can be seen that with the increasing of sensor number, the RMSE distribution is narrowed down and has smaller peaks at large errors.

Additional Discussion
The difference between the expected RMSE E{RMSE} and the simulated RMSE is shown in Tables 5-7, as well as Figures 12,  15. This could be explained by: (i) The computation of the expected RMSE, E{RMSE} (via Eq. 24), is based on the assumption that the process noise is zero-mean white noise. However, the actual unknown disturbance term _ x GT is not normally distributed for the example testing cycle. Since this paper focusing on selecting the optimal sensor set among candidate sensors for the engine system rather than studying the differences between the engine model and actual system, a quick and simple approximation method of the process noise described in Section 3.2 was used. The valuable information provided by the sensor selection algorithm is the sequence and relative increase/decrease among different sensor sets. Further studies could focus on a more appropriate unknown disturbance estimation method, but this would not be expected to change the sensor selection results and thus was not the study purpose; (ii) The measurement noise is approximated by the product of the maximum deviation of the sensor measurement with respect to its linearization point and the accuracy (per Section 3.3). This simple approximation would result in some differences between the expected RMSE E{RMSE} and the simulated RMSE due to the reason that the actual sensor measurement deviations are not symmetric about the linearization points, but would not be expected to change the sensor selection results. Further studies could focus on a more appropriate measurement noise estimation method based on analytical approaches.

CONCLUSION
This paper outlines a sensor selection and observer deign algorithm based on H 2 optimization while considering process and measurement noise. The approach is (1) implemented to an advanced turbo-charged spark-ignited engine architecture using exhaust gas circulation; and (2) validated on a high fidelity engine simulation in GT-Power. The objective of the sensor selection + observer design algorithm is to minimize the estimation error and the required sensor numbers. The optimization problem is convexified and solved via SDP. A method to estimate the unknown model uncertainties was also developed. The high fidelity simulation results verified that the optimal sensor sets computed by the algorithm had the best estimation performances. Sensor redundancy was also analyzed based on the computation results. This algorithm reduces the computation time and experimental efforts of selecting optimal sensor sets.

FUTURE WORK
The future work of this algorithm study can involve the estimation of other engine key parameters and observer designs; other engine operation conditions analysis.
FIGURE 16 | Histograms of intake manifold burnt gas mass fraction estimation error when using optimal sensor sets.