Interval Extension of Neural Network Models for the Electrochemical Behavior of High-Temperature Fuel Cells

In various research projects, it has been demonstrated that feedforward neural network models (possibly extended toward dynamic representations) are efficient means for identifying numerous dependencies of the electrochemical behavior of high-temperature fuel cells. These dependencies include external inputs such as gas mass flows, gas inlet temperatures, and the electric current as well as internal fuel cell states such as the temperature. Typically, the research on using neural networks in this context is focused only on point-valued training data. As a result, the neural network provides solely point-valued estimates for such quantities as the stack voltage and instantaneous fuel cell power. Although advantageous, for example, for robust control synthesis, quantifying the reliability of neural network models in terms of interval bounds for the network’s output has not yet received wide attention. In practice, however, such information is essential for optimizing the utilization of the supplied fuel. An additional goal is to make sure that the maximum power point is not exceeded since that would lead to accelerated stack degradation. To solve the data-driven modeling task with the focus on reliability assessment, a novel offline and online parameterization strategy for interval extensions of neural network models is presented in this paper. Its functionality is demonstrated using real-life measured data for a solid oxide fuel cell stack that is operated with temporally varying electric currents and fuel gas mass flows.

In various research projects, it has been demonstrated that feedforward neural network models (possibly extended toward dynamic representations) are efficient means for identifying numerous dependencies of the electrochemical behavior of hightemperature fuel cells. These dependencies include external inputs such as gas mass flows, gas inlet temperatures, and the electric current as well as internal fuel cell states such as the temperature. Typically, the research on using neural networks in this context is focused only on point-valued training data. As a result, the neural network provides solely point-valued estimates for such quantities as the stack voltage and instantaneous fuel cell power. Although advantageous, for example, for robust control synthesis, quantifying the reliability of neural network models in terms of interval bounds for the network's output has not yet received wide attention. In practice, however, such information is essential for optimizing the utilization of the supplied fuel. An additional goal is to make sure that the maximum power point is not exceeded since that would lead to accelerated stack degradation. To solve the data-driven modeling task with the focus on reliability assessment, a novel offline and online parameterization strategy for interval extensions of neural network models is presented in this paper. Its functionality is demonstrated using real-life measured data for a solid oxide fuel cell stack that is operated with temporally varying electric currents and fuel gas mass flows.

INTRODUCTION
Because of their high efficiency factors, high-temperature fuel cells, for example, solid oxide fuel cells (SOFCs), are promising devices to use in the framework of a decentralized grid for co-generation of heat and electricity. An advantage of SOFCs compared to other fuel cell types is their capability to operate with a wide range of fuels such as pure hydrogen, different hydrocarbonates (e.g., methane and propane) or even carbon monoxide. To employ the latter types of fuels for power generation-which often appear as a kind of waste in industrial process engineering anyway-appropriate gas reformers need to be used in the fuel cell system. Thus, fuel cells can provide a powerful means to use waste in an energetically efficient way. Moreover, they operate in an environmentally friendly manner if the fuel gases are produced from renewable sources (Stambouli and Traversa, 2002;Gubner, 2005;Pukrushpan et al., 2005;Stiller, 2006;Stiller et al., 2006;Weber et al., 2006;Sinyak, 2007;Bove and Ubertini, 2008;Divisek, 2010;Huang et al., 2011Huang et al., , 2013Dodds et al., 2015;Arsalis and Georghiou, 2018). However, the operation of SOFCs according to the current state-of-the-art is restricted to a thermally and electrically constant, a-priori optimized operating point. This restriction needs to be removed if dynamic system employment in decentralized, island-like settings with high efficiency is desired. There, battery buffers and thermal energy storage (aiming at low-pass filtering temporal variations of the power demand) should be kept as small as possible to reduce costs for installation and maintenance. In any case, a dynamic fuel cell operation needs to be implemented in such as way that constraints on the admissible stack temperature and also on the fuel consumption are not violated.
Possibilities to describe both the thermal and electrochemical fuel cell behavior using neural network (NN) models with a shallow structure were investigated, for example, in Razbani and Assadi (2014) and Xia et al. (2018). In contrast to Rauh et al. (2021), other state-of-the-art NN models are purely data-driven black box representations and do not explicitly structure the NN to account for physical dependencies (such as proportionality between the consumed hydrogen mass flow and the exothermal heat production). Hence, those dependencies are typically extracted from a set of measured data by the training algorithm of the (shallow) NN. A shallow NN has only a small number of hidden layers with an optimally chosen small number of hidden layer neurons. In Rauh et al. (2021), it was shown that the structure of such models can be devised by using physical insight, leading, for example, to the introduction of multiplicative couplings between hydrogen mass flows (or electric currents) with temperature-dependent nonlinearities. These couplings directly represent the physical influence factors on the exothermal heat production in high-temperature operating phases that were derived in Rauh et al. (2016) in an equation-based form. A point-valued NN model describing not only the thermal but also the electrochemical system behavior was introduced in Rauh et al. (2021) and extended to (fractional) differential equation models in Rauh (2021). However, these publications do not analyze the robustness of point-valued NN models by the additional means of interval techniques. To quantify the reliability of the models and to deal with the uncertainty of a point-valued (static or dynamic) representation for the electrochemical fuel cell behavior, novel interval-based approaches are proposed in this paper. As such, these extended NN models provide the basis for improving the reliability of control procedures either during their offline design or for their online adaptation or optimization.
The main factors influencing the electric power characteristic of high-temperature fuel cells are the electric current and the supplied fuel mass flow (Pukrushpan et al., 2005;Bove and Ubertini, 2008;Divisek, 2010). Therefore, typical approaches for the derivation of physically inspired system descriptions make use of the representation of the stack's terminal voltage by means of electric equivalent circuit models. In those, the Nernst voltage serves as the system input from which voltage drop phenomena due to activation polarization, Ohmic polarization, and concentration polarization are subtracted. After that, the corresponding parameters can be identified using data from experiments comprising stepwise current variations and methods for impedance spectroscopy (Frenkel et al., 2019).
From a phenomenological point of view, these models can be simplified toward linear transfer function representations that are valid in the vicinity of a desired operating point . However, those simplified models become quite inaccurate if large and relatively fast variations of operating points occur. A promising approach to deal with this situation is to approximate the electric stack voltage by a multivariate polynomial with respect to the current and fuel mass flow and to estimate the corresponding coefficients in real time by means of a suitable Kalman filter. Such models have been applied successfully for the derivation of maximum power point tracking controllers and for the online optimization of the fuel efficiency under the constraint that the maximum power point must not be exceeded (to avoid system wear) Rauh, 2021).
The advantage of this online estimation approach, in contrast to the equivalent circuit representation, is the significantly reduced model complexity and the fact that the dependencies of the equivalent circuit parameters on system inputs such as the anode gas, cathode gas, and the internal stack temperatures do not need to be identified in advance. Instead, the parameter identification is performed at runtime using a Kalman filter-based scheme. However, this advantage turns into a drawback if the model needs to be applied offline for simulation, control design, and validation purposes. Both to solve this problem and to avoid explicit specifications of parameter dependencies, it is promising to employ NNs as a data-driven modeling option. In this paper, we propose to generalize static feedforward networks (Haykin et al., 2001) and dynamic nonlinear autoregressive models with exogenous inputs (NARX) (Nelles, 2020) using intervals. Such extensions quantify the accuracy of the system representation in comparison with available training and validation data sets that are obtained by suitable identification experiments.
Interval analysis (Jaulin et al., 2001;Moore et al., 2009) (IA) is a powerful mathematical and computational approach to result verification with a wide range of applications in engineering, medical science, (bio)mechanics and others. Methods based on IA ascertain formally that the outcome of a computer simulation implemented with their help is correct despite, for example, the use of floating point arithmetic (having a finite precision) or possible appearance of discretization errors (assuming that the underlying implementation is correct). The results are intervals with bounds expressed by floating point numbers which with certainty contain the exact solution to a mathematical function evaluation or a more complex dynamic system model. Such correctness properties are exploited, for example, when searching for all zeros of an algebraic function on a bounded domain or when simulating ordinary differential equations which may loose their stability properties if floating point solvers with inappropriate discretization step sizes were chosen. A method with result verification based on IA then always provides bounds for the system states that contain the true system dynamics. It is, furthermore, possible to quantify bounded uncertainty in output parameters from that in the input data. By using suitable model implementations that replace the system inputs and selected parameters by interval quantities, system models can be analyzed from the angles of sensitivity, reachability, and feasibility numerically in a reliable way. In our paper, we use IA primarily in its capacity to deal with uncertainty in measured data.
A common drawback of such rigor-preserving methods, caused by the dependency problem or the wrapping effect (Jaulin et al., 2001), is the possibility of too wide bounds for the solution sets (e.g., between −∞ and +∞), which is called overestimation. The dependency problem manifests itself if an interval variable occurs multiple times in the mathematical expression to be evaluated. The wrapping effect appears because of the need to enclose rotated, non-axis aligned (and in some cases even non-convex) sets in axis-parallel boxes.
An interval [x] [x ; x -], with x as the lower and x as the ( 1 ) For a binary operation ○ = { +, −, ·, /} and two intervals [x ; x -], [y ; y -], the corresponding interval operation can be defined as This formula can be simplified using further information about the operation, for example, [x ; x] + [y ; y] [x + y ; x + y] for the addition. For division of intervals, usually 0 ∉ [y ; y] is assumed. Based on this arithmetic, higher-level interval methods can be defined, for example, those for solving systems of algebraic or differential equations or for the identification of parameters for a given mathematical model given measured data affected by interval uncertainty (Neumaier, 1990;Jaulin et al., 2001;Nedialkov, 2011).
To quantify the accuracy of models for the electrochemical behavior of SOFCs relying on NN representations, information from the available training and validation data sets is exploited in terms of a two-stage offline procedure comprising: 1. Identification of a classical point-valued NN model for the system; 2. Identification of interval bounds for the model parameters at Stage 1 using offline optimization (leading to an interval parameterization of the NN).
In an optional third online identification stage of the proposed procedure, the interval parameters can be adapted further so that the interval extension of the NN representation becomes compatible with data points that are observed additionally.
The structure of the paper is as follows. Sec. 2 describes different options using which NN models and interval techniques can be combined. Based on this, the proposed interval extension of both static (feedforward) NN models and dynamic NARX models is presented and illustrated for an academic example in Sec. 3. After that, the interval approach is applied to modeling and simulation of the electric power characteristic of an SOFC system in Sec. 4. Finally, conclusions and an outlook on future research are given in Sec. 5.

INTERVAL EXTENSIONS OF NEURAL NETWORKS: PRINCIPLES
In general, it is possible to employ interval methods for the evaluation of NNs in two fundamentally different ways. The first approach is to perform a classical training of the NN so that a model is obtained that provides point-valued outputs if pointvalued data are applied to the network's input layer. Interval analysis can then be used to assess a posteriori the sensitivity of the network with respect to inputs that are described by suitable intervals. This kind of assessment can serve either to analyze the robustness of the NN or to quantify its range of possible outputs. The latter helps to examine the capability of the NN model to generalize with respect to unknown inputs and to perform a reachability or safety analysis for the modeled system. The second and less commonly used approach is to incorporate intervals directly into the training phase. In this case, layer weights and bias values are determined as interval quantities so that (uncertain) input data lead to interval bounds on the system outputs which themselves enclose uncertain but bounded (measured) data. In this section, we give references and examples for the first approach in Sec. 2.2 and for the second in Sec. 2.3 and Sec. 2.4 to highlight the specific combination of NN modeling with interval methods suggested in this paper.

Preliminaries
Our general goal is to use NN modeling to solve nonlinear regression problems. In the considered SOFC application, this task corresponds to approximating scalar-valued multivariate functions that represent the electric power of the fuel cell in terms of all relevant influence factors. As already stated in the introduction, we restrict ourselves to the case of shallow NNs with a suitably chosen internal structure (cf. Rauh et al., 2021) and an optimized number of hidden layer neurons. This kind of network structuring is similarly exploited by Lutter et al. (2019), where NNs are employed to describe mechanical systems with the help of an approach based on the Lagrange formalism.
In contrast to our shallow NN approach, deep learning techniques are also employed in the current applied NN research. Here, the focus is on probabilistic regressions in combination with importance sampling, noise contrastive estimation, and maximum likelihood estimation (Gustafsson et al., 2020;Andersson et al., 2021;Gedon et al., 2021). In the area of probabilistic NN techniques, the estimation of probability density functions in a data-based context should be mentioned as a further approach that might find applications in controloriented system modeling. Because of their limited applicability in the frame of control for fuel cell systems, other deep learning approaches, for example, for solving automatic (image) recognition tasks, are not discussed here. In our future research, we might consider extending the interval shallow NN modeling using the deep learning regression techniques that employ a feedforward or NARX structure similar to that considered in this paper. In such a way, the proposed approach could be extended not only to handle a control optimization in a bounded-error framework but also as an alternative to stochastic model predictive control techniques (González Querubín et al., 2020) with underlying deep learning system representations.
The general aim of this paper is thus to solve a function approximation problem (with possible memory effects represented by means of autoregressive NN structures) supposing that • measurements of the system inputs are available at certain time instants, • measurements of system outputs are available at the identical synchronized time steps, and • the networks are extended by systematically introducing interval parameters so that the computed output intervals cover all available measurements and that a re-training becomes possible as soon as new measured data arrive.
Classical methods for point-valued NN parameterization make use of well-known optimization techniques such as the Levenberg-Marquardt algorithm, conjugate-gradient approaches, or Bayesian regularization (Dan Foresee and Hagan, 1997;Nocedal and Wright, 2006). These tools perform well if feedforward NNs with sufficiently large numbers of hidden layer neurons are considered. However, excessive overparameterizations should be avoided since this might lead to overfitting or high sensitivity against input disturbances which both cause poor generalizability to unknown inputs.
To prevent under-and overfitting of the data by the used NN, we employ the singular value decomposition approach theoretically derived in Kanjilal et al. (1993) and investigated in detail from an application (fuel cell) perspective in Rauh et al. (2021) and Rauh (2021) to find reasonable numbers of neurons in the hidden layer(s). In addition, this procedure is applicable to eliminate system inputs that do not contribute essentially to the input-output relations to be identified.
To solve the tasks listed above, we denote the vector of NN inputs by q k ∈ R m , where k ∈ {1, . . . , k max } corresponds either to the sample number in a list of possible input values or to the data at the kth time instant if the system inputs possess a temporal relationship. Assuming a shallow NN with a single hidden layer, the vector y k ∈ R n of outputs can be computed by where g(·): R L ↦ R L is the activation function in the hidden layer, L is the number of hidden layer neurons and the output layer is linear. Here, W 1 ∈ R L×m and W 2 ∈ R n×L are weighting matrices of suitable dimensions with the bias vectors b 1 ∈ R L and b 2 ∈ R n .
In the frame of training feedforward NNs with the aim of function approximation, the vector g(x) of activation functions is usually chosen either as the so-called ReLU function or as a sigmoid-type function A typical choice for the latter is the hyperbolic tangent (implemented in MATLAB by the function tansig). Throughout this paper, we use the representation to avoid overestimation caused by a dependency of the numerator and denominator terms on the same variable in the alternative representation of the hyperbolic tangent function.
Remark 1. Both ReLU and sigmoid functions are capable of accurately representing input-output relationships in the frame of nonlinear function evaluation. ReLU representations are preferred in many current research works on regression problems in system identification because of their lower computational cost since efficient linear algebra routines and more simple gradient expressions can be applied during training phases. However, we make use of sigmoid representations in the hidden network layer in our paper because they automatically provide bounded outputs to input values with large magnitudes. This holds not only for pointvalued NN realizations but also for the interval-valued counterparts introduced in this contribution. For those, constant interval diameters can be expected for inputs with large magnitude if sigmoid activation functions are used while the interval diameters would grow inevitably in the ReLU case because the vector components of (Eq. 4) are unbounded for x i →+∞.

Interval Evaluation of Static Feedforward Neural Network Models
by a feedforward NN. For that purpose, we generate N = 500 normally distributed input samples q k with zero mean and standard deviation σ = 3 as the system inputs for which output measurements are available. These outputs y m,k , where the subscript m stands for measured data, are corrupted by uniformly distributed, additive noise from the interval Δ · [−1 ; 1] with Δ = 0.15. The presence of such noise processes is quite common in practice. It models the effects of random disturbances that can be considered as bounded due to quantization effects and the use of intelligent sensors detecting measurement outliers 1 . Figure 1 shows a comparison of the exact system behavior (Eq. 8), the training samples, and the corresponding point-valued NN representation, where the number of hidden layer neurons is set to L = 10 if either sigmoid-type or ReLU activation functions are used. In both cases, the NN was trained using a Bayesian regularization technique from MATLAB. During this process, the overall data set was subdivided randomly into training (70%), test (15%), and validation (15%) data.
A naive interval extension of the network (i.e., that directly replacing all floating point operations with the corresponding interval counterparts) for an equidistant interval mesh [q i ; where min(q i ) −10 and max( q i ) 10 is shown in Figure 2. This interval extension aims at quantifying the ranges of possible NN outputs without the necessity for sampling the inputs in a pointvalued form.
As mentioned in Remark 1, the use of a sigmoid hidden layer ensures bounded outputs for inputs with large absolute values. For both kinds of activation functions, the comparison of different mesh sizes for the input intervals reveals the problem of multiple interval dependencies that leads to overestimation in the computed output ranges. For that reason, it is necessary to apply techniques for the reduction of overestimation during the interval evaluation of NNs with interval-valued input data q k . This might include a (brute force) subdivision of input intervals 2 as shown in Figure 2 or algorithmic means such as centered form or affine representations and slope arithmetic (Cornelius and Lohner, 1984;Chapoutot, 2010). Due to the restriction to globally defined (and, for the sigmoid case, differentiable) functions in the hidden layer, these methods can be implemented using existing interval libraries such as INTLAB (Rump, 1999) or JULIAINTERVALS (Ferranti, 2021). An advantage of the latter is that it can be employed to execute the NN's source code directly on a GPU (graphics processing unit) to exploit data parallelism for large numbers of input samples q k .

Interval-Valued Parameterization of the Network's Activation Functions
In contrast to performing a set-based reachability analysis of NN models that are trained with the help of local optimization procedures and thus possess point-valued parameters b 1 , b 2 , W 1 , and W 2 as summarized in the previous subsection, the main goal of our paper is to determine suitable interval versions of these parameters to quantify the NNs' modeling quality.
Gradient-based training techniques can be generalized quite easily to determine interval-valued NN parameterizations (Oala et al., 2020). However, a drawback of the direct NN training using interval parameters might be the convergence to poor local minima of the considered cost function if inflating layer weights leads to similar output intervals as replacing selected bias terms by interval parameters. During the search for interval-valued NN parameterizations, the cost function should not only reproduce the training data with high accuracy but also reduce the interval diameters of the network's outputs, see Sec. 3 for further details. Additionally, we have to make sure that all (point-valued) system outputs (i.e., the actual point-valued measured data available for the training) are fully included in the networks' output intervals. This property can be relaxed, for example, by allowing a certain percentage of outliers that do not necessarily need to be included in the network's output range. This modification is similar to that from the relaxed intersection approach described in Jaulin and Bazeille (2009).
Poor convergence of the interval-based NN training relying on local optimization techniques by a direct search for interval parameters as stated above is caused by the fact that the conversion of different network parameters to interval quantities might lead to ambiguous output behavior, at least for some of the domains of the inputs q k . This fact is visualized in Figure 3 for the NN from the previous subsection. To limit the overestimation caused by interval-valued inputs q k ∈ [q k ], the evaluation is performed on the fine grid from Figure 2B. To illustrate the influence of converting either b 1 , b 2 , W 1 , or W 2 into interval quantities, each of them is multiplied independently with the interval bounds [0.95 ; 1.05] (parameter uncertainty of ±5%). Since b 1 , b 2 , W 1 , and W 2 each appear only once in Eq. 3, the NN does not exhibit classical dependency-based overestimation. However, the wrapping effect described in Jaulin et al. (2001) might still be present where the matrix multiplication is involved. Output ambiguities, concerning a similar shape of the enclosures [y k ], can be seen when comparing Figure 3A with Figure 3C (and Figure 3B with Figure 3D). This visual comparison indicates that converting all entries in W i to intervals might have a similar influence as the interval conversion of b i for some of the respective entries.
To circumvent such kind of poor convergence and to reduce the effect of the illustrated ambiguities, we propose a two-stage identification procedure for feedforward NNs with interval parameters. Rather than just replacing all entries by intervals provided by local optimization in a single stage, we show in Section 3 how to specifically determine those entries in b i and W i in a two-stage procedure for which the interval conversion leads to the smallest diameters of the NN output intervals while nonetheless including (all) measured data.

Interval Evaluation of Dynamic Neural Network Models
In contrast to static function approximations, dynamic NN models contain a kind of memory for previous input and state information. This can be realized, for example, by training recurrent networks with an Elman or Jordan structure (Elman, 1990;Mandic and Chambers, 2001), training NARX models (Nelles, 2020), or combining static function approximation networks with dynamic elements that can be represented by integer-or fractional-order transfer functions. The latter option leads to the dynamic NN models and Hammerstein-type nonlinearity representations discussed in Rauh (2021).
In all cases, interval parameters within the corresponding networks or also interval-valued system inputs and initial conditions lead to discrete-or continuous-time dynamic systems with interval parameters. For such systems, the wrapping effect of IA is an important issue to handle. For corresponding approaches, the reader is referred to the vast literature on verified simulation procedures for dynamic systems, cf. Nedialkov (2011);Lohner (2001). As far as dynamic NN-based model evaluations are concerned, Xiang et al. (2018) and the references therein provide a good overview of the current state-of-the-art. However, the computational effort increases by a large degree in comparison to the evaluation of purely static NN models. For that reason, we do not make use of a direct IA evaluation of any of the mentioned dynamic NN models. Instead, a NARX model, representing a point-valued approximation of the dynamic system behavior, is combined with a static interval-valued additive error correction model, both relying on the same finite window of input and output information.

PROPOSED INTERVAL PARAMETERIZATION OF NEURAL NETWORK MODELS
In this section, interval extensions of the static feedforward NN models and dynamic NARX models sketched in the previous section are presented in detail. For these two different types of models, we propose a unified interval methodology that quantifies the modeling quality.

The Static Case
For the purpose of static function approximations, an interval NN model structure shown in Figure 4 is considered. A generalization of the representation given in Eq. 3, namely, its interval extension aiming at a guaranteed enclosure of all measured samples y m,k , is denoted by where the interval parameters, highlighted by the thick lines in Figure 4, are defined for i ∈ {1, 2} according to In analogy to the definition of scalar interval variables, the inequalities Δb i ≤ Δ b i and ΔW i ≤ ΔW i hold in an element-wise manner. To make it possible to combine a point-valued NN parameterization with an IA-based uncertainty model, we assume that both b i and W i denote the interval midpoints to be identified in the first stage of the following procedure, while the additive bounds in the relations stated in (Eq. 10) are determined in the second stage.
In Figure 4 and all subsequent block diagrams of NN models in this paper, the parameters indicated by solid arrow heads are determined during the training and optimization phases of the proposed procedure while the non-filled arrow heads denote connections with fixed weights that are set to the constant value one.
We suggest the following interval procedure for static NN modeling.
Stage A Train the NN N (q k , W 1 , W 2 , b 1 , b 2 ) according to (Eq. 3) by means of a standard algorithm (e.g., trainbr in MATLAB) to minimize the quadratic cost function where T denotes the set of training samples' indices. The resulting parameters b 1 , b 2 , W 1 , W 2 define the interval midpoints that are kept constant during the following stage.
Stage B Determine the NN's interval extension. B1 Compute interval correction bounds for the parameter b 2 according to so that all measured ouput samples y m,k are included in the interval-valued NN outputs. In (Eq. 12), the symbol k denotes the tightest axis-aligned interval enclosure around all arguments of this operator.

B2 Determine the intermediate minimum cost function value
with w([x]) being the element-wise extension of the interval diameter definition from (Eq. 1) and the vector B3 Minimize the cost function by searching for optimal parameterizations of the additive interval bounds [Δb i ; Δ b i ] and [ΔW i ; Δ W i ] in (Eq. 10). Here, P k is a sufficiently large penalty term P k 100 · δ + Remark 2. Using this minimization procedure, implemented by means of a particle swarm optimization technique (Kennedy and Eberhart, 1995;Yang, 2014), it is guaranteed that the optimized bounds in Step B3 become tighter than the heuristic outer bounds in Step B1 in those domains for the inputs q k , where dense identification samples are available. This is a direct consequence of the fact that the relation J ≤ J* between the cost functions (Eq. 13) and (Eq. 15) is always satisfied if P k = 0. In this case, the NN is parameterized so that none of the measured samples lies outside the output interval bounds.
Remark 3. For scenarios in which a certain percentage of outliers are tolerated, the penalty term J* · P k in (Eq. 15) can be modified easily to ignore those samples in the summation that have the largest indicator values P k . For an example of this relaxation of the inclusion property, Sec. 4.
In Figure 5A and Figure 5B, a comparison is shown between the heuristic interval parameterization of a feedforward NN according to Step B1 and the systematic version according to Step B3 (for sigmoid activation functions). The comparison for the ReLU case is illustrated in Figures 5C,D. These two scenarios make use of the same data as in the previous section. Due to the symmetric (uniformly distributed) nature of the disturbance, the interval parameterization can be simplified in this example by setting to reduce the degrees of freedom during the particle swarm optimization in Stage B. For both kinds of NN, it can be seen clearly that the interval diameters are much smaller in the domains in which dense measurements are available. Outside those domains, the sigmoid version of the NN has the advantage of providing tighter bounds due to the inherited saturation of the hyperbolic tangent function (Eq. 7). For that reason, we use only the sigmoid implementation in the context of SOFC modeling (cf. Sec. 4).
In Table 1, there is a summary of the computing times 3 for evaluating the trained neural networks. Obviously, the slowest version is the naive evaluation of the point-valued networks in the form of the MATLAB network structure using the command sim. This can be accelerated significantly by employing automatic code generation (genFunction with the option MatrixOnly), which, however, pre-evaluates all NN-internal relations with the point-valued parameters. To obtain an implementation that is directly suitable for replacing the NN parameters with intervals, an extension of the Matlab functionalities by a vectorized implementation is necessary to which all parameters are passed as function arguments. Compared with this version, the interval-valued realization increases computing times by a factor 5 for ReLU activation functions and by a factor 10 in the nonlinear, sigmoid case.
Remark 4. Note that the true function (usually not available for the identification procedure) is not necessarily fully included in the determined interval outputs if the number of training points is insufficient in certain domains. If this phenomenon is detected at runtime after new measured data arrive, the model can be adapted by a re-application of Step B1. If computational resources allow this, Step B3, hot-started with the previously identified interval bounds, can also be re-applied.

Interval-Based Error Correction of NARX Models
In this paper, we extend the two different system structures for NARX models depicted in Figure 6 using IA. The first is the dynamic NN model N d,A in which both input variables q k and output information from a (previous) time window influence the dynamics in a nonlinear way ( Figure 6A). In contrast, the network N d,B in Figure 6B contains a simplified version in which only the input variables enter the nonlinear hidden layer with the sigmoid activation functions (Eqs. 5-7), while the previous outputs are fed back in a linear manner. For a compact mathematical representation of the networks (20) and with M ≥ 1 as the number of previous sampling points, we introduce the stacked vectors for input and output variables, respectively. As explained in Sec. 2.4, we suggest to extend both types of networks (ι ∈ {A, B} in Figure 6) using an additive correction that is implemented by means of a static NN with interval parameters such that the enclosure property y m,k ∈ N d,ι q k−M: k , y k−M: k−1 , W d,q,1 , W d,y,1 , with the combined input vector holds for all measured samples y m,k . Due to this specific structure, the network [N ] with interval parameters can be optimized with the help of the two-stage procedure described in Sec. 3.1 as soon as the structure and the parameters of the point-valued NARX models N d,ι have been fixed.
A detailed comparison of the alternatives from Figure 6 is given in the following section using the example of modeling the electrochemical behavior of an SOFC system.

NEURAL NETWORK MODELING OF THE ELECTRIC POWER OF A HIGH-TEMPERATURE FUEL CELL
SOFCs can only produce electric power from the supplied fuel gas if a certain minimum operating temperature is maintained in the interior of the fuel cell stack. Moreover, this temperature must be limited from above so that rapid wear of the system components is prevented. For that reason, the identification experiment used in this section only contains the high-temperature phases from the data sets in Rauh et al. (2021) and Rauh (2021). During this phase, an observer-based interval sliding mode controller (Rauh et al., 2015) is employed to keep the stack temperature in the vicinity of a constant value despite temporal changes of the electric SOFC power. During the identification experiment for the electric power characteristic of the SOFC, the fuel mass flow (hydrogen) and the electric current (specified by means of an electronic load) are varied according to the measured data shown  in Figure 8. During the first 0.7 h, the mass flow and current were specified as piecewise constant, while the remaining part of the data consists of a controlled system operation according to Frenkel et al. (2020).
Both, the open-loop and the controlled phases show that there exist numerous factors that influence the electric power characteristic of the SOFC. Describing all of these factors by physically motivated analytic models is a next to impossible task due to the following reasons. The electric power depends on the mass flows and temperatures of the supplied media in a strongly nonlinear way; the internal fuel cell temperature shows a temporally and spatially multi-dimensional dependency; and, last but not least, operating conditions depend on the electric current if the fuel cell is utilized in a current-controlled mode.   appear mainly because knowledge concerning the internal construction of a fuel cell stack is incomplete for the user due to manufacturer confidentiality, there are assembly constraints on gas supply manifolds and electric contact paths, manufacturing tolerances, and non-ideal material properties. An additional difficulty is that a perfectly constant, spatially homogeneous temperature cannot be achieved in practice if the electric load varies.
In previous work, it has been shown that point-valued static NN models such as the one in Figure 7 (and their dynamic extensions) can represent the measured SOFC system behavior quite accurately. To develop a shallow NN model with an acceptable hidden layer for the considered SOFC test rig, we identify system inputs having the largest influence on the electric power with the help of a principal component analysis based on the singular value decomposition (Kanjilal et al., 1993) as described in Rauh et al. (2021) and Rauh (2021). As shown in these papers and indicated in Figure 7, the relevant components of the input vector q k for the NN-based system identification are • the measurable stack temperatures ϑ m,(1,1,1),k and ϑ m,(1,3,1),k close to the gas inlet and outlet manifolds, • the electric current I k , • the inlet temperatures ϑ CG,m,k and ϑ AG,m,k at the stack's cathode and anode sides, and • the nitrogen and hydrogen mass flows _ m N 2 ,k and _ m H 2 ,k at the anode.
Moreover, we consider the influence of the mass flow _ m CG,k of preheated air at the cathode as a further NN input to achieve larger flexibility with respect to temporally varying operating conditions. In Figure 8, plots for selected measured input and output data are shown that are used throughout the remainder of this section. At the test rig, these inputs q k are sampled with a frequency of 10 Hz. Prior to the NN identification, we average these values to reduce the sampling frequency to 1 Hz. Our goal is to predict, based on these data, the SOFC stack voltage U k as the system output from which the instantaneous electric power P EL,k is determined by multiplying with the measured electric current I k (cf. Figure 7). Interval enclosures for the electric power can be used to forecast not only the uncertainty in the predicted system output but also for implementing set-based generalizations of the maximum power point tracking from Rauh (2021) and to forecast the uncertainty of fuel efficiency factors for specific operating points.
In this section, we first show results for traditional, pointvalued NN models for the electric power obtained with the help of static (Sec. 4.1) and dynamic (Sec. 4.2) networks. After that, we show how to employ the procedure described in Sec. 3 to parameterize interval extensions of both types of models reliably in Sec. 4.3 and Sec. 4.4. For that purpose, we exploit the fact shown in Rauh et al. (2021) and Rauh (2021) that using seven neurons in the hidden layer is sufficient for representing the SOFC system behavior in an accurate way if these neurons are parameterized by the sigmoid function (Eq. 7).

Point-Valued Static Neural Network Model
As the fundamental, static electric power model, the NN representation in Figure 7 is employed. This network is trained with the measured data described above by using the MATLAB NN toolbox. We rely on the Bayesian regularization back-propagation algorithm (parallelized on four CPU cores) with a maximum number of 5,000 epochs to solve the training task. During the network training, a worsening of the validation performance has been allowed in 50 subsequent iterations with a random subdivision into training (70%), test (15%), and validation (15%) data. For further details, see Rauh (2021).
The visualization of the training results in Figure 9 reveals that the largest approximation errors occur during those operating phases of the SOFC stack in which either the hydrogen mass flow or the electric stack current show significant temporal variations. Therefore, point-valued NARX models (not yet considered in previous publications by the authors) are discussed in the following subsection to reduce approximation errors in dynamic operating phases.

Point-Valued Dynamic NARX Model
As indicated in Figure 6, two different types of NARX models can be considered. These are either models with fully nonlinear characteristics of both the system inputs and the fed-back stack voltage (N d,A ) or a model simplification in which only nonlinearities with respect to the input variables are accounted for (N d,B ). To ensure comparability between    the previous static system model and the NARX representation, L = 7 is again used as the fixed number of hidden layer neurons. In addition, M = 10 (for the data set with the reduced sampling frequency of 1 Hz) serves as the memory length of the autoregressive model. This heuristically chosen value is motivated by a delay of a few seconds between the fuel cell stack's inputs and outputs that became visible during the experimental data acquisition. The training of both point-valued models N d,A ( Figure 10) and N d,B (Figure 11) is based on a two stage procedure (using the same optimization algorithm as for the static case in the previous subsection). First, an open-loop training is carried out. Second, the NARX model obtained from the first stage is converted into a closed-loop structure by feeding back the simulated outputs. This configuration is then re-optimized to further reduce approximation errors and to avoid instability of the closed-loop NN models. Here, directions for our future research will be to optimize the memory length M and to develop techniques for automatically selecting different memory lengths for the network inputs q k−M:k and for the output feedback y k−M:k−1 . When fully neglecting the output feedback, this would lead to nonlinear finite impulse response models in which the only delayed variables are the input data q k−M:k .
Compared with the static NN model, both NARX options reduce the deviations between measured and simulated data significantly, especially during the dynamic operating phases. This is confirmed in Table 2 that provides information on the roots of the corresponding mean square approximation errors (RMS). It can be clearly seen that approximation errors for both the voltage and the power are approximately 30% lower if the dynamic model N d,B with the linear output feedback is used instead of the static representation. These errors can be reduced even further if the fully nonlinear representation N d,A is used instead (cf. the visualization of the approximation errors in Figures 10B, 11B).

Interval Extension of the Static Neural Network Model
In this section, we employ the procedure described in Sec. 3 to parameterize the interval extension of the static NN model. The left column of Figure 12 shows the results of the direct application of this procedure to the complete available data set for the SOFC system. It can be seen that the operating domains with the largest uncertainty of the model (characterized by the largest interval diameters of the forecast electric power in Figure 12C) are those regions in which significant power variations due to sharp changes of the fuel cell current or the hydrogen mass flow occur (cf. Figure 8). Here, symmetric bounds have been determined for all NN parameters as suggested in (Eq. 19).
To reduce the resulting interval widths for the static approximation of the electric fuel cell power, the penalty term (Eq. 16) is modified in such a way that the largest positive deviations in (Eq. 18) of y m,k − y k and y m,k − y k -corresponding to outliers-are ignored in both cases for at most 2% of the measured voltage samples (cf. the right column of Figure 12). This modification results in an interval extension of the NN model that encloses almost all measured voltage and power data, respectively. Note that outliers occur only at those points where steplike current changes of large amplitudes take place. If the resulting interval-valued model is employed in future work to improve the robustness of the maximum power point tracking procedures or online optimization approaches for enhancing the fuel efficiency according to Rauh et al. (2020) and Rauh (2021), such large current changes can be avoided easily by imposing suitable rate constraints on the control signals.

Interval Extension of the NARX Model
In the final part of this section, interval extensions of both possible NARX models are presented in Figure 13. As in the previous subsection, we again compare the cases with and without admissible outliers. It can be seen that the fully nonlinear NARX model N d,A does not only provide smaller approximation errors for the point-valued system identification but also yields significantly smaller diameters in its interval extension.
In contrast to the interval extension of the static NN model, it can be seen that the interval diameters in Figures 13E,F are almost proportional to the electric current. This indicates that the identified interval network parameters in (Eq. 23) lead to almost constant interval widths if the fuel cell stack voltage is concerned as the output of the dynamic system model. On the one hand, this confirms quite strongly that the point-valued system model is parameterized in terms of a fairly robust minimum of the employed quadratic error functional in Stage A of the optimization algorithm. On the other hand, this indicates that further improvements of the modeling accuracy could be achieved only by replacing the feedback of the fuel cell voltage in the NARX models by the electric power. However, this option is not further investigated in this paper, because it would require removing the physically motivated multiplicative system output shown in Figure 7. As observed in Rauh et al. (2021), this would lead to the necessity of a larger number of hidden layer neurons already during the phase of optimizing the pointvalued system model. Therefore, a promising compromise for future research would be to modify the interval model (Eq. 23) in such a way that the point-valued NARX models N d,ι provide the stack voltage as a system output that is multiplied with the measured current I k , while [N ′] represents interval bounds for the power approximation error according to P el, m,k ∈ N d,ι q k−M: k , y k−M: k−1 , W d,q,1 , W d,y,1 , W d,2 , b d,1 , b d,2 · I k + N ′ q k ′, It is especially interesting to investigate this modified formulation in an ongoing research, in which the Kalman filter-based online power identification according to Rauh (2021) and the intervalvalued NN models from this paper are compared with respect to their capabilities to design robust procedures for the optimization of fuel efficiency while simultaneously tracking desired temporal responses for the stack power.

CONCLUSION AND OUTLOOK ON FUTURE WORK
In this paper, we have presented a novel possibility for extending both static NN models and NARX system approximations by error correction intervals. These corrections are chosen in such a way that the computed system outputs enclose available measured data and can be adjusted so that a certain percentage of outliers is permitted.
In future work, we plan to use the interval-valued NN models to assess the reliability of optimization-based control strategies which (in the sense of a predictive controller) adjust system inputs-such as the hydrogen mass flow-to implement tracking controllers for a nonconstant electric power under the simultaneous minimization of the fuel consumption. For this kind of application, it is necessary to avoid overshooting the maximum power point with certainty. So far, strategies have been developed which make use of a Kalman filterbased optimization method (Rauh, 2021). Generalizations to intervalvalued NN models can lead to the implementation of sensitivitybased predictive control procedures such as the one published in Rauh et al. (2011). Note that this approach would require intervalvalued partial derivatives of the NN outputs with respect to the gas mass flow and the electric current. This, however, can be achieved in a straightforward manner by overloading the operators using the interval data type of INTLAB with the functionalities for automatic differentiation available in the same toolbox.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding authors.