Parametric Curves Metamodelling Based on Data Clustering, Data Alignment, POD-Based Modes Extraction and PGD-Based Nonlinear Regressions

In the context of parametric surrogates, several nontrivial issues arise when a whole curve shall be predicted from given input features. For instance, different sampling or ending points lead to non-aligned curves. This also happens when the curves exhibit a common pattern characterized by critical points at shifted locations (e.g., in mechanics, the elastic-plastic transition or the rupture point for a material). In such cases, classical interpolation methods fail in giving physics-consistent results and appropriate pre-processing steps are required. Moreover, when bifurcations occur into the parametric space, to enhance the accuracy of the surrogate, a coupling with clustering and classification algorithms is needed. In this work we present several methodologies to overcome these issues. We also exploit such surrogates to quantify and propagate uncertainty, furnishing parametric stastistical bounds for the predicted curves. The procedures are exemplified over two problems in Computational Mechanics.


INTRODUCTION
In a large variety of engineering applications, parametric surrogates are thoroughly powerful tools (Simpson et al., 2001;Prud'homme et al., 2002;Audouze et al., 2013;Mainini and Willcox, 2015;Hesthaven et al., 2016;Benner et al., 2020a). They allow a real-time monitoring and control of the most relevant physical quantities describing a given phenomenon. Moreover, they empower smart decision making, optimizing time and manufacturing costs. The uncertainty propagation in such models is also fundamental to operate efficiently in diagnosis and prognosis. In a non-intrusive framework, given an engineering problem, a Design of Experiments-DoE-based on the problem parameters is established and the corresponding responses of the system are collected into databases, which are used as training data to build the surrogate model via Machine Learning-ML-and Model Order Reduction-MOR-algorithms (Wang and Shan, 2007;Benner et al., 2015;Hesthaven and Ubbiali, 2018;Rajaram et al., 2020;Franchini et al., 2022;Khatouri et al., 2022). Such responses are usually the ensemble of several Quantities of Interest-QoI-observed, for instance, over time (i.e., time series) and can come both from experiments and numerical simulations. Therefore, each QoI is usually a curve, discretized according to the number of sampling points. This is the case when, for example, a material is tested and the force-displacement curve is extracted for different parameters p defining the material itself. It is also the case when a sensor placed on a mould records the pressure evolution during the mould filling from a resin injected into a mould. In this paper, we propose several strategies to build parametric curves, illustrating the procedure over two applications in computational solid mechanics.
The parametric surrogate f X takes as input a new combination of parameters p ∈ Ω and returns an approximationg (x; p) of g(x; p), that is: where G is a given functional space (in most engineering applications, G ⊆ L 2 (X)).
Our procedure mainly consists in the application of nonintrusive nonlinear regressions based on the sparse Proper Generalized Decomposition-sPGD- (Chinesta et al., 2011;Borzacchiello et al., 2017;Ibáñez et al., 2018;Sancarlos et al., 2021), these being efficient under the scarce data availability constraint. Indeed, in real engineering applications, when dealing with simulation-based metamodels, data availability is largely limited by the complexity of the Finite Element-FE-computations. From the High-Fidelity-HiFi-offline simulations, it is often possible to define a Reduced Order Model-ROM-, for instance, by extracting the most relevant Proper Orthogonal Decomposition-POD-modes from the training data (Raghavan et al., 2013;Fareed and Singler, 2019). Consequently, since the curve can be expressed into the extracted POD reduced basis through a set of weighting coefficients, the nonlinear regressions can be applied to predict such coefficients. A similar workflow is applied by Gooijer et al. (2021), where the POD-based surrogate models employ Radial Basis Function-RBF-interpolations. For the sake of completeness, it shall be noticed that the use of POD-based interpolations-PODI-has several limitations and drawbacks, particularly when dealing with non-linear solution manifolds. To alleviate such issues, several works have been conducted in the framework of interpolations on Grassmann manifolds and its tangent space, improving the model robustness over the parametric space (Amsallem and Farhat, 2008;Mosquera et al., 2018Mosquera et al., , 2021Friderikos et al., 2020Friderikos et al., , 2022.
However, ad-hoc physics-based data pre-processing is a fundamental step to be embedded in the procedure. Indeed, when different choices of the parameters carry radically different physical behaviours, the interpolation in the parametric space can lead to nonphysical solutions. In such cases, separate regression sub-models are built, requiring the coupling with some clustering and classification algorithms, leading to the so-called multiregression strategy.
Another non-trivial issue comes when the curves exhibit a common pattern characterized by some critical points resulting from a change in the physical behaviour. Indeed, a shift among the locations of such critical points in the different curves would cause nonphysical results when employing a classical interpolation. To overcome this matter, we propose a parametrization of the curves accounting for the locations of such critical points and allowing a curve alignment prior to the interpolation.
The main points addressed in this work are: 1. the parametric modeling of a quantity of interest using advanced sparse nonlinear regressions; 2. the parametric modeling of a curve where a data alignement is needed; 3. the statistical parametric modeling based on a parametrized physical model; 4. the statistical parametric model learned from scarce data (measurements); 5. and, finally, the concept of data clustering to overcome bifurcations in the parametric space.
The paper is structured as follows. Section 2 is mainly a review of some well-known techniques, excepting SubSection 2.3 which illustrates the implementation of the sPGD algorithm for the prediction of functions defined over an interval (i.e., curves). Elements of novelty are introduced in Section 3 and 4, where 1) we propose a curve alignment prior to regression; 2) we define a statistical model for uncertainty propagation, furnishing confidence bounds for a parametric curve; 3) we employ a multi-regression, based on clustering and classification, to tackle bifurcations in the parametric space, enhancing the model accuracy. We exemplify the methodologies over two engineering applications in computational solid mechanics. The first application concerns a reduced order model for virtual materials characterized by a parametric Krupkowski hardening law; the second application is related to crack propagation analysis in parametric notched specimens under tensile loading. Section 5 is a short conclusion, in which possible further developments and approaches are discussed.

METHODS
In this section we briefly summarize the main tools in MOR employed in this work. For a complete description of the most recent advances in the MOR community, we refer to the handbooks by Benner et al. (2020c,b,a) and the plentiful bibliography therein.

POD
We assume that a numerical approximation of the unknown field of interest u(x, t) is known at the nodes x i of a spatial mesh for discrete times t j = (j − 1)Δt, with i ∈ [1, …, n x ] and j ∈ [1, …, n t ]. We use the notation u(x i , t j ) ≡ u j (x i ) ≡ u j i and define u j as the vector of nodal values u j i at time t j . The main objective of the POD is to obtain the most typical or characteristic structure ϕ(x) among these u j (x), ∀j. For this purpose, we maximize the scalar , which leads to the following eigenvalue problem Cϕ = λϕ, where is the two-point correlation matrix (symmetric and positive definite). Defining the matrix In order to obtain a reduced-order model, we first solve the eigenvalue problem and select the r eigenvectors ϕ i associated with the highest eigenvalues (truncated SVD at rank r), with in practice r ≪ n x . Thus r eigenvectors are placed in the columns of a matrix B that allows reducing U into its reduced counterpart γ, according to U = Bγ. Then, considering the full-size system KU = F, we have KBγ = F. Premultiplying by B T one gets B T KBγ = B T F and, with new definitions, the reduced counterpart becomes kγ = f.
The main drawback related to such a procedure is the size of the eigenproblem to be solved, the size of the correlation matrix C, n x × n x , with n x scaling with the number of nodes considered in the problem discretization that can reach in some applications millions and much more. The so-called Snapshot-POD allows alleviating the just referred issue (Hilberg et al., 1994). The basic concept is that, when n t ≪ n x , it is much more convenient solving the eigenvalue problem forC = Q T Q, whose size scales with n t , then retrieve the modes related to the highest eigenvalues.

PODI
The origin of the non-intrusive POD, comes from the so-called POD with interpolation. PODI consider different snapshots related with different values of the model parameter p, U(p i ), i = 1, …, n s , without loss of generality assumed scalar and ordered, i.e. p 1 < … < p n s .
Then, as usual in POD-based MOR, the reduced basis is extracted, ϕ 1 , …, ϕ r . Now, for a given parameter p, with p 1 < p < p n s and p ≠ {p 1 , p 2 , …, p n s }, instead of expressing the searched solution into the reduced basis U(p) = ∑ r i=1 γ i (p)ϕ i , and then looking for the coefficient γ i (p) by Galerkin projection, i.e., by solving (B T KB)γ = B T F (that requires assembling the matrix and performing the matrix products before finally solving the reduced linear system of equations), PODI proceeds as follows.
• Sampling: U(p i ) ≡ U i , i = 1, …, n s ; • Reduced basis extraction: POD is applied to extract the reduced basis ϕ 1 , …, ϕ r ; • Reproduction: calculation of γ i . For that, we look to express U i = ∑ r j=1 γ i j ϕ j . Premultiplying by ϕ k and taking into account the orthonormality of the reduced basis, it results Repeating for all i ∈ {1, …, n s } and k ∈ {1, …, r}, we obtain γ i (the reduced counterpart of U i ).
• Interpolation: with the reduced solution representations γ i ≡ γ(p = p i ), one is tempted for any other p to proceed by interpolation, i.e.
with F i (p) the approximation functions, that define an interpolation as soon as F i (p j ) = δ ij , with δ ij the Kronecker delta.
• Reconstruction: with γ(p) obtained, the solution can be reconstructed everywhere from the nodal values U(p) = Bγ(p).

Extension to Multi-Parametric Settings
The just discussed procedure seems very appealing, however, its extension to highly-multidimensional settings remains difficult because of usual approximation bases suffer from the so-called curse of dimensionality.
In the case of moderate dimensionality, the PODI algorithm is easily generalizable. For that purpose we first reformulate the PODI described above as follows: the reconstruction U(p) = Bγ(p) can be expressed in the equivalent form: with γ i k ≡ γ k (p i ) known, the interpolation can be expressed as: that is directly generalizable to the multi-parametric setting where the scalar p is replaced by the parameters vector p, with the interpolation expressed now as As previously mentioned the main difficulty associated with the technique just described is the difficulty of interpolating when the number of parameters (the size of vector p) increases too much. Separated representations in sparse settings, addressed in Subsection 2.3, succeed in circumventing the just referred difficulty.

Advanced Sparse PGD-Based Nonlinear Regressions
Here we discuss the PGD-based regression methods to build metamodels depending on d features. In particular, we focus on the case where, for a given choice of the parameters.
1. a single-valued output is measured; 2. a vector-valued output is measured; 3. a single-valued output is measured over a certain interval.

Single-Valued Output
In the case of a scalar output, the general problems consists of constructing the function that depends on d features (parameters) p k , k = 1, …, d, taking values in the parametric space Ω, where a sparse sample of n s points and the corresponding outputs are known.
The so-called sparse PGD (sPGD) expresses the function f from a low-rank separated representation constructed from rank-one updates within a greedy constructor.
In the previous expressionf M refers to the approximation, M the number of employed modes (sums) and ψ k m are the one-dimensional functions concerning the mode m and the dimension k.
Functions ψ k m , m = 1, …, M and k = 1, …, d are expressed from a standard approximation basis N k m , via coefficients a k m : where D represents the number of degrees of freedom (nodes) of the chosen approximation and N k m is the vector collecting the shape functions.
In the context of usual regression the approximationf M results from wheref M takes the separated form of Eq. 1, n s is the number of sampling points to train the model and p i the vectors that contain the input data points of the training set. Notice that, to avoid overfitting, the number of basis functions D must be D < n s . The approximation coefficients of each one-dimensional function are computed by employing a greedy algorithm, such that, once the approximation up to order M − 1 is known, the Mth order term readsf The computed function is expected to approximate f not only in the training set but in any point p ∈ Ω.
The main issue is how to ally rich approximations and scarce available data, while avoiding overfitting. For that purpose a modal adaptivity strategy-MAS-was associated to the sPGD, however, it has been observed that the desired accuracy is not achieved before reaching overfitting or the algorithm stops too early when using MAS in some cases. This last issue implies a PGD solution composed of low order approximation functions, thus not getting an as rich as desired function. Some papers describing the just referred techniques are (Borzacchiello et al., 2017;Ibáñez et al., 2018).
In addition, in problems where just a few terms of the interpolation basis are present (that is, there are just some sparse non-zero elements in the interpolation basis to be determined), the strategy fails in recognizing the true model and therefore lacks accuracy.
To solve these difficulties, different regularizations were proposed (Sancarlos et al., 2021), combining L 2 and L 1 norms affecting the coefficients a k m , in order to increase the predictive performances beyond the sPGD capabilities, or to construct parsimonious models while improving predictive performances.

Vector-Valued Output
In the case of a multidimensional output, we seek the function This is a trivial extension of the single-valued function, where each component f i (p 1 , …, p d ), for i = 1, …, n, is fitted independently using the procedures explained in Subsection 2.3.1.

Single-Valued Output Over an Interval
Let us now consider the case when, d features (parameters), the system output is a univariate function of the variable x, that is g( The parametric surrogate f X takes as input a new combination of parameters p ∈ Ω and returns an approximationg (x; p) of g(x; p), that is: where G is a given functional space.
Usually, the target function g(x) is evaluated (known) in a finite number n x of sampling points, that is the discrete ensemble X = {x j } n x j=1 . In this case, the coordinate x can be considered as an additional parameter, and the approximation problem can be reformulated as seeking the function We have dropped the subscript X related to the variable x since the approximation problem has been recast into a new parametric framework defined by Ξ. The newly defined parametric coordinate p d+1 accounts for the location in which g(x) shall be approximated, that is: Such coordinate is thus much richer than the others, given the very fine discretization in n x points available along this direction, compared to the sparse knowledge concerning the first d parametric coordinates belonging to Ω.

Equation 1
now reads: where the univariate functions of the first d parameters {ψ k m } d k=1 , for m = 1, …, M, are still expressed by the same polynomial basis, as defined in Eq. 2. However, functions ψ d+1 m can be expressed through standard piecewise linear basis functions (i.e., Lagrangian hat functions), defined over the n x discretization points of the coordinate x: where n x is the number of discretization and with h denoting an uniform discretization step. In particular, The minimization problem (Eq. 3) can also be recast as With these definitions made, the algorithm runs as previously explained.

POD Modes Extraction
Here we reformulate the approximation problem of curves within a POD-based MOR builder, which can be seen as a data precompression and dimensionality reduction approach. Indeed, considering the training data {g i (x)} n s i=1 , for x ∈ X = {x j } n x j=1 , the following snapshots matrix can be built: where g ∈ ℝ n x ×1 contains the evaluations of g(x) over the discrete ensemble X.
A reduced factorization of the snapshots matrix is then obtained via a standard truncated POD of rank r: where U ∈ ℝ n x ×r , Σ ∈ ℝ r×r , V ∈ ℝ n s ×r . From these, we can define the matrices of POD modes and coefficients, respectively: In particular, the matrix Φ contains, by columns, the functions of the reduced POD basis {ϕ i (x)} r i=1 evaluated at points in X, while Λ collects the projection coefficients into the reduced basis. A generic curve g k (x) belonging to the training dataset, for k = 1, …, n s and with x ∈ X, has the reduced counterpart g (r) and, in particular, its discrete form reads where Λ k, • denotes the kth row of the matrix Λ.
Let us consider now a parametric curve depending on d features p ∈ Ω, that is g (x; p), for x ∈ X. From Eq. 4 it is clear that, once the reduced basis matrix Φ available, such function is projected over this basis only through the POD (parametric) coefficients {λ i (p)} r i=1 : The above equation suggests that a reduced-order parametric metamodel for the curves can be built considering only the set of coefficients {λ i (p)} r i=1 . In particular, the following parametric function shall be constructed: from the available training dataset {p k , Λ k,• = (λ k,1 , λ k,2 , …, λ k,r )} n s k=1 obtained after the POD. This problem can be solved by the algorithm exposed in Subsection 2.3.2.

Multi-Regression
Creating a unique regression in large physical and parametric domains is a tricky issue. From one side, constructing a regression of a quantity of interest is much more accurate than creating the parametric curve (e.g., the parametric time evolution of the solution at a certain point), that in turn, becomes much more accurate than creating a regression of a field. The reason is that in general regressions are constructed by using the L 2 -norm, and consequently, if a given field exhibits strong localizations, these local behaviors are sacrificed in benefit of a quite good solution everywhere (on average).
Thus, a valuable route for enhancing accuracy consists in partitioning the physical space, in order to perform a regression in each of the resulting patches. Local quasi-linear regressions perform in general better than rich nonlinear regressions in the whole space domain.
The main issue in using multiple regressions, one per patch, is that the continuity can be lost on the patch boundaries. One could try to enforce the continuity, for example within a Partition of Unity-PU-framework, however, continuity is not compulsory, and then, on the patch borders (or in its neighborhood) one could compute the regressions from both sides and average them. Another possibility is taking profit of those discontinuities for refinement purposes, as usually considered within the finite element method framework.
In the case of parametric models the issue that we just discussed not only affects the spatial domain, but also the parametric one. In that case, making a partition of the multiparametric space is not simple. One possibility consists in clustering the solutions related to the considered sampling, for example by invoking the k-means. Then, a nonlinear regression is created from the solutions in each cluster. Finally, the trickiest issue becomes the way of associating a cluster to any parameters choice, that is, performing an accurate classification. The procedure can be summarized in the following steps: 1. clustering high-fidelity solutions related to a design of experiments; 2. creating a regression model in each cluster (for instance, via the algorithms presented in Subsection 2.3); 3. constructing a classifier able to associate a cluster to any parameters choice and to select the most suitable regression model.

k-Means
k-means is one of the earliest methods for non-supervised vector quantization in artificial intelligence (MacQueen, 1967). In essence, as the Support Vector Machines-SVMs- (Cristianini and Shawe-Taylor, 2000) would do in the context of supervised learning models, k-means performs cluster analysis. In other words, this technique groups a set of objects such that every member of the group or cluster is more similar (closer) to the other members of the cluster than to any member of the rest of clusters.
In the case of k-means, this partition is made on the basis that each experimental data pertains to the cluster with the nearest mean. As can be readily noticed, this is equivalent to computing Voronoi cells in the data. Formally, if we have a set of observations in the form of high-dimensional vectors (x 1 , x 2 , …, x M ), we aim at partitioning these M observations into k sets (k ≤ M), S = {S 1 , S 2 , …, S k }, such that where μ i is the mean of each cluster.

DATA ALIGNMENT AND UNCERTAINTY PROPAGATION
In this Section we will present the curve parameterization based on data alignment to obtain an accurate physics-informed interpolation. We will exemplify the procedure to study the mechanical response of parametric materials loaded in tension.
In this Section we consider a parametric study over dog bone tensile test samples, as sketched in Figure 1. We are interested in the influence of the 3 parameters (n, K, 0 ) characterizing the Krupkowski hardening law (also known as Swift hardening law), widely used in FEM software linking the True Strength and the True Strain. denotes the effective plastic strain, 0 the offset strain, n the strain hardening exponent and K the material constant.
The image in Figure 2 top shows two patterns of the Force-Displacement curve, obtained for two different choices of the Krupkowski parameters (blue and orange lines). A classical interpolation of these two patterns would result in the nonphysical black dashed pattern.
In what follows, we propose a procedure to overcome such spurious effects, based on a curve alignment prior to interpolate. The method is illustrated over the Force-Displacement curves. However, for the sake of generality, we refer to such curves as generic functions g(x), presenting two characteristic behaviors in the so-called primary and secondary zones. In the specific case of Force-Displacement, the primary zone is the elastic response of the material, up to the yield point x E . The secondary zone is the post yield behaviour up to the failure point x F , as illustrated in Figure 3. We will also refer to x E as the "transition point" and to x F as the "end point", related to the specimen fracture.
We assume that the behaviors in the primary and secondary zone, g 1 (x) and g 2 (x) respectively, and the transition and end points, x E and x F respectively, depend on a series of parameters grouped in vector p, i.e. g 1 (x; p) ≡ g(x ∈ [0, x E ]; p), g 2 (x; p) ≡ g(x ∈ [x E , x F ]; p), x E (p) and x F (p). Indeed, when considering different choices of the model parameter p i = (K i , n i , 0,i ), i = 1, …, n s , one obtains a set of curves, as the FIGURE 2 | Main issue encountered when using standard interpolations on non-aligned curves (the black dashed line represents the interpolation between the two colored lines).  Figure 4, for instance. Such curves correspond to a sparse DoE (Latin Hypercube) of 20 points in the 3-dimensional parametric space Ω = I K × I n × I ε 0 , considering the parameters bounds specified in Table 1. Numerical simulations have been carried out with VPS simulation software from ESI Group. The variable x corresponds to the displacement in mm, while the function g(x) to the force in kN.

ones shown in
Once the transition and end points of each curve have been determined, the curves can be rediscretized over the same number of points (through a standard piecewise linear interpolation, for instance). To align them, we define a expressions that hold for each curve g(x; p i ), i = 1, …, n s , with Figure 5 depicts functions g 1 i (y) ≡ g 1 (y; p i ) and g 2 i (z) ≡ g 2 (z; p i ).
Actually, this procedure amounts at performing an alignment based on a dilatation of the curves in the first and secondary zone, as shown in Figure 6. In such case, we can express the aligned curves as functions ofx ∈ [0, 2].
Once the curves have been aligned, the nonlinear regressor presented in Subsection 2.3.3 can be invoked to build the parametric metamodel of the curve. This can be done separately in each zone or over the whole newly defined coordinatẽ x. However, before proceeding with the regression, we address an ulterior parametrization via the Proper Orthogonal Decomposition to achieve a further Model Reduction as discussed in Paragraph 2.3.3.

POD Modes Extraction
In order to extract the most significant modes able to describe these functions, the POD can be applied in each group of curves in Figure 5. This amounts to build the snapshot matrix within each group and perform a truncated SVD. In the case that serves here to illustrate the procedure, a single mode suffices for describing the almost linear functions in the primary zone, that will be noted by ξ 1 ( y), whereas in the secondary zone two functions are needed, ϕ 1 (z) and ϕ 2 (z).
Thus, any function g 1 i (y) can be expressed ∀i as whereas functions g 2 i (z), ∀i, read The α and β coefficients can be easily computed by simple projection, i.e.
where the normality of ξ 1 ( y) was used. In the same way, and taking into account the orthonormality of functions ϕ 1 (z) and ϕ 2 (z), Thus, for each curve g i (x) we succeeded to extract its five main descriptors: x i E , x i F , α i 1 , β i 1 and β i 2 , all of them related to the features grouped in vector p i . Now, each of these descriptors can be expressed parametrically, x E (p), x F (p), α 1 (p), β 1 (p) and β 2 (p), by using the regression techniques described in Subsection 2.3.1 for scalar quantities.

Curves Reconstruction
When considering a choice of the parameters p, the curves descriptors are extracted from the regressions FIGURE 7 | sPGD predictions (green line for training, red for testing) versus true curve (blue line).
x E (p), x F (p), α 1 (p), β 1 (p) and β 2 (p), the dimensionless coordinates defining both zones calculated from and, finally, the curve in each zone reconstructed according to g 1 (y; p) = α 1 (p) ξ 1 (y) , and g 2 (z; p) = β 1 (p) ϕ 1 (z) + β 2 (p) ϕ 2 (z) , from which the curve g(x; p) can be straightforward obtained via To build the parametric metamodel, 17 curves have been used to train the sPGD regressor, while the remaining 3 for testing. Figure 7 shows the resulting predictions over 3 training points and test points.

Real-Time Calibration
Now, given an experimental curve g(x), its parameters are extracted according to.
• x E from the point at which the change of behavior occurs (for instance, computing the function derivatives by means of finite differences); • x F is the terminal point; • α 1 follows from y = x x E and ∫ 1 0 g(y)ξ 1 (y) dy = α 1 ;   • β 1 follows from z = and ∫ 1 0 g(z)ϕ 2 (z) dz = β 2 . Then, from the regression models x E (p), x F (p), x 1 (p), β 1 (p) and β 2 (p), the inverse problem is solved for extracting the associated parameters, p.

Statistical Model Derived by Parametric Curves
With the previously built surrogate model, the curve related to any possible value of p can be computed in real-time, i.e. g(x; p). In this section, this surrogate will be employed for uncertainty quantification.
We assume that each feature p k in vector p is assumed characterized by a Gaussian distribution defined its mean value   μ k and its variance σ 2 k , that is p k ∼N (μ k , σ 2 k ). Assuming all p k being independent, we get where diag(•) is the diagonal matrix of diagonal •. The aim is linking the sensitivity over the input features with the one over the output curve. This means computing some estimators of the average M and the variance Σ of the curve descriptors for different choices of μ and σ, and from them, by using the regressions presented in Subsection 2.3, build the set of statistical surrogates: where O(p) denotes any QoI involved in the curves parametrization (i.e., an output depending on the input parameters; e.g., x E , x F , α 1 , β 1 and β 2 in the example presented before) and M and Σ the corresponding estimators for mean and variance, respectively. This allows calculating the envelopes, for a given confidence, of the curves, as sketched in Figure 8.
To build the surrogate (5), for instance for the curve descriptor O(p), a training dataset of N s points shall be generated: This can be achieved by means of a Monte Carlo sampling, which gives the estimators of mean and variance for the curves g(x; p j (μ j , σ j )), and of any descriptor O(p j ), for j = 1, …, N s . The whole procedure is summarized in Algorithm 1.
: Figure 9 shows the parametric curve and its statistical sensing, for a given choice of the input features distribution parameters. Confidence Intervals have been computed using Algorithm 1, for the curve and the rupture point.

Statistical Model Derived From Measures
In this Section we consider that for different choices of the problem features p i , the measure g m (x; p i ) is collected. We assume that measures contain a significant uncertainty, modeled again, without loss of generality, by a Gaussian distribution of null average and variance σ, that is, N (0, σ 2 ), with the variance assumed independent of the features p.
In these circumstances applying a regression to fit those values g m (x; p i ), that is f X,m (p i ) = g m (x; p i ), according to the techniques described in Subsection 2.3 is not a valuable route. The most valuable solution consists of looking for the baseline regression f X (p) such that the deviation D i = f X,m (p i ) − f X (p) follows the distribution N (0, σ 2 ), where both the regression parameters involved in f X (p) and the variance (if not known a priori) are calculated. In some cases the sensor calibration allows identifying σ 2 .
The just described procedure is very close to standard Bayesian inference.

Model Enrichment
When two regression models are known, for the sake of simplicity assumed scalar, one related to a physics based model f X,model (p) and the second one to the measures f X,measure (p), both associated with the average values in case of uncertainty in the model Thus, the enriched model reads As in general the nonlinear character of f X,measure (p) is expected being much higher than the one of the gap, Δf X (p), a more valuable route consists in calculating the discrete gap D(p i ) = f X,m (p i ) − f model (p i ) and then calculate the regressionΔf X (p) fitting the discrete deviations, and the associated enriched model f X,enrich (p)f X,enrich (p) = f X,model (p) +Δf X (p) .

DATA ALIGNMENT AND DATA CLUSTERING
Here we focus on the study of crack propagation in notched specimens loaded in tension, whose geometry is sketched in Figure 10. The test piece has a V-shaped notch defect which is always at the same location (almost bottom-middle). On the other side of the test piece there is a half-circle groove. The goal is to predict the crack propagation from the defect based off of different locations (S) and radii (R) of the groove, as well as different test piece thicknesses (h). Depending on the location of the groove, the crack will propagate differently from the defect to the groove. We have considered a sparse DoE (Latin Hypercube) of 50 points in the 3-dimensional parametric space Ω = I R × I S × I h , with the parameters bounds specified in Table 2. Numerical simulations (carried out in VPS software from ESI Group) employ an Explicit Analysis and the EWK rupture model (Kamoulakos, 2005), using a mesh of 1096218 solid elements.
We focus on the prediction of the Force-Displacement curves plotted in Figure 11, which are considered as the generic functions g(x), following the same notation of Section 3.
It can be observed that all the curves present a similar pattern in the first zone, monotonically increasing, while the response appears much different in the secondary zone. A first pre-processing step consists in splitting the zones as illustrated in Figure 12, where x M denotes the point where the curve reaches its maximum value, while x F its endpoint.
Cutting the curves, we obtain the two groups of functions plotted in Figure 13, which are of course not aligned. However, they can be expressed as functions of normalized coordinates y and z, respectively, and aligned following the dilatation procedure discussed in Section 3.
Once the alignment has been performed, using the usual nonlinear regression techniques of Subsection 2.3 and same notations of Section 3, two regression models, one for each group, can be established: In Eq. 6, for the sake of clarity, we have specified x M and x F since these points are involved into the parametrization of the functions g 1 (x) and g 2 (x), respectively, and thus expressed parametrically.
As we have previously pointed out, the second group of functions g 2 i (x), for i = 1, …, n s , presents really different shapes depending on the features p i . When bifurcations occur in the parametric space, the system responses related to two choices of the model parameters can be completely different. In such cases, a standard nonlinear regression over the full space can lead to inaccurate and nonphysical solutions. To enhance the accuracy of the model f X 2 (p), a more valuable route consists in exploring the parametric space prior to interpolation. This can be done via a clustering of the system responses. Once the clusters have been established, several regression sub-models can be built, minimizing the risk of mixing spurious effects coming from other clusters.

Clustering
To exemplify the bifurcation problem in the parametric space, we consider two different configurations of the model parameters, resulting into the specimens shown in Figure 14. Figure 15 shows four snapshots of the displacement field related to the specimens in Figure 14, under axial tensile loading. The crack propagation follows two completely different patterns, drastically influencing the Force-Displacement curve, as shown in Figure 16.
The clustering step can be performed automatically by using a hierarchical clustering based on the curves shape or on the location of damaged elements into the finite element mesh. Once the clusters C 1 and C 2 have been established, two regression    submodels can be trained, one for each cluster, and Eq. 6 becomes for C 1 f X 2,2 (p) for C 2 .
(7) Figure 17 shows the functions in the secondary zone after the clustering.
In particular, one can remark that fracture occurs early on for tests belonging to cluster C 1 and the final part of the curve is characterized by a steep slope. On the contrary, tests belonging to cluster C 2 have an endpoint displacement around 15 mm and present a shallow slope. The clustering allows to avoid averaging such different dynamics, clearly enhancing the quality of the regressor.

Curves Reconstruction and Classification
For a newly defined choice of model features p * , the curve g(x; p * ) is obtained via where g 1 and g 2 are obtained through Eq. 7. The training of the regression models has been performed using 40 points of the DoE, remaining 10 have been used for testing. Moreover, a Support Vector Machine classifier (a Random Forest classifier could also be used, for instance) has been trained to select the best regression submodel to predict g 2 (x; p * ). Such classifier has shown perfect accuracy, as shown by the Confusion Matrices in Figure 18. Moreover, Figure 19 shows the separating surface and classified points in the 3-dimensional parametric space.
Figures 20, 21 represent the plots of predictions for train and test, respectively, for 4 data points.

CONCLUSION
In this paper we have focused on several nontrivial issues encountered when a whole curve shall be predicted from a given number of features. A major argument is the data alignment to achieve physics-consistent interpolations among curves and the data clustering to detect bifurcations in the parametric space. The proposed methodologies rely on adopting specific parametrizations of the curve and a physics-based preprocessing prior to the application of any regression technique. We have also suggested a reduced order parametrization of the curve via POD coefficients, requiring the prediction of a few scalar quantities (i.e., the POD coefficients) instead of the whole curve. Here, without loss of generality, we have preferred sPGD-based nonlinear regressions, these being efficient in high-dimensional parametric spaces under the scarce data limit constraint. Indeed, since our data come from numerical simulations of complex engineering problems, due to the high computational complexity of the offline simulations, not much data are usually available. Moreover, one important achievement of the work is the definition of a statistical sensing for uncertainty propagation based on the parametric model.
We have focused on two applications in computational mechanics: 1) plastic materials with parametric hardening law, 2) crack propagation in parametric notched specimens. However, these methodologies can be applied to any time series or generic curve stem from any context. For instance, in our current research, we are successfully applying these techniques to solve many other problems (to cite some, the study of a two-phase flow dynamics in a heated channel, the composite forming processes involving a reactive resin injection molding). Moreover, we are focusing on other physics-based curves interpolation strategies based on Optimal Transport-OT- (Torregrosa et al., 2022) and other mappings.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
All the authors participated in the definition of techniques and algorithms. All authors read and approved the final manuscript.