# MACHINE LEARNING AND DATA MINING IN MATERIALS SCIENCE

EDITED BY : Norbert Huber, Surya R. Kalidindi, Benjamin Klusemann and Christian Johannes Cyron PUBLISHED IN : Frontiers in Materials

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-651-8 DOI 10.3389/978-2-88963-651-8

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# MACHINE LEARNING AND DATA MINING IN MATERIALS SCIENCE

Topic Editors:

Norbert Huber, Helmholtz Centre for Materials and Coastal Research (HZG) and Hamburg University of Technology, Germany Surya R. Kalidindi, Georgia Institute of Technology, United States Benjamin Klusemann, Leuphana University of Lüneburg and Helmholtz Centre for Materials and Coastal Research (HZG), Germany Christian Johannes Cyron, Hamburg University of Technology and Helmholtz Centre for Materials and Coastal Research (HZG), Germany

Citation: Huber, N., Kalidindi, S. R., Klusemann, B., Cyron, C. J., (2020). Machine Learning and Data Mining in Materials Science. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-651-8

# Table of Contents


Anjana Talapatra, Shahin Boluki, Pejman Honarmandi, Alexandros Solomou, Guang Zhao, Seyede Fatemeh Ghoreishi, Abhilash Molkeri, Douglas Allaire, Ankit Srivastava, Xiaoning Qian, Edward R. Dougherty, Dimitris C. Lagoudas and Raymundo Arróyave


Roland Can Aydin, Fabian Albert Braeu and Christian Johannes Cyron


Eric R. Homer, Derek M. Hensley, Conrad W. Rosenbrock, Andrew H. Nguyen and Gus L. W. Hart


Orkun Furat, Mingyan Wang, Matthias Neumann, Lukas Petrich, Matthias Weber, Carl E. Krill III and Volker Schmidt


# Editorial: Machine Learning and Data Mining in Materials Science

Norbert Huber 1,2 \*, Surya R. Kalidindi <sup>3</sup> , Benjamin Klusemann1,4 and Christian J. Cyron1,5

1 Institute of Materials Research, Materials Mechanics, Helmholtz-Zentrum Geesthacht, Geesthacht, Germany, <sup>2</sup> Institute of Materials Physics and Technology, Hamburg University of Technology, Hamburg, Germany, <sup>3</sup> School of Mechanical Engineering and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, United States, <sup>4</sup> Institute of Product and Process Innovation, Leuphana University of Lüneburg, Lüneburg, Germany, 5 Institute of Continuum and Materials Mechanics, Hamburg University of Technology, Hamburg, Germany

Keywords: machining of data, machine learning, data mining, materials design, materials processing, scale bridging

**Editorial on the Research Topic**

#### **Machine Learning and Data Mining in Materials Science**

The development of new materials, incorporation of new functionalities, and even the description of well-studied materials strongly depends on the capability of individuals to deduce complex structure-property relationships. A significant challenge in this field remains the "curse of dimensionality". Even for the characterization of moderately complex materials, often a considerable number of parameters is required to characterize their composition and microstructure (or also processing conditions) uniquely. Modeling of materials is thus facing the challenge of high-dimensional parameter spaces, where numerous parameter combinations have to be sampled and studied thoroughly. Relying thereby on experiments is typically prohibitively expensive, given the often high-dimensional parameter space of interest. Thus, the combination of experimental and computational approaches is receiving increasing attention. The complex interdependencies in the resulting data sets can be studied using machine-learning approaches. Artificial neural networks and data-driven approaches can significantly help to identify, approximate, and visualize structure-property relationships of interest. This way, they can accelerate our understanding and effective utilization of complex hierarchical materials.

This Research Topic is a compilation of contributions on current ideas and novel concepts for the advancement of machine learning, data mining, and data driven-approaches in the context of the design of materials and materials processing. This includes general methods as well as their application to decoding the complex relationships along the chain composition—processing structure—mechanical properties.

The review article by Bock et al. provides an overview on the state of art about machine learning and statistical learning approaches in the field of continuum materials mechanics. Furthermore, works on experiment- and simulation-based data mining in combination with machine leaning tools are presented. The reviewed papers are categorized as descriptive, predictive, or prescriptive depending on whether they aim at identification, prediction, or even optimization of essential characteristics. The potential of utilizing machine learning in materials science to empower significant acceleration of knowledge generation is highlighted. The other review article within

Edited and reviewed by: Roberto Brighenti, University of Parma, Italy

> \*Correspondence: Norbert Huber norbert.huber@hzg.de

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

Received: 17 December 2019 Accepted: 18 February 2020 Published: 28 February 2020

#### Citation:

Huber N, Kalidindi SR, Klusemann B and Cyron CJ (2020) Editorial: Machine Learning and Data Mining in Materials Science. Front. Mater. 7:51. doi: 10.3389/fmats.2020.00051 this collection by Talapatra et al. discusses the need and challenges of optimal experimental setups as a key factor for accelerating the discovery of materials. The authors review the most important challenges and opportunities connected with the concept of optimal experiment design and present successful examples that have led to materials discovery via this concept.

Advances of machine learning and data mining methods are addressed in particular in three articles of this special issue. Fritzen et al. developed a multi-fidelity surrogate model allowing for an adaptive on-the-fly switching between different surrogate models for a concurrent two-scale simulation. The first surrogate model is based on reduced order modeling, where the second one represents an artificial neural network (ANN). The methodology provides a suitable basis for the generalization of the applied machine learning techniques for different applications. Aydin et al. show how the bottleneck of computational data generation can be widened by an effective combination of simulations with different accuracy and computational cost. A cheap low-fidelity computational model is used to start the training of the ANN and then gradually switches to higher-fidelity training data as the training of the ANN progresses. This multi-fidelity strategy can reduce the total computational cost by a half up to one order of magnitude. González et al. emphasize how to enhance suitable physical models by available experimental data. Rather than substituting physical models by data, the authors are using the data to correct and enhance the physical law/model of interest, ensuring thermodynamic consistency. Rather than creating a purely data-driven model, the proposed technique represents an appealing alternative for machine learning of models from data.

Characteristics of the lower scales often significantly influence or dominate the macroscopic behavior of materials, making an appropriate characterization of the lower scales indispensable. Unfortunately, common (crystal) structure identification techniques can often not be applied to describe the structure of individual atoms in grain boundaries (GBs) sufficiently. To address this problem, Snow et al. used a form of Common Neighbor Analysis for the identification and characterization of arbitrary atomic structures found around GBs. The resulting structure descriptors are used as input to machine learning algorithms, here PCA with linear regression, for the development of atomic structure-property models for GBs. In the same spirit, Homer et al. developed a new structural representation, called the scattering transform, for characterization of GBs. This approach uses wavelet-based convolutional neural networks to characterize grain boundaries. The learning results are compared to a SOAP (smooth overlap of atomic positions) based representation, which reveals some benefits on the scattering transform, e.g., learning well on larger datasets and providing physically interpretable information. At the microscale, Steinberger et al. used a machine learning based approach for classification of coarse-grained dislocation microstructures. As potential machine learning features, the dislocation microstructure is described via different dislocation density field variables. It is shown that the accuracy of machine learning models varies with different sets of microstructure features and spatial discretization. This can also be used as an indicator for testing the ability of a coarse-grained model to capture the underlying mechanisms accurately. At the macroscale, Furat et al. present various applications for segmentation of tomographic imaging data by combining machine learning methods and conventional image processing techniques. They demonstrate the applicability of their approach using the example of grain-wise segmentation of time-resolved CT data obtained in between Ostwald ripening steps of an AlCu specimen. Richert et al. investigated algorithms used for the measurement of complex 3D microstructures with respect to over- and underestimation of the thickness of curved features, which can lead to a significant error in the prediction of mechanical properties. Here, artificial neural networks are applied for reconstruction of the true geometry from the image processing data within voxel resolution.

In terms of materials modeling along the process-propertystructure-performance chain, Würger et al. successfully used a combination of experiments, machine learning, data mining, density functional theory, and molecular dynamic calculations to determine property-structure relationships in magnesium alloys with respect to corrosion. Corrosion inhibition properties of still untested molecules are estimated and a relationship between corrosion inhibition efficiency and corresponding molecular structure of magnesium corrosion inhibitors is established. Castillo and Kalidindi present a two-step Bayesian framework for the estimation of the intrinsic single crystal elastic stiffness parameters from the measurements of spherical indentation stress-strain responses in multiple individual grains of a polycrystalline sample, whose crystal lattice orientations have been measured using electron back-scattered diffraction technique. It is shown that the introduction of a Bayesian framework can greatly reduce the number of simulations necessary to establish this function. The novel framework is presented and demonstrated for a cubic polycrystalline Fe-3%Si sample and a hexagonal polycrystalline pure titanium sample. In the approach by Reimann et al., the macroscopic material behavior is described via a trained machine learning algorithm based on micromechanical simulations, i.e., uniaxial loading of representative volume elements of the microstructure of interest. In this regard, the trained algorithm can be interpreted as a macroscopic constitutive relation. The approach is illustrated for damage modeling as well as microstructure design that lead to targeted mechanical properties.

Menon et al. present a general hierarchical machine learning (HML) model for predicting the stress-at-break, strain-at-break, and Tan δ for thermoplastic and thermoset polyurethanes. The algorithm was trained on a library of 18 polymers. HML reduces data requirements through robust embedding of domain knowledge and surrogate data in a middle layer that bridges input variables (composition) and output responses (mechanical properties). The HML predictions are shown to be more accurate than those from a random forest model directly relating composition and properties, suggesting that embedding domain knowledge provides significant advantages in predicting the properties of complex material systems based on small datasets. Huber addresses a number of fundamental questions regarding the topological description of materials characterized by a highly porous three-dimensional structure. Via data mining, the interdependencies of topological parameters and relationships between topological parameters with mechanical properties are discovered. The determination of the average coordination number turned out to be a difficult problem, which is solved by artificial neural networks by reconstructing the information on low-coordinated junctions that are not detectable from a common structure analysis.

The reviews and original articles compiled in this Research Topic give a taste of the potential of coupling approaches from materials science, modeling, and simulation with data mining and machine learning. This offers exciting perspectives for solving challenging problems, such as decoding and computational modeling of complex structure-process-property relationships, replacement of computationally demanding submodels in multiscale simulations, or classification and interpretation of imaging data.

# AUTHOR CONTRIBUTIONS

This Editorial was jointly written by all authors who also served as Guest Editors for the Research Topic.

# FUNDING

This work was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—Project Number 192346071 - SFB 986 "Tailor-Made Multi-Scale Materials Systems: M<sup>3</sup> ", projects B4 and B9, and by the Office of Naval Research N00014-18-1-2879.

**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Huber, Kalidindi, Klusemann and Cyron. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Review of the Application of Machine Learning and Data Mining Approaches in Continuum Materials Mechanics

#### Frederic E. Bock <sup>1</sup> \*, Roland C. Aydin<sup>1</sup> , Christian J. Cyron1,2, Norbert Huber 1,3 , Surya R. Kalidindi <sup>4</sup> and Benjamin Klusemann1,5

1 Institute of Materials Research, Materials Mechanics, Helmholtz-Zentrum Geesthacht, Geesthacht, Germany, <sup>2</sup> Institute of Continuum and Materials Mechanics, Hamburg University of Technology (TUHH), Hamburg, Germany, <sup>3</sup> Institute of Materials Physics and Technology, Hamburg University of Technology (TUHH), Hamburg, Germany, <sup>4</sup> School of Mechanical Engineering and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, United States, <sup>5</sup> Institute of Product and Process Innovation, Leuphana University of Lüneburg, Lüneburg, Germany

#### Edited by:

Roberto Brighenti, University of Parma, Italy

#### Reviewed by:

Andreas Menzel, Technical University Dortmund, Germany Daojian Cheng, Beijing University of Chemical Technology, China

> \*Correspondence: Frederic E. Bock frederic.bock@hzg.de

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

> Received: 04 February 2019 Accepted: 26 April 2019 Published: 15 May 2019

#### Citation:

Bock FE, Aydin RC, Cyron CJ, Huber N, Kalidindi SR and Klusemann B (2019) A Review of the Application of Machine Learning and Data Mining Approaches in Continuum Materials Mechanics. Front. Mater. 6:110. doi: 10.3389/fmats.2019.00110 Machine learning tools represent key enablers for empowering material scientists and engineers to accelerate the development of novel materials, processes and techniques. One of the aims of using such approaches in the field of materials science is to achieve high-throughput identification and quantification of essential features along the process-structure-property-performance chain. In this contribution, machine learning and statistical learning approaches are reviewed in terms of their successful application to specific problems in the field of continuum materials mechanics. They are categorized with respect to their type of task designated to be either descriptive, predictive or prescriptive; thus to ultimately achieve identification, prediction or even optimization of essential characteristics. The respective choice of the most appropriate machine learning approach highly depends on the specific use-case, type of material, kind of data involved, spatial and temporal scales, formats, and desired knowledge gain as well as affordable computational costs. Different examples are reviewed involving case-by-case dependent application of different types of artificial neural networks and other data-driven approaches such as support vector machines, decision trees and random forests as well as Bayesian learning, and model order reduction procedures such as principal component analysis, among others. These techniques are applied to accelerate the identification of material parameters or salient features for materials characterization, to support rapid design and optimization of novel materials or manufacturing methods, to improve and correct complex measurement devices, or to better understand and predict fatigue behavior, among other examples. Besides experimentally obtained datasets, numerous studies draw required information from simulation-based data mining. Altogether, it is shown that experiment- and simulation-based data mining in combination with machine leaning tools provide exceptional opportunities to enable highly reliant

**8**

identification of fundamental interrelations within materials for characterization and optimization in a scale-bridging manner. Potentials of further utilizing applied machine learning in materials science and empowering significant acceleration of knowledge output are pointed out.

Keywords: machine learning, materials mechanics, data mining, process-structure-property-performance relationship, knowledge discovery

### INTRODUCTION

A key motivation of applying machine learning methods in continuum materials mechanics is the prospect of enabling, accelerating or even simplifying the discovery and development of novel materials for future deployment. One of the main challenges is to gain information on how to tailor material characteristics in order to generate a successful combination of (all) anticipated properties and performance attributes. Therefore, identifying coupled physical phenomena at different spatiotemporal scales, accounting for statistical uncertainties and controlling the parameter space within the materials structures are core interests in designing (new) materials for specific applications. For scale-bridging, to couple the effects of process parameters to microstructural features and to resulting material properties and performance characteristics, it is also important to consider the statistical variance of the process at hand. Additionally, with respect to a more fundamental level, data mining enables scientists to investigate and understand complex nonlinear relationships. In these cases, data mining and machine learning approaches often appear as intermediate steps in approaching and penetrating a problem until a point where the nature of the relationship of interest can be captured by more general physics-based models replacing the trained algorithms. More specifically, machine learning approaches based on rigorous statistical approaches (e.g., Bayesian inference) offer unique opportunities to calibrate objectively (based on available data) unknown model forms and/or parameter values in physics-based models.

Methodology wise, there are soft boundaries between the disciplines of data mining and machine learning that are both also related to the discipline of applied statistics, as they compose toolsets in data science. These methods cannot be seen separately, as they are strongly interrelated (Witten et al., 2011). The process for data mining according to the crossindustry standard (Chapman et al., 1999) consists typically of (i) problem understanding; (ii) data understanding; (iii) data preparation; (iv) data modeling and (v) data evaluation via machine learning; as well as (vi) deploying the trained algorithm. Hence, the application of machine learning and data mining approaches usually involve an adequate pre-processing of the relevant data as well as training, testing and validating the applied algorithms. Subsequently, post-learning tasks such as feature optimization and decision-making are frequently performed for the prescriptive purpose of optimization.

For challenges within continuum materials mechanics, different databased approaches were proposed in literature. Due to the different spatial and temporal scales of various data that are often involved, we are addressing the issues along the process-structure-property-performance (p-s-p-p) chain. Therefore, we are dividing our review of different machine learning and data mining approaches into four main sections depending on the main field of application: process parameters, microstructure, mechanical properties and performance. Furthermore, each field is divided into three categories that refer to the type of machine learning or data mining task and pursued objective: descriptive (e.g., identifying unknown patterns), predictive (e.g., approximations based on available knowledge) and prescriptive (e.g., optimization based on machine learning controlled decision-making). This differentiation is according to Delen and Ram (2018) formulated for business analytics. Similarly, Tan et al. (2009) divided machine learning tasks into two major categories: predictive and descriptive. However, in the context of materials mechanics and process-structure-property-performance linkages, a prescriptive machine learning task section appears suitable to account for implemented optimizations. Consequently, we follow the subsequent classification of the different approaches investigated in this review:

A descriptive approach is of explanatory nature and means that patterns within data can be recognized based on correlations, trends or anomalies to answer questions on "why does microstructure Y with properties Z occur for process parameter X and how do they affect materials performances such as fatigue and failure?"

A predictive approach is used to foresee specific consequences induced by certain factors; thus, previously non-existing results are generated through applying correlation, regression, classification, or statistical inference techniques to process and analyze existing data for answering questions such as "what kind of microstructure Y will occur with particular properties Z if process parameter X is changed?"

A prescriptive approach in this context means to provide insight on "what should be done in terms of process parameters X to obtain microstructure Y with properties Z?" to not only identify and predict but also to implement optimized results with respect to improved actions, e.g., in terms of a processmicrostructure-property relation.

A preliminary collection of descriptions about important machine learning methods is provided, as they are used either solely or diversely combined in the different studies discussed. In this regard, Witten et al. (2011) states: "Experience shows that no single machine learning scheme is appropriate to all data mining problems. The universal learner is an idealistic fantasy.


FIGURE 1 | Overview of different data analytics methods applicable within the field of continuums materials mechanics, motivating to develop accurate and comprehensive databases and to make them accessible. Three data sources compose a common data structure: experiments, process models and reduced order models. Experiments lead to empirical determination of characteristics along the p-s-p-p chain. With process models, these characteristics can be described and predicted. Via reduced order models, data can be compressed and patterns recognized. Available data can be analyzed via data mining and machine learning to generate new knowledge. Own figure based on the idea of Smith et al. (2016).

As [. . . ] real datasets vary, and to obtain accurate models the bias of the learning algorithm must match the structure of the domain. Data mining is an experimental science." The selection of studies presented in this article is based on the applicability of machine learning and data mining approaches to solve challenges in continuum materials mechanics and by no means exhaustive. An overview on linking process, structure, property and performance characteristics for additive manufacturing of metals via data analytics was provided by Smith et al. (2016). They focused on computational and experimental methods. Machine learning methods are not allocated into a unique class of methods but can be contained under subsections of the reduced-order modeling section of different data sources relevant for data analytics and data mining, as shown in **Figure 1**.

# SHORT OVERVIEW AND DESCRIPTION OF MACHINE LEARNING AND DATA MINING METHODS

Machine learning as a scientific discipline is still emerging and thus undergoing continuous change. While many of the methods and algorithms employed have been known for decades, in recent years, new approaches have matured to a degree that it is valid to consider machine learning a new and still nascent field, despite its already comprehensive development over a considerable period of time. As such, what constitutes machine learning exactly (as opposed to, e.g., descriptive statistics) remains only fuzzily defined. With data-driven methodologies being incorporated into domains such as materials science, new variants and adapted machine learning methods have been devised or are in the process of being fitted to the challenges and data profiles unique to materials science. This methodological domain-specificity should not be construed to preclude the importance of "mainstay methods" of machine learning such as artificial neural networks, which are in theory all-purpose and adaptable to approximating ("learning") any function inherent in data [Universal Approximation Theorem (Hornik, 1991)]. However, as approaches from data science augment and merge with traditional research procedures of materials science, the methods listed in this chapter cannot claim to be an exhaustive enumeration of machine learning methods viable for (continuum) materials mechanics, as constant changes within the next few years are expected.

Within this context, the most commonly encountered class of machine learning (to the extent that it is sometimes used interchangeably with the term machine learning itself) is the class of artificial neural networks (ANNs). Derived from a simple precursor formulation dating back as far as 1958, the perceptron (Rosenblatt, 1958), ANNs have gained in popularity as increasing computing power and availability of data alleviate the two bottlenecks which previously curtailed their use. The perceptron itself was conceived a simple one-layer neural network and used as a linear classifier.

In their simplest modern form, feedforward neural networks (FFNNs) (Haykin, 1998; Russell et al., 2016) are multilayer perceptrons, i.e., layers of vertices (neurons) in which each neuron computes an output based on inputs from the previous layer. The signal traverses through the network in a unidirectional manner, gradually transforming the input signal into an output signal as it percolates through the network, hence named "feedforward." Such networks are usually trained using a back propagating gradient-descent error minimization approach. The error is determined by comparing the current network output to the correct output (which is available in a supervised learning scenario). Individual neuron behavior is then changed during the training/learning process by altering the connection weights associated with each edge within the network topology. Many different approaches exist to address the various difficulties that are typically intrinsic to neural networks, for instance, overfitting, local minima, determining the optimal number of layers and neurons per layers, choice of activation function, and human interpretability of the network, among others.

Once enough layers of neurons are stacked, which can be referred to as the depth of the underlying ANN, the network behavior enters the regime of deep learning (DL). With increased number of layers, new difficulties come into focus, such as vanishing gradient problem (Hochreiter, 1998) or computation time, among others, which are not absent with the simpler architectures but are exacerbated in the case of DL. DL architectures are, for the most part, agnostic concerning the type of ANN, i.e., any kind of ANN can form the basis for a DL architecture. Two subtypes of ANN architectures have gained in popularity particularly over the last two decades, to the degree that for complex problems, usually one of the two is encountered at least as a component of the overall ANN architecture, or as a pre-/post processing step of the learning pipeline. These are convolutional neural networks and long short-term memory networks.

Convolutional neural networks (CNNs) are mostly formulated as a variant of FFNNs and were pioneered in 1980 (Fukushima, 1980), and then reformulated in their contemporary form in 1999 (LeCun et al., 1999). They are suited particularly well for image recognition, i.e., recognizing patterns in visual data (Schmidhuber, 2015; Russell et al., 2016). Typically, a convolutional layer is shifted across the data akin to a filter/detector in computer vision algorithms, requiring only few parameters due to the convolving layer allowing for effective weight replication as the "filter" is replicated across the visual field. Pooling and normalization layers allow for stepwise data simplification and for variable feature sizes, respectively. While the suitability of CNNs for materials science may not be immediately apparent, there are examples of direct applications such as materials texture recognition (Cang and Ren, 2016; Lubbers et al., 2017; Cecen et al., 2018), as well as indirect application examples in which e.g., non-visual materials data may be interpreted (Schwarzer et al., 2019).

Long short-term memory (LSTM) (Hochreiter and Schmidhuber, 1997; Russell et al., 2016) can compose ANNs because they offer specialized memory neurons/units that chiefly deal with the vanishing gradient problem (or inversely the exploding gradient problem), which often lead to sub-optimal local minima, especially as the number of neuronal layers increases. As deep learning architectures are gaining increasingly in popularity, LSTMs or variants thereof have gained in popularity in lockstep with DL as a way of circumventing such local entrapments. In short, "saving" important data points over time from being drowned out and distributing their error correction signal over longer periods allows for better information storage concerning important events.

For dynamic problems in which a "data-point" is often exposed to a temporal evolution of the materials state and encompassing a series of actions, other approaches than for the static case are often preferred. While many of these have not migrated into materials science to a significant degree, yet, it can be expected that customized methods suited for dynamic problems will gain increasing importance, both as the complexity of the target functions to be learned rises, and with the incorporation of state dependencies (usually as a function of time). Two of the major approaches to such problems are reinforcement learning, which itself is comprised of different methods such as Q-learning (Watkins and Dayan, 1992) and recurrent neural networks (RNN) (Lipton, 2015; Russell et al., 2016), that allow for directed circles within the ANN topology, and thus for signals to oscillate and overlap with the computation of subsequent samples. As a result, data is selectively passed across sequence steps.

Worth mentioning are randomized neural networks, which add random excitatory/inhibitory spikes to individual neurons (Gallicchio et al., 2017) without stable internal states (Maass et al., 2002) and radial basis neural networks (Orr, 1996), which are typically shallow FFNNs using individual neuron-specific radial basis functions to sum over neuron inputs and thus allow for better individual neuron specialization.

One noteworthy alternative to neural networks are Support Vector Machines (SVMs), introduced by Cortes and Vapnik (1995). SVMs deviate from ANNs by not mapping to neither continuous (regression problem) nor discrete (decision problem) output, but rather by separating patterns through either hyperplanes in the linearly separable case or a Kernel function in the nonlinearly separable case. This Kernel-transformation ("trick") maps support vectors into a transformed feature space in which separability is possible (using, e.g., a maximum margin). Whereas, ANNs usually employ several layers, often composed of simple neurons, SVMs can be interpreted as a specialized single neural node.

Q-learning is a noteworthy reinforcement learning algorithm concerned with learning policies, i.e., optimal choices for sequential decision problems (Watkins and Dayan, 1992; Mnih et al., 2013; van Hasselt et al., 2016), which is based on a reward signal. In this particular variant, the reward signal is based on a Q-function, which is a specific reward function that trades off maximum rewards using a discount factor. Reward-based learning methods such as Q-learning are of particular importance when the space of possible paths, i.e., the number of actions to be taken in sequence, is high but in relation to the training information available only sparsely populates that space.

Monte-Carlo methods (Andrieu et al., 2003) are usually a way of inferring numerical approximations via random (i.i.d.) sampling of a subset of the underlying data. In the context of machine learning, they typically refer to methods to decrease high-complexity environments and datasets and distill e.g., a workable, smaller set of actions to be used by a reinforcement learning algorithm such as Q-learning.

A random forest (Breiman, 2001) is an ensemble learning method that combines multiple (typically weak but computationally tractable) predictors, in this case decision trees, into a composite "super predictor" which (depending on the selection of the decision trees) is often not subject to the same constraints as the individual, weaker predictors.

Bayesian learning (Russell et al., 2016) in the context of machine learning is not a single class of methods but rather an approach into which other machine learning methods can be embedded. Utilizing Bayesian inferences, rooted in Bayes theorem, leads to an optimal update on prior distributions based on available observations. The approach is often used for the parameter estimation of a given learning algorithm or to compare the probabilistic fit between a model and the data to be modeled; either to infer the desired model complexity, or to decide between similarly complex models (Neal, 1996). Gaussian process regression models offer a nonparametric approach to building reduced-order models employing a special form of Bayesian learning that might be ideally suited to continuum materials mechanics problems because of their relatively small sizes of datasets and lack of prior knowledge of model forms.

Fuzzy c-means (FCM) clustering (Bezdek et al., 1984), which is related to k-means clustering, is usually used as a type of unsupervised learning, often in the context of feature extraction in data mining. The objective in such cases is to cluster data points alongside salient features which are not pre-defined (Mai et al., 2016). Fuzzy clustering denotes that feature-classification is not binary but rather given in terms of probability assignments to multiple features. The algorithm randomly distributes cluster centers among the data and iteratively changes the cluster positions until an objective cost function is minimized.

Principal component analysis (PCA) (Abdi and Williams, 2009) is typically used as a simplifying preprocessing step and not considered as a pure machine learning algorithm in itself. In cases where data points consist of more parameters (i.e., higher dimensionality) than the learning model can handle, a number of parameters may be merged until the data becomes tractable for the desired learning model ("dimensionality reduction"). PCA achieves this by projecting the data into a lower dimensional space to retain the maximum amount of variance. The projection leads to a reduced number of parameters, the "principal components," which capture the maximum amount of information among their axes.

Multi-fidelity methods (Aydin et al., 2019) may grow to be especially relevant for materials research, which often involves computationally expensive simulations, in contrast to other domains where training data is either already present or inexpensive to generate. Multi-fidelity approaches use a large number of cheap low-fidelity computational models for generating the training data for the majority of the training process, only switching to higher-fidelity simulations once the learning contribution per computational cost spent surpasses that of the lower-fidelity model. The magnitude of saving depends on the type of simulation, especially on the feasibility of defining lower-fidelity versions of the computational model, and on the degree to which the used machine learning algorithm relies on gradient-descent-based methods.

In the context of materials science, all of the methods mentioned in this section (with the exception of PCA) are mostly used solitarily, i.e., as the only major component of the machine learning approach. However, as the subfield matures and gains further dissemination into materials science research, they are expected to be used in tandem more often in the future. They can be either combined into (parallel) ensemble methods for which each method contributes a prediction, or into a consecutive serial learning pipeline, in which e.g., clustering is used for feature determination in conjunction with a CNN for subsequent feature learning.

Lastly, it may be worth noting that simple (and borderline merely statistical) learning algorithms such as regression (Russell et al., 2016) and decision trees (Quinlan, 1986) should not be neglected, especially when only a low-sample regime is available offering only a limited amount of extractable information (as is often the case with time and resource expensive experimental setups). Typical machine learning methodologies may be inapplicable, in which case the aforementioned learning algorithms may constitute the most suitable tool to "train" a predictor.

#### PROCESS PARAMETERS

In this section, applications of machine learning approaches to identify, approximate and optimize process parameters for a variety of results are discussed. The choice of process parameters is responsible for many features that arise in the ensuing process-microstructure-property-performance chain. Examples include identifying correlations between process parameters and resulting microstructures (Popova et al., 2017), predicting process-time requirements and part-geometry results (Xiong et al., 2014) as well as correcting measurements, e.g., resulting residual stress fields (Chupakhin et al., 2017). One set of features arising from the process parameters relate to the material microstructure itself. On the one hand, direct models can be used to discover relevant relationships between causes (i.e., inputs) and effects (i.e., outputs) (Xiong et al., 2014; Popova et al., 2017). On the other hand, once such a forward model is validated, they can be suitably interrogated for inverse relationships needed in design, where the goal is to identify the specific process parameters (and histories) that lead to a desired optimization of the effects (Upadhyay et al., 2012).

#### Descriptive

Descriptive tasks such as pattern recognition and correlation have been performed by Popova et al. (2017) for the implementation of a data-driven workflow to identify relationships between process parameters and resulting microstructures in additive manufacturing. The proposed workflow included data pre-processing, microstructure quantification and dimensionality reduction to extract and validate process-structure linkages (in the form of reduced order models). The microstructures obtained via additive manufacturing techniques are complex and highly depend on specific process conditions. The generation of synthetic data of these microstructures was accomplished via applying the Monte-Carlo method. The dataset consisted of ∼1,600 unique microstructures. The particular method applied in each step of the workflow depends on the amount and type of data available. For building a reduced-order model, three different approaches were used: first, a so-called chord length distribution was employed to quantify microstructural features such as grain sizes, shape distributions and anisotropies. Second, a dimensionality reduction and model reconstruction was achieved by PCA. Third, a multivariate polynomial regression was used for building a surrogate model to efficiently exploit the data. As a result, a framework was created to substitute constitutive models that are typically comprised of comprehensive multiscale and multiphysical field equations by approaches such as advanced statistics and machine learning that lead to highly efficient identification of process-structure-property linkages.

#### Predictive

For the prediction of bead layer geometries during the additive manufacturing process of robotic gas metal arc welding based on the chosen process parameters, Xiong et al. (2014) used two different prediction approaches: a feed forward artificial neural network, see **Figure 2**, and a second-order regression analysis. Important characteristics of the manufactured part, such as thickness of the weld bead layer, surface quality and dimensional accuracy affecting the geometry of the deposited layers were included in the training of the ANN. The predictive

equation of the second-order regression analysis consisted of a quadratic polynomial considering four influential factors: wire feed rate, welding speed, arc voltage and the distance between nozzle and plate. When comparing the prediction results of both methods to experimental findings, the bead width was marginally underestimated and the bead height slightly overestimated. The deviations were assumed to be based on influential effects caused by heat accumulation that were not accounted for by both approaches. Overall, both prediction approaches lead to reasonable results; however, the error for the ANN was consistently lower than the one of the second order regression analysis. This is due to the superior capability of the ANN to approximate nonlinear processes. Thus, a neural network might be preferable to predict deposited layer width and height with reasonable accuracy for future research (Xiong et al., 2014).

Prediction of the required cutting force during the turning process of a titanium alloy was performed by Upadhyay et al. (2012) with a neural network. In comparison to the experiments that were conducted based on design of experiment (DOE) using the response surface method, the neural network predictions showed better performance on a small but statistically welldistributed dataset. Sahu et al. (2018) also used neural networks to predict the surface roughness in the turning process of a titanium alloy while considering the three controllable process parameters cutting speed, feed rate and cutting depth as input. Additionally, they were able to link them to the measureable outputs: cutting force, feed force and acceleration.

The prediction of higher-order microstructure statistics as a function of the process parameters from both multiscale experimental and simulation datasets was demonstrated in recent studies (Brough et al., 2017a; Khosravani et al., 2017; Yabansu et al., 2017; Popova et al., 2018). In these preliminary explorations, reduced-order models were built using a combination of dimensionality reduction (using PCA), feature engineering (using Pearson correlations), and regression. Clearly, there are many opportunities for the application of more advanced machine learning approaches to this class of problems.

# Prescriptive

The identification of process parameters to be applied for obtaining anticipated results can be achieved by completing a prescriptive task. Such a prescriptive task for measurement correction on residual stress fields after laser shock peening (LSP), obtained through the hole drilling method, was performed by Chupakhin et al. (2017) via the use of an ANN. The process of LSP allows to locally introduce deep compressive residual stresses (Ding and Ye, 2006), which is of particular interest in applications prone to fatigue failure. Hole drilling is the commonly used technique to determine the depth dependent residual stress field, but the method is limited to residual stresses below 60% of the yield stress. Chupakhin et al. (2017) developed an ANN for correcting the measured residual stress profile. About 250 training patterns were computed from elastic-plastic FEM simulations of a pre-stressed plate with increasing hole depth by random combination of material properties and residual stress profiles covering the typical range of alloys and LSP profile shapes. The computed deformation field on the surface of the plate was analyzed using the Integral method (Schajer, 1988), which is also used in hole drilling experiments. This "measured" residual stress profile served as input while the residual stress profile applied to the plate served as desired output. The dataset revealed that the error is still below 10% up to a residual stress of 80% of the yield strength. For larger values, the error of the hole drilling method can rise up dramatically and requires a correction using the ANN. Based on the corrected residual stress profiles, the relationship between process parameters and residual stresses could be determined via DOE. This allowed for designing the

LSP process to generate desired residual stresses in 2.0 mm-thick AA2024T3 sheet material (Chupakhin et al., 2019).

# MICROSTRUCTURE

Numerous research results have been published on microstructural quantification (Altschuh et al., 2017; Voyles, 2017; Gobert et al., 2018), classification (DeCost and Holm, 2015; Chowdhury et al., 2016; DeCost et al., 2017), evolution (Gomberg et al., 2017) and reconstruction (Sundararaghavan and Zabaras, 2005; Bostanabad et al., 2016). Bridging length-scales around the microstructure can be pursued via either bottom-up approaches, e.g., through homogenization or via top-down approaches, e.g., through localization. Moreover, it can be achieved through descriptive, predictive and prescriptive approaches. Based on the descriptive identification of linkages between process parameters, generated microstructures and resulting mechanical properties (Deshpande et al., 2016; Cecen et al., 2018), as well as the related fatigue performances and failure mechanisms (Spear et al., 2018), it is possible to predict or even prescriptively tailor and optimize microstructural features.

# Descriptive

The descriptive characterization of the microstructure of random heterogeneous materials remains an important challenge in materials mechanics. To this end, descriptors such as n-point spatial correlations (also called n-point statistics) are used. Sundararaghavan and Zabaras (2005) showed that SVMs in combination with PCA can help to classify microstructures and reconstruct three-dimensional representative volume elements (RVE) using such descriptors, as shown in **Figure 3**, with nearly real-time efficiency. This idea was significantly extended by Niezgoda et al. (2013) who suggested to represent the microstructure by stochastic processes that allow for a largely automated classification of microstructures. The framework also naturally leads to delineation of a comprehensive space of microstructures (Niezgoda et al., 2008), and the instantiations of microstructures from statistics (Fullwood et al., 2008; Turner and Kalidindi, 2016).

Fast and Kalidindi (2011) presented an efficient approach for localization, i.e., calculating the strain field in the relevant volume element for given loading conditions, based on the materials knowledge systems (MKS) (Kalidindi et al., 2010; Landi et al., 2010). Core of this approach is the description of the material response (e.g., microscale strain field) via a series of convolution integrals. Statistical continuum theory (Kröner, 1977) provides the basis for the approach, i.e., it inspires the model form for the reduced-order model. Central to the MKS is the calibration of the influence filters present in these linkages. This calibration is accomplished using results from numerical models, typically from finite element calculations of the responses of microscale volume elements (MVE) or RVE, respectively. Different model building approaches have been used in this body of work. Fast and Kalidindi (2011) used linear regression, removing redundancies by employing a reduced-row echelon form. This work demonstrates the suitability of the kernelbased series model form employed that systematically adds more terms as higher levels of microscale interactions are needed to be captured.

The MKS approach from Kalidindi et al. (2010) was also utilized in a study on elastic localization kernels for single phase polycrystalline microstructures (Yabansu et al., 2014) as well as for a wide range of cubic polycrystals (Yabansu and Kalidindi, 2015). The goal was also to efficiently achieve scale bridging in modeling and simulation of materials involving numerous scales. It is claimed that the most advanced material structures possess a highly hierarchical internal structure with different length scales. Therefore, the MKS framework is used to capture high dimensional local state spaces of advanced material systems for the prediction of elastic strain fields in a broad class of cubic polycrystalline microstructures (**Figure 4**). Significant reduction of required computational effort was achieved through

spectral representations of the influence functions that are highly compact.

In the area of materials characterization and microscopy, Voyles (2017) focused on improving the quality of data obtained empirically from instruments (microscope, in this case), optimally deriving information from that, to ultimately develop generalizations and gain new knowledge. Besides that, microstructure quantification and feature identification in porous membranes was studied by Altschuh et al. (2017). Data generation was conducted via a newly developed microstructure generator, to generate a large ensemble of porous structures that contain a large variety of different features, such as pore shape, pore size, degree of porosity, and specific surface area. Experimental data was obtained via high-resolution Xray tomography to measure the morphology of real porous membranes. To be able to compare the two different datasets, statistical representations for both simulated and real membrane microstructures were calculated and compared based on a PCA of two-point spatial correlations. This leads to an objective measure of the difference between any two selected microstructures; thus, to a quantification of the porous membrane structures. A PCA on these two-point statistics was used to obtain low dimensional representations of the microstructures and to classify them. For the basic microstructure, the most dominant features are porosity, pore size, stretching direction and stretching factor. These features were identified as a basis in the low dimensional space. A high variety of microstructure characteristics and its influence on the low dimensional space lead to the identification of linkages. As a result, the basis vector and the principal component value were successfully used to estimate the features of the real membranes.

#### Predictive

For the purpose of providing an efficient linkage for localization, Liu et al. (2015b) compared the performance of different approaches based on machine learning and data mining concepts. One particular goal was to overcome limits in terms of applicability of the previous linkage approach based on the extension of statistical continuum theory to higher elastic contrasts of the composition (Kalidindi et al., 2011). The linkage is established based on setting up a predictive model, consisting of the two aspects, feature extraction and regression. Three test cases were analyzed to evaluate the influence of different steps in generating the data-driven predictive model for localization. First, the influence of additional information about the neighboring voxels, called feature space, on predicting the response of the currently influenced voxel is studied. However, the computational performance is decreasing linearly due to the increased training time with growing number of included neighbors, meaning the feature dimensions, and the prediction might be even deteriorated. Secondly, the influence of differently defined features on the representation of considered voxels are systematically analyzed and subsequently ranked. Based on this ranking, a combination of different top-ranked features are juxtaposed, showing the improved performance in contrast to simply adding information of the neighboring grains. However, there exists a feature threshold where the error increases with increasing information. Thirdly, the performances of different regression models are compared, showing that a random forest regression model outperformed the considered support vector regression model and M5 model tree in terms of accuracy by only a moderately increased training time compared to the M5 model tree. This approach was extended by Liu et al. (2017) through considering context detection, i.e., "finding the right high-level, low dimensional, knowledge representation in order to create coherent learning environments" (Liu et al., 2017). In this regard, a two-step approach is used. First, identifying the context of the data and secondly constructing the predictive model for each context, also called multi-agent learning, lead to an increased efficiency and accuracy of the predictive model. Key difference to the previous work of the authors is the identification of microstructure similarities, called macro-features, and assembling them to a subset using k-means clustering algorithm. Subsequently, each subset is handled with the approach as presented in their previous work (Liu et al., 2015b) and discussed above, using the best 57 features, called micro-features. Three strategies of identifying the microstructure macro similarities are investigated and their performance compared. These strategies include context detection based on volume fractions alone, on "designed macroscale microstructure descriptors" (Liu et al., 2017) and on pair correlation functions. The results showed an improvement of 38% compared to the best results presented by Liu et al. (2015b). The accuracy of the different strategies for the macro feature extraction was nearly identical.

Automatic microstructure recognition was implemented by Chowdhury et al. (2016) in a case study on image-driven machine learning methods. The dendritic morphologies were of particular interest with the aim of performing classification with a minimum of required pre-expert-knowledge. Thus, the anticipated knowledge gain was claimed to be equivalent to human performance, but not beyond. The first classification task was to differentiate between dendritic and non-dendritic microstructures. The second classification task aimed to recognize longitudinal and transverse dendrite orientations via a successional binary classification task performed on crosssectional views. Images with different magnitudes and from different material-compositions served as initial data input. Feature extraction and dimensionality reduction were used to represent micrographs as feature vectors. These feature vectors were then used for training, validating and testing various classification models. They consist of a set of detected features in an image; thus, images were represented by high-dimensional feature vectors (**Figure 5**). Feature selection is performed to increase computational efficiency by reducing the length of the feature vector and still retain all relevant image information (e.g., reducing sparsity of the vector). Various dimensionality reduction methods were tested, and in conclusion, convolutional neural networks were evaluated best for both classification tasks with an accuracy of 92–98% as generalization can be performed most sufficiently.

Microstructural images were used by Ling et al. (2017) to set-up a data-driven model for microstructure classification, using pre-trained convolutional neural networks within the framework of Keras (Chollet et al., 2015) and Tensorflow (Abadi et al., 2016). The specific model was trained, tested and validated with the aim to process different datasets through generalization, including the identification of the required number of features and an evaluation of the interpretability of results. First, the microstructural images were transformed by using CNNs, followed by texture featurization and classification through a random forest algorithm (**Figure 6**). The required computational effort is proportional to the number of features. Mean texture featurization showed good performance based on the comparatively low number of features that requires less memory space and enables efficient computation of the random forest classifier. Overall, an appropriate method for featurizing images obtained via Scanning Electron microscope (SEM) was developed and applied. Generalization was achieved sufficiently from the input based on different datasets as opposed to only one single dataset and allowed for various prediction targets.

A descriptive and predictive approach is proposed by Hu et al. (2018) for the efficient simulation of grain and pore growth in aluminum alloys during solidification in a casting process. A cellular automaton (CA) is combined with backpropagation neural networks (BPNN), resulting in a socalled CA-BPNN method to simulate the growth of pores and grains. Computational effort is reduced since the continuous governing equations with high-dimensionality to account for porosity do not have to be solved. The neural network is used on data obtained from a process simulation of the solidification via CA to economically identify the relationship between porosity and solidification parameters<sup>1</sup> , such as solidus velocity, initial hydrogen content as well as spatial and temporal thermal gradients. These relationships are considered in the transition functions, which compose the rules for the cellular automaton model and affect the simulated pore growth in addition to the governing equations of the numerical simulation (**Figure 7**).

For metallurgical texture analysis, in particular classification of zones of titanium alloy microstructures into either α and β or <sup>α</sup> <sup>+</sup> <sup>β</sup> phases, respectively, Mesquita Sá Junior et al. (2018) used a randomized neural network for identifying microstructural features. In particular, linear discriminant analysis (Fukunaga, 1990) and SVMs reached good and similar precisions for both types of microstructures. For example, this approach was applied for classifying titanium alloys processed via friction stir welding, as the existing phase type has a strong effect on the mechanical material properties.

An example of performing a predictive task based on the descriptive approach of defect pattern recognition was performed by Gobert et al. (2018). For in-situ detection of discontinuities, such as defects, during the process monitoring of additively manufactured metals via powder bed fusion, a supervised machine learning approach was implemented on high-resolution images recorded via computer tomography during the building of layers. For geometrically describing discontinuities, adjacent voxels that exhibit anomalies were clustered. The particular assignment of each anomaly voxel to their correlating discontinuity was achieved through k-means clustering, which is based on the minimum distance of a voxel to the center of its cluster. The aim was to detect discontinuities with diameters between 20 and 200µm. Furthermore, visual features in the form of high-dimensional feature vectors were extracted and evaluated through binary classification via SVMs. Once the ensemble classifier was trained, the accuracies amounted to 80% and better for predictively detecting defects during the process, validated with three dimensional computer tomography images of the manufactured parts.

# Prescriptive

A particular challenge is the identification and prediction of optimized microstructure configurations to prescribe the best material properties for specific applications. Liu et al. (2015a) presented an approach for microstructure optimization, enhanced by machine learning methods, as outlined in **Figure 8**. Although, a number of methods for determining the properties directly from a given representation are available,

<sup>1</sup>Consequently, this work simultaneously qualifies for the Predictive Section of this article: prediction of process parameters.

third position within the e.g., first, second, third, fourth of fifth stack. The outputs were processed via featurization of the texture and the ultimate classification of these features was achived by using a random forest. Reprinted from Ling et al. (2017), Copyright (2017), with permission from Elsevier.

the traditional structure-property optimization, representing the inverse procedure, is complex. The optimization problem might be of high dimension, multiple objectives have to be fulfilled and the result is often non-unique; thus, deteriorate classical optimization methods. In Liu et al. (2015a), a machine learningbased structure-property optimization scheme, see **Figure 8**, is introduced and applied to the design of magneto elastic Fe-Ga alloy for five different design problems. At the core of the new scheme are random data construction as well as feature selection and classification algorithms to refine the search path and to reduce the search region, respectively. The latter two steps have the goal to reduce the search space and by this, to decrease the computational costs to find the optimal solution. The microstructure of the magneto elastic Fe-Ga alloy was represented by an orientation distribution function (ODF). In combination with a crystal plasticity model, all relevant properties considered in this work were obtained via homogenization. For the random microstructure data generation, four randomization methods were used to ensure the sufficient randomness and polarization: random intervals, random k intervals, random every k and best-first assignment. The search path refinement is based on supervised feature ranking methods [χ² (Liu and Setiono, 1995), information gain (Quinlan, 1986), f-score (Steinwart and Christmann, 2008) and SVM-weight (Chang and Lin, 2008)] to identify the most promising path, i.e., crystal orientations or ODF dimension. For the search space reduction, a rule-based classification tree (decision tree) is used, e.g., to identify promising orientation regions. For the design problems published by Liu et al. (2015a), the original region could be reduced by 80–99%. A gradient-based line search is employed to perform the mathematical optimization. The authors compare the outcome and the performance of the machine learning-based scheme to three other approaches, namely an exhaustive search, a generalized pattern search and linear programming (LP) as well as a genetic algorithm (GA), respectively. Overall, the results by Liu et al. (2015a) illustrate that the machine learningbased scheme outperforms all other approaches, considering optimality, efficiency<sup>2</sup> and completeness of the solution, in particular dealing with nonlinear problems.

Brough et al. (2017b) set-up a prescriptive framework for capturing and communicating critical information regarding the material structure evolution in spatiotemporal multiscale simulations to reduce the number of required experiments.

<sup>2</sup>LP was faster for linear problems. GA was reported to be faster for nonlinear problems as well but the authors Liu et al. (2015a) reported that the algorithm worked poorly for the considered problem.

Copyright (2018), with permission from Elsevier.

They aimed for establishing the desired process-structureproperty linkages by generalizing the MKS framework via introducing different basis functions and exploring their benefit. Using Cahn-Hilliard based phase field simulations to predict microstructure evolution and using Green's function based influence kernels as a method to identify the underlying embedded physics, lead to a calculation acceleration by the factor of three compared to an optimized numerical integration algorithm. It is important to distinguish the direction of relationships. Consequently, the kernels in the MKS localization approach are calibrated with results from numerical tools such as FEM. Once the linkages are calibrated and validated, the influence kernels can be used to predict the local responses of new microstructures at minimum computational costs. Thus, this approach is sufficient for exploring a very large number of potential microstructures. The extracted kernels were insensitive to details of the initial microstructure, enabling the application of the kernel to any initial microstructure within the material system selected for that kernel and allowing for expanding the domain size without significant alteration of the accuracy. The overall achievement of this study was the rapid exploration of the underlying physics via Green's function based influence kernels at exceptionally low computational costs, opening up superior opportunities for spatiotemporal multiscale bridging.

# MECHANICAL PROPERTIES

Mechanical material properties are characteristics to be precisely predicted and controlled as they are strongly linked to and highly affected by process parameters and resulting microstructures. Mechanical behavior in simulations is often described by means of constitutive equations. Already before the most recent popularity rise of machine learning in the scientific community, several approaches had been suggested to replace constitutive equations by data-based methods such as artificial neural networks (Hashash et al., 2004; Oeser and Freitag, 2009). Such approaches are particularly promising for problems where it remains poorly understood how to describe the material behavior appropriately by means of constitutive equations, for example, remodeling of biological bones as discussed by Hambli et al. (2011). Other examples include the prediction of compressive strength and elastic modulus of sandcrete materials (Asteris et al., 2017) and approximation of yield strength while respecting diverse physical constraints for the design of a nickelbased superalloy (Conduit et al., 2017). In general, descriptive tasks, such as pattern recognition, predictive tasks, such as classification as well as prescriptive tasks, such as optimization, are implemented to fulfill the material property requirements of particular material applications.

#### Descriptive

Hambli et al. (2011) substituted constitutive equations by a datadriven ANN. A combined model composed of a finite element (FE) simulation and an ANN was developed for simulating the remodeling process of bones and linking the mesoscopic scale of the "trabecular network" level to the macroscopic scale of the complete bone level, as shown in **Figure 9**. While the FE simulation was implemented on the macroscopic scale, an ANN was used to provide predictions at the mesoscopic scale. The FE analysis was based on digital CT image voxels used to build mesoscale RVEs, whereas the ANN was provided with parameters of the bone materials as well as boundary conditions and applied stresses. The anticipated outputs were the updated bone properties.

In the data-driven approach presented by Kirchdoerfer and Ortiz (2016), the need for empirical material modeling, which can require extensive efforts, is circumvented by performing more efficient calculations directly from a material dataset obtained through experiments. Through the combination of experimental data, relevant constraints and essential conservation laws, the data-driven calculations were restricted to remain within boundaries prescribed by principles of conservation and relevant limits related to the specific problem. In particular, through the data-driven model, the nearest possible state of a materials datapoint of interest to the experimental dataset is assigned to a point in the computational material model that simultaneously fulfills the boundary conditions. This nearest possible state is determined via a distance-minimization function in the phase space between the experimental data points and the newly proposed data points from the data-driven computational model. The approach was applied to a mechanical problem of a non-linear three-dimensional truss system with linear elastic properties. The developed data-driven solvers showed good convergence, especially in comparison to a classical finite element model analysis. An extension of this approach was the investigation of its robustness with respect to noise induced by outliers within experimental datasets, which was achieved through a cluster analysis (Kirchdoerfer and Ortiz, 2017). Furthermore, the data-driven computing approach is extended in Kirchdoerfer and Ortiz (2018) to time-dependent problems such as predicting annealing processes.

Ibañez et al. (2018a) proposed a data-driven computational approach to compensate for the inability of existing constitutive models to be extended or generalized for describing new experimental results without significant adaptation efforts. To describe the elastic material behavior, there was no need for a constitutive model that could reflect linear and non-linear elastic behavior or yield conditions. Proposed were two different linearization strategies for utilizing an iteration solver to define points in the material model that fulfill both constitutive and equilibrium equations within large experimental datasets.

However, more recently, Ibáñez et al. (2018b) proposed an approach on combining governing equations with constitutive plasticity models and experimental data via machine learning. Based on the benefit of contained constitutive equations, the approach is claimed to be more accurate and efficient than approaches without a model. The use of sparse proper generalized decomposition (s-PGD) enabled to correct constitutive plasticity models in order to minimize the error between the results generated by the model and those obtained via experiments. Through this approach, it was possible to utilize substantial knowledge already contained in the model, as opposed to training an algorithm from scratch.

Liu et al. (2018) proposed a so-called deep material network that was implemented for modeling materials on multiple scales, based on homogenization of two-dimensional RVE's. With data obtained from linear elastic RVE calculations, the deep generic material network was trained via stochastic gradient descent with backpropagation and enhanced via model compression by removing redundancies<sup>3</sup> in the network. As a result, learning and convergence was achieved in less time. A number of connected building blocks, as common for generic algorithms, are used in combination with solutions from homogenization of two dimensional elastic RVE's to preserve important information about the mechanical physics. The trained network was validated with numerical simulations for cases of linear elasticity, nonlinear plasticity and finite-strain hyperelasticity exposed to large deformations; thus, it provides a description of mechanical microstructure-property linkages, however, it can also be used for prediction purposes during material development.

To derive relationships between process parameters, microstructure and mechanical properties for additively manufactured materials, Yan et al. (2018) proposed a comprehensive, data-driven model, containing multiple scales to respect numerous underlying physical phenomena. To enable an efficient and accurate data-driven mechanical simulation for material design, a reduced order modeling

<sup>3</sup>Removing redundancies refers to the deletion of nodes those function is <sup>f</sup> <sup>=</sup> 1.

Hambli et al. (2011), Copyright (2010), with permission from Springer Nature.

technique was developed, the so-called self-consistent clustering analysis (SCA), which is based on the works of Liu et al. (2016a) and Liu et al. (2016b). The SCA was used on the mesoscale to connect the microstructural model to macroscopic properties. Processed data consisted of voxels from non-linear materials with complex microstructural morphologies. Instead of solving constitutive equations for each voxel, clusters of voxels are formed, e.g., via the k-means method, and constitutive equations were solved for each of those clusters. As a result, SCA served as reduced order method that leads to a valuable compromise between efficiency and accuracy of the results.

Huber (2018) addressed a number of fundamental questions regarding the topological description of materials characterized by a highly porous three-dimensional structure with bending as the major deformation mechanism. This is the dominant deformation mechanism in nanoporous gold, foams, porous membranes and some architecture materials. Highly efficient finite-element beam models were used for generating data on the mechanical behavior of structures with different topologies, ranging from highly coordinated bcc to Gibson–Ashby structures (Gibson and Ashby, 1997). Random cutting enabled a continuous modification of average coordination numbers ranging from the maximum connectivity to the percolation-cluster transition of the 3D network. Via data mining, the interdependencies of topological parameters as well as relationships between topological parameters with mechanical properties were discovered. It was found that the average coordination number serves as a common key for determining the cut fraction, the scaled genus density, and the macroscopic mechanical properties. The dependencies of macroscopic Young's modulus, yield strength, and Poisson's ratio on the cut fraction (or average coordination number) could be represented as master curves, covering a large range of structures from a coordination number of 8 (bcc reference) to 1.5, close to the percolation-cluster transition. As an interesting outcome, the data for macroscopic Young's modulus and yield strength are covered by a single master curve. This lead to the important conclusion that the relative loss of macroscopic strength due to pinching-off of ligaments corresponds to that of macroscopic Young's modulus. In principle, the derived master curves can be used to design the macroscopic stiffness, strength and Poisson's ratio of open pore materials by adjusting the connectivity of the material, leading to a prescriptive approach.

# Predictive

Effective macroscopic mechanical properties of a material with a given microstructure can be predicted via computational homogenization and is another typically time-consuming task in materials mechanics, for which recently machine learning techniques such as ANNs have been proposed as a viable and computationally efficient alternative (Le et al., 2015). Predicting the mechanical properties of a material depending on its processing can be a challenge and hard to tackle even with computational methods because an accurate physical model that could reliably link processing parameters and materials properties is often lacking. In such cases, artificial neural networks are often used to predict mechanical properties based on mechanical models and experimental data (Chopra et al., 2016).

The prediction of rising or falling material hardness, based on residual stresses and contact pressure of spherical indentation tests was investigated by several groups via experimental and FEM simulations. Heerens et al. (2009) presented a model that allows to compute the change in hardness for arbitrary in-plane biaxial residual stress states including the special cases of uniaxial, equibiaxial, and pure shear residual stress. Relevant for this review is the way this model was found. Based on a 3D FEM model, hardness training patters were generated for randomly chosen elastic-plastic materials with nonlinear work hardening. For each pattern, a pair of data with and without residual stresses was computed. It turned out that an ANN could easily predict the increase or decrease of the hardness relative to the material without residual stress when the two in-plane residual stress components σ1, σ3, and the average contact pressure σ<sup>r</sup> are given as inputs. When this happens, there is a high chance that the underlying relationship 5(σ1, σ3, σr) can be represented by a simple model. Motivated by this, the data was systematically analyzed with respect to the interdependencies using an ANN. To this end, physical knowledge was incorporated in the formulation of the ANN inputs and output. The authors studied the ANN prediction error by feeding the following information as inputs: indentation depth to spherical indenter radius h/R, normalized Mises stress σ<sup>f</sup> /σ<sup>r</sup> , and normalized hydrostatic pressure p/σ<sup>r</sup> . As not explicitly illustrated in the original contribution by the authors, a sequential omission of single inputs reveals an error pattern in the predicted output specific for each input, see **Figure 10**. Therefore, as major outcome of the applied machine learning approach, it can be concluded that both the normalized von-Mises stress and hydrostatic pressure are equally important to solve the problem. The model published by Heerens et al. (2009) is based on this insight and would not exist without the intermediate step of using the ANN. While the ANN was descriptive and limited to the range of training data, the derived model is general and predictive.

Ghosh et al. (2014) used a multilayer neural network to predict the porosity, the yield strength, the ultimate tensile strength and the elongation of aluminum alloys during solidification, based on input parameters, such as solidus velocity, initial hydrogen content as well as spatial and temporal thermal gradients. The training error and number of cycles were reduced via numerical optimization of the ANN training structure by using the quasi-Newtonian Broyden-Fletcher-Goldfarb-Shanno algorithm (Nocedal and Wright, 2006). Good agreement between ANN predictions and empirically determined mechanical properties was obtained.

The effective stiffness of high contrast elastic composites is predicted by Yang et al. (2018), based on a deep learning approach. They used a multi-layered CNN with a rectified linear unit (ReLU) function for neuronal activation to model linkages between microstructure and mechanical properties at the macroscale. The architecture of a CNN, as shown in **Figure 11**, usually consists of a convolutional layer for objective extraction of important features from two or three-dimensional images, followed by a pooling layer for reducing feature map dimensions and a fully connected layer before concluding with the output layer that consists of one node, yielding the anticipated material property.

Enhancing the accuracy for predicting mechanical properties of heterogeneous materials based on image data by using a CNN in combination with a morphology-aware generative model was achieved by Cang et al. (2018). The generative machine learning model was used at low computational cost to generate artificial but authentic material samples that are required when only a limited set of original data typically from experiments, is available for training. Morphology constraints lead to a morphology distribution of the generated samples that is identical to the one of the original data. Through a comparison, it could be shown that this material property distribution matched the original material property distribution better than that one generated with a state-of-the-art Markov Random Field model (Li, 1995); hence an improvement of a predictive structure-property model was reached.

# Prescriptive

The identification of material parameters for constitutive models is commonly the key for optimization of processes and for designing parts that undergo complex loading histories. Irrespective of whether a deterministic or stochastic optimization algorithm is used, they intend to lead to a set of parameters that correspond to the best fit. However, it is mostly not clear, if the result is unique. Huber and Tsakmakis (2001) developed a neural network tool that allowed identifying the material parameters of a finite deformation viscoplasticity model with static recovery. This complex inverse problem was solved in a very general way by enriching the information fed to the machine learning approach via a specifically designed loading history for cyclic loading with different loading rates and inserted relaxation phases. The identification process was split into a sequence of specialized ANNs, which used the results from the previous steps. All inputs and outputs are defined in dimensionless form. The outputs are normalized by measurable quantities that incorporate a priori knowledge, wherever possible, via a simple estimate of the desired output. This improves the accuracy of the ANN considerably, due to the reduction of the approximation task to the correction of the estimate. In this way, the Young's modulus was determined

right: omitting indentation depth to spherical indenter radius h/R; bottom left: omitting mormalized Mises stress σf /σr ; bottom right: omitting normalized hydrostatic pressure p/σr . Omitting an input is visible in a specific pattern of error in the predicted output.

first, then the equilibrium behavior of the nonlinear isotropic and kinematic hardening rules, and finally the parameters responsible for viscosity and static recovery terms. Subsequently, the nonlinear elastic-plastic deformation behavior of thin Al films was identified by Huber et al. (2002) from nanoindentation experiments. In contrast to the common rule of maximum 10% indentation depth, the required additional information for a unique identification was provided by purposely deep penetration of twice the film thickness. This concept allowed to break the geometric similarity of the pyramidal indent and to enrich the input data of the ANN by sufficient independent information about the mechanical behavior of the film. To minimize the computational costs for pattern generation, a strategy was applied, where five patterns served for training and another five patterns for validation. With each training cycle, the previous validation patterns were added to the training dataset and five new validation patterns were generated. Based on this approach, 40 patterns turned out to be sufficient to achieve a comparable training and validation error. The enrichment of the input information by modifying the loading history was also key for a successful unique parameter identification based on spherical indentation tests (Huber and Tyulyukovskiy, 2004; Klötzer et al., 2006; Tyulyukovskiy and Huber, 2006). The developed identification approach was successfully applied to determine the material parameters of EUROFER 97 steel. The high quality of the identified material behavior and the prescriptive capability for generating very different loading histories was demonstrated by a comparison of the predicted stress-strain behavior with cyclic tension-compression tests from specimens made from the same material.

Conduit et al. (2017) utilized a neural network for the design of a nickel-based polycrystalline superalloy. Specifically defined physical criteria were fulfilled by the approach; therefore, modeling, discovering and optimizing novel alloys with respect to required design specifications was possible. Experimental validation of parameters, such as the yield stress, showed that the relevant properties for a particular application were improved in comparison to commercially available materials through the prescriptive discovery of a material composition that is most suited for the particular use-case. Examples of the successful application of this approach are the development of a new nickelbased superalloy for high temperature application (Conduit et al., 2014) as well as the predictive and prescriptive design of a molybdenum-based alloy that fulfills desired requirements such as yield stress and hardness properties for a die-forging application (Conduit et al., 2018).

# PERFORMANCE

When materials are exposed to loads that are significantly dependent on the temporal scale, the performance of the material, such as fatigue and failure, become highly relevant. Specific material behavior that eventually leads to fatigue are governed by phenomena such as crack initiation, growth, and coalescence under static and cyclic loading, among others. Using machine learning approaches for the identification of linkages to fracture initiation (Jha et al., 2018), crack growth (Younis et al., 2018) as well as fatigue life performance (Paulson et al., 2018) is substantial for choosing and designing the best characteristics along the process-structure-property-performance chain.

#### Descriptive

To uncover and quantify relevant microstructural factors influencing the fatigue behavior, Jha et al. (2018) used a data-analytics approach based on principal component analysis (PCA) (Jolliffe, 1986) and fuzzy c-means (FCM) clustering (Bezdek et al., 1984). Through crystal plasticity finite element (CPFEM) calculations of RVEs, statistically representative for Ti-6242S microstructures, 33 different metrics (slip and geometry) for 25 grains, as well as for their neighborhood (8) were determined. To predict early fatigue crack growth, the Fatemi– Socie fatigue indicator parameter (FIP) (Fatemi and Socie, 1988) was calculated from the CPFEM results as well. Jha et al. (2018) showed that the consideration of single metrics/factors alone is not sufficient to determine or to rate their influence on the fatigue behavior. Thus, linear PCA was used to reveal the influence of the different metrics onto the FIP. This is obtained by analyzing the FIP value in dependence of the principal components and identifying the critical regions of principal components showing a high FIP value. Afterwards, the contribution of each metric to the principal component (variable coefficients) leading to the critical regions is obtained. By this analysis, the authors could conclude that the "microstructural configuration with high FIP roughly corresponds to a combination of α particles oriented to produce high normal stress on the basal plane and a neighborhood that imposes high shear strain" (Jha et al., 2018). The authors showed that via the suggested data analysis, contributions of several parameters could be revealed which would be impossible by direct analysis as well as by experimental characterization alone. To "reveal unique microstructural configurations" (Jha et al., 2018) leading to high FIP values and the occurrence rate of configurations, a clustering analysis in principal component space was performed. For this purpose, kernel based PCA in combination with FCM data clustering is applied. The results of this analysis showed that only certain configurations have a high FIP, appearing at low occurrence rate, as expected from experimental observations.

Corrosion is another mechanism that is very complex and strongly influenced but not only controlled by alloy composition and microstructure, rather also by the environmental conditions under which the alloy shall bear mechanical loads. Metallic biomaterials made from Mg alloys have the potential to be biodegradable. For implants in form of screws and plates, the degradation rate needs to be designed such that the implant bears the load until the bone sufficiently healed and takes over the mechanical load. The challenge is the large number of parameters in conjunction with the long duration for a corrosion test. Based on a very limited number of 69 samples, Willumeit et al. (2013) applied an ANN to first analyze the most important parameters and then visualize and identify the dependencies on all parameters under investigation. As the most important outcome, it was found that in addition to the concentration of NaCl, the concentration of CO<sup>2</sup> are the two most important factors that control the corrosion rate. While the first was wellknown, the second was revealed by this study. This finding is particularly important because the CO<sup>2</sup> concentration differs significantly between in vitro and in vivo experiments. The trained ANN allows to further design experiments in specific areas of interest as well as to quantitatively predict the corrosion rate for given environmental parameters.

# Predictive

In general, health monitoring and lifetime prediction for engineering structures has traditionally been a largely datadriven area. Recent progress in Bayesian methods and machine learning, in particular artificial neural networks, has motivated a considerable number of publications introducing new datadriven approaches for lifetime prediction (Freitag et al., 2009; Silverio Freire Júnior et al., 2009; Figueira Pujol and Andrade Pinto, 2011; Sikorska et al., 2011; Mosallam et al., 2016).

Machine learning has proven beneficial for lifetime predictions in particular for systems where accurate physical models for mechanical analysis are absent so far. A typical example is lifetime prediction for interfaces (e.g., Jia and Davalos, 2006). To predict fatigue properties not only on the microscopic material level but rather on the system level including factors such as macroscopic geometry, Wang et al. (2016) proposed a framework based on artificial neural networks. For a review of machine learning approaches specifically for crack growth prediction, the reader is referred to Wang et al. (2017).

Vassilopoulos et al. (2007) used ANNs to predict the fatigue life of composite materials based on experimental data that measured only approximately half of the amount usually required for the analysis. Thus, stress-cycle (S-N) curves and constant life diagrams (CLDs), which are helpful for structural designing, could be generated more efficiently and with satisfying accuracy for 10<sup>4</sup> -10<sup>7</sup> cycles. The loading condition investigated, modeled and validated were tension-tension, tension-compression and compressioncompression, respectively. The R-ratio refers to the different loading amplitudes imposed onto the specimens (Schijve, 2001). The approach was validated for two different glass-fiber reinforced polymers (GFRP) with dissimilar laminate sequences, as shown in **Figure 12**.

For predicting the fatigue crack growth in aluminum alloys, Zhi et al. (2016) utilized a recurrent neural network. The linkage between applied stress load and the resulting crack growth within the material was approximated via feedback loops at the output layer. As a result, the fine crack growth evolution could be accurately simulated, as validated by experiments.

Wang et al. (2017) compared three different machine learning approaches for predicting fatigue crack growth within aluminum alloys. Three-layered, fully connected feed forward neural network (one hidden layer) is advantageous over both radial basis function network (RBFN) and genetic algorithms optimized back-propagation network (GABP) so that the optimization and extrapolation results agreed best with the experimentally obtained data.

To predict the fatigue strength of numerous steels based on composition and process parameters, Agrawal et al. (2014) applied successful machine learning techniques, such as feature selection, regression and classification through the use of artificial neural networks, decision trees, and multivariate polynomial regression. For evaluating the capability of predicting the fatigue strength of steels, the most promising parameters were successfully ranked accordingly. Identifying the salient linkages between composition, processing and properties was realized through using the open access material database MatNavi from the Japan National Institute for Material Science (NIMS)<sup>4</sup> (Ogata and Yamazaki, 2012). It was shown that the most appropriate predictive modeling technique can vary in dependence on the steel type. "Hierarchical predictive modeling" was used for sequential processing of the data at different scales starting with an initial classification to determine the steel type and followed by choosing and applying the most appropriate method for the particular steel grade to predict the fatigue strength (Agrawal et al., 2014).

Further development of successfully applied predictive modeling techniques to fatigue strengths of steels was used to build an open access online tool by Agrawal and Choudhary (2018). The so called Steel-Fatigue-Strength-Predictor (Agrawal and Choudhary, 2016) is based on data-driven ensemble data mining based on composition and process characteristics of steels to predict their fatigue strength<sup>5</sup> Datasets on the fatigue behavior of steel were again taken from the MatNavi materials database and build into a forward process-structure-propertyperformance model. The framework provides a selection of 40 different modeling techniques that are selected based on the specific properties of the steel alloy(s) of interest. To identify composition and processing parameters that have a significant impact on the fatigue strength, feature selection techniques were applied to determine a small sub-set of the corresponding attributes. Thus, the model with the highest accuracy is tailored to the data of specific material in the steel fatigue strength predictor to provide insight to design preferences for optimal fatigue strength of parts of various types of steel.

To identify and predict the impact of the microstructure on the high-cycle fatigue performance, a data-driven mechanical model of a matrix using crystal plasticity was built by Smith et al. (2016) and Kafka et al. (2018) for the specific application of manufacturing drawn tubes of nickel titanium for arterial stents, as shown in **Figure 13**. The fatigue crack incubation life is simulated according to a particular microstructure exposed to high-cycle fatigue loads. Via a parametric study, the authors showed that the width of included voids in the material had an inverse proportional relationship to the fatigue life, whereas the diameter of the voids showed a direct proportional relation to the predicted fatigue performance, respectively.

For predicting the dynamic fracture growth and coalescence in brittle materials and to foresee failure, Moore et al. (2018) used random forests and decision trees, whereas Schwarzer et al. (2019) utilized recurrent graph convolutional neural networks. The overall aim of both studies was to bypass computationally intensive simulations for predicting fracture evolution. By applying machine learning approaches whose training is based on high accuracy finite-discrete simulations, prediction of statistic fracture growth was achieved in a few seconds. Even though training with simulation results from a microscale model requires computational effort, the predictions of fracture growth statistics and time to failure of the material via the trained random forest or neural network was very efficient. Moore et al. (2018) disregarded ANNs despite an increase in accuracy of about 10% compared to random forests, because of the extensive amount of data required to prevent over-fitting and accompanied computational effort. Instead, they used random forests and decision trees to efficiently perform an uncertainty quantification. Schwarzer et al. (2019) circumvented the challenge of needing a large experimental dataset that is statistically meaningful via using a significant number of simulations; thus, the accuracy could be increased. For that, a deep neural network was used; specifically, a graph convolutional network for fracture feature recognition within the material. Subsequently, a recurrent neural network was utilized

<sup>4</sup>https://mits.nims.go.jp/index\_en.html

<sup>5</sup>http://info.eecs.northwestern.edu/SteelFatigueStrengthPredictor.

for modeling the corresponding feature evolution. Training was achieved with a set of time series of graphs that represent the results of a total number of 145 simulations. Based on their initial state, evolution of multiple material properties can be predicted simultaneously. The error for the averaged fracture size prediction in comparison to the corresponding simulation results was 2%, for the fracture size distribution, the error was 13%, and for the predicted time to failure, the absolute error was 15%. Due to the relatively modest size of the initial dataset used for training, further accuracy was achieved by training the network on incorrect predictions that were previously produced as output. As a result, the loss function could be further reduced; thus, the network learned from its mistakes and the error of the predictions could be gradually decreased.

# Prescriptive

Performance properties of components that highly depend on an ideal design of the part can also be optimized via an improved non-dominated sorting generic algorithm (NSGA)- II (Deb et al., 2002), as shown by Wang et al. (2011) for the multi-objective optimization of wind turbine blades, specifically with respect to ideal so-called maximum power coefficient and minimum blade mass. Wang et al. (2011) modified the NSGA-II via incorporating the controlled elitism and dynamic crowding distance (DCD) methods. Ultimately, the design for a 5 megawatts wind turbine blade was optimized by increasing the performance and simultaneously decreasing its mass. Further examples of prescriptive approaches that strongly touch the field of control theory can be found (Padhye and Deb, 2011; Gao et al., 2016; Klancnik et al., 2016) but a discussion of these is out of the scope of the current paper.

# SUMMARY AND OUTLOOK

In conclusion, it was shown that numerous machine learning approaches are already applied successfully within the field of continuum materials mechanics, either solely or in various combinations for performing tasks that are descriptive, predictive or prescriptive in nature. As a result, acceleration of the discovery and development of novel materials can be enabled through highly reliant descriptions, predictions or prescriptions of anticipated characteristics along the process-structure-propertyperformance chain, often in a scale-bridging manner.

Machine learning and data mining approaches need to be established as standard tools for scientists and engineers that are experts in generating data through experiments and numerical analyses and sit at the key position to their data sources with the best and most comprehensive access to it (Agarwal and Dhar, 2014). This can be achieved when materials scientists and engineers collaborate closely with computer and data scientists, statisticians, physicists and other experts across various fields to further incorporate data science tools into established workflows for solving problems in materials mechanics and engineering. For example, combining data-driven machine learning and statistical approaches and traditional constitutive model-based simulation tools to perform data-driven simulations (Ibáñez et al., 2018b). In particular, databased and physics-based modeling can complement one another in the sense that a combination of purely data-driven approaches with well-tested physics-based models that are built on known constitutive equations, leads to creating highly reliant and efficient hybrid analytics and simulations.

Award-winning algorithms in other learning domains as diverse as handwritten digit recognition [e.g., using the standardized MNIST dataset as a benchmark (LeCun et al., 1999) and AlphaGo for the Go board game (Silver et al., 2017)] are often noteworthy not only for their performance, but moreover for their simplicity and lack of complex architectures. It is thus reasonable to predict that for materials science, learning algorithms will diverge in both directions, with more composite methods employing many of the algorithms expounded in this review, and with simple learning architectures that will have been proven adaptable and performant within the scope of most materials research. In this regard, so-called capsules, which are embedded in multiple layers within a neural network, exhibit potential for future use, as they achieve improved results on the MNIST handwritten digit database in terms of highly overlapping digit recognition with state-of-the-art performance in comparison to CNNs (Sabour et al., 2017). Thus, they will also be useful in areas of materials mechanics. Reinforcement learning is another example of a promising method for future application within continuum materials mechanics. A successful application on materials outside continuum material mechanics was provided by Popova et al. (2018) on applying deep reinforcement learning for the design of a chemical library, in particular, to the de novo design of drug molecules with specifically desired properties. Thus, in general, reinforcement learning is a suitable approach where materials are involved and decision-making is required.

As machine learning and data mining are fueled by data, the availability of useful and comprehensive datasets to machine learning experts within the field of continuum materials mechanics needs to be increased through establishing common data infrastructures and shared databases. One noteworthy difference between materials mechanics and other, more traditional machine learning domains is the comparative expense of obtaining training data, either by experimental gathering or via simulation. Such simulations can be prohibitively expensive, which may require new methods of synergizing materials simulations to machine learning, for example via multi-fidelity models for generating data for machine learning (Aydin et al., 2019), which have been shown to realize significant computational savings. Because data collection and assimilation may require significant costs, materials data management is important for data-driven approaches. How to store, archive, retrieve and share reliable data; including metadata, providing information on context and content of the data, is essential for increasing the usefulness of shared data, including information on when, where, how and under what conditions data was created and the type of data-processing already performed. Additionally, online tools that are designed for collaborations in research across various disciplines can help to integrate novel machine learning and data mining tools into existing workflows. Besides making decisions along the materials development progress based on empirical knowledge and instincts, experts would benefit extremely, in terms of cost and time reduction, by incorporating data-driven approaches such as machine learning into the development of materials and their processing. That way, knowledge that is gained from investigations that were either successful or failures can be recorded, stored, accessed and transferred to other challenges; therefore be extremely valuable as costs can be saved (Kalidindi and Graef, 2015).

Open source tools such as the highly abstracted neural network library Keras<sup>6</sup> , which works within the framework of other libraries such as Tensor flow<sup>7</sup> , empower scientists and engineers to efficiently use machine learning and data mining tools that are implemented with the programming language Python—the de facto standard programming language for machine learning. Due to the good readability of Python's syntax, the convenience and easy access for machine learning and data mining newcomers is increased (Chollet, 2018). Another example for a platform for shared computing and data resources is UNICORE<sup>8</sup> , which is a maturely developed software interface that provides access to a computing and data infrastructure including high-performance computing, clusters and file systems (Benedyczak et al., 2016).

Ultimately, proper usage of machine learning and data mining approaches has to be practiced and made easily accessible to materials researchers and engineers in order to enable employment across the range from theoretical foundations to practical applications. Furthermore, synergies between various disciplines such as data science and materials science still hold a substantial potential for applying machine learning tools most efficiently to face particular challenges within the field of materials mechanics.

# AUTHOR CONTRIBUTIONS

FB and BK conceptualized and created the basic structure of the article. FB wrote the main part and assembled all of the manuscript. All authors contributed substantially to the comprehensiveness of this review and wrote sections of the manuscript. All authors read, discussed, and revised the article and approved the submitted version.

<sup>6</sup>https://keras.io/

<sup>7</sup>https://www.tensorflow.org/

<sup>8</sup>https://www.unicore.eu/

#### FUNDING

FB and BK acknowledge support from the Helmholtz-Association via an ERC-Recognition-Award under contract number ERC-RA-0022.

#### REFERENCES


From CC and NH, support from Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—Projektnummer 192346071—SFB 986 is acknowledged. SK acknowledges support from NIST 70NANB18H039.

via convolutional neural network and a morphology-aware generative model. Comp. Mater. Sci. 150, 212–221. doi: 10.1016/j.commatsci.2018.03.074


K. Thornton, E. Holm, and P. Gumbsch (Cham: Springer International Publishing), 155–160.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Bock, Aydin, Cyron, Huber, Kalidindi and Klusemann. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Experiment Design Frameworks for Accelerated Discovery of Targeted Materials Across Scales

Anjana Talapatra<sup>1</sup> , Shahin Boluki <sup>2</sup> , Pejman Honarmandi <sup>1</sup> , Alexandros Solomou<sup>3</sup> , Guang Zhao<sup>2</sup> , Seyede Fatemeh Ghoreishi <sup>4</sup> , Abhilash Molkeri <sup>1</sup> , Douglas Allaire<sup>4</sup> , Ankit Srivastava<sup>1</sup> , Xiaoning Qian<sup>2</sup> , Edward R. Dougherty <sup>2</sup> , Dimitris C. Lagoudas 1,3 and Raymundo Arróyave<sup>1</sup> \*

<sup>1</sup> Department of Materials Science & Engineering, Texas A&M University, College Station, TX, United States, <sup>2</sup> Department of Electrical & Computer Engineering, Texas A&M University, College Station, TX, United States, <sup>3</sup> Department of Aerospace Engineering, Texas A&M University, College Station, TX, United States, <sup>4</sup> Department of Mechanical Engineering, Texas A&M University, College Station, TX, United States

#### Edited by:

Christian Johannes Cyron, Hamburg University of Technology, Germany

#### Reviewed by:

Miguel A. Bessa, Delft University of Technology, Netherlands Zhongfang Chen, University of Puerto Rico, Puerto Rico

> \*Correspondence: Raymundo Arróyave rarroyave@tamu.edu

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

> Received: 07 February 2019 Accepted: 05 April 2019 Published: 24 April 2019

#### Citation:

Talapatra A, Boluki S, Honarmandi P, Solomou A, Zhao G, Ghoreishi SF, Molkeri A, Allaire D, Srivastava A, Qian X, Dougherty ER, Lagoudas DC and Arróyave R (2019) Experiment Design Frameworks for Accelerated Discovery of Targeted Materials Across Scales. Front. Mater. 6:82. doi: 10.3389/fmats.2019.00082 Over the last decade, there has been a paradigm shift away from labor-intensive and time-consuming materials discovery methods, and materials exploration through informatics approaches is gaining traction at present. Current approaches are typically centered around the idea of achieving this exploration through high-throughput (HT) experimentation/computation. Such approaches, however, do not account for the practicalities of resource constraints which eventually result in bottlenecks at various stage of the workflow. Regardless of how many bottlenecks are eliminated, the fact that ultimately a human must make decisions about what to do with the acquired information implies that HT frameworks face hard limits that will be extremely difficult to overcome. Recently, this problem has been addressed by framing the materials discovery process as an optimal experiment design problem. In this article, we discuss the need for optimal experiment design, the challenges in it's implementation and finally discuss some successful examples of materials discovery via experiment design.

Keywords: materials discovery, efficient experiment design, Bayesian Optimization, information fusion, materials informatics, machine learning

# 1. INTRODUCTION

Historically, the beginning of materials research centered around learning how to use the elements and minerals discovered in nature. The chief challenge at the time was the separation of the pure metal from the mined ore which lead over time to the science of metallurgy—the foundation of current day materials research. Humans then discovered that these pure metals could be combined to form alloys, followed by the principles of heat treatments—advances that shaped history; since the ability to invent new and exploit known techniques to use metals and alloys to forge weapons for sustenance and defense was instrumental in the success, expansion and migration of early civilizations. Additionally, there is evidence that the oft-quoted sequence of copper-tin bronzeiron which lend their names to the "ages" of human progress, occurred in different parts of the world, sometimes even simultaneously (Tylecote and Tylecote, 1992). Thus, the desire to harvest materials from nature and use them to improve the quality of life is a uniquely human as well as universal trait.

With the acceleration of scientific advances over the last few centuries, mankind has moved on from developing applications based on available materials, to demanding materials to suit desired applications. Science and technology are continuously part of this contentious chicken and egg situation where it is folly to prioritize either–scientific knowledge for its own sake or the use of science as just a tool to fuel new applications. Regardless, the majority of the scientific breakthroughs from the materials perspective have resulted from an Edisonian approach and been guided primarily by experience, intuition and to some extent, serendipity. Further, bringing the possibilities suggested by such discoveries to fruition takes decades and considerable financial resources. Also, such approaches when successful, enable the investigation of a very small fraction of a given materials design space leaving vast possibilities unexplored. No alchemic recipes exist, however desirable, which given a target application and desired properties, enables one to design the optimized material for that application. However, to tread some distance on that alchemic road, recently, extensive work has centered on the accelerated and costeffective discovery, manufacturing, and deployment of novel and better materials as promoted by the Materials Genome Initiative (Holdren, 2011).

# 1.1. Challenges in Accelerated Materials Discovery Techniques

The chief hurdle when it comes to searching for new materials with requisite or better properties is the scarcity of physical knowledge about the class of materials that constitute the design space. Data regarding the structure and resultant properties may be available, but what is lacking is usually the fundamental physics that delineate the processing-structureproperty-performance (PSPP) relationships in these materials. Additionally, the interplay of structural, chemical and microstructural degrees of freedom introduces enormous complexity, which exponentially increases the dimensionality of the problem at hand, limiting the application of traditional design strategies.

To bypass this challenge, the current focus of the field is on the use of data to knowledge approaches, the idea being to implicitly extract the material physics embedded in the data itself with the use of modern day tools–machine learning, design optimization, manufacturing scale-up and automation, multiscale modeling, and uncertainty quantification with verification and validation. Typical techniques include the utilization of High-Throughput (HT) computational (Strasser et al., 2003; Curtarolo et al., 2013; Kirklin et al., 2013) and experimental frameworks (Strasser et al., 2003; Potyrailo et al., 2011; Suram et al., 2015; Green et al., 2017), which are used to generate large databases of materials feature / response sets, which then must be analyzed (Curtarolo et al., 2003) to identify the materials with the desired characteristics (Solomou et al., 2018). HT methods, however, fail to account for constraints in experimental / computational) resources available, nor do they anticipate the existence of bottle necks in the scientific workflow that unfortunately render impossible the parallel execution of specific experimental / computational tasks.

Recently, the concept of optimal experiment design, within the overall framework of Bayesian Optimization (BO), has been put forward as a design strategy to circumvent the limitations of traditional (costly) exploration of the design space. This was pioneered by Balachandran et al. (2016) who put forward a framework that balanced the need to exploit available knowledge of the design space with the need to explore it by using a metric (Expected Improvement, EI) that selects the best next experiment with the end-goal of accelerating the iterative design process. BObased approaches rely on the construction of a response surface of the design space and are typically limited to the use of a single model to carry out the queries. This is an important limitation, as often times, at the beginning of a materials discovery problem, there is not sufficient information to elucidate the feature set (i.e., model) that is the most related to the specific performance metric to optimize.

Additionally, although these techniques have been successfully demonstrated in a few materials science problems (Seko et al., 2014, 2015; Frazier and Wang, 2016; Ueno et al., 2016; Xue et al., 2016a,b; Dehghannasiri et al., 2017; Ju et al., 2017; Gopakumar et al., 2018), the published work tends to focus on the optimization of a single objective (Balachandran et al., 2016), which is far from the complicated multi-dimensional real-world materials design requirements.

In this manuscript, we discuss the materials discovery challenge from the perspective of experiment design-i.e., goal-oriented materials discovery, wherein we efficiently exploit available computational tools, in combination with experiments, to accelerate the development of new materials and materials systems. In the following sections, we discuss the need for exploring the field of materials discovery via the experiment design paradigm and then specifically discuss two approaches that address the prevalent limitations discussed above: i) a framework that is capable of adaptively selecting competing models connecting materials features to performance metrics through Bayesian Model Averaging (BMA), followed by optimal experiment design, ii) a variant of the wellestablished kriging technique specifically adapted for problems where models with varying levels of fidelity related to the property of interest are available and iii) a framework for the fusion of information that exploits correlations among sources/models and between the sources and 'ground truth' in conjunction with a multi-information source optimization framework that identifies, given current knowledge, the next best information source to query and where in the input space to query it via a novel value-gradient policy and examples of applications of these approaches in the context of single-objective and multi-objective materials design optimization problems and information fusion applied to the design of dual-phase materials and CALPHAD-based thermodynamic modeling.

# 2. EXPERIMENT DESIGN

The divergence of modern science from its roots in natural philosophy was heralded by the emphasis on experimentation in the sixteenth and seventeenth centuries as a means to establish causal relationships (Hacking, 1983) between the degrees of freedom available to the experimenter and the phenomena being investigated. Experiments involve the manipulation of one or more independent variables followed by the systematic observation of the effects of the manipulation on one or more dependent variables. An experiment design then, is the formulation of a detailed experimental plan in advance of doing the experiment that when carried out will describe or explain the variation of information under conditions hypothesized to reflect the variation. An optimal experiment design maximizes either the amount of 'information' that can be obtained for a given amount of experimental effort or the accuracy with which the results are obtained, depending on the purpose of the experiment. A schematic of the experiment design process is shown in **Figure 1**.

Taking into consideration the large number of measurements often needed in materials research, design problems are formulated as a multi-dimensional optimization problem, which typically require training data in order to be solved. Prior knowledge regarding parameters and features affecting the desired properties of materials is of great importance. However, often, prior knowledge is inadequate and the presence of large uncertainty is detrimental to the experiment design. Hence, additional measurements or experiments are necessary in order to improve the predictability of the model with respect to the design objective. Naturally, it is then essential to direct experimental efforts such that the targeted material may be found by minimizing the number of experiments. This may be achieved via an experiment design strategy that is able to distinguish between different experiments based upon the information they can provide. Thus, the experiment design strategy results in the choosing of the next best experiment, which is determined by optimizing an acquisition function.

# 2.1. Experiment Design Under Model Uncertainty

In most materials design tasks, there are always multiple information sources at the disposal of the material scientist. For example, the relationships between the crystal structure and properties/performance can in principle be developed through experiments as well as (computational) models at different levels of fidelity and resolution (-atomistic scale, molecular scale, continuum scale). Traditional holistic design approaches such as Integrated Computational Materials Engineering (ICME), on the other hand, often proceed on the limited and (frankly) unrealistic assumption that there is only one source available to query the design space. For single information sources and sequential querying, there are two traditional techniques for choosing what to query next in this context (Lynch, 2007; Scott et al., 2011). These are (i) efficient global optimization (EGO) (Jones et al., 1998) and its extensions, such as sequential Kriging optimization (SKO) (Huang et al., 2006) and valuebased global optimization (VGO) (Moore et al., 2014), and (ii) the knowledge gradient (KG) (Gupta and Miescke, 1994, 1996; Frazier et al., 2008). EGO uses a Gaussian process (Rasmussen, 2004) representation of available data, but does not account for noise (Schonlau et al., 1996, 1998). SKO also uses Gaussian processes, but includes a variable weighting factor to favor decisions with higher uncertainty (Scott et al., 2011). KG differs in that while it can also account for noise, it selects the next solution on the basis of the expected value of the best material after the experiment is carried out. In the case of multiple uncertain sources of information (e.g., different models for the same problem), it is imperative to integrate all the sources to produce more reliable results (Dasey and Braun, 2007). In practice, there are several approaches for fusing information from multiple models. Bayesian Model Averaging (BMA), multifidelity co-kriging (Kennedy and O'Hagan, 2000; Pilania et al., 2017, and fusion under known correlation (Geisser, 1965; Morris, 1977; Winkler, 1981; Ghoreishi and Allaire, 2018) are three such model fusion techniques that enable robust design. These approaches shall be discussed in detail in the following sections.

#### 2.1.1. Bayesian Model Averaging (BMA)

The goal of any materials discovery strategy is to identify an action that results in a desired property, which is usually optimizing an objective function of the action over the Materials Design Space (MDS). In materials discovery, each action is equivalent to an input or design parameter setup. If complete knowledge of the objective function exists, then the materials discovery challenge is met. In reality however, this objective function is a black-box, of which little if anything is known and the cost of querying such a function (through expensive experiments/simulations) at arbitrary query points in the MDS is very high. In these cases a parametric or non-parametric surrogate model can be used to approximate the true objective function. Bayesian Optimization (BO) (Shahriari et al., 2016) corresponds to these cases, where the prior model is sequentially updated after each experiment.

Irrespective of whether prior knowledge about the form of the objective function exists and/or many observations of the objective values at different parts of the input space are available, there is an inherent feature selection step, where different potential feature sets might exist. Moreover, there might be a set of possible parametric families as candidates for the surrogate model itself. Even when employing non-parametric surrogate models, several choices for the functional form connecting degrees of freedom in the experimental space and the outcome(s) of the experiment might be available. These translate into different possible surrogate models for the objective function. The common approach is to select a feature set and a single family of models and fix this selection throughout the experiment design loop; however, this is often not a reliable approach due to the small initial sample size that is ubiquitous in materials science.

This problem was addressed by a subset of the present authors by framing experiment design as Bayesian Optimization under Model Uncertainty (BOMU), and incorporating Bayesian Model Averaging (BMA) within Bayesian Optimization (Talapatra et al., 2018). The acquisition function used is the Expected Improvement (EI) which seeks to balance the need to exploit available knowledge of the design space with the need to explore it. Suppose that f ′ is the minimal value of the objective function f observed so far. Expected improvement evaluates f at the point that, in expectation, improves upon f ′ the most. This corresponds to the following utility function:

$$I = \max(f' - f(\mathfrak{x}), \mathbf{0}) \tag{1}$$

If yˆ and s are the predicted value and its standard error at x, respectively, then the expected improvement is given by:

$$E[I(\mathbf{x})] = (f' - \hat{\mathbf{y}})\Phi(\frac{f'-\hat{\mathbf{y}}}{s}) + s\phi(\frac{f'-\hat{\mathbf{y}}}{s})\tag{2}$$

where: φ(.) and 8(.) are the standard normal density and distribution functions (Jones et al., 1998). The Bayesian Optimization under Model Uncertainty approach may then be described as follows:


Incorporating BMA within Bayesian Optimization produces a system capable of autonomously and adaptively learning not only the most promising regions in the materials space but also the models that most efficiently guide such exploration.

under Model Uncertainty (BOMU). Initial data and a set of candidate models are used to construct a stochastic representation of an experiment/simulation. Each model is evaluated in a Bayesian sense and its probability is determined. Using the model probabilities, an effective acquisition function is computed, which is then used to select the next point in the materials design space that needs to be queried. The process is continued iteratively until target is reached or budget is exhausted. Used with permission from Talapatra et al. (2018).

The framework is also capable of defining optimal experimental sequences in cases where multiple objectives must be met-we note that recent works have begun to address the issue of multi-objective Bayesian Optimization in the context of materials discovery (Mannodi-Kanakkithodi et al., 2016; Gopakumar et al., 2018). Our approach, however, is different in that the multiobjective optimization is carried out simultaneously with feature selection. The overall framework for autonomous materials discovery is shown in **Figure 2**.

#### 2.1.2. Multi-Fidelity co-kriging

As discussed in Pilania et al. (2017), co-kriging regression is an variant of the well-established kriging technique specifically adapted for problems where models with varying levels of fidelity (i.e., variations both in computational cost and accuracy) related to the property of interest are available. This approach was put forward by Kennedy and O'Hagan (2000) who presented a cogent mathematical framework to fuse heterogeneous variablefidelity information sources paving the way for multi-fidelity modeling. This framework was then adapted by Forrester et al. (2007) who demonstrated its application in an optimization setting via a two-level co-kriging scheme. The auto-regressive co-kriging scheme may be applied to scenarios where l-levels of variable-fidelity estimates are available, however, practical limitations pertaining to computational efficiency emerge when the number of levels l or number of data points grow large. Recent work by Le Gratiet and Garnier ( Le Gratiet, 2013; Le Gratiet and Garnier, 2014) showed that any co-kriging

scheme with l-levels of variable-fidelity information sources can be effectively decoupled, and equivalently reformulated in a recursive fashion as an l-independent kriging problem, thereby circumventing this limitation. This facilitates the construction of predictive co-kriging schemes by solving a sequence of simpler kriging problems of lesser complexity. In the context of materials discovery, this approach was successfully implemented by Pilania et al., who presented a multi-fidelity co-kriging statistical learning framework that combines variable-fidelity quantum mechanical calculations of bandgaps to generate a machine-learned model that enables low-cost accurate predictions of the bandgaps at the highest fidelity level ( Pilania et al., 2017). Similarly, Razi et al. introduced a novel approach for enhancing the sampling convergence for properties predicted by molecular dynamics based upon the construction of a multi-fidelity surrogate model using computational models with different levels of accuracy (Razi et al., 2018).

#### 2.1.3. Error Correlation-Based Model Fusion (CMF) Approach

As mentioned earlier, model-based ICME-style frameworks tend to focus on integrating tools at multiple scales under the assumption that there is a single model/tool which is significant at a specific scale of the problem. This ignores the use of multiple models that may be more/less effective in different regions of the performance space. Data-centric approaches, on the other hand, tend to focus (with some exceptions) on the brute-force exploration of the MDS, not taking into account the considerable cost associated with such exploration.

In Ghoreishi et al. (2018), the authors presented a framework that addresses the two outstanding issues listed above in the context of the optimal micro-structural design of advanced high strength steels. Specifically, they carried out the fusion of multiple information sources that connect micro-structural descriptors to mechanical performance metrics. This fusion is done in a way that accounts for and exploits the correlations between each individual information source-reduced order model constructed under different simplifying assumptions regarding the partitioning of (total) strain, stress or deformation work among the phases constituting the micro-structureand between each information source and the ground truthrepresented in this case by a full-field micro-structurebased finite element model. This finite element model is computationally expensive, and is considered as a higher fidelity model as part of a multi-fidelity framework, the intention being to create a framework for predicting ground truth. Specifically, the purpose of the work is not to match the highest fidelity model, but to predict material properties when created at ground truth. There is usually no common resource trade-off in this scenario, in contrast to traditional computational multi-fidelity frameworks that trade computational expense and accuracy.

In this framework, the impact of a new query to an information source on the fused model is of value. The search is performed over the input domain and the information source options concurrently to determine which next query will lead to the most improvement in the objective function. In addition, the exploitation of correlations between the discrepancies of the information sources in the fusion process is novel and enables the identification of ground truth optimal points that are not shared by any individual information sources in the analysis.

A fundamental hypothesis of this approach is that any model can provide potentially useful information to a given task. This technique thus takes into account all potential information any given model may provide and fuses unique information from the available models. The fusion goal then is to identify dependencies, via estimated correlations, among the model discrepancies. With these estimated correlations, the models are fused following standard practice for the fusion of normally distributed data. To estimate the correlations between the model deviations when they are unknown, the reification process (Allaire and Willcox, 2012; Thomison and Allaire, 2017) is used, which refers to the process of treating each model, in turn, as ground truth. The underlying assumption here is that the data generated by the reified model represents the true quantity of interest. These data are used to estimate the correlation between the errors of the different models and the process is then repeated for each model. The detailed process of estimating the correlation between the errors of two models can be found in Allaire and Willcox (2012) and Thomison and Allaire (2017).

A flowchart of the approach is shown in **Figure 3**. The next step then is to determine which information source should be queried and where to query it, concurrently, so as to produce the most value with the tacit resource constraint in mind. For this decision, a utility, which is referred to as the value-gradient utility is used, which accounts for both the immediate improvement in one step and expected improvement in two steps. The goal here is to produce rapid improvement, with the knowledge that every resource expenditure could be the last, but at the same time, to be optimally positioned for the next resource expenditure. In this sense, there is equal weight accorded to next step value with next step (knowledge) gradient information, hence the term value-gradient. As mentioned in the previous section, in the BMA approach, the Expected Improvement (EI) metric is used to choose the next query point, while in this approach, the value gradient is used.

The knowledge gradient, which is a measure of expected improvement, is defined as:

$$\nu^{KG}(\mathbf{x}) = E[V^{N+1}(H^{N+1}(\mathbf{x})) - V^{N}(H^{N})|H^{N}] \tag{3}$$

where H<sup>N</sup> is the knowledge state, and the value of being at state H<sup>N</sup> is defined as V <sup>N</sup>(HN) <sup>=</sup> maxx∈<sup>χ</sup> <sup>H</sup>N. The KG policy for sequentially choosing the next query is then given as:

$$\mathfrak{x}^{KG} = \operatorname\*{argmax}\_{\mathfrak{x} \in \mathfrak{X}} \nu^{KG}(\mathfrak{x}) \tag{4}$$

and the value-gradient utility is given by:

$$U = \mu\_{fused}^\* + \max\_{\boldsymbol{\kappa} \in \chi} \nu^{KG}(\boldsymbol{\kappa}) \tag{5}$$

where µ ∗ fused is the maximum value of the mean function of the current fused model and maxx∈<sup>χ</sup> ν KG(x) is the maximum expected improvement that can be obtained with another query as measured by the knowledge gradient over the fused model.

# 2.2. Application of Experiment Design Framework: Examples

#### 2.2.1. Multi-Objective Bayesian Optimization

A Multi-objective Optimal Experiment Design (OED) framework (see **Figure 4**) based on the Bayesian optimization techniques was reported by the authors in Solomou et al. (2018). The material to be optimized was selected to be precipitation strengthened NiTi Shape memory alloys (SMAs) since complex thermodynamic and kinetic modeling is necessary to describe the characteristics of these alloys. The specific objective of this Bayesian Optimal Experimental Design (BOED) framework in this study was to provide an Optimal Experiment Selection (OES) policy to guide an efficient search of the precipitation strengthened NiTi SMAs with selected target properties by efficiently solving a multi-objective optimization problem. The EHVI (Emmerich et al., 2011) acquisition function was used to perform multi-objective optimization. EHVI balances the trade-off between exploration and exploitation for multiobjective BOED problems, similar to EI for single-objective problems. EHVI is a scalar quantity that allows a rational agent to select, sequentially, the next best experiment to carry out, given current knowledge, regardless of the number of objectives, or dimensionality, of the materials design problem. The optimal solutions in an optimization problem are typically referred as Pareto optimal solutions or Pareto front or Pareto front points. The Pareto optimal solutions in a selected multi-objective space, correspond to the points of the objective space that are not dominated by any other points in the same space.

For the NiTi SMA, the considered input variables were the initial homogeneous Ni concentration of the material

before precipitation (c) and the volume fraction (v<sup>f</sup> ) of the precipitates while the objective functions were functions of the material properties of the corresponding homogenized SMA. The framework was used to discover precipitated SMAs with (objective 1) an austenitic finish temperature A<sup>f</sup> = 30◦C, (objective 2) a specific thermal hysteresis that is defined as the difference of austenitic finish temperature and martensitic start temperature, A<sup>f</sup> − M<sup>s</sup> = 40◦C. The problem was solved for two case studies, where the selected continuous MDS is discretized with a coarse and a dense mesh, respectively. The refined MDS has n<sup>T</sup> = 21021 combinations of the considered variables c and vf . The utility of the queried materials by the BOED framework within a predefined experimental budget (OES) is compared with the utility of the corresponding queried materials following a Pure Random Experiment Selection (PRES) policy and a Pure Exploitation Experiment Selection (PEES) policy within a predefined experimental budget. An experimental budget is assumed of n<sup>B</sup> = 20 material queries and for the case of the OES and PEES policies the experimental budget is allocated to n<sup>I</sup> = 1 randomly queried material and to n<sup>E</sup> = 19 for sequential experiment design.

The results are shown in **Figure 5A**. It is seen that the OES policy, even under the limited experimental budget, queries materials that belong to the region of the objective space which approaches the true Pareto front. This is clear by comparing the Pareto front calculated based on the results of the OES (blue dash line) with the true Pareto front found during the case study 1 (red dot line). The results also show that the materials queried by the PRES policy are randomly dispersed in the objective space, as expected, while the materials queried by the PEES policy are clustered in a specific region of the objective space which consists of materials with similar volume fraction values which is anticipated courtesy the true exploitative nature of the policy. Further analysis demonstrates, that the OES on average queries materials with better utility in comparison to the other two policies, while the PRES policy exhibits the worst performance.

Same trends of performance are maintained through the equivalent comparisons conducted for various experimental

budgets as shown in **Figure 5** which also indicates similar curves for the coarse mesh. It is apparent that if the OES policy is employed to query a material in a discrete MDS with defined variables bounds, its relative performance in comparison to the PRES policy becomes more definitive as the discretization of the MDS is further refined, as the gap between PRES and OES policies for the case of the dense discretized MDS (red lines) is much bigger than that in the case of the coarse discretized MDS (blue lines) optimally queried materials. The results of the BOED framework thus demonstrate that the method could efficiently approach the true Pareto front of the objective space of the approached materials discovery problems successfully. Such treatment was also carried out for a three-objective problem with the additional objective of maximizing the maximum saturation strain ( Hsat) that the material can exhibit and similar conclusions were drawn.

While exact algorithms for the computation of EHVI have been developed recently (Hupkens et al., 2015; Yang et al., 2017), such algorithms are difficult to be extended to problems with more than 3 objectives. Recently, a subset of the present authors (Zhao et al., 2018) developed a fast exact framework for the computation of EHVI with arbitrary number of objectives by integrating a closed-formulation for computing the (hyper)volume of hyperrectangles with existing approaches (While et al., 2012; Couckuyt et al., 2014) to decompose hypervolumes. This framework is capable of computing EHVIs for problems with arbitrary number of objectives with saturating execution times, as shown in **Figure 6**.

#### 2.2.2. Bayesian Model Averaging: Search for MAX Phase With Maximum Bulk Modulus

As was mentioned above, the BMA framework has been developed by the present authors to address the problem of attempting a sequential optimal experimenta design over a materials design space in which very little information about the causal relationships between features and response of interest is available. This framework was demonstrated by efficiently

exploring the MAX ternary carbide/nitride space (Barsoum, 2013) through Density Functional Theory (DFT) calculations by the authors in Talapatra et al. (2018). Because of their rich chemistry and the wide range of values of their properties (Aryal et al., 2014), MAX phases constitute an adequate material system to test simulation-driven-specifically DFT calculations- materials discovery frameworks.

The problem was formulated with the goals of (i) identifying the material/materials with the maximum bulk modulus K and (ii) the maximum bulk modulus and minimum shear modulus with a resource constraint of permitting experiments totally querying 20% of the MDS. The case of the maximum bulk modulus K is designed as a single-objective optimization problem while the second problem is designed as a multi-objective problem. Features describing the relation between the material and objective functions were obtained from literature and domain knowledge.

In this work, a total of fifteen features were considered: empirical constants which relate the elements comprising the material to it's bulk modulus; valence electron concentration; electron to atom ratio; lattice parameters; atomic number; interatomic distance ; the groups according to the periodic table of the M, A & X elements, respectively; the order of MAX phase (whether of order 1 corresponding to M2AX or order 2 corresponding to M3AX2); the atomic packing factor ; average atomic radius ; and the volume/atom . In relevant cases, these features were compositionweighted averages calculated from the elemental values and were assumed to propagate as per the Hume-Rothery rules. Feature correlations were used to finalize six different feature sets which are denoted as F1, F2, F3, F4, F5, and F6.

Complete details may be found in Talapatra et al. (2018). Some representative results are shown here. Calculations were carried out for a different number of initial data instances N = 2, 5, 10, 15, 20. One thousand five hundred instances of each initial set N were used to ensure a stable average response. The budget for the optimal design was set at ≈ 20% of the MDS i.e., 80 materials or calculations. In each iteration, two calculations were done. The optimal policy used for the selection of the compound(s) to query was based on the EI for the single objective case and the EHVI for the multi-objective case. The performance trends for all problems across different values of N are consistent. The technique is found to not significantly depend on quantity of initial data. Here, representative results for N = 5 are shown.

**Figure 7A** indicates the maximum bulk modulus found in the experiment design iterations based on each model (feature set) averaged over all initial data set instances for N = 5. The dotted line in the figure indicates the maximum bulk modulus = 300 GPa that can be found in the MDS. F<sup>2</sup> is found to be the best performing feature set on average, converging fastest to the maximum bulk modulus. F<sup>6</sup> and F<sup>5</sup> on the

other hand, are uniformly the worst performing feature sets on average, converging the slowest. It is evident that using a regular optimization approach will work so long there is, apriori, a good feature set. **Figure 7B** shows the swarm plots indicating the number of calculations required to discover the maximum bulk modulus in the MDS using experiment design based on single models for the 1500 initial data instances with N = 5. The width of the swarm plot at every vertical axis value indicates the proportion of instances where the optimal design parameters were found at that number of calculations. Bottom heavy, wide bars, with the width decreasing with the number of steps is desirable, since that would indicate that larger number of instances needed fewer number of steps to converge. The dotted line indicates the budget allotted, which was 80 calculations. Instances that did not converge within the budget were allotted a value of 100. Thus, the width of the plots at vertical value of 100, corresponds to the proportion of instances which did not discover the maximum bulk modulus in the MDS within the budget. From this figure, it is seen that for F<sup>1</sup> , F2, and F<sup>4</sup> in almost 100% of instances the maximum bulk modulus was identified within the budget, while F<sup>5</sup> is the poorest feature set and the maximum was identified in very few instances.

Unfortunately, due to small sample size and large number of potential predictive models, the feature selection step may not result in the true best predictive model for efficient Bayesian Optimization. Small sample sizes pose a great challenge in model selection due to inherent risk of imprecision and, and no feature selection method performs well in all scenarios when sample sizes are small. Thus, by selecting a single model as the predictive model based on small observed sample data, one ignores the model uncertainty.

To circumvent this problem the Bayesian Model Averaging (BMA) method was used. Regression models based on aforementioned six feature subsets, were adopted in the BMA experiment design. The BMA coefficients were evaluated in two ways: first-order (BMA1) and second-order (BMA2) Laplace approximation. **Figure 7C** shows the comparison of the average performance of both the first-order and second-order BMA over all initial data set instances with the best performing model (F2) and worst performing model (F6). It can be seen that both the first-order and second-order BMA performance in identifying the maximum bulk modulus is consistently close to the best model (F2). BMA<sup>1</sup> performs as well as if not better than F2. **Figure 7D** shows the corresponding swarm plots indicating the number of calculations required to discover the maximum bulk modulus in the MDS for N = 5 using BMA<sup>1</sup> and BMA2. It can be seen that for a very high percentage of cases the maximum bulk modulus can be found within the designated budget. In **Figures 7E,F**, the average model coefficients (posterior model probabilities) of the models based on different feature sets over all instances of initial data set are shown with the increasing number of calculations for BMA<sup>1</sup> and BMA<sup>2</sup> respectively. Thus, we see that, while prior knowledge about the fundamental features linking the material to the desired material property is certainly essential to build the Materials Design Space (MDS), the BMA approach may be used to auto-select the best features/feature sets in the MDS, thereby eliminating the requirement of knowing the best feature set a priori. Also, this framework is not significantly dependent on the size of the initial data, which enables its use in materials discovery problems where initial data is scant.

#### 2.2.3. Multi-Source Information Fusion: Application to Dual-Phase Materials

In Ghoreishi et al. (2018), the authors demonstrated the Multi-Source Information Fusion approach in the context of the optimization of the ground truth strength normalized strain hardening rate for dual-phase steels. They used three reducedorder models (iso-strain, iso-stress, and iso-work) to determine the impact of quantifiable micro-structural attributes on the mechanical response of a composite dual-phase steel. The finite element model of the dual-phase material is considered as the ground truth with the objective being the maximization of the (ground truth) normalized strain hardening rate at ǫpl = 1.5%. The design variable then is the percentage of the hard phase, fhard, in the dual-phase material. A resource constraint of five total queries to (any of) the information sources before a recommendation for a ground truth experiment is made was enforced. If ground truth results were found to be promising, five additional queries were allocated to the information sources. The initial intermediate Gaussian process surrogates were constructed using one query from each information source and one query from the ground truth.

The value-gradient policy discussed earlier was used to select the next information source and the location of the query in the input space for each iteration of the process. The KG policy operating directly on the ground truth was also used to reveal the gains that can be had by considering all available information sources for comparison purposes. To facilitate this, a Gaussian process representation was created and updated after each query to ground truth. The convergence results of the fusion approach using all information sources and the KG policy on the ground truth are indicated in **Figure 8**. Here, the dashed line represents

the optimal value of the ground truth quantity of interest. The proposed approach clearly outperforms the knowledge gradient applied directly to the ground truth, and also converged to the optimal value much faster, thereby reducing the number of needed ground truth experiments. This performance gain may be attributed to the ability of the information fusion approach to efficiently utilize the information available from the three low fidelity information sources to better direct the querying at ground truth. The original sample from ground truth used for initialization was taken at fhard = 95%, which is far away from the true optimal as can be observed in **Figure 9** in the left column. The proposed framework, was thus able to quickly

30. Image sourced from Ghoreishi et al. (2018); use permitted under the Creative Commons Attribution License CC-BY-NC-SA.

CC-BY-NC-SA.

direct the ground truth experiment to a higher quality region of the design space by leveraging the three inexpensive available information sources.

**Figure 9** shows the updates to each information source Gaussian process surrogate model and the fused model representing the total knowledge of ground truth for iterations 1, 15, and 30 of the information source querying process. Note that an iteration occurs when an information source is queried. which is distinct from any queries to ground truth. As is evident from the left column, the first experiment from ground truth and the first query from each information source gave scant information about the location of the true objective. However, by iteration 15, the fused model, shown by the smooth red curve, still underpredicts the ground truth at this point but has identified the best region of the design space. At iteration 15, only three expensive ground truth experiments have been conducted. By iteration 30, six ground truth experiments have been conducted and the fused model is very accurate in the region surrounding the optimal design for ground truth. It is clear from **Figure 9** that none of the information sources share the ground truth optimum. It is worth highlighting that the ability of the proposed framework to find this optimum rested upon the use of correlation exploiting fusion, and would not have been possible using traditional methods.

**Figure 10** presents the history of the queries to each information source and the ground truth. Note that the iteration now counts queries to each information source as well as ground truth experiments. From the figure, it is evident that all three information sources are exploited to find the ground truth optimal design, implying that, however imperfect, the optimal use of all sources available to the designer is essential in order to identify the optimal ground truth.

#### 2.2.4. Bayesian Model Averaging and Information Fusion: CALPHAD-Based Thermodynamic Modeling

Calculation of phase diagrams (CALPHAD) is one of the fundamental tools in alloy design and an important component of ICME. Uncertainty quantification of phase diagrams is the first step required to provide confidence for decision making in property- or performance-based design. In work that was the first of its kind (Honarmandi et al., 2019), the authors independently generated four CALPHAD models describing Gibbs free energies for the Hf − Si system. The calculation of the Hf − Si binary phase diagram and its uncertainties is of great importance since adding Hafnium to Niobium silicide based alloys (as promising turbine airfoil materials with high operating temperature) increases their strength, fracture toughness, and oxidation resistance significantly (Zhao et al., 2001). The Markov Chain Monte Carlo (MCMC) Metropolis Hastings toolbox in Matlab was then utilized for probabilistic calibration of the parameters in the applied CALPHAD models. These results are shown for each model in **Figure 11** where it is seen that there is a very good agreement between the results obtained from model 2 and the data with a very small uncertainty band and consequently a small Model Structure Uncertainty (MSU) (Choi et al., 2008). Models 3 and 4 on the other hand show large uncertainties for the phase diagrams which are mostly attributed to MSU. In the context of BMA, the weight of the applied models was calculated to be 0.1352, 0.5938, 0.1331, and 0.1379, respectively, indicating that Model 2 thus has three times the weight of the other models, which otherwise have similar Bayesian importance, consistent with the phase diagram results in **Figure 11**. The phase diagram obtained using BMA is shown in **Figure 12**. The posterior modes of the probability distributions in the BMA model exactly correspond to the posterior modes of the probability distributions in model 2. Thus, the best model results can be considered as the optimum results for the average model, but with broader uncertainties, contributed by the inferior models. In BMA, each model has some probability of being true and the fused estimate is a weighted average of the models. This method is extremely useful in the case of model-building process based on a weighted average over the models' responses, and/or less risk (more confidence) in design based on broader uncertainty bands provided by a weighted average over the uncertainties of the models' responses.

Error Correlation-based Model fusion was then used to fuse the models together in two ways: (i) Fuse models 1, 3, 4 to examine whether the resulting fused model maybe closer to the data and reduce the uncertainties and (ii) Fuse models 1, 2, 3, and 4 together. **Figure 13A** shows that the approach can provide a phase diagram in much better agreement with data and with less uncertainty compared to phase diagrams obtained from each one of the applied models individually. This result implies that random CALPHAD models can be fused together to find a reasonable estimation for phase diagram instead of trial-anderror to find the best predicting model. It is also apparent that better predictions can be achieved as shown in **Figure 13B** if model 2 (the best model) is also involved in the model fusion. The information fusion technique allowed the acquisition of

FIGURE 11 | Optimum Hf-Si phase diagrams and their 95% Bayesian credible intervals (BCIs) obtained from models 1–4 (A–D) after uncertainty propagation of the MCMC calibrated parameters in each case. Reproduced with permission from Honarmandi et al. (2019).

more precise estimations and lower uncertainties compared to results obtained from each individual model. In summary, the average model obtained from BMA shows larger 95% confidence intervals compared to any one of the individual models, which can provide more confidence for robust design but is likely too conservative. On the other hand, the error correlationbased technique can provide closer results to data with less uncertainties than the individual models used for the fusion. The uncertainty reductions through this fusion approach are also verified through the comparison of the average entropies (as a measure of uncertainty) obtained for the individual and fused models. Therefore, random CALPHAD models can be fused together to find reasonable predictions for phase diagrams with no need to go through the cumbersome task of identifying the best CALPHAD models.

# 3. CONCLUSIONS AND RECOMMENDATIONS

In this work, we have reviewed some of the most important challenges and opportunities related to the concept of optimal experiment design as an integral component for the development of Materials Discovery frameworks, and have presented some recent work by these authors that attempts to address them.

As our understanding of the vagaries implicit in different design problems progresses, tailoring experiment design strategies around the specific material classes under study while further developing the experiment design frameworks will become increasingly feasible and successful. As techniques improve, we will be able to access and explore increasingly complex materials design spaces, opening the door to precision tailoring of materials to desired applications. Challenges

in the form of the availability and generation of sufficient and relevant data of high quality need to be continuously addressed. The optimal way to accomplish this would be the implementation of universal standards, centralized databases and the development of an open access data-sharing system in conjunction with academia, industry, government research institutions and journal publishing agencies which is already underway.

### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

#### FUNDING

The authors acknowledge the support of NSF through the projects NSF-CMMI-1534534, NSF-CMMI-1663130, and DGE-

#### REFERENCES


1545403. RA and ED also acknowledge the support of NSF through Grant No. NSF-DGE-1545403. XQ acknowledges the support of NSF through the project CAREER: Knowledgedriven Analytics, Model Uncertainty, and Experiment Design, NSF-CCF-1553281 and NSF-CISE-1835690 (with RA). AT and RA also acknowledge support by the Air Force Office of Scientific Research under AFOSR-FA9550-78816-1- 0180 (Program Manager: Dr. Ali Sayir). RA and DA also acknowledge the support of ARL through [grant No. W911NF-132-0018]. The open access publishing fees for this article have been covered by the Texas A&M University Open Access to Knowledge Fund (OAKFund), supported by the University Libraries and the Office of the Vice President for Research.

### ACKNOWLEDGMENTS

Calculations were carried out in the Texas A&M Supercomputing Facility.


Zhao, J.-C., Bewlay, B., and Jackson, M. (2001). Determination of nb–hf–si phase equilibria. Intermetallics 9, 681–689. doi: 10.1016/S0966-9795(01)00057-7

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Talapatra, Boluki, Honarmandi, Solomou, Zhao, Ghoreishi, Molkeri, Allaire, Srivastava, Qian, Dougherty, Lagoudas and Arróyave. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# On-the-Fly Adaptivity for Nonlinear Twoscale Simulations Using Artificial Neural Networks and Reduced Order Modeling

#### Felix Fritzen<sup>1</sup> \*, Mauricio Fernández <sup>1</sup> and Fredrik Larsson<sup>2</sup>

<sup>1</sup> EMMA - Efficient Methods for Mechanical Analysis, Institute of Applied Mechanics, University of Stuttgart, Stuttgart, Germany, <sup>2</sup> Material and Computational Mechanics, Department of Industrial and Materials Science, Chalmers University of Technology, Göteborg, Sweden

A multi-fidelity surrogate model for highly nonlinear multiscale problems is proposed. It is based on the introduction of two different surrogate models and an adaptive on-the-fly switching. The two concurrent surrogates are built incrementally starting from a moderate set of evaluations of the full order model. Therefore, a reduced order model (ROM) is generated. Using a hybrid ROM-preconditioned FE solver additional effective stress-strain data is simulated while the number of samples is kept to a moderate level by using a dedicated and physics-guided sampling technique. Machine learning (ML) is subsequently used to build the second surrogate by means of artificial neural networks (ANN). Different ANN architectures are explored and the features used as inputs of the ANN are fine tuned in order to improve the overall quality of the ML model. Additional ML surrogates for the stress errors are generated. Therefore, conservative design guidelines for error surrogates are presented by adapting the loss functions of the ANN training in pure regression or pure classification settings. The error surrogates can be used as quality indicators in order to adaptively select the appropriate—i.e., efficient yet accurate—surrogate. Two strategies for the on-the-fly switching are investigated and a practicable and robust algorithm is proposed that eliminates relevant technical difficulties attributed to model switching. The provided algorithms and ANN design guidelines can easily be adopted for different problem settings and, thereby, they enable generalization of the used machine learning techniques for a wide range of applications. The resulting hybrid surrogate is employed in challenging multilevel FE simulations for a three-phase composite with pseudo-plastic micro-constituents. Numerical examples highlight the performance of the proposed approach.

Keywords: reduced order modeling (ROM), machine learning, artificial neural networks (ANN), surrogate modeling, error control, on-the-fly model adaptivity, multiscale simulations

# 1. INTRODUCTION

In computer-assisted materials design and in the simulation of complex materials with rich microstructure major challenges remain to be solved despite the outstanding advances made in recent years. For example, the discretization of all microstructural features in a monolithic finite element (FE) simulation is unfeasible due to the various length scales involved that range from

#### Edited by:

Christian Johannes Cyron, Hamburg University of Technology, Germany

#### Reviewed by:

Alireza Yazdani, Brown University, United States Ercan Gürses, Middle East Technical University, Turkey

\*Correspondence: Felix Fritzen fritzen@mechbau.uni-stuttgart.de

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

> Received: 04 February 2019 Accepted: 05 April 2019 Published: 03 May 2019

#### Citation:

Fritzen F, Fernández M and Larsson F (2019) On-the-Fly Adaptivity for Nonlinear Twoscale Simulations Using Artificial Neural Networks and Reduced Order Modeling. Front. Mater. 6:75. doi: 10.3389/fmats.2019.00075

**46**

micrometers up to the meters. These would lead to a ludicrous complexity of the resulting overall model. By accounting for a separation of length scales, the FE<sup>2</sup> ansatz (Feyel, 1999; Miehe, 2002) can lead to some savings over the monolithic approach by replacing the heterogeneous material by microscopic FE problems at the macroscopic integration points, leading to a partial decoupling of microscopic and macroscopic degrees of freedom. Still the number of overall unknowns is prohibitive and calls for massively improved computational efficiency in terms of CPU time, memory savings and information compression. Novel strategies contributing to the vision of a fully connected investigation of materials and aspiring the prediction processstructure-property relationships across multiple length and time scales are, thus, much sought-after, see, e.g., Schmitz and Prahl (2016). Due to the rapid growth of available material and simulation data, data-integrated approaches that exploit information from different sources in order to complement or substitute simulations and experiments are experiencing increased attention, see, e.g., Kalidindi and De Graef (2015), Kalidindi (2015), and Ramakrishna et al. (2018).

Due to novel improvements in machine learning and computational resources, a zoo of data-driven methods comprising, e.g., kernel methods, principal component analysis, and artificial neural networks, have developed immense momentum over the last years. The successful implementation of these techniques in materials research is an active field. For instance, in Chupakhin et al. (2017) artificial neural networks and finite element computations have been combined in order to predict the influence of plasticity on the residual stress field measured by hole drilling. Principal component analysis of npoint microstructure statistics have shown excellent performance in order to examine microstructure-property relationships, see, e.g., Çeçen et al. (2014), Gupta et al. (2015), and Altschuh et al. (2017). In Bélisle et al. (2015), several machine learning methods have been considered in the context of molecular dynamics. Liu et al. (2015) show how data mining and machine learning are combined in order to efficiently approximate the elastic localization in voxelized microstructures. Another branch of data-driven materials research exploring the use of convolutional neural networks and deep learning in order to deliver accurate structure-property linkages is currently in heavy development, see, e.g., Çeçen et al. (2018) and Yang et al. (2018).

While data-driven approaches have their appeal, the structure of the underlying physical problem can be accounted for only in parts. For instance, established balanced laws and thermodynamic principles are hard to be incorporated in the aforementioned methods. Reduced order models for the microscopic problem offer an advantageous compromise between physics-informed modeling and computational efficiency. Purely data-driven surrogates lack accuracy (i) if the amount of training data is limited, (ii) if the validity domain is left, or (iii) if the error of the surrogate in respect to the reference solution is to be estimated. In these scenarios, reduced order models following physical principles offer, in general, better accuracy and robustness. For example, in Fritzen and Leuschner (2013) a highly efficient potential based reduced order model has been developed. This ansatz has a natural physical supporting argument, since a reduced basis for the solution field is generated based on snapshot data of FE computations. The approach has been demonstrated to achieve substantial speed-ups and memory savings, see also Fritzen et al. (2014) and Fritzen and Hodapp (2016). Other developments in this field comprise the NTFA (e.g., Michel and Suquet, 2003) and NTFA-TSO (Michel and Suquet, 2016) or hyper-reduced simulations and related schemes (Ryckelynck, 2009; Soldner et al., 2017). In order to improve the incorporation of surrogates obtained from reduced order models a goal-oriented error estimation or quality indication is required. The quantity of interest (QoI) is the effective stress and its accuracy (up to a prescribed tolerance) is essential for reliability of the overall predictions. In Lu et al. (2018), for example, neural networks have been successfully trained to approximate the microscopic nonlinear microscopic electric material law of graphene/polymer nanocomposites, but without error control or model adaptivity in macroscopic simulations. A macroscopic goal-oriented approach combining reduced order modeling and machine learning techniques has been demonstrated in Trehan et al. (2017) for two-dimensional oil-water subsurface flow systems and in Freno and Carlberg (2018) for three-dimensional mechanical problems. The approach considers a reduced order model and an a posteriori correction through machine learning methods. The ansatz shows promising results, but it requires the evaluation of the reduced order model. For twoscale simulations with a macroscopic and a microscopic problem, even the evaluation of a reduced order model for the microscopic problem may not be always viable due to the large number of needed evaluations in the macroscopic problem. It is, therefore, necessary to seek efficient alternatives incorporating a hierarchy of surrogate models of different computational complexity and different accuracy in the QoI. Hereby, one may also take into consideration physics-informed artificial neural networks, as done in Raissi et al. (2018), in order to obtain the field solution of the balance equations for the microscopic physical problem at hand and then computing the QoI for the macroscopic scale. Such approaches are highly attractive, but not suitable for the objectives of twoscale computations, since the surrogate for the microscopic model is only required to return the QoI for the macroscale and additional calibration of the artificial neural networks for the microscopic solution field would only increase the computational costs without any benefits for the convergence of the macroscopic problem.

The present work aims in mechanical multiscale FE simulations at the adaptive combination of the physics-informed reduced order model (ROM) of Fritzen and Kunc (2018), for nonlinear hyperelastic problems (i.e., no history dependency), with artificial neural networks (ANNs), for which feedforward neural networks are considered. Hereby, the ANNs are trained based on FE computation of the full three-dimensional microstructure and material at hand for a set of loading strains (input quantity). The QoI (output quantity) is the effective stress, which the ANNs are trained for. The trained ANNs are then used as a highly efficient constitutive relation surrogate for the nonlinear material at hand. For macroscopic FE computations, the ANN material law surrogate is to be used, if possible, at every integration point for given effective strain. Based on quality indicators, as accuracy or range of validity, the ANN constitutive relation surrogate may be inaccurate or insecure for given strain. The present work, therefore, further considers the error modeling of trained ANNs and discusses guidelines in order to induce conservative properties to the error models, which are calibrated through ANNs in standard regression or classification approaches. Additionally, strategies are proposed for an adaptive ANN-ROM schemes, where the more accurate but expensive ROM is only called at an integration point, if the quality indicator demands it.

The manuscript is organized as follows: In section 2, two concurrent surrogate models for the QoI obtained by reduced order modeling and from purely data-driven ANNs are described. The twoscale mechanical problem is introduced and the challenges in the goal-oriented error estimation of derived quantities of interest remaining in nonlinear reduced order modeling are detailed. Then, the data generation for the training of the ANNs is illustrated, followed by the guidelines for the material law and error approximation. At the end of the section, adaptive twoscale simulation strategies including on-thefly model switching are presented. Section 3 offers numerical examples for a three-phase pseudo-plastic material: The ANN is used for the direct surrogation of the QoI. This is adaptively complemented by a more robust and reliable reduced order model based on the concept of quality indicators. Multiscale FE simulations comparing the different multiscale simulation techniques are presented. The manuscript ends with a concluding summary of the results in section 4.

#### 2. REDUCED ORDER MODELING AND ARTIFICIAL NEURAL NETWORKS

#### 2.1. Twoscale Framework

#### 2.1.1. Problem Setting

The simulation of microstructured solids with a sufficient separation of length scales is investigated. More precisely, a macroscopic domain <sup>Ω</sup> <sup>⊂</sup> <sup>R</sup> <sup>3</sup> with characteristic length L and an attached microstructure with characteristic length L<sup>µ</sup> ≪ L are considered. The microstructure is assumed to be ergodic and the existence of a periodic Representative Volume Element (RVE) Ω is assumed. In the following, macroscopic fields are overlined •. The twoscale problem consists of the concurrent solution of the macroscopic boundary value problem (BVP)

$$(\overline{\mathbf{P}}): \operatorname{div} \left( \bar{\sigma} \left( \bar{\mathbf{s}} \right) \right) = \mathbf{0} \quad \text{with} \quad \overline{\boldsymbol{\varepsilon}} = \operatorname{sym} \operatorname{grad} (\overline{\mathbf{u}}) \quad + \operatorname{BC} \tag{1}$$

and, for each macroscopic point **<sup>x</sup>** <sup>∈</sup> <sup>Ω</sup>, of the solution of the RVE problem

$$\begin{aligned} \text{(P)}: \text{div}\left(\sigma(\mathfrak{s})\right) &= \mathbf{0} \quad \text{with} \quad \mathfrak{s} = \text{sym } \text{grad}(\mathfrak{u})\\ \text{and} \quad \frac{1}{|\mathfrak{Q}|} \int\_{\mathfrak{Q}} \mathfrak{s} \, \mathrm{d}V &= \bar{\mathfrak{s}}. \end{aligned} \tag{2}$$

Here **<sup>u</sup>**, **<sup>u</sup>** denote displacements, <sup>ε</sup>, <sup>ε</sup>¯ are infinitesimal strain tensors and σ, σ¯ denote the stress fields on the microscopic and macroscopic domain, respectively. The solution of (P) defines the

missing constitutive relation for the macroscopic stress σ via the volume average

$$\bar{\sigma} = \frac{1}{|\mathcal{Q}|} \int\_{\mathcal{Q}} \sigma \,\mathrm{d}V. \tag{3}$$

The two BVPs are strongly coupled since the solution **u** of (P) defines the boundary condition for (P) via ε¯, while the solution of (P) implicitly provides the missing constitutive equation via (3).

A straight-forward yet computationally costly approach to solving the twoscale problem is given in terms of the FE<sup>2</sup> method (Feyel, 1999; Miehe, 2002): Here the microscopic problem is solved at each macroscopic integration point and the effective tangent operator is used in order to allow for Newton-Raphson iterations of the nonlinear macroscopic BVP. In the following, the solution of the microscopic BVP using Finite Elements is considered as the reference solution, i.e., it denotes the Full Order Model (FOM). In **Figure 1** the macroscopic and microscopic problems are illustrated in the context of FE<sup>2</sup> .

#### 2.1.2. Reduced Order Model (ROM)

Given the massive computational demands of the FE<sup>2</sup> technique and the limited availability of computational resources, the use of nowadays established reduced order models (ROM), in order to replace the costly microscopic BVP evaluations, has become an accepted alternative for dissipative and pseudo-plastic hyperelastic materials (Radermacher and Reese, 2016; Fritzen and Kunc, 2018). The reduced basis of dimension N obtained from the snapshot Proper Orthogonal Decomposition (POD, Sirovich, 1987) can be expressed in terms of a matrix U(x), where each column represents a displacement field. The reduced parameterization of the solution is then given in vector notation

 $\text{via } (i = 1, \ldots, N)$ 
$$\underline{\underline{u}}(\underline{\underline{x}}) = \underline{\underline{\underline{x}}}\,\underline{\underline{x}} + \underline{\underline{U}}(\underline{\underline{x}})\,\underline{\underline{\underline{x}}}, \quad \underline{\underline{c}}(\underline{\underline{x}}) = \underline{\underline{\underline{x}}} + \underline{\underline{E}}(\underline{\underline{x}})\,\underline{\underline{\underline{x}}},$$

$$\left(\underline{\underline{E}}(\underline{\underline{x}})\right)\_{\bullet i} = \text{vec}\left(\text{sym } \text{grad}\left(\left(\underline{\underline{U}}(\underline{\underline{x}})\right)\_{\bullet i}\right)\right),\tag{4}$$

where (A)•<sup>i</sup> refers to the ith column of the corresponding matrix. Here, the matrix and vector notation of the effective strain ε¯ are used concurrently for convenience. In the following, attention is limited to pseudo-plastic materials, i.e., to strongly nonlinear hyperelastic solids for which the stress σ and the stiffness C are defined as the gradients of a free energy function W(ε) according to

$$\underline{\sigma} \equiv \sigma = \frac{\partial \, W(\mathfrak{s})}{\partial \mathfrak{s}}, \qquad \underline{\mathbb{C}} \equiv \mathbb{C} = \frac{\partial^2 W(\mathfrak{s})}{\partial \mathfrak{s} \, \partial \mathfrak{s}}.\tag{5}$$

Following Fritzen and Kunc (2018) the reduced problem is to find the coefficients <sup>ξ</sup> <sup>∈</sup> <sup>R</sup> <sup>N</sup> solving

$$\underline{x}(\underline{\overline{\underline{\varepsilon}}},\underline{\underline{\xi}}) = \int\_{\underline{\underline{\mathcal{Q}}}} \underline{\underline{\underline{\mathcal{E}}}} \, \, \, \, \mathrm{d}V \stackrel{!}{=} \underline{\underline{\mathcal{Q}}}.\tag{6}$$

While the effective stress is obtained from simple volume averaging of σ, the effective tangent stiffness is computed via

$$\underline{\underline{C}} = \frac{1}{|\mathcal{Q}|} \left( \int\_{\mathcal{Q}} \underline{\underline{C}} \, \mathrm{d}V - \underline{\underline{K}}^{\mathsf{T}} \underline{\underline{I}}^{-1} \, \underline{\underline{K}} \right) \qquad \text{with} \quad \underline{\underline{K}} = \int\_{\mathcal{Q}} \underline{\underline{C}} \, \underline{\underline{E}} \, \mathrm{d}V,$$
 
$$\underline{\underline{I}} = \int\_{\mathcal{Q}} \underline{\underline{E}}^{\mathsf{T}} \underline{\underline{C}} \, \mathrm{d}V,\tag{7}$$

which follows from straight-forward linearization of (6). The accuracy of the ROM depends on the quality and amount of the snapshots and of the reduced dimension N. It shall be noted that the Galerkin ROM inherits the properties of classical Finite Elements, i.e., the solution is Galerkin orthogonal and, thus, energy optimal. It follows the basic physical principle of energy minimization. From a theoretical perspective this motivates the robustness and accuracy of the ROM even beyond the considered parameters used during the generation of the snapshot data, i.e., the ROM can be considered to generalize.

#### 2.2. Goal-Oriented Error Estimation

For the microscopic BVP, using the ROM (or any other approximation of the FOM) naturally introduces an error into the solution of the problem, and into the quantity of interest (QoI). In this work the latter is the effective stress. Hence, in order to enable error control for the macroscopic boundary value problem, it is crucial to estimate the error in the QoI, see, e.g., Larsson and Runesson (2011). For this purpose, we define the error in displacements on the microscale as e(x) = u F (x) − u <sup>R</sup>N(x), where u F and u <sup>R</sup><sup>N</sup> are the solutions to the microscopic problem (2) using the FOM and N-dimensional ROM, respectively. In view of the reduced kinematics (4), we can parameterize the error in the ROM as

$$
\underline{\mathfrak{e}}(\underline{\mathfrak{x}}) = \underline{\underline{U}}^{\mathrm{F}}(\underline{\mathfrak{x}}) \underline{\mathfrak{E}}\_{\mathfrak{e}},\tag{8}
$$

where U <sup>F</sup> denotes the (finite element) shape functions pertinent to the FOM and ξ e denotes the nodal values of the fully resolved error. The FOM and N-dimensional ROM effective stress functions are referred to for clarity as

$$\underline{\underline{\sigma}}^{\text{F}}(\underline{\underline{\varepsilon}}) = \text{effective stress of FOM for given effective strain } \underline{\underline{\varepsilon}}\text{,} \tag{9}$$

$$\underline{\underline{\sigma}}^{\text{RN}}(\underline{\underline{\varepsilon}}) = \text{effective stress of N-dimensional ROM for given}$$

$$\text{effective strain } \underline{\underline{\varepsilon}}\text{,}$$

(10)

while the corresponding error is addressed as

$$\underline{e}\_{\overline{\sigma}} = \overline{\underline{\sigma}}^{\mathcal{F}} - \overline{\underline{\sigma}}^{\mathbb{R}N}.\tag{11}$$

Now, consider the corresponding FOM residual equation analogous to (6) and the error in the QoI given in (11) in terms of the error in the solution defined in (8). Through linearization, we obtain the error equation and the linearization of the macroscopic stress error

$$
\underline{\underline{J}}^{\rm F} \underline{\underline{\xi}}\_{\epsilon} \approx -\underline{\underline{r}}^{\rm F}, \qquad \underline{\underline{\epsilon}}\underline{\underline{\sigma}} \approx \underline{\underline{K}}^{\rm F} \underline{\underline{\xi}}\_{\epsilon}, \tag{12}
$$

respectively. Here r F , J F , and K <sup>F</sup> define the residual, Jacobian and linearized stress error in (6) and (7), with E replaced by E F defining the strains of the finite element shape functions. The, nowadays, standard method of goal-oriented error estimation can be carried out by solving the suitably formulated dual (or adjoint) problem (see, e.g., Oden and Prudhomme, 2001)

$$\left(\underline{\underline{\boldsymbol{I}}}^{\rm F}\right)^{\rm T}\underline{\underline{\boldsymbol{\xi}}}^{\rm \*} = \left(\underline{\underline{\boldsymbol{K}}}^{\rm F}\right)^{\rm T}.\tag{13}$$

Finally, (12) and (13) can be combined to yield the result

$$\underline{\mathbf{e}}\_{\overline{\sigma}} \approx - [\underline{\underline{\xi}}^\*]^\mathrm{T} \underline{\mathbf{r}}^\mathrm{F}.\tag{14}$$

We note that the estimator (14) has, in particular, the following properties: (i) It is restricted to estimating the linearized error contribution, (ii) it requires the assembly of the entire FOM residual and Jacobian, and (iii) it requires the solution of the dual problem using the FOM to formally hold. Even if the linearization error is negligible, the high computational cost involved in assembling the full (FOM) Jacobian and residual of the problem makes this technique unalluring for use in conjunction with highly efficient ROM approximations. Possible approximations of (13) pertain to hierarchical approximations. One could, for instance, solve the dual problem using an enriched ROM, rather than the FOM. However, designing a robust hierarchical scheme requires means of guaranteeing that the enriched basis is sufficient. In view of the discussion above, we shall henceforth consider alternative methods to estimating (and controlling) the error in macroscopic stress from each microscopic problem.

#### 2.3. Artificial Neural Networks (ANNs)

#### 2.3.1. Generation of Data

#### **2.3.1.1. Design of input data / loading directions**

The present work is concerned with materials based on state dependent models for, e.g., pseudo-plasticity. For such material models, see, e.g., Kunc and Fritzen (2018), the transition zone between elastic and plastic domain is found in the vecinity of the origin in strain space. In this transition zone a pronounced nonlinearity and change of slope not only from the elastic to the plastic domain take place, but also depending on the load direction in the plastic domain, followed by a saturation behavior for increasing load amplitudes. This material behavior motivated the Concentric Sampling (CS) approach proposed by Kunc and Fritzen (2018) for pseudo-plastic materials, which is also used in this work. Based on the CS approach, n<sup>d</sup> almost uniformly distributed unit vectors / directions d (i) <sup>∈</sup> <sup>R</sup> 6 (i = 1, . . . , nd) are generated. Samples along the generated directions are considered with an exponentially growing step width from the origin. The primal strain dataset Dˆ ε is addressed as

$$\begin{aligned} \mathbf{\hat{D}}\_{\ell} &= \{ \underline{\bar{\varepsilon}} \in \mathbb{R}^{6} : \underline{\bar{\varepsilon}} = r \underline{d}, r \in \mathbf{D}\_{r}, \underline{d} \in \mathbf{D}\_{d} \}, \quad \mathbf{D}\_{r} = \{ r\_{1}, \dots \} \,, \\ &\mathbf{D}\_{d} = \{ \underline{d}\_{1}, \dots \} \end{aligned} \tag{15}$$

with the primal strain norm discretization D<sup>r</sup> and set of directions Dd. The definition (15) corresponds to a tensor decomposition into direction and amplitude. For many materials the volume changes are rather small compared to isochoric deformations. This effect is particularly pronounced for (pseudo-) plastic materials. In order to sample the strain space in a problem specific manner, a rescaling of the strains defined in (15) may be convenient. The present work solely rescales the spherical part (sph) of each primal strain (i.e., the dilatation), while the deviatoric part (dev) remains unchanged. The actual strain dataset is described by

$$\mathcal{D}\_{\varepsilon} = \left\{ \underline{\overline{\varepsilon}} \in \mathbb{R}^{6} : \underline{\overline{\varepsilon}} = \underline{\widehat{T}}(\underline{\widehat{\overline{\varepsilon}}}) = \frac{1}{\widehat{r}} \text{sph}(\underline{\widehat{\overline{\varepsilon}}}) + \text{dev}(\underline{\widehat{\overline{\varepsilon}}}), \underline{\widehat{\overline{\varepsilon}}} \in \mathcal{D}\_{\varepsilon} \right\},$$

$$\mathfrak{\*}(\mathcal{D}\_{\varepsilon}) = \mathfrak{\*}(\mathcal{D}\_{d})\mathfrak{\*}(\mathcal{D}\_{r}) \,. \tag{16}$$

where rˆ specifies the rescaling of the spherical part. The number of strain samples #(Dε) is given by the product of number of the directions #(Dd) and the number of amplitudes per direction #(Dr).

#### **2.3.1.2. Generation of output data**

For the training of the artificial neural networks (ANNs), training (T), validation (V), and random (Monte Carlo - MC) datasets, referred to as D<sup>T</sup> ε , D<sup>V</sup> ε , and DMC ε , respectively, are generated. The latter are not obtained using CS, but using a uniformly random set of directions in strain space. They are mainly used for unbiased testing of the surrogate independent of the proximity to the training and validation set. The output of interest in the present work is, primarily, the effective stress, but also some error measures for the derived surrogates, which will be defined in the following sections.

Technically, the process of generating the data samples is challenging. In order to obtain reliable data, the FOM and the ROM must be evaluated thousands of times in order to obtain the needed data. Each sample consists of an effective strain ε¯ and the related effective stress σ¯. In order to boost the performance of the simulations, a ROM-preconditioned solver for the FOM has been developed: First, an accurate (i.e., high-dimensional) ROM is solved for each load path. Then the FEM is accelerated by taking the ROM solution as initial guess for the nodal displacements during the first increment and, during the subsequent load steps, by taking the ROM displacement increment as initial guess for the FEM displacement adjustment. This not only brings the initial guess close to the final solution but it also leads to an accurate global stiffness matrix that can be combined with Quasi-Newton techniques. The ROM-accelerated FE showed a 20% reduction in the number of Newton iterations, despite the use of a Quasi-Newton scheme. This is remarkable in view of the less accurate stiffness matrix of Quasi-Newton scheme and the faster convergence must be attributed to the improved initial guess for the FE displacement vector reconstructed from the ROM solution. Overall, this approach provides significant computational improvements over a naive FE based data generation. Further, it is noteworthy that the high-dimensional ROM solution can be used to derive a hierarchy of lower-dimensional ROM solutions needing virtually no additional Newton-Raphson iterations via linearization. More precisely the trailing entries of a ROM solution can be eliminated by making use of the Schur complement which leads to an adjustment of the remaining reduced coefficients. In our tests this downscaling of high quality ROM solutions to N-dimensional ROMs proved an efficient tool.

#### 2.3.2. Surrogate Model for the Effective Stress **2.3.2.1. Feature design**

For the successful training of ANNs the normalization of the input and output data and the design of appropriate inputs (usually referred to as features) through linear or nonlinear transformations is essential. Compared to image data and convolutional neural networks, which usually take advantage of the intrinsic connection of image data and convolution, the present input data (strain data) is low-dimensional and necessarily requires sensible mechanical guidance during feature design. From a pure data-driven perspective, general batch normalization can greatly improve the prediction quality of a network. But in the present problem setting the input and output data have a clear physical nature. Therefore, based on mechanical reasoning, the consideration of the dependency of the material law on the spherical (ε¯ ◦ ) and deviatoric (ε¯ ′ ) degrees of freedom of the strain offers a material theoretic starting point. This linear transformation is addressed as

$$\underline{T}^{\text{sd1}}(\underline{\bar{\varepsilon}}) = \begin{bmatrix} \underline{\bar{\varepsilon}}^{\circ} \\ \underline{\bar{\varepsilon}}^{\prime} \end{bmatrix} = [\bar{\bar{\varepsilon}}^{\circ}, \bar{\bar{\varepsilon}}\_{1}^{\prime}, \bar{\bar{\varepsilon}}\_{2}^{\prime}, \bar{\bar{\varepsilon}}\_{3}^{\prime}, \bar{\bar{\varepsilon}}\_{4}^{\prime}, \bar{\bar{\varepsilon}}\_{5}^{\prime}]^{\mathsf{T}} \in \mathbb{R}^{6} \,. \tag{17}$$

Additionally, the deviatoric part of the strain can be split into its norm and direction

$$\underline{T}^{\text{sd2}}(\underline{\tilde{\varepsilon}}) = \begin{bmatrix} \underline{\tilde{\varepsilon}}^{\circ} \\ |\underline{\tilde{\varepsilon}}'| \\ \overline{\tilde{\varepsilon}}'| \end{bmatrix} \in \mathbb{R}^{\mathsf{T}}.\tag{18}$$

After either of these transformations, a corresponding normalization is performed in order to prepare the strain features for the subsequent evaluation through the ANN: For T sd1, each component of the vector T sd1(ε¯) is shifted and then divided by its corresponding mean and standard deviation over the training dataset D<sup>T</sup> ε , i.e., component-wise shifting and scaling are applied. For T sd2, the first component (i.e., the volumetric strain) is scaled according to the standard procedure while the deviatoric strain amplitude is divided by its peak value and the deviatoric direction remains unchanged. In the following the shifted and scaled inputs are referred to as x [0] <sup>∈</sup> <sup>R</sup> <sup>D</sup>, <sup>D</sup> <sup>=</sup> 6, 7.

#### **2.3.2.2. Architecture of the artificial neural network**

In the present work, feedforward neural networks are used. This choice within the plethora of available artificial neural networks is driven by the fact that a function is to be calibrated that depends exclusively on the current state ε¯: the effective stress of the FOM σ¯ F (ε¯). It should be remarked that for problems with history dependency, e.g., path-dependent plasticity or damage in cyclic loading, feedforward neural networks could, in principle, be considered, but recurrent neural networks offer much better alternatives. They are specially designed for time series and they feed back outputs of the model into the prediction of the subsequent cycle. Generally, the training costs of recurrent neural networks are immensely higher than that of feedforward neural networks, since a large number of input paths is required, instead of points in the input space. For the problem at hand recurrent neural networks offer no advantages. Hence, we choose feedforward neural networks for the rest of the present work. Hereby, networks consisting of L > 1 layers are taken into account. For each layer l = 1, . . . , L consisting of n [l] neurons the inputs x [l−1] <sup>∈</sup> <sup>R</sup> n [l−1] and outputs x [l] <sup>∈</sup> <sup>R</sup> n [l] are related by weights W[l] , biases b [l] and activation functions a [l] via the recursion

$$\underline{\mathbb{x}}^{[l]} = a^{[l]} (\underline{\underline{W}}^{[l]} \underline{\underline{x}}^{[l-1]} + \underline{\underline{b}}^{[l]}) \in \mathbb{R}^{n^{[l]}}, \qquad \underline{\underline{W}}^{[l]} \in \mathbb{R}^{n^{[l] \times n^{[l-1]}}},$$

$$\underline{\underline{b}}^{[l]} \in \mathbb{R}^{n^{[l]}},$$

complemented by n [0] <sup>=</sup> <sup>D</sup>. The weights and biases of the ANN are parameters, which need to be calibrated with training data by solving an unconstrained optimization problem. The choice of activation functions is an abstract parameter that can heavily influence the quality of the surrogate. Its selection depends on the intuition of the user, complemented by thorough testing in terms of architecture sweeps. In the present context, the differentiability of the stress surrogate is aspired, as it allows for a computation of the tangent stiffness at low computational expense through automatic differentiation. This requirement naturally favors smooth activation functions. Our ANN implementation is based on Python3 (v3.4.3) using Google's TensorFlow library (v1.12.0), which offers automatic differentiation capabilities. For architecture tests the following activation functions have been used:


• and the hyperbolic tangent (TANH) a(x) = tanh(x).

The identity function (Id) allows to pass unaltered input, such that a linear combination of the activation functions of the previous layer is returned. This is particularly desired in the last layer, in order to obtained an optimized linear combination of nonlinear functions as final output y = x [L] of the ANN. The evaluation of a single input strain through the whole ANN is addressed by the composition of all layers

$$\text{ANN}(\bar{\underline{\varepsilon}}) = \underline{\underline{\chi}}(\bar{\underline{\varepsilon}}) = a^{[L]} \left( \underline{\underline{W}}^{[L]} a^{[L-1]} (\dots) + \underline{\underline{\mu}}^{[L]} \right) . \tag{20}$$

#### **2.3.2.3. Loss function**

The training of the ANN requires an objective function that provides an error respecting the nature of the outputs. In the context of ANNs, the objective function is referred to as loss function. Similar to the inputs, the outputs, the effective stress of the FOM σ¯ <sup>F</sup> defined in (9), should also be scaled using an invertible transformation

$$
\underline{p}(\underline{\bar{\varepsilon}}) = \underline{T}\_{\sigma}(\underline{\bar{\sigma}}^{\mathcal{F}}(\underline{\bar{\varepsilon}})) \in \mathbb{R}^{d\_{\mathcal{S}}}.\tag{21}
$$

Here, the same transformations T sd1 and T sd2 as for the inputs are considered for T<sup>σ</sup> during architecture testing. The evaluation of the ANN is analogously abbreviated as

$$
\underline{\tilde{p}}(\underline{\bar{\varepsilon}}) = \text{ANN}(\underline{\bar{\varepsilon}}) \in \mathbb{R}^{d\_{\sigma}}.\tag{22}
$$

In this work, the mean squared error (MSE) is chosen as the loss function

$$\text{MSE} = \frac{1}{d\_{\sigma}} \text{mean} \left( \left\| \underline{p} - \tilde{\underline{p}} \right\|^2 \right) \,. \tag{23}$$

The MSE (23) is then optimized with respect to the ANN parameters, i.e., the weights and biases are identified starting from a random initialization. The ANN output is then obtained through an inverse transformation

$$
\underline{\bar{\sigma}}^{\text{ANN}}(\underline{\bar{\varepsilon}}) = \underline{T}^{-1}\_{\sigma}(\text{ANN}(\underline{\bar{\varepsilon}}))\,. \tag{24}
$$

It should be remarked that, from the perspective of physicsinformed artificial neural networks, one may also consider the incorporation of the norm of the non-symmetric part of the gradient ∂σ¯ ANN/∂ε¯ in the loss function. This would help to calibrate the network, such that its gradient is likely to be close to symmetric. But since this can not be assured for arbitrary input ε¯, number of layers, neurons and activation functions, the present work prefers to solely consider (23) for the loss function, calibrate σ¯ ANN as good as possible and simply symmetrize the resulting gradient ∂σ¯ ANN/∂ε¯. Hereby, it should be stressed that a symmetric gradient ∂σ¯ ANN/∂ε¯ is essential for the hyperelastic/pseudo-plastic material considered in this work, since the assembled system matrix of the macroscopic problems is symmetric by the corresponding material theory. A non-symmetric system matrix in the macroscopic problem would also increase the computational costs, due to the thereby induced necessity for solvers for non-symmetric matrices.

The quality of the ANN during training is checked, not with respect to the training dataset, but with the validation dataset D<sup>V</sup> ε via the mean relative norm error (MRNE)

$$\text{MRNE} = \underset{\mathbf{D}\_s^V}{\text{mean}} \left( \frac{\|\bar{\underline{\sigma}}^{\mathbf{F}} - \bar{\underline{\sigma}}^{\text{ANN}}\|}{\|\bar{\underline{\sigma}}^{\mathbf{F}}\|} \right). \tag{25}$$

In addition to that, the mean coefficient of determination R 2 σ of the effective stress is evaluated

$$R\_{\sigma}^{2} = \frac{1}{6} \sum\_{i=1}^{6} R\_{i}^{2}, \quad R\_{i}^{2} = 1 - \frac{\text{mean}\left(\left(\bar{\sigma}\_{i}^{\text{F}} - \bar{\sigma}\_{i}^{\text{ANN}}\right)^{2}\right)}{\text{mean}\left((\bar{\sigma}\_{i}^{\text{F}})^{2}\right) - \left(\text{mean}(\bar{\sigma}\_{i}^{\text{F}})\right)^{2}}. \tag{26}$$

The coefficient of determination is bounded by one which is attained if and only if the surrogate coincides with the reference for all queries.

#### 2.3.3. Surrogate Model for the Error in the Quantity of Interest

#### **2.3.3.1. Error regression and classification**

In this section, we are interested in the calibration of ANNs taking strain data as input and delivering quantitative and qualitative error estimates for the stress. On the one hand, for a given strain, it might be of interest to predict the error of stress surrogate against the FOM stress. On the other hand, it might not be of particular interest to know the exact error value, but rather to know if the error is acceptable, i.e., if it is smaller than a prescribed tolerance. The quantitative error prediction leads to a classical regression problem, whereas the binarized response gives rise to an ordinary classification problem.

In the error regression problem, for a given model σ¯ <sup>M</sup> <sup>∈</sup> { ¯σ <sup>R</sup>N, <sup>σ</sup>¯ ANN} of the effective stress, we are interested in the absolute and relative norm errors

$$e\_a^{\mathcal{M}}(\bar{\underline{\varepsilon}}) = \left\| \bar{\underline{\sigma}}^{\mathcal{F}}(\bar{\underline{\varepsilon}}) - \bar{\underline{\sigma}}^{\mathcal{M}}(\bar{\underline{\varepsilon}}) \right\| \; \; \; \quad e\_r^{\mathcal{M}}(\bar{\underline{\varepsilon}}) = \frac{\left\| \bar{\underline{\sigma}}^{\mathcal{F}}(\bar{\underline{\varepsilon}}) - \bar{\underline{\sigma}}^{\mathcal{M}}(\bar{\underline{\varepsilon}}) \right\|}{\left\| \bar{\underline{\sigma}}^{\mathcal{F}}(\bar{\underline{\varepsilon}}) \right\|} . \; \tag{27}$$

For the error classification problem, we consider the indicator function

$$\chi^{\mathcal{M}}(\underline{\boldsymbol{\varepsilon}}) = \begin{cases} 1 & \text{if } e^{\mathcal{M}}\_a(\underline{\boldsymbol{\varepsilon}}) < \mathfrak{r}\_a \text{ or } e^{\mathcal{M}}\_r(\underline{\boldsymbol{\varepsilon}}) < \mathfrak{r}\_r\\ 0 & \text{else } , \end{cases} \tag{28}$$

with prescribed absolute and relatives tolerances τ<sup>a</sup> and τ<sup>r</sup> , respectively. The outcome of χ <sup>M</sup> is particularly useful in order to decide on the subsequent treatment: For χ <sup>M</sup> <sup>=</sup> 1, the error is considered acceptable and the surrogate can be used, while χ <sup>M</sup> <sup>=</sup> 0 should trigger an adaptive refinement. For instance, the classifier χ <sup>M</sup> can decide if the stress surrogate <sup>σ</sup>¯ <sup>M</sup> at a macroscopic integration point is acceptable or whether a more dedicated surrogate is needed.

For error regression and classification, the fully connected feed forward ANNs as described by (19) and the same activation functions as in section 2.3.2.2 are used. For the binary classification the final ANN layer is regarded as a log-probability with a single neuron. This setup is usually referred to as logits in binary classification.

#### **2.3.3.2. Loss function**

One of the desired properties, considering possible safety requirements in the error regression and classification, is to obtain if not accurate, then at least conservative results. In order to achieve a conservative behavior, for the error regression problem we consider the function

$$\phi\_{\alpha}(\mathbf{x}) = \max(\mathbf{x}, \mathbf{0}) + \alpha \max(-\mathbf{x}, \mathbf{0})\,,\tag{29}$$

which changes the slope for negative input values to α. The function φ<sup>α</sup> can be used to penalize underestimation of the error (for α > 1) when applied to the scalar argument of the MSE for the true error e <sup>M</sup> (representing the absolute error e M a or the relative error e M r of the model M ∈ {RN, ANN}) and its ANN surrogate e˜ M

$$\text{MSE}\_{\alpha} = \underset{\text{D}\_{\delta}^{\text{T}}}{\text{mean}} (|\phi\_{\alpha}(e^{\mathcal{M}}(\underline{\hat{e}}) - \tilde{e}^{\mathcal{M}}(\underline{\hat{e}}))|^{2}). \tag{30}$$

The MSEα is considered as the loss function for error regression, where α acts as a penalty parameter. The corresponding R 2 value and the relative conservative amount (RCA) over the validation dataset

$$R\_{\varepsilon}^{2} = 1 - \frac{\text{mean}\left((\boldsymbol{\varepsilon}^{\mathsf{M}} - \widetilde{\boldsymbol{\varepsilon}}^{\mathsf{M}})^{2}\right)}{\text{mean}((\boldsymbol{\varepsilon}^{\mathsf{M}})^{2}) - \left(\underset{\mathbf{D}\_{\varepsilon}^{\mathsf{V}}}{\text{mean}(\boldsymbol{\varepsilon}^{\mathsf{M}})}\right)^{2}},$$

$$\text{RCA}\_{\varepsilon} = \frac{\#\{\mathbf{D}\_{\varepsilon}^{\mathsf{V}} : \boldsymbol{\varepsilon}^{\mathsf{M}}(\underline{\widetilde{\boldsymbol{\varepsilon}}}) \le \widetilde{\boldsymbol{\varepsilon}}^{\mathsf{M}}(\underline{\widetilde{\boldsymbol{\varepsilon}}})\}}{\#\{\mathbf{D}\_{\varepsilon}^{\mathsf{V}}\}}\tag{31}$$

are used to assess the quality of the prediction.

For the error classification of model M ∈ {RN, ANN}, due to the binary nature of (28), the last layer of the ANN is defined as the composition of a standard sigmoid function and a shifted step function, i.e.,

$$\check{\chi}^{\mathcal{M}}(\bar{\underline{\varepsilon}}) = s \circ \check{\chi}^{\mathcal{M}}\_{0}(\bar{\underline{\varepsilon}}), \qquad \check{\chi}^{\mathcal{M}}\_{0}(\bar{\underline{\varepsilon}}) = \frac{1}{1 + \exp(-\text{ANN}(\bar{\underline{\varepsilon}}))},$$

$$s(\underline{\boldsymbol{x}}) = \begin{cases} 1 & \boldsymbol{x} > 1/2, \\ 0 & \text{else}. \end{cases} \tag{32}$$

The loss function for classification chosen in this work is the weighted binary cross entropy

$$\eta\_{\boldsymbol{W}} = -\text{mean}\left(\boldsymbol{w}\,\boldsymbol{\chi}^{\text{M}}(\underline{\boldsymbol{\varepsilon}})\,\log\left(\boldsymbol{\bar{\chi}}\_{0}^{\text{M}}(\underline{\boldsymbol{\varepsilon}})\right) + (1-\boldsymbol{\chi}^{\text{M}}(\underline{\boldsymbol{\varepsilon}}))\,\log\left(1-\boldsymbol{\bar{\chi}}\_{0}^{\text{M}}(\underline{\boldsymbol{\varepsilon}})\right)\right). \tag{33}$$

Herein, false positive predictions dominate the cross entropy for w > 1, while 0 < w < 1 puts the focus on false negative classification. We define the overall accuracy of the classifier as the expectation of finding the same response in the true indicator χ <sup>M</sup> and in the surrogate <sup>χ</sup>˜ M:

$$\text{ACC} = 1 - \max\_{\mathbf{D}\_s^V} \left( |\tilde{\boldsymbol{\chi}}^\mathcal{M} - \boldsymbol{\chi}^\mathcal{M}| \right). \tag{34}$$

Further, the accuracy within the bin b ∈ {0, 1} is defined as the conditional probability

$$\text{ACC}\_b = 1 - \max\_{\{\underline{\chi} \in \mathcal{D}\_b^V : \chi^M(\underline{\chi}) = b\}} \left( |\tilde{\chi}^M - \chi^M| \right). \tag{35}$$

The reader should note, that ACC<sup>0</sup> is more relevant when seeking conservative estimates. Only if ACC<sup>0</sup> and ACC<sup>1</sup> are close to unity, then the overall classification is robust, while for seemingly good ACC (e.g., around 0.98) the critical ACC<sup>0</sup> could be inappropriate. This effect is particularly important if the surrogate has only few outliers requiring further processing.

#### 2.4. Hybrid ANN/ROM Multi-Level Finite Element Simulation

#### 2.4.1. General Hybrid Approach

In order to build a twoscale simulation model relying on the finite element method on the larger scale, the material model must be replaced by the homogenized response of the heterogeneous solid. In sections 2.1.2 and 2.3.2 the use of ROM and ANN serving as surrogates for the effective stress tensor and the effective tangent stiffness are described in detail. Both surrogates can be combined by introducing an indicator function <sup>χ</sup>(**x**): <sup>Ω</sup> 7→ {0; <sup>1</sup>} which adaptively selects between the rapid and purely data-driven (but less physical) ANN if χ = 1 and the physics-driven ROM for χ = 0. The indicator function represents the binarized confidence in the accuracy of the ANN surrogate.

First, a simple ansatz for χ is chosen by setting χ to one if the current strain at the macroscopic position **<sup>x</sup>** <sup>∈</sup> <sup>Ω</sup> falls within the region covered by samples during the training of the ANN. In the present study this is equivalent to the kinematic indicator

$$\chi^{\mathbb{K}}(\overline{\mathfrak{x}}) = \begin{cases} 1 & \text{if } \|\, \overline{\mathfrak{s}}(\overline{\mathfrak{x}})\|\_{W} \le \overline{\varepsilon}\_{0}, \\ 0 & \text{else}. \end{cases} \tag{36}$$

Here, ε<sup>0</sup> = max(Dr) is the peak amplitude used during Concentric Sampling and k · k<sup>W</sup> denotes a weighted norm that transforms elements of Dε defined via (16) back into normalized directions:

$$\|\underline{\overline{\underline{\varepsilon}}}\|\_{W} = \sqrt{\hat{r}^2 \left\| \text{sph}(\underline{\overline{\underline{\varepsilon}}}) \right\|\_2^2 + \left\| \text{dev}(\underline{\overline{\underline{\varepsilon}}}) \right\|\_2^2}. \tag{37}$$

The use of the ROM outside of the training domain is motivated by its reluctance to energy minimization, i.e., by preserving the key physical characteristics of the full order model while restricted to a relevant subspace of the solution manifold.

A second indicator can be obtained by evaluating the accuracy of the ANN. Therefore, a binary classifier χ˜ ANN : Sym(R 3×3 ) 7→ {0, 1} is employed following the procedure outlined in section 2.3.3. The indicator function is then replaced by the classifier: <sup>χ</sup>(**x**) = ˜<sup>χ</sup> ANN( ε(**x**)).

#### 2.4.2. Technical Issues Related to on-the-fly Model Switching

At first, the concept of the indicator function χ marking the confidence region for the ANN and employing the ROM elsewhere sounds straight-forward. However, this simple

FIGURE 2 | Macroscopic FE boundary value problem (P) with on-the-fly ¯ model switching at integration points for the computation of the effective microscopic stress σ¯ for prescribed microscopic effective strain ε¯ for the microscopic problem (P): first, χ <sup>K</sup> checks if <sup>ε</sup>¯ is in the training region of the ANN surrogate for σ¯; if the quality of ANN is acceptable based on χ˜ ANN, then σ¯ ANN is evaluated and passed to the macroscopic FE problem; but if either <sup>ε</sup>¯ is outside of the ANN training range or the ANN surrogate is not accurate enough, then a previously selected accurate ROM of corresponding dimension N is evaluated and then passed to the macroscopic problem.

approach does not work in practice as the two concurrent surrogates do not provide continuous approximations of the stresses. This can be illustrated by letting <sup>C</sup> <sup>⊆</sup> Sym(<sup>R</sup> 3×3 ) denote the confidence region of the ANN in strain space. It should be noted, that C may contain several holes depending on the chosen quality indicator determining a point or region in strain space as admissible or not. On the boundary ∂C of the confidence region there is a hard transition between the two surrogates which induces a stress jump, leading to a non-smooth material response. When switching between ANN and ROM on-the-fly, i.e., when deciding for each query adaptively which surrogate should be evaluated, convergence of the macroscopic problem is disrupted, rendering the straightforward implementation of a quality indicator guided adaptive procedure infeasible. One may try to solve this problem with multi-fidelity approaches, see, e.g., Meng and Karniadakis (2019), where multiple nested surrogates (e.g., artificial neural networks) based on data groups of different accuracy/fidelity and amounts are trained. Unfortunately, such multi-fidelity data approaches are not applicable for the problem at hand. In order to motivate this more clearly, consider again **Figure 1** and the strategy illustrated in **Figure 2** for a macroscopic boundary value problem solved with FE and calling for an on-the-fly model switching at the integration points for the computation of σ¯ for prescribed ε¯.

In the context of twoscale simulations, the problem is not the accuracy/fidelity of the training data of the microscopic problem, but (1) the usage of a surrogate outside of its training range (based on χ <sup>K</sup> for the ANN stress surrogate) and (2) the point-wise quality of the surrogate with respect to prescribed tolerances (χ˜ ANN for the ANN effective stress surrogate), which define the boundary of the confidence region C and trigger the model switching. Both events can occur in twoscale simulations, since the input field at the macroscopic scale (i.e., <sup>ε</sup>¯(**x**¯)) is not known for arbitrary macroscopic geometry and boundary conditions, such that point-wise at the macroscopic scale the ANN microscopic surrogate for the effective stress may be evaluated far outside of its training range or may be inaccurate. If the ANN effective stress surrogate is inaccurate, then, e.g, a fixed ROM of sufficient accuracy can be initiated, as depicted in **Figure 2**. Naturally, in order to lower the number of ROM evaluations, one could simply enhance the existing networks σ¯ ANN and <sup>χ</sup>˜ ANN during the online computation by re-training using additional samples. However, there is no methodology available that can a priori guaranty accuracy gains without the need of extensive architecture sweeps and substantial sampling of extended and/or refined regions in the input space. Therefore, such an online re-training is not a viable option at the moment and alternatives need to be investigated. Contrary to the inherent properties of ANNs and the related training, (i) the ROM solution is obtained in a physically guided procedure, (ii) the

errors of the ROMs drop with increasing dimension, and (iii) the ROM has no intrinsic validity domain limitation in strain space. This motivates the use of a ROM of sufficient dimension outside of the validity domain of the ANN stress surrogate. Approaches for the algorithmic realization of the dynamic switching between concurrent surrogates are described in the sequel.

#### **2.4.2.1. Staggered hybrid ANN/ROM algorithm**

The first approach consists of a staggered procedure, where the ANN is used as the only stress surrogate in a first run of the twoscale simulation (see Algorithm 1). Thereby, a first overall response is gathered. This is followed by a second run, in which the subset of all integration points having seen a zero quality indicator during any of the load steps of the first run are enforced to use the ROM surrogate. This set is then kept constant, i.e., switching from ANN to ROM is one way. This procedure enables the use of the ANN solution as an initial guess for the subsequent hybrid run which leads to low iteration counts and improved performance. During the second run, the difference of the ANN and the ROM can be evaluated

**Algorithm 1:** Staggered hybrid ANN/ROM twoscale simulation algorithm.


**Algorithm 2:** Adaptive on-the-fly ANN/ROM twoscale simulation algorithm.


to provide valuable post-processing data in order to better understand the quantitative impact of the model modifications, see also examples in section 3.3.2. Two major disadvantages of this approach are (i) the irreversibility of the ROM activation which can lead to substantial computational costs and (ii) the possible failure during the first run, if the ANN surrogate becomes non-convergent. The latter can, e.g., occur if the local magnitude of ε¯ on the macroscale falls way outside of range of the training data.

#### **2.4.2.2. Adaptive on-the-fly ANN/ROM algorithm**

A second on-the-fly model selection procedure, solving both of the aforementioned issues, is described in Algorithm 2: It re-initializes the quality indicator in favor of the ANN at the beginning of each load increment. During the subsequent nonlinear Newton-Raphson iterations of the same increment, the indicator is updated in a monotonic way, i.e., switching from ANN to ROM is allowed but not vice verse (see line 5 in Algorithm 2). The computational efficiency can be improved by substituting only part of the equilibrium iteration by the ROM.

#### 3. NUMERICAL EXAMPLES

#### 3.1. Underlying Material Model

An artificial heterogeneous solid consisting of three phases is investigated. It consists of a laminate structure of two pseudoplastic materials where the two layers share the same elastic parameters (E<sup>1</sup> = E<sup>2</sup> = 75 GPa, ν<sup>1</sup> = ν<sup>2</sup> = 0.3) but have different yield strength and hardening behavior: The first layer has a yield stress of 100 MPa and a linear hardening slope of 2,000 MPa, whereas the second layer has a yield stress of 115 MPa in the absence of hardening. The third phase is represented by a spherical inclusion that is centered on the interface of the two phases. The inclusion is assumed linear elastic with properties mimicking a ceramic inclusion made of SiC (E = 400 GPa, ν = 0.2), see **Figure 3**. The volume fractions of the two plastic layers are 46.73% each and the one of the inclusion is 6.54%. The material was designed to induce a directional dependency of the effective material behavior (see right plot in **Figure 3** for an example). This feature makes the identification of the unknown homogenized response more challenging and, thereby, a benchmark problem for the developed methodology is designed.

## 3.2. Quantitative Comparison of ROM and ANN Surrogate Models

#### 3.2.1. Effective Stress Surrogate

The strain space is sampled as described in section 2.3.1 for an effective strain amplitude discretization D<sup>r</sup> = {0.0005, 0.002, 0.0035, 0.005, 0.0075, 0.01, 0.015, 0.025, 0.04}.

The spherical / volumetric part of the primal strain dataset is rescaled with rˆ = 5. Then, 1152 training, 288 validation and 512 Monte Carlo directions are generated, yielding 10368 training, 2592 validation and 4608 Monte Carlo effective strain points in R 6 .

An initial architecture testing phase is conducted. The activation functions and transformations illustrated in section 2.3.2 are considered, together with varying number of layers and neurons. The architecture test with L ∈ {3, . . . , 6} and number of neurons per hidden layer n [l] ∈ {16, 32, 64, 128}, <sup>l</sup> ∈ {1, . . . , <sup>L</sup> <sup>−</sup> <sup>1</sup>}, yields that none of the activation functions (RELU, SP, TANH) show a remarkable advantage over the other, even for as large number of epochs as 10,000 with whole batch training for a learning rate of 0.001 using an ADAM optimizer. However, the feature design of input (effective strain) and output data (effective stress of the FOM) has a major influence. Hereby, the most successful combination is identified to be the use of the spherical-deviatoric transformation T sd1 for the input as well as for the output. The transformation T sd2 did not show major advantages in the final objective function values.

Based on the initial architecture testing, the softplus function (SP) has been chosen to power further investigations, due to its monotonic and differentiability properties in regard of an expected monotonic stress behavior and need for tangent operators for future FE multiscale computations. In **Table 1**, different architectures are tabulated, showing the performance of each ANN. Based on the MRNE and R 2 σ values for the validation dataset (and the corresponding values MRNEMC and R 2 σMC


TABLE 1 | ANNs for the effective stress surrogate with corresponding choice of input features, network architecture, intermediate transformation of stress data T<sup>σ</sup> , measures MRNE and R 2 σ for the validation dataset, and MRNEMC and R 2 <sup>σ</sup>MC for the MC dataset.

evaluating the MC dataset), the ANN1 comprised of six layers with five softplus hidden layers and 128 neurons per hidden layer is chosen for the final evaluation. In **Figure 4** the prediction of ANN1 for the von Mises effective stress σ¯vM is depicted for the three in **Table 2** tabulated directions of the training (dirT12, dirT23, and dirTmixed) and validation datasets (dirV12, dirV23, and dirVmixed), showing a good agreement with the FOM data. It should be noted that the directions dirT/V12 have a (12) dominant component, meaning that the hardening material shown in **Figure 3** is activated, while dirT/V23 have a (23) dominant component allowing for a localization of the deformation in the non-hardening material, see **Figure 4**. The effective strain directions dirT/Vmixed show some examples for combined loading and corresponding material response, see **Figure 4**. The reader should take into account, that the ANNs have been trained with strain data up to a norm of 0.04 in the primal strain set Dˆ <sup>T</sup> ε (corresponding to the last data point for each loading direction in **Figure 4**). The behavior of the ANN1 beyond this norm value was expected to tend to keep increasing due to the properties of the softplus function.

TABLE 2 | Effective strain load directions ε¯/k¯εk for the inspection of the effective von Mises stress σ¯vM in the evaluation of ANN1.


However, due to the tendency of the ANN to increasingly overestimate the stresses and the artificial stiffening at load amplitudes beyond the training data, ANN1 is not expected to deliver accurate results beyond an effective strain norm of approximately 0.04 in respect to the primal strain set Dˆ <sup>T</sup> ε . Finally, in addition to the a posteriori symmetrization of the gradient ∂σ¯ ANN1/∂ε¯, numerical tests were carried out to verify that (i) the gradient obtained via automatic differentiation is almost symmetric (with an average error lower than 1%) and (ii) that the difference of the symmetrized gradient to the algorithmic tangent of the ROM with 96 modes was matched up to relative errors around 1.5%. These two checks approved the chosen approach. For a better transparency of these results, the authors offer **Supplemental Data**, see section Supplementary Material, containing the FOM data, the trained ANN1 and commands for the reproduction of all corresponding results.

#### 3.2.2. Error Surrogates

For the error regression and classification, it is first necessary to gain an overview regarding the quality of the N-dimensional ROMs and of the best of the trained ANN effective stress surrogates σ¯ ANN1 of the previous section.

In **Figure 5** the cumulative distribution function of the absolute norm error e M a (ANE) and of the relative norm error e M r (RNE) for the validation set D<sup>V</sup> ε are shown for ROMs of different dimensions N and for σ¯ ANN1. It is clearly visible that for increasing ROM dimension, the accuracy of the ROM improves for both, the ANE and RNE. This is expected, since the higher the ROM dimension, the richer the underlying function space, i.e., the distance to the solution manifold of the full order model decreases. It should be noted that the ANN effective stress model σ¯ ANN1 performs well against ROM16 and ROM24. The ROM32

TABLE 3 | ANNs for error regression with corresponding choice of input feature, network architecture, penalty parameter α, corresponding quality indicators R 2 e and RCAe for the validation dataset and R 2 <sup>e</sup>MC and RCAeMC for the MC dataset.


yields a mean ANE of 1.019 MPa and a mean RNE of 0.007. It is from now on assumed that the accuracy of the ROM32 suffices for future multiscale FE simulations, i.e., an a priori quality assessment is made.

We first demonstrate the error regression in terms solely of the N-dimensional ROMs for the corresponding ANE and RNE. These error measures could be used for an adaptive selection of a ROM after having access to its estimated errors. An architecture test for ANNs with number of layers L ∈ {3, . . . , 6}, neurons per hidden layer n [l] <sup>∈</sup> {16, 32, 64}, up to 10,000 epochs and whole batch training is performed. A selection of the trained ANNs is tabulated in **Table 3**.

The ANNs e˜ R16|1/2 <sup>a</sup> , tabulated in **Table 3**, are depicted in **Figure 6**. The influence of the penalty parameter α, introduced in (29), can be seen in **Figure 6** (left plot), where it becomes visible that the larger amount of points are found on the upper side of the diagonal, i.e., the predicted error is larger than the error in the validation set. This is reflected in the relative conservative amount (RCA), see **Table 3**. The usual trade-off is that increasing α yields conservative behavior (i.e., a higher RCA), but reduces the accuracy in terms of R 2 e . Analog behavior is observed for e<sup>r</sup> , as tabulated in **Table 3** (bottom half). Large values of α yield reduced R 2 e values, due to the dilemma of balancing a reduction of the loss function, while preserving conservative behavior.


TABLE 4 | ANNs for error classification for σ¯ R16/24/<sup>32</sup> and <sup>σ</sup>¯ ANN1 of previous section with tolerances <sup>τ</sup><sup>a</sup> <sup>=</sup> 2MPa and <sup>τ</sup><sup>r</sup> <sup>=</sup> 0.02.

The error classification is conducted for the absolute and relative tolerances τ<sup>a</sup> = 2MPa and τ<sup>r</sup> = 0.02, respectively. Architecture testing for L ∈ {3, . . . , 6} and n [l] ∈ {16, 32, 64} for the hidden layers yield varying quality of results depending on the weight w on the false positive. Depending on the number of positive and negative outcomes, the weights should be adapted. For the architecture testing of this work, the ratio

$$\varkappa\_0 = \frac{\#(\mathbf{D}\_\varepsilon^\mathrm{T} : \chi(\underline{\tilde{\mathbf{c}}}) = 0)}{\#(\mathbf{D}\_\varepsilon^\mathrm{T} : \chi(\underline{\tilde{\mathbf{c}}}) = 1)} \tag{38}$$

is considered. If the number of negative outcomes in the training data #(D<sup>T</sup> ε : <sup>χ</sup>(ε¯) <sup>=</sup> 0) is higher than the positives, then w<sup>0</sup> > 1 holds. The consideration of w = w<sup>0</sup> in the binary cross entropy partly equilibrates the influence of the false positive (i.e., classified accurate but violating the tolerance) and false negative (i.e., classified inaccurate but within tolerance). But it may also overly bias the cross entropy during training, yielding poor accuracy in one bin. Therefore, w is sampled between unity and w<sup>0</sup> in four evenly spaced steps during architecture testing. A selection of trained ANNs is tabulated in **Table 4**.

Classification ANNs with acceptable accuracy with respect to the validation dataset are obtained for the 16-, 24-, and even for the 32-dimensional ROM. These ANNs, denoted as χ˜ R16/24/32 in **Table 4**, offer, in principle, the opportunity for an adaptive ROM scheme, in which for a given effective strain the lowestdimensional but still acceptable ROM can be automatically identified for the chosen tolerances. In addition to the error classification of different ROMs, an attempt to classify the quality of the ANN labeled σ¯ ANN1 in **Table 1** is made for the same tolerances with χ˜ ANN1, see last row in **Table 4**. The surrogate σ¯ ANN1 has already intrinsic information of the training dataset, due to its optimization in respect to this dataset. In order to avoid an over-calibration, the training, validation and Monte Carlo datasets have been concatenated, randomly reordered and split into new training and validation datasets containing 90% and 10% of the data, respectively. The classifier χ˜ ANN1 for <sup>σ</sup>¯ ANN1 is trained on these new datasests. An extensive architecture test is performed with the same parameters as for the ROMs. The classifier for σ¯ ANN1, denoted as <sup>χ</sup>˜ ANN1 in **Table 4**, reaches acceptable accuracy, but notably lower than the ones achieved for the ROM classifiers. In retrospective, a justification for the lower performance of the classifier χ˜ ANN1 is found in the higher regularity of the ROM solution that is matching the behavior of the full order model. This is explained by the ROM inheriting the mathematical structure and the physical principles of the FOM. The classifiers of this section allow for an on-the-fly model switching, as illustrated in **Figure 2**, to be exemplified in the following section. Hereby, the ROM32 is considered for Algorithm 1 and Algorithm 2 due to its sufficient accuracy, see **Figure 5**.

# 3.3. Multiscale Simulation Based on Adaptive ANN-ROM-Scheme

#### 3.3.1. Twoscale Problem

The presented hybrid methods introduced in Algorithms 1 and 2 are used in actual three-dimensional twoscale simulations. The results are compared to FE2R simulations (in the spirit of Fritzen and Hodapp, 2016) in which the reduced order model is used as a stress surrogate in all points of a macroscopic structure which is considered as a reference based on the high accuracy of the ROM with 32 modes (see **Figure 5**, section 3.2.2).

The macroscopic problem (P) depicted in ¯ **Figure 1** is borrowed from Fritzen and Hodapp (2016), while the microstructure is described through the material of section 3.1. Three different 3-dimensional mesh densities are considered on the macroscopic level: M1 (1,734 elements/13,872 int. points), M2 (6,318 elements/50,512 int. points) and M3 (53,790 elements/430,320 int. points). All three models consist of trilinear hexahedral elements with selectively reduced integration (i.e., B-bar elements are used). The loading in terms of a 2% stretch of the macroscopic specimen is applied in 10 equally spaced increments up to 0.2% followed by nine increments of 0.2% amplitude each in order to better cover the transition between elastic and elasto-plastic behavior for Algorithm 2.

#### 3.3.2. Staggered Adaptive Procedure cf. Algorithm 1

First the staggered procedure introduced in Algorithm 1 is used. It is found that the first run that is relying on the ANN surrogate only achieves excellent runtimes when evaluating the ANN on graphics cards (here: one Nvidia GTX Titan Black), leading to runtimes of approximately 15 s for one evaluation of the surrogate at each of the 430,320 integration points of the finest mesh M3. It shall be noted that this includes a major execution overhead<sup>1</sup> .

A general dilemma of twoscale simulations that was observed for the FE2R method by Fritzen and Hodapp (2016) is also present here: Local outliers of the strain field attain magnitudes that quickly exceed the range of the inputs used during training of the ANN stress surrogate. The number of outliers becomes more

<sup>1</sup>For simplicity each evaluation launches a new Python instance, reloads the model from a file and returns the results to the FE code through another file.

relevant for the finer discretizations. This reveals a major shortcoming of Algorithm 1: While the simulation for mesh level M1 terminated cleanly in roughly 3 h wall-clock time with most of the computing time being spent during the hybrid ANN/ROM phase, M2 did not converge for loadings larger than 1.2% due to locally excessive strains that lead to spurious stress response of the ANN. The finer mesh M3 fails to converge beyond 0.8% of overall stretch. Additionally, the ANN version failed to improve the accuracy beyond a certain limit, i.e., it failed to achieve quadratic convergence starting beyond a critical load amplitude. In **Figure 7** a comparison of the tension force of the ANNonly run (lines) and of the subsequent hybrid run (symbols) is shown. During the hybrid run the number of integration points evaluating the ROM are determined from the quality indicator at the end of the last load step of the first run. For M1 this amounts to 960 out of 13,872 integration points (6.92%)<sup>2</sup> . These numbers illustrate that the ROM must be evaluated approximately 42,000 times for M1 (44 Newton iterations were needed in total) which leads to a substantial computational effort. Surprisingly, the ANN-only run and the hybrid run are hard to distinguish from the overall force-stretch plots, i.e., the ANN appears to yield good accuracy for this test. This is supported by the rather small absolute errors in the effective stress tensor, see **Figure 7** (right) for the final load and mesh M1. In summary, the staggered procedure can exclusively be used if the peak strain in the macroscopic problem is sufficiently low due to the aforementioned convergence issues. Then the solution can be expected to give accurate predictions.

In view of the number of quadrature points marked for use of the more reliable ROM, the adaptive scheme shows a steady increase when using the kinematic indicator χ <sup>K</sup> marking points outside of the training range as not trustworthy for the ANN.

#### 3.3.3. Single Pass on-the-Fly Adaptive Algorithm cf. Algorithm 2

The crucial ingredient of the on-the-fly adaptive scheme, described in Algorithm 2, is the irreversible update of the quality indicator during each load increment. Thereby, alternating model selection is prevented. All three macroscopic models, M1, M2, and M3, converged without any issues. The resulting macroscopic tension force of all three is compared in **Figure 8**,

<sup>2</sup>The numbers for M2 and M3 are not representative as the final load was not achieved.

where the hybrid curve from **Figure 7** and the FE2R curve for the ROM featuring 32 modes are also shown for M1. It is observed from **Figure 8** (right) that all algorithms yield virtually identical results. Closer inspection reveals, however, that the FE2R and adaptive algorithm have nearly indistinguishable slopes (despite a negligible shift), whereas the ANN model is slightly curved, i.e., it shows a qualitative difference toward the reference solution which gets more pronounced at increasing load amplitude.

The adaptive algorithm has the advantage that the number of macroscopic integration points that require evaluation of the ROM depends only on the current state. For the considered proportional loading, and when using the kinematic indicator χ <sup>K</sup>, the relative amount of integration points grows monotonically with increasing load, cf. **Figure 9**.

In order to investigate the practical usefulness of the ANN classifier, a comparison of the on-the-fly adaptive simulation using the kinematic quality indicator and the same simulation supplemented by the ANN classifier discussed in section 3.2.2 is considered. As expected, the solid yet not overly satisfying accuracy of the ANN classifier (see **Table 4**) induces a large number of additional ROM evaluations, thereby increasing the computation time considerably (approx. by a factor of 7), see **Figure 10**. Notably, the ANN classifier adds a considerable amount of points at the left and right constriction and close to the holes. In order to assess the relevance of these additional points, the macroscopic tensile force was investigated: It varies less than 0.3%, except in the very first load step with a difference around 0.6%.

# 4. CONCLUDING SUMMARY

A multi-fidelity approach for generating surrogate models of the effective stress tensor for the use in twoscale simulations is developed in section 2.1. At first, a ROM is derived from data gathered during full field simulations. The estimation of the error in the effective stress tensor (representing the QoI) of the ROM is discussed from a theoretical perspective in section 2.2. The mathematical structure of the error estimate reveals, that the ROM error estimation produces computational cost that is almost equivalent or even beyond that needed to solve a more dedicated ROM, thereby making it hard to justify such estimates when in the need for computational efficiency.

In our view this dilemma can only be resolved by finding alternative surrogates with low computational complexity but moderate to good accuracy complemented by adaptive strategies for local model refinement that employ costly computational methods only when needed. In this regard, ANNs are seen as promising candidates for the calibration of surrogate models for the effective stress and for classification that can trigger adaptive refinement. In section 2.3, the layout and the theoretical background of ANNs are discussed, together with different feature designs for the inputs and outputs based on the mechanical nature of the strain and stress. For the calibration of the stress surrogate, the mean squared error is used as loss function, while the quality of the trained ANN is checked on the validation dataset with the mean coefficient of determination and the mean relative norm error. In the case of error regression, a penalized mean squared error is proposed, which allows the conservative calibration of trained ANNs. For the error classification based on prescribed tolerances, the weighted cross entropy is used in order to allow for a better focus on the more important warning case, if the warning case density is low. Based on the proposed models, the core contribution of the present work constitutes two model-adaptive algorithms which encompass convergence issues encountered in the naive implementation of on-the-fly adaptive surrogate selection, see section 2.4. The first staggered algorithm is based on a two run approach, in which the first run is conducted solely with the ANN effective stress surrogate and flags points evaluated outside of the strain training region, such that only these points are evaluated with the high-accuracy ROM in a second run. The second algorithm offers a more flexible on-the-fly modeladaptive approach by allowing the re-initialization of the ANN at the beginning of each load increment.

Numerical examples of the illustrated approaches are presented in section 3 for a three-phase pseudo-plastic material with microstructure. First, ANNs are trained in order to approximate the effective stress. The surrogate of choice, σ¯ ANN1, achieved a mean relative norm error of 0.0189 and a mean coefficient of determination of 0.9995 and yields an accurate tangent stiffness, due to its formulation on the automatic differentiation capabilities of the TensorFlow library. The accuracy of the ANN stress surrogate is found to range between ROMs of dimension 24 and 32, respectively (see section 3.2). The ansatz for error regression of ROMs of different dimension is presented, showing the possibilities for a calibration of a conservative ROM error estimator. In view of subsequent twoscale simulations with adaptive model selection, error classification is carried out for ROMs of different dimension and for the trained ANN stress surrogate. The achieved accuracy of the ROMs are higher than for the ANN stress surrogate, which indicates that the physics-informed ROM still shows a clearer pattern than the trained ANN stress surrogate, due to its inherited mathematical structure and underlying physical principles.

The trained ANNs are then used in twoscale mechanical FE simulations, based on the two developed algorithms of section 2.4. The staggered algorithm produces sensible results but has two limitations: First, the number of macroscopic quadrature points marked for correction grows irreversibly. Second, the ANN surrogate must be sufficiently robust and of—at least moderate accuracy in a prohibitive part of the strain space. This requirement stems from the fact that local strain outliers lead to queries that are way outside of the usual training range of the ANN. This effect is found to be more pronounced when the macroscopic mesh density is increased which further complicates the robust surrogate construction using purely data-driven methods in general, see section 3.3.2. The second algorithm offers a true on-the-fly adaptivity in which the ANN surrogate can be recovered, e.g., during unloading. It is observed in section 3.3.3 that this second algorithm offers the fastest convergence among the considered twoscale simulations being approximately 3–10 times faster than the staggered algorithm and around 20 times in comparison to the fully coupled FE2R algorithm using the ROM with 32 modes for all stress predictions. The adaptive on-thefly model of the second algorithm offers, therefore, an attractive approach which combines a low number of ROM evaluations with good convergence.

The final test using the additional error classifier for the ANN stress surrogate introduced a high number of additional negative outcomes (i.e., ANN error greater than tolerances), considerably increasing the number of integration points requiring the ROM. This was expected due to the low accuracy achieved during the training of the classifier, more specifically, due to the low accuracy for the positive outcome ACC<sup>1</sup> and corresponding high amount of positive outcomes reflected by w0, see **Table 4**. On the one hand, this last approach offers a conservative twoscale scheme relying on the robust ROM. On the other hand, the approach loses computational efficiency, which leaves the error classification for rapid multiscale problems still an open issue. Future improvements should, therefore, focus on further theoretical or hybrid error estimators and an improved error classification for the effective stress. The latter could benefit from datasets spanning larger portions of the strain space. The authors are convinced that the ambitious goal of reliable twoscale simulations can only be achieved by fusing data-driven methods together with dedicated theories. For example, the reader should consider that at least two quality indicators are indeed necessary for a reliable model-adaptive ansatz. One quality indicator should address the trained ANN stress surrogate over the sampled strain (input) space and one quality indicator should definitely return a warning if the input leaves the training region. This is quintessential, since the range of macroscopic strain is not a priori known and the behavior of the purely data-driven ANN outside the training region may endanger the convergence and quality of the multiscale simulation. The results of the present work support the usability of ANNs in computational materials science, although further research ideally addressing the combination of theory and data is urgently required.

# AUTHOR CONTRIBUTIONS

FF conducted the finite element and reduced order model computations on the microscale and the twoscale simulations based on his self-developed code and wrote the corresponding sections. MF implemented, trained the artificial neural networks, and wrote the corresponding sections. FL conducted together with FF preliminary work on theoretical error estimates solely based on the reduced order model, which yielded the theoretical insights for the computational expenses and the necessity for alternative approaches. FL wrote the corresponding section for the theoretical error estimation.

# FUNDING

The contributions of MF and FF are funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – FR2702/6 – within the scope of the Emmy Noether Group EMMA – Efficient Methods for Mechanical Analysis. The contributions of FL are funded by the Swedish Research Council (VR) under grant no. 2015-05422.

### ACKNOWLEDGMENTS

Vivid discussions within the scope of Cluster of Excellence SimTech (DFG EXC310 and EXC2075) regarding machine learning and data-driven model surrogation are highly appreciated. FF and MF further acknowledge the valuable

#### REFERENCES


discussions with Steffen Freitag (Ruhr-Universität-Bochum) on the topic of ANN-based regression and classification.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmats. 2019.00075/full#supplementary-material

Supplemental Data | Supplementary material is provided in the form of three HDF5 datasets (containing all FEM and ROM results used for the ANN training). Further, the stress surrogate σ¯ ANN1 and the quality indicator for this ANN are provided including a Python interface for accessing the file.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Fritzen, Fernández and Larsson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# General Multi-Fidelity Framework for Training Artificial Neural Networks With Computational Models

Roland Can Aydin<sup>1</sup> \*, Fabian Albert Braeu<sup>2</sup> and Christian Johannes Cyron1,3

1 Institute of Materials Research, Materials Mechanics, Helmholtz-Zentrum Geesthacht, Geesthacht, Germany, <sup>2</sup> Institute for Computational Mechanics, Technical University of Munich, Munich, Germany, <sup>3</sup> Institute of Continuum Mechanics and Materials Mechanics, Hamburg University of Technology, Hamburg, Germany

Training of artificial neural networks (ANNs) relies on the availability of training data. If ANNs have to be trained to predict or control the behavior of complex physical systems, often not enough real-word training data are available, for example, because experiments or measurements are too expensive, time-consuming or dangerous. In this case, generating training data by way of realistic computational simulations is a viable and often the only promising alternative. Doing so can, however, be associated with a significant and often even prohibitive computational cost, which forms a serious bottleneck for the application of machine learning to complex physical systems. To overcome this problem, we propose in this paper a both systematic and general approach. It uses cheap low-fidelity computational models to start the training of the ANN and gradually switches to higher-fidelity training data as the training of the ANN progresses. We demonstrate the benefits of this strategy using examples from structural and materials mechanics. We demonstrate that in these examples the multi-fidelity strategy introduced herein can reduce the total computational cost–compared to simple brute-force training of ANNs–by a half up to one order of magnitude. This multi-fidelity strategy can thus be hoped to become a powerful and versatile tool for the future combination of computational simulations and artificial intelligence, in particular in areas such as structural and materials mechanics.

#### Edited by:

Roberto Brighenti, University of Parma, Italy

#### Reviewed by:

Ercan Gürses, Middle East Technical University, Turkey Youyong Li, Soochow University, China

> \*Correspondence: Roland Can Aydin roland.aydin@hzg.de

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

> Received: 04 February 2019 Accepted: 25 March 2019 Published: 17 April 2019

#### Citation:

Aydin RC, Braeu FA and Cyron CJ (2019) General Multi-Fidelity Framework for Training Artificial Neural Networks With Computational Models. Front. Mater. 6:61. doi: 10.3389/fmats.2019.00061 Keywords: artificial intelligence, homogenization, material science, machine learning, simulation

# INTRODUCTION

Over the last years, we have witnessed several groundbreaking advances in artificial intelligence (AI) that were based on a simple idea: a virtual training environment was created by setting up some general rules. Subsequently, an AI, typically represented by an artificial neural network (ANN), was placed in this training environment and allowed to practice until it reached a superhuman level of mastery. The rules of the training environment were, for example, the rules of the board game Go in the AlphaGo project (Silver et al., 2016). Even when the machine learning component was not provided any prior knowledge about the game other than the ruleset itself it achieved superhuman mastery simply by training in a virtual training space (Silver et al., 2017). Other research projects rely on virtual environments as used in computer games in order to train AIs to perform intelligent actions or solve certain problems (Vinyals et al., 2017). This research typically uses training environments defined by rules whose complexity is far below that of real physics. Aydin et al. Multi-Fidelity Simulations for ANNs

Consequently, the generation of training data is feasible at such a low computational cost that one can apply straightforwardly a Big Data paradigm, in which the training of the ANN is the bottleneck, not the availability of data. With physically more realistic virtual training environments one could train AIs to solve problems, for example, from mechanical engineering or materials science that require so far intense human interactions. Creating physically realistic models of systems and processes in mechanical engineering and materials science is the principal objective of computational mechanics. It is thus natural to combine computational mechanics and machine learning to create what one may refer to as "computational mechanics intelligence," that is, a kind of artificial/computational intelligence that is endowed with an accurate understanding of a certain mechanical problem and which is trained in a virtual environment created by methods from computational mechanics. This paradigm can actually be understood as a natural extension of the fast growth body of research that seeks to apply machine learning to various areas of mechanical engineering or materials science such as fatigue (Mosallam et al., 2016; Wang et al., 2017), homogenization (Yang et al., 2018), or process design (Hu et al., 2018). The idea to couple computational mechanics and artificial intelligence in an intimate way bears great promise to open up new ways to endow AI with an understanding of real physics. However, it faces the great challenge that creating training data for an AI by means of realistic models of complex physical systems and processes can be computationally prohibitively expensive. This is a main reason why the attempts to couple computational mechanics and artificial intelligence–although started first already long ago (cf. Waszczyszyn and Ziemianski, 2001)–have so far remained very limited both in scope and number. The key to the future success of computational mechanics intelligence is thus developing smart strategies how computational models can be used to train AIs at an acceptable overall computational cost. If realistic computational models are used for training AIs, the computational cost of the models typically by far surpasses the computational cost of the AI training itself. It is thus of paramount importance to find ways to reduce in particular the computational cost associated with the generation of training data by means of computational models. In the area of computational quantum mechanics, recently a variablefidelity method for the calculation of bandgaps was proposed (Pilania et al., 2017). Apparently, it is promising to combine also for classical computational mechanics on the continuum scale simulations with multiple different fidelity levels in order to reduce computational cost for generating training data. In this paper, we will introduce a systematic and general framework how to use computational methods in a smart way in order to create training data for AIs. This framework relies on a multi-fidelity strategy which couples the learning progress of the AI with the resolution of the computational models used to generate training data. It trades unnecessary precision of the error gradient used especially in the early stages of AI training for computational efficiency. The main objective of our multi-fidelity strategy is to significantly speed up the training of ANNs in domains in which training data have to be generated by means of computational

models and where this process constitutes a significant portion of the overall computational cost associated with endowing an AI with physical intelligence. The outline of the paper is as follows. In section "Problem setting," we briefly describe the type of problem on which this article focuses. In section "Methods," we delineate the architecture and general learning algorithms of the ANNs used in this paper. Moreover, we introduce our novel multi-fidelity framework for coupling computational models and AI training. In section "Numerical Examples," we demonstrate the benefits of this multi-fidelity framework using examples from both structural mechanics and materials mechanics. Finally the section "Conclusions" summarizes and discusses the broader implications of the multi-fidelity framework introduced herein.

#### PROBLEM SETTING

Herein we consider the following general problem: an AI is to be trained to perform some kind of action or make some kind of prediction within or with respect to a physical system. Experiments with and measurement within the physical system in order to generate training data for the AI are assumed to be expensive, time-consuming, or dangerous so that it is preferable to generate training data rather by means of "simulated experiments" or "simulated measurements" performed by means of a realistic computational model of the physical system of interest. The computational cost of these simulated experiments is assumed to surpass by far the computational cost of AI training on the basis of given training data itself. It is therefore the bottleneck in coupling artificial intelligence and computational models. The general problem delineated above mainly appears in two settings. In the first one, the AI is used to make predictions about the behavior of a physical system under variable input. The motivation may be that using a comprehensive classical computational model for making these predictions for each different input case of interest may be much more expensive than using an AI as a cheap surrogate model. In this setting, the computational model is used to compute for a large number of input values realistic approximations of the associated output of the system. These input and computationally approximated output values form together training samples which can be used for training an AI to approximate the behavior of the computational model. Typically, this is achieved by means of a backpropagation training algorithm where the internal parameters of the AI are adjusted until for a given input the AI produces an output sufficiently similar to the one of the computational model. This adjustment is based on the current error of the AI, that is, the current deviation of the AI output from the output of the computational model for a certain input (cf. **Figure 1**, left).

In the second problem setting, the AI is to be trained to take action in a specific physical environment. To this end, the AI is connected, typically in a closed-loop setting, to a computer simulation of this physical environment in which actions and consequences can be evaluated much faster and cheaper than in real-world experiments (cf. **Figure 1**, right).

This setting becomes relevant, for example, if an AI should be used to control complex processes and systems such an autonomously driving car or an autonomously flying drone or robots in a manufacturing plant. The difference between this setting and the first one is mainly that the input to the AI cannot be arbitrarily chosen already before starting the training but typically results–at least in parts–only in the course of the training process as a consequence of closed-loop interactions between AI and computational model. Moreover, AI training can typically not be simply based on a difference between AI output and model output but will rather more often rely on some objective function evaluated on the current state of the system.

Despite these differences, both the above problem settings have in common that AI training requires the evaluation of computationally expensive models. In the next section we will delineate a multi-fidelity strategy that can heavily alleviate the associated computational cost, which otherwise is often prohibitively high for realistic computational models of complex systems. While the general concept of this strategy applicable to both the above delineated problem settings, we will focus in this article on the first problem setting illustrated in **Figure 1**, left.

# METHODS

#### Architecture of Artificial Intelligence

ANNs are chosen as the most general-purpose, widely spread type of learning algorithm. This choice is taken as to minimally constrain the generalizability of the multi-fidelity approach introduced in this research. The specific ANNs used herein are feed forward neural networks (FFNs) based on several densely connected hidden layers. Their learning process thus falls into the realm of so-called deep learning. This choice is not mandatory, and it is important to note that other choices of the machine learning algorithm would be equally viable to exemplify the comparative advantages of the multi-fidelity framework developed herein. While the results of this research are not wholly agnostic with respect to the choice of learning algorithm and specific parameters (learning rate, activation functions, etc.), the above choice has been made so as to maximize the generalizability of the results obtained herein and not make them merely an artifact of a peculiarity of the highly specific learning method.

# Learning Algorithm

There are various different ways how ANNs can be trained to imitate a function, the most common of which is supervised learning. Even though there is a host of different approaches to supervised learning with differences ranging from small details to entirely different architectures, all these approaches share a couple of common elements. In general, supervised learning algorithms compare the current output of an ANN for a given input with the correct solution, and base the correction of the internal parameters of the neural network (i.e., the "learning") on an error which is the difference between current and correct output. Although there are also derivativefree methods, correcting the internal parameters of the network in most approaches requires the computation of a gradient the said error with respect to the parameters governing directly the output of the ANN. In a so-called "backpropagation algorithm" this gradient is propagated through the ANN and used to update thresholds/weights for the individual neurons in the ANN.

To train the ANN, a large amount of training data is required. Each training sample consists of one tuple of an input (to the computational model or the ANN) and the corresponding output which the ANN should learn to ideally yield in response to the respective input values. In our setting the output which the ANN should learn to reproduce is generated by means of a computational model. Our objective is training the ANN to reproduce the input-output behavior of the computational model. To this end, the ANN is fed one training sample after the other. In supervised learning (i.e., when the desired output for specific input to the ANN can at least in principle be computed from the beginning on), training samples are typically not used individually for backpropagation training but rather the error and error gradients used for backpropagation training are computed across so-called batches of N samples and then used. This reduces computational cost and perturbations of the learning process due to specific numerical features of individual samples. The error over a batch of N training samples is computed in our framework as a root mean square error (RMSE) (Russell and Norvig, 2016).

$$e\_{\rm ANN} = \sqrt{\frac{1}{N} \sum\_{i=1}^{N} \left(\hat{\nu}\_i - \wp\_i\right)^2} \tag{1}$$

with yˆ<sup>i</sup> the results predicted by the ANN for the input of sample i within the training batch, and y<sup>i</sup> the corresponding output provided by the computational model. The error gradient, which is needed for backpropagation training, can be computed by a simple finite-difference-like approximation of the derivatives of (1).

#### Multi-Fidelity Training General Idea

Typical algorithms for AI training such as gradient-based backpropagation algorithms for ANNs are realized by way of a process where the internal parameters of the ANN are adjusted in a stepwise, iterative manner to improve its performance. When ANN training starts, mainly a coarse adjustment of the internal parameters of the ANN toward reasonable values takes place because a tailor-made problem-specific initialization of these parameters is typically not possible. This way, the AI is endowed with a first coarse understanding of the basic properties of the problem to which it is applied. Only later on the ANN is fine-tuned. In this process, the initial adjustments of the internal parameters of the ANN need not be accurate because in subsequent training steps their precise values will still change in a way that cannot directly be foreseen in the beginning. Small perturbations of the initial steps can thus be expected to remain without major impact on the overall result of the training process. Thus, it is sufficient if during the initial training stage the internal parameters of the ANN are altered in a way which points roughly in the right direction. To this end, it is sufficient to use in the initial stage of ANN training samples generated by means of coarse low-fidelity computational models, which may exhibit considerable numerical approximation errors but which are computationally cheap. Only later on, as ANN training progresses one has to gradually move toward samples generated by means of more accurate and computationally more expensive higher-fidelity models. Following this strategy one can use cheap low-fidelity samples for a large part of the ANN training and needs computationally expensive high-fidelity samples only in the very end of the training process and thus only in a very limited number. Exploiting such a multi-fidelity strategy, which is illustrated in **Figure 2**, can substantially reduce the overall computational cost (compared to a non-optimized brute-force training of the ANN). It is worth emphasizing that the main objective of this multi-fidelity strategy is indeed reduction of the computational cost for training ANNs to a given level of accuracy rather than training ANNs more accurately. Precisely, our multifidelity framework is based on the following course of action. We first define a number of different fidelity levels along with associated computational models. We start generating training data for the ANN using the computational model with lowest fidelity. We continue using the low-fidelity model as long as the overall performance of the ANN increases and we can expect that using the a computational model at the current level of fidelity can help to train the ANN in an efficient way. As soon as this is no longer the case, we move on toward higher fidelity training data. In practice, this is typically possible by simply using computational models based on a finer discretization. More details on this will be given below.

For simplicity we focus in the discussion herein and in particular in the examples section on ANNs as a widely used basis for AI and on computational models based on the finite element method, which is widely used both in solid and fluid mechanics as well as many other areas of continuum physics. We stress, however, that we expect the multi-fidelity strategy introduced herein to be generalizable also to cases where the computational models used are not based on finite-element discretizations or where more complex or specialized machine learning architectures are used than ANNs. In fact, we expect the multi-fidelity strategy introduced in this paper to be applicable as long as the following three conditions are satisfied. First, there must be a necessity to generate training data for the AI by means of computational models. Second, the computational expense of generating training data using these models should vastly exceed the computational expense of training the AI itself. Third, it must be possible to create for the physical system or process of interest computational models with varying levels of accuracy, lower levels of accuracy thereby being associated also with a lower computational cost.

#### Criteria for Switching to Higher Fidelity Levels During Training

In standard-problems of supervised learning, all training data are available from the beginning on. In this case, the training data can be divided into batches and the following course of action is common: all batches are fed into the ANN, whereby, however, from each batch only 90% of the samples are used for training and 10% are retained for validation purposes. Once all batches have been fed into the ANN a so-called epoch is completed. At this point the RMSE with respect to the samples retained for validation is computed. Subsequently, the next epoch starts where again all batches are fed into the ANN. This sequence is interrupted, however, as soon as the RMSE computed for the validation samples after an epoch has increased compared to the previous epoch. The motivation for this strategy is the avoidance of overfitting (Domingos, 2012; Russell and Norvig, 2016), which is a tendency of learning algorithms not to learn the intended, generalizable target function but rather features of the specific training data unless the training process for a given set of data is stopped in time. Once the RMSE for the validation samples– which were not used for training and are unknown to the ANN– starts to increase, even though the RMSE for the training data themselves may still continue to decrease, overfitting can be assumed to start.

In our setting, we have to pursue a slightly modified course of action. The reason is that the complete set of training data is not available from the beginning on but rather has to be generated during learning because in the beginning it is not even known how many training samples have to be generated from computational simulations in order to achieve a reasonable performance of the ANN. To overcome this problem, we start our training with data generated by means of a computational model at the lowest fidelity level. We generate one or several batches of training data, depending on criteria discussed in more detail below. In each batch, we retain 10% of the samples for validation and use the remaining 90% for training. We train the ANN looping in a batch-wise manner through all the training samples. Looping one time through all available training samples is called an epoch. We do so again and again and complete this way more and more epochs until either a certain predefined maximal number E of epochs has been completed or until the RMSE based on the validation samples increases (which is a sign of overfitting). Once this point is reached, a so-called super-epoch is considered to be completed. After each super-epoch, we evaluate the recent training progress of the ANN. We do so on the basis of two quantities. The first quantity is the current RMSE of the ANN

$$e\_{\rm ANN}^{\rm max} = \sqrt{\frac{1}{\bar{N}} \sum\_{i=1}^{\bar{N}} \left( \hat{\nu}\_i - \nu\_i^{\rm max} \right)^2} \tag{2}$$

on the basis of N¯ output values y max i which are generated in the very beginning of the whole training process using a computational model at the highest fidelity level and which are never used for training.

To obtain the second quantity required to evaluate the recent training progress of the ANN at the current fidelity level i, we compute an estimate of the approximation error of the computational model at this fidelity level. To this end, we assume that accuracy differences between the highest fidelity level at which the computational model is available and the lowest one with i = 1 are very large so that we can approximate the approximation error of the computational model at the lowest fidelity level with i = 1 as

$$\begin{split} \epsilon\_{\text{CM}}^{1} \approx \sqrt{\frac{1}{\bar{N}} \sum\_{i=1}^{\bar{N}} \left[ \boldsymbol{\jmath}\_{i}^{1} - \boldsymbol{\jmath}\_{i}^{\text{exact}} \right]^{2}} &\quad \approx \sqrt{\frac{1}{\bar{N}} \sum\_{i=1}^{\bar{N}} \left[ \boldsymbol{\jmath}\_{i}^{1} - \boldsymbol{\jmath}\_{i}^{\text{max}} + O\left( \boldsymbol{\varrho}\_{\text{CM}}^{\text{max}} \right) \right]^{2}} \\ &\approx \sqrt{\frac{1}{\bar{N}} \sum\_{i=1}^{\bar{N}} \left[ \boldsymbol{\jmath}\_{i}^{1} - \boldsymbol{\jmath}\_{i}^{\text{max}} \right]^{2}} \end{split} \tag{3}$$

where y 1 i and y exact i are the output values which a computational model with the lowest fidelity level and a fictitious exact model of the physical system of interest yield when provided the same input values that make the computational model at the highest fidelity level yield the above introduced y max i . O e max CM is a term on the order of magnitude of the RMSE of the computational model at the highest fidelity level compared to the fictitious exact model. We assume now that the modeling error of the computational model at fidelity level i is governed by the discretization lengths h<sup>i</sup> used in the computational model, and that it converges monotonically to zero with the p-th power of h<sup>i</sup> . Then the error of the computational model at the i-th fidelity level can be estimated as

$$e\_{\rm CM}^i \approx \left(\frac{h\_i}{h\_1}\right)^\rho e\_{\rm CM}^1 \approx \left(\frac{h\_i}{h\_1}\right)^\rho \sqrt{\frac{1}{N} \sum\_{i=1}^{\bar{N}} \left[\boldsymbol{\nu}\_i^1 - \boldsymbol{\nu}\_i^{\rm max}\right]^2} \tag{4}$$

The training progress after a super-epoch is always considered unsatisfactory if the following criterion applies:

(T1) The RMSE e max ANN according to (2) after the current superepoch is worse than the RMSE s super-epochs ago.

Criterion (T1) indicates that the ANN is no longer learning general features of the problem but rather batch-specific information. We average the RMSE over the last s super-epochs when monitoring the training progress in order to reduce the impact of random fluctuations during a single super-epoch. Depending on the exact training strategy (cf. section "Different Training Strategies" for more details) one may consider the training progress after a super-epoch also unsatisfactory if the following additional criterion applies:

(T2) The RMSE e max ANN according to (2) after a super-epoch is smaller or equal than q times the approximation error e i CM of the computational model at the current fidelity level estimated on the basis of (4).

Criterion (T2) indicates that the ANN has been trained to a level of accuracy comparable to the one of the computational model at the current fidelity level. Naturally, training beyond this level is impossible and q can be understood as a safety factor which ensure that training always stops earlier, taking into account that (4) is just a rough estimate.

If the training progress after a super-epoch is found to be satisfactory on the basis of criteria such as (T1) and possibly also (T2), we continue training at the same fidelity level. Otherwise, we switch to the next higher fidelity level. If there is no higher fidelity level, we terminate training.

#### Different Training Strategies

In this paper, we compare altogether three different training strategies. The first one is a simple single-fidelity approach where the whole training of the AI is based on samples generated at the same (maximal) level of fidelity and thus also computational cost. As viable alternatives to this simple brute-force approach, which is most widely practiced so far, we propose and compare in the following also two different multi-fidelity strategies.

#### Single-Fidelity Training

Training ANNs with data generated from computational models is so far mostly performed in a single-fidelity paradigm where all training data are generated with the same computational model and thus with the same fidelity level. Therefore, we test this approach also herein as a benchmark to measure the performance gains that can be achieved by the multi-fidelity framework introduced herein. To ensure comparability between single-and multi-fidelity training, single-fidelity training is terminated as soon as either criterion (T1) from section "Criteria for switching to higher fidelity levels during training" is satisfied or the RMSE of the ANN has decreased to a threshold RMSEmin which is chosen to be equal to qe<sup>i</sup> CM from criterion (T2) in the multifidelity training algorithm with which the single-fidelity training algorithm is compared. The algorithm for single-fidelity training is provided as pseudocode in **Table 1**.

TABLE 1 | Pseudocode for single-fidelity training.


#### Unidirectional Multi-Fidelity Training

In the simplest "brute-force" multi-fidelity approach, we unidirectionally loop through different fidelity levels. We start at the lowest one and move on to the next higher fidelity level as soon as one of the two termination criteria (T1) and (T2) from section "Criteria for Switching to Higher Fidelity Levels During Training" are satisfied. If this happens on the highest fidelity level, we terminate training completely. Before starting the training process, we have to define the number n of fidelity levels as well as the computational models assigned to them. When using finite element discretizations as we do herein, one can create computational models at n different fidelity levels simply by starting at the lowest fidelity level and refining then from one fidelity level to the next one the discretization mesh by a certain factor (e.g., a factor of two or four). The algorithm for unidirectional multi-fidelity training is provided as pseudocode in **Table 2**.

```
TABLE 2 | Pseudocode for unidirectional multi-fidelity training.
SET current fidelity level to the lowest one
REPEAT
        GENERATE a new batch not yet filled with
        samples
        FOR each empty slot in the current batch
                 GENERATE a new training sample with
                 computational model at current
                 fidelity level
                 ADD new training sample to the
                 current batch
        END FOR
        TRAIN artificial neural network with the
        current batch
        TEST whether termination criterion (T1) or
        (T2) are satisfied
        IF termination criterion (T1) or (T2)
        applies
                 CHANGE current level of fidelity to
                 next higher fidelity level
        END IF
UNTIL fidelity level exceeds maximal fidelity level
OR computational budget spent
```
#### Bidirectional Multi-Fidelity Training

In the brute-force multi-fidelity approach switching to a higher fidelity level is based on the criteria (T1) and (T2) from section "Criteria for switching to higher fidelity levels during training" and it is irreversible. Evaluation of criterion (T2) requires the approximation (4). In certain cases, this approximation may exhibit an unusually high error, for example, because the errors of the computational models at the different fidelity levels are not related by a simple power law. This happens in practice in particular if the lowest fidelity levels are based on extremely coarse discretizations because simple power laws governing convergence mostly apply only in the limit of infinitesimal (or practically at least relatively fine) discretization lengths. If (4) exhibits a high error, application of (T2) may lead to a strongly reduced computational efficiency. To overcome this problem, one can skip in such cases termination criterion (T2) as a whole and rather adopt the following bidirectional approach for switching between different fidelity levels: at a given fidelity level one first generates one batch at the next higher and one batch at the next lower fidelity level (if these exist) as well as several batches at the given fidelity level. The number of batches at the given fidelity level is adjusted such that the computation time spent on the generation of these batches surpasses the one spent for the generation of the single batch at the next higher fidelity level by a factor of m > 1. In the following we always assume m = 4, noting that the exact choice of m has only minor impact. Higher choices for m would reflect a lower rate at which the learning efficiency per computational resource is compared between fidelity levels. However, as additional batches at the comparatively lower fidelity level only gradually diminish their contribution to the learning process, our results were insensitive to delayed comparisons (and thus delayed transitions to a different fidelity level) as reflected by higher choices of m.

This procedure ensures that at each fidelity level one spends by far most of the computation time on generating training samples at this very fidelity level but that one also has available one "trial" batch at the next higher and next lower fidelity level, respectively. Now one uses the generated batches at these three fidelity levels for training the ANN subsequently in three superepochs. For each fidelity level, we valuate the ratio between the reduction of the RMSE during the associated super-epoch and the computation time spent on generating the underlying training samples. The fidelity level for which this ratio is maximal is chosen as the new "current" fidelity level. Note that this rule admits not only increasing the fidelity level but also decreasing it in case that this is found to be computationally beneficial. For the sake of clarity, the bidirectional multi-fidelity strategy is described in detail as pseudocode in **Table 3**.



The bidirectional multi-fidelity strategy completely bypasses termination criterion (T2) and replaces it by a smart, bidirectional switching strategy between different fidelity levels which ensures that the vast majority of computational time is always spent on samples at the fidelity level that currently enables the computationally most efficient learning. It is worth mentioning that bidirectional multi-fidelity training can be particularly useful if the computational models used at the different global fidelity levels exhibit an approximation error that strongly depends on the input parameter regime so that for certain choices of input parameters the fidelity (in terms of an absolute model error) is much higher than for others. If one generates in such cases samples with input values which are not randomly distributed but which for example first probe one parameter regime with a relatively high approximation error of the computational model and subsequently another input parameter regime with a relatively low approximation error of the computational model, it will typically be efficient to reduce the global fidelity level if one moves from the first to the second input parameter regime because this would actually ensure that the approximation error of the training samples remains rather constant when switching from one to the other regime, which is beneficial because this approximation error should be linked to the smoothly changing approximation performance of the ANN.

#### NUMERICAL EXAMPLES

#### General

For the computational examples in this section we implemented FNNs with three densely connected hidden layers in Python 3.6.5 using the Keras 2.2.0 library with TensorFlow (Abadi et al., 2016) 1.8.0 as a backend. Learning is accomplished via backpropagation, using Adam (Kingma and Ba, 2014) as an optimizer for gradient descent. The activation functions are rectified linear units. The number of layers and neurons per layer are specified in **Figure 3**.

We generated all the training data used in the following examples by means of the in-house finite element code BACI (written in C++ and developed at the Institute for Computational Mechanics of the Technical University of Munich, Germany) and a 12-core Intel Xeon E5-2680v3 "Haswell." In the following, we will skip units, assuming thereby implicitly appropriately normalized quantities. For generating computational models, linear finite element discretizations with variable discretization lengths were used. The error convergence power in (4) is thus assumed to be p = 2 for displacementbased problems as in section "Two-Dimensional Problem: Elastic Deformation of a Thin Membrane Under Loading" and p = 1 for stress/strain-based problems as in section "Three-Dimensional Problem: Elastic Modulus of RVE With Material Inclusion." The maximal number of epochs within a super-epoch is chosen as E = 100. This number in fact rarely matters and is simply used to ensure that no peculiar situation can arise where the

ANN training process can get stuck. The number s of superepochs for evaluating whether the recent training progress of the ANN has been satisfactory is chosen as s = 3. For the combination of our learning architecture and problem settings, averaging the error over at least the last 3 super-epochs was found to be sufficient to preclude spurious deteriorations of the RMSE from affecting transition points. The number of test samples with maximal fidelity used in (2) and (4) is chosen as N¯ = 100. While this choice is heuristic in nature, it is legitimate because with respect to this number it is only important to keep it high enough to ensure reasonably accurate error estimates and low enough to limit the computational cost for generation of the testing samples compared to the training samples.

# Two-Dimensional Problem: Elastic Deformation of a Thin Membrane Under Loading

#### Problem Description

We consider a square rubber membrane of edge length L = 10. This membrane is modeled as a two-dimensional continuum with pure in-plane extensional and shear stiffness governed by the non-linear neo-Hookean strain energy function

$$
\psi = \frac{\mu}{2} \left[ tr\left(\mathbf{C}\right) - 3\right] \tag{5}
$$

with the material parameter µ = 1 and the (two-dimensional, in-plane) right Cauchy-Green tensor C and its trace tr (**C**). The membrane forms in its stress-free initial configuration a plane square of edge length L = 10. Before subjecting it to any other loading, it is uniformly stretched in both in –plane directions by a factor of 1.6 and then fixed at all the four edges (zero Dirichlet boundary conditions). Subsequently, the membrane is loaded with four out-of-plane loads. These loads are uniform surface loads of magnitude f = 5 acting on circular domains of radius R = 0.5, respectively. Note that the prestretch by a factor of 1.6 is used here to endow the membrane with a non-zero out-of-plane stiffness at any point in time, which is beneficial for numerically stable computational modeling of the problem.

Our objective is training an ANN to predict the out-ofplane displacement of the center of the membrane, given the (in general variable) positions of the center points of the four surface loads. To this end, we provide the ANN training data where each sample consists of four randomly chosen load positions, which form the input to the ANN, and the associated out-ofplane displacement of the membrane center, which is computed using finite element simulations. For these simulations, we use computational models with n = 5 different levels of fidelity, which correspond to uniform finite element discretizations with N 2d el ∈ 10<sup>2</sup> , 20<sup>2</sup> , 40<sup>2</sup> , 80<sup>2</sup> , 160<sup>2</sup> 4-noded rectangular linear membrane elements. Per element the unusually high number of 64 Gauss points were used to evaluate the surface loads on the membrane with an acceptable approximation error even when coarse spatial discretizations are applied. The membrane problem and its discretization with N 2d el = 100 elements are illustrated in **Figure 4**. In **Figure 5** the output of computational models at different fidelity levels (corresponding to 100, 1,600, and 25,400 finite elements) is illustrated for the same input (i.e., same out-of-plane loading).

#### Results

We trained an ANN with the architecture from **Figure 3** with the three different strategies from section "Different Training Strategies" (single-fidelity, unidirectional multi-fidelity, and bidirectional multi-fidelity) to predict for given positions of the four membrane loads the out-of-plane displacement of the center. Multi-fidelity training required a higher number of samples but was consistently found to reduce the overall computational cost of the training significantly due to the much

FIGURE 4 | In a computational model with N 2d el = 100 4-noded membrane finite elements, an initially stress-free elastic rubber membrane is first subjected to an in-plane prestretch by a factor of 1.6 in both directions, then fixed at the boundaries and subjected to out-of-plane loading with circular surface loads (red circles with crosses). Finally, the out-of-plane displacement of the membrane center is recorded in the loaded configuration as output of the computational model. This output forms together with the positions of the four loads a training sample for the ANN with the specific fidelity level corresponding to a discretization with N 2d el = 100 finite elements.

lower average computational cost of the samples used. **Figure 6** and **Table 4** report samples numbers, computational cost and learning progress across the different fidelity-levels in case of both single- and multi-fidelity training. In single-fidelity training only samples with highest fidelity (26,500 finite elements) were used for training. It should be noted that the numbers reported are the median over 100 training instances per strategy (reusing generated samples) in order to eliminate the impact of random that is inherent to a training concept were samples are created on the basis of randomly chosen input values. As can be seen in the last column of **Table 4**, multi-fidelity training enabled us to train the ANN to the same level of accuracy as single-fidelity training at a computational cost that was consistently between a half and one order of magnitude lower. With unidirectional multi-fidelity training computational savings depend on the hyperparameter q in criterion (T2). For sufficiently small choices of q, the criterion (T2) becomes unfulfillable and is functionally removed. Conversely, sufficiently large choices of TABLE 4 | Computation times and sample numbers for different training strategies for the membrane problem from section "Two-dimensional problem: elastic deformation of a thin membrane under loading."


q lead to a permissive criterion (T2) which will be satisfied whenever it is checked, resulting in fast transitions to the highest fidelity level. In that case, the unidirectional multi-fidelity strategy resembles the standard single-fidelity approach with a minor prefix of a few batches of lower fidelity levels. The best choice of q depends on how accurately equation (4) describes the numerical approximation error, which cannot be easily determined a priori for many problem settings. This heuristic factor is completely eliminated in bidirectional multi-fidelity training where switching between different fidelity levels is performed automatically so as to ensure a computationally efficient training progress. Interestingly, bidirectional multifidelity trainings yields this way computational savings that are, at least in this example, comparable to the ones possible with unidirectional multi-fidelity training for favorable choices of q.

#### Three-Dimensional Problem: Elastic Modulus of RVE With Material Inclusion Problem Description

The second example originates from the field of materials mechanics and is related the broader area of computational homogenization. A classical problem in this area is determining the homogenized (macroscopic) mechanical properties of a material whose microstructure is known in the form of a representative volume element (RVE). The RVEs studied here are cubes with edge length L = 10. They consist of a matrix material into which an ellipsoidal inclusion in the center is embedded. Size and shape of the inclusion are uniquely defined by the three ellipsoidal semiaxes a, b, and c which can vary in a specific prescribed range. Thereby, both the semiaxes and the edges of the RVE are assumed to be aligned with the three coordinate axes x, y, and z. The origin of the coordinate system coincides with the center of the RVE (cf. **Figure 7**). Both the material of the inclusion and of the surrounding matrix are assumed to be isotropic with Poisson's ratio ν = 0.3. Young's modulus of the matrix is E<sup>m</sup> = 1 and of the inclusion E<sup>i</sup> = 100. We are interested in computing Young's modulus E<sup>x</sup> in x-direction of a material consisting of RVEs of the above described type, depending on the exact geometry of the ellipsoidal inclusion. To this end, one can subject RVEs of the above type to a mechanical loading of the following type: at the face of the RVE oriented in positive x-direction one imposes as a boundary condition a uniform displacement u<sup>x</sup> (x = L/2) = 0.05 and at the oppositve face oriented in negative x-direction a uniform displacement u<sup>x</sup> (x = − L/2) = −0.05. This mimics an average strain ε<sup>x</sup> = 0.01 of the RVE in x-direction. On other faces of the RVE, displacement is constrained such that its component orthogonal to the respective face is uniform across the whole face and that the average outer normal traction vector on each face is zero. Under these conditions, one can compute

$$E\_{\mathbf{x}} = \frac{|f\_{\mathbf{x}}|}{L^2 \varepsilon\_{\mathbf{x}}} \tag{6}$$

(6) with the total reaction force f<sup>x</sup> in x-direction on the faces of the RVE oriented in positive and negative x-direction. Our objective is training an ANN to predict Ex, given the (in general variable) lengths a, b, and c of the semiaxes of the ellipsoidal inclusion as input parameters, assuming a, b,c ǫ [2; 4]. To this end, we generate training samples consisting of a random input tuples a, b,c and the associated Ex, respectively. The values of E<sup>x</sup> are computed by means of finite element simulations of the above described problem. In these simulations, the RVEs are discretized with a uniform mesh of 8-noded hexahedral linear finite elements (**Figure 7**). We use simulations on four different fidelity levels, corresponding to discretizations with N 3d el = 4 3 , 8<sup>3</sup> , 16<sup>3</sup> , 32<sup>3</sup> elements, respectively (**Figure 8**).

#### Results

The results for the three-dimensional RVE problem closely resemble the results for the two-dimensional membrane problem presented in the previous "Results" section. The general observation that multi-fidelity strategies require the generation of more samples but yet reduce the total computational cost significantly because the average computation time per sample is much lower is made in two and three dimensions alike. **Figure 9** and **Table 5** report samples numbers, computational cost, and learning progress across the different fidelity-levels in case of both single- and multi-fidelity training. In single-fidelity training only samples with highest fidelity (262,144 finite elements) were used for training. Again it is noted that the numbers reported here are the median over 100 training instances per strategy for the reasons discussed already in the previous "Results" section. As can be seen in the last column of **Table 4**, multi-fidelity training enables us to train the ANN to the same level of accuracy at a computational cost that is consistently significantly below the one of single-fidelity training. As one can see, unidirectional multifidelity training yields the best results in this three-dimensional example (unlike in the previous two-dimensional example) not for 1 < q < 10 but rather for q < 1 which indicates that the error estimates used for switching between the different fidelity levels may exhibit a considerable inaccuracy. The interaction between the choice of q and the unidirectional multi-fidelity results follows the pattern described in the previous "Results" section. Again one can see that bidirectional multi-fidelity training in which the parameter q is not required and where switching between fidelity levels is performed instead in a smart and automatic way, probing dynamically the computational efficiency of training at different levels of fidelity, is a robust and viable tool to yield an efficiency gain of around half an order of magnitude compared to brute-force single-fidelity training.

#### CONCLUSIONS

Machine learning and artificial intelligence have attracted rapidly increasing interest in mechanical engineering and materials science over the last years. One of the major challenges in this area is training ANNs to predict or control the behavior of complex physical systems for which not enough real-word training data are available, for example, because experiments or measurements are too expensive, time-consuming, or dangerous. In this case, generating training data by way of realistic computational simulations is a viable and often the only promising alternative. Doing so can, however, be associated with a significant computational cost, which forms a serious bottleneck for the application of machine learning to complex physical systems. To overcome this problem, we propose in this paper a new systematic approach. It exploits the fact that in the initial stage training an ANN mainly aims at endowing

the ANN with a coarse understanding of the general features of a problem. Using training data from detailed and thus computationally expensive models can thus be expected to be a waste of computational resources in this stage because coarse low-fidelity models often capture already the most salient features of a physical systems but at much a lower computational cost. Based on this observation, we introduced herein a general and systematic multi-fidelity framework for training ANNs with data generated by computational models with various different fidelity levels. Such models can easily be generated in the context of widely used computational methods such as the finite element method by varying the discretization length. In this framework, cheap low-fidelity computational models are used to generate the training data for the early stages of ANN training. As the training of the ANNs progresses, one gradually switches to higher-fidelity training data generated by means of more accurate and computationally more expensive models. This strategy is very general in nature and can in principle be applied to any problem where training ANNs computational models are used whose accuracy can straightforwardly be controlled, for example, by way of a discretization length. This is true not only for the finite element method which we are using herein but also for numerous other methods for solving partial differential equations such as finite difference methods, mesh-free discretization schemes such as the moving least squares methods or particle-based methods such as smoothed particle hydrodynamics (SPH). In this article, we focused on two application areas, which are structural mechanics and materials mechanics. In these areas computational models are already widely used and coupling them with machine learning appears the natural next step to address several key problems such as efficient prediction of the behavior of complex mechanical systems under variable (e.g., loading) conditions or efficient homogenization of the mechanical behavior of complex materials with a heterogeneous microstructure depending on certain features of this microstructure.

We developed in this article a general multi-fidelity framework and discussed two slightly different versions of it. The first one is based on an estimate of the error of the computational models used at different fidelity levels. It implies a heuristic correction factor q. While it may often be possible to determine this correction factor based on simple rules of thumb, in other cases this may be more difficult. This dependency on q, which does not straightforwardly generalize to complex hybrid models (e.g., finite elements on one level, molecular dynamics on another level, other discretization schemes) is a drawback of the unidirectional multi-fidelity variant, which furthermore has the limitation of no clear method of a priori

TABLE 5 | Computation times and sample numbers for different training strategies for the membrane problem from section "Two-dimensional problem: elastic deformation of a thin membrane under loading."


calculating the most efficient transition points between levels of fidelity. To eliminate this heuristic element and the problems it entails, we proposed a second version of our multi-fidelity framework where switching between different fidelity levels is controlled in a smart and fully automated way. To this end, our training algorithms probes at each fidelity level training samples also from neighboring fidelity levels and dynamically switches to the fidelity level where currently the largest training progress per computational cost can be achieved. We would consider this method a robust variant of the multi-fidelity strategy, relying on fewer parameters than the unidirectional approach. In summary, we found that our multi-fidelity training strategy enables us to train ANNs to the same level of accuracy as standard (single-fidelity) approaches but at a computational cost that is by around a half to one order of magnitude lower. This gives rise to the hope that the general multi-fidelity strategy introduced herein can become a powerful and versatile tool for the future combination of computational simulations and artificial intelligence, in particular in the area of structural and materials mechanics.

We conclude this paper by noting that the two specific multi-fidelity training algorithms introduced in this paper, the

#### REFERENCES

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., et al. (2016). "Tensorflow: A System for Large-Scale Machine Learning," in OSDI'16 Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation. (Savannah, GA: Osdi), 265–283.

unidirectional and the bidirectional training algorithm, are but a starting point. There are various ways how the underlying general idea of systematic multi-fidelity training can be further developed and optimized. For example, one could employ for ANN training batches where samples with several different fidelity levels are mixed. This would enable a seamless transition between different fidelity levels during training, which might yield additional computational savings.

#### AUTHOR CONTRIBUTIONS

RA and CC conceived of the presented approach and developed and employed the machine learning model. FB developed and performed the computational examples. RA wrote the main part and assembled all of the manuscript. All authors discussed the results and contributed to the final manuscript.

#### ACKNOWLEDGMENTS

Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) − Projektnummer 192346071 − SFB 986; Projektnummer 257981274.

Domingos, P. (2012). A few useful things to know about machine learning. Commun. ACM 55, 78–87. doi: 10.1145/2347736.2347755

Hu, Y., Xie, J., Liu, Z., Ding, Q., Zhu, W., Zhang, J., et al. (2018). CA method with machine learning for simulating the grain and pore growth of aluminum alloys. Comput. Mater. Sci. 142, 244–254. doi: 10.1016/j.commatsci.2017. 09.059


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Aydin, Braeu and Cyron. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Learning Corrections for Hyperelastic Models From Data

#### David González <sup>1</sup> , Francisco Chinesta<sup>2</sup> and Elías Cueto<sup>1</sup> \*

<sup>1</sup> Aragon Institute of Engineering Research, Universidad de Zaragoza, Zaragoza, Spain, <sup>2</sup> ESI Group Chair and PIMM Lab, ENSAM ParisTech, Paris, France

Unveiling physical laws from data is seen as the ultimate sign of human intelligence. While there is a growing interest in this sense around the machine learning community, some recent works have attempted to simply substitute physical laws by data. We believe that getting rid of centuries of scientific knowledge is simply nonsense. There are models whose validity and usefulness is out of any doubt, so try to substitute them by data seems to be a waste of knowledge. While it is true that fitting well-known physical laws to experimental data is sometimes a painful process, a good theory continues to be practical and provide useful insights to interpret the phenomena taking place. That is why we present here a method to construct, based on data, automatic corrections to existing models. Emphasis is put in the correct thermodynamic character of these corrections, so as to avoid violations of first principles such as the laws of thermodynamics. These corrections are sought under the umbrella of the GENERIC framework (Grmela and Oettinger, 1997), a generalization of Hamiltonian mechanics to non-equilibrium thermodynamics. This framework ensures the satisfaction of the first and second laws of thermodynamics, while providing a very appealing context for the proposed automated correction of existing laws. In this work we focus on solid mechanics, particularly large strain (visco-)hyperelasticity.

#### Edited by:

Christian Johannes Cyron, Technische Universität Hamburg, Germany

#### Reviewed by:

Kevin Linka, Technische Universität Hamburg, Germany Youyong Li, Soochow University, China

> \*Correspondence: Elías Cueto ecueto@unizar.es

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

Received: 05 November 2018 Accepted: 25 January 2019 Published: 15 February 2019

#### Citation:

González D, Chinesta F and Cueto E (2019) Learning Corrections for Hyperelastic Models From Data. Front. Mater. 6:14. doi: 10.3389/fmats.2019.00014 Keywords: data-driven computational mechanics, hyperelasticity, model correction, GENERIC, machine learning

# 1. INTRODUCTION

In a very recent paper about how construct machines that could eventually learn and think like humans, Lake et al. (2017)state that "machines should build casual models of the world that support explanations and understanding, rather than merely solving pattern recognition problems" and that "model building is the hallmark of human-level learning, or explaining observed data through the construction of causal models of the world". Indeed, machine learning of physical laws could be seen as the ultimate form of machine intelligence, and this should be done, of course, from data.

There is a very active field of research around this way of reasoning. For instance, in Brunton et al. (2016) a method is presented that operates on a bag of terms like sines, cosines, exponentials, etc., so as to find an expression that is sparse (i.e., it incorporates few of theses terms) while still explaining the experimental data. Similar approaches include techniques to find reducedorder operators from data (Peherstorfer and Willcox, 2015, 2016) or the possibility to construct physics-informed machine learning (Raissi et al., 2017a,b; Swischuk et al., 2018).

In the field of computational materials science, this approach seems to begin by the works of Kirchdoerfer and Ortiz (2016, 2017a). In it, and the subsequent works, they present a method in which the constitutive equation is substituted by experimental data, that could be possibly noisy (Kirchdoerfer and Ortiz, 2017b; Ayensa-Jiménez et al., 2018). In them, it is recognized that some equations (notably, equilibrium, compatibility) are of a higher epistemic nature, while constitutive equations—that are often phenomenological and, therefore, of lower epistemic value could easily be replaced by data (Latorre and Montáns, 2014). The criterion is to establish a distance measure that indicates the closest experimental datum to be employed every time the constitutive law is called at the finite element integration point level.

In some of our previous works, this approach is further generalized by defining the concept of constitutive manifold, a low-dimensional embedding for the stress-strain pairs (see Lopez et al., 2018). Thus, by alternating between stress-strain pairs that satisfy either equilibrium or the constitutive equation, the solution that satisfies the three families of equations is found, regardless of the non-linearity of the behavior. Several methods have been studied for the construction of this constitutive manifold (Ibañez et al., 2017).

Another inherent difficulty in trying to machine learning models is that of the adequate level of description. Every physical phenomenon can be described at different levels of detail. In the case of fluid mechanics, for instance, these levels range from molecular dynamics to thermodynamics—in descending order of detail—. In between, different theories have been developed that take care of different descriptors of the phenomenon taking place: from the Liouville description to the Fokker-Planck equation, hydrodynamics, ... to name but a few of the different possibilities (Español, 2004). Thus, there should be a compromise between detail in the description and the resulting computational tractability of the approach. This is something very difficult to discern for an artificial intelligence.

The risk of employing an approach based upon pure data regression is to violate—due to the inherent noise in data, for instance—some basic principles such as the laws of thermodynamics: conservation of energy, positive dissipation of entropy. Trying to avoid these possible inconsistencies, in González et al. (2018) we developed a data-driven method that operates under the framework of the GENERIC formalism (Grmela and Oettinger, 1997; Öttinger, 2005). The General Equation for Non-Equilibrium Reversible-Irreversible Coupling (GENERIC) constitutes a generalization of the Hamiltonian mechanics. Therefore, under the GENERIC umbrella, the equations satisfy basic thermodynamic principles by construction.

Thus, the problem translates to finding—by means of data the right expression of the particular GENERIC formalism for the system at hand (or its finite element approximation, if we work in a purely numerical framework). The resulting approximation is thermodynamically sound and very appealing from the numerical point of view. The stability of the GENERIC approach and its thermodynamic consistency—in particular, the conservation of symmetries in the formulation—has been thoroughly investigated in previous works, whose lecture is greatly recommended (Romero, 2009, 2010).

However, even if the usual parameter fitting procedure from experimental data is often painful and, notably, gives poor fitting of the results in many occasions, we believe that well-known constitutive equations should not be discarded, thus waiting centuries of scientific discovery. Instead, we believe that it is interesting to simply correct those models that sometimes do not fit perfectly the results—sometimes locally, in a delimited region of the phase space—. This is the approach followed in Ibañez et al. (2018), where corrections are developed to yield criteria so as to render them compliant (to a specified tolerance level) with the available experimental results. A similar approach has been pursued recently in Lam et al. (2017) for a study on the interaction of aircraft wings. In these approaches, the chosen level of description is defined by the (poor) model, so, in principle, no further decision needs to be taken, as will be discussed later on.

The GENERIC formalism is valid for all levels of description, and could also help in deriving corrections from data that still maintain the thermodynamic properties of the resulting model. Hyperelastic models fall within Hamiltonian mechanics i.e., they represent a purely conservative material—. However, rubbers or foams usually present some degree of viscoelasticity, for instance. In this framework, Hamiltonian mechanics will no longer be the right formalism to develop their constitutive equations. GENERIC should be preferred instead.

In this paper we study how to learn these corrections from data. First, in Section 2 we review the basics of the GENERIC formalism, with an emphasis on hyperelastic and visco-hyperelastic materials. In Section 3 we explain how to employ GENERIC to develop corrections to existing models from data, while in Section 4 we introduce, by means of an academic example in finite dimensions, the basic ingredients of our approach. This will be further detailed for visco-hyperelastic materials in Section 5. The paper ends with a discussion on the just developed techniques and the future lineas of research in Section 6.

# 2. A REVIEW OF THE GENERIC FORMALISM

# 2.1. The Basics

The GENERIC formalism was introduced by Grmela and Oettinger (1997) in a seminal paper in an attempt to give a common structure for non-Newtonian fluid models. The establishment of such a model in the GENERIC framework starts by selecting appropriate state variables. This is not straightforward in a general case in which we have no prior information about the precise behavior of the system at hand. However, for most systems—and specially when we start from known models, as it is the case in this work—simple rules exist for the selection of such variables (Öttinger, 2005). Selecting mutually dependent variables does not constitute a problem, in fact, as most of the literature on GENERIC demonstrates. Let us call these variables **<sup>z</sup>**<sup>t</sup> <sup>=</sup> **<sup>z</sup>**(t): <sup>I</sup> <sup>→</sup> <sup>S</sup>, **<sup>z</sup>** <sup>∈</sup> <sup>C</sup> 1 (0, T], and emphasize their obvious time dependency in the interval <sup>I</sup> <sup>=</sup> (0, <sup>T</sup>]. <sup>S</sup> represents the space in which these variables live, which depends obviously on the particular system under scrutiny. The final objective of the GENERIC model is to establish an expression for the time evolution of these variables, **<sup>z</sup>**˙(t).

The GENERIC equation takes, under these assumptions, the form

$$\dot{\mathbf{z}}\_{t} = \underbrace{\mathbf{L}(\mathbf{z}\_{t})\nabla E(\mathbf{z}\_{t})}\_{\text{Hamiltonian}} + \underbrace{\mathbf{M}(\mathbf{z}\_{t})\nabla S(\mathbf{z}\_{t})}\_{\text{Dissipative}}, \quad \mathbf{z}(0) = \mathbf{z}\_{0}. \tag{1}$$

The first sum on the right-hand side term represent the Hamiltonian, or conservative, part of the behavior of the system. In it, the term **L**(**z**t) is the so-called Poisson matrix. The second sum is responsible for the dissipative behavior of the system, with **M**(**z**t) the so-called friction matrix. Here, E(**z**t) represents the total energy of the system, while S(**z**t) represents its entropy.

For Equation (1) to give a valid description of any physical system, it must be supplemented with the so-called degeneracy conditions:

$$L(\mathbf{z}) \cdot \nabla S(\mathbf{z}) = \mathbf{0},\tag{2a}$$

$$\mathcal{M}(z) \cdot \nabla E(z) = \mathbf{0}.\tag{2b}$$

Enforcing these conditions leads to the necessity of **L**(**z**) to be skew-symmetric and a **M** to be symmetric, positive semi-definite. If these conditions are met, then, it holds,

$$\dot{E}(\mathbf{z}) = \nabla E(\mathbf{z}) \cdot \dot{\mathbf{z}} = \nabla E(\mathbf{z}) \cdot \mathbf{L}(\mathbf{z}) \nabla E(\mathbf{z}) + \nabla E(\mathbf{z}) \cdot \mathbf{M}(\mathbf{z}) \nabla \mathbf{S}(\mathbf{z}) = 0,\tag{3}$$

which is, in fact, the equation of conservation of energy for the system. Additionally, these conditions ensure the satisfaction of

$$\dot{\mathbf{S}}(\mathbf{z}) = \nabla \mathbf{S}(\mathbf{z}) \cdot \dot{\mathbf{z}} = \nabla \mathbf{S}(\mathbf{z}) \cdot \mathbf{L}(\mathbf{z}) \nabla \mathbf{E}(\mathbf{z}) + \nabla \mathbf{S}(\mathbf{z}) \cdot \mathbf{M}(\mathbf{z}) \nabla \mathbf{S}(\mathbf{z}) \ge 0,\tag{4}$$

or, equivalently, the fulfillment of the second principle of thermodynamics.

Noteworthy, Equation (1) constitutes the most general framework to develop a valid constitutive equation in the light of the principles of thermodynamics. A valid constitutive model must satisfy the GENERIC equation, and any possible correction to it should not deviate the result from this framework. For a thorough description of a long list of models under the GENERIC formalism, the interested reader can consult Öttinger (2012). To exemplify the just introduced concepts, consider the simplest case of a conservative mechanical system whose time evolution can be expressed, in the Hamiltonian framework, by resorting to a description of the type **<sup>z</sup>**˙<sup>t</sup> = {**q**<sup>t</sup> , **p**t }, where **<sup>q</sup>**<sup>t</sup> represents the position and **<sup>p</sup>**<sup>t</sup> the momentum. In that situation, the system is purely Hamiltonian and

$$L(z) = \begin{bmatrix} \mathbf{0} & \mathbf{1} \\ -\mathbf{1} & \mathbf{0} \end{bmatrix},$$

with no entropy evolution, i.e., **<sup>M</sup>** <sup>=</sup> **<sup>0</sup>**. In this simple situation, **L**(**z**) turns out to be the canonical symplectic matrix and the GENERIC description of the system reduces to that of a Hamiltonian system.

#### 2.2. Hyperelasticity Under the Prism of GENERIC

It is important to highlight the fact that, for hyperelastic materials, the expression

$$\dot{\mathbf{z}}\_l = L(\mathbf{z}\_l) \nabla E(\mathbf{z}\_l)$$

represents the usual hyperelastic problem under the Hamiltonian formalism (Romero, 2013). Indeed, if we choose **<sup>z</sup>**(**x**, <sup>t</sup>) <sup>=</sup> [**x**(**X**, <sup>t</sup>), **<sup>p</sup>**(**X**, <sup>t</sup>)]⊤, where **<sup>x</sup>** <sup>=</sup> <sup>φ</sup>(**X**)—the deformed configuration of the solid—and **p** represents the material momentum density, then,

$$\dot{z} = \begin{bmatrix} \dot{\mathbf{x}} \\ \dot{\mathbf{p}} \end{bmatrix} = L\nabla E = L \begin{bmatrix} \frac{\partial E}{\partial \mathbf{x}} \\ \frac{\partial E}{\partial \mathbf{p}} \end{bmatrix}.$$

The total energy of an elastic body can be decomposed as

$$E = W + K\_2$$

i.e., the sum of elastic and kinetic energies. Here, we assume a strain energy density potential w of the form

$$W = \int\_{\Omega} \w(\mathbf{C}) \, d\Omega,$$

where **C** represents the right Cauchy-Green deformation tensor. While, in general, the strain energy density for an isotropic case would be of the form <sup>w</sup> <sup>=</sup> <sup>w</sup>(**X**, **<sup>C</sup>**, <sup>S</sup>), in the context of isotropic hyperelasticity—a purely Hamiltonian case—, this dependence is often simply <sup>w</sup> <sup>=</sup> <sup>w</sup>(**C**). In turn, the kinetic energy will be

$$K = \int\_{\Omega} \frac{1}{2\rho\_0} |\mathbf{p}|^2 \, d\Omega.$$

In this framework, it is clear that

$$\frac{\partial E}{\partial \mathbf{x}} = \frac{\partial \, W}{\partial \mathbf{x}} = \nabla\_X \cdot \mathbf{P} = \nabla\_X \cdot [F\mathbf{S}]\_\ast$$

where **P** and **S** represent, respectively, the first and second Piola-Kirchhoff stress tensors and **F** is the deformation gradient. Given that

$$
\mathfrak{p} = \rho\_0 V = \rho\_0 \frac{\partial \mathfrak{x}}{\partial t},
$$

with **V** the material velocity and ρ<sup>0</sup> the density in the reference configuration so that, finally,

$$\dot{z} = \begin{bmatrix} \dot{\mathbf{x}} \\ \dot{\mathbf{p}} \end{bmatrix} = L \nabla E = L \begin{bmatrix} \nabla\_X \cdot \mathbf{P} \\ \frac{\mathbf{p}}{\rho\_0} \end{bmatrix}.$$

This implies that

$$L = \begin{bmatrix} \mathbf{0}\_{3 \times 3} & I\_{3 \times 3} \\ -I\_{3 \times 3} & \mathbf{0}\_{3 \times 3} \end{bmatrix},\tag{5}$$

which is fully compliant with the GENERIC framework, see Equation (1). This model is readily seen as equivalent to

$$
\dot{\mathbf{x}} = \frac{\mathbf{p}}{\rho\_0},
$$

$$\nabla\_X \cdot P = \dot{p},$$

which correspond to the definition of the material momentum density and the equilibrium equation, respectively.

Under this rationale, the possible viscous effects in the material would be described by the second sum in Equation (1).

REMARK. We have stated that, under the GENERIC formalism, an isotropic Hamiltonian or conservative hyperelastic model can be written in the form <sup>w</sup> <sup>=</sup> <sup>w</sup>(**C**) and therefore will not depend on S. This discussion is strongly related with that of the adequate level of description of the model. In fact, many hyperelastic models exist that depend on different parameters, that can influence its viscous behavior, for instance, see Mihai and Goriely (2017).

Indeed, by introducing a new potential (entropy) in the formulation, what we are doing is to introduce ignorance on these details, while still taking into account their influence on the results. It is the same process we face if we are not interested in tracking every molecule of a gas in a container but prefer instead a description based on macro-scale magnitudes such as pressure, volume, and temperature. The process of coarsegraining the description in a non-equilibrium setting makes it necessary to introduce a new potential that accounts for the neglected information: entropy (Español, 2004; Pavelka et al., 2018). Thus, in the correction procedure that we are about to introduce, there will be no need to add new variables to the model, but an adequate entropy potential to the formulation.

The problem of constructing a valid constitutive model under the GENERIC point of view is therefore reduced to that of finding the particular structure of the terms **L**(**z**), E(**z**), **M**(**z**), and S(**z**). The classical approach is to do it analytically, as in Romero (2009, 2010), for instance, or Vázquez-Quesada et al. (2009) and Español (2004), to name but a few of the examples in the literature. A different approach is to find the structure of these terms numerically, from data. This will be done possibly with the help of manifold learning techniques such as LLE (Roweis and Saul, 2000) or isomap (Tenenbaum et al., 2000), among others. It is the approach followed by the authors in González et al. (2018) and, in some sense, it is also the approach followed by Millán and Arroyo (2013) without even knowing the structure of GENERIC. This approach is also somehow related to the use of compositional rules to construct models (Grosse et al., 2012). This last reference shares with the approach herein the need of identifying the structure of several matrices that are then used to develop models—in that case, of phenomena that do not even obey the laws of physics, such as voting tendencies, for instance.

#### 3. CORRECTING MODELS IN A GENERIC FRAMEWORK

In this work we do not pursue to unveil models by means of GENERIC and experimental data. As explained in the introduction, we believe that is simply nonsense to discard models that have demonstrated to be useful for decades. In the case of hyperelasticity, these include, among a wide list of references, the works of Treloar (1975), Ogden (1984) or Holzapfel and Gasser (2000). These models, as analyzed before, already had a GENERIC structure.

Purely hyperelastic materials are strictly conservative. However, soft living matter, for instance, that is often modeled under the hyperelastic theory, present some non-negligible viscous effects (Peña et al., 2011; García et al., 2012). In that case, in the light of the GENERIC formalism, it is necessary to complement the model with a dissipative part, i.e., to determine the precise form of **M**(**z**) and S(**z**).

What we will do in this work, in fact, is to assume that an inexact model exists, so that a correction is needed,

$$\mathbf{z}^{\text{corr}} = \mathbf{z}^{\text{exp}} - \mathbf{z}^{\text{mod}},$$

where "corr", "exp" and "mod" stand, respectively, for correction, experimental and model. We will develop a correction in the GENERIC framework so as to guarantee that the corrected model for the experimental results will also have a GENERIC structure. To this end, we cast the correction in the form

$$\dot{\mathbf{z}}^{\text{corr}} = L\nabla E(\mathbf{z}^{\text{corr}}) + \mathbf{M}\nabla \mathbf{S}(\mathbf{z}^{\text{corr}}).$$

We do not consider a correction for **L** nor **M**, since, in the light of the previous remark, **L** is assumed to be identical to that of the model (we consider the same state variables). Since the correction of the model could have an important influence on the form of **M**—recall again the remark in the previous section, we attribute to S the possible presence of fine-grained state variables that are not considered in the Hamiltonian part of the model—, we discard any possible **M** coming from the inexact model and instead re-compute it from scratch. With these assumptions, the resulting model that fits with the experimental results will have the form

$$\dot{\mathbf{z}}^{\text{exp}} = \dot{\mathbf{z}}^{\text{mod}} + \dot{\mathbf{z}}^{\text{corr}} = L\nabla E(\mathbf{z}^{\text{corr}}) + M\nabla S(\mathbf{z}^{\text{corr}}) + L\nabla E(\mathbf{z}^{\text{mod}}),$$

so that, finally,

$$\dot{\mathbf{z}}^{\text{exp}} = \mathbf{L} \left( \nabla E(\mathbf{z}^{\text{corr}}) + \nabla E(\mathbf{z}^{\text{mod}}) \right) + \mathbf{M} \nabla \mathbf{S}(\mathbf{z}^{\text{corr}}),$$

which proves that the corrected model for **z** exp possesses a GENERIC structure with a correction in the Hamiltonian term.

Consider that a set of nmeas experimental measurements Z = {**z** exp 0 , **z** exp 1 , . . . , **z** exp nmeas } is available. The predictions of the inexact model are then subtracted from the experimental results. The final objective will be therefore to obtain a discrete approximation

$$\frac{z\_{n+1}^{\text{corr}} - z\_n^{\text{corr}}}{\Delta t} = \mathsf{L}\,\mathsf{DE}(z\_{n+1}^{\text{corr}}) + \mathsf{M}(z\_{n+1}^{\text{corr}})\mathsf{DS}(z\_{n+1}^{\text{corr}}),$$

to the GENERIC structure of the discrepancy between data and experiments, by identifying DE(**z**), and possibly also M(**z**) and DS(**z**). DE and DS represent the discrete gradients (in a finite element sense).

Therefore, the proposed algorithm will consist in solving the following (possibly constrained by the degeneracy conditions) minimization problem within a time interval <sup>J</sup> <sup>⊆</sup> <sup>I</sup>:

$$\mu^\* = \langle \mathbb{M}, \mathbb{D}\mathbb{E}, \mathbb{D}\mathbb{S} \rangle = \operatorname\*{arg\,min}\_{\mu} ||z(\mu) - z^{\text{meas}}||, \dots$$

with **z** meas <sup>⊆</sup> <sup>Z</sup>, a subset of the total available experimental results. See the discussion in González et al. (2018) about how to determine the right size of the sample set, the possibility of employing monolithic or staggered strategies, etc.

In the next Section this procedure is exemplified with the help of an academic example in finite dimensions.

## 4. AN INTRODUCTORY EXAMPLE

We first consider an example analyzed in Romero (2009) and then again in González et al. (2018). The system is a double pendulum, which is connected by thermoelastic springs. It comprises two masses m<sup>1</sup> and m<sup>2</sup> connected by springs of internal energy e<sup>1</sup> and e2. They oscillate around a fixed point, see (**Figure 1**). We employ the classical notation of Hamiltonian mechanics where **<sup>q</sup>**<sup>i</sup> , **p**i , i = 1, 2 represent position and momenta, respectively. For the springs, their respective entropies are s<sup>j</sup> , and the longitudes at rest will be denoted by λ 0 j , j = a, b.

The set of state variables for this double pendulum will be therefore

$$\begin{aligned} \mathcal{S} &= \langle \mathbf{z} = (\mathbf{q}\_1, \mathbf{q}\_2, \mathbf{p}\_1, \mathbf{p}\_2, s\_1, s\_2) \\ &\in \langle \mathbb{R}^2 \times \mathbb{R}^2 \times \mathbb{R}^2 \times \mathbb{R}^2 \times \mathbb{R} \times \mathbb{R} \rangle, \mathbf{q}\_1 \neq \mathbf{0}, \mathbf{q}\_2 \neq \mathbf{q}\_1 \rangle. \end{aligned}$$

The GENERIC structure for this problem needs to consider the internal energy of the system. Again, the internal energy is composed by the kinetic energy of the masses and the potential energy in the springs, i.e.,

$$E(\mathbf{z}) = K\_1(\mathbf{z}) + K\_2(\mathbf{z}) + e\_a(\lambda\_a, s\_a) + e\_b(\lambda\_b, s\_b),$$

with

$$
\lambda\_a = \sqrt{q\_1 \cdot q\_1}, \quad \lambda\_b = \sqrt{(q\_2 - q\_1) \cdot (q\_2 - q\_1)}.
$$

The temperature in the springs, θ<sup>j</sup> , is assumed to be originated by the Joule effect,

$$\theta\_{\dot{j}} = \frac{\partial e\_{\dot{j}}}{\partial s\_{\dot{j}}}, \ \dot{j} = a, b.$$

The conductivity in the springs will be denoted by κ. Under this rationale, the resulting equations for the double pendulum will be

$$\begin{aligned} \dot{q}\_i &= \frac{\mathbf{p}\_i}{m\_i}, \\ \dot{\mathbf{p}}\_i &= -\frac{\partial}{\partial \mathbf{q}\_i} (e\_a + e\_b), \\ \dot{s}\_j &= \kappa \left( \frac{\theta\_k}{\theta\_j} - 1 \right), \end{aligned}$$

with i = 1, 2, j = a, b, k 6= j. Therefore, the gradients of the GENERIC formalism will look

$$\nabla E(z) = \left(f\_a \mathfrak{n}\_a - f\_b \mathfrak{n}\_b, f\_b \mathfrak{n}\_b, \frac{\mathfrak{p}\_1}{m\_1}, \frac{\mathfrak{p}\_2}{m\_2}, \theta\_a, \theta\_b\right), \tag{6a}$$

$$\nabla S(z) = (\mathbf{0}, \mathbf{0}, \mathbf{0}, \mathbf{0}, 1, 1), \tag{6b}$$

with f<sup>j</sup> , **n**j , j = a, b, the forces in the springs and their respective unit vector along their direction.

Poisson and friction matrices will result in this case,

$$L(z) = \begin{pmatrix} \mathbf{0} & \mathbf{0} & 1 & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & 1 & \mathbf{0} & \mathbf{0} \\ -1 & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & -1 & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \end{pmatrix}, \quad M(z) = \begin{pmatrix} \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & -\kappa & \kappa \frac{\partial\_{\bar{a}}}{\partial\_{b}} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & -\kappa & \kappa \frac{\partial\_{\bar{a}}}{\partial\_{b}} \end{pmatrix}. \tag{7}$$

However, we will assume that this description of the system is not available—will be used as a ground truth to determine errors—and that the system is thought to be purely Hamiltonian.

In this scenario, the goal of our method will be that of unveiling the dissipative part of the model so as to correct the pure Hamiltonian behavior of the assumed model. In other words, the system will be considered as modeled by

$$\dot{\boldsymbol{z}}\_t = L(\boldsymbol{z}\_t) \nabla E(\boldsymbol{z}\_t),$$

with **<sup>L</sup>** as in Equation (7) and <sup>∇</sup><sup>E</sup> as defined in Equation (6a). Results of the ground truth, the assumed (purely Hamiltonian) model and the found corrected model are shown in **Figure 2**.

The mean squared error of the assumed model with respect to the pseudo-experimental data was initially 0.1732%. Note the little influence of the Joule effect on the results. However, after a correction is found and the dissipative character of the model is taken into account, this error is decreased up to 0.0125%, i.e., one order of magnitude.

# 5. CORRECTIONS TO HYPERELASTIC MODELS

In order to show the full capabilities of the proposed method, we consider now an example of a visco-hyperelastic material

whose precise constitutive model is to be corrected from experimental data.

modulus K evolve in time. This evolution is modeled by means of a Prony series in terms of the instantaneous moduli,

# 5.1. Ground Truth. Pseudo-Experimental Data

The pseudo-experimental data is obtained by finite element simulation of a visco-hyperelastic Mooney-Rivlin material in which

$$\mathcal{W} = \mathcal{C}\_1(\vec{l}\_1 - \mathfrak{Z}) + \mathcal{C}\_2(\vec{l}\_2 - \mathfrak{Z}) + D\_1(f - 1)^2,\tag{8}$$

with I<sup>1</sup> = J − 2 <sup>3</sup> I<sup>1</sup> and I<sup>2</sup> = J − 4 <sup>3</sup> I2, and where the invariants of the right Cauchy-Green tensor **<sup>C</sup>** are defined asI<sup>1</sup> <sup>=</sup> <sup>λ</sup> 2 <sup>1</sup>+λ 2 <sup>2</sup>+λ 2 3 , and I<sup>2</sup> = λ 2 1 λ 2 <sup>2</sup> + λ 2 2 λ 2 <sup>3</sup> + λ 2 3 λ 2 1 , respectively. J represents, as usual, the determinant of the gradient of deformation tensor. In this case, C<sup>1</sup> = 27.56 MPa, C<sup>2</sup> = 6.89 MPa and D<sup>1</sup> = 0.0029 MPa.

To model the viscoelastic behavior of this rubberlike material, it is assumed that the material's shear modulus G and bulk

$$\frac{G(t)}{G\_0} = 1 - \sum\_{i=1}^{2} \overline{g}\_i^p \left( 1 - \exp\left(-\frac{t}{\tau\_i}\right) \right),$$

$$\frac{K(t)}{K\_0} = 1 - \sum\_{i=1}^{2} \overline{k}\_i^p \left( 1 - \exp\left(-\frac{t}{\tau\_i}\right) \right),$$

with g P <sup>i</sup> = [0.2, 0.1] and k P <sup>i</sup> = [0.5, 0.2]. The relaxation times take the values τ<sup>i</sup> = [0.1, 0.2] seconds, respectively. With these values, the initial instantaneous Young's modulus takes the value E = 206.7 MPa, with Poisson's ratio ν = 0.45.

Data was generated after a total of 557 different loading processes to the same specimen. It was subjected to a load history of different amplitudes. In every case, a first plane stress state (σx, σy, τxy)—values are not correlated—is applied during a short impulse of 0.021 seconds, then maintained at constant value for one more second, allowing the material to creep. This is followed by a second loading process of 0.021 seconds at a different (σx, σy, τxy) value, followed by a final plateau of one more second. For each one of the 557 different experiments these two stress states were different. These results are stored in the form of 557 different Z vectors, thus representing a trajectory in time.

# 5.2. Modeling the Results With a Purely Hyperelastic Model

After the generation of the pseudo-experimental data, we tried to reproduce these results with a deliberately wrong model: the material was assumed to be modeled by a Mooney-Rivlin model with no viscous response (and thus purely Hamiltonian or conservative). The comparison of the experimental results and the predictions given by this (poor) model are shown in **Figure 3**.

It seems obvious that a classical Mooney-Rivlin model can not reproduce the viscous behavior of the reference material. In the next section a correction to this model is developed based on the available data and the procedure introduced in Section 3.

# 5.3. Correction of the Dissipative Part of the Model

Knowing in advance that the pseudo-experimental results come from a viscous modification to a Mooney-Rivlin model, a first attempt is made of finding a correction by incorporating a dissipative part in the GENERIC description of the model. To this end, for each one of the experimental results, a fitting procedure of the dissipative GENERIC terms was accomplished.

In **Figure 4** results are shown for one of the 557 essays. Experimental results, Mooney-Rivlin prediction and the subsequent GENERIC correction are shown. As can be noticed, experimental results are captured to a high degree of accuracy. In this case, for the particular test shown in **Figure 5**, the mean squared error was 0.018%. All the tests showed similar levels of error.

# 5.4. What if Some Terms Need no Correction?

Of course, in general we will not know in advance that a particular model is the best for the Hamiltonian part of the behavior. In a general situation both parts of the model will need to be corrected. To show the robustness of the presented method, we demonstrate here that if we try to correct the Hamiltonian part of the model, the method is able to detect that it is already correct (Mooney-Rivlin) and that it needs no correction. The method proceeds by correcting the dissipative part only, obtaining the same levels of error as the preceding section.

#### 5.5. Constructing the Good Model

The final goal of the method is not to reproduce each one of the experimental results, but to be able to construct a true model from data. To this end, we first unveil the underlying manifold structure of the experimental data. The temporal series of **z** exp(t) is grouped into a high dimensional vector, one for each of the 557 experiments. These are then embedded, by means of Locally Linear Embedding techniques (Roweis and Saul, 2000) onto a low-dimensional manifold. This permits to unveil the true neighborhood structure between experimental data and, notably, to perform rigorous interpolation on the manifold structure and not on the Euclidean space—among data.

The first step when applying LLE techniques to a set of high-dimensional data is to find the right dimensionality of the embedding space. To do so, the eigenvalues of the projection matrix are usually studied. These are depicted in **Figure 5**.

The first LLE eigenvalue is always close to zero within machine precision, and is discarded. The next "isolated" eigenvalues represent the true dimensionality of the embedding space (in this case, three). The rest of the eigenvalues are usually much

FIGURE 4 | Comparison of the Mooney-Rivlin model prediction and its subsequent GENERIC correction with the experimental results for one particular experiment.

closer to each other and do not represent the right dimensionality of the embedding space. Therefore, it seems that the right dimensionality of the embedding space is three—even two.

Locally Linear Embedding techniques need some user intervention to determine, by trial and error, the adequate number of neighbors for each datum. In this case we assume some 20 neighbors for each one. The key step in finding the good low-dimensional embedding of the data is to find a vector of weights **W** that minimizes the functional

$$\mathcal{F}(\mathcal{W}) = \sum\_{m=1}^{557} \left\| \mathbf{z}\_m - \sum\_{i=1}^{20} W\_{mi} \mathbf{z}\_i \right\|^2.$$

Once these weights are found, LLE assumes that they continue to be valid in the low-dimensional embedding, and looks for the new coordinates ξ in this space accordingly, by minimizing a new functional

$$\mathcal{G}(\boldsymbol{\xi}\_1, \dots, \boldsymbol{\xi}\_{557}) = \sum\_{m=1}^{557} \left\| \boldsymbol{\xi}\_m - \sum\_{i=1}^{20} W\_{mi} \boldsymbol{\xi}\_i \right\|^2.$$

This procedure allows us to find the constitutive manifold, as defined in Ibañez et al. (2017). It is shown in **Figure 6**. The objective of this validation procedure will be to try to reproduce a control point in the manifold—a complete loading history, in fact—by obtaining its GENERIC model from the neighboring experimental points. This control point is shown in red in **Figure 6**.

In **Figure 7** the result of the interpolated model (in continuous line) and the eight neighboring experimental results (dashed lines) that served to construct the final GENERIC model for the red point in **Figure 6** are shown. The mean squared error with respect to the control experimental history resulted be 0.174%.

#### 5.6. Full Model Correction

In the preceding sections we assumed that the Hamiltonian part of the model (basically, a Mooney-Rivlin model) was known and that the model needed only some amendment in its dissipative part. In this section we study the performance of the proposed technique if every term in the assumed model is wrong.

To this end, we assume for the solid a Neo-Hookean model with no viscous dissipation. The neo-Hookean model is basically equal to Mooney-Rivlin, see Equation (8), with C<sup>2</sup> = 0. To make things even more difficult, we assume a bad calibration of the instruments so that, for this "wrong" model, C<sup>1</sup> = 68.9 MPa (four times the right value for the Mooney-Rivlin model—the actual one—) and D<sup>1</sup> = 0.0016.

Proceeding like in previous sections, we first computed corrections for each one of the 557 different experimental time series. For one of these essays, the prediction given by

FIGURE 6 | Obtained constitutive manifold by embedding the experimental results onto a three-dimensional space. Only a portion of the 557 experimental results are shown for clarity. In red, control point employed to validate the approach. Note that it is surrounded by a user-defined number of neighbors, whose GENERIC model is employed to obtain, by interpolation by means of the LLE weights, the sought model.

the "wrong" (neo-Hookean) model, the experimental results (coming from the Mooney-Rivlin model) and the corresponding corrected model predictions are shown in **Figure 8**.

For this particular case (every experiment provided similar results), the initial error for the prediction given by the "wrong" neo-Hookean model was 13.05%. After correction, the relative mean square error in the time history was 0.092%.

Once the whole 557 experiments have been corrected, the constitutive manifold for this material can be constructed by LLE methods, as detailed in Section 5.5.

With this constitutive manifold thus constructed we can now evaluate the behavior of any new strain-stress state by simply locating it in the manifold, determining its surrounding neighbors, and employing the LLE weights to interpolate its GENERIC terms. This was done for one of the experimental results, that was removed from the manifold for control purposes, and interpolated from its neighbors. The result of this process is shown in **Figure 9**.

The mean squared error along the time history with respect to the control experiment was 1.057%.

# 6. DISCUSSION

From the results just presented, it is clear that the proposed technique presents an appealing alternative for the machine learning of models from data. Instead of constructing data-driven models from scratch, constructing only corrections to existing, well-known models has shown to provide very accurate results that very much improve these models.

One key ingredient in these developments is the concept of constitutive manifold, that allows to interpolate experimental results in the right manifold structure. Existing works choose simply the nearest experimental neighbor, but, notably, this neighborhood is found in an Euclidean space (Kirchdoerfer and Ortiz, 2016) or in a Mahalanobis space (Ayensa-Jiménez et al., 2018).

The presented method is robust even if some parts of the model need no correction. The final method, as has been presented, has the important property of being sound from the thermodynamic point of view, guaranteeing, thanks to its GENERIC structure, the conservation of energy and positive production of entropy.

From the numerical point of view, the resulting, GENERICbased time integrator schemes have already demonstrated their ability to conserve the right symmetries of the system (see, for instance, Romero, 2009 or González et al., 2018). In sum, we believe that the just presented technique, that should be extended to other types of systems, presents a promising future.

#### REFERENCES


### DATA AVAILABILITY STATEMENT

The datasets [GENERATED/ANALYZED] for this study will be available upon request.

# AUTHOR CONTRIBUTIONS

DG, FC, and EC contributed in the conception and design of the study. DG coded the Matlab program. EC wrote the first draft of the manuscript. All authors contributed to manuscript revision, read and approved the submitted version.

# FUNDING

This work has been supported by the Spanish Ministry of Economy and Competitiveness through Grants number DPI2017-85139-C2-1-R and DPI2015-72365-EXP and by the Regional Government of Aragon and the European Social Fund, research group T24 17R.


Treloar, L. R. G. (1975). The Physics of Rubber Elasticity. Clarendon Press.

Vázquez-Quesada, A., Ellero, M., and Español, P. (2009). Consistent scaling of thermal fluctuations in smoothed dissipative particle dynamics. J. Chem. Phys. 130:034901. doi: 10.1063/1. 3050100

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 González, Chinesta and Cueto. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Simple Approach to Atomic Structure Characterization for Machine Learning of Grain Boundary Structure-Property Models

#### Brandon D. Snow, Dustin D. Doty and Oliver K. Johnson\*

Department of Mechanical Engineering, Brigham Young University, Provo, UT, United States

Grain boundaries (GBs) have a significant influence on the properties of crystalline materials. Machine learning approaches present an attractive route to develop atomic structure-property models for GBs because of the complexity of their structure. However, the application of such techniques requires an appropriate descriptor of the atomic structure. Unfortunately, common crystal structure identification techniques cannot be applied to characterize the structure of the vast majority of GB atoms (50–98% are classified as "other"). This suggests a critical need for atomic structure descriptors capable of identifying arbitrary atomic environments. In this work we present a simple procedure that facilitates the identification of arbitrary atomic structures present in GBs. We apply this approach to characterize the atomic structure of the 388 GBs from the Olmsted data set (Olmsted et al., 2009). We show how this approach facilitates visualization of GB atomic structures in a way that reveals important structural information. We test the recently proposed hypothesis that 63 GBs contain facets of the GBs that form the corners of the corresponding GB plane fundamental zone. Finally, we briefly demonstrate how the structure descriptors resulting from our approach can be used as inputs to machine learning approaches for the development of atomic structure-property models for GBs.

Keywords: grain boundary, machine learning, atomic structure identification, common neighbor analysis, faceting

# 1. INTRODUCTION

Grain boundaries (GBs) play an important role for many material properties, such as hydrogen embrittlement (Bechtle et al., 2009), creep (Gertsman and Tangri, 1997; Watanabe et al., 2009), corrosion resistance (Shimada et al., 2002; Tan et al., 2008), and conductivity (Zhang et al., 2006). While the structure of GBs is most often characterized experimentally by their five macroscopic crystallographic degrees of freedom (Ashby et al., 1978), it is the atomic structure that fundamentally governs their properties (Katritzky and Fara, 2005). Atomistic simulation has been used to investigate the atomic structure of GBs and how it correlates with their observed properties (Zhang et al., 2009). However, the atomic structure of GBs is much more complicated than their crystallographic structure and traditional crystal identification descriptors are not designed to classify the structure of the vast majority of atoms present at GBs. As an example, we analyzed the 388 GBs constructed by Olmsted et al. (2009) using common crystal structure identification methods: bond-angle analysis (BAA) (Ackland and Jones, 2006), common neighbor analysis (CNA)

#### Edited by:

Norbert Huber, Helmholtz Centre for Materials and Coastal Research (HZG), Germany

#### Reviewed by:

Peter Larsen, Massachusetts Institute of Technology, United States Liam Huber, Max-Planck-Institut für Eisenforschung GmbH, Germany

> \*Correspondence: Oliver K. Johnson ojohnson@byu.edu

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

> Received: 03 March 2019 Accepted: 09 May 2019 Published: 31 May 2019

#### Citation:

Snow BD, Doty DD and Johnson OK (2019) A Simple Approach to Atomic Structure Characterization for Machine Learning of Grain Boundary Structure-Property Models. Front. Mater. 6:120. doi: 10.3389/fmats.2019.00120 (Faken and Jónsson, 1994), and polyhedral template matching (PTM) (Larsen et al., 2016). **Table 1** provides the percentage of the GB atoms that were unclassified (i.e., classified as "other"/unknown structures) by each technique across all 388 GBs and across the subset of 41 63 GBs. The fact that 50–98% of the GB atoms remain unclassified, makes it difficult to identify atomic structure-property relationships for GBs, and suggests a critical need for new techniques that can describe the complex atomic structure of GBs.

Due to the complex and high-dimensional nature of GB atomic structures, machine learning and related statistical approaches provide an attractive route for the development of atomic structure-property models. However, the inability to resolve atomic structure within GBs complicates such an effort because the effect of distinct atomic environments cannot be extracted if these environments cannot be distinguished. If it were possible to fully characterize the atomic structure of GBs, dimensionality reduction techniques such as feature selection (e.g., decision trees) and feature transformation (e.g., principle component analysis) could be applied to identify the atomic environments that govern properties of interest. Labeled data from simulations could then be provided to train supervised machine learning algorithms, and predictive models could be developed that would significantly expand our understanding of atomic structure-property relations for GBs.

As demonstrated above, common crystal structure identification techniques are insufficient for this task. Consequently, several authors have developed methods to identify arbitrary non-crystalline atomic structures for applications such as developing interatomic potentials (Bartók et al., 2013), analyzing colloidal crystallization (Reinhart et al., 2017), and characterizing grain boundaries (Banadaki and Patala, 2017; Rosenbrock et al., 2017; Priedeman et al., 2018). A brief summary of their work is given in section 2. While these methods are effective, they are also significantly more complex than simple crystal structure identification techniques that are in common use. The major contribution of the present work is to bridge this gap.

By employing a simple version of common neighbor analysis (CNA) and leveraging information that is already available—but which is normally discarded—we develop an approach that (i) can characterize arbitrary atomic environments, while also being both (ii) simple to implement, and (iii) built upon a descriptor

TABLE 1 | Comparison of characterization methods applied to the Olmsted GB data set (Olmsted et al., 2009).


For all crystal structure identification methods, a large portion of the grain boundary atoms remain unclassified. Note that the FCC atoms are excluded from this calculation. Because the listed methods are restricted to finding a pre-defined set of reference structures, the GB atoms are only classified if they are either HCP, BCC, ICO, or SC. This is useful in the case of the coherent twin which is 100% HCP, but additional measures are needed to characterize any other grain boundary.

that is already familiar to the atomistic modeling community. We demonstrate that, in spite of its simplicity, it can be employed for predictive purposes as part of a machine learning strategy to develop GB structure-property models. We anticipate that the simplicity and effectiveness of this approach will facilitate the development of predictive structure-property models for GBs as well as other applications that involve lower symmetry atomic structures such as those present in metallic glasses.

# 2. BACKGROUND

There has been great interest in characterizing atomic structures recently and over the last decade and several reviews are available in the literature (Stukowski, 2012; Priedeman, 2018), so only a brief description is given here.

# 2.1. Identification of Crystalline Atomic Environments

Common methods used to identify crystalline structures include the centrosymmetry parameter (Kelchner et al., 1998), common neighbor analysis (CNA) (Faken and Jónsson, 1994), polyhedral template matching (PTM) (Larsen et al., 2016), and Voronoi cell analysis methods (Bernal, 1959; Rahman, 1966; Bernal and Finney, 1967; Finney, 1970; Hsu and Rahman, 1979; Sheng et al., 2006; Lazar et al., 2015).

The centrosymmetry parameter is a measure of the distance to an atom's n nearest neighbors to determine whether or not an atom is within a bulk crystal or a defect. CNA, PTM, and Voronoi analysis methods all classify the atomic structure of an atom by comparison of its local environment to a library of known structures, usually face-centered cubic (FCC), hexagonal closepacked (HCP), body-centered cubic (BCC), icosahedral (ICO), and, for some of these methods, simple cubic (SC).

These methods provide valuable tools for identifying the location, and in some cases the types, of defects present in an atomistic model. However, as with all tools (including those that we present in this paper), each method has certain drawbacks and limitations. The main disadvantages of the centrosymmetry parameter are that the number of neighbors, n, is a user-defined parameter, and the centrosymmetry parameter doesn't give any insight into what the local structure is if it is part of a defect. While some of the limitations of CNA have been reduced by the introduction of an adaptive cutoff radius (Stukowski, 2012), the method is typically just used to determine whether an atom belongs to one of a small set of predetermined environments. PTM uses a more robust Voronoi method to identify neighbors, but it too relies on comparison with a small library of known environments. Voronoi analysis generally characterizes local environments by the number of faces with a particular number of edges, but this approach fails to distinguish between some common environments (FCC and HCP) (Bernal, 1959; Rahman, 1966; Bernal and Finney, 1967; Finney, 1970; Hsu and Rahman, 1979; Sheng et al., 2006). The recently developed Voronoi Topology (VoroTop) technique (Lazar et al., 2015) uses planar graph representations to address this issue by including information about the arrangement of the faces, but requires a large database of nearly degenerate variants of the known Voronoi cells to compare against, since small atomic displacements can significantly affect the Voronoi cell topology. As with the other crystal structure identification methods, the VoroTop technique has primarily employed a small library of known structures. Additional environments can be added to these libraries, but this must be done manually.

# 2.2. Identification of Non-crystalline Atomic Environments

To adequately analyze the local atomic structure of defects, such as GBs, a method is needed that can classify atoms without a priori knowledge of the structures present (i.e., without reliance on a small precomputed list of known structures). Several recent publications have presented methods to identify arbitrary local environments (Bartók et al., 2013; Banadaki and Patala, 2017; Reinhart et al., 2017; Rosenbrock et al., 2017; Priedeman et al., 2018), and a brief description of each is given here.

Bartók et al. (2013) developed an atomic structure descriptor based on the superposition of Gaussian kernels centered at atomic positions, referred to as the SOAP kernel/descriptor. SOAP is unique in that it is a continuous descriptor (making it robust against small changes in atomic positions) unlike most other descriptors that are discrete in nature. SOAP has recently been applied to characterize GBs by Rosenbrock et al. (2017) and Priedeman et al. (2018).

Banadaki and Patala (2017) presented the polyhedral unit model, which compares the neighborhood around voids in atomic structures (at which vertices in the Voronoi tessellation are centered) against an exhaustive library of configurations of close-packed spheres for up to 12 spheres. A benefit of the polyhedral unit model is that an RMSD value can be calculated to quantify how close of a match particular structures are to their reference structures, but the resulting polyhedra are centered on a void as opposed to an atom which is the more common representation of an atomic environment.

Reinhart et al. (2017) developed an algorithm called Neighborhood Graph Analysis (NGA), which implemented CNA with an adaptive cutoff radius to produce CNA signatures for arbitrary environments present in colloidal crystallization simulations. The adaptive cutoff however, produces an asymmetric neighborhood graph (i.e., atom B may be a neighbor to atom A, but that does not imply atom A will be in the neighborhood set of atom B) which can artificially increase the number of unique environments (i.e., there is an overpartitioning of the configuration space). This is compensated for by employing a machine learning algorithm to determine relationships between otherwise discrete signatures and consolidate similar environments that have different signatures. Reinhart et. al subsequently developed a modified version of their original algorithm, which they call the "fast NGA" (fNGA) algorithm (Reinhart and Panagiotopoulos, 2018), which defines neighbors using a Delaunay triangulation (similar to PTM), and which uses graphlets to dramatically reduce the computational cost of the consolidation step. The present work can be seen as a simplified version of Reinhart's original approach.

While all of these methods are effective at classifying noncrystalline atomic environments, they are complex and in some cases computationally expensive. In this paper we present a comparatively simple alternative based on CNA to identify arbitrary local environments without the use of a predetermined library of structures. Because of its simplicity and the fact that it only requires some minor post-processing (code provided in **Supplementary Material**) of traditional CNA data that is already ubiquitously available in existing software packages, our approach can be easily adopted. While our method, like others, suffers from over-partitioning of the space of unique atomic environments, we show that it is, nevertheless, possible to gain insight into important structure-property relationships. We demonstrate the usefulness of this technique by characterizing the unique atomic environments (UAEs) present in the 388 GBs of the Olmsted data set (Olmsted et al., 2009). We also test the recent hypothesis (Banadaki and Patala, 2016) that the structures of 63 GBs may decompose into facets of the GBs occupying the corners of the corresponding GB plane fundamental zone (FZ). Finally, we give a brief example of how the UAEs identified using our approach might serve as inputs to machine learning strategies for the development of atomic structure-property models for GBs.

# 3. METHODS

#### 3.1. Traditional Common Neighbor Analysis In the traditional CNA method, a set of three indices j, k, l is defined, which describes the topology of the graph formed by the nearest neighbor atoms (see **Figure 1**). The three indices are computed for each neighboring atom to define their relationship to the central atom. The first index j enumerates the number of shared nearest neighbors (e.g., in **Figure 1** the four light purple atoms are nearest neighbors of both the central atom and the dark purple atom, so for the dark purple atom j = 4). The index k enumerates the number of bonds between shared nearest neighbors (e.g., in **Figure 1** there are two dashed purple lines indicating two distinct bonds between shared nearest neighbors, so for the dark purple atom k = 2). Finally, the index l enumerates the number of bonds in the longest bond-chain formed by shared neighbors (e.g., in **Figure 1** the dashed purple lines do not share an atom, so the longest bond-chain between shared nearest neighbors is 1, giving l = 1 for the dark purple atom). CNA indices are calculated for each atom pair. The local environment (i.e., "atomic structure") of a particular atom is then defined by the set of CNA indices of all of its nearest neighbors. As has been done in prior literature (Stukowski, 2012; Reinhart et al., 2017), we refer to this as an atom's CNA signature to distinguish it from the atom's CNA indices. For example, the CNA signature of an atom whose local structure corresponds to an FCC lattice would be denoted {12 × (4, 2, 1)}, indicating that it possesses 12 nearest neighbors, each with CNA indices of (4, 2, 1). An atom with a less symmetric local environment, such as one belonging to a GB might have a CNA signature of {2 × (3, 1, 1), 3 × (4, 2, 1), 2 × (4, 2, 2), 2 × (4, 3, 3),}

{1 × (4, 4, 4), 2 × (5, 4, 4)}, indicating a total of twelve nearest neighbors, but which have different CNA indices.

We note that neighbors can be identified using various methods, the primary ones being a fixed cutoff radius or an adaptive cutoff (Stukowski, 2012; Reinhart et al., 2017). In this work we chose to use a fixed cutoff of 3.5Å (which falls between the first and second nearest neighbors for the FCC lattice, see **Figure 3A**). The fixed cutoff was chosen both because of its simplicity and because it resulted in fewer unique signatures than the adaptive methods (2205 vs. 3716) for the structures that we analyzed.

Once the CNA signature of every atom has been computed, atomic structures are identified by comparison with the CNA signatures of a predefined library of known structures, typically limited to FCC, HCP, BCC, and ICO. In standard usage, any atom whose CNA signature does not match that of one of the predefined structural templates remains unclassified and is labeled as "other." This is sufficient to identify the location of defects because "other" atoms typically are found at defects. However, it is generally insufficient to resolve the structure of those defects. Because GBs consist of mostly "other" atoms, their internal atomic structure cannot typically be resolved. Furthermore, if two GBs both contain all "other" atoms, it is difficult to distinguish between them.

# 3.2. Fully-Leveraged CNA

To address this issue, we note that the information necessary to distinguish "other" atoms from one another is already available and encoded in their respective CNA signatures, it is just typically ignored in standard practice. To exploit this information one must simply identify all of the unique CNA signatures; these define distinct atomic structure classes; in some sense this list constitutes an extended structure library. Atoms are then classified using this extended structure library. However, it is constructed at the time of analysis and is compatible with arbitrary atomic structures (one does not need to know what structures they are looking for a priori). Furthermore, the "other" category is entirely eliminated as all atoms are classified and belong to one of the UAEs that were identified.

To extract the complete CNA signatures for each atom in the structures that we analyzed, there are built-in functions that can be run as part of a pipeline in the Open Visualization Tool (OVITO) (Stukowski, 2010), and an example python script is available in the online OVITO documentation. We modified this script for our particular application, and we provide our modified version in the accompanying **Supplementary Material**. Once extracted, the unique CNA signatures were then identified in MATLAB and each was assigned a unique numerical class ID (we also provide this code in the **Supplementary Material**), which was subsequently imported into OVITO as a custom particle property, allowing for color-coding and visualization.

# 4. RESULTS AND DISCUSSION

# 4.1. Classifying "Other" Atoms in GBs

We applied the fully-leveraged CNA approach to characterize all of the atoms in the 388 GBs from the Olmsted data set (Olmsted et al., 2009), which contains atomic structures for a total of 388 GBs in Al with variation across all five crystallographic degrees of freedom, including 41 63 GBs. Here we present the results of that analysis. The vast majority of the atoms belong to the grain interiors and are FCC, and could be easily characterized by existing methods. We, therefore, focus on the GB atoms, which are generally classified as "other"/unidentified structures by reference structure based techniques. We define an atom as belonging to the GB if at least one of the nearest neighbors is not FCC. This results in all of the non-FCC atoms, as well as many FCC atoms inside or adjacent to the GB being identified with it (for some tilt GBs, if the dislocation spacing is sufficiently large there will be FCC atoms in the GB plane, that are entirely surrounded by other FCC atoms, which would not be counted as GB atoms by this definition.). Using this definition, there are a total of 462,955 GB atoms, out of a total of 11,922,451 atoms contained in the Olmsted data set (the non-GB atoms belong to the bulk crystal and are all FCC). While some GBs properly contain FCC atoms in their interior (e.g., low-angle GBs have FCC atoms between dislocations), the focus of this work is on characterizing non-FCC atoms. Consequently, we will present our results in two ways: (i) relative to all 462,955 GB atoms (FCC and non-FCC), and (ii) relative to only the non-FCC GB atoms (of which there are 227,401).

**Figure 2A** shows the distribution of GB atomic environments across all 388 GBs for the fully-leveraged CNA approach. This shows that out of the nearly 500,000 GB atoms (across all 388 GBs), there are 2205 unique CNA signatures. However, noting the log-scale in the y-axis, only 448 signatures are needed to account for approximately 90% of the non-FCC GB atoms (see **Figure 2B**), and only 167 are needed if the GB atoms with FCC structure are included<sup>1</sup> . While this still represents a nonnegligible number of unique environments, it is a considerable reduction in dimensionality for a general set of grain boundaries, which would otherwise require a total of at least 682, 203 parameters to describe the atomic configurations (3 parameters for each atom Rosenbrock et al., 2017).

We note that, using an alternative spatially continuous descriptor, smooth overlap of atomic positions (SOAP) (Bartók et al., 2013), Rosenbrock et al. initially found 800,000 UAEs for the same 388 GBs in Ni, using a neighborhood cutoff distance of 5Å (Rosenbrock et al., 2017). In the SOAP method, as well as other methods such as PTM, a similarity measure is employed, enabling two structures that differ by only a small perturbation to still be classified as the same environment, which is one way to correct for the overpartitioning phenomenon. After using a similarity metric within a machine learning framework the original 800,000 UAEs were consolidated to only 145 distinct UAEs. We note that, as with any similarity based consolidation approach, the resulting number of unique environments depends on the user specified similarity threshold.

The simple approach to UAE identification embodied in the fully-leveraged CNA does not employ a similarity threshold, so it is expected that the UAE space will be over-partitioned. This manifests itself in the relatively long-tailed distribution of UAEs in **Figure 2**, which are produced by small deviations in atomic position that cause a single environment to produce multiple CNA signatures (i.e., UAEs that are not frequently observed are most likely slightly distorted versions of other UAEs). The underlying cause of this phenomenon is the difficulty in unambiguously defining atomic neighbors in non-crystalline regions. To illustrate this, compare the radial distribution function (RDF) for bulk FCC with that of a grain boundary, as shown in **Figure 3**. The clear separation of the first and second peaks—corresponding to the first and second nearest neighbors, respectively—in the RDF of the FCC lattice (**Figure 3A**) facilitates the selection of an appropriate neighbor cutoff radius. However, as expected, the RDF for the grain boundary atoms (**Figure 3B**) does not show a clear separation between first and second neighbors, making CNA sensitive to small perturbations of atomic position and changes in the cutoff radius. This also means that the number of UAEs identified by the fully-leveraged CNA approach of the present work depends on the user chosen cutoff radius. This challenge exists for any method that attempts to characterize GB atoms, because there is no clear choice as to which atoms should be included in the neighborhood, and the resulting structures are likely to over-partition the UAE space.

As mentioned earlier, work has been done by Reinhart et al. (2017) to establish a machine learning approach to identify environments that have similar structure but different CNA signatures and combine them into a single environment (i.e., clustering in the UAE space). This effectively implements a similarity metric for CNA, and was successful in its application to surfaces of colloidal crystals. However, this process is computationally expensive and does not result in a single universal partitioning of the UAE space, so the repartitioning would need to be recalculated (or at least updated) for every new data set to be characterized. In spite of the overpartitioning that results from the simple fully-leveraged CNA approach, and in the absence of environment consolidation, we find that useful analysis can still be performed to evaluate GB structure-property models as will be described in section 4.3.

For the subset of 63 GBs, the number of UAEs is reduced considerably. **Figure 4** shows the distribution of atomic environments found in the subset of 41 63 GBs, for which there were only 117 unique CNA signatures. Moreover, the vast majority of the GB atoms (roughly 90%) correspond to one of just 44 UAEs (or only 29 UAEs if GB atoms with FCC structure are included). This kind of dimensionality reduction for descriptions of GB atomic structure may make inference of GB atomic structure-property models significantly more tractable. Furthermore, this information can be used to compare the structural similarity of different GBs as will be discussed in section 4.3.

# 4.2. Visualization

Without resorting to the more advanced machinery of SOAP or Reinhart's machine learning approach, most analysis of atomic structures relies on the simpler reference structure based crystal structure identification techniques. Because they were designed to identify crystalline regions, and not GBs, 50%−98% of the GB atoms in the Olmsted data set are, unsurprisingly, classified as "other" by the reference structure based techniques, making the atomic structure of these GBs largely opaque to classical analysis. As revealed by our fully-leveraged CNA technique, the fact that only 44 UAEs dominate the 63 GBs studied here suggests the possibility of discovering new GB structural information for very little computational effort, and within the familiar CNA framework. We illustrate this through visualization, by coloring GB atoms according to their UAE identifier. As an example, **Figures 5A,B** provides a rendering of a 63 [51¯ 2] GB with atoms ¯ colored according to standard practice (using the traditional CNA approach). The FCC atoms (in green) are identified, but all of the atoms at the GB are classified as "other"/unidentified environments. In contrast, **Figure 5C** shows the same GB atoms

<sup>1</sup> Including the GB atoms that have FCC structure only adds one UAE, but because GB atoms that possess FCC structure make up a significant percentage of the total GB atoms, fewer UAEs are required to represent 90% of the total GB atoms.

FIGURE 2 | (A) Histogram of UAEs found in the 388 Omlsted GBs. Note that this is on a log scale and there are approximately 5 <sup>×</sup> <sup>10</sup><sup>5</sup> GB atoms. (B) Cumulative sum of the proportion of atoms that can be described using a given number of UAEs. Approximately 90% of the non-FCC GB atoms can be described by one of the 448 most prevalent UAEs (only 167 UAEs are required if the GB atoms with FCC structure are included).

colored using the atomic environment classes identified by our fully-leveraged CNA technique. It is evident that this GB contains a structured arrangement of atomic environments and is quasitwo dimensional. This new approach reveals structure that was previously unresolvable using the common crystal structure identification techniques, and for far less computational effort than the more advanced techniques.

In addition to the ability to easily obtain important structural information for a single GB, coloring each atom according to its local environment facilitates identification of structural similarity among different GBs. In the case of 63 GBs, it has been hypothesized that GBs may form facets whose structure corresponds to that of the GBs that occupy the corners of the relevant boundary plane fundamental zone (FZ) (Banadaki and Patala, 2016). However, a test of this hypothesis would require comparison of the atomic structures of various GBs, which would be difficult using reference structure based descriptors that leave nearly all of those atoms unclassified. For example, the top row of **Figure 6** shows three different 63 GBs that are near each other in the FZ. While terrace-like features are apparent, it is unclear whether these represent facets of the same structure. Using the fully-leveraged CNA procedure, the bottom row of **Figure 6** makes it clear that each of these GBs do in fact contain very similar environments, giving some evidence in support of the faceting hypothesis. A more complete analysis of faceting in 63 GBs, enabled by the fully-leveraged CNA technique, is provided in section 4.3.

Visualizing a grain boundary in this manner also highlights higher-order defects, or defects inside of other defects (note the dark purple environments that decorate the ledges in **Figure 6**).

FIGURE 5 | (A) OVITO rendering of a 63 [521] GB, with atoms colored by ¯ traditional CNA. Green atoms are FCC, white are "other". Notice that all of the GB atoms remain unidentified. (B) All FCC atoms removed. (C) Atoms colored by the UAE IDs obtained from our fully-leveraged CNA procedure.

# 4.3. Application

Here we apply the fully-leveraged CNA technique to investigate the relationship between atomic structure and GB properties. As mentioned previously, it was recently hypothesized by Banadaki and Patala (2016) that 63 GBs may be composed of facets whose structure corresponds to that of the 3 GBs that define the corners of the 63 GB plane FZ. Based on this hypothesis, Banadaki and Patala developed a structure-property model to predict the GB energy of an arbitrary 63 GB as a weighted average of the GB energies of the FZ corners. This model showed good agreement with GB energies calculated by MD for many cases. However, the GB structures were never analyzed to test whether the hypothesized structural faceting actually occurred. The fullyleveraged CNA approach presented here provides an opportunity to test this hypothesis.

The total number of UAEs found in each of the GBs that define the corners of the 63 GB plane FZ are provided in **Table 2**. It is

notable that the UAEs appearing in each of the corner GBs form disjoint sets. This implies that they are in some sense orthogonal structures, which might at first appear to support the possibility of faceting. However, the total number of environments (117) found across all of the 41 63 GBs is greater than the total number


These environments are present in many of the non-corner GBs, but other environments are also present, increasing the overall GB energy.

of environments found in the FZ corners (12), and, as shown in **Figure 7**, these additional environments are not concentrated at ledges between facets, but constitute significant portions of the non-corner GBs.

Several key observations can be derived from **Figure 7**. First, there are in fact some regions of the FZ where the GBs are made of facets of the corner GBs. In particular, GBs near the [111] coherent twin (θ = φ = 0) show obvious facets whose structure is that of the coherent twin. Also, GBs along the right boundary of the FZ (θ = 90◦ ) show some evidence of faceting (this behavior near the [21¯1] corner was also noted by ¯ Banadaki and Patala, 2017), though for many of these GBs the structure of these facets does not correspond to any of the FZ corners. As for the rest of the FZ, there is no clear evidence of faceting for the Olmsted Al GBs. It is important to note, however, that the ability of a GB to facet in an atomistic model may depend on the size of the simulation cell that was employed to construct it (see Race et al., 2014; Humberson and Holm, 2017, for a discussion of the impact of simulation cell size), so that it is possible that if larger simulation cells were used, faceting might be observed more generally. Moreover, it has been shown that there can be many metastable atomic structures for the same GB (Han et al., 2017), some of which have nearly degenerate energies. Thus, it is also possible that there are distinct isoenergetic configurations, or that the atomic structures in this data set may not be the lowest energy configurations, which might otherwise exhibit the hypothesized faceting structure. Indeed, Banadaki and Patala found atomic structures for the 63 GBs with considerably lower energies in many cases (Banadaki and Patala, 2016), which may have exhibited faceting more generally, and this may be one explanation for the better fit of the faceting model's energy predictions to their data than to the Olmsted data (see **Figure 9**). Regardless of whether or not the atomic structures in the Olmsted data set are ground state structures or (at least in some cases) metastable structures, the fully-leveraged CNA approach can be applied to characterize the atomic structure that is present, whatever it happens to be. Furthermore, if ground state structures were available, our fully-leveraged CNA approach would easily identify more general faceting if it were to occur in those structures.

Although structural faceting does not occur generally for the Olmsted atomic structures, relatively smooth trends in the composition of UAEs are observed across the FZ. **Figure 8** shows the fraction of atoms in each GB whose atomic environments match those of each of the FZ corners. For all three corners, smooth trends in atomic environment composition are observed along θ = 90◦ from [21¯1] to [10 ¯ 1] (for the [2 ¯ 1¯1] corner it ¯ is smooth, but not monotonic, see **Figure 8B**). Smooth trends also occur along φ = 0 from [111] to [21¯1] and near the ¯ coherent twin. Furthermore, as the crystallographic distance to one of the corner GBs increases, the proportion of atomic environments belonging to that corner decreases. This suggests that in the absence of faceting (which represents a sort of structural segregation behavior) there may be a sort of mixing behavior of atomic environments from each of the FZ corners for these GB structures.

Because we do not observe structural faceting generally, it is not surprising that the faceting model does not predict the energies of the Olmsted data set well. However, for some regions of the FZ there are also deviations between the faceting model's predictions and the calculated GB energies for the lower energy atomic structures obtained by Banadaki and Patala. It is notable that where these deviations do occur, they are almost exclusively underpredictions. Our observations here may partially explain this behavior. The faceting model predicts GB energy as a weighted average of the energies of the GBs at the FZ corners, which ignores the energetic contribution of the line defects that will likely exist at the junction of distinct facets, and underpredictions are therefore consistent with this omission. These line defects are likely to be composed of atomic environments that are not present in the FZ corners, and which may have higher cohesive energies. In fact, we find that the noncorner atomic environments have an average cohesive energy<sup>2</sup> that is 3.5 <sup>×</sup> <sup>10</sup>−<sup>21</sup> J (0.022 eV) higher than the average for the atomic environments that belong to the FZ corners. This may seem like a small difference, but because many GBs contain a large portion of non-corner environments (a median of 49% of the GB atoms) the cumulative effect can be significant. A rough estimate is illustrative: if 50% of a GB's atoms (e.g., 500 of 1000) are non-corner environments and possess the average non-corner environment cohesive energy (−5.30 <sup>×</sup> <sup>10</sup>−<sup>19</sup> <sup>J</sup> or <sup>−</sup>3.31 eV) then with a GB area of 1800 Å<sup>2</sup> (the average cross-section for an Olmsted simulation cell) the non-corner environments would contribute approximately 0.097 J/m<sup>2</sup> to the GB energy, which is similar to the magnitude of the underpredictions shown in **Figure 9**.

#### 4.4. Simple UAE Model

This suggests that a model based on atomic environments, might provide improved predictions for GB energy. We note that important work in this area has already been performed by Rosenbrock et. al within the SOAP framework (Rosenbrock et al., 2017). The rigorous development of such a model is beyond the scope of the present work, whose primary objective has been to present a simple atomic structure characterization technique (the fully-leveraged CNA approach) that enables characterization of GB atomic structure that was unresolvable using crystal structure identification approaches. Nevertheless, we provide a simple and

<sup>2</sup>We computed the cohesive energies in LAMMPS (Plimpton, 1995) using the same potential that Olmsted et. al used to produce these Al structures (the Ercolessi & Adams potential for aluminum Liu et al., 2004).

brief example of how the resulting UAEs might be incorporated into machine learning or other model development approaches.

We treat the fraction of each UAE as a predictor (independent) variable and the energy of the GB as the response (dependent) variable. This implies a 2205 dimensional space (corresponding to the 2205 UAEs observed across all 388 GBs). We employ PCA to perform feature transformation and selection and find that only 84 principle components (linear combinations of the original variables) are required to explain 95% of the variance in the data. Thus, we have reduced the dimensionality of the problem from 2205 to 84 dimensions. Using these 84 transformed variables, we employ 5-fold cross-validation to train a simple

Olmsted simulations (open squares).

linear regression model. Comparison of the resulting model to the calculated GB energies for all 388 GBs is provided in **Figure 10**, with the subset of 63 GBs highlighted. Comparison of the model predictions to the Olmsted simulations for the subset of 63 GBs as a function of boundary plane orientation is also provided in **Figure 9** (compare filled vs. open squares). The resulting model predictions agree well with the calculated values, and the model predicts the correct GB energy with less than 10% error for 89.69% of the 388 GBs (and 92.68% of the 63 GBs). We note, in particular, the improved predictions of the UAE model across the θ = 90◦ arc of the FZ from [21¯1] to [10 ¯ 1]¯ (the green filled squares agree well with the green open squares in the right panel of **Figure 9**) as compared to the faceting model (solid green line).

# 5. CONCLUSION

In this work, we have presented an atomic structure characterization technique (the fully-leveraged CNA approach) that (i) can characterize arbitrary atomic environments, while also being both (ii) simple to implement, and (iii) built upon a descriptor that is already familiar to the atomistic modeling community. This enables characterization of GB atomic structure that was previously unresolvable using crystal structure identification techniques, and for lower computational effort than more advanced techniques. We show that it is possible to describe GB atomic structure in terms of the proportion of the unique atomic environments (UAEs) resulting from the use of our method.

We find that a relatively small number of UAEs account for a large proportion of the GB atoms, suggesting the possibility of a significant dimensionality reduction in the description of GB

atomic structure. Specifically, we found that to describe 90% of the non-FCC GB atoms present in the 388 GBs of the Olmsted data set, only 448 UAEs (CNA signatures) are required, and for the subset of 41 63 GBs only 44 UAEs are necessary. This dimensionality reduction suggests that these UAEs can act as atomic structure descriptors that might be incorporated into machine learning approaches to develop improved GB structureproperty models.

We demonstrated how visualization of the UAEs reveals important GB structural information. As an example, we investigated the possible description of 63 GBs as being composed of facets of the GBs occupying the corners of the corresponding boundary plane fundamental zone (FZ). We found that for the Olmsted data set such faceting does occur in certain regions of the FZ, but not generally. Instead, an apparent mixing of atomic environments from the GBs defining the FZ corners was observed, together with the appearance of numerous environments not present in the FZ corners. These observations are consistent with the good agreement of the faceting model with calculated GB energies for some regions of the FZ, as well as the observed underprediction in other regions.

Finally, we provided a brief example to illustrate how the UAE fractions can be used as GB atomic structure descriptors that can serve as input to machine learning approaches for the development of GB atomic structure-property models.

# DATA AVAILABILITY

The datasets for this study will not be made publicly available because some data has been used with permission. All other data is available upon request to the corresponding author.

### REFERENCES


# AUTHOR CONTRIBUTIONS

OJ designed the project and trained the final model. BS and DD developed all of the analysis codes and performed the calculations and analysis. All authors contributed to preparation of the manuscript.

# FUNDING

This work has been supported by the Department of Mechanical Engineering at Brigham Young University.

# ACKNOWLEDGMENTS

We are grateful to Dr. Eric R. Homer and Dr. Srikanth Patala for fruitful discussions, and to Dr. David Olmsted for permission to use the Al GB structures.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmats. 2019.00120/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Snow, Doty and Johnson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Machine-Learning Informed Representations for Grain Boundary Structures

Eric R. Homer <sup>1</sup> \*, Derek M. Hensley <sup>2</sup> , Conrad W. Rosenbrock <sup>2</sup> \*, Andrew H. Nguyen<sup>2</sup> and Gus L. W. Hart <sup>2</sup>

<sup>1</sup> Department of Mechanical Engineering, Brigham Young University, Provo, UT, United States, <sup>2</sup> Department of Physics and Astronomy, Brigham Young University, Provo, UT, United States

The atomic structure of grain boundaries plays a defining but poorly understood role in the properties they exhibit. Due to the complex nature of these structures, machine learning is a natural tool for extracting meaningful relationships and new physical insight. We apply a new structural representation, called the scattering transform, that uses wavelet-based convolutional neural networks to characterize the complete three-dimensional atomic structure of a grain boundary. The machine learning to predict GB energy, mobility, and shear coupling using the scattering transform representation is compared and contrasted with learning using a smooth overlap of atomic positions (SOAP) based representation. While predictions using the scattering transform are not as good as those of SOAP, other factors suggest that the scattering transform may yet play an important role in GB structure learning. These factors include the ability of the scattering transform to learn well on larger datasets, in a process similar to deep learning, as well as their ability to provide physically interpretable information about what aspects of the GB structure contribute to the learning through an inverse scattering transform.

#### Edited by:

Benjamin Klusemann, Leuphana University, Germany

#### Reviewed by:

Michele Ceriotti, École Polytechnique Fédérale de Lausanne, Switzerland Robert Horst Meißner, Hamburg University of Technology, Germany

#### \*Correspondence:

Eric R. Homer eric.homer@byu.edu Conrad W. Rosenbrock rosenbrockc@gmail.com

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

> Received: 11 February 2019 Accepted: 26 June 2019 Published: 16 July 2019

#### Citation:

Homer ER, Hensley DM, Rosenbrock CW, Nguyen AH and Hart GLW (2019) Machine-Learning Informed Representations for Grain Boundary Structures. Front. Mater. 6:168. doi: 10.3389/fmats.2019.00168 Keywords: machine learning, grain boundaries, atomic structure, characterization, SOAP, scattering transform

# 1. INTRODUCTION

Grain boundaries (GBs) in crystalline materials are complex structures that can have a significant influence on material properties. The structural complexity derives from the fact that when any two crystals are joined, there are macroscopic and microscopic degrees of freedom that influence their behavior. With a proper understanding of how material properties are influenced by these degrees of freedom, materials engineers could develop materials with enhanced properties. This has been accomplished in a handful of cases using GB engineering (Watanabe et al., 2009; Randle, 2010). Unfortunately, the majority of materials used in society have not benefited from these efforts as GB engineering primarily focuses on one special type of GB, the twin boundary. Continued efforts in tailoring material properties as a result of GB engineering will require a more complete understanding of GB structure-property relationships.

At the macroscopic level, the structural degrees of freedom are well known and defined by the crystallography of the joined crystals (Frank, 1988; Patala et al., 2012; Patala and Schuh, 2013). At the microscopic level, the structural degrees of freedom are defined by the configuration of the atoms and the macroscopic degrees of freedom can be viewed as constraints (Tadmor and Miller, 2011; Han et al., 2016).

Since material properties are derived from the atom configurations, or microscopic degrees of freedom, more attention must be given to characterization of atom configurations at GBs. A full description of the microscopic structure is given by the position of all the atoms, leading to 3N positional degrees of freedom for N atoms. Due to the challenge of fully defining GB structures through their 3N degrees of freedom a variety of other structural metrics have been defined.

Among the commonly used structural descriptors of GBs are the structural unit model (Frost et al., 1982; Sutton and Vitek, 1983; Balluffi and Bristowe, 1984; Rittner and Seidman, 1996; Tschopp and McDowell, 2007; Spearot, 2008; Han et al., 2017), dislocation arrays (Read and Shockley, 1950; Bishop and Chalmers, 1968; Wolf, 1989; Medlin et al., 2001), and common neighbor analysis (Honeycutt and Andersen, 1987). These have unique capabilities and provide intuition primarily in characterizing quasi-2-dimensional GB structures but have limitations in characterizing fully 3-dimensional GB structures. More recently a number of other models have emerged to overcome limitations in the common techniques; these include polyhedral template matching (Larsen et al., 2016), Voronoi cell topology (Lazar, 2018), and polyhedral unit model (Banadaki and Patala, 2017).

As modern machine learning techniques push the limits of scientific discovery, there are several important lessons to learn from the deep learning community. The first is the remarkable discovery that the accuracy of a model can continue increasing, instead of asymptoting, as more data is added. That discovery required a universally applicable, generalized approach to extracting descriptors (i.e., features) from data using convolutional networks. These lessons should inform our approach to machine learning in materials. Specifically, given the availability of algorithms and limited data in GB science, the important gap to fill is in the creation of universal descriptors that fully characterize the 3-dimensional GB structure.

Rosenbrock et al. (2017) recently introduced the use of two new descriptors that help address this gap. The first is the application of the Smooth Overlap of Atomic Positions (SOAP) formalism to GBs. Typical applications of SOAP include accurately modeling potential energy surfaces (Szlachta et al., 2014; John and Csányi, 2017; Mocanu et al., 2018) and reactivity (Caro et al., 2018) of molecules (Cisneros et al., 2016) and solids (De et al., 2016; Sosso et al., 2018), pressure, temperature, and composition phase diagrams of materials (Baldock et al., 2016), defects (Dragoni et al., 2018), and dislocations (Maresca et al., 2018). SOAP is also convenient for characterization of GBs because it possesses the following desirable properties: (i) enables comparison between GBs, (ii) is invariant with respect to structural symmetries, rotations, and permutations, (iii) is smoothly varying while accommodating structural perturbations, (iv) is applicable to general, three-dimensional GB structures, and (v) is amenable to automated characterization and discovery of structures. Rosenbrock et al. (2017) also introduced a new descriptor called the local environment representation. This representation finds unique sets of local environments that are repeated throughout a set of GBs. In recent work, Priedeman et al. (2018) used the local environment representation and found that among 494,495 GB atoms, there were only 55 unique local atomic environments that were repeated in different combinations and arrangements to construct all the GBs.

Using these descriptors and their ability to compare environments, Rosenbrock et al. (2017) applied machine learning to predict both static and dynamic GB properties based on the static GB structure. The predictions for the static property of GB energy was the most accurate, which is reasonable considering that it is a property that is influenced by each atom's contribution to the whole energy. For the dynamic properties of mobility trend and shear coupling, however, the predictions were not as good, and it was reasoned that longer range information about atomic structures was likely required to make better predictions. Since SOAP is a local-environment descriptor, we propose that an alternative descriptor is necessary to characterize the structure at multiple scales. Importantly, the characterization metric must still be automated and satisfy invariance requirements.

We present the scattering transform (ST, Bownik, 1997; Benedetto and Pfander, 1998; Pfander and Benedetto, 2002; Benítez et al., 2010; Goh and Lee, 2010; Goh et al., 2011; Lanusse et al., 2012; Mallat, 2012) as a second, universal descriptor for GB systems that includes multi-scale features. We present its ability as a representation to learn energy, mobility, and shear coupling from GB structures, and compare the results with the published SOAP methodology. We also compare the results with a combined representation by SOAP and ST. While the results indicate that there is room for improvement, we demonstrate how additional data can improve learning by ST. Finally, we demonstrate how an inverse ST, using relevance propagation, can identify key features of the GB structure that are useful for the machine learned predictions.

# 2. MATERIALS AND METHODS

# 2.1. SOAP

To generate the first representation, the averaged SOAP representation, we create a SOAP descriptor (Bartók et al., 2010; Bartók et al., 2013) for each atom in the GB. Briefly, the process of calculating the SOAP descriptor starts by placing a Gaussian on each local neighbor of a specified atom i.

$$\rho\_i(\vec{r}) = \sum\_j e^{-(\vec{r}\_{\vec{\eta}} - \vec{r})^2 / 2\sigma\_{\text{atom}}^2} f\_{\text{cut}}(|\vec{r}\_{\vec{\eta}}|) \tag{1}$$

where fcut is a smooth cutoff function that ensures compact support at radius rcut, and Erij is the vector from atom Er<sup>i</sup> to Er<sup>j</sup> . We define these Gaussians as the species independent neighbor density of i. To simplify the representation of this neighbor density it is expanded in an orthonormal basis,

$$\rho\_i(\vec{r}) = \sum\_{nlm} c\_{i,nlm} g\_n(r) Y\_{lm}(\hat{r}),\tag{2}$$

where g<sup>n</sup> are an orthonormal radial basis, Ylm are spherical harmonics, and ci,nlm are the expansion coefficients.

The overlap of two different site environments is defined to be:

$$S(\rho\_i, \rho\_k) = \int \rho\_i(\vec{r}) \rho\_k(\vec{r}) d^3r,\tag{3}$$

and is permutationally invariant (because of the sum over the j neighbors in ρ<sup>i</sup> of Equation 1). Rotational invariance is achieved by integrating over all rotations of one of its arguments,

$$
\tilde{K}(\rho\_i, \rho\_k) = \int d\hat{\mathbb{R}} \, |\mathcal{S}(\rho\_i, \hat{\mathbb{R}}\rho\_k)|^p,\tag{4}
$$

where Rˆ is a 3D rotation operator (element of SO(3)), and p is a small integer, e.g., 2. The value for p loosely defines the "multi-bodyness" of the expansion, similar to how the power of a binomial relates to the number of cross-terms in its expansion. For example, (a + b) <sup>2</sup> <sup>=</sup> <sup>a</sup> <sup>2</sup> <sup>+</sup> <sup>2</sup>ab <sup>+</sup> <sup>b</sup> 2 , where the ab cross-term shows interaction between a and b. Thus, p = 2 roughly corresponds to 2-body interactions and a value of p = 4 roughly corresponds to 5-body interactions. A more complete description for creating SOAP descriptors from local environments is documented in detail elsewhere (Bartók et al., 2013; Rosenbrock et al., 2017).

This process has already been efficiently implemented and can be found in the Python-based pycsoap code<sup>1</sup> (Nguyen and Rosenbrock, to be submitted). Rosenbrock et al. (2018) discusses selecting atoms to include in the GB and considerations for tuning parameters.

The difficulty with applying local-environment descriptors directly is that the method produces an M × N matrix for each GB, where M is the number of atoms in the GB, and N is the length of each SOAP vector. Machine learning requires a single vector describing each data point in the dataset, which motivates an averaging of this SOAP matrix over the M atoms to produce the averaged SOAP representation, as defined by Rosenbrock et al. (2017) and De et al. (2016). While this representation was referred to as the ASR (for Averaged SOAP Representation) in previous works (Rosenbrock et al., 2017), we simply refer to it here as SOAP. In other words, this SOAP vector represents the average local atomic environment of all the atoms in the GB. Collecting all these averaged SOAP vectors for a collection of GBs produces the feature matrix for machine learning.

#### 2.2. Scattering Transform

The ST is similar to a multi-layer, convolutional neural network. However, instead of using the discrete convolutions typical in deep learning approaches, based on integer kernel matrices, the ST uses continuous convolution with wavelet functions. For a time series signal, the Fourier transform gives information about the frequency content of the signal. Wavelets, by analogy, are localized in both time and frequency by defining a scaling parameter for the wavelet function that limits its extent in time. The wavelet transform is then executed as a convolution between the scaled, time-frequency wavelet function and the signal.

The analysis functions for this wavelet transform are defined as:

$$
\psi\_{a,b}(t) = \frac{1}{\sqrt{a}} \psi\left(\frac{t-b}{a}\right) \tag{5}
$$

where a represents the scale (i.e., large values of a correspond to "long" basis functions that will identify long-term trends in the signal to be analyzed) and b represents a shift. The unscaled wavelet function ψ(t) is usually a bandpass filter. High-frequency basis functions are obtained by going to small scales; therefore, scale is loosely related to the inverse frequency. One can choose shifts and scales to obtain a constant relative bandwidth analysis known as the wavelet transform. To accomplish this, we use a real bandpass filter with zero mean.

Then we can define a continuous wavelet transform for an arbitrary function f(t) as:

$$f \ast \psi\_{a,b} = \int\_{R} \psi\_{a,b}^{\ast}(t) f(t) dt,\tag{6}$$

where ψ ∗ a,b (t) represents the complex conjugate of ψa,<sup>b</sup> (t) and R is the domain of the signal. This is similar to the Short Time Fourier Transform but with a variable window. Once again, we are measuring the similarity between a function, f(t), and of an elementary function (which is shifted and scaled).

For a multi-dimensional signal, a multi-dimensional wavelet can be constructed as the Cartesian product between wavelets defined in each dimension. In other words, the domain for the function of interest f(t) changes to f(x, y, z), and the convolution integral is still defined over the domain of f .

Applied to GBs, the 3D ST is computed as a sequence of multidimensional, multi-scale wavelet transforms, interleaved with non-linear transforms that take the absolute value of their input signal (i.e., modulus nonlinearities). The process of introducing these nonlinearities is described below.

The general formulation of the ST used here is depicted in **Figure 1** where a series of layered convolutions are used to obtain the feature representation. In the first step, and similar to the SOAP formalism, a Gaussian density is applied to the atom positions to obtain the density f . When implemented numerically, some discretization of f is inevitable, the continuous signals are sampled at a specified resolution (tunable parameter).

In the first layer (0), a Gaussian filter φJ<sup>0</sup> (f) at scale J<sup>0</sup> blurs the density f . The coefficients of the blurred density are subsampled, averaged, and stored as part of the ST representation. During subsampling, a discretized vector is sampled at a coarser resolution to form a smaller vector for the final representation.

To obtain the second layer (1), various wavelet transforms are applied to f ; the convolutions f ∗ ψj1,0 are computed at various length scales j<sup>1</sup> before calculating the modulus (absolute value) of each of these averaged coefficients as another part of the ST representation. This modulus operation introduces the nonlinearities mentioned earlier. After computing the modulus, we again blur using a Gaussian filter φJ<sup>1</sup> (f) and subsample, this time at scale J<sup>1</sup> and store the resulting coefficients as part of the scattering representation.

To obtain the third layer (2) another wavelet transform is applied, yielding f ∗ ψj1,0 <sup>∗</sup> <sup>ψ</sup>j2,0 for each length scale <sup>j</sup>2. Each of these again has the modulus operator applied, is blurred, and is subsampled to produce coefficients as done in previous layers. Similar to other convolutional neural networks, this process could continue for many more layers. Of course, the ability to

<sup>1</sup>This is available from the Python Package Index using pip install pycsoap.

capture the relevant features will depend upon the relative scales of the atomic structures and the wavelets employed. Once the scales of the wavelets have been set, these features will not be affected by including more copies of a periodic structure, like those often present in GBs. In this respect, the scattering features are not dependent on increased system size.

The ST produces a 1 × N vector for each GB, where N is determined by the ST parameters (i.e., chiefly the number of convolutional layers, the number and scale of the wavelet functions, and the severity of the subsampling). In contrast to SOAP, the ST produces a single vector per GB and thus requires no additional statistical post-processing to produce the feature vector for the GB.

Given the availability of discrete convolutional neural network software that is optimized for both CPU and GPU architectures, it is worth noting why continuous convolutions are worth the extra implementation effort compared to using discrete convolutions. Convolutional neural networks in deep learning were developed to handle image learning tasks, which are inherently discrete due to pixels in images. Physical systems, like the atomistic view of GBs, have smooth transitions that are represented more naturally by spherical harmonics and continuous wavelet functions. While it is true that neural network architectures can approximate curved decision boundaries<sup>2</sup> , continuous wavelets are a more natural choice because they lead to a sparser representation (Hirn et al., 2015, 2017; Eickenberg et al., 2017).

# 2.3. Grain Boundary Structures and Properties

The SOAP and the ST are both representations that provide a feature matrix that is convenient for machine learning of GB structures. In the present work, we learn on the Olmsted GB database, which is a collection of 388 computed Ni GBs created by Olmsted et al. (2009a) using the Foiles-Hoyt embedded atom method (EAM) potential (Foiles and Hoyt, 2006).

The GB structures were created following standard methods where a fairly comprehensive list of initial atomic configurations are each minimized to determine which of all the configurations represents the minimum energy structure of the GB (Olmsted et al., 2009a). Using these GB structures, a variety of properties can be measured or calculated from simulations; for this work, our interest is in energy, temperature-dependent mobility, and shear coupling of the 388 GBs.

The GB energy is defined as the excess energy relative to the bulk as a result of the irregular structure of the atoms in the GB (Tadmor and Miller, 2011). It is important to note that GB energy is normally defined as a static property of the system measured at T = 0 K, and all atomistic structures examined in the machine learning are the T = 0 K structures associated with this calculation. The GB energies for the Olmsted GB database are available in the supplemental materials of Olmsted et al. (2009a). Since the energies for this dataset were calculated using an EAM potential, learning energies serves merely as a benchmark to demonstrate whether a given descriptor captures any physically relevant information useful for machine learning.

Temperature-dependent mobility and shear coupled GB migration are two dynamic properties related to the behavior of a migrating GB. The mobility of a GB is defined as the proportionality factor relating how fast a GB will migrate when subjected to a given driving force (Gottstein and Shvindlerman, 2010). The temperature-dependent mobility has to do with how the mobility changes with temperature. In most cases, mobility is a thermally activated process, where the mobility increases with increasing temperature. However, in analyzing the temperaturedependent mobility of the GBs in the Olmsted database (Olmsted et al., 2009b) and Homer et al. (2014) noticed four broad categories of temperature-dependent mobility: (i) thermally activated, (ii) non-thermally activated, (iii) mixed modes, and (iv) immobile/unclassifiable. These categories correspond with whether the mobility follows an Arrhenius relationships with temperature (thermally activated), does not follow an Arrhenius relationship with temperature (non-thermally activated), shows some mixed mode combination of thermally activated and nonthermally activated, or is immobile or simply unclassifiable.

In addition, when GBs migrate, they can also exhibit a coupled shear motion, in which the motion of a GB normal to its surface couples with lateral motion of one of the two crystals (Cahn et al., 2006; Homer et al., 2013). GBs are then classified as either exhibiting shear coupling or not.

# 2.4. Machine Learning

The SOAP and ST structure characterizations of the 388 GBs in the Olmsted database are calculated using the methods described above. Parameters for these calculations are defined for the SOAP as the radial basis cutoff (nmax), angular basis (spherical harmonic) cutoff (lmax), and the radial cutoff (rcut) which are set to 18, 18 and 5.0 respectively in the present work. For the ST the parameters are defined as the size of the density discretization grid (density=0.25), the number of convolutional layers as seen in **Figure 1** (Layers=2, which also includes Layer 0), a parameter that defines a singular spherical harmonic angular function (SPH\_L=4), the number of wavelets at different scales used at each layer (n\_trans=16), and the number of angular augmentations in the azimuthal and polar angles (n\_angle1=16, n\_angle2=16). An angular augmentation is when the density function is duplicated and rotated to form a new density function, which is also fed through the scattering network. The vectors produced from the rotated density function are then concatenated to form the final ST vector. For example, with n\_angle1 = 16 and n\_angle2 = 16, we end up with 256 copies of the density function, each of which produces a scattering vector. These are then concatenated together to produce the final ST vector. This provides a level of rotational invariance since it is not explicit in the ST.

With both the SOAP and ST providing feature matrices, we are now able to apply a machine learning approach on the SOAP, ST, and combined SOAP+ST characterizations of the GBs. The combined SOAP+ST characterization feature vector is created by simply concatenating the SOAP and ST vectors together. Gradient boosted decision trees [as implemented in xgboost (Chen and Guestrin, 2016)] are used to analyze and predict the GB energy, temperature-dependent mobility, and shear coupling.

For the machine learning of the properties, it is important to note that GB energy is a continuous quantity, while temperature dependent mobility trend and shear coupling are classification properties. The mobility and shear coupling properties present an imbalanced class problem, where one class contains many more samples than the other classes. Consequently, the machine learning models favor this larger class to minimize error, but this degrades the ability of the model to generalize to new data. For example, imagine a binary classification problem where the training data has 99% in one class and only 1% of the other. The machine learning model will perform best by just predicting 100% of the first class. Thus to address this issue, we used the Synthetic Minority Over-sampling Technique (SMOTE), which is a standard approach used in imbalanced class machine learning problems (Han et al., 2005), as implemented in the imblearn package to oversample the minority classes. We can conceptualize SMOTE by imagining a line segment connecting each instance of the minority class to every other instance of that minority class. The algorithm then synthetically creates instances of the minority class randomly along these line segments and adds them to the data set, thus oversampling and balancing the

<sup>2</sup>The interactive 2D playground at https://playground.tensorflow.org demonstrates this nicely.


number of samples in each class. This approach could present issues if any classes are not separable (e.g., the classes overlap), but even in these cases SMOTE is expected to improve learning over simply using the imbalanced classes.

In addition to using SMOTE to address the class imbalance, we also consider two different splits of the temperature-dependent mobility. In a 4 class split, we use the four categories as defined above (Homer et al., 2014). In a 3 class split, we essentially combined the non-thermally activated and mixed modes into a single class, such that the three classes are essentially, (i) thermally activated, (ii) mobile but not thermally activated, and (iii) immobile/unclassifiable. The original machine learning on this data by Rosenbrock et al. (2017) used this same 3 class split.

We trained each model with a 50–50 train-test split. While decision trees have many different tunable hyperparameters, only the number of estimators (the number of trees) was tuned, using a process called Early Stopping (Zhang et al., 2005) with 5 fold cross validation. An ensemble of decision trees is trained by adding trees in multiple fitting rounds, with each new tree's parameters optimized using a loss function. By limiting the number of fitting rounds, the model will only grow until the accuracy never improves for the specified number of rounds. Thus, the optimal number of estimators can be found to minimize the chance of over-fitting.

#### 3. RESULTS AND DISCUSSION

A summary of the machine learning results of GB energy, temperature-dependent mobility, and shear coupling by the SOAP, ST, and Combined SOAP+ST methods is found in **Table 1**. To provide a reference against which to judge the machine learning results, we define a baseline "Random" quantity, as implemented in the original SOAP formulation (Rosenbrock et al., 2017). For this "Random" column, energies are drawn from a normal distribution with the same mean and standard deviation as the training data and then compared to the actual values in the validation data. For the mobility and shear coupling classification, random selection of classes from the training data are picked and compared against the validation data.

The ST results for energy and temperature-dependent mobility are statistically better than random and demonstrate that this new, universal representation is capable of learning certain GB structure-property relationships. However, it does not perform as well as the SOAP, and does not improve predictions even when it is combined with SOAP (SOAP+ST). Valid predictions are being made, but on different features of the GB atomic structure.

It is worth noting that the predictions of temperaturedependent mobility is worse for the 4 class split than the 3 class split. We attribute this to the reduced number of GBs in each class on which to learn and then make predictions, and which aggravates the imbalanced class problem. If our attribution is correct, this suggests how even a minor increase in data for each class (e.g., from 4 to 3 classes of the 388 GBs) can have a significant impact on the learning and prediction ability.

On its own, the ability to predict GB properties using machine learning has only limited benefits. For example, predicting the energy of the GBs here is merely an exercise. Computing energies from structures is not difficult, but predicting the mobility and shear coupling of a GB is and these properties have implications for material processing and deformation. Thus, we desire to use machine learning models to highlight new physical processes governing these properties. ST was introduced here because it targets different features of the GB atomic structure than SOAP. It follows then that each may highlight different physical processes that contribute to the same structureproperty relationship, an assertion that would be born out by improvements to the machine learning accuracies.

A comparison of the learning rates is provided in **Figure 2**. In this figure it can be seen that the SOAP has better training and test accuracies than ST. Furthermore, according to the current slopes of the learning rates, there is no indication, at this point, that ST will perform better than SOAP. For now, one must conclude that ST learns different information about the GB structures, and this information is less helpful for accurate property prediction than the information provided by SOAP.

Interestingly, the SOAP+ST has the lowest training error, while having slightly worse test error than SOAP alone. This is indicative that the information provided by ST is useful in improving the training accuracy of the model. Unfortunately, the increase in error from SOAP alone to SOAP+ST indicates that the additional information provided by ST does not generalize to accurate property predictions on other GB structures. This would indicate that the SOAP+ST is suffering from over-fitting.

To understand and interpret these results, it is helpful to examine the characteristics of the SOAP and ST descriptors. While SOAP is formally complete in its rotational invariance (see Equation 4), the ST is formally complete in its translational invariance due to its convolution integral in Equation (6). In practice, the rotational invariance for ST is introduced by augmenting the representation with several discretely rotated copies of the data. Thus rotational invariance is only approximate

for ST, whereas it is formally exact for SOAP. On the other hand, because ST uses multiple wavelets at different scales, it formally handles multi-scale translational invariance. Translational invariance for the SOAP representation originates in the use of local environments defined relative to a central atom, though the length-scale is limited by the cutoff radius of the SOAP descriptor.

The SOAP representation uses spherical harmonics to capture the angular information in the local environment density function. For this implementation of ST, we used periodic spherical harmonic wavelets to capture the periodicity of the GB structure in the dimensions of the boundary plane. It is likely that this choice of basis introduced some similarity in the features extracted by both SOAP and ST, but SOAP remains a local approach while ST operates at multiple scales.

One could also characterize multiple scales using SOAP by concatenating multiple SOAP vectors with varying cutoff and σatom parameters, as has been done in other works (Bartók et al., 2017; Willatt et al., 2018). At larger radial cutoffs, the surface area of the sphere for the local environment grows as r 2 cutoff, which introduces larger distances between atoms at the surface of the sphere. If the width of the Gaussian density (σatom) placed at each atom remains small, the angular resolution of the SOAP expansion cannot distinguish atom densities well. Thus, increasing the width of the Gaussian at each atom in proportion to the radial cutoff compensates for this geometrical effect so that more distant atoms are still resolved well. However, larger Gaussians placed at neighboring atoms close to the central atom cause structural information to be washed out. This necessitates including multiple SOAP vectors at different cutoffs and σatom values. To demonstrate the effectiveness of this approach, we compare the accuracy of this method with the others listed in **Table 1**. Here it can be seen that the multi-scale SOAP performs almost equal to standard SOAP, with values slightly worse for several properties. This also means that it performs better than ST and SOAP+ST.

While one could conclude from these results that ST does not provide sufficient improvement to the learning to justify its use, we believe there are some reasons to withhold judgment. There are three attributes to the ST that should be considered further. These are (i) data availability, (ii) interpretability, and (iii) overall utility as a structural descriptor.

First, concerning data availability, the ST uses layered convolutional neural networks, which generally provide high accuracy predictions in machine learning. It is worth noting that convolutional neural networks are frequently trained with tens of thousands or more datapoints. It is possible that more data may simply be required for the convolutional neural network used by ST to accurately learn GB properties.

One can increase the size of the GB dataset by constructing additional GB structures, which is time consuming and nontrivial. Or, one can increase the dataset by simulating existing GB structures at finite temperatures, where thermal fluctuations will lead to a large number of similar atomic configurations. We employ the latter approach in simulations of a 65 (0 1 3) ¯ /(0 1¯ 3), ¯ h100i symmetric tilt GB at 100 K over 10 ns and generate 1000 configurations, or snapshots, for that GB. If the ST is used to train a model on some configurations and test the model on the remaining, ST predicts with low mean absolute error. For example, with a single GB trained on 250 configurations and tested on the other 750 configurations, a mean absolute error of 0.002 J/m<sup>2</sup> is obtained. On the other hand SOAP trained on that same data results in a mean absolute error of 0.0015 J/m<sup>2</sup> . Thus, with significantly more data ST improves significantly, though still not better than SOAP in this case.

The expanded MD dataset demonstrates that ST performs well

extracted every 10 picoseconds. Both models look down the [100] tilt axis of the crystals. The units for the inverse scattering transform are arbitrary.

with additional data. However, such datasets are moving toward the realm of "big data." For example, if one desires to predict properties for any conceivable GB structure, significantly more data will be needed to train a general ST model.

The second attribute of ST that is worth discussing is the interpretability of the results and the ability to learn the underlying physics surrounding the machine learning predictions. By using the ST to provide the feature matrix, one can also perform an inverse scattering transform using relevance propagation to understand what aspects of the structure are influencing the learning. Specific details on the application of relevance propagation to ST is forthcoming (Nguyen, to be submitted). However, **Figure 3** shows heatmaps generated using relevance propagation for the energy learning task. In **Figure 3A** we show a relevance propagation heatmap for learning of GB energy using a 50/50 split of the Olmsted database (i.e., the learning task reported in **Table 1**). Contrast that with the relevance propagation heatmap in **Figure 3B** where energy was learned from 500/500 split of the MD configurations noted above. In comparing the two images it is clear that **Figure 3A** highlights a seemingly random selection of atoms that are not consistent with the symmetry of the periodic structure of the GB. In **Figure 3B**, the well-known kite structure from the structural unit model is highlighted, despite the fact that the model had no knowledge of this structure a priori. Thus, the inverse ST relevance propagation heatmaps may allow one to identify the relevant features of the GB structure that correlate with the property of interest. The heatmaps in **Figure 3** would be different for each property even though the structure of the GB might be the same. This could be crucial to the identification of the relevant features of the GB structure controlling different properties.

Furthermore, while Rosenbrock et al. (2017) demonstrated that a derived form of SOAP, called the local environment representation, provides a way to interpret relevant GB structures, SOAP itself can be difficult to interpret. The multi-scale SOAP, which can provide longer range structural information, would be more difficult than SOAP by itself. Thus, while ST may not lead to the highest prediction values, its interpretability through the relevance propagation may render it a useful tool.

The overall utility as a structural descriptor is the third attribute of ST that is worth considering. To consider this we compare ST to a range of structural descriptors and their properties.

In **Table 2** we summarize descriptors introduced for characterizing GBs, and from which machine learning models could be built. In addition to the metrics described in this work we also compare attributes against the structural unit model (SUM), dislocation arrays (DA), common neighbor analysis (CNA), polyhedral template matching (PTM), Voronoi cell topology (VCT), and the polyhedral unit model (PUM), all of which were mentioned in the introduction.

We judge each descriptor based on its usefulness across several metrics. The properties of interest are: Easily Visualized one can convey the structures through visual means, Easily Interpreted–one can easily identify the relevant characteristics and differences between structures, Comparison - one can quantitatively compare the structures to one another, Invariance– the characterization is invariant to rotations, permutations, and/or translations, Perturbations–perturbations in the structure are captured as small changes in the metric, Smoothly Varying– the metric is continuous and varies smoothly for larger changes in structure, 3D GB Structures–the characterization works for quasi-2D and complex 3D GB structures, Automation– the characterization process can be automated, Connectivity– the technique characterizes how all the atoms in the GB are connected, Multi-scale–the technique characterizes both shortand long-range structural information, Subunit Discovery–the technique does not require a preset list of structures, it can discover them on its own.

While there are notable things about each descriptor and some of the entries in **Table 2** are subjective, we will focus on a few properties of interest. In particular, we'll focus on a few of the properties not present in SOAP.

First, the ability to automate the description is an essential requirement to move GB science into the big data age. This property is shared by many. Second, is the ability to provide multi-scale characterization. Many techniques possess this ability if the researcher knows what they are doing, but ST is the only technique that possesses this inherently. Third and fourth are easily visualized and interpreted, which are two properties that are more subjective. Neither of these properties is a strength of


The structural unit model is abbreviated as SUM, dislocation arrays as DA, common neighbor analysis as CNA, polyhedral template matching as PTM, Voronoi cell topology as VCT, polyhedral unit model as PUM, averaged SOAP representation as SOAP, local environment representation as LER, and scattering transform as ST. A check mark (X) indicates that the descriptor exhibits a particular property. 'R' indicates that the researcher using the tool is largely responsible for whether or not the atomic structure description has a particular property or not (since that property is extracted manually).

SOAP<sup>3</sup> , but both could be a strengths of ST as evidenced by the heatmaps in **Figure 3**. Sixth is connectivity. ST does not possess this outright as one might consider in the structural unit model or in a graph description. However, it should be noted that while **Figure 3** colors each of the atoms by their relevance in predicting energy, the continuous nature of ST and the inverse ST means that relevance scores are available continuously throughout the space; one could produce high resolution heatmaps. Having a detailed 3D "importance density" for a grain boundary would allow connectivity values between a graph of nearest-neighbor atoms to be quantified (for example by integrating the density along the path connecting the atoms). These edge weights in the connectivity graph could be thresholded to provide alternate views of connectivity. This definition of connectivity is somewhat different from the traditional definition. The heatmaps also change based on the property of interest rather than being static. That in turn, may be more useful for discovering the physical underpinnings on structure-property relationships. This approach might also allow one to fulfill the final property of subunit discovery. Again, this isn't currently present in ST, but one could imagine how the inverse ST heatmaps might enable this property.

Considering these three attributes of ST, there is reason to believe that the ST, or something very similar, might become an important descriptor for GB data science. However, given the evidence presented here, one must proceed with caution, and consider other ways to achieve the same goals of encoding the most useful information about GB structures for property prediction and discovery of the underlying physics.

# 4. CONCLUSION

The success of machine learning in GB data science will largely be guided by the development of tools that capture the physical essence of GB structure-property relationships. These tools must be automated and universally applicable to large and complex GB structures. Since the machine learning is merely a stepping stone to discovery of the underlying physics, these tools should also satisfy certain mathematical constraints related to invariances and smoothness.

We introduced a new descriptor, the Scattering Transform (ST) (Bownik, 1997; Benedetto and Pfander, 1998; Pfander and Benedetto, 2002; Benítez et al., 2010; Goh and Lee, 2010; Goh et al., 2011; Lanusse et al., 2012; Mallat, 2012), based on continuous, multi-scale wavelet transforms interleaved with modulus nonlinearities. We showed that this descriptor can effectively learn GB structure-property relationships for energy and does reasonably well for temperature-dependent mobility. It should be noted that the SOAP descriptor surpassed the ST in prediction accuracy and remains the optimal descriptor for the properties and structures compared here.

However, we also demonstrated that despite its inability to achieve the same accuracy predictions as SOAP, ST has complimentary features that may make it a useful descriptor of GB structure. First off, the ST information content is different than and complementary to that of the SOAP descriptor. The ST has the ability to encode multi-scale structural information and be visualized using an inverse ST that generates a heatmap. Importantly, the inverse ST provides evidence of the prevailing wisdom that multi-level convolutional networks require large amounts of data in order to truly learn the physics underlying structure-property relationships. This helps contextualize the performance of ST relative to the averaged SOAP representation and other SOAP-based representations. It also motivates the building of much larger GB databases.

The ST has the potential to be a powerful tool in understanding GB structure-property relationships. As we

<sup>3</sup> SOAP can lends itself to interpretation by either (i) optimizing a reference structure by minimizing the kernel metric distance, much like the local environment representation, or (ii) applying relevance propagation to the SOAP vector. However, the first approach provides only a local analog and the second approach suffers information loss due to the angular integral. Thus, while certainly useful, the inverse SOAP operations do not have the same global resolution as an inverse scattering transform.

continue to push the limits of our understanding in GB structureproperty relationships it will be most valuable to (i) focus on building larger databases of GB structure-property mappings, which currently represents the greatest limitation, and (ii) continue to introduce new descriptors that satisfy as many of the desirable characteristics as possible.

#### DATA AVAILABILITY

The datasets for this manuscript are not publicly available. Requests to access the datasets should be directed to Stephen Foiles, foiles@sandia.gov.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

CR, AN, EH, and GH all conceived the idea for this work. AN wrote the code for the scattering transform. DH performed all the calculations. All were involved in writing the manuscript.

#### FUNDING

DH, CR, AN, and GH are supported under ONR (MURI N00014-13-1-0635). EH is supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences under Award #DE-SC0016441.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Homer, Hensley, Rosenbrock, Nguyen and Hart. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Machine Learning-Based Classification of Dislocation Microstructures

#### Dominik Steinberger\*, Hengxu Song and Stefan Sandfeld\*

Institute of Mechanics and Fluid Dynamics - the Micromechanical Materials Modelling Group, Freiberg University of Mining and Technology, Freiberg, Germany

Dislocations—the carrier of plastic deformation—are responsible for a wide range of mechanical properties of metals or semiconductors. Those line-like objects tend to form complex networks that are very difficult to characterize or to link to macroscopic properties on the specimen scale. In this work a machine learning based approach for classification of coarse-grained dislocation microstructures in terms of different dislocation density field variables is used. The performance of the model combined with domain knowledge from the underlying physics helps to shed light on the interplay between coarse-graining voxel size and the set of suitable or even required density variables for a faithful microstructure characterization.

#### Edited by:

Christian Johannes Cyron, Hamburg University of Technology, Germany

#### Reviewed by:

Liming Xiong, Iowa State University, United States Julien Guénolé, RWTH Aachen Universität, Germany

#### \*Correspondence:

Dominik Steinberger dominik.steinberger@ imfd.tu-freiberg.de Stefan Sandfeld stefan.sandfeld@imfd.tu-freiberg.de

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

> Received: 04 February 2019 Accepted: 31 May 2019 Published: 18 June 2019

#### Citation:

Steinberger D, Song H and Sandfeld S (2019) Machine Learning-Based Classification of Dislocation Microstructures. Front. Mater. 6:141. doi: 10.3389/fmats.2019.00141 Keywords: machine learning, dislocation, classification, plasticity, microstructure

# 1. INTRODUCTION

One of the primary mechanisms of plastic deformation in crystalline material is the movement of dislocations. Dislocations are one-dimensional lattice defects that cause a distortion of the crystallographic lattice. The distortion results in a stress state through which dislocations interact. Once subjected to a sufficiently large stress they may start to move within a crystallographic plane, the slip plane. In addition to the interaction through their stress fields, dislocations may also form junctions or even may climb, i.e., move perpendicular to their slip plane. Thus, understanding the complex relation between dislocation microstructures and the emerging mechanical behavior is important from a fundamental point of view but is also required for designing new materials with tailored material properties. To this end, both experimental as well as numerical approaches can give important input to such developments.

In recent years, experimental methods reached the point where a three-dimensional imaging of dislocations is possible (Chen et al., 2013; Yamasaki et al., 2015). On the other side of the spectrum, due to improvements in algorithms and increasing computational power, numerical methods are able to simulate the evolution of dislocation microstructure in large specimens of up to several tens of µm (Rao et al., 2019). The drastic increase in the amount of available data sets as well as the degree of complexity of such dislocation microstructure results in a growing need for suitable algorithms and concepts for their analysis. The recent resurgence of machine learning algorithms offers a novel way for exploring this data in great detail.

Machine learning algorithms have been successfully applied in a variety of fields within materials science so far, e.g., prediction of stable compounds (Saal et al., 2013), prediction of the crystal structure (Ghiringhelli et al., 2015), band gap prediction (Dey et al., 2014), microstructure characterization (Chowdhury et al., 2016; Bostanabad et al., 2018), or material structure-property linkages (Cecen et al., 2018). The challenge of machine learning in the context of dislocation

**113**

microstructures is the extraction and selection of features that are able to accurately capture the properties of both single dislocations, as well as dislocation networks. Features characterizing dislocation microstructures should capture as much of their geometrical character with as few parameters as possible. In typical dislocation dynamics simulations the microstructure is represented as a network of lines each of which requires many geometrical parameters for its definition. This makes it problematic to directly operate with these objects. We will therefore take a new approach: Based on the discrete-to-continuous (D2C) framework (Sandfeld and Po, 2015; Steinberger et al., 2016), which "borrows" the field variables of a continuum theory of dislocation dynamics. This method was already successfully used to study the emergent microstructural features of molecular dynamics simulations of plastic deformation via scratching (Gunkelmann et al., 2017) and during shock loading (Kositski et al., 2018).

In the following, we will start by introducing the main steps for converting mathematical lines into continuum field variables along with the most important simulation methods the "discrete dislocation dynamics"—which provides the data for all subsequent analysis. We will then apply the briefly introduced machine learning algorithms to an example problem, which will allow us to study the information content of each of these field quantities. This will help to understand whether different sets of field variables suffice as features for machine learning dislocation microstructures.

### 2. METHODS

In the following, the D2C framework is outlined as a means of converting discrete dislocation microstructures into continuous fields while retaining a variable amount of information. Then, the generation of dislocation microstructures within samples via discrete dislocation dynamics is summarized. Lastly, the machine learning algorithms used to classify the sample size based on their dislocation microstructure is given in detail.

#### 2.1. D2C—Discrete-to-Continuous

The D2C framework (Sandfeld and Po, 2015; Steinberger et al., 2016) is based on treating dislocations as directed curves with additional physical properties, i.e., the slip plane normal, and the Burgers vector. Dislocations represent the boundary of an area over which slip displacement between two adjacent lattice planes has occurred. Dislocations can not end at arbitrary sites within the crystal, but only at free surfaces, grain boundaries, other dislocations, or other defects. A dislocation is characterized by


Locally, the character of a dislocation depends on its orientation with respect to the Burgers vector. If **<sup>l</sup>** <sup>k</sup> **<sup>b</sup>**, the character is of "screw" type, if **<sup>l</sup>** <sup>⊥</sup> **<sup>b</sup>**, the character is of "edge" type, in all other cases it is of "mixed" type.

The so-called Kröner–Nye tensor (Nye, 1953; Kröner, 1958) is defined via

$$\mathfrak{a} = \sum\_{\mathcal{S}} \varrho\_{\mathcal{S}} b\_{\mathcal{S}} \otimes l\_{\mathcal{S}} = \sum\_{b} \int\_{0}^{2\pi} \int\_{0}^{2\pi} \varrho\_{\mathfrak{b}}(\theta, \varphi) \mathfrak{b} \otimes l(\theta, \varphi) \mathrm{d}\varphi \mathrm{d}\theta,\tag{1}$$

where S denotes a possible set of dislocations within a volume sharing a Burgers vector and line tangent. The integral over the spherical angles θ and ϕ denotes an integration over all possible orientations in three-dimensional space. It was the first attempt to describe dislocations along with structural information as continuous fields. As opposed to simplistic measures as, e.g., the total dislocation density ρ t , which is defined as the line length per averaging volume, the Kröner–Nye tensor captures the local dislocation character in terms of the relative orientation of the line directions of the dislocations with respect to the Burgers vector. However, contributions of dislocations with opposite character cancel each other out: e.g., consider two straight line segments with the same Burgers vector but opposite line directions **l** <sup>+</sup> and **l** <sup>−</sup> = −**<sup>l</sup>** +: their average contribution is **<sup>b</sup>** <sup>⊗</sup> **<sup>l</sup>** <sup>+</sup> <sup>+</sup> **<sup>b</sup>** <sup>⊗</sup> **<sup>l</sup>** <sup>−</sup> = **0**. Thus, only information about socalled "geometrically necessary" dislocations, i.e., dislocations that contribute to plastic distortion within the averaging volume, is taken into account. A number of continuum theories for predicting the evolution of dislocations are based on the Kröner-Nye tensor (Acharya and Roy, 2006; Roy et al., 2006; Xia and El-Azab, 2015).

Another theory for evolving continuous dislocation fields is the so-called higher-dimensional continuum dislocation dynamics theory developed by Hochrainer and co-workers (Hochrainer et al., 2007; Sandfeld et al., 2010). Within this theory, dislocations are represented by density and "curvature density" fields, both of which are not only a function of the spatial position **r**, but also of the orientations θ and ϕ of the dislocations. While this concept contains many important information, the extra degrees of freedom also add a high degree of complexity. This can be remedied by expanding the density and curvature fields using a Fourier series. The resulting infinite hierarchy of field equations, however, can then be truncated. For the density field of n-th order it is (Hochrainer et al., 2014; Hochrainer, 2015)

$$\rho^{(n)}(\mathbf{r}) = \int\_0^{2\pi} \int\_0^{2\pi} \rho(\mathbf{r}, \theta, \varphi) l(\theta, \varphi)^{\otimes n} \mathbf{d}\varphi \,\mathrm{d}\theta,\tag{2}$$

with **l**(**r**) <sup>⊗</sup><sup>n</sup> denoting the n-times outer product of **l**(**r**). The zeroth-order term of the series,

$$
\rho^{(0)}(\mathbf{r}) = \int\_0^{2\pi} \int\_0^{2\pi} \rho(\mathbf{r}, \theta, \varphi) \mathrm{d}\varphi \mathrm{d}\theta,\tag{3}
$$



The symbols refer to the ones used by their respective continuous field theory, **l**(u) denotes the line vector of the dislocation, and **b**<sup>c</sup> its Burgers vector.

recovers the total dislocation density at position **r**. The first-order term,

$$\rho^{(1)}(\mathbf{r}) = \int\_0^{2\pi} \int\_0^{2\pi} \rho(\mathbf{r}, \theta, \varphi) \mathbf{l}(\mathbf{r}) \mathrm{d}\varphi \mathrm{d}\theta,\tag{4}$$

represents the "line excess density". If computed separately for each slip system, it is the "geometrically necessary" dislocation density for this slip system. The second-order term,

$$\rho^{(2)}(\mathbf{r}) = \int\_0^{2\pi} \int\_0^{2\pi} \rho(\mathbf{r}, \theta, \varphi) \mathbf{l}(\theta, \varphi) \otimes \mathbf{l}(\theta, \varphi) \mathrm{d}\varphi \,\mathrm{d}\theta,\tag{5}$$

denotes the "line direction density". If computed separately for each slip system in a coordinate system that is based on the Burgers vector of that slip system, it can be interpreted as the density of edge- and screw-type dislocation character. The theory based on these fields and an additional field—the curvature density of the dislocations—is able to represent the kinematics of dislocation motion for simplified single slip situations, which was shown by Sandfeld and Po (2015) by comparison with discrete dislocation dynamics simulations.

Numerically, the computation of the fields based on discrete dislocation data is carried out in the following way. The subvolume of interest within a specimen is discretized into voxels i . Microstructure features may then be extracted for each voxel by treating each dislocation as a parameterized curve c(u) via

$$\frac{1}{V\_{\Omega\_i}} \sum\_{c \in \Omega\_i} \int\_{\mathcal{L}\_{\Omega\_i}} f\_c(u) \mathrm{d}u,\tag{6}$$

where u is the arc length, and V<sup>i</sup> is the volume of the voxel <sup>i</sup> . fc(u) denotes a field specific term that relies on the geometrical and physical properties of the dislocation curve c. An overview of the continuous fields used as features and their corresponding term for fc(u) is compiled in **Table 1**.

#### 2.2. The Discrete Dislocation Dynamics Method

The discrete dislocation dynamics methods represent dislocation as polygonal chains, i.e., an ordered sequence of segments. Forces acting on those segments, or their vertices, due to other dislocations, external load, and/or image forces due to surfaces are computed, and subsequently used to move the dislocations according to a velocity law. Additionally, local rules are implemented to take dislocation reactions like cross-slip or junction formation into account. A velocity law combined with the local rules can then be time-integrated to update the dislocation positions.

#### 2.3. Data Generation and Simulation Set Up

The generation and evolution of dislocation microstructures were performed using the MODEL discrete dislocation dynamics code (Po et al., 2012; Po and Ghoniem, 2014). Cube-shaped copper samples with edge lengths of 30, 60, and 90 nm were filled with dipolar edge loops that were randomly placed on all slip systems up to a total dislocation density of <sup>≈</sup> <sup>5</sup> <sup>×</sup> <sup>10</sup><sup>16</sup> <sup>m</sup>−<sup>2</sup> . Throughout the simulations, the effect of open boundaries on the dislocations was taken into account and dislocations were allowed to exit the samples. Subsequently, these random structures were relaxed without application of an external stress.

Due to the open boundaries image forces act on the dislocations that attract them to the free surfaces where parts of them leave the specimen. The attraction is stronger the closer the dislocation is to the surface. Therefore, we expect the dislocation density ρ (0) to be smaller at the boundaries of the specimen. Furthermore, if unhindered, the remaining part of the dislocation should be oriented perpendicular to the surface. This preference of dislocation line direction should show in the line direction density ρ (2). Thus, the region close to the surfaces should exhibit dislocation microstructure features that are different from that of the center of the sample. For simplicity, formation of junctions was not considered in the present study, which resulted in a large simulation speedup allowing to generate more samples.

Overall, 306 realizations of the 30 nm specimen, 238 of the 60 nm specimen, and 207 of the 90 nm specimen were generated. Due to the relatively small number of samples that can be investigated in this study, slip system specific dislocation densities would be prone to overfitting. Instead, only the line directions of each dislocation within the subvolume are taken into account regardless of the slip system. The local deformation character of the dislocation ensemble is therefore only considered by the Kröner–Nye tensor due to it taking the Burgers vector into account.

Due to the different size of the samples, the dislocation arrangement close to the surface can be expected to differ between the sample sizes. Therefore, we ask the following question: Can a machine learning model be trained to classify the sample size based on the dislocation microstructure within a subvolume at the surface of the specimen?

A 30 × 30 × 30 nm subvolume at the center of the side oriented toward the negative x-direction was chosen, see **Figure 1**. For the 30 nm sample, the whole volume is thus taken into account, including all the surfaces. In the larger specimen, the subvolume only contained one free surface. Assuming that the microstructure features are able to capture the characteristics of the microstructure, the classification of the 30 nm sample should be easier than the classification of the 60 nm, and 90 nm samples.

# 2.4. Machine Learning of Dislocation Microstructures

Machine learning algorithms rely on the description of samples by common features, that are then typically used for classification, regression, and/or clustering. A feature is a measurable property of a sample that provides information about a sample and puts it into relation with other samples. Classification describes the procedure of trying to infer a label for one or several samples based on the features of other samples with a known label. In this work, a Gaussian naive Bayes classifier is used, which is briefly explained in the following. For more details, see Domingos and Pazzani (1997), and Hand and Yu (2001).

Bayes' theorem states that the probability P of a sample with features Xˆ belonging to class y<sup>i</sup> is given by

$$P(\mathcal{y} = \mathcal{y}\_i \mid X = \hat{X}) = \frac{P(X = \hat{X} \mid \mathcal{y} = \mathcal{y}\_i)P(\mathcal{y} = \mathcal{y}\_i)}{P(X = \hat{X})}.\tag{7}$$

Here, P(A | B) denotes the conditional probability of A under the condition B. The predicted class then is the class for which this probability is the highest considering the given feature vector. Thus, the denominator of Equation (7) becomes irrelevant, as it does not depend on the class. Both, P(y = yi)—the probability that the class is yi—and P(X = Xˆ | y = yi)—the probability that the features are Xˆ given that the class is yi—are results of the supervised learning procedure. The former is computed via the number of times class y<sup>i</sup> was observed within all training data with respect to all training data. The latter is assumed to be modeled by a gaussian distribution for each occurring class individually, with the mean and standard deviation being computed from the features of specimen belonging to that class within the training dataset.

A simple example of the Gaussian naive Bayes classification can be seen in **Figure 2**. The samples shown as dots were used to train a Gaussian naive Bayes classifier. Subsequently, the feature space was sampled for its classification areas and they are shown accordingly. Interfaces between these areas are called decision boundaries and represent ambiguous feature combinations.

features to classify samples into three distinct classes, represented by their color. A Gaussian naive Bayes classifier was trained using the samples seen as dots and subsequently the areas, whose feature combination would lead to a specific classification, was colored accordingly. It can be seen that not all samples would be classified correctly even though they have been part of the training data.

In this work features used for machine learning are the microstructure features in the subvolumes of each sample. This is done to make them comparable w.r.t the voxel size and position of the features. If, instead, we used the whole specimen size, the data would not be comparable.

The performance of classification models is then measured by cross-validation and the accuracy score, i.e., the number of correctly labeled samples divided by the total number of samples that were labeled. Additionally, so-called confusion matrices may be computed. They reveal details of the mislabeling by keeping track of the true label and the one predicted by the machine learning model.

To measure the influence of the spatial resolution and its interplay with the different features on the classification score, different combinations of spatial discretizations and density features are applied. Each subvolume was subdivided into up to 8 segments along each direction, resulting in up to 512 voxels <sup>i</sup> . Subsequently, the features were computed within each of those voxels using the D2C framework.

For each combination of spatial discretization and features, 30 shuffled stratified 5-fold cross-validations were performed to determine the average accuracy scores and confusion matrices of

the models. Throughout this work, the Python packages NumPy (Oliphant, 2015) version 1.16.0, and scikit-learn (Pedregosa et al., 2011) version 0.20.2 were used.

# 3. RESULTS

Dislocation structures in specimens of three different sizes are created using the open source discrete dislocation dynamics code MODEL according to the relaxation procedure outlined in the previous section. Examples for such dislocation structures within a subvolume are shown in the top row of **Figure 3**. All subvolumes exhibit a depletion of dislocations close to the surface. This behavior is most pronounced for the 30 nm specimens. Applying the D2C coarse-graining to the discrete dislocation structure we obtain continuous dislocation dynamics (CDD) field data. To be able to directly compare the microstructures of different specimen sizes, we cut samples of equal sizes from each specimen size (compare **Figure 1**). Typical density distributions for different specimen sizes and with two different discretizations are illustrated in **Figure 3**.

The overall total dislocation density of the 30 nm specimens is smaller than that of the 60 nm and 90 nm specimens. Furthermore, the smallest sample also shows a highly symmetric density morphology, while the average total dislocation density in the larger samples exhibit a gradient, i.e., an increase in direction of the negative x-direction, with smaller density at the free surface at the right. Along the other two directions no gradient can be observed. This is a result of the way how we cut samples out of the specimens of different sizes: only the smallest sample has

FIGURE 4 | Average accuracy score of the microstructure features in combinations that are used within their respective theories over the number of voxels along each axis used for the spatial discretization.

free surfaces everywhere, while the samples from the 60 nm and 90 nm specimens have only one free surface, i.e., the one with outwards normal pointing into positive x-direction.

Having presented general observations of the microstructure, the results of the machine learning model are presented in the following.

**Figure 4** shows the average accuracy scores computed from the machine learning model. They were obtained for different combinations of microstructure features and for different coarse-graining voxel sizes. These particular combinations are commonly used in continuous dislocation simulations models. It can be seen that in particular for large voxel sizes (≤ 3 × 3 × 3 voxel) the accuracy score of the Kröner–Nye tensor is low compared to those obtained for the CDD field variables. For higher resolutions, the Kröner–Nye tensor α scores higher than the total dislocation density ρ (0) but is still performing not as good as using more than one CDD feature at the same time. Using (combinations of) the CDD field variables from Hochrainer's CDD theory, the general trend is that a larger number of involved fields leads to a better or at least comparable score. Using the direction line density ρ (2) in addition to the excess line density ρ (1) and the total density ρ (0) leads to a significant improvement in the accuracy score (green curve in **Figure 4**).

To study in more detail what the influence of different features is we investigate the accuracy score for only using a single feature and for combinations of two features in **Figure 5**, on the left and on the right, respectively. If only one voxel is considered for the spatial discretization, the total line density {ρ (0)} is the best predictor of the sample size, followed by the direction line density {ρ (2)}. The latter starts to perform better for resolutions

of more than one voxel for each direction. The excess line density {ρ (1)} performs better with higher resolution, performing better than the total dislocation density {ρ (0)} for more than four voxels per axis, and better than the direction line density {ρ (2)} for more than seven voxels per axis. Field combinations involving ρ (2) perform better than those without it, the exception being the highest resolution of eight voxels per axis. For low resolutions the combination {ρ (0) , ρ (2)} is more accurate than the combination {ρ (1) , ρ (2)}. For more than two voxels per direction the accuracy of the two becomes comparable.

Confusion matrices for only using the total dislocation density as feature are shown for different resolutions in **Figure 6**. There, the vertical axis shows the real specimen size and the horizontal axis is the size inferred by the classification algorithm. One observes that the 30 nm samples are always labeled correctly: each matrix has a "1" in the top left. Larger samples are mislabeled more often, with a stronger tendency of mislabeling the specimen as a too small specimen, i.e., the 90 nm sample is more often classified as 60 nm than the other way around. This effect is less pronounced for higher resolutions. At the same time the accuracy of correctly labeling the 60 nm samples slightly decreases.

The confusion matrix of the best performing combination of features and resolutions, {ρ (0) , ρ (1) , ρ (2)}, for one voxel, is shown in **Figure 7** on the right. The predicted size of 30 nm samples perfectly matches the actual size. Sixty and ninety nanometer samples are predicted correctly with an accuracy of above 0.8. Specimens that could not be predicted correctly were never labeled as 30 nm, and the degree of false labelings of the two larger samples (i.e., identifying a 60 nm specimen as a 90 nm and vica versa) is balanced.

The combination of the Kröner–Nye tensor and a resolution of two voxel per direction performed worst out of the investigated combinations. Its confusion matrix is shown in **Figure 7** on the left. While specimens of size 30 nm are not predicted perfectly, they still remain those that are most accurately predicted. False predictions are not limited to just the next smaller or larger sizes, as roughly 5 % of 30 nm samples are classified as 90 nm, and roughly 11 % the other way around. Slightly more than half of the 60 and 90 nm samples are classified as 60 nm.

These two extreme cases also summarize all other combinations of continuous fields and resolution: The 30 nm specimens are much more reliably classified than the larger specimens. If larger specimens are mislabeled, the tendency is that the 90 nm specimen is classified as being 60 nm more often than vice-versa.

# 4. DISCUSSION

# 4.1. Accuracy of Classifying 30 nm vs. 60 nm, and 90 nm

The confusion matrices show that there is a striking difference in the accuracy with which subvolumes of 30 nm specimen are classified compared to the larger specimen. The reason for this is that the subvolumes of 30 nm specimens have six free boundaries, whereas the subvolumes of larger specimen only have one. As seen in **Figure 3**, this leads to distinct density features for the 30 nm specimens/subvolumes. On the one hand, the overall dislocation density is lower compared to the larger specimens. As dislocations are attracted to free surfaces through which they can leave the specimen, more free surfaces closer to the subvolume result in a lower density. On the other hand, there is no gradient of the dislocation density like in the larger specimen. While the dislocations close to the free surfaces in the larger specimen are able to leave the samples, dislocations closer to the center of the specimen can not. On average, this leads to a large density gradient in the subvolumes of larger samples. These are the features likely learned by the model and lead to a high accuracy in distinguishing the 30 nm subvolumes from larger ones, regardless of the resolution.

Classification of subvolumes of the larger specimen is less accurate as their basic features are the same: both have one free surface, while their other surfaces are inside the specimen. However, the distance of the "inner" subvolume surfaces to the specimen surfaces is different for the 60 nm and the 90 nm samples. This likely leads to more subtle differences in the dislocation microstructure that have to be represented as features for the machine learning algorithm to recognize them. For this, two options seem to be available:


Both ways also alleviate the asymmetry in mislabeling subvolumes of the larger specimen. If one looks at the "feature efficiency," i.e., how many features are used to get the best accuracy, including more information via higher order terms of Hochrainer's CDD theory is the better solution.

#### 4.2. Resolution and Features

Is it possible to identify a simple or generic recipe that helps to choose the "right" resolution or "correct" number of features? To answer this question the accuracy scores for all feature combinations and numbers of voxels for each direction are summarized in **Figure 8**.

When only using one microstructure feature set, the total dislocation density ρ (0) performs best for low resolutions. While the performance of ρ (0) remains rather unchanged for higher resolutions, other single microstructure features perform better at different resolutions. This can be explained by the length scales of the features compared to the spatial resolution. If there is only a single voxel, then the details that are captured by ρ (2) are averaged out. The performance increases as the resolution gets higher up to a point of about two average dislocation spacings. At this point

the length scale of the details represented by ρ (2) likely coincides with the resolution and leads to good performance.

The poor performance for the line excess density ρ (1) and the Kröner–Nye tensor α at resolutions below one average dislocation spacing can furthermore be explained by the chosen dislocation configuration. Both measures are only able to describe dislocation configurations that show an excess of a particular dislocation type that ultimately leads to a plastic distortion of the lattice. In our example, the average dislocation character is balanced, i.e., on average there is no plastic distortion in the specimen. Thus, the local formation of substructures of different "character excess" is averaged out if the resolution is chosen too low. As the resolution is increased and approaches a comparable scale, the performance of these measures increases and, in some cases, even surpasses that of the total dislocation density.

Why is it not possible to simply increase the resolution together with using four or more CDD field variables? As the resolution increases, so does the number of features and the likelihood of overfitting. This can be seen within the performance of the combination of the fields {ρ (0) , ρ (1) , ρ (2)} in **Figure 5**. The performance advantage of {ρ (0) , ρ (1) , ρ (2)} over {<sup>ρ</sup> (0)} at lower resolutions can be attributed to the addition of ρ (2) alone, as evident by comparing the performance of {ρ (0) , ρ (1) , ρ (2)}, and {ρ (0) , ρ (2)}. As the resolution is increased, the performance slightly decreases and reaches another maximum for the resolution of four voxels per axis. This coincides with an increase of all field combinations that are containing ρ (1). Up until this point the likelihood of having overfitted is small. The subsequent continuing drop in accuracy may then be attributed to overfitting.

Overfitting, however, may not be the only culprit of a decrease in performance for higher resolutions. As dislocations are onedimensional objects embedded into three-dimensional space, the size of the voxels, i.e., the size of the domain for statistical averaging, can be too small. In extreme cases, no correlation may be found for characteristic dislocation arrangements that due to a too fine resolution are, e.g., contained in different voxel. The link to the underlying physics is given by the mean dislocation spacing, x¯ = 1/ <sup>√</sup>ρ0. If the voxel size is smaller than <sup>x</sup>¯ the likelihood of finding two dislocations inside the same averaging volume becomes small. Thus, a single voxel is rather a probe of properties of a single line segment but will not be able to represent any non-local structural details of more complex dislocation networks. In **Figure 4** the voxel size as a multiple of the initial mean dislocation spacing x¯ is indicated on top of the diagrams. In both plots one can observe that for voxel smaller than ≈ ¯x the accuracy is strongly reduced. Therefore, one can conclude that the mean dislocation spacing might be a useful quantity to estimate a reasonable lower limit for the voxel size. This highlights the fact that including domain knowledge is beneficial.

# 4.3. Implications of Simplifications of the Simulations

Clearly, the DDD setup that was used in this work is not entirely realistic since junction formation was not allowed. However, the main point still remains valid: continuous fields are sufficient as features for machine learning of dislocation microstructures. Junction formation would not hinder us in extracting the line directions, but actually give us access to more features by, e.g., differentiating between lines of "pristine" type and "junction" type. The number of available features further increases if junction features such as the resulting Burgers vector or the angle between junction line and the original dislocation lines were taken into account. This, of course, also means that more samples would be required to avoid overfitting.

# 5. CONCLUSION

A variety of continuous fields "borrowed" from a continuous dislocation dynamics theory was introduced as potential machine learning features that are able to describe dislocation microstructures. Using discrete dislocation dynamics, relaxed dislocation configurations of samples of different size were created. Through the D2C framework, the microstructure features of the discrete data provided by the discrete dislocation dynamics code were extracted. The performance of these features was investigated by predicting the size of a specimen based on samples of dislocation microstructure. It was shown that the accuracy of machine learning models trained with these features varies with different sets of microstructure features and spatial discretizations. Finding the key characteristic microstructural features in these systems and linking them to the underlying physics seems to be a very promising way, not just for "learning dislocation dynamics" but also for guiding the development of coarse-grained continuum theories of dislocations, such as, e.g., based on atomistics (Xiong et al., 2011) or using the phase field method (Rodney et al., 2003). If a machine learning model were trained to distinguish between the detailed and the coarse-grained simulations based on the proposed microstructure features, but it turned out that the performance is poor it could imply that the coarse-grained model is able to capture the underlying mechanisms accurately.

Last but not least, the present work might also be a first step toward guiding the development of new, possibly specialized continuum theories of dislocation dynamics since the classification performance of certain field variables can be an indicator for its importance. Understanding the interplay between voxel size and accuracy might be able to guide, e.g., finite element based simulation frameworks toward an "informationbased" mesh refinement.

# AUTHOR CONTRIBUTIONS

DS designed, implemented and performed the data analysis. HS created the simulation set up and ran simulations. SS designed the project. DS and SS discussed the results and wrote the manuscript. All authors contributed to the manuscript, read and approved the submitted version.

# ACKNOWLEDGMENTS

The authors acknowledge funding from the European Research Council Starting Grant, A Multiscale Dislocation Language for Data-Driven Materials Science, ERC Grant Agreement No. 759419 MuDiLingo.

# REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Steinberger, Song and Sandfeld. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Machine Learning Techniques for the Segmentation of Tomographic Image Data of Functional Materials

Orkun Furat <sup>1</sup> \*, Mingyan Wang<sup>2</sup> , Matthias Neumann<sup>1</sup> , Lukas Petrich<sup>1</sup> , Matthias Weber <sup>1</sup> , Carl E. Krill III <sup>2</sup> and Volker Schmidt <sup>1</sup>

*1 Institute of Stochastics, Ulm University, Ulm, Germany, <sup>2</sup> Institute of Functional Nanosystems, Ulm University, Ulm, Germany*

In this paper, various kinds of applications are presented, in which tomographic image data depicting microstructures of materials are semantically segmented by combining machine learning methods and conventional image processing steps. The main focus of this paper is the grain-wise segmentation of time-resolved CT data of an AlCu specimen which was obtained in between several Ostwald ripening steps. The poorly visible grain boundaries in 3D CT data were enhanced using convolutional neural networks (CNNs). The CNN architectures considered in this paper are a 2D U-Net, a multichannel 2D U-Net and a 3D U-Net where the latter was trained at a lower resolution due to memory limitations. For training the CNNs, ground truth information was derived from 3D X-ray diffraction (3DXRD) measurements. The grain boundary images enhanced by the CNNs were then segmented using a marker-based watershed algorithm with an additional postprocessing step for reducing oversegmentation. The segmentation results obtained by this procedure were quantitatively compared to ground truth information derived by the 3DXRD measurements. A quantitative comparison between segmentation results indicates that the 3D U-Net performs best among the considered U-Net architectures. Additionally, a scenario, in which "ground truth" data is only available in one time step, is considered. Therefore, a CNN was trained only with CT and 3DXRD data from the last measured time step. The trained network and the image processing steps were then applied to the entire series of CT scans. The resulting segmentations exhibited a similar quality compared to those obtained by the network which was trained with the entire series of CT scans.

Keywords: machine learning, segmentation, X-ray microtomography, polycrystalline microstructure, Ostwald ripening, statistical image analysis

# 1. INTRODUCTION

In materials science, supervised machine learning techniques are used to describe relationships between the microstructure of materials and their physical properties (Stenzel et al., 2017; Xue et al., 2017). Roughly speaking, these techniques provide high-parametric regression or classification models. However, to analyze the microstructure and to determine quantitative descriptors for its morphology or texture, one often requires image acquisition techniques like X-ray microtomography or electron backscatter diffraction (EBSD). Therefore, image processing is necessary for analysis, which generally entails some sort of semantic segmentation of image

#### Edited by:

*Benjamin Klusemann, Leuphana University, Germany*

#### Reviewed by:

*Stefan Sandfeld, Freiberg University of Mining and Technology, Germany Tim Dahmen, German Research Centre for Artificial Intelligence, Germany*

> \*Correspondence: *Orkun Furat orkun.furat@uni-ulm.de*

#### Specialty section:

*This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials*

> Received: *04 February 2019* Accepted: *07 June 2019* Published: *25 June 2019*

#### Citation:

*Furat O, Wang M, Neumann M, Petrich L, Weber M, Krill CE III and Schmidt V (2019) Machine Learning Techniques for the Segmentation of Tomographic Image Data of Functional Materials. Front. Mater. 6:145. doi: 10.3389/fmats.2019.00145* data. The non-trivial task of segmentation can range from determining the material phases that are present in image data to the detection and extraction of single particles, grains or fibers. The quality of the segmentation has a significant influence on the subsequent analysis of the material's microstructure and macroscopic physical properties.

Thus, in the present paper, we focus on machine learning techniques that provide assistance in the segmentation of image data. In recent years, numerous approaches for various fields have been considered that deal with this issue, where specifically convolutional neural networks (CNNs, Goodfellow et al., 2016) enjoy an increased popularity. In the field of object detection in 2D images the Region-CNN (R-CNN, Girshick et al., 2014) was successfully used for determining bounding boxes around objects of interest. In recent years this architecture was enhanced, resulting in the Fast R-CNN (Girshick, 2015) and Faster R-CNN (Ren et al., 2017). However, in many applications it does not suffice to obtain a bounding box around objects of interest—a much finer segmentation was achieved by He et al. (2017) who extended the Faster R-CNN architecture to assign image pixels to object instances detected in 2D image data. Recently, another CNN architecture, namely the U-Net (Ronneberger et al., 2015) was used for the segmentation of biomedical 2D image data. In later works, variations of the U-Net were introduced which are able to process and segment volumetric image data, see Çiçek et al. (2016) and Falk et al. (2019). Furthermore, conventional segmentation techniques, like the watershed transform (Beucher and Lantuéjoul, 1979), have been utilized in combination with methods from machine learning in segmentation tasks, see Naylor et al. (2017) and Nunez-Iglesias et al. (2013).

In the present paper, we give a short overview of several applications in the field of materials science in which we successfully combined methods of statistical learning including random forests, feedforward and convolutional neural networks—with conventional image processing techniques for segmentation, classification and object detection tasks, see e.g., Furat et al. (2018), Neumann et al. (2019), and Petrich et al. (2017). This shows the flexibility of the approach of combining conventional image processing with machine learning techniques, where the latter can be used either for preprocessing image data to increase the performance of conventional image processing algorithms or for postprocessing segmentations obtained by conventional means in order to improve segmentation qualities.

Based on our experience from previous studies, we apply similar techniques to the segmentation of time-resolved tomographic image data of polycrystalline materials. More precisely, the focus of the present paper is on data of an AlCu alloy that was repeatedly imaged by X-ray computed tomography (CT) following periods of Ostwald ripening. In order to investigate the relationship between grain geometry and functional properties, the study of grain boundary movement caused by the growth of grains during the ripening process—is of particular interest (Werz et al., 2014). Therefore, it is necessary to segment the CT image data into single grains. Due to the poor visibility of grain boundaries at high volume fractions in CT data (Werz et al., 2014), this task is demanding, especially when targeted using conventional image processing approaches.

Consequently, we will utilize convolutional neural networks, in particular architectures based on the U-Net (Ronneberger et al., 2015), for enhancing and predicting grain boundaries from CT data obtained after several ripening steps. More precisely, we use single- and multichannel U-Nets which receive 2D input and can be applied slice-by-slice to image stacks. Additionally, we trained a 3D U-Net which can evaluate volumetric data at a lower resolution, due to higher memory consumption. For training the neural networks we use "ground truth" information derived from 3D X-ray diffraction (3DXRD) microscopy, which allows grains and their boundaries to be extracted from the technique's measurement of local crystallographic orientation. The trained networks can then recover grain boundaries of poor visibility in CT data reasonably well, without drawing on additional 3DXRD information.

The rest of this paper is organized as follows. In section 2, we give a short overview of some applications that combine machine learning methods with conventional techniques of image processing for the semantic segmentation and classification of image data. Section 2.1 deals with the trinarization of the microstructure of Ibuprofen tablets using random forests and the watershed algorithm (Neumann et al., 2019). Then, in section 2.2, particulate systems of minerals are considered that are of interest in the mining industry. Here, a feedforward neural network is used to refine particle-wise segmentations obtained from the watershed algorithm (Furat et al., 2018). The watershed algorithm and feedforward neural networks are also combined in section 2.3. However, in the latter case, the focus lies on the detection of particle cracks in the 3D microstructure of lithium-ion batteries (Petrich et al., 2017).

The main results of the present paper are given in section 3. To begin with, in section 3.1, we describe the problem at hand when considering CT image data of AlCu alloys. In section 3.2, we utilize 3DXRD microscopy data to train three neural networks to extract grain boundaries from CT image data: a 2D U-Net for slice-by-slice evaluation, a multichannel 2D U-Net which can process consecutive slices and a 3D U-Net which uses full 3D information at a lower resolution. The grain boundary predictions of these networks are then segmented into single grains with conventional image processing tools (Spettl et al., 2015). In section 3.3, we quantitatively compare the presented methods by matching segmented grains to the "ground truth" obtained by 3DXRD measurement. Then, in section 3.4 we discuss how similar approaches can be utilized in other fields in which "ground truth" measurements are not easily feasible. Finally, section 4 concludes.

# 2. OVERVIEW OF PREVIOUS RESULTS

In this section, we give a short overview of different applications in the field of materials science in which we successfully combined methods of statistical learning, including random forests, feedforward and convolutional neural networks with conventional image processing techniques for segmentation, classification and object detection tasks.

# 2.1. Segmentation of Ibuprofen Tablets

In Neumann et al. (2019), a hybrid algorithm combining machine learning techniques with conventional tools of image analysis has been used to trinarize tomographic image data representing the microstructure of Ibuprofen tablets, i.e., to classify each voxel of the grayscale image as one of the three phases the tablet consists of. These phases are microcrystalline cellulose (MCC), Ibuprofen (API) and pores. In the following, we describe the challenges of this particular trinarization problem and briefly summarize the developed hybrid trinarization algorithm. Moreover, we discuss to which extent it improves the algorithms which are based either on machine learning techniques or on conventional image analysis. For details, we refer to Neumann et al. (2019).

A 2D slice of the 3D image data, which is obtained by synchrotron tomography and represents the microstructure of Ibuprofen tablets, is visualized in **Figure 1**. The image data consists of cubic voxels with a side length of 0.438 µm, while the resolution limit is at about 2 µm. Although there is a good contrast between the three constituents of the tablets, it is challenging to perform an algorithmic tinarization, mainly due to the following two aspects. First, the grayscale values of some voxels within MCC are in the same range as the grayscale values of those voxels which belong clearly to API. Second, long thin pores occur at the boundary of MCC particles, the corresponding grayscale values of which are similar to the ones of API. These two aspects suggest that in this application it is not reasonable to rely only on thresholding of grayscale values in order to obtain a physically coherent trinarization.

To deal with these challenges by means of machine learning, a random forest algorithm is used, i.e., a classification algorithm is considered which is based on a large number of randomized decision trees (James et al., 2013). To train the random forest algorithm, N voxels of a 2D slice of the image are manually classified by visual inspection. On the same 2D slice, M different filters are applied. Doing so, we obtain for each of the N manually classified voxels, an (M+1)-dimensional feature vector. It contains the original grayscale value of the voxel as well as the M grayscale values after application of the M filters. The random forest is trained to classify the voxels, i.e., to trinarize the image, by means of these feature vectors. For this purpose, Ilastik (Sommer et al., 2011) is used in combination with the parallelized random forest implemented in the computer vision library VIGRA. The results of the random forest algorithm are visualized in **Figure 1B**. One can observe that it leads to a satisfactorily well trinarization. Regarding the challenges mentioned above, the random forest algorithm leads to a good classification of MCC particles, even if an occurrence of API inside them is suggested by small grayscale values. Moreover, the long and thin pores at the boundary of MCC particles are reflected well in the trinarized image, since the algorithm is trained to detect such thin pores. However, this leads to wrongly detected pore voxels at the boundary between MCC and API when there is no indication for pores, neither by grayscale values nor by physical reasons. This effect can be removed by combining the random forest algorithm with a trinarization which is based on conventional image analysis and using the watershed algorithm.

The main idea of the watershed-based trinarization is as follows. At first, the pore space is determined via global thresholding. Here the threshold value is manually chosen by visual inspection. In the second step, regions, in which the deviation of grayscale values is relatively small, are determined by the watershed algorithm (Beucher and Lantuéjoul, 1979; Meyer, 1994; Beare and Lehmann, 2006). Then, each of these regions is either classified as API or MCC according to their average grayscale values. The results of the watershed-based trinarization, visualized in **Figure 1**, shows that this approach leads to an appropriate trinarization, when only the grayscale values are considered without any additional physical information about the material. But, the random forest trinarization is significantly better with respect to the detection of MCC particles and long, thin pores. Nevertheless, the watershed-based trinarization does not detect unrealistic pores at the boundary between API and MCC. Thus, the information obtained by the watershedbased trinarization can be used to further improve the random forest trinarization.

In particular, each pore voxel v of the random forest trinarization is relabeled as API voxel if the closest pore voxel in the watershed-based trinarization has a distance of more than 8.76 µm and the closest voxel classified as API in the random forest trinarization has a distance of at most 8.76 µm. The latter condition is necessary since pores within MCC, which are not detected by the watershed-based trinarization should not be removed. The value of 8.76 µm is manually chosen and is justified by visual inspection of the obtained result. A 2D slice of the final trinarization is shown in **Figure 1D**. The combination of the random forest trinarization with the watershed-based trinarization meets the required challenges of classifying the three constituents of Ibuprofen tablets. Based on the trinarized image, a characterization of the 3D microstructure of Ibuprofen tablets is performed by means of spatial statistics in Neumann et al. (2019).

# 2.2. Segmentation of Mineral Particle Systems

In the previous section, we discussed how to combine tools of conventional image processing with machine learning techniques to determine the material's phases in tomographic image data. However, in many applications a much finer segmentation is required, e.g., for tomographic images of particle or grain systems the segmentation has to correctly separate these objects from the background and from each other. For such segmentation problems, modified versions of the watershed algorithm, which entail some sort of pre- or postprocessing of image data, often yield good results (Roerdink and Meijster, 2000; Rowenhorst et al., 2006b; Spettl et al., 2015; Kuchler et al., 2018). The preprocessing steps are necessary to determine unique markers for each particle or grain, from which the watershed algorithm grows regions which lead to a segmentation of the image. A carefully adjusted marker detection is required: If multiple

markers are determined in a single particle, the watershed splits the particle into multiple fragments, see **Figure 2B**. This issue is referred to as oversegmentation. On the other hand, too few markers lead to a segmentation, in which multiple particles are assigned to a single region. The marker detection is especially difficult if the particles depicted in the image data have irregular, for example elongated or plate-like, shapes. Therefore, a postprocessing step is required to correct the mentioned issues, e.g., by merging regions to overcome oversegmentation.

In Furat et al. (2018) X-ray microtomography (XMT) image data of a mixture of particles was considered. These particles comprise of ores and other minerals and have a size of about 100 µm, see **Figure 2A**. In order to analyze particle properties from such image data, for example, the distributions of volume or some shape characteristics, one needs to extract single particles from image data via segmentation. However, the watershed algorithm often fails for the considered data, since, for example, elongated particles are segmented into multiple fragments. In Furat et al. (2018) a postprocessing step was described which utilizes machine learning techniques, more precisely a feedforward neural network, to eliminate oversegmentation.

Therefore, an oversegmented image Iover of a tomographic grayscale image I of the sample under consideration, was represented by an undirected graph G = (V, E), where each vertex v ∈ V represents a region of the oversegmented image Iover. Furthermore, the set E contains an edge e = (v1, v2) between two vertices v1, v<sup>2</sup> ∈ V if the corresponding regions are adjacent in the oversegmented image Iover. The goal of the neural network was the elimination of edges between adjacent regions which belong to different particles, while preserving those which lie in the same one. This lead to a reduced set of edges E˜ ⊂ E. A remaining edge (v1, v2) ∈ E˜ indicated that the corresponding adjacent regions should be merged in the oversegmented image. For the neural network to decide whether to remove an edge <sup>e</sup> <sup>∈</sup> <sup>E</sup>, it required input, in form of feature vectors <sup>x</sup><sup>e</sup> <sup>∈</sup> <sup>R</sup> p , obtained from the original grayscale image I.

Among the components of the input vectors xe, local contrast information was stored. More precisely, the absolute gradient image of I was computed using Sobel operators, see Soille (2013). For an edge e = (v1, v2) the voxels in the vicinity of the interface between the two regions surrounding the vertices v<sup>1</sup> and v<sup>2</sup> were considered for the computation of the first four moments of the absolute gradient values in this local neighborhood. These values were stored in the feature vector xe. Furthermore, x<sup>e</sup> was appended with the relative frequencies of the histogram of the local absolute gradient values. Analogously, local information of the first four moments and relative frequencies of the histogram of local grayscale values of the original image I were stored in xe. Note that the previously described features of the vector x<sup>e</sup> contain only local contrast information. Therefore, some local geometry features were included in a similar manner. By computing local curvatures, the first four moments and histogram frequencies of curvatures were obtained in the vicinity of the interface between v<sup>1</sup> and v2. Another geometrical feature which was considered, characterizes the shape of the interface itself. More precisely, a principle component analysis (PCA) of the voxels (Hastie et al., 2009), which form the interface between the adjacent regions, was performed. The eigenvalues obtained by the PCA were stored in the feature vector xe.

Then the classification problem was formulated as

$$f(\mathbf{x}\_{\varepsilon}) = \begin{cases} 1, & \text{if } \nu\_1, \nu\_2 \text{ belongs to the same particle,} \\ 0, & \text{else,} \end{cases} \tag{1}$$

for each edge e = (v1, v2) ∈ E. As a model for the classifier f a feedforward network was chosen and the target values for feature vectors x<sup>e</sup> were determined by manually segmenting a small cut-out of the image data. The trained network f was then used to classify which edges e should be removed, i.e., edges with f(xe) = 0. **Figure 2B** depicts the initial graph, in which edges are set between adjacent regions. After the edge reduction with the neural network, regions connected by an edge e with f(xe) = 1 get merged, thus leading to a less oversegmented system of particles, see **Figure 2C**.

#### 2.3. Crack Detection in Lithium-Ion Cells

In sections 2.1 and 2.2, machine learning is applied to image segmentation problems. In this section we present an approach that goes one step further and employs similar techniques, but instead of identifying individual particles, the relationship between two particles is investigated, which allows to localize regions of interest in electrodes of lithium-ion batteries.

Lithium-ion batteries are among the most commonly used types of batteries since they combine several beneficial properties,

FIGURE 2 | (A) 2D cut-out of tomographic image data of ore particles. (B) Oversegmented image obtained by the watershed transformation. Red lines are set between adjacent regions. Note that some regions are adjacent in 3D but not in the visualized planar section. (C) Segmentation after a postprocessing step using a neural network.

FIGURE 3 | Overview of the model development for the crack detection in lithium-ion batteries. Reprinted from Petrich et al. (2017), Figure 1, with permission from Elsevier.

such as high energy density and low self-discharge. However, one of their biggest disadvantages is their vulnerability to thermal runaway caused, e.g., by overheating or overcharging, which can lead to disastrous incidents like fires or even explosions. An active research field deals with the design of lithium-ion batteries with minimal risk of failure. It is known that during thermal runaway the particles in the electrode material break (Finegan et al., 2016), and the resulting increase in surface area intensifies the heat generation (Jiang and Dahn, 2004; Geder et al., 2014). However, many questions are still unanswered and an in-depth analysis on how the microstructure of the electrodes affects the safety of the battery requires information on the locations of the broken particles in post-mortem cells.

For this purpose, in Petrich et al. (2017) a method is presented that allows an automatic detection of particle cracks in tomographic image data of lithium-ion batteries and thus reduces the amount of manual labeling, which is tedious at best or outright infeasible for large datasets. More precisely, a commercial LiCoO<sup>2</sup> cell was overcharged, which led to a thermal runaway. The post-mortem sample was imaged in a lab-based X-ray nano-CT system and to prepare the data for further analysis it was denoised, binarized, and individual particles were segmented. In Petrich et al. (2017), pairs of adjacent particles are considered and categorized in one of the following classes.


The goal is to automatically classify pairs of particles with methods from machine learning, which require hand-labeling only for a small subset of the data. In the presented case, an expert labeled 294 particle pairs. An important part of many machine learning applications is to translate the problem at hand to quantitative features. In order to facilitate this feature engineering step, synthetic data was used, for which it is possible to generate arbitrarily many particle pairs and their true class labels. This means that the quality of several features on a bigger artificial dataset (3693 instances) was investigated by training many different classification models and evaluating their performance. The best features were selected and a new model was trained and tested on the hand-labeled dataset, which was used for validation. An overview of the approach is visualized in **Figure 3**.

For the simulated dataset, first, a system of individual pristine particles was generated based on the stochastic microstructure model introduced in Feinauer et al. (2015a) and Feinauer et al. (2015b), then a certain percentage of particles were broken in two parts as described in Petrich et al. (2017). The individual particles were discretized in a single 3D image and the same image preprocessing was performed as on the tomographic image data. Because in each step—the particle creation, the breakage, and the image preprocessing—the relationships of the particles to their neighbors were tracked, it is possible to generate a list of particle pairs and their true class label. This list was subsampled such that there were the same number of instances for each class.

Based on this simulated dataset numerical features were designed. For these, not only the individual particles were considered, but also a combination of the two, which here means the morphological closing (Soille, 2013) of the two particles. Some features are straight forward, like the fraction of the volume of the smaller particle to the volume of the larger one or the volume of the combined particles divided by the sum of the individual volumes. The same ratios were calculated for the surface area. The next quantity is more complicated, but also showed more predictive power. Here, for each voxel on the boundary of the particles the distance to the other particle is computed and the histogram of these values forms another (multidimensional) feature.

As in section 2.2 for the classification a multilayer perceptron (MLP), i.e., a feed-forward neural network with one hidden layer, was chosen. For an introduction to MLPs and machine learning in general, see Bishop (2006) and Hastie et al. (2009). The input for the classifier was the standardization of the feature vector described above. The sigmoid function is used for the non-linear activation functions in the input and hidden layer, and the softmax function for the output layer. The network was trained with the quasi-Newton method L-BFGS (Nocedal and Wright, 2006), which minimizes the cross-entropy loss with L<sup>2</sup> regularization. The hyperparameters (i.e., number of hidden neurons and weight of the L<sup>2</sup> regularization term) were tuned with a 5-fold stratified cross-validation maximizing the accuracy.

With this setup two classifiers were built, one for the simulated and one for the hand-labeled dataset. In each set 75% of the instances were used to train the classifier and the rest to evaluate its performance. The results for the simulated dataset (2769 samples for training, 924 for testing) are shown in **Table 1**. The overall accuracy is 82.1%. The evaluation results for the hand-labeled data (220 samples for training, 74 for testing) are presented in **Table 2**. Here, the classifier achieved an accuracy of 73.0%.

All in all, a good prediction performance is observed. It is not surprising that the hand-labeled data is harder to classify than the simulated dataset since especially the breakage algorithm gives only an approximation to the real degraded microstructure of the electrode of a lithium-ion battery. However, the similarity of the results shows that it is a valid strategy to perform the feature engineering on the simulated dataset. As it can be seen in **Table 2**, the classifier mostly struggles with separating PREPROCESSSEP and BROKEN classes, but this is hard, even for humans, as can be seen in **Figures 4B,C**. Further examples of particle pairs with their true and predicted classes are depicted in **Figure 4**.

# 3. SEGMENTATION OF TIME-RESOLVED TOMOGRAPHIC IMAGE DATA

#### 3.1. Description of the Problem

Tomographic image data of materials provides extensive information regarding microstructure, from which the latter's influence on a given sample's functional properties can be assessed. However, in most applications, this type of analysis becomes possible only after successful segmentation of the image data. Moreover, for some materials it can be difficult to obtain adequate CT data for analysis—for example, when the material is comprised of phases covering a broad spectrum of mass densities, which can lead to beam-hardening artifacts. Other issues can occur when a given specimen is homogeneous in density or X-ray attenuation, which causes low contrast in the resulting image data. The latter is a challenge in the case of polycrystalline materials, for which the grain microstructure manifests itself


TABLE 1 | Performance metrics for the classifier based on the simulated test data.

*Reprinted from Petrich et al. (2017), Table 1, with permission from Elsevier.*

TABLE 2 | Performance metrics for the classifier based on the hand-labeled test data.


*Reprinted from Petrich et al. (2017), Table 2, with permission from Elsevier.*

through heterogeneities in crystallographic orientation. The interfaces between neighboring grains, which are called grain boundaries, give rise to such small changes in X-ray attenuation that the boundaries are invisible to standard (i.e., absorptioncontrast) CT measurements. Consequently, techniques that exploit other grain-to-grain contrast mechanisms such as 3D electron backscatter diffraction (3DEBSD) or 3DXRD microscopy—must be utilized to image single-phase polycrystalline materials (Rowenhorst et al., 2006a; Bhandari et al., 2007; Schmidt et al., 2008; Poulsen, 2012).

Alternatively, if a particular material has a two-phase region in which one phase decorates the grain boundaries of the other phase, then it may be possible to map out the network of grain boundaries directly using only CT. For example, in Werz et al. (2014), tomographic measurements were performed on an Al-5 wt.% Cu alloy at various stages of Ostwald ripening, during which a liquid layer of a minority phase was present between the grains of the solid majority phase. X-ray absorption contrast arose from the higher concentration of Cu in the liquid than in the solid phase; this contrast was easily visible in CT reconstructions of the characterized volume, see **Figure 5A**. The subsequent image analysis is described in Spettl et al. (2015), in which modified conventional image processing techniques were employed to perform a grain-wise segmentation of the considered image data.

Although the liquid phase is responsible for making the polycrystalline microstructure visible to X-ray tomography, the liquid itself can interact strongly with the network of grain boundaries, thereby exerting a non-negligible influence on the equilibrium shape of grains or on the migration kinetics of boundaries during Ostwald ripening. For this reason, we

consider the analysis of CT image data for an Al-5 wt.% Cu alloy containing only 2% (by volume) of the liquid phase. This sample was imaged a total of seven times by CT; between each measurement the specimen experienced 10 min of Ostwald ripening.

From here on, we refer to the resulting 3D images as C0, . . . , C6, see **Figure 8** (left column). Note that the grain boundaries become less distinct during the Ostwald ripening process, which exacerbates the difficulty of segmenting individual grains by standard image processing algorithms. Therefore, we turn our attention to machine learning techniques, namely convolutional neural networks (CNNs) (Goodfellow et al., 2016),

FIGURE 6 | (A) Cross-section of *C*2 depicting the sample after 20 min of Ostwald ripening. These images will be used as training input for the CNN. The blue square indicates the size of an 80 × 80 cutout with respect to the original resolution of the CT data. After downsampling the data to the resolution of 240 × 240 × 420 voxels an 80 × 80 cutout has the relative size indicated by the red square. (B) Segmentation of the corresponding section obtained via 3DXRD microscopy. (C) Cross-section of the extracted grain boundary image *L*2 from the 3DXRD data. The grain boundary images *Lt* will be used as target images for the input images *Ct* during training of the CNN.

of size 2 × 2 × 2. Up-convolutional layers of size are 2 × 2 × 2 indicated by green arrows. Merge layers are visualized by gray arrows. The layer (black arrow) generating the output is a convolutional layer with kernel size 1 × 1 × 1 and a sigmoid activation function. The sizes of input, feature and output images during training is given in the boxes. After training the network can receive arbitrarily sized images as input, provided their size in each direction is a multiple of 2<sup>4</sup> <sup>=</sup> 16.

to extract grain boundaries from the tomographic images Ct . In contrast to the method described in section 2.2, in which a neural network was used as a postprocessing step to refine a segmentation, CNNs are employed in the present section as a preprocessing step to enhance and predict grain boundaries. Another key difference between the methods described here and in sections 2.2 and 2.3 is that the present CNNs do not require user-defined image features for their decision making, but are able to determine their own features. More precisely, the trainable parameters of a CNN are discrete kernels that can detect (depending on the kernel size) local features via convolution with input images. The aggregation of such local features allows

the detection of larger-scale features. Thus, CNNs are capable of learning and incorporating multi-scale features into their decision-making process.

# 3.2. Materials and Methods

Like every supervised machine learning technique, CNNs require training data in form of pairs of input and desired target images. In the context of the present paper, this means that for each 3D image obtained by CT we require a corresponding 3D image in which the grain boundaries have already been extracted. Such grain boundary images were obtained by an additional image acquisition technique: at each imaging step t = 0, . . . , 6, in addition to CT measurements (Ct) the same sample volume was characterized by 3DXRD microscopy. This paired information will be used to train CNNs such that they are able to predict grain boundaries from CT image data without additional 3DXRD imaging. Now, we provide additional details regarding the nature of the data, the chosen CNN architectures and the training procedure.

Both CT and 3DXRD measurements were carried out on Al-5 wt. % Cu at beamline BL20XU of the synchrotron radiation facility SPring-8. The sample had a cylindrical shape with a diameter of 1.4 mm. Mounted on a rotating stage, it was illuminated by a monochromatic X-ray beam with an energy of 32-keV. We recorded both far-field and near-field diffraction patterns on 2D detectors. Followed the reconstruction routine described in Schmidt et al. (2008) and Schmidt (2014), the grain morphology together with the crystallographic orientation of

individual grains was mapped. Heat treatment of the sample took place at 575◦C, at which temperature the microstructure consisted of a mixture of solid and liquid phases according to the Al-Cu phase diagram (Massalski, 1996). Under these conditions, the sample undergoes slow but steady Ostwald ripening. After an annealing time of 10 min, the sample was cooled to room temperature and characterized by both CT and 3DXRD microscopy. In total, the specimen was held for 60 min at 575◦C and mapped seven times. Due to small misalignments that occurred each time the sample was removed from the X-ray beamline for annealing, it was necessary to register sequential CT and 3DXRD measurements according to the method described in Dake et al. (2016).

Reconstruction and processing of the 3DXRD data yielded the local crystallographic orientation, from which segmented 3D images of grains and thus grain boundary images L<sup>t</sup> were obtained (Schmidt, 2014), see **Figures 6B,C**. Since the state of the specimen did not change between CT and 3DXRD measurements, the images L<sup>t</sup> derived from the latter depict the true grain boundary systems of the corresponding reconstructed CT images C<sup>t</sup> , for each t = 0, · · · , 6. The CT images C<sup>t</sup> had a size of 960 × 960 × 1678 voxels, with cubic voxels of size 0.75µm.

Due to the registration step of CT and 3DXRD measurements the grain boundaries visible in C<sup>t</sup> are aligned with those of L<sup>t</sup> for each t = 0, · · · , 6. A cross-section of such a matching pair is visualized in **Figures 6A,C**. As a consequence, we can formulate the issue of detecting grain boundaries from CT images as a regression problem. More precisely, we seek a function f with

$$f(\mathbf{C}\_t) \approx L\_t,\tag{2}$$

for each 3D CT image C<sup>t</sup> with values in the interval [0, 1] and the corresponding binary grain boundary image L<sup>t</sup> with values in {0, 1}, with 1 indicating grain boundaries and 0 grain interiors.

As regression models for the function f we use CNNs based on the U-Net architecture. In recent years, this architecture has been used successfully in several segmentation tasks, see Çiçek et al. (2016) and Ronneberger et al. (2015). The U-Net uses several max-pooling layers, which downsample the image data. Then, even small kernels applied to downsampled data can detect largescale features—see **Figure 7** for the architecture of the considered U-Net with volumetric input. In order to inspect the capabilities of the U-Net architecture, we used CT measurements of an Al-5 wt% Cu sample having a liquid content of 7% (thus grain boundaries with a good visibility) to train such a neural network to handle two-dimensional input images. **Figure 5** indicates that this U-Net can predict the location of grain boundaries, even when they are not visible in CT data. This visual inspection of the results obtained for 2D input images motivates the use of a U-Net for three-dimensional CT images of such materials with the low liquid content of 2%, see **Figure 8** (left column).

Now, we describe the architecture of the chosen 3D U-Net for detecting grain boundaries in 3D data. A size of 3 × 3 × 3 for

the trainable kernels of the 3D U-Net depicted in **Figure 7** was chosen. The activation functions of the 3D U-Net's hidden layers are rectified linear unit (ReLU) functions (Glorot et al., 2011), and for the output layer a sigmoid function was chosen, such that the voxel values of output images are normalized to values in the interval (0, 1). Due to memory limitations the training could only be performed on cutouts from the images C<sup>t</sup> and L<sup>t</sup>

with a size of 80 × 80 × 80 voxels. Since these cutouts cover relatively small volumes, see **Figure 6A**, they do not provide the necessary size for learning large scale features with the 3D U-Net. In order to remedy this, the CT image data was downsampled from 960 × 960 × 1678 voxels to 240 × 240 × 420 voxels, with some manageable loss of information. Analogously, we upsampled the corresponding grain boundary images L<sup>t</sup> , which

initially had a voxel size of 5 µm, to obtain the same voxel and image size. For simplicity, we denote the resampled CT and grain boundary images by C<sup>t</sup> and L<sup>t</sup> , respectively. Then, training was performed on cutouts with 80 × 80 × 80 voxels, which can represent larger grain boundary structures at this scale after downsampling, see **Figure 6A**. The cutouts were taken randomly from the images C<sup>t</sup> and the corresponding sections of the grain boundary images L<sup>t</sup> . Note that, in contrast to the U-Net architecture proposed in Ronneberger et al. (2015), we padded the convoluted images of the CNN such that input and output images have the same size. Thus, the network's input is not restricted to images with a size of 80 × 80 × 80 voxels, i.e., it can be applied to the entire scaled CT image stack (240 × 240 × 420 voxels) after training. The only limitation is that the number of voxels in each direction of the input images must be a multiple of 2<sup>4</sup> <sup>=</sup> 16, which can be achieved by padding the image stack. This constraint arises from the four 2 × 2 × 2 max-pooling layers—which downsample images—followed by the four up-convolutional layers, see **Figure 7**. The number of max-pooling layers, which we call the depth of the U-Net in the following, can be increased such that the network can learn features of a larger scale. Note that in this case, the numbers of

convolutional, up-convolutional and merge layers are adjusted accordingly. Furthermore, we point out that the cutouts used for training were taken from image data among all seven time steps, but only from the first 200 slices of each image stack; thus, the remaining 220 slices could be used for validation and testing. In order to increase the efficiency of the available training data, we utilized data augmentation (Goodfellow et al., 2016) i.e., during training, pairs of chosen input and corresponding target cutouts were transformed randomly, yet pairwise in the same manner, via rotations/reflections. In this way we increased the number of available input-target pairs, and, additionally, the predictions of the neural network became more stable with respect to rotated images.

As cost function for the training procedure, we chose the binary cross-entropy (negative log-likelihood) function, see Goodfellow et al. (2016). The U-Net's initial kernel weights were drawn from a truncated normal distribution. Then, training of the kernel parameters was performed with the Adam stochastic gradient descent method (Kingma and Ba, 2015), using 50 epochs with 300 steps per epoch and a batch size of 1. These training hyperparameters were manually tuned, while the batch size of 1 was chosen due to memory limitations. The network was implemented using the Keras package in Python, see Chollet (2015) and training was performed on a NVidia GeForce GTX 1080 graphics processing unit (GPU).

After the training procedure, we applied the CNN, denoted by f , to each of the seven available CT images C<sup>t</sup> on an Intel Core i5-7600K CPU; that is, we computed predictions for the grain boundary network, Lˆ t , from

$$
\hat{L}\_t = f(\mathbf{C}\_t), \quad \text{for each } t = 0, \dots, 6. \tag{3}
$$

**Figure 8** (middle column) visualizes the outputs Lˆ <sup>t</sup> of the network in a cross-section (slice 350) that was not used for training. Initial inspection indicates that the predictions of the neural network become less reliable with increasing time or, equivalently, with decreasing visibility of grain boundaries in the CT data. Nevertheless, the predictions are, even for the final time step, reasonably good.

Since the training and application of the 3D U-Net is coupled with high memory usage, we reduced, as already mentioned, the initial resolution of the CT data. Furthermore, **Figure 5** indicates that a 2D U-Net, which can be used at higher resolutions due

FIGURE 13 | (A) 2D cross-section of a CT image containing reconstruction artifacts and (B) the corresponding prediction of the 3D U-Net.

to fewer memory requirements, is capable of detecting grain boundaries from 2D slices—at least for grain boundaries with a good visibility. Therefore, we trained a 2D U-Net using slices, instead of volumetric cutouts. Since the 2D architecture requires less memory than the 3D U-Net, for training we used patches of size 256 × 256 × 1 voxels which were taken from the CT images being downsampled to the resolution of 480 × 480 × 839 voxels instead of 240 × 240 × 420 voxels. In order to allow the 2D U-Net to learn features at a comparable scale as the 3D U-Net, we increased the depth (as defined above) of the 2D U-Net from 4 to 5. After training, the 2D U-Net has been applied slice-by-slice to the seven image stacks, resulting in volumetric grain boundary predictions. Because the 2D U-Net evaluates consecutive slices independently, the network's output can lead to discontinuous grain boundary predictions, see **Figure 9**. To overcome this, we used a 2D U-Net, which was trained with 2D multichannel images with a size of 256 × 256 × 11 voxels. More precisely, it was trained with sets of 11 consecutive CT slices and a ground truth slice corresponding to the 6th input slice. This way, when predicting grain boundaries in consecutive CT slices, the network receives overlapping and correlated information which reduces the discontinuities in the network's output. In order to give the multichannel U-Net additional information for its grain boundary predictions we did not limit it to slice-by-slice predictions in one single axial direction, namely top-to-bottom, of the image stacks. Thus, to obtain the final grain boundary predictions of the multichannel U-Net, the slice-by-slice predictions are computed in three directions (top-to-bottom, left-to-right, front-to-back). For each CT image stack, this results in three grain boundary predictions, which are than averaged resulting in the final volumetric grain boundary predictions.

As of now the procedures described above do not provide a grain-wise segmentation. More precisely, the outputs of the U-Net architectures are 3D images with voxel values in the interval (0, 1). Therefore, the network predictions must be binarized in order to localize the grain boundaries, which, however, do not necessarily enclose grains completely. Therefore, additional

FIGURE 14 | Segmentation results obtained by a 3D U-Net that was trained only with CT/3DXRD data from time step *t* = 6. (A) Kernel density estimation (blue) of relative errors in grain volume. The red curve is the density of relative errors in volume under the condition that the grain is completely visible in the cylindrical sampling window. (B) Kernel density estimation of relative errors in grain volume obtained by the segmentation procedure for each time step *t* = 0, . . . , 6.

image processing steps must be carried out in order to obtain a full segmentation of individual grains.

To that end, we binarize the grain boundary predictions (**Figure 10A**) of the networks Lˆ <sup>t</sup> with a manually determined global threshold followed by morphological closing (**Figure 10B**). The binarization is followed by a marker-based watershed transformation (**Figure 10C**), which is performed on the (inverted) Euclidean distance transform of the binary images. In order to reduce oversegmentation in the image obtained by the watershed transformation, a final postprocessing step is carried out in which adjacent regions are merged if the overlap between one region and the convex hull of the neighbor is too large (**Figure 10D**). For more details on the marker selection procedure for the watershed transformation or the postprocessing step we refer the reader to Spettl et al. (2015).

In order to quantitatively compare segmentations of the CT images C0, . . . , C<sup>6</sup> obtained by the 3D U-Net, 2D U-Net and multichannel U-Net followed by the postprocessing steps described above with segmentations derived from the 3DXRD measurements, we first match grains among these segmentations. More precisely, each grain <sup>G</sup>XRD <sup>⊂</sup> <sup>R</sup> <sup>3</sup> observed in a segmentation obtained by 3DXRD microscopy is assigned to a grain <sup>G</sup>seg <sup>⊂</sup> <sup>R</sup> 3 in the corresponding segmentation of the CT image data. We formulated this as a linear assignment problem (Burkard et al., 2012), which minimizes the sum of the volumes of the symmetric differences of matched grains

$$\nu\_{\text{3}}(G\_{\text{XRD}} \Delta \, G\_{\text{seg}}) = \nu\_{\text{3}}(G\_{\text{XRD}} \, \langle \, G\_{\text{seg}} \rangle + \nu\_{\text{3}}(G\_{\text{seg}} \, \langle \, G\_{\text{XRD}} \rangle, \tag{4})$$

where ν3(·) denotes the volume and GXRD 1 Gseg is the symmetric difference given by

$$G\_{\rm XRD} \,\,\Delta \,\, G\_{\rm seg} = \left( G\_{\rm XRD} \,\,\, \left( G\_{\rm seg} \right) \cup \left( G\_{\rm seg} \,\, \left( G\_{\rm XRD} \right) \right) \,. \tag{5}$$

Thus, we will be able to quantitatively compare pairs of matched grains (GXRD,Gseg) which, in turn, allows a comparison of the presented methods.

# 3.3. Results

Even though the CNNs described in section 3.2 do not provide grain-wise segmentation of CT data, they can significantly enhance CT images such that conventional image-processing techniques can be readily used to obtain a grain-wise segmentation. By following the approach described in Spettl et al. (2015), we obtained grain-wise segmentations of the considered data set, despite its rather indistinct grain boundaries. A visual comparison between grain boundaries extracted from the segmentation utilizing a 3D U-Net and the true grain boundaries obtained by 3DXRD microscopy indicates that the segmentation is reasonably good, with some oversegmented grains remaining, see **Figures 10E,F**. A more quantitative comparison becomes available by the grain matching procedure described in section 3.2, i.e., we will compute quantities to measure how much grains segmented from CT deviate from matched grains observed in the ground truth data. More precisely, we determine for pairs of matched grains the relative errors r<sup>V</sup> in grain volume given by

$$r\_V = \frac{|\nu\_3(G\_{\rm XRD}) - \nu\_3(G\_{\rm seg})|}{\nu\_3(G\_{\rm XRD})}.\tag{6}$$

Also, we computed errors r<sup>c</sup> in grain barycenter location normalized by the volume-equivalent diameter of the grain GXRD. These values are given by

$$r\_{\mathcal{L}} = \frac{\|\mathcal{L}(\mathcal{G}\_{\text{XRD}}) - \mathcal{L}(\mathcal{G}\_{\text{seg}})\|}{\sqrt[3]{\frac{6}{\pi} \,\nu\_{\text{3}}(\mathcal{G}\_{\text{XRD}})}},\tag{7}$$

where k · k denotes the Euclidean norm and c(GXRD),c(Gseg) are the barycenters of the grains GXRD and Gseg, respectively. **Figure 11** visualizes the quartiles of these relative errors in grain characteristics for the segmentation procedures based on the trained 3D U-Net, 2D U-Net and multichannel U-Net. For reference, we also included results obtained by the conventional segmentation procedure without applying neural networks, which was conceptualized for grain boundaries with good visibility and is described in Spettl et al. (2015). These results indicate that the segmentation procedures based on the U-Net architecture perform better then the conventional method. Among the machine learning approaches, the sliceby-slice approach with the 2D U-Net performs worst with a median value for r<sup>V</sup> of 0.37. This could be explained by the discontinuities of grain boundary predictions for consecutive slices, see **Figure 9**. By enhancing the slice-by-slice approach with the multichannel U-Net, we achieve a significant drop of this error down to 0.21. The segmentation approach based on the 3D U-Net performs best with a median error of 0.14, because it is able to learn 3D features for characterizing the grain boundary network embedded in the volumetric data.

Kernel density estimations (Botev et al., 2010) of the relative errors forthe 3D U-Net approach are visualized in **Figures 12A,B** (blue curves). Furthermore, **Figures 12C,D** depict these densities for each of the seven observed time steps t = 0, . . . , 6. Note that, as expected, the errors show a tendency to grow with increasing time step. In order to analyze possible edge effects, i.e., a reduced segmentation quality for grains located at the boundary of the cylindrical sampling window, we computed error densities only for grains located in the interior of the sampling window, see **Figures 12A,B**. The plots (red curves) indicate that, indeed, the segmentation procedure based on the 3D U-Net works better for interior grains. This effect can be explained by the information that is missing for grains that are cut off by the boundary of the sampling window.

#### 3.4. Discussion

Although our procedures based on preprocessing with CNNs followed by conventional image processing do not lead to perfect grain segmentations, see **Figure 12**, especially the method utilizing the 3D U-Net delivers relatively good results when considering the nature of the available CT data. Furthermore, the neural network is able to reduce local artifacts, like liquid inclusions in the grain interiors, which cause small areas of high contrast far from grain boundaries, see **Figure 8** (first row). Yet, we warn that the predictions of the trained U-Net are prone to error when there are large-scale image artifacts in the input images, as illustrated in **Figure 13**. One possible way to reduce the effect of such artifacts is to consider a modified architecture of the 3D U-Net, with larger kernels or more pooling layers, such that even larger features can be considered.

Nevertheless, without the machine learning approach, i.e., the preprocessing provided by the 3D U-Net, the segmentation of CT data for later measurement time steps with poorly visible grain boundaries is a complex and time-consuming image processing problem. Still, in the presented procedure, conventional image processing, i.e., binarization and the watershed transform, was necessary to obtain a grainwise segmentation of the considered data. Thus, the segmentation techniques considered in sections 2 and 3 show the flexibility of combining the watershed transform with machine learning techniques either for pre- or postprocessing image data for the purpose of segmenting tomographic image data of functional materials.

Note that, in the 3D U-Net approach, there are some machine learning techniques that could have been adopted to further reduce the need for some of the subsequent image processing steps. For example, the binarization step could be incorporated into the network by using the Heaviside step function as an activation function in the output layer. Morphological operations, like the closing operation utilized in the procedure above, could be implemented by additional convolutional layers with non-trainable kernels followed by thresholding. In this way, the necessary postprocessing steps will be considered during the training procedure of the 3D U-Net. Alternatively, by describing a segmentation with an affinity graph on the voxel grid, it is possible to obtain segmented images as the final output of CNNs, see Turaga et al. (2010). Note that such approaches require cost functions which allow a quantitative comparison between segmentations, see e.g., Briggman et al. (2009) and Liebscher et al. (2015). Furthermore, we point out that there are techniques for obtaining a grain-wise segmentation by fitting mathematical tessellation models to tomographic image data using Bayesian statistics and a Markov chain Monte Carlo approach, see Chiu et al. (2013). In our case, such techniques could be applied directly to tomographic or even to enhanced grain boundary images obtained by the 3D U-Net.

Moreover, we note still another possible application of machine learning methods for the analysis of CT image data. In many applications, "ground truth" measurements are destructive, which means that they can be carried out only for the final time step of a sequence of measurements. This limits the available training data for machine learning techniques.

We simulated such a scenario with our data by using solely the CT image C<sup>6</sup> and the 3DXRD data L<sup>6</sup> of the last measured time step to train an additional 3D U-Net. Analogously to the procedure described in section 3.2, this network was applied to the entire series of CT measurements. The resulting grain boundary predictions were then segmented using the same image processing steps as described in section 3.2. **Figure 14** indicates that the relative errors of grain volumes are comparable to the errors made when considering every time step during training, see **Figure 12**. This result suggests that a "ground truth" measurement of only the final time step would suffice for training in our scenario. Similarly, machine learning approaches might be interesting for the segmentation and analysis of time-resolved CT data in various applications in which "ground truth" measurements cannot be made during experiments, but only afterwards, in a destructive or time-consuming manner.

# 4. CONCLUSIONS

We gave a short overview of some applications in the field of materials science in which we successfully combined methods of statistical learning, including random forests, feedforward and convolutional neural networks with conventional image processing techniques for segmentation, classification and object detection tasks. More precisely, the methods of sections 2 and 3 utilize machine learning as either a pre- or postprocessing step for the watershed transform to achieve phase-, particle- or grainwise segmentations of tomographic image data from various functional materials—showing how flexible the approach of combining the watershed transform with methods from machine learning is. In particular, we presented such an approach for segmenting CT image data of an Al-5 wt.% Cu alloy with very low volume fraction of liquid between grains. In total, we considered seven CT measurements of the sample, between which were interspersed Ostwald ripening steps. Especially at later times, the aggregation of liquid leads to a decrease in contrast of the image data, i.e., grain boundaries become less distinct in the image data, which makes segmentation by conventional image processing techniques quite difficult and unreliable. Therefore, we employed matching grain boundary images—which had been extracted from the same sample by means of 3DXRD microscopy—as "ground truth" information for training various CNNs: a 2D U-Net which can be applied slice-by-slice to entire image stacks, a multichannel 2D U-Net which considers multiple slices at once for grain boundary prediction in a planar section of the image stack and, finally, a 3D U-Net which was trained with volumetric cutouts at a lower resolution. After the training procedure, the U-Nets were able to enhance the contrast at grain boundaries in the CT data. Especially, the 3D U-Net successfully predicted the locations of many grain boundaries that were either missing from the image data or poorly visible. This shows that machine learning methods can facilitate difficult image processing tasks, provided that "ground truth" data is available, e.g., data obtained via additional measurements or manual image labeling. Since the images output by the convolutional neural networks were not themselves grainwise segmentations, we applied conventional image processing algorithms to the outputs to obtain full segmentations at each considered time step and for each presented method. These were compared quantitatively with "ground truth" segmentations extracted from 3DXRD measurements. The resulting relative errors in grain volume and locations of grain centers of mass indicated that the machine learning-based segmentation

procedures worked reasonably well, particularly for grains that were not cut off by the boundary of the observation window. Finally, we trained an additional 3D U-Net only with CT and 3DXRD data obtained during the final time step. This simulated the common scenario in which a "ground truth" measurement can be performed only at the very end of an experiment. The 3D U-Net trained in this manner was applied as before to the entire CT data set, followed by conventional image processing steps, yielding grain segmentations. Quantitative comparison of the latter to segmentations derived from 3DXRD data indicated that the approach produced good results. Even though a trained neural network does not make 3DXRD measurements obsolete, the procedure presented here can potentially reduce the amount of 3DXRD beam time that is needed for accurate segmentation and microstructural analysis. Likewise, we believe that a similar approach might be particularly beneficial whenever nondestructive CT measurements can be carried out in situ, but "ground truth" information can be acquired only by a destructive measurement technique.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

OF, MN, LP, and MWe reviewed previous results on machine learning for segmentation of image data. Tomographic image data of the AlCu specimen has been provided by MWa and CK. The network training, segmentation and analysis of AlCu CT image data was performed by OF. All authors discussed the results and contributed to writing of the manuscript. CK and VS designed the research.

#### ACKNOWLEDGMENTS

The financial support of the German Research Foundation (DFG) for funding this research project (SCHM997/23-1) is gratefully acknowledged. The authors thank Murat Cankaya for the processing of image data. In addition, the authors are grateful to the Japan Synchrotron Radiation Research Institute for the allotment of beam time on beamline BL20XU of SPring-8 (Proposal 2015A1580).


lithium-ion batteries under increasing compaction. J. Microsc. 272, 96–110. doi: 10.1111/jmi.12749


solid-liquid mixtures. Acta Mater. 54, 2027–2039. doi: 10.1016/j.actamat.2005. 12.038


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Furat, Wang, Neumann, Petrich, Weber, Krill and Schmidt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Computation of Thickness and Mechanical Properties of Interconnected Structures: Accuracy, Deviations, and Approaches for Correction

#### Claudia Richert <sup>1</sup> \*, Anton Odermatt 1,2 and Norbert Huber 1,2

1 Institute of Materials Research, Materials Mechanics, Helmholtz-Zentrum Geesthacht, Geesthacht, Germany, <sup>2</sup> Institute of Materials Physics and Technology, Hamburg University of Technology, Hamburg, Germany

Identifying local thickness information of fibrous or highly porous structures is challenging. The analysis of tomography data calls for computationally fast, robust, and accurate algorithms. This work systematically investigates systematic errors in the thickness computation and the impact of observed deviations on the predicted mechanical properties using a set of 16 model structures with varying ligament shape and solid fraction. Strongly concave, cylindrical, and convex shaped ligaments organized in a diamond structure are analyzed. The predicted macroscopic mechanical properties represent a highly sensitive measure for systematic errors in the computed geometry. Therefore, the quality of proposed correction methods is assessed via FEM beam models that can be automatically generated from the measured data and allow an efficient prediction of the mechanical properties. The results show that low voxel resolutions can lead to an overprediction of up to 30% in the Young's modulus. A model scanned with a resolution of 200 voxels per unit cell edge (8M voxels) reaches an accuracy of a few percent. Analyzing models of this resolution with the Euclidean distance transformation showed an underprediction of up to 20% for highly concave shapes whereas cylindrical and slightly convex shapes are determined at high accuracy. For the Thickness algorithm, the Young's modulus and yield strength are overpredicted by up to 100% for highly concave ligament shapes. A proposed Smallest Ellipse approach corrects the Thickness data and reduces this error to 20%. It can be used as input for a further robust correction of the Thickness data using an artificial neural network. This approach is highly accurate with remnant errors in the predicted mechanical properties of only a few percent. Furthermore, the data from the FEM beam models are compared to results from FEM solid models providing deeper insights toward further developments on nodal corrections for FEM beam models. As expected, the FEM beam models show an increasing overprediction of the compliance with increasing solid fraction. As an unexpected result, the mechanical strength can however be underpredicted or overpredicted, depending on the ligament shape. Therefore, a nodal correction is needed that solves contradicting tasks in terms of stiffness and strength.

Keywords: tomography, skeletonization, thickness correction, artificial neural network, nanoporous gold, trabecular bone, foams, FEM beam model

Edited by:

Nicola Maria Pugno, University of Trento, Italy

#### Reviewed by:

Ercan Gürses, Middle East Technical University, Turkey Douglas Soares Galvao, Campinas State University, Brazil

> \*Correspondence: Claudia Richert claudia.richert@hzg.de

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

Received: 04 February 2019 Accepted: 28 November 2019 Published: 18 December 2019

#### Citation:

Richert C, Odermatt A and Huber N (2019) Computation of Thickness and Mechanical Properties of Interconnected Structures: Accuracy, Deviations, and Approaches for Correction. Front. Mater. 6:327. doi: 10.3389/fmats.2019.00327

# INTRODUCTION

Lacking a detailed morphological and topological description of the microstructure, the structure-property relationship of open-pore materials, such as metal foams, elastomeric foams, or Nanoporous gold (NPG) is commonly described by the Gibson-Ashby scaling law, in which the solid fraction is the most important parameter characterizing the materials morphology (Gibson and Ashby, 1997; Ashby et al., 2000). During the last two decades, the morphological characterization and prediction of mechanical properties of open-pore materials gained increasing attention, thanks to the improving resolution of X-ray, FIB, and TEM micro-/nanotomography instruments, complemented by advancing image processing algorithms and computational modeling techniques. Tomography and FEM simulations on metal and elastomeric foams date back to Nieh et al. (1998), Nieh et al. (2000), and Kinney et al. (2001). A very detailed analysis of cell volume and strut length distributions, number of faces per cell, junctions coordination number and the shape of the most representative cells was carried out by Dillard et al. (2005) based on a 3D quantitative image analysis of open-cell nickel foams under tension and compression loading using Xray microtomography.

First studies based on NPG were conducted by Rösner et al. (2007) using TEM on dealloyed gold leafs. Hu et al. (2016), Mangipudi et al. (2016), and (Ziehmer et al., 2016) analyzed NPG samples of larger volumes, obtained from focused ion beam (FIB) sectioning and scanning electron microscope (SEM) imaging. By these thorough works, a systematic analysis of the NPG morphology in terms of ligament size distribution and connectivity density has become possible for the first time. Because the ligaments are of nanoscale dimension, these investigations are all based on high-resolution SEM images for which techniques for an automated image processing are an asset. Hu et al. (2016) and Mangipudi et al. (2016) use the 3D Biggest Sphere Thickness algorithm by Hildebrand and Rüegsegger (1997) for the estimating the ligament size distribution of 3D volumes.

For the geometrical description of the ligaments in a NPG network, Pia and Delogu (2015) proposed a parabolic shape with a square cross-section connected in cubic nodes. The parameters for the parabolic shape and their statistical distribution were manually determined from 2D SEM images. Badwe et al. (2017) analyzed 2D SEM images using digital image analysis to obtain ligament size histograms that were fit to the Weibull distribution. To obtain the ligament size distribution, they apply the skeletonization and distance map transformation each onto the original binary SEM image, using the open-source software ImageJ. The multiplication of these two results yields the skeleton ascribed with the according diameter at each skeleton-point. Consistent with the results of Rösner et al. (2007) and Hu et al. (2016), the mean ligament distributions were reported to be nearly self-similar for the examined ligament sizes. Stuckner et al. (2017) present a Python package AQUAMI, which automatically analyzes microstructural features from micrographs. The approach is similar to the approach by Badwe et al. (2017), which was independently published, but has no need for manual calculation in ImageJ. The average diameter and diameter distribution of the morphologies in each phase is calculated using a medial axis transform and a distance transform. McCue et al. (2018) use AQUAMI to data-mine NPG 2D images of 28 published manuscripts, regarding mean ligament diameter, length, and solid phase fraction. They point out the difficulty and resulting systematic discrepancies when comparing results gained by different measuring approaches, ranging from manually measuring the thinnest part of the ligament, to computational estimations. Furthermore, as a minimum criterion for meaningful image analysis, they propose to use images with a minimum resolution of at least 10 pixels per ligament diameter, due to the otherwise reported errors.

In summary, two algorithms are found to be dominantly used in literature to estimate the ligament size distribution: The Thickness algorithm, which is able to analyze 3D volumes and the Euclidean distance transformation (EDT), which is applied for analyzing 2D SEM images by Badwe et al. (2017), Stuckner et al. (2017), and McCue et al. (2018). It calculates at each point of the structure the distance to the nearest background point. The Thickness algorithm by Hildebrand and Rüegsegger (1997) is implemented in image analysis programs, such as the opensoftware program Fiji by Schindelin et al. (2012). It calculates the local thickness at a point as the dimeter of the largest sphere, which is completely inside the structure and which contains the evaluated point. The mean thickness is calculated as the volume weighted average of the local thickness. The algorithm is commonly used to estimate the mean trabecular thickness of trabecular bone (Day et al., 2000; Almhdie-Imjabber et al., 2014), or other bone structures (Witkowska et al., 2014), because it is a powerful and fast volume-based algorithm. In the context of NPG the Thickness algorithm has been applied for analyzing 3D tomography data or voxel models by Hu et al. (2016), Mangipudi et al. (2016), Richert and Huber (2018), and Soyarslan et al. (2018a,b).

By the definition of Hildebrand and Rüegsegger (1997), the biggest sphere at a skeleton point pskel does not need to be centered at this point. Liu et al. (2014) show for an object formed by two overlapping disks of different scales that the Thickness algorithm shows a bias toward the larger disk. They furthermore show that an equivalently working Smallest Sphere approach results in the same artifact, but in the opposite direction. The authors propose the definition of the thickness of a point p as the diameter of the maximum inscribed sphere whose circumference is farthest from p. Furthermore, for the skeleton, the property must be satisfied that the thickness at a skeleton point pskel is the diameter of the biggest sphere centered at pskel. They introduced also a star-line-based algorithm, where the thickness at an axial voxel is defined as the minimum-intercept of a straight line with the boundary. The minimum-intercept length measure is highly robust under small random shifts of axial voxels. One drawback of this thickness computation method lies in the increased computation time needed, because interpolated intensity values at multiple sample points have to be computed on individual starlines for each axial voxel. For more details and other thickness approaches see also the literature cited by Liu et al. (2014). The tendency to overpredict the thickness of structures was also reported by Maier et al. (2017) for cartilage thickness, in comparison to other thickness estimation approaches. Such an overprediction is unproblematic when studying the selfsimilarity of structures, or when comparing mean values or distributions. However, for the prediction of mechanical properties using FEM, the correct diameter distribution along the ligament axis is crucial. Richert and Huber (2018) showed that the Thickness algorithm reaches its limits when being applied to typical shapes of NPG ligaments, due to the strongly varying diameter along the ligament axis. The resulting overestimation in ligament radius up to 30% has a strong impact on the predicted mechanical stiffness, which can deviate by a factor of more than two. In their conclusions, Richert and Huber (2018) mentioned the need for a correction method for tracing back an identified ligament shape to the corresponding true geometry, which could be based on inverse methods, such as optimization or machine learning. This important finding has been ignored by Soyarslan et al. (2018b) who used the diameter information as determined from the Thickness algorithm in their beam-FE model, without any local validation of the detected diameters or discussion of possible consequences for their mechanical prediction.

Further literature research revealed that there exists also a plugin in the open-software program Fiji of the 3D Euclidean distance transformation (EDT) by Ollion et al. (2013), among others, which seems to be unnoticed by groups working on the analysis of 3D data. As this algorithm computes the distance from a given voxel of the structure to the nearest background voxel, the extracted axis-to-surface distance will have the tendency to underpredict the ligament diameter for highly convex or concave ligament shapes. The reason for this is that the smallest distance is determined by the normal from the surface contour to an axis point, which is smaller compared to the diameter measured normal to the ligament axis. It his however unclear, how large the deviations are for the typical geometries found in open pore materials and how big their impact is on the mechanical properties in comparison to the results from the Thickness algorithm.

Motivated by these findings, this paper aims to lay a solid basis for error estimation and thickness correction for the different algorithms. The availability of a method for an accurate characterization represents a key element for producing data sets of high quality, consisting of pairs of structure information and related mechanical properties. As demonstrated by Huber (2018) for the topology term of the structure-property relationship, a larger number of such patterns is needed for deriving a fairly general representation using data mining and machine learning approaches. This is particularly an issue when pooling data from different sources, which make use of different algorithms.

Following a detailed investigation of the sources of over- and underestimation in the computed thickness data, approaches for the correction of data from the Thickness algorithm are proposed: A Smallest Ellipse algorithm, which resides in between the Biggest Sphere approach and the Smallest Sphere approach, and an artificial neural network approach. Similarly, an artificial neural network approach is proposed for the correction of data from the Euclidean distance transformation. The results clearly show that the artificial neural network is able to correct the overand underpredicted thickness dependent on the position of the ligament axis. The drawback is that it is limited to the range of ligament shapes used during training. Recommendations are given in terms of generalization to asymmetric ligaments as a requirement for applications to larger structures of higher complexity.

# METHODOLOGY

Previous analysis by Richert and Huber (2018) on actual NPG tomography data produced by Hu et al. (2016) revealed a diameter overestimation of the NPG structure by the Biggest Sphere Thickness algorithm by Hildebrand and Rüegsegger (1997), implemented in the open-source program Fiji by Schindelin et al. (2012), in the Thickness Plugin by Dougherty and Kunzelmann (2007). Richert and Huber (2018) mathematically calculated the influence on the overestimated ligament diameters on the mechanical stiffness for single parabolic ligaments, showing an overestimation by up to a factor of 8. These results clearly show the significance of the error to be expected as function of the ligament geometry, but it is unclear how strong this effect is reflected in the macroscopic properties of a Representative Volume Element (RVE). It can be argued that the macroscopic response of an interconnected structure could be less sensitive to local deviations in the ligament geometries. Furthermore, the amount and effect of possible underestimations by the distance transformation need to be investigated. An impression of the discrepancy between the two algorithms is obtained by analyzing the tomography data of Hu et al. (2016), shown in **Figure 1**. The Thickness (Th) and Euclidean distance transformation (EDT) information are consistently evaluated along the skeleton voxels. It can be seen that the determined averages of 400 nm (Th) and 308 nm (EDT) deviate significantly. It is therefore important to investigate each algorithm with

FIGURE 1 | Ligament diameter distribution of NPG tomography with Thickness (Th) and Euclidean distance transformation (EDT) algorithm. The histograms are normalized to an area of one and fitted with the Gaussian distribution. Shifted distributions with average ligament diameter of 400 nm (Th) and 308 nm (EDT) are observed.

respect to ligament shape and to propose correction methods, where needed.

It should be noted that working with tomography data, several crucial image-processing steps are necessary beforehand, such as image noise filtering, brightness and contrast adjustment, registration and segmentation. For the latter, it is necessary to set a threshold value that decides if a voxel is attributed to the solid or to the pore space and the proper choice of this parameter is absolutely critical for all following steps. Commonly, this parameter is calibrated via the relative density of the material, which is independently measured. While this ensures that the tomography reflects the relative density of the material in average, this does not guarantee that local features are precisely detected. In case of the NPG-epoxy composite tomography data produced by Hu et al. (2016), specific settings in the FIB-SEM process made the ligaments easily distinguishable without interfering with the ligament network structure underneath the cross-section. In this case, the segmentation in Fiji using a single value grayscale threshold for the image stack was thus applicable. An image processing error of ±2% in volume fraction was found by manually changing the image contrast, brightness and threshold value for the segmentation process for that data set (Hu, 2017).

This study focuses on analyzing the influence of the Thickness and EDT algorithm on NPG-like RVEs, which are based on known geometries. Emphasis is placed on providing data of sufficiently complex but well-defined 3D structures, for which the exact diameter information is known in each position along the ligament axis. To this end, ligaments with a smooth parabolic-spherical ligament shape as suggested by Richert and Huber (2018) are organized in a diamond structure. This topology is frequently used for mechanical modeling of 3D open pore materials (Nachtrab et al., 2011; Huber et al., 2014; Roschning and Huber, 2016; Jiao and Huber, 2017a,b; Huber, 2018). In contrast to the conventional FEM approaches, which are computational expensive, FEM beam models allow for fast computation even for large plastic deformation, which is a requirement for larger parameter studies of larger and more realistic RVEs. The drawback of this method is the underprediction of stiffness and strength, which needs to be compensated via a correction of the nodal mass (Huber et al., 2014; Roschning and Huber, 2016; Jiao and Huber, 2017b). An attractive alternative for the numerical simulation of foam-like materials is the Finite Cell Method (Parvizian et al., 2007; Düster et al., 2008, 2017). Recently, Gnegel et al. (2019) applied this approach for predicting the elastic-plastic deformation behavior of pure and polymer coated NPG based on the tomography data of Hu et al. (2016). In combination with experimental macroscopic compression data, it was possible to determine the elastic-plastic properties of the gold phase and of the polypyrrole coating of a few nanometer thickness. This requires reducing the explicitly modeled 3D structure to a sub-sample of the available tomography dataset such that the model could be computed in a reasonable time. Therefore, FEM beam models remain an attractive candidate for computing larger models.

For the sake of a systematic in-depth comparison of all methods under investigation, the geometries in this work are limited to symmetric shapes. Altogether, 16 idealized model geometries plus three additional validation geometries are generated covering the relevant range of ligament shapes from concave to convex. For each model geometry, a high-resolution voxel representation serves as basis for testing various approaches of thickness detection and correction. In addition to the assessment of the error in the determined geometry, the effect on the mechanical properties is computed for each structure and correction method using the FEM beam modeling approach developed in a series of previous works (Huber et al., 2014; Jiao and Huber, 2017a; Huber, 2018; Richert and Huber, 2018).

Motivated by the reported differences between the skeleton FEM beam model and the FEM solid model (Richert and Huber, 2018), FEM solid models are created via PCL scripting in MSC Patran, complementing the reference FEM beam models. The results will provide further insights into the differences between FEM beam and FEM solid models for various ligament shapes in terms of elastic and plastic deformation behavior. The results are also relevant for the further development of nodal corrections for more general ligament shapes as an extension to the simple ball-and-stick geometries investigated by Jiao and Huber (2017b).

**Figure 2** gives an overview of the workflow applied in the following sections. Details on the individual approaches are provided at the beginning of each section. To mimic the FEM skeleton beam model building process from tomography data by Richert and Huber (2018), the RVE geometry information is scanned by a Python script with a defined voxel resolution. The output is a voxelized tiff stack, which is needed as input for the Skeletonize, AnalyzeSkeleton, Thickness and 3D Distance Map Plugin evaluations in Fiji (Lee et al., 1994; Dougherty and Kunzelmann, 2007; Arganda-Carreras et al., 2010; Ollion et al., 2013). The whole procedure of building the FEM skeleton beam model from tomography data is described in detail in the Appendix of Richert and Huber (2018). The simulation of the original FEM beam model vs. the FEM skeleton beam model created with the Thickness information will reveal the impact of the flawed diameter estimation on the mechanical behavior of the ligament network. This allows us also to individually analyze the errors originating from the voxel resolution, the skeletonization, and the ligament discretization on the macroscopic elastic-plastic response.

After the analysis of the influencing parameters with regard to their effect on the geometry computation, the question arises, to what extend the error of each algorithm could be reduced in the aftermath. Concerning the Thickness algorithm we focus in this work on two different correction approaches. Geometrically it is clear why the Thickness algorithm overestimates the diameters of strongly varying ligament shapes as found in NPG. This is why a direct reconstruction approach is developed, opting for an ellipse as the final scanning volume. This so-called Smallest Ellipse (SE) algorithm resides in between the Biggest Sphere approach and the Smallest Sphere approach and is therefore a promising technique for efficiently balancing the thickness data between over and underprediction. A second correction approach is based on an artificial neural network (ANN), which efficiently allows for a global mapping from the measured overpredicted to the corrected ligament shapes. The ANN approach is also applied for correcting data from the EDT algorithm.

and Euclidean distance transformation are done in Fiji. 3rd step: FEM skeleton beam models are built via python scripting with Thickness (Th), Euclidean distance transformation (EDT), and corrected diameters using the Smallest Ellipse (SE). 4th step: additional artificial neural network (ANN) correction approach.

#### REFERENCE FEM MODELS AND THEIR PROPERTIES

#### Reference Geometry of the Unit Cell

To study the effect of the overestimation in the thickness and the quality of approaches for correction, 16 diamond unit cells are generated. By shifting the diamond structure proposed by Huber et al. (2014) by a quarter of a unit cell length in all three coordinate directions (Soyarslan et al., 2017), four ligaments with complete nodes at both ends are positioned in the center of the RVE. These core ligaments are later analyzed with respect to their thickness distribution by different algorithms, as they remain unaffected by cuts at the boundary of the RVE.

In what follows, the investigation of the mechanical behavior is limited to macroscopic compression, which is commonly used in experiments (Jin et al., 2009; Huber et al., 2014; Hu et al., 2016; Liu and Jin, 2017). The resulting macroscopic properties are only valid for this loading direction. Due to the inherent anisotropy in the diamond structure, the mechanical response can be different for compression, tension, and shear. The elastic properties though can be considered isotropic in tension and compression, because elastic properties per definition reflect small deformations. Furthermore, because of the perfect symmetry of the unit cell in x, y, and z-direction, isotropy in these directions is naturally given as long as the loading is consistently either tension or compression. Thus, the stress-strain curve will show perfect agreement for small strains, whereas with increasing strain, the stress-strain curves for tensile loading tends to rise faster compared to the curves for compression loading. Under tensile loading, the ligaments tend to align in loading direction (see Sun et al., 2013) and are able to bear higher loads compared to compression loading, where the ligaments deform like an sshape due to bending (Huber et al., 2014). Therefore, the yield strength is slightly larger in tension than in compression and the difference is more pronounced for thin ligaments, because they align more easily in tensile direction like fibers. These mechanisms are demonstrated for two example structures G<sup>11</sup> and G<sup>14</sup> in **Supplementary Section 2.3**. For the scope of this work it is sufficient to concentrate on compression, because errors in the ligament geometry will be reflected similarly in all mechanical properties and loading scenarios. In what follows, we will investigate the errors in the thickness determination depending on the algorithm that is used and their correction. To this end, we use diamond structure consisting of identical ligaments with well-known geometry. Because of this replication, the macroscopic behavior of the structure gives an indication about the response of a single ligament that is part of a more complex network.

Variable ligament shapes are incorporated in form of a continuous parabolic-spherical shape introduced by Richert and Huber (2018), see Figure 10 therein. To incorporate also asymmetric ligament shapes observed by Richert and Huber (2018), the ends are defined by two different radii rend,<sup>l</sup> and rend,<sup>r</sup> for the left and right junction, respectively. The resulting gradient along the ligament with length l is included in Equation (1) through the parameter b. The locations xQ,<sup>l</sup> and xQ,<sup>r</sup> at which the parabolic shape transitions into the spherical parts of the ligament, are determined iteratively such that a smooth ligament with a tangential transition is achieved (see Richert and Huber, 2018).

$$r(\mathbf{x}) = \begin{cases} \sqrt{r\_{end,l}^2 - (l/2 + \mathbf{x})^2} & -l/2 \le \mathbf{x} < \mathbf{x}\_{Q,l} \\\ a\mathbf{x}^2 + b\mathbf{x} + \mathbf{c} & \mathbf{x}\_{Q,l} \le \mathbf{x} \le \mathbf{x}\_{Q,r} \\\ \sqrt{r\_{end,r}^2 - (l/2 - \mathbf{x})^2} & \mathbf{x}\_{Q,r} < \mathbf{x} \le l/2 \end{cases} \tag{1}$$

The axial coordinate x has its origin in the mid of the ligament, such that the ligament mid radius is given by rmid = c. For the indepth study of the thickness determination and correction as well as their effect on the mechanical properties, the ligament shape is kept symmetric by setting b = 0. In this case,rend = rend,<sup>l</sup> = rend,<sup>r</sup> and xQ,<sup>l</sup> = − xQ,<sup>r</sup> .

In what follows, the unit cell size aUC is set to 1, i.e., all absolute lengths are given as fraction of the unit cell size. The 16 geometries are chosen to cover ratios of ligament mid to end radius rmid/rend from 0.5 to 1.25 in increments of 0.25. This is the relevant range of ligament shapes as identified from a 3D tomography of a NPG sample (Richert and Huber, 2018). As the second geometry parameter, the end radius was varied from rend = 0.1 to 0.175 in increments of 0.025. Through the combination of these two parameters a large range of solid fractions is covered that exceeds the typical range of NPG samples from very low (ϕmin ≈ 0.1) to very large values (ϕmax ≈ 0.5). Based on the two chosen parameters rmid and rend, the parameter c in Equation (1) can be determined following Richert and Huber (2018).

#### Reference FEM Solid and Beam Models

Reference FEM beam and solid models are generated for all geometries defined in **Table 1**. A detailed description of how the reference FEM beam is created is given in **Supplementary Section 1**. The solid unit cells are built using PCL scripting in MSC Patran 2017 and, after Boolean operation on all ligament and junction volumes, are meshed in a single meshing operation with C3D10 three dimensional 10-node quadratic tetrahedron elements for (Abaqus, 2014). The number of elements range from 9,445 to 38,279 for structures with lowest (G11) and highest solid fraction (G44), respectively, with average element sizes of 0.05. The solid fractions given in **Table 1** are obtained from the FEM solid model in Abaqus via the history output VOL. Examples for the most filigree structures with rend = 0.1 are shown in **Figure 3**. Due to the small ligament diameter, these structures will show the highest sensitivity with respect to effects of voxel resolution, discretization, and the accuracy of the algorithms applied to these data.

In addition to the solid models that serve as common reference for all mechanical properties, FEM beam models with 20 beam elements per ligament of type B31 [two-node shear flexible Timoshenko beams in space; (Abaqus, 2014)] are built using the code developed by Huber (2018). The code is modified for assigning a variable ligament shape to the beam elements in dependence of their position relative to the mid of the ligaments.

For the mechanical properties, a Young's modulus of E<sup>s</sup> = 80 GPa, a Poisson's ratio of ν = 0.42, a yield strength of σy,<sup>s</sup> = 500 MPa, and a work-hardening rate of E<sup>T</sup> = 1,000 MPa are chosen. These parameters represent the mechanical behavior of the ligaments in NPG reasonably well (Huber et al., 2014; Hu et al., 2016; Roschning and Huber, 2016; Huber, 2018).

The translation of the ligament shape given in Equation (1) for a single ligament into a physical meaningful radius distribution for the interconnected structure is described in detail in **Supplementary Section 2**. Through the intersection of three convex ligaments, the actual size of the nodal mass increases to the value R, which is defined by the triple point—the point where the surfaces of three ligaments intersect. This surface point is closest to the center of the nodal mass. Therefore, all reference FEM beam models are based on the radius for the biggest sphere R, that fits in the nodal area. The corresponding radii are computed as distance from the center of the junction to the surface in direction of the triple point, which is found at an angle of 70.53◦ relative to the ligament axis. The value R is assigned to all elements positioned between the ligament end, which is the center of the nodal mass, to the axial position of the triple point T. This approach avoids case sensitivity and allows to compare the results from different models. All geometric parameters for the structures defined in **Table 1** are provided in **Supplementary Section 4**, **Supplementary Table 1**. **Supplementary Figure 6A** shows that there is only a moderate effect in the macroscopic Young's modulus. For most ligaments, the stiffening is below 10%. However, for the yield strength shown in **Supplementary Figure 6B**, the incorporation of R becomes relevant for cylindrical and convex shaped ligaments, for which a strength increase by up to 20% and 40%, respectively, is achieved.

#### Boundary Conditions

For a finite model size, the choice of the boundary conditions can significantly influence the material response significantly. Miehe and Koch (2002) showed for shearing of a composite microstructure modeled with 2D solid elements that prescribed displacement boundary conditions lead to a stiffer response compared to periodic boundary conditions. Diebels and Steeb (2002) showed that boundary layers of rotations form under simple shear of a foam leading to a size effect. In this study, we investigate the effect of errors in ligament geometry on macroscopic properties and effects of boundary conditions should be avoided. Therefore, the chosen boundary conditions emulate an infinite periodic microstructure. Due to the perfect symmetry of the diamond structure, all simulations can be based



Two digits numbering the row and column in this table are used for coding the geometry.

on one unit cell with prescribed displacement and rotation boundary conditions, for details see **Supplementary Section 1.2**. For the FEM beam model, this approach is equivalent to periodic boundary conditions, while it significantly simplifies the meshing of a 3D FEM solid model.

The displacement boundary conditions impose the known deformation behavior of the structure on all surface nodes using <sup>∗</sup>EQUATION in Abaqus. To this end, nodes on planes x = 0, y = 0, and z = 0 are set to zero displacement normal to the corresponding plane. Nodes in the planes at coordinate x = 1, y = 1, and z = 1 are set to remain in a plane that is controlled by a dummy node. All nodes on the mid planes are forced to move half the displacement of the corresponding nodes in the plane at coordinate 1. Finally, in the beam models, all rotational degrees of freedom are set to zero for all surface nodes. As no displacement boundary conditions are applied to the five internal junction nodes within the RVE, these nodes are allowed to move and rotate without any constraint. Nevertheless, they behave identically to the nodes at the boundaries, which have their rotational degrees of freedom fixed, and accomplish a full periodicity of the stress and deformation field results. This indicated the correctness of the chosen boundary conditions being equivalent to periodic boundary conditions. More details are given in **Supplementary Section 1.2** (see **Supplementary Figures 2, 3**).

For elastic computations, a compression strain of 1% is applied on the dummy node of plane z = 1; for predicting elasticplastic stress-strain behavior, the structure is compressed by 20% strain using large deformation theory (NLGEOM) with a start increment of 0.001. The Young's modulus is always determined from the first loading increment.

For geometries G<sup>11</sup> and G<sup>14</sup> (rend = 0.1, rmid/rend = 0.5 and rmid/rend = 1.25, respectively), a size study with RVEs of increasing model size confirmed that the chosen displacement boundary conditions yield results identical to periodic boundary conditions, both being independent of the model size. The results are presented in **Supplementary Section 1.2**. As shown in **Supplementary Figure 3**, the computations with simple symmetry conditions, as used e.g., by Huber et al. (2014), asymptotically approach this value with increasing model size (see also the size study in the Appendix of Huber, 2018). For applying the displacement boundary conditions in the solid model, a search tolerance of 1% of the unit cell allows collecting enough FE nodes, which are sufficiently close to the position of the corresponding surface nodes of the FEM beam model.

**Figure 4** shows contour plots for the corresponding FEM solid and beam models at a deformation in the elastic-plastic transition. Elements exceeding the yield stress of 500 MPa are colored in gray. They represent the distribution of the plastic zones, which are in good agreement for the solid model and the corresponding beam model for the convex ligament shape G14, as can be seen from **Figures 4C,D**. However, for structure G<sup>11</sup> with concave ligaments shown in **Figure 4A**, the plastic zones are organized in the FEM solid model along the tension and compression side in the thin regions of the ligaments and cross the junction volume in the middle into the neighboring ligament. Due to the kinematics implemented in the FE beam elements, the FEM beam model in **Figure 4B** cannot capture this complex deformation and localizes the plastic strains in elements in the transition region from the ligament to the nodal mass.

# Reference Macroscopic Mechanical Properties

In the following section, the results obtained from the FEM beam model and the FEM solid model are presented for

the reference geometries defined in **Table 1**. This serves two goals. The first goal is to precisely determine the differences between the macroscopic properties of the FEM beam model relative to the FEM solid model of the very same geometry for all ligament geometries. For all further investigations, the FEM beam models serve as reference for the FEM skeleton beam models derived from the voxel models. This allows to clearly separate potential effects from different sources, such as the different behavior of FEM beam and solid models, the thickness algorithms (section FEM Skeleton Beam Models), and the quality assessment of the developed correction methods (section Methods for Thickness Correction).

structure G11; (C) solid model of structure G14; (D) beam model of structure G14.

The macroscopic properties Young's modulus E and the yield strength σ<sup>y</sup> are derived from engineering stress and strain measures (see **Supplementary Section 1.2**, subsection Macroscopic Evaluation). Complete sets of the resulting mechanical properties for the structures defined in **Table 1** are provided in form of absolute values in **Supplementary Section 4**, **Supplementary Tables 2–4**. An overview of the macroscopic mechanical properties predicted by the reference FEM beam model (E (ref) , σ (ref) <sup>y</sup> ) normalized to the corresponding values of the reference FEM solid model (E, σy) is given in **Figure 5**. The shaded regions indicate solid fractions that are out of the range of NPG (Liu and Jin, 2017; Soyarslan et al., 2018a). It should be noted that a direct comparison with NPG samples via the solid fraction is not possible, because a significant percentage of solid fraction can exist in form of dangling ligaments, whereas our diamond structure is fully connected. Therefore, the larger range of solid fractions in this theoretical work can be useful for covering the relevant ligament shapes determined by Richert and Huber (2018).

**Figure 5A** confirms that the FEM beam model generally underpredicts the macroscopic Young's modulus relative to the solid model, which is due to the well-known effect from increased lever length (Huber et al., 2014; Roschning and Huber, 2016). The FEM beam model is more compliant compared to the solid model, because the full distance from the mid of the element to the ligament end, i.e., the half ligament length l/2, is available for bending deformation, independent of the ligament thickness. In contrast to this, the nodal mass in the solid model reduces the lever length available for bending of the ligament depending on

the size of the nodal mass relative to the ligament radius. The node is stiffened-up and deformation is moved into the transition zone from the ligament to the nodal mass. For more details, we refer to Huber et al. (2014) and Roschning and Huber (2016). Jiao and Huber (2017b) carried out a study on the effect of the nodal mass for a ball-and-stick model and suggested a nodal corrected beam model to compensate for the softening in the beam model by adjusting the radii and the Young's modulus of the elements in the nodal region.

There is a clear trend toward the stiffness of the solid model for decreasing ratio rmid/rend, which goes along with a decreasing solid fraction. This means that the more concave the ligament is, the closer the macroscopic mechanical stiffness is to that of the FEM solid model. Therefore, concave ligaments require less nodal correction to raise the stiffness by about 30% (rmid/rend = 0.5) or 80% (rmid/rend = 0.75), while cylindrical and convex ligaments require an additional stiffening by more than a factor of 2. This disproves an application of a single "stiffness intensity factor" as proposed by Soyarslan et al. (2018b) independent of the local ligament shape and solid fraction ϕ.

In contrast to the elastic behavior, the effect in the macroscopic strength, computed at 1% plastic strain, depends strongly on the specific ligament shape (see **Figure 5B**). In average, the yield strength predicted by the FEM beam model is comparable to that of the FEM solid model. However, for specific ligament shapes the ratio of the yield strength ranges from 0.6 to 1.6. An example is shown in **Figures 4A,B**. From the contour plots for both types of models it can be deduced that for concave ligament shapes, the plastic zone in the FEM solid model, **Figure 4A**, is distributed over a larger volume extending from one ligament via the nodal mass into the neighbor ligament. In contrast to this, for the FEM beam model shown in **Figure 4B**, the plastic deformation localizes in elements located in the transition zone from the ligament to the nodal mass. Therefore, the levers and resulting bending moments causing plastic deformation are longer in the solid model, effectively reducing its mechanical strength. This can explain the unexpected high strength of the FEM beam model for specific geometries.

Based on the good agreement of the yield strength averaged over all geometries, one could argue that a structure that contains a large range of ligament shapes does not require a nodal correction for the mechanical strength. This surprising result has important consequences for the interpretation of stress-strain curves predicted from FEM beam models derived from skeletonized structural data, because the elastic and plastic properties need to be treated differently.

# FEM SKELETON BEAM MODELS

The FEM skeleton beam model building approach of Richert and Huber (2018) is based on tomography data sets of real NPG provided by Hu et al. (2016). The common problem for this and similar works (Mangipudi et al., 2016; Soyarslan et al., 2018b) is that the desired thickness information normal to the ligament axis is not easily available. The 16 model geometries, defined in section Reference Geometry of the Unit Cell, enable us to systematically study the different sources of over- and underprediction and to qualify proposed correction methods. Furthermore, the sensitivity with respect to the voxel resolution, the skeletonization, and the discretization of the ligaments is studied.

#### RVE Size and Voxelization

To mimic the procedure according to the analysis of tomography data, a Python script is used to scan the reference RVEs for given ligament geometries. This scan produces a black (pore) and white (gold) tiff-stack in the chosen voxel resolution. Details on the tomography of the FEM beam models via parallel processing are provided in **Supplementary Section 3**. The tiff files of the 16 model geometries are available for download as the Data Sheet 2.zip folder of the **Supplementary Material**. Details of the files are provided in **Supplementary Section 5**. The code is validated using the open visualization tool Ovito by Stukowski (2010) confirming that the solid fraction of the voxelized model is below 1% error. To avoid boundary issues during the skeletonization and thickness analysis, as discussed by Richert and Huber (2018), a larger RVE of size 3 × 3 × 3 unit cells is used, similar to Soyarslan et al. (2018b). However, for the voxelization, the scanbox edge length around the mid-point is limited to 1.5 times of the unit cell size aUC, so that on all sides exactly one additional ligament (0.25 of one unit cell) is connected to the center unit cell. The skeletonization is carried out on the resulting RVE of size 1.5 in the open-source software Fiji (Schindelin et al., 2012) with the BoneJ Plugin (Doube et al., 2010) Skeletonize 3D based on the thinning algorithm by Lee et al. (1994). The diameter estimation is carried out with the BoneJ Plugin Thickness (Dougherty and Kunzelmann, 2007) based on the Biggest Sphere algorithm by Hildebrand and Rüegsegger (1997) and the 3D Mathematical Morphology (TANGO) Plugin operation 3D Distance Transform by Ollion et al. (2013). The skeleton forms the beam element axis and the thickness data is used to calculate the section radii of the beam elements. For the FEM skeleton beam model building, only the data within the volume of the center unit cell is used. For further details about the procedure (see the Appendix of Richert and Huber, 2018).

The geometry G<sup>11</sup> with the smallest diameter was chosen to determine the accuracy as function of the voxel resolution. This most filigree structure with rend = 0.1 and rmid/rend = 0.5 is shown in **Figure 6A**. Due to the small ligament diameter, it has the highest sensitivity with respect to effects of voxel resolution and beam discretization. The structure was scanned with 60, 100, 200, and 300 voxels per unit cell edge length Nv/aUC (see **Figure 6**), yielding volume fractions of 9.2, 9.4, 8.0, 7.9%, showing a dependence on the voxel resolution. With the unit cell edge length aUC = 1, one voxel has an edge length of 1/60 (0.0167), 1/100 (0.01), 1/200 (0.005), and 1/300 (0.0033) for the different resolutions, respectively. The smallest radius of the structure is 0.05 in the middle of the ligament. With the lowest resolution of 60 voxels per unit cell edge length, this results in only three voxels making up the ligament radius. With the resolution of 100 voxels shown in **Figure 6B**, the proposed minimum quality of 10 voxels per ligament diameter proposed by McCue et al. (2018) is met. The unsatisfying quality of the 60 voxels resolution leads to steps in the beam diameters and an uneven replication of the ligament profile, as visible in **Figure 6A**. As a consequence, local narrow neckings are averaged out, which leads to a stiffening of the mechanical response. In contrast, the 200 and 300 voxels resolutions show a satisfying quality of the surface (see **Figures 6C,D**).

# Skeletonization and Beam Discretization

When analyzing the effect of the different voxel resolutions on the mechanical behavior of the FEM skeleton beam models, the skeletonization, and originating from that, the discretization of the beam elements are further sources of errors. The skeleton of the structure is the one-voxel-wide centerline. It is achieved by surface thinning, as implemented in Fiji. Richert and Huber (2018) discuss different discretization approaches, where the most accurate approach appears to be to construct the beam axis as the connection between the centers of neighboring voxels (1 V/E). However, due to the discrete cubic size of a voxel, this can lead to harsh direction changes of up to 90◦ between two beam elements (zigzag). Especially for curved ligaments, as found in actual NPG tomography data, this has a great effect. This zigzag skeleton path results in a more compliant mechanical behavior, as shown by Richert and Huber (2018). The other approach is to average over a certain number of voxels. An approach of on average five voxels was tested by Richert and Huber (2018). This solves the issue with the skeleton zigzag on the one hand, but results in a lower number of beam elements per ligament on the other hand and, due to this, the ligament shape may be badly represented. Neckings are averaged out and the macroscopic stiffness and strength is probably overestimated. This is a similar effect as if using a low voxel resolution.

A new approach is introduced in this paper, were the skeleton voxels are fit by a Bezier function. This results in a smooth line, with the start- and end-node being fixed in their position. The Bezier fit is not forced to go exactly through the individual

ligament (Bez 20 E/L). The values are fit to a simple hyperbolic function E (Th)<sup>=</sup> k (Nv/aUC) +E (Th) <sup>∞</sup> . The parameter E (Th) <sup>∞</sup> approximates the values for an infinite number of voxels Nv/aUC, being 590 and 580 MPa for Bezier and 1 V/E discretization, respectively. The percentage deviation from those values is inscribed.

skeleton points of the ligament, so no overshoots arise, as is would be the case for a spline fit. The Bezier approach has the additional advantage that the desired number of equidistant beam elements per ligament can be chosen in dependent of the length and skeleton voxel number of the current ligament. For assuring comparability with the reference FEM beam model, 20 two-node shear flexible Timoshenko beam elements in space (B31) are used (see section Reference FEM Solid and Beam Models). For the boundary conditions (see section Boundary Conditions).

The models for the different discretization approaches based on the Thickness data are shown in **Figure 7**. The diamond structures analyzed in this paper have initially a straight ligament axis (**Figure 7A**). By using the discretization of one voxel per beam element (1 V/E) on a 60 voxels scanned structure, kinks are clearly visible in **Figure 7B** as tilted elements. Also for the 200 voxels scan resolution, the 1 V/E discretization shows kinks (see **Figure 7C**). This phenomenon is not avoidable due to the discrete voxel size, shape and orientation of the ligaments in space, even for the ideal geometries used in this work. This problem is solved via the newly introduced Bezier fit, which shows nicely aligned beam elements (see **Figure 7D**). Besides the discretization issues, the diameter overestimation through the Thickness algorithm is clearly visible in all three FEM skeleton beam models (**Figures 7B–D**), when compared to the reference geometry presented in **Figure 7A**.

The FEM skeleton beam model was built from the four different voxel resolutions of the geometry G<sup>11</sup> based on the Thickness diameter estimation algorithms. Furthermore, the two different discretization approaches with either each voxel being represented by one beam element (1 V/E), or a Bezier fit (Bez 20 E/L) are applied to the skeleton and diameter data. The results for the Young's modulus are displayed in **Figure 8**. The values are fit to a simple hyperbolic function E (Th) <sup>=</sup> <sup>k</sup>/(Nv/aUC) <sup>+</sup> <sup>E</sup> (Th) <sup>∞</sup> , where the parameter E (Th) <sup>∞</sup> approximates the Young's modulus for a model with infinite number of voxels Nv/aUC = ∞, as 590 MPa and 580 MPa for Bezier and 1 V/E discretization, respectively. The percentage deviation from those values is inscribed. The focus is here solely set on the effect of the voxel resolution and the two different beam element discretizations. The deviations to the reference beam model stemming from the diameter estimations are addressed in sections Thickness Analysis and Effect on Mechanical Properties.

Overall, the 1 VE models show slightly lower Young's modulus values than the Bezier models, and also lower deviations to its asymptotic value of 580 MPa at Nv/aUC = ∞. As the skeleton is straight in the reference geometry, the effect of the increased compliance caused by the kinks in the ligament axis with the 1 VE discretization is small. For the lowest voxel resolution (60 V) the stiffness is overpredicted by up to 43% while for higher resolution, the accuracy increases. The Young's modulus of the models with a resolution of 200 voxels shows around 10% remaining difference to the predicted value at Nv/aUC = ∞. Further refinement slowly increases the accuracy, but rapidly increases the computational time. Thus, all further computations will use the voxel resolution of 200 voxels per unit cell edge length Nv/aUC with the Bezier fit to ensure comparability to the reference structures created with 20 elements per ligament. The remaining uncertainty in the prediction of the mechanical properties is up to 12% due to the voxel resolution and beam element discretization. The resulting voxel edge length of 1/200 (0.005) defines the achievable accuracy limit for the geometrical characterization in the following sections.

#### Thickness Analysis

This section discusses the geometry derived with the Thickness algorithm (Th) and the Euclidean distance transformation (EDT) from the voxel scan of the underlying reference geometries, given in **Table 1**. **Figure 9A** shows the mean-radii D r (.)E obtained from averaging over all 20 elements of a ligament normalized by the mean-radius of the reference geometry D r (ref) E . It can be seen that the deviation of D r (.)E / D r (ref) E increases with increasing concavity, independent of the end radius rend and algorithm used. For the Thickness algorithm, the largest value of 1.2 is comparable to the results of Richert and Huber (2018), where values up to 1.3 have been reported using the mathematically exact ligament geometry as reference. It could be argued that the deviation of 20% in the geometry is still acceptable. However, as showed by Richert and Huber (2018), this causes serious overpredictions in the mechanical stiffness of the ligament by a factor of two. As expected, the data from the EDT show an underprediction for increasing concavity, but

the relative deviations are significantly smaller compared to Thickness algorithm.

The advantage of the object-oriented-programming is that it enables to locally analyze parameters of individual ligaments at specific positions. **Figures 9B–D** show selected results for the effect on the local thickness determined in the end, quarter, and middle position of the ligament, respectively. From this series, the strength and weaknesses of each algorithm can discussed. In the overall comparison, the EDT algorithm is of superior accuracy. At the mid and end position, where the tangent of the ligament shape is flat, the diameter is determined with high accuracy. Only in the transition from end to mid position, represented by the quarter positions in **Figure 9C**, the expected underprediction can be seen in the EDT data. In the worst case that represents the largest diameter change, i.e., structure G41, the deviation is −30%.

For the Thickness algorithm, the local overestimation of the rmid value increasingly depends also on the absolute radius of the ligament end, the more concave the ligament is. This is a result of the following mechanism: The Thickness algorithm propagates the sizes of the nodal region into the ligament region. Firstly, all skeleton points inside the nodal sphere are assigned with this value Rnode ≥ rend, forming a nodal plateau of constant radius. Secondly, from there the ligament shape assumes a smooth transition from Rnode to rmid. However, in the extreme case of a very thick ligament, the two nodal spheres can even overlap in the mid position of the ligament. This would lead to an extension of the plateau over the whole ligament length. Due to this, the determined radius in the mid-point r (Th) mid can take all possible values from r (ref) mid to Rnode.

In the following, we will investigate the impact of the determined geometries on the macroscopic mechanical properties. The question will be addressed in how far the averaged data or the local effects in the geometrical characterization are relevant in terms of the mechanical behavior.

#### Effect on Mechanical Properties

In section Thickness Analysis, the deviations for the average and local thicknesses are determined for the 16 reference RVEs. Because the diameter enters the moment of inertia by a power of four in the stiffness calculation, the overestimation of the Young's modulus and yield strength is expected to be even higher. To quantify this effect, 16 FEM skeleton beam models are built from the 200 voxel resolution scans (section RVE Size and Voxelization), with a Bezier curve fit to the skeleton axis (section Skeletonization and Beam Discretization). In **Figures 10A,B**, the macroscopic Young's modulus E and the yield strength σy, respectively, obtained from the FEM skeleton beam model are compared to the values from the corresponding reference FEM beam model.

The factor of overestimation of the Young's modulus for the Thickness algorithm, presented in **Figure 10A**, is similar for structures with same ratio rmid/rend, independent of the absolute rend value. Strongly concave structures show the highest deviations by up to a factor of 2. Tending toward cylindrical and convex structures, the deviation decreases to a factor of 1.2. The trend in the yield strength data in **Figure 10B** is similar, showing highest overestimations at strongly concave ligaments. With decreasing concavity, the decay is however more emphasized. Furthermore, stronger variations for different rend values are observed, especially for the concave ligaments. There, smaller rend values show higher overestimation, ranging from 1.68 to 2.15. The higher sensitivity of the yield strength is caused by the circumstance that the onset of plastic deformation results from the combination of weakest cross-section and applied bending

FIGURE 10 | Results of the macroscopic mechanical properties for the FEM skeleton beam models based on the Thickness (Th) algorithm or Euclidean distance transform (EDT), normalized to the results from the reference FEM beam models: (A) Young's modulus E and (B) yield strength σ<sup>y</sup> . The superscript (.) corresponds to (Th) or (EDT).

moment, which again depends on the lever acting on this crosssection. In contrast to this, the elastic deformations spread over the whole ligaments and into the junction volumes and are therefore less sensitive to the local geometry (Huber et al., 2014).

It should be noted that cylindrical and convex ligaments show overall the lowest overestimation, which is still about 20% for both macroscopic properties. This is astonishing, as one might imagine that a cylindrical ligament should be perfectly reproduced by the Thickness algorithm. However, this is only true for a cylindrical ligament of infinite length. For the interconnected structure, which contains junction volumes that are larger than the cylindrical ligaments, the overestimation in mechanical properties is due to the mechanism discussed in section Thickness Analysis.

In line with the findings from the geometric analysis presented in **Figure 9**, the predicted deviations in the macroscopic mechanical properties for the EDT data are much smaller compared those obtained for the Thickness algorithm. The results can be considered accurate for cylindrical and convex shapes while for concave shapes the stiffness and strength are reduced up to 20%. If this is acceptable, the EDT can be used without further correction. It should be noted that stronger concavities or asymmetries as well as non-circular cross-sections can further increase these deviations also for the EDT.

# METHODS FOR THICKNESS CORRECTION

Due to the impact on the mechanical response, we present in the following sections possible correction approaches for both thickness algorithms. The high sensitivity of the mechanical properties on the geometric characterization justifies to use the predicted Young's modulus and yield strength throughout these sections as the relevant measure for the assessment of the quality of each approach.

# Smallest Ellipse Approach

Coming from the Biggest Sphere Thickness approach by Hildebrand and Rüegsegger (1997), the idea is to compensate its systematic trend of overestimation by the opposing equivalent, which is the Smallest Sphere approach, discussed by Liu et al. (2014). Between these two extremes, a Smallest Ellipse (SE) approach can be considered, as schematically presented in **Figure 11**. As input data, the coordinates of the medial axis and the respective Thickness values are used. Each point x along the axis located in a smallest ellipse inscribed into the Thickness data r (Th) , is assigned with the value of the ellipse major axis as the radius r (SE) . The ellipse allows to incorporate some flexibility in the range of radius assignment near the point under investigation. To this end, the linear eccentricity e of the ellipse was determined independently of the geometries defined in **Table 1**. Approximately 800 ligament geometries were created, reproducing the range of ligament geometries detected by Richert and Huber (2018), including asymmetric ligament shapes. A linear eccentricity of e = 0.75 produced the lowest errors.

The obvious drawback of the proposed Smallest Ellipse approach is that the minimum diameter of a ligament is bound to the minimum Biggest Sphere Thickness value. This can be

seen in **Figure 11**, where in the center of the ligament a gap between the minimum radius of the original reference geometry and the reconstructed radius remains. In the nodal areas, the Biggest Sphere Thickness value represents the upper limit, which is correctly reproduced. The algorithm is robust since it does not require any assumption on a model function, parameter bounds, and parameter start values and works for symmetric and asymmetric ligament shapes.

The correction of the geometries via the Smallest Ellipse approach lead to an overall improvement in the predicted macroscopic mechanical properties (see **Figure 12**). The previously observed overestimation from 1.2 to 2.0 based on the Thickness data (see **Figure 10A**) is now reduced to an almost constant value between 1.1 and 1.25, i.e., the concave ligaments are most improved. As discussed before, the yield strength shows some stronger sensitivity to the different ligament shape parameters, while the overall improvement is comparable to that of the Young's modulus. In summary, the reconstruction of the ligament shape with the simple Smallest Ellipse approach represents a substantial improvement in comparison to the Thickness data, although some geometrical inaccuracies remain in the thinner region.

# Artificial Neural Network Correction Approach

In contrast to the Smallest Ellipse approach, which does not require an assumption with regard to the ligament geometry, computational methods, such as optimization strategies or artificial neural networks can be applied for reconstruction of

the original ligament shape. In our case, the ligament shape is limited to symmetric-parabolic shapes, which in principle allows applying both strategies in a straightforward manner. Apart from the drawback of restricting the generality to a certain class of ligament shapes included in the assumed model function, this has the advantage of dealing with the ligament as a whole dataset. Optimization strategies require parameter bounds, parameter start values of the model function. They are furthermore computationally demanding, because the parameter identification must be carried out individually and independently for each ligament. Therefore, we focus on the development of an artificial neural network (ANN).

Young's modulus; (B) Macroscopic yield strength at 1% plastic strain.

For details on the ANNs, we refer to Huber (2018) and the literature cited there. For the training of the ANN, the 16 symmetric ligament geometries are used, defined in **Table 1**. Pattern files are written in the following style: The input vector **X** consists according to Equation (2) of the radii computed for all elements along the ligament skeleton, normalized by their average, in the form r (.)(xi)/ D r (.)E , where (.) can be set to any of the three algorithms, namely (Th), (SE), or (EDT). This set of data represents the shape of the ligament from one end to the other end as measured by the corresponding algorithm.

As one further input value, the normalized position 2xi/l is given, for which the correction factor shall be determined. The output vector **Y** consists according to Equation (3) of just one value, which is the correct radius divided by the radius determined from the algorithm r (ref) (xi)/r (.)(xi) at the position x<sup>i</sup> . Because an ANN represents a continuous approximation of the presented data, it is very difficult to predict the steps contained in r (ref) as shown in **Figure 4**. Therefore, the prediction of the output is limited to the positions within the triple points. Per ligament, 14 patterns are created, which are related to the 14 element radii for which the correct radius needs to be computed. Each ANN consists of four layers with 21 neurons at the input layer, 15 and 10 neurons in the two hidden layers, and 1 neuron for the output layer and is trained for 10,000 epochs. The resulting mean squared training and validation errors are MSET(Th) <sup>=</sup> 2.37 · <sup>10</sup>−<sup>5</sup> and MSEV(Th) <sup>=</sup> 1.87 · <sup>10</sup>−<sup>4</sup> ; MSET(SE) <sup>=</sup> 8.82 · <sup>10</sup>−<sup>6</sup> and MSEV(SE) <sup>=</sup> 8.44 · <sup>10</sup>−<sup>5</sup> ; MSET(EDT) <sup>=</sup> 1.73 · <sup>10</sup>−<sup>5</sup> and MSEV(EDT) = 1.89 · 10−<sup>4</sup> ; respectively.

$$\mathbf{X} = \left\{ \frac{r^{(.)}\left(\mathbf{x}\_{1}\right)}{\left\langle r^{(.)}\right\rangle}, \frac{r^{(.)}\left(\mathbf{x}\_{2}\right)}{\left\langle r^{(.)}\right\rangle}, \dots, \frac{r^{(.)}\left(\mathbf{x}\_{20}\right)}{\left\langle r^{(.)}\right\rangle}, \frac{2\mathbf{x}\_{i}}{l} \right\} \tag{2}$$

$$\mathbf{Y} = \left\{ \frac{r^{(ref)} \left( \mathbf{x}\_i \right)}{r^{(\cdot)} \left( \mathbf{x}\_i \right)} \right\} \tag{3}$$

Indicated by the very low training and validation error, the reconstruction of the correct ligament shape seems to be a simple task for the ANN. The ANNs are able to determine the original ligament radius within one voxel accuracy, independent by which algorithm the input data are provided.

To validate the generalization capability of the trained ANNs, three new validation geometries are generated within the range of the existing 16 geometries. They are defined with rend = 0.1375 and rmid/rend = [0.625, 0.875, 1.125]. The geometries of the three validation examples are shown in **Figure 13**. The degree of the remaining deviations is illustrated by 1 voxel- (±0.5 v) and 2 voxel-wide (±1 v) bands. The radii determined along the ligaments is within or very close to the 2-voxel wide band range for all three validation geometries and three ANN types. This corresponds to plus-minus one voxel, which is the limit for the accuracy defined by the voxel resolution. Only for the corrected

Thickness data of strongly concave ligament in **Figure 13A**, the determined values are far outside the 2-voxel wide band.

If the Thickness data are pre-processed with the Smallest Ellipse algorithm, the accuracy improves significantly also for this difficult case. The ANN is now able to achieve accuracies, which are within the theoretical resolution limit of the voxel discretization and comparable to the results of the EDT (**Figures 13A–C**). This results from the capability of the ANN to memorize the relationship between ligament shapes and their corrections as whole and smoothly interpolate this relationship for untrained geometries. Due to this, the ANN approach has superior performance compared to the local Smallest Ellipse approach, discussed in the previous section, which cannot fully recover the information in the thinnest part of the ligament. The drawback is however that this method is so far limited to symmetric ligaments. For further evaluations of actual tomography data in the parameter space of r ∗ sym and r ∗ asym, as found by Richert and Huber (2018), the incorporation of a linear gradient according to Equation (1) is required. With this, asymmetric ligaments can be represented, as they occur with high probability in real NPG. Motivated by the promising results presented in this section, such an extension will be scope of future work.

As for the resulting mechanical behavior, very small deviations of maximum 10% are observed for the 16 trained geometries for both Young's modulus and yield strength. Also the three additional validation examples are well-predicted by the ANN with the very same accuracy, supposed the Thickness data are improved by the Smallest Ellipse approach before feeding the data to the ANN. It is remarkable that, despite some remaining error in the geometry reconstruction, resulting errors in the mechanical properties are negligible. The reason for this is that the ANN in average determines the ligament shape correctly with perhaps some small over- and underpredictions in different regions of the ligament. In contrast to that, the Thickness algorithm and the reconstruction via the Smallest Ellipse approach systematically overestimate the geometry and therefore the mechanical properties are biased to higher values.

# APPLICATION TO EXPERIMENTAL TOMOGRAPHY DATA

To test the methods presented in section Methods for Thickness Correction beyond the 16 idealized diamond structures, the NPG tomography data set of Hu et al. (2016) is used, which stems from a nanoporous gold sample with nominally 400 nm average ligament diameter. Three FEM skeleton beam models are generated based on the Thickness, EDT and furthermore Smallest Ellipse corrected data of the NPG tomography, as described in section FEM Skeleton Beam Models. The ANN approach is not applicable in its present form, as it requires an extension toward more general shapes. The mesh of the reference Solid model consisting of 10-node tetrahedral elements (C3D10) was provided by Hu et al. (2016). For both types of models, symmetry boundary conditions are used and a compressive loading in zdirection is applied. In this way, the results give an additional insight about the effect for a more commonly used boundary condition and a realistic, aperiodic microstructure. The resulting macroscopic stress-strain behaviors are plotted in **Figure 14**; the inserts clearly show the differences in the resulting beam models relative to the solid model, where the black line shows the traced outline of the solid model.

The macroscopic Young's moduli and yield strengths at 1% plastic strain are computed as 432 and 10.0 MPa (Thickness); 310 and 7.5 MPa (Smallest Ellipse); 160 and 3.7 MPa (EDT), respectively. For comparison, the Young's modulus and yield strength of the FEM solid model was computed as 370 and 5.5 MPa. Consistent with the trend observed for the idealized diamond structures, the model based on Thickness and EDT diameter information show the highest and lowest values, respectively. From the idealized diamond structures, we computed ratios (Thickness/EDT) in the average diameter ranging from 1.05 (highest convexity) to 1.4 (highest concavity). The corresponding ratios in the predicted mechanical properties are ranging from 1.2 to 2.5 (Young's modulus) and 1.2 to 2.7 (yield strength). For the NPG tomography, the ligament diameter distribution resulting from the Thickness and EDT algorithms showed a ratio in the average ligament diameter of 1.3 (see section Methodology and **Figure 1**). As shown in **Figure 14**, the (Thickness/EDT) ratio computed for the macroscopic mechanical properties are 2.7 (Young's modulus) and 2.7 (yield strength). Therefore, geometry and property ratios for the real material are close to or above the upper limits found for the idealized diamond structures. This is reasonable, because the diamond structure exhibits straight skeleton lines and symmetric ligament profiles while the skeleton paths in the NPG sample are more randomized and ligament profiles are strongly asymmetric, as reported by Richert and Huber (2018). Therefore, the ligaments show additional gradients along their axis—such gradients are found to be the source of error in both algorithms, Thickness and EDT.

Furthermore, the resulting Young's modulus of the EDT beam model is only 43% of the solid model. This confirms the expectation by Richert and Huber (2018) that the FEM Solid model should be stiffer and stronger than the FEM skeleton beam model, as in the latter, the stiffening and strengthening effect of the nodal mass (Jiao and Huber, 2017b) is not yet accounted for.

# CONCLUSIONS AND OUTLOOK

While the accurate determination of the thickness of geometrical features from 2D images is straight forward, the situation changes dramatically for 3D structures. Various algorithms exist, but each has its specific drawbacks regarding implementation, computational cost, or accuracy. The Thickness algorithm by Hildebrand and Rüegsegger (1997) is the most commonly used algorithm. This is usually done without an assessment of the error, because information about the correct thickness of the structures under investigation is not available. A study by Richert and Huber (2018) of typical ligament shapes identified from 3D FIB tomography data of NPG revealed that the error in the measured geometry can reach values up to 30%. The overestimated thickness data lead to an overestimation of the mechanical stiffness by a factor of two and more. Although an implementation of the 3D Euclidean distance transformation (EDT) is for example available in the Plugin TANGO, this algorithm has so far not been used in 3D analysis. In contrast to the Thickness algorithm, it tends to underpredict the diameter for curved shapes. A first comparison of both algorithms with tomography data of NPG revealed a difference in the computed average ligament diameter of 30%. This and the detailed results obtained on the local radii for the different algorithms highlight how important it is to understand the individual algorithm used and what the produced data represent in relation to the measure of interest. This is particularly an issue when pooling data from various sources making use of different algorithms.

To provide RVEs with well-defined geometries, this work is based on idealized model structures consisting of ligaments with circular cross-sections and smooth parabolic-spherical shape, organized in a diamond structure. Sixteen high-resolution voxel models and finite element models are provided, covering the relevant shapes from concave to convex ligaments. These models serve as reference for the error assessment for both, the determined geometry and the elastic-plastic mechanical properties along the thickness determination and correction chain. Furthermore, the provided test structures can be used for validation of any newly developed algorithm for the determination or correction of thickness information from voxel data.

FIGURE 14 | Stress strain curves of NPG tomography of Hu et al. (2016) modeled as FEM solid model and predictions from FEM skeleton beam models based on Thickness, Smallest Ellipse, and EDT diameter information. Inserts show regions of interest for comparison of the determined geometries.

To decouple this study from the known effect that FEM beam models show a more compliant and less strong macroscopic stress-strain behavior compared to FEM solid models, the differences in both properties are computed for each geometry. As expected, the FEM beam model is more compliant compared to the FEM solid model. The data show an increasing deviation for increasing mid to end radius ratio while the ligament size has only a marginal effect. In contrast to this, the yield strength distributes below and above those of the FEM solid model. This surprising result leads to the conclusion that the stressstrain curve computed by Richert and Huber (2018) must not necessarily fall below the curve predicted by the FEM solid model, after the geometry is corrected, because a newly developed nodal correction for these ligament shapes may not necessarily increase the strength.

An investigation of the sensitivity with regard to the voxel resolution revealed that the predicted mechanical stiffness is significantly overestimated with decreasing voxel number. For the most filigree structures and a resolution of 60 voxels per unit cell length, the error reaches up to 30% in comparison to a resolution of 300 voxels. Increasing the resolution to 200 voxels reduces the error to 3%.

Applying the Thickness algorithm to the data with 200 voxels resolution yielded largest overestimations of 20% in the average radius and 70% in the radius in the middle of the ligament. The impact on the Young's modulus and yield strength is up to 100% overestimation for concave shapes. This is not as high as predicted in the single ligament study by Richert and Huber (2018), but is still inacceptable. The Euclidean distance transformation resulted in an underprediction in the macroscopic mechanical properties of up to 20% for concave ligaments.

In view of these results, two approaches for correction of the computed thickness are proposed. A Smallest Ellipse correction approach, which could be interpreted as counterpart of the Biggest Sphere Thickness algorithm, allows reducing the error in Young's modulus to 20% and in yield strength to 30% for all ligament shapes. Secondly, using patterns consisting of estimated thickness information from Thickness, Smallest Ellipse, Euclidean distance transformation algorithm, and original ligament shapes, artificial neural networks were trained. It could be shown that the accuracies achieved for most cases are within a few voxels. The resulting deviations in the mechanical properties are within few percent, even for untrained validation patterns. This demonstrates the big potential of ANNs to accurately approximate complex non-linear relationships as whole. Even a correct reconstruction is possible for data for which the input information is incomplete in terms of the original ligament shape. However, relative to the ANN corrected Thickness data, the accuracy can be significantly increased by presenting the data from the Smallest Ellipse algorithm. This shows that it is advisable to reduce the complexity of the problem as far as possible by using existing algorithms or estimates, even if they are of limited accuracy. Such strategies have been successfully applied before and the outcome of this work emphasizes once more the importance of incorporation of a priori knowledge in the preparation of the ANN definition and pattern generation when high accuracy is a requirement. This is particularly important for solving highly non-linear and complex inverse problems (Huber and Tsakmakis, 1999, 2001; Tyulyukovskiy and Huber, 2006).

An obvious drawback of the ANN approach is that it must be trained for the parameter space of possible shapes to be identified. This means that for the evaluation of tomography data in the parameter space of r ∗ sym and r ∗ asym, as found by Richert and Huber (2018), requires an expansion by a linear gradient along the ligament axis or the incorporation of even more general shapes. In addition, the results of the Thickness and EDT algorithms should be critically evaluated with respect to effects from non-circular cross-sections that might occur in real samples.

Thus, future research should be directed toward approaches that provide sufficient geometrical accuracy for a large range of possible ligament geometries, where the accuracy should always be evaluated in view of the predicted mechanical properties.

Finally, the Thickness, Smallest Ellipse, and EDT algorithms are applied to the experimental NPG tomography data set of Hu et al. (2016). The average diameters and predicted stressstrain curves consistently showed Thickness to EDT ratios at the upper limit of the range computed for the idealized diamond structures. This is consistent with the finding that gradients in the ligament diameter along the axis are responsible for systematic over- and underestimation by the algorithms. Obviously, this effect is enhanced by the random nature and strong asymmetry of real ligaments. Furthermore, the stressstrain curve of the solid model lies in between the Thickness and EDT prediction. While the overprediction based on Thickness data confirms the result reported by Richert and Huber (2018), the EDT curve being significantly below the result of the FEM solid model now opens the perspective for an implementation of a physically meaningful nodal correction in the FEM beam model.

# AUTHOR CONTRIBUTIONS

CR and NH conceptualized, designed the study, and wrote and revised the manuscript. NH created the geometries, FEM solid and FEM beam models, and coded the Python scripts for voxel scanning. CR analyzed the voxel scans, carried out the FEM skeleton beam models. AO developed the Smallest Ellipse approach and carried out the thickness correction using this method. CR developed the ANN for thickness correction, analyzed the errors of all methods in terms of geometry and mechanical properties.

# FUNDING

This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—Projektnummer 192346071—SFB 986.

# ACKNOWLEDGMENTS

Kaixiong Hu and Erica T. Lilleodden are acknowledged for making the FIB tomography data set and the FEM solid model of NPG available.

### REFERENCES


# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmats. 2019.00327/full#supplementary-material


quantitative analysis of morphologically complex multiphase materials. Comput. Mater. Sci. 139, 320–329. doi: 10.1016/j.commatsci.2017. 08.012


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Richert, Odermatt and Huber. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Data Science Based Mg Corrosion Engineering

Tim Würger 1,2, Christian Feiler <sup>1</sup> , Félix Musil <sup>3</sup> , Gregor B. V. Feldbauer <sup>4</sup> , Daniel Höche1,5 , Sviatlana V. Lamaka<sup>1</sup> , Mikhail L. Zheludkevich1,6 and Robert H. Meißner 1,2 \*

<sup>1</sup> MagIC - Magnesium Innovation Centre, Institute of Materials Research, Helmholtz Centre for Materials and Coastal Research, Geesthacht, Germany, <sup>2</sup> Institute of Polymers and Composites, Hamburg University of Technology, Hamburg, Germany, <sup>3</sup> Laboratory of Computational Science and Modelling, Institute of Materials, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, <sup>4</sup> Institute of Advanced Ceramics, Hamburg University of Technology, Hamburg, Germany, <sup>5</sup> Computational Material Design, Faculty of Mechanical Engineering, Helmut-Schmidt-University, Hamburg, Germany, <sup>6</sup> Faculty of Engineering, Institute for Materials Science, University of Kiel, Kiel, Germany

Magnesium exhibits a high potential for a variety of applications in areas such as

#### Edited by:

Surya R. Kalidindi, Georgia Institute of Technology, United States

#### Reviewed by:

Tim Kowalczyk, Western Washington University, United States Ruifeng Lu, Nanjing University of Science and Technology, China

> \*Correspondence: Robert H. Meißner robert.meissner@tuhh.de

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

> Received: 04 January 2019 Accepted: 14 March 2019 Published: 05 April 2019

#### Citation:

Würger T, Feiler C, Musil F, Feldbauer GBV, Höche D, Lamaka SV, Zheludkevich ML and Meißner RH (2019) Data Science Based Mg Corrosion Engineering. Front. Mater. 6:53. doi: 10.3389/fmats.2019.00053 transport, energy and medicine. However, untreated magnesium alloys are prone to corrosion, restricting their practical application. Therefore, it is necessary to develop new approaches that can prevent or control corrosion and degradation processes in order to adapt to the specific needs of the application. One potential solution is using corrosion inhibitors which are capable of drastically reducing the degradation rate as a result of interactions with the metal surface or components of the corrosive medium. As the sheer number of potential dissolution modulators makes it impossible to obtain a detailed atomistic understanding of the inhibition mechanisms for each additive, other measures for inhibition prediction are required. For this purpose, a concept is presented that combines corrosion experiments, machine learning, data mining, density functional theory calculations and molecular dynamics to estimate corrosion inhibition properties of still untested molecules. Concomitantly, this approach will provide a deeper understanding of the fundamental mechanisms behind the prevention of corrosion events in magnesium-based materials and enables more accurate continuum corrosion simulations. The presented concept facilitates the search for molecules with a positive or negative effect on the inhibition efficiency and could thus significantly contribute to the better control of magnesium / electrolyte interface properties.

Keywords: machine learning, property-structure relationship, high-throughput screening, corrosion inhibition, density functional theory, magnesium, dimensionality reduction

# 1. INTRODUCTION

Light-weight materials such as magnesium and its alloys are of high interest for the industrial sector. Potential applications can be found in the automobile industry as structural component (Kulekci, 2008), in batteries as anode material (Aurbach et al., 2000; Höche et al., 2018) and in medical engineering as biocompatible, resolvable implant (Brar et al., 2009). However, dealing with corrosion is a challenging task in various engineering disciplines. Durability and versatility strongly depend on the corrosion properties of the applied material and for most applications as structural component, corrosion activity has to be minimized. Yet, for other approaches the corrosion properties have to be adapted to fit the desired application. As for example, introducing magnesium as battery or implant material requires the corrosion or degradation to proceed with a certain rate. Consequently, the development of reliable, predictive models and methods for general dissolution control is crucial.

There are several concepts to protect magnesium from corrosion, ranging from alloying to surface coatings (Gray and Luan, 2002; Blawert et al., 2006; Jia et al., 2016). Recent studies strongly suggest that the re-deposition of released noble impurities (e.g., iron) results in higher corrosion rates as the size of cathodically active sites at the magnesium surface increases over time up to a state of equilibrium (Höche et al., 2016; Li et al., 2016; Mercier et al., 2018; Michailidou et al., 2018). Concerning the iron re-deposition mechanism, a promising strategy to prevent or control corrosion in magnesium-based materials is the introduction of chemical substances that either form stable complexes with the released iron species or block their access to the surface (Lamaka et al., 2016; Yang et al., 2018).

Novel methods for inhibition prediction of not yet tested compounds based on modern data science techniques are in high demand to predict whether a molecule is a potential inhibitor or even further promotes dissolution of the used material. Hence, the high experimental effort and costs of testing multiple compounds for their corrosion inhibition potential can be circumvented. The molecular structure of potential corrosion inhibiting additives is easily obtained nowadays and thus, represents a promising starting point to identify property-structure relationships as well as to predict the inhibition efficiencies of uninvestigated additives. Following this strategy, Ceriotti et al. developed sophisticated methods to vividly illustrate property-structure landscapes by employing SOAP (Smooth Overlap of Atomic Positions) kernels (Bartók et al., 2013) to create a high-dimensional similarity measure and reducing it to a two-dimensional visualization with the dimensionality reduction algorithm "sketch-map" (Ceriotti et al., 2011, 2013). Moreover, this approach is particularly suited for high-dimensionality data from atomistic simulations as it was already successfully applied to molecular crystals (Musil et al., 2018) and high-throughput structural databases (De et al., 2016, 2017).

In this study, the capabilities of the SOAP kernel and sketch-map are focused on a corrosion inhibition database for multiple molecular compounds to improve the understanding of the inhibition-structure relationship. Furthermore, obtained results can be directly used to qualitatively predict the inhibition properties of not yet tested compounds, thus allowing for a data-driven design of anti-corrosion additives for magnesium-based materials.

# 2. MATERIALS AND METHODS

#### 2.1. Corrosion Experiments

The balance between magnesium dissolution and hydrogen evolution dominates the aqueous magnesium corrosion process. Due to the processing of magnesium with various methods (Pekguleryuz et al., 2013), noble impurities, as for example iron, are impossible to avoid. Thus, local galvanic cells are induced into the material that locally promote the corrosion, resulting in increased magnesium dissolution, hydrogen evolution and the release of impurities, such as iron. Finding molecules that form stable soluble or insoluble complexes with the released impurities is a promising way to screen for dissolution modulators and provides the basis for our workflow.

In a systematic screening for magnesium corrosion inhibitors (Lamaka et al., 2017), the influence of various organic molecules on the hydrogen evolution rate in magnesium corrosion was investigated. Here, the compounds were either previously reported as magnesium corrosion inhibitors or chosen based on their ability to form stable soluble complexes with Fe2+/3<sup>+</sup> in order to prevent iron re-deposition (Höche et al., 2016; Lamaka et al., 2016). Hydrogen evolution tests were performed for six different alloys as well as three grades of pure magnesium. Based on the resulting hydrogen evolution rate, the inhibitors were ranked by their inhibition efficiencies, where positive values, up to 99% correspond to suppressed Mg corrosion (referred to as corrosion inhibitors) and negative values to promoted dissolution of Mg (referred to as corrosion promoters) with respect to a reference experiment in 0.5% NaCl electrolyte without any additives. The potential inhibitors were dissolved in 0.5% NaCl to obtain concentrations of 0.05 M and the initial pH was adjusted to the values in the range of 5.5 − 7.2. Further experimental details can be found in the original publication (Lamaka et al., 2017). In this study, only inhibition results for commercial purity magnesium (CP-Mg) with 220 ppm iron content are considered.

# 2.2. Molecular Similarity

SMILES (simplified molecular-input line-entry system) strings of the experimentally investigated compounds are used to create molecular structures using the small molecule topology generator STaGE (Lundborg and Lindahl, 2015). As implemented in the high-throughput workflow of STaGE, the structures are geometry optimized with GAMESS/US (Schmidt et al., 1993; Gordon and Schmidt, 2005) using the B3LYP functional (Becke, 1993; Stephens et al., 1994) with 6-311++G(d,p) basis set and a SCF convergence criterion of 10<sup>6</sup> . As the inhibitor molecules are experimentally tested in solution, the optimizations are performed using a polarizable water model (c-PCM) (Barone and Cossi, 1998; Cossi et al., 2002, 2003; Wang and Li, 2009). Further information on the computational details is given in the **Supplementary Material**.

We quantify the structural and chemical similarity between inhibitor structures using the SOAP-REMatch kernel (Bartók et al., 2013; De et al., 2016) to investigate the relation between their structure and associated properties. The SOAP kernel compares local atomic environments and the REMatch (Regularized Entropy Match) kernel condenses the local similarities between two structures into a global similarity measure. A local environment is defined within a spherical region of radius r<sup>c</sup> centered on an atom and is built by a superposition of Gaussian functions with width ξ . The larger r<sup>c</sup> is chosen, the more structural information surrounding the atom is included. The SOAP kernel measures the rotationally and translationally invariant overlap between two such local environments and can be raised to a power ζ to discriminate more between large (∼ 0.9) and medium (< 0.6) similarities. The combination of the local similarities can be tuned by the hyper parameter γ of the REMatch kernel. For large values (γ ∼ 10) more equal weights are assigned to the local similarities while for small values (γ ∼ 0.01) only the best matching pairs of local environments are selected to compute the global similarity (see De et al., 2016 for more details).

To help the visualization of potential structure-property relationship, we consider each structure to lie in a highdimensional space defined by the SOAP-REMatch kernel, which is transformed into a distance (Berg et al., 1984), and we project this information on a two-dimensional map using sketch-map (Ceriotti et al., 2011). This dimensionality reduction technique allows to focus the distortions of the space so that close/distant, i.e., similar/dissimilar, structures in the high-dimensional space keep this relationship in the low dimensional space. This behavior is achieved by a sigmoid function that is applied to the distances and is mainly influenced by the switching distance σ, as well as a and b as tuning parameters (see **Supplementary Material**).

Thus, it is possible to create a two-dimensional similarity landscape that allows assessing the molecular similarity by analyzing relative positions and cluster formations. However, due to the form of the sigmoid function, far apart points can be arbitrarily far apart in the lower dimensional projection–making a physical interpretation of distances between basins in the low dimensional projection impossible.

#### 2.3. The Inhibition Prediction Workflow

When combining the presented methods, it is possible to visualize the relationship between the molecular structures and the corresponding inhibition efficiencies in a propertystructure landscape where all experimentally tested structures act as landmark points. Subsequently, the inhibition efficiencies of uninvestigated compounds can be predicted following the proposed workflow (**Figure 1**).

Experimental inhibition efficiencies obtained from a corrosion inhibitor databank (Lamaka et al., 2017), as well as molecular similarity measures determined with the SOAP-REMatch kernel can be combined to create a two-dimensional property-structure landscape for the tested inhibitor molecules. Here, clusters can indicate correlations between the inhibition efficiency and inhibitor structure, allowing to relate certain molecular structures to potentially promoting or inhibiting corrosion properties. For now, the small sample size favors an unsupervised machine learning technique where the decision boundaries are drawn by human, instead of a supervised learning algorithm (Kotsiantis et al., 2007).

Consequently, to predict the inhibition properties of an untested compound its structural relationship to the landmark points has to be determined. This can be accomplished by out-of-sample embedding, where the new structure is projected into the generated sketch-map by reproducing the distances to the previously defined landmark points (Ceriotti et al., 2011). Once the structure is projected into the property-structure landscape, its relative position to previously identified clusters can help to assess the impact on corrosion events. Concurrently, this approach indicates whether it is reasonable to examine untested additives in further corrosion experiments, hence saving a tremendous amount of time and resources compared to an experimental high-throughput approach.

# 3. RESULTS

To create a sketch-map displaying the relationship between corrosion inhibition efficiency and inhibitor structure, a total of 80 compounds was chosen out of the 151 experimentally tested structures provided in a corrosion inhibitor database (Lamaka et al., 2017). All structures were chosen based on a mutual inhibitor concentration of 0.05 M during the hydrogen evolution experiment to avoid concentration dependencies. Before conducting any analysis, the dataset has been further subdivided into 74 plus 6 randomly selected training and test structures – 74 structures for creating the sketch-map and six structures for validating the inhibition workflow.

After geometry optimization, we measure the structural and chemical similarity between these structures using the SOAP-REMatch kernel (De et al., 2016). In order to improve the understanding of the property-structure relationship, the influence of the respective parameters is examined. Indeed, to achieve a wide range of applicability, the SOAP-REMatch kernel and sketch-map technique rely on a few hyper parameters that need to be tuned accordingly (see De et al., 2017; Musil et al., 2018 for a more comprehensive discussion). Depending on the choice of hyper parameters, structural data points are either divided into clusters based on an observable similarity or appear completely scattered. Hence, the parameters have a strong impact on the identifiability of correlations between structure and investigated property. After thorough investigation of the parameter behavior (see **Supplementary Material**), a set of parameters is chosen that allows the division of structural data points with similar corrosion properties into clusters.

To put a higher focus on the local atomic structure for the similarity determination using SOAP, a cutoff radius r<sup>c</sup> = 3.0 Å is chosen which includes all moieties of interest but neglects the overall molecular structure for most of the investigated dissolution modulators. For a good balance between strict similarity requirements and a sufficient number of pairs of local environments, the Gaussian width is set to ξ = 0.3. By choosing ζ = 2.0 the discrimination between large and medium similarities is increased, thus amplifying clustering effects. Based on the parameter γ = 2.0, a broad selection of well matching pairs of local environments—as determined by the SOAP kernel—is taken into account to compute the global similarity using the SOAP-REMatch kernel.

As the dimensionality reduction with sketch-map is based on a sigmoid function, the corresponding parameters have to be optimized for the given data. Again, optimizing the parameters with respect to cluster formations of the structural data points, choosing σ = 2 as well as the tuning parameters a = b = 3 results in the sketch-maps shown in **Figures 2**, **3**. The data points originating from the input structures are divided into two elongated "islands," a small island in the lower left and a larger island in the upper right part of the sketch-map. It is noteworthy

that aromatic compounds are solely found in the lower region whereas aliphatic compounds are distributed in the upper island of the sketch-map. This leads us to the conclusion, that the chosen parameters are well suited to generate a sketch-map of the investigated molecule database.

When coloring the structural data points according to their corresponding inhibition efficiency (**Figure 2A**), the upper right island is further divided into two clusters, where the left cluster is populated by corrosion inhibitors (green) and the right cluster mostly by corrosion promoters (purple) or moderately inhibiting (light green) additives. The lower left island is dominantly populated by corrosion inhibitors, except for three structures on its outer edge. Cluster formations clearly indicate a propertystructure relationship, allowing to cautiously correlate inhibition efficiency and molecular structure.

For the inhibition prediction it is desired to project not yet tested compounds into the generated sketch-map and relate their position to the three identified clusters. When a new structure is projected into an area with dominantly corrosion inhibitors or promoters, it is assumed to share similar inhibition properties and can be further investigated experimentally if desired. For purposes of validation, six structures of the experimental database are randomly chosen and projected into the sketch-map by determining their global similarity including all structures. Subsequently, the distance of the new structures is related to the 74 defined landmarks and used to compute the required projections. As a guide for the eye, the three previously identified clusters are outlined with dashed lines and colored according to the median inhibition efficiency in the respective region (**Figure 2B**).

As the new structures differ strongly in topology, it is natural that the computed projections lead to differing positions in the sketch-map. In relation to the landmarks, structures containing unusually coordinated atoms, additional atom species or an unusual number of functional groups are projected in regions far away from the observed islands, indicating discrepancies in similarity. However, except for one structure at the top of the sketch-map, structural similarities to the defined landmarks result in projections within or close to the generated sketch-map, where the corresponding inhibition efficiencies match the relative positioning to the inhibitor and promoter clusters fairly well.

The generated sketch-map can also be used to correlate the dissolution modulator structure to other properties, as for instance the HOMO-LUMO gap (HL gap). The energetic difference between the highest occupied and lowest unoccupied molecular orbital (HOMO and LUMO) is indicative for the affinity of the investigated corrosion modulators to transition metals (Griffith and Orgel, 1957), where formation of these complexes is more likely with lower HL gaps as this allows for a energetically more favorable overlap of the involved orbitals. Moreover, the HL gap is a sound indicator for chemical reactivity as the stability of a molecule increases with larger HL gaps. Concomitantly, the reactivity of the dissolution modulator decreases (Aihara, 2000). Consequently, aromatic ligands (e.g., pyridine derivatives) are more likely to form complexes with transition metals (e.g., Fe, Ni) than aliphatic ligands that, in general, exhibit larger HL gaps. Hence, the HL gap might be an important parameter that has to be taken into account in future studies to adequately predict the capability of an untested compound to prevent the re-deposition of noble impurities like

iron (Höche et al., 2016; Lamaka et al., 2016). The HL gaps were calculated on the TPSSh/def2SVP level of density functional theory using Turbomole 7.2 (TURBOMOLE, 2017) for each of the 80 compounds (**Figure 3**). As computing the HL gaps using the B3LYP/6-311++G\*\* level of theory that was employed for the STaGE calculations is computationally rather demanding, TPSSh/def2SVP is chosen here as a fast and accurate alternative. Comparing the optimized geometries for each functional, no structural discrepancies could be observed.

Coloring the sketch-map according to the calculated HL gaps, puts further emphasis on the expected separation between aromatic and aliphatic compounds in the investigated dataset. Aromatic structures in the lower left island are assigned with rather low values of 3.2–5.3 eV whereas aliphatic compounds in the top right island correspond to rather high energy gaps of 5.5–7.4 eV. Albeit this outcome corroborates our current working hypothesis, further work is required to quantitatively correlate the HL gap to the inhibition efficiency of potential inhibitor molecules based on the employed sketch-map approach.

# 4. DISCUSSION

The acquired property-structure landscape in **Figure 2A** uncovers a clear relationship between inhibitor structure and inhibition efficiency, whereas only a few outliers in the defined corrosion inhibitor and promoter clusters are observed. Furthermore, almost all new molecules that are projected into the sketch-map, matching the landmarks in similarity, are correctly positioned within or close to the defined clusters according to their corresponding inhibition efficiency. Hence we are confident, that the presented concept is suitable to predict the potential of uninvestigated corrosion inhibitors or promoters based on their resemblance to a defined landmark structure.

However, similarity values obtained using the SOAP-REMatch kernel depend strongly on the chemistry of the input structures. The direct effect can be observed in **Figure 2B**. Molecules that differ strongly in similarity—due to unusually coordinated atoms, varying atom species or an unusual number of functional groups—are positioned far away from the observed islands. The origin for this behavior lies within the SOAP-REMatch kernel where similarity measures are computed based on the overlap of local atomic environments. Hence, comparing a relatively large molecule to a high number of relatively small molecules leads to low similarity values, and thus a large distance in highdimensional space, given that a large cutoff radius r<sup>c</sup> is provided. Also, variations for the number and type of functional groups are affected by this behavior. For the given case, a relatively small cutoff radius r<sup>c</sup> = 3.0 Å is chosen, leading to a higher focus on local atomic bonds than on the overall molecular structure. Thus, for a significant similarity between local atomic bond networks of landmarks and projections, also large structures can be assigned to clusters of smaller molecules within the sketch-map. For similarity measures between structures containing different elements, a separate density is built for each atomic species and an overlap of differing local environments corresponds to zero (De et al., 2016). Therefore, molecules containing atomic species varying from the ones included in the landmark structures are also more likely to be projected further away. However, here the only investigated structure containing a different atom

species (phosphorus in phenylphosphonic acid) is projected directly into the cluster of aliphatic corrosion inhibitors, even though it contains an aromatic ring. A possible reason for this behavior is the low cutoff radius r<sup>c</sup> that gives greater weight to the similarity of the oxygen arrangement within the phosphonic acid functional group than to the phenyl ring. Since no other structure containing phosphorus is provided, the structure thus appears most similar to the aliphatic compounds. Following the same reasoning, the projected structure at the top of the sketchmap, as well as the landmark structure at the far left, are spaced further away from the aliphatic cluster as their local structure (arrangement of carbon and oxygen atoms) differs significantly. With respect to the proposed inhibition prediction workflow, the presented results already suggest important factors for future hydrogen evolution experiments. Accordingly, using out-ofsample embedding to find structures that match the already defined clusters, potential corrosion inhibitors or promoters can be identified. However, as the proposed inhibition prediction is to be understood more as the formulation of a first clue with respect to the inhibition properties, the predicted inhibition efficiency still has to be validated experimentally. To improve the prediction potential of the proposed concept, more data point from hydrogen evolution experiments are required. With an increasing number of tested compounds, the presented sketchmap can be extended by newly tested structures, thus facilitating the search for new inhibitor molecules with new properties even further. Moreover, structures projected into unexplored regions may indicate promising starting points for the discovery of novel additives with interesting inhibition properties that would not have been considered for testing otherwise.

Based on the structures of already investigated dissolution modulators within the inhibitor clusters, yet unexplored molecules can be identified that might yield promising corrosion inhibition or promotion properties. In this manner, a small number of unknown structures has been selected that shall be tested in future hydrogen experiments–comprising the sodium salt of 6-hydroxypyridine-3-carboxylic acid and quercetin (based on the aromatic cluster) as well as the sodium salt of hexanoic acid (based on the aliphatic cluster). Using out-of-sample embedding to get a first indication of the inhibition performance (see **Figure S4**), the sodium salt of 6-hydroxypyridine-3-carboxylic acid and quercetin can be identified as potential corrosion inhibitors, whereas the sodium salt of hexanoic acid is expected to promote the corrosion rate.

Even though the proposed workflow works well for the considered data, there are still certain factors to be aware of. On the one hand, the used input structures are all geometrically optimized with an implicit solvent model which might not represent the actual molecular geometry on the surface or in coordination complexes at all. On the other hand, the large number of tunable parameters when using the SOAP-REMatch kernel and sketch-map makes it difficult to fully understand its outcome, as the fine-tuning process contains a lot of trial and error as well as visual inspection (see **Supplementary Material**). For the given case, this strategy is still reasonable as the aim of finding a property-structure relationship with respect to the inhibition efficiency, as well as predicting the inhibition performance of new compounds is accomplished. However, a comprehensive understanding of the underlying physical concepts behind the occurring inhibition mechanisms still requires further work.

Due to the data-hungry nature of most machine learning applications, like sketch-map, more input structures are desired to improve its validity and prediction abilities since the proposed inhibition prediction workflow is highly dependent on the provided experimental input data. Thus, possible outliers within the inhibition efficiencies are to be expected without a sufficient amount of data points. However, the generation of new experimental data points is limited by costly and timeconsuming hydrogen evolution investigations. Experimental conditions have to be accurately defined as small discrepancies in the experimental environment of the chosen structures may already have a severe impact on the predictive performance of the generated property-structure landscape. Consequently, we aim to employ high-throughput MD or DFT computations to identify properties that correlate well with the experimentally determined inhibition efficiencies. A promising starting point for this in silico approach are the presented HL gaps (**Figure 3**). The separation between lower and higher energy gaps for aromatic and aliphatic compounds, respectively, matches the spatial separation due to the SOAP-REMatch kernel and sketch-map. Looking at the property-structure landscape more carefully, small point clusters within the islands can be identified that indicate some property-structure relationship. An example are the four data points in the far right of the sketch-map, provided with their corresponding molecular structures. The further right the structure lies within the sketch-map, the higher the HL gap of the respective compound becomes. Of course, the property-structure landscape does not allow investigations of this behavior in more detail. Nevertheless, it represents a potential relation between the molecular structure and the energy gap of the frontier orbitals, that can be further examined using other measures. Hence, we are currently investigating if the calculated HL gaps will help to detect a relationship between the HL gap and the inhibition efficiency as well.

Since the molecular compounds are tested in solution, another interesting parameter is the free energy of solvation. However, no obvious relationship to the inhibition properties can be observed so far (**Figure S3A**). Consequently, future works will focus on the determination of the free energy of solvation for corrosive species (e.g., Mg or Fe ions) in a solution containing the dissolution modulators to yield more accurate—and correlatable—results with respect to the occurring inhibition mechanisms. Here, STaGE is a mighty tool to screen free energies of solvation for high numbers of molecular compounds requiring very few input parameters (Lundborg and Lindahl, 2015). Moreover, even simpler properties as the number of certain functional moieties within an inhibitor molecule can provide a deeper insight on a potential correlation to the experimentally determined inhibition efficiency. For instance, the property-structure landscape in **Figure S3B** indicates that nitrogen plays an immediate role in the corrosion inhibition mechanism of aliphatic compounds.

In subsequent steps, by providing material and system parameters like the free energy of solvation or adsorption for inhibitor molecules, by predicting HOMO-LUMO gaps or by computing energy levels related to coordination complexes, physico-chemical entities at nano- and microscale, relevant for mathematically based system modeling, can be derived. For example, the shift in the electrochemical potential due to changes of the free energy of adsorption (Groß, 2018) or efficient (ion-) transport parameters like diffusion coefficients can be calculated. Furthermore, based on the molecule data, the cluster formation and its interaction with the surface can be analyzed more accurately by molecular dynamic studies. As a consequence, more precise calculations of elemental surface coverage, concentration distributions of chemical species or averaged, system relevant surface kinetic parameters are possible and more profound input data applicable in upscaled continuum corrosion models (Höche, 2015). Typically, such kind of information is experimentally difficult to access but of main interest for setting up advanced non-empirical corrosion models which are required to enhance computational corrosion and system engineering capabilities. The developed data science based concept can be applied for analyzing or even learning from corrosion simulation results by correlating simulation predictions and molecular structures.

In conclusion, it was possible to create a property-structure landscape based on the results of hydrogen evolution measurements, that vividly demonstrates the relationship between corrosion inhibition efficiency and corresponding molecular structure of magnesium corrosion inhibitors. After creating a high-dimensional similarity measure with the SOAP-REMatch kernel between 74 tested compounds, the similarity matrix is reduced to a two-dimensional visualization with sketchmap, providing a reference to qualitatively predict the inhibition behavior of yet to be tested molecules. Aside from the inhibition efficiency, also other properties as the HL gap were correlated with the inhibitor structure, matching impressively well the spatial separation into aliphatic and aromatic compounds. The predictive performance of the proposed workflow is still limited by the relatively low amount of available experimental input data. However, the discovered corrosion inhibitor and promoter clusters provide a valuable reference for inhibition prediction and identification of yet unexplored structures – thus facilitating the search for potential corrosion inhibitors and increasing the efficiency of corrosion inhibition experiments and corrosion models.

# AUTHOR'S NOTE

The datasets for this manuscript are not publicly available because they are published in a closed access journal (https://doi. org/10.1016/j.corsci.2017.07.011). Requests to access the datasets should be directed to sviatlana.lamaka@hzg.de. The sketchmaps used in this article are published on https://interactive. sketchmap.org/.

# AUTHOR CONTRIBUTIONS

TW, SL, CF, MZ, and RM: contributed the conception and design of the study. CF and GBVF: ran DFT simulations. FM: performed the statistical analysis. SL: provided experimental data. TW: wrote the first draft of the manuscript. RM, CF, GBVF, FM, and DH: wrote sections of the manuscript. All authors contributed to the manuscript revision, read and approved the submitted version.

# FUNDING

Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) Projektnummer 192346071-SFB 986. GBVF acknowledges financial support by the Austrian Research Promotion Agency (FFG) within the project PL2N A, project number 865011. FM is supported by NCCR MARVEL, funded by the Swiss National Science Foundation. MMDi IDEA project funded by HZG is gratefully acknowledged.

#### REFERENCES


#### ACKNOWLEDGMENTS

Prof. D. Winkler from La Trobe University, Australia as well as Prof. M. Ceriotti from École Polytechnique Fédérale de Lausanne, Switzerland are acknowledged for discussions about setting up machine learning for magnesium dissolution modulators.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmats. 2019.00053/full#supplementary-material


continuum model. J. Chem. Phys. 131, 206101. doi: 10.1063/1.32 68921

Yang, J., Blawert, C., Lamaka, S. V., Yasakau, K. A., Wang, L., Laipple, D., et al. (2018). Corrosion inhibition of pure mg containing a high level of iron impurity in ph neutral nacl solution. Corrosion Sci. 142, 222–237. doi: 10.1016/j.corsci.2018.07.027

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Würger, Feiler, Musil, Feldbauer, Höche, Lamaka, Zheludkevich and Meißner. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Bayesian Framework for the Estimation of the Single Crystal Elastic Parameters From Spherical Indentation Stress-Strain Measurements

Andrew Castillo and Surya R. Kalidindi\*

*George W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA, United States*

This paper presents a two-step Bayesian framework for the estimation of the intrinsic

#### Edited by:

*Ping Wu, Singapore University of Technology and Design, Singapore*

#### Reviewed by:

*Thomas R. Bieler, Michigan State University, United States Yong Ni, University of Science and Technology of China, China*

> \*Correspondence: *Surya R. Kalidindi*

*surya.kalidindi@me.gatech.edu*

#### Specialty section:

*This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials*

> Received: *08 January 2019* Accepted: *27 May 2019* Published: *13 June 2019*

#### Citation:

*Castillo A and Kalidindi SR (2019) A Bayesian Framework for the Estimation of the Single Crystal Elastic Parameters From Spherical Indentation Stress-Strain Measurements. Front. Mater. 6:136. doi: 10.3389/fmats.2019.00136* single crystal elastic stiffness parameters from the measurements of spherical indentation stress-strain responses in multiple individual grains of a polycrystalline sample, whose crystal lattice orientations have been measured using electron back-scattered diffraction technique. The first step requires the establishment of the functional dependence of the indentation elastic modulus given the lattice orientation and the intrinsic single crystal elastic stiffness parameters. Previous efforts for this step required a large database of computationally expensive finite element (FE) simulations in order to establish this function with adequate accuracy. In this paper, it is shown that the introduction of a Bayesian framework can greatly reduce the number of simulations necessary to establish this function, while introducing practically useful measures of uncertainty which can guide the selection of specific additional simulations that are expected to best improve the predictive accuracy of the function. The second step involves a Markov Chain Monte-Carlo (MCMC) sampling of the distribution of possible values for the single crystal elastic stiffness parameters based on a given set of experimentally measured elastic indentation moduli in individual grains of different lattice orientations. This second step is accomplished by calibrating the available experimental data to the function established in the first step. This novel framework is presented and demonstrated in this paper for an as-cast cubic polycrystalline Fe-3% Si sample and a hexagonal polycrystalline commercially pure (CP-Ti) titanium sample.

Keywords: bayeian inference, Monte - Carlo simulation, single crystal elastic constants, design of experiments (DoE), parameter uncertainties, spherical nanoindentation

# INTRODUCTION

Continued development and application of physics-based multiscale materials models is largely hampered by the lack of protocols for reliably estimating the intrinsic material properties at the microscale (e.g., the grain-scale properties in modeling of polycrystalline materials). In recent years, instrumented indentation techniques have been demonstrated to be capable of providing consistent and reliable measurements at the lower length scales (up to submicron length scales) (Vlassak and Nix, 1994; Uchic et al., 2004; Pathak and Kalidindi, 2015; Weaver et al., 2016; Khosravani et al., 2018). Although small-scale mechanical measurements are now quite reliable, it has not been a straightforward process to extract the intrinsic material properties from such measurements. As specific examples, one would hope to estimate the values of the single crystal elastic constants and the critical resolved shear strengths from the instrumented nanoindentation measurements. Reliable and robust protocols for addressing this gap are emergent (Sánchez-Martín et al., 2014; Priddy, 2016; Patel and Kalidindi, 2017).

Currently employed strategies for extracting intrinsic material properties from indentation tests have generally involved the calibration of physics-based finite element (FE) models of these tests to the corresponding set of experimental measurements (Bhattacharya and Nix, 1988; Zambaldi et al., 2012; Patel et al., 2014; Priddy, 2016). In this regard, it has been pointed out in recent work (Patel and Kalidindi, 2017) that these protocols are much more robust when the calibration is attempted in the form of the normalized indentation stress-strain curves as opposed to directly matching the load-displacement curves. This is mainly because the initial elastic response and the elasticplastic transition occur over a very short early portion of the load-displacement curve that is not easily identified and isolated, resulting in a very high sensitivity of the extracted values of the intrinsic material properties to small changes in the calibration procedures.

The calibration of the FE simulated indentation stress-strain curves to the experimentally measured indentation stress-strain curves for any selected material system essentially involves solving an inverse problem. In other words, the guessed values of the intrinsic material properties of interest become inputs to the FE simulations. Typically, one has to search over a large multidimensional space to find the best-fits between the FE predictions and the measurements. The main challenge comes from the high computational expense of FE simulations of the indentation experiments. It should be noted that establishing each data point on the FE predicted indentation stress-strain curve needs the simulation of a suitable unloading segment (Patel and Kalidindi, 2017), and this drives up the cost of the simulation significantly. Given all of the complexity described, the only logical path forward is to establish a reduced-order model for the FE simulations of the indentation test, and to use the reducedorder model in solving the inverse problem described above. In recent work (Patel et al., 2014), we have formalized this approach as a two-step process: (1) establishing a reduced-order model calibrated to FE simulations of indentations that takes the relevant intrinsic material properties as inputs and predicts indentation properties (defined suitably on an indentation stressstrain curve), and (2) the extraction of the intrinsic material properties from the available measurements (typically performed on grains of different orientations in a polycrystalline sample) through calibration with the reduced-order model established in step (1). The second step described above typically involves the solution to an optimization problem (i.e., minimizing the difference between the measurements and the predictions from the reduced-order model). The viability of this two-step protocol for extracting the values of the single crystal elastic constants and the critical resolved shear strengths in Fe-3%-Si has been demonstrated in recent work (Patel et al., 2014; Patel and Kalidindi, 2017).

The main difficulty with the two-step protocol described above lies in building the reduced-order model [i.e., step (1)]. Because of the need to cover a large space (for example for extracting single crystal elastic constants, the input space of interest is the product space spanning all combinations of the single crystal elastic constants, C11, C12, C44, and all possible grain orientations), one needs to generate a large amount of the FE simulation data in order to establish a high-fidelity reducedorder model. The difficulty of this task is amplified significantly in dealing with hcp crystals, where the numbers of the intrinsic properties is significantly larger (for example, modeling the elastic deformation in hcp crystals requires specification of five independent single crystal elastic constants). In prior work (Patel et al., 2014), the reduced-order models were built using standard regression approaches. Although these regression approaches produced excellent results, they do not scale well to problems with larger numbers of the intrinsic properties (because of the need to generate a large amount of data spanning the entire input domain).

The primary goal of this paper is to demonstrate the utility of Bayesian strategies for (i) optimizing the reduced-order model building effort involved in step (1), and (ii) providing estimates of the desired intrinsic material parameters (single elastic constants specifically) with uncertainty measures from available experimental data (spherical indentation measurements). Toward these goals, we will develop and present a Bayesian inference framework for both steps of the two-step protocol described above. Bayesian inference has been instrumental in model-building tasks with limited amount of data (MacKay, 1998; Rasmussen, 1999; Huan and Marzouk, 2013; Gelman, 2014). The adoption of a Bayesian inference framework for the extraction of the intrinsic material properties from indentation measurements offers the following main advantages: (i) it is expected to dramatically reduce the number of FE simulations needed to produce the reduced-order model generated in step (1), and (ii) it provides a much more rigorous quantification of the uncertainty in the estimates of the intrinsic material properties obtained in step (2), while accounting for the uncertainty in the measurements as well as other sources. In this paper, we first develop the framework, and subsequently demonstrate its application to the extraction of single crystal elastic properties in selected cubic and hexagonal metals.

# NEW BAYESIAN INFERENCE FRAMEWORK FOR THE ESTIMATION OF INTRINSIC MATERIAL PROPERTIES FROM INDENTATION MEASUREMENTS

Let **c** denote the set of intrinsic material properties to be established. For cubic crystals, this represents the set of three elastic constants, i.e., **c**= {C11, C12, C44}. Let **P** denote an available set of observations of the indentation properties corresponding to the set of crystal orientations **G**. This set of observations could come from either FE simulations or the physical experiments. We shall note the source of the data using subscripts sim and exp on these variables. Furthermore, in the notation employed in this paper, a set of values for a variable is denoted by an uppercase symbol, while an individual element of the set is denoted by its non-bold counterpart. As an example, a single value of the indentation property will be denoted by P. Furthermore, a collection of variables is also denoted by bold symbols. As an example, a single crystal orientation would be denoted by **g**, as it denotes a set of three Bunge-Euler angles (Bunge, 1993). However, a collection of grain orientations would be represented by **G**. Likewise, a single set of intrinsic material parameters is denoted by **c**, whereas a collection of the sets of intrinsic material parameters is denoted **C**. Employing this notation, the central tasks in the two-step protocol developed in this work are the following:


Prior experimental work (Pathak et al., 2009) in single-phase polycrystalline metals has focused on exploring the dependence of indentation modulus on the lattice orientation of the indented grains (i.e., individual crystals). These findings were verified by suitable FE simulations (Patel et al., 2014). Recently, a reducedorder model which captures the dependence of indentation modulus on both orientation and an arbitrary set of intrinsic material parameters has been established from FE simulations. The mathematical form of the reduced-order model for the present application is adopted from this prior work (Patel et al., 2014) as

$$P = \hat{P}(\mathbf{c,g}) \approx \sum\_{l=0}^{L} \sum\_{m=1}^{M(l)} \sum\_{\mathbf{q}}^{\mathbf{Q}} A\_l^{mq} \mathbf{K}\_l^m \left(\mathbf{g}\right) \tilde{P}^{\mathbf{q}}(\overline{\mathbf{c}}) \tag{1}$$

$$\overline{c}\_{j} = \frac{2c\_{j} - c\_{j}^{min} - c\_{j}^{max}}{c\_{j}^{max} - c\_{j}^{min}} \tag{2}$$

where K m l **g** denote the symmetrized Surface Spherical Harmonics basis over the relevant orientation space of interest, and P˜ <sup>q</sup> ( ) denote a multivariate Legendre polynomial product basis. In other words, one can express P˜ **<sup>q</sup>** (**c**) = P <sup>q</sup><sup>1</sup> (c1) P <sup>q</sup><sup>2</sup> (c2). . . P <sup>q</sup><sup>R</sup> (cR), where **<sup>q</sup>** <sup>=</sup> (q<sup>1</sup> , q<sup>2</sup> . . . qR) forms a multi-index array, each element of which is a nonnegative integer allowed to vary from 0 to the selected maximum degree, Q, i.e., q<sup>j</sup> ∈ [0, Q]. The use of Legendre polynomials provides an orthonormal basis over the range [−1, 1], for which each of the elastic constants are rescaled in accordance to Equation (2), where c max j and c min j are the maximum and minimum values of the j-th elastic constant under consideration. In Equation (1), m and l index the surface spherical harmonic basis where M(l) enumerates the spherical harmonics that implicitly reflect the crystal symmetries of interest (Bunge, 1993; Adams et al., 2012). The integers Q and L denote the truncation levels adopted in the use of Equation (1). It is emphasized here that the model form used in Equation (1) denotes a Fourier representation using an orthonormal basis that has been previously shown to produce compact representations for mechanical responses of crystalline solids (Proust and Kalidindi, 2006; Knezevic et al., 2008; Patel et al., 2014; Yabansu et al., 2014; Yabansu and Kalidindi, 2015; Patel and Kalidindi, 2017). One of the central features of a Fourier representation is that the Fourier coefficients A mq l are completely independent of each other. The goal of the reducedorder modeling task here is to estimate the values of A mq l , expressed in a vector notation as **A**, from the sparse amount of available data, as it is being generated from the expensive FE simulations. Even more importantly, our goal is to drive the model building in an optimal way by identifying the specific set of inputs for the next FE simulation such that it maximizes the improvement to the reduced-order model being built.

#### Building the Reduced-Order Model

The reduced-order model [see Equation (1)] needs to be built such that it makes good predictions for the indentation modulus over a large domain of input parameters (**c**,**g**). Given the large domain of the input parameters (e.g., covering the range of values for the three independent parameters defining cubic elasticity and the two independent parameters defining the indentation direction in the crystal reference frame) and the high cost of executing a FE simulation for generating each data point, it is highly desirable to explore Bayesian regression approaches for estimating the unknown Fourier coefficients in Equation (1). Let the corresponding sets of intrinsic parameters, **c**, used as inputs to simulations be denoted as **C**sim. The data generated from FE simulations will be denoted {**P**sim**, <sup>C</sup>**sim**,G**sim} following the notation introduced earlier.

Bayesian approaches treat model parameters [e.g., Fourier coefficients in Equation (1)] as stochastic variables exhibiting a distribution of values. Most importantly, Bayes' theorem allows one to update the distributions for the model parameters given new data (i.e., observations) and is commonly expressed as

$$P(A|D) = \frac{P(D|A)P(A)}{P(D)}\tag{3}$$

where P(A) denotes the prior belief (expressed as a distribution) on the values of the unknown model parameters, P(D|A) denotes the likelihood of sampling the observations D for specified values of the model parameters, and P(A|D) denotes the posterior (updated) belief on the values of the unknown model parameters given the observations D. The denominator P(D) in Equation (3) is generally referred as the probability of the evidence, and is often difficult to establish. However, it mainly serves as a normalization factor for the posterior distribution. Since the distributions are often defined with known normalization factors, it is often possible to skip the evaluation of P(D) in practical implementations of the Bayes' rule described in Equation (3) (Box, 1973).

It is expedient to treat the distributions associated with all the stochastic variables in Equation (3) as normal (i.e., Gaussian) distributions. As a specific example, the ith observed value of the indentation modulus is modeled as being generated from a deterministic model, with added stochastic noise, as

$$P\_i = \hat{P}\_i\left(\mathbf{A}, \mathbf{c}\_i, \mathbf{g}\_i\right) + \epsilon\_i, \; \epsilon\_i \sim \mathcal{N}(\mathbf{0}, \beta^{-1})\tag{4}$$

where N (0, β −1 ) denotes a normal distribution with a zero mean and a variance of β −1 . Note that the stochastic noise is assumed to be independent of location in the parameter space, i.e., homoscedastic. The likelihood for a set of N independently observed indentation moduli can be established using the product rule as

$$\not{p}\left(\mathcal{P}\_{sim}|A,\mathcal{C}\_{sim},\mathcal{G}\_{sim}\beta\right) = \prod\_{i}^{N} p(P\_i|A,\mathcal{c}\_{i\bullet}\mathbf{g}\_i\beta) \tag{5}$$

As noted earlier, the model parameters **A** are also treated as stochastic variables. The prior belief on these variables is assumed to be specified by a normal distribution with a zero mean and a large variance of α −1 as

$$p(A|\alpha) \sim \mathcal{N}(0, \alpha^{-1}I) \tag{6}$$

The application of Bayes' rule [Equation (3)] to the problem at hand results in

$$\not{p}\left(A|\mathcal{P}\_{sim},\mathcal{C}\_{sim},\mathcal{G}\_{sim},\alpha,\beta\right) = \frac{\not{p}\left(\mathcal{P}\_{sim}|A,\mathcal{C}\_{sim},\mathcal{G}\_{sim},\beta\right)\not{p}\left(A|\alpha\right)}{\not{p}\left(\mathcal{P}\_{sim}|\mathcal{C}\_{sim},\mathcal{G}\_{sim},\alpha,\beta\right)}\tag{7}$$

where <sup>p</sup> (**A**|**Psim,Csim,Gsim**, <sup>α</sup>, β) denotes the posterior (updated) distribution on the model parameters. The denominator in Equation (7) reflects the probability of the observed outcomes irrespective of the model parameters **A** chosen, and can be described by the marginalization of the likelihood with respect to the model parameters as

$$\oint \left( \mathbf{P}\_{\rm sim} \left| \mathbf{C}\_{\rm sim}, \mathbf{G}\_{\rm sim}, \alpha, \beta \right> \right) = \int\_{A} \rho \left( \mathbf{P}\_{\rm sim} \left| \mathbf{A}, \mathbf{C}\_{\rm sim}, \mathbf{G}\_{\rm sim}, \beta \right> \rho(\mathbf{A} \left| \alpha) d\mathbf{A} \right. \tag{8}$$

In a fully Bayesian approach, the precision parameters, α, β, may also be treated as stochastic variables (Gelman, 2004). This allows for a separate application of Bayes' theorem expressed as

$$\left(p(\alpha, \beta | \mathcal{P}\_{sim}, \mathcal{C}\_{sim}, \mathcal{G}\_{sim})\right) \propto \left(\mathcal{P}\_{sim} | \mathcal{C}\_{sim}, \mathcal{G}\_{sim}, \alpha, \beta\right) \left(\alpha, \beta\right) \tag{9}$$

Alternately, one can use point estimates from the maximization of the likelihood in Equation (9), denoted as αˆ, βˆ. This is equivalently interpreted as the maximization of the evidence of the observed data in Equation (8) (MacKay, 1992b). With this approach, the posterior distributions of model coefficients in Equation (7) can be solved analytically (while assuming normal distributions for the various variables involved) (MacKay, 1992a, 1996; Christopher, 2006). The updated posterior distribution computed using the approach described above is generally expected to be sharper (i.e., lower variance) compared to the prior belief.

Obviously, the available observations may not produce a posterior distribution that is sharp enough (i.e., the uncertainty associated with the posterior is still too high for a given application). In such cases, one needs to examine carefully where one should produce additional data points (i.e., new observations) in order to maximize the sharpening of the posterior distributions. The general approach to solving this problem (i.e., identifying the new data points exhibiting the maximum potential for improving the model accuracy and reliability) involves making predictions for new inputs, and identifying the specific inputs that exhibited the highest variance (i.e., uncertainty) in their predictions as the locations where new observations should be generated (MacKay, 1992a; Atkinson, 2007). This kind of a rational approach for deciding where to generate new data points is critical for situations where data generation is expensive (as is the case with the FE simulations of the spherical indentation for the present case study). The predictions for new inputs are obtained by the marginalization over the posterior distribution of the model parameters as

$$\begin{aligned} \rho\left(P\left|\mathbf{c}\_{\mathbf{g}}, \mathbf{p}\_{\text{sim}}, \mathbf{C}\_{\text{sim}}, \mathbf{G}\_{\text{sim}}, \hat{\boldsymbol{\alpha}}, \hat{\boldsymbol{\beta}}\right)\right) &= \int\_{A} \rho\left(P\left|A, \mathbf{c}\_{\mathbf{g}}, \hat{\boldsymbol{\beta}}\right) \\ \rho\left(A \left|P\_{\text{sim}}, \mathbf{C}\_{\text{sim}}, \mathbf{G}\_{\text{sim}}, \hat{\boldsymbol{\alpha}}, \hat{\boldsymbol{\beta}}\right)\right) dA \end{aligned} \tag{10}$$

where (**c**,**g**) denote the new inputs. Therefore, the specific set of inputs which exhibit the highest variance for the prediction can be readily identified. Once the set of inputs are identified, and corresponding FE simulation performed, the next step is updating the distribution of model coefficients with the newly acquired observation. The update step to the distribution of the model coefficients is natural using a Bayesian framework in the sense that any knowledge acquired previously can be incorporated through the prior.

$$\begin{aligned} p\_{N+1}\left(A\left|P\_{\text{sim}}, \mathbf{C}\_{\text{sim}}, \mathbf{G}\_{\text{sim}}, \hat{\boldsymbol{\alpha}}, \hat{\boldsymbol{\beta}}\right> \propto p\_{N+1}\left(P\_{\text{sim}}\left|A, \mathbf{C}\_{\text{sim}}, \mathbf{G}\_{\text{sim}}, \hat{\boldsymbol{\beta}}\right>\right)\right) \\ p\_N\left(A\left|P\_{\text{sim}}, \mathbf{C}\_{\text{sim}}, \mathbf{G}\_{\text{sim}}, \hat{\boldsymbol{\alpha}}, \hat{\boldsymbol{\beta}}\right>\right) \end{aligned} \tag{11}$$

The posterior distribution of the parameters can continually be updated as incoming data is sequentially added by setting the prior as the previously inferred posterior distribution of model coefficients as shown in Equation (11). Updates to the posterior distribution of model coefficients are performed until sufficient model convergence and prediction performance is attained. Model convergence is determined through the change in values of the model coefficients and parameters as data is added. Model performance is evaluated through various error metrics such as the leave-one-out-cross-validation (LOOCV) error (MacKay, 1992a; Christopher, 2006). Building the reduced-order model and critically evaluating its reliability and robustness completes the first step of the two-step protocol. It should be noted that this is intended to be performed only once for a given class of materials.

#### Estimating Intrinsic Material Properties From Indentation Measurements

For the second step of the protocol, our goal is to employ the reduced-order model built in the first step together with indentation measurements obtained from a given sample to estimate its intrinsic material properties. Let **P**exp, **G**exp denote such experimental measurements. The posterior distribution for the intrinsic material properties can be sampled from yet another application of the Bayes' rule as

$$\left(p(\mathcal{c}|\mathcal{A}, \mathcal{P}\_{\text{exp}}, \mathcal{G}\_{\text{exp}}, \sigma) \propto p\left(\mathcal{P}\_{\text{exp}} \, \middle| \, \mathcal{A}, \mathcal{C}, \mathcal{G}\_{\text{exp}}, \sigma\right) \right) p(\mathcal{c}) \tag{12}$$

where **A** denotes the parameters in the reduced-order model built in the first step. Although point estimates can be obtained by maximizing the likelihood in Equation (12), in the spirit of building a robust framework capable of accounting for various sources of uncertainty, we have decided to pursue the computation of the posterior distribution on the intrinsic material properties through sampling techniques. In order to sample from the posterior distribution defined in Equation (12), we need to establish the likelihood of the set of experimental observations. A likelihood can be constructed by assuming that the experimental observations (i.e., data points) are independent and normally distributed, i.e., the experimental data points are observations drawn from normal distributions with means estimated by the reduced order model and variances, σ, estimated from the experimental data of the measured indentation property at M grain orientations (Fisher and Renken, 1964; Box, 1973; Bates and Campbell, 2001; Ferraioli et al., 2012). This likelihood is expressed as

$$\rho\left(\mathbf{P}\_{\text{exp}}|A,\mathbf{c},\mathbf{G}\_{\text{exp}},\sigma\right) = \prod\_{i}^{M} \mathcal{N}(P\_{i\_{\text{exp}}}|\hat{P}(A,\mathbf{c}\_{\mathbf{i}}\mathbf{g}\_{i\_{\text{exp}}}),\sigma\_{i}) \tag{13}$$

The evaluation of the likelihood described in Equation (13) is performed using the reduced-order model, Pˆ(**A**,**c**,**g**), built in the first step of the two-step protocol. In this work, the sampling from the posterior distribution of intrinsic material parameters [Equation (12)] is accomplished using a Monte Carlo Markov Chain (MCMC). The goal of MCMC is to generate a Markov Chain which indirectly samples from the posterior distribution of interest as long as the number of samples drawn is very large. The Markov Chain is generated by the acceptance and rejection of a large number of transitions through the space of intrinsic material parameters based on an acceptance probability. In practice, a class of algorithms have been developed in order to define these transitions and are referred as Metropolis-Hastings algorithms (Gelman, 2004). In this work, Single Component Metropolis Hastings (SCMH) is applied, which considers component wise transitions (Haario et al., 2005). In the algorithm below for a given step t, partial updates are performed for the sample **c**<sup>t</sup> for each component j until all components are updated.

The basic steps for the implementation of the SCMH algorithm are as follows:


$$\mathbf{c}^\* \sim q\_{\hat{l}}\left(\mathbf{c}|\mathbf{c}\_t\right)$$

where qj(∗) proposes **c** <sup>∗</sup> differing from **c**<sup>t</sup> in component j, sampled from a normal distribution with mean c j t and variance v 2 j

$$
\boldsymbol{\sigma}^\* \sim \check{\boldsymbol{\mathcal{N}}} \left( \boldsymbol{\sigma}^{\boldsymbol{j}} | \boldsymbol{c}\_t^{\boldsymbol{j}}, \boldsymbol{v}\_{\boldsymbol{j}}^2 \right),
$$

∗

$$\begin{array}{rcl} \text{3. Calculate the acceptance probability of transition, } \alpha(\*)\\ \alpha(c^\*|c\_t) &=& \min(1, \frac{\rho(c^\*|A, \mathbb{P}\_{\text{exp}}, \text{Gexp}, \sigma) \, q(|c\_t|^\*)}{\rho(c\_t|A, \mathbb{P}\_{\text{exp}}, \text{Gexp}, \sigma) \, q\_f(c^\*|c\_t)})\\ &=& \min(1, \frac{\rho(P\_{\text{exp}}|A, c^\*, \text{Gexp}, \sigma) \, p(c^\*)}{\rho(P\_{\text{exp}}|A, c\_t, \text{Gexp}, \sigma) \, p(c\_t)})\\ \end{array}$$

4. Update Chain (accept/reject proposed transition)

a. Draw a sample, r, from a standard uniform distribution b. If α > r **c**<sup>t</sup> = **c**

5. Repeat steps (2–4) until all components of **c**<sup>t</sup> are updated, then proceed to a new step.

While the probability of a proposed transition is described by the proposal distribution qj( ∗ ), the probability of accepting the transition is given by α( ∗ ). By assuming a flat prior for p(**c**), the acceptance probability of a proposed transition is completely specified by the posterior probability of the states evaluated within a normalizing constant using Equation (13) (Chib and Greenberg, 1995; Gelman, 2004). The variances of the proposal distributions v 2 j are tuned during the "burn-in" period in order to meet an acceptance rate around ∼0.23. Ensuring the acceptance rate lies around 0.23 has been shown to provide efficient convergence of the Markov chain for gaussian posteriors (Roberts et al., 1997). All of the computations described above were realized using functions readily available in MATLAB (2016).

# CASE STUDY: CUBIC POLYCRYSTALS

#### Problem Statement

For our first case study, we revisit the extraction of the single crystal elastic constants {C11, C12, C44} of the bcc metal Fe 3%- Si, which was previously attempted using standard regression techniques. In the previous study (Huan and Marzouk, 2013), a total of 2,286 simulations were needed to establish a high-fidelity reduced-order model in the first step of the two-step protocol. The simulated database consisted of the indentation modulus corresponding to 300 distinct sets of cubic stiffness constants within the domain 50 GPa ≤ C<sup>11</sup> ≤ 250 GPa, 40 GPa ≤ C<sup>12</sup> ≤ 150 GPa, and 15 GPa ≤ C<sup>44</sup> ≤ 120 GPa across 9 orientations selected within the fundamental zone of the relevant orientation space (Patel et al., 2014). We note, these ranges encompass a vast number of cubic metals (Simmons, 1971). It is anticipated that the proposed Bayesian framework will need significantly less number of FE simulations to adequately capture the FE predicted indentation modulus within the same parameter space in a robust reduced-order model.

#### Model Building Process

The Bayesian model building process enables sequential design strategies through the identification of high value simulations which will best improve the predictive capability of the model. Since a database of simulations is already available, simulations are treated as "unseen" and are sampled based on the determined utility of performing the simulation. Before beginning the sequential design process, an initial set of simulations must be performed to establish an initial model.

For the present study, a set of 123 FE simulations were selected from the previously performed 2,286 simulations as this initial set. This initial set was selected to correspond to the boundaries of the intrinsic material parameter space. Following initialization, the reduced-order model in Equation (1) was considered with different truncation levels of L = 8, 10, 12 for the symmetrized Surface spherical harmonics (differently shaped symbols in **Figure 1**) and Q = 1, 2, 3 for the maximal degree of the respective Legendre Polynomials (Bunge, 1993) (different colors of symbols in **Figure 1**). The truncation levels of the reduced-order model can be treated as hyperparameters, and must be selected so that we produce the most robust and accurate reduced-order models. Leave-one-out-cross-validation (LOOCV) was performed at various times during the update process and plotted in **Figure 1** for the different truncation levels considered.

There is clear improvement in cross validation error up to truncations levels Q = 2, L = 10, with little improvement for higher truncation levels. The plots in this figure also provide guidance on where to stop the model building effort (i.e., when there is no appreciable improvement in the accuracy of the reduced-order model being built). In addition to the LOOCV, the norm of the vector of model coefficients at each update step (see **Figure 2B**) and the angular difference of the vector of model coefficients from the previous update step (see **Figure 2A**) were taken into consideration in determining when to stop the model building effort. Based on these considerations (see **Figures 1**, **2A,B**) it was decided to stop the model building effort after using 300 training points (this includes the set of 123 training point used for initialization). The predictive accuracy of the reducedorder model for the remaining FE simulations (i.e., 2,286–300 = 1,986) is presented in **Figure 3** as a parity plot. The resulting mean absolute prediction error of the reduced-order model was found to be 2.16 GPa (see **Figure 3**) while the LOOCV error was found to be 2.22 GPa (see **Figure 1**). This is comparable

dashed line indicates 300 training points.

to previous efforts based on standard regression techniques and utilizing the full database of 2,286 FE simulations, where the LOOCV error was reported to be in the range of 2–2.5 GPa (Patel et al., 2014).

#### Extracting Intrinsic Material Parameters

At any point during the model building process, the Bayesian framework presented in section Building the Reduced-Order Model can be used to sample the posterior distribution on the material parameters via the MCMC approach. In order to accomplish this second step of the proposed framework, one needs to evaluate the likelihood function [see Equation (13)]; this requires the use of the reduced-order model obtained in step (1) as well as the relevant experimental indentation data. The reduced-order model with truncations Q = 2, L = 10 obtained after using 300 training points (described in section Model Building Process) was selected for this example case study. Experimental data, including the mean and associated variance of measured indentation moduli, were previously reported for 11 different grains in a polycrystalline sample of Fe 3%-Si (Pathak et al., 2009). Using the MCMC procedure described in section Estimating Intrinsic Material Properties from Indentation Measurements, 50,000 samples were drawn. The resulting multivariate distribution is shown in **Figure 4** as three univariate distributions.

To recap, in Step (1) of the protocol used a minimal number of finite element simulations to establish a high fidelity reducedorder model. Using experimental data previously reported, (Pathak et al., 2009) the established reduced-order model was used to sample the distribution of elastic constants in Step (2) of the protocol. The distributions for the parameters extracted here are in very good agreement with the literature values. Estimates of the elastic constants from the current study, typical values

FIGURE 2 | (A) Variation in the angular change in the vector of model coefficients between model update steps. (B) Variation in the magnitude of the vector of model coefficients during the model building process.

reported from literature (Simmons, 1971), and estimates reported from the previous study based on ordinary regression (on the full set of 2,286 FE simulations) (Patel et al., 2014) are shown in **Table 1**.

It is emphasized that the previous study did not attempt any form of uncertainty quantification with respect to these estimates. It is important to note that the highest relative uncertainty in the present study was associated with the estimation of C44, which deviated the most from the reported literature values. Since the literature values seldom report the associated uncertainty, it is very difficult to identify the source of the small disagreement between the C<sup>44</sup> values extracted here from the indentation measurements and the literature values obtained using completely different techniques. This small difference could be attributed to the experimental measurement errors (in both the indentation protocols employed here as well as the more conventional measurement protocols employed in literature). We further note that it should be possible to further refine the methodology presented here [i.e., Step (2) of the protocol] to identify specific additional grain orientations for indentation measurements that might improve specifically the estimates of C<sup>44</sup> by reducing its variance. Such refinements will be pursued in future work.

TABLE 1 | Comparison of reported estimates for single crystal elastic constants of the bcc-metal, Fe-3%-Si. All units are in GPa.


*<sup>a</sup>Simmons (1971), <sup>b</sup>Patel et al. (2014).*

# CASE STUDY: HEXAGONAL POLYCRYSTALS

#### Problem Statement

In order to demonstrate the versatility of the proposed framework, attention is now turned to the extraction of the elastic constants, **c** = {C11, C12, C44, C33, C13}, for the hcp metal CP-Ti (commercially pure titanium) (Simmons, 1971). Unlike the previous case study, a database of previously performed FE simulations was not readily available for this case study. Therefore, FE simulations were designed and performed specifically for this study as demanded by the Bayesian inference framework in the Step (1) of the protocol.

The FE model used for this study is the previously validated Finite Element model (Patel et al., 2014) developed using the commercial software ABAQUS (ABAQUS, 2014). The sample mesh consisted of 12,610 C3D8 continuum 3-D elements and is shown in **Figure 5**. The simulated indents were performed using an analytically defined rigid indenter with a tip radius of 16µm, consistent with the size used in the experiments on single crystal CP alpha-Ti grains reported in literature (Weaver et al., 2016). The dimensions of the sample mesh were taken as 9.6 <sup>×</sup> 9.6 <sup>×</sup> 4.8µm. The FE model was validated by comparing simulated indentation moduli to the theoretical values reported by Vlassak and Nix (1994) for zinc single crystals **c** = {161.1, 34.2, 38.3, 61.1, 50.3} GPa as shown in **Figure 5**. The comparisons confirm the linear relationship between the indentation load (P) and the elastic indentation depth (h-e) raised to a power of 3/2 for hcp single crystals, as predicted by Vlassak and Nix (1994) (note that the original Hertz theory Hertz, 1896 is restricted to isotropic materials).

For building the reduced-order model (Step 1 of the protocol), we need to identify the specific ranges of the intrinsic material properties of interest. For this study, the bounds of the ranges for the single crystal elastic constants were taken as 80 GPa ≤ C<sup>11</sup> ≤ 240 GPa, 40 GPa ≤ C<sup>12</sup> ≤ 120 GPa, 30 GPa ≤ C<sup>44</sup> ≤ 90 GPa, 70 GPa ≤ C<sup>33</sup> ≤ 210 GPa, and 40 GPa ≤ C<sup>13</sup> ≤ 90 GPa; these were chosen to encompass a large number of hcp metals of future interest to our research (Priddy, 2016). The transverse elastic isotropy of the hcp symmetry implies that the elastic indentation response is dependent solely on the declination angle (8) between the indenter axis and c-axis of the hcp crystal. Therefore, one only needs to explore the orientation space defined by 0 ≤ 8 ≤ π 2 radians. Our goal will be to employ the sequential design strategy once again to efficiently explore the multi-dimensional parameter space identified above in establishing a reliable and robust reduced-order model for the FE indentation simulations over the entire parameter space of interest.

#### Model Building Process

As with the previous case study, the truncation parameters (Q, L) are important hyper-parameters in the model building process. Since, these are not known a priori, we need to build reducedorder models with different values of these hyper-parameters and make suitable selections. The basic strategy employed here as follows: (1) Reduced-order models with lower truncation levels are initially established, (2) the truncation level is increased systematically if the performance of the established reducedorder model is deemed inadequate, and (3) the model building process is stopped when either the accuracy of the reduced-order model is deemed adequate or when the improvements in the accuracy were deemed insignificant. The LOOCV errors obtained from this process for the different truncations levels are depicted

in **Figure 6**. A set of 760 simulations were used as the initial set for all of these model-building exercises. This number was chosen to be slighter larger than the number of terms in the expansion of Equation (1) for the case (Q = 2, L = 4), which results in a total of 729 terms in the expansion. This initial set was identified using a Latin hypercube design (LHD) (McKay et al., 2000) across the 6 dimensional parameter space { C11, C12, C44, C33, C13, 8}.

Following the initialization, additional simulations were chosen based on a screening of the highest uncertainty across a denser LHD of 2,440 sets of inputs (total of 3,200 design points including the initialization set). The LOOCV error for the various truncation levels appears to decrease for all cases as data is added with slight increase for the truncations (Q = 3, L = 4, 6) after 2,200 data points, which given the small changes (<0.3 GPa) is attributed to noise. It is apparent from **Figure 6** that the truncation level combination (Q = 2, L = 4) outperforms others throughout the model building process. The good accuracy of the reduced-order model built for this case study becomes apparent after about 2,200 FE simulations, exhibiting a LOOCV error of 1.3 GPa as seen from **Figures 6**, **7**.

In order to generate a validation set, the selection process was continued to generate another set of 600 FE simulations. We argue that this approach is likely one of the best strategies for building validation sets, as the elements of the validation set are selected based on the highest values of the prediction uncertainty. The prediction errors for the validation set of 600 FE simulations using the reduced-order model built with the training set of 2,200 FE simulations are shown in **Figure 8**. This comparison yields a mean absolute error of 1.2 GPa.

It is important to recognize that the parameter space was purposefully chosen to be applicable to many hcp metals of future interest to our research (Priddy, 2016). Predictions are very good over the chosen parameter space as shown in **Figure 8**. Therefore, within the defined parameter space, future extraction efforts would no longer necessitate the generation of a new model. Furthermore, there is little value in performing additional simulations within the defined parameter space to attempt to significantly improve the reduced-order model. The convergence of the associated model parameters in **Figure 7** provides evidence that the reduced-order model is unlikely change drastically with the introduction of new simulations.

It should be noted that the significantly larger training set needed for this case study compared to the previous case study can be attributed to the following reasons: (i) the present case study involved a six-dimensional input space whereas the previous one involved a five-dimensional input space, (ii) the range of values for each input in this case study were selected to be significantly larger than the previous one, and (iii) the degree of elastic anisotropy and contrast captured in this case study is significantly larger compared to the previous case study. The degree of single crystal elastic anisotropy, can be quantified

FIGURE 7 | (A) Variation in the angular change in the vector of model coefficients between model update steps. (B) Variation in the magnitude of the vector of model coefficients during the model building process.

by the universal elastic anisotropic index, A, (Ranganathan and Ostoja-Starzewski, 2008; de Jong et al., 2015) defined as

$$A = 5\frac{G\_\nu}{G\_r} + \frac{K\_\nu}{K\_r} \tag{14}$$

where K and G are the bulk and shear moduli provided by Voigt and Reuss estimates (indicated by subscript v and r, respectively) of a macroscopically homogenous polycrystalline material with uniform texture (Hill, 1952). A maximum universal elastic anisotropic index of 7.2 was noted for the earlier cubic case study discussed in this paper, compared to 66.2 encountered in the current hcp case study. It is therefore quite reasonable that the number of training data points needed is significantly higher.

#### Extracting Intrinsic Material Parameters

The focus is now turned to the sampling of the posterior distribution of the elastic constants, **c** = {C11, C12, C44, C33, C13} , via MCMC. Similar to the previous case study, in order to sample from the posterior distribution of the intrinsic material parameters, the likelihood function in Equation (13) must be computed using the available experimental data and the reduced-order model established in Step (1) (corresponding

FIGURE 9 | MCMC sampling of the posterior distribution for the intrinsic single crystal elastic constants of CP-Ti.

to truncation levels Q = 2, L = 4 using a training set of 2,200 FE simulation data points). The experimental data for this case study was obtained from a prior openly shared dataset (Weaver et al., 2016). This data set included indentation moduli for 50 different crystal orientations on a CP-Ti sample. Following the procedure described in section Estimating Intrinsic Material Properties from Indentation Measurements, 50,000 samples were drawn using the MCMC approach. The resulting posterior distributions are shown in **Figure 9** for each of the five intrinsic hcp elastic stiffness parameters. The maximum-a-posteriori (MAP) estimates, the mean values, and the standard deviations of the distribution are reported as a table in the same figure. The reported mean values for elastic constants were found to be {155, 89, 49, 174, 55} GPa for {C11, C12, C44, C33, C13}, respectively. Typical literature values reported are {162, 92, 47, 180, 69} GPa (Fisher and Renken, 1964). With the exception of C13, the extracted intrinsic stiffness parameters show good agreement with values reported in literature (mean values are within 5%). It is also interesting to note that the extracted distribution for C<sup>13</sup> exhibits the highest relative uncertainty.

This indicates the relative low sensitivity of the indentation modulus to changes in C13, when compared to the other elastic stiffness constants. As noted in the previous case study, it should be possible to extend the framework presented here to focus exclusively on improving the estimation of C<sup>13</sup> (Huan and Marzouk, 2013). However, such an effort could only be justified after the uncertainty in the literature reported values is rigorously quantified. The variance in the predictions of the surrogate model at selected orientations is compared with the experimental data in **Figure 10**.

Evaluations of the reduced-order model at various orientations using samples from the posterior distributions of the elastic constants provides the possible mean indentation moduli for the observed experimental indentation moduli, as described in section Estimating Intrinsic Material Properties from Indentation Measurements. Since the reduced-order model coupled with a sampled set of elastic moduli from the posterior distribution of elastic constants provides the respective mean indentation modulus as a function of orientation, the predictions should be more tightly packed in regions which there are more observations, reflecting a higher certainty of the mean. The prediction uncertainty from MCMC is in fact shown to be highest at low declination angles, while uncertainty is lowest at high declination angles where relatively much more data is available. Furthermore, this observation suggests that there is much more value in conducting additional tests at the lower declination angles, specifically in the range of 0–0.2 radians, compared to conducting them at the higher declination angles. This could be highly valuable input to the experimentalists for their future studies.

#### CONCLUSIONS

A statistical framework has been presented for the robust extraction of the intrinsic material parameters from available experimental observations from spherical indentation stressstrain protocols. The two-step Bayesian inference framework enables the specification of uncertainty in the measurement data, which is then transferred to the uncertainty in the values of the extracted intrinsic material properties. Most importantly, the new framework presented in this paper demonstrates potential for significantly speeding up the materials characterization effort by focusing on experiments that are likely to deliver the maximum value in establishing the desired properties. This is accomplished by employing a numerical model of the experiment itself (here accomplished using a finite element model). Although the numerical model can be very expensive, it is only needed for a one-time effort is establishing a reduced-order model (Step

#### REFERENCES


(1) of the proposed two-step protocol). Once the reduced-order model is established, the calibration of the available experimental data to the theory (Step (2) of the proposed two-step protocol) can be accomplished with minimal computational resources. The versatility and the robustness of the proposed new framework is demonstrated with two case studies: (i) extraction of three elastic constants for Fe-3%-Si, and (ii) extraction of the five elastic constants for CP-Ti. In both case studies, the ranges of intrinsic material parameters considered covers a significant number of polycrystalline hcp and cubic metals. This makes both models highly applicable to new case studies within the material classes. For material classes outside of the classes explored here, the main challenge is indeed Step (1) of the protocol, requiring the establishment of a high fidelity reduced-order model from suitable FE simulations, while Step (2) remains the same. In the event the extracted parameters in Step (2) fall outside of the extents of the databases used to construct the reduced-order model, additional simulations considering the new bounds would become necessary. Finally the use of a Bayesian framework opens new avenues for the development of autonomous (fully guided by the computer) scientific explorations. It is anticipated that the framework is extensible to a large number of other applications in multiscale materials modeling (e.g., extraction of the values of slip resistances from indentation measurements, extraction of the values of parameters in phase-field models based on available microstructure datasets).

#### AUTHOR CONTRIBUTIONS

AC performed simulations, database organization, data analysis and prepared the manuscript. SK provided guidance and prepared the manuscript.

### FUNDING

The authors acknowledge funding from AFOSR award FA9550- 18-1-0330 (Program Manager: J. Tiley). AC acknowledges funding from the National Science Foundation Grant 1258425.


Hertz, H. (1896). Miscellaneous Papers. New York, NY: MacMillan and Co., Ltd. Hill, R. (1952). The elastic behaviour of a crystalline aggregate. Proc. Phys. Soc. Sec.

A 65, 349–354. doi: 10.1088/0370-1298/65/5/307

Huan, X., and Marzouk, Y. M. (2013). Simulation-based optimal Bayesian experimental design for nonlinear systems. J. Comput. Phys. 232, 288–317. doi: 10.1016/j.jcp.2012.08.013

Khosravani, A., Morsdorf, L., Tasan, C. C., and Kalidindi, S. R. (2018). Multiresolution mechanical characterization of hierarchical materials: spherical nanoindentation on martensitic Fe-Ni-C steels. Acta Mater. 153, 257–269. doi: 10.1016/j.actamat.2018.04.063

Knezevic, M., Kalidindi, S. R., and Mishra, R. K. (2008). Delineation of firstorder closures for plastic properties requiring explicit consideration of strain hardening and crystallographic texture evolution. Int. J. Plastic. 24, 327–342. doi: 10.1016/j.ijplas.2007.05.002

MacKay, D. J. (1998). "Introduction to Gaussian processes," in NATO ASI Series F Computer and Systems Sciences (Springer-Verlag), 133–166.

MacKay, D. J. C. (1992a). Bayesian interpolation. Neural Comput. 4, 415–447. doi: 10.1162/neco.1992.4.3.415

MacKay, D. J. C. (1992b). Information-based objective functions for active data selection. Neural Comput. 4, 590–604. doi: 10.1162/neco.1992.4.4.590

MacKay, D. J. C. (1996). "Hyperparameters: optimize, or integrate out?," in Maximum Entropy and Bayesian Methods (Dordrecht: Springer), 43–59.

MATLAB (2016). MATLAB, Version 9.1.0 (R2016b). The MathWorks Inc.,

McKay, M. D., Beckman, R. J., and Conover, W. J. (2000). A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 42, 55–61. doi: 10.1080/00401706.2000.10485979

Patel, D., and Kalidindi, S. (2017). Estimating the slip resistance from spherical nanoindentation and orientation measurements in polycrystalline samples of cubic metals. Int. J. Plastic. 92, 19–30. doi: 10.1016/j.ijplas.2017.03.004

Patel, D. K., Al-Harbi, H. F., and Kalidindi, S. R. (2014). Extracting single-crystal elastic constants from polycrystalline samples using spherical nanoindentation and orientation measurements. Acta Mater. 79, 108–116. doi: 10.1016/j.actamat.2014.07.021

Pathak, S., and Kalidindi, S. R. (2015). Spherical nanoindentation stress–strain curves. Mater. Sci. Eng. R Rep. 91, 1–36. doi: 10.1016/j.mser.2015.02.001

Pathak, S., Stojakovic, D., and Kalidindi, S. R. (2009). Measurement of the local mechanical properties in polycrystalline samples using spherical nanoindentation and orientation imaging microscopy. Acta Mater. 57, 3020–3028. doi: 10.1016/j.actamat.2009.03.008

Priddy, M. W. (2016). Exploration of forward and inverse protocols for property optimization of Ti-6Al-4V (Doctoral dissertation). Georgia Institute of Technology.

Proust, G., and Kalidindi, S. R. (2006). Procedures for construction of anisotropic elastic-plastic property closures for face-centered cubic polycrystals using first-order bounding relations. J. Mech. Phys. Solids 54, 1744–1762. doi: 10.1016/j.jmps.2006.01.010


Roberts, G. O., Gelman, A., and Gilks, W. R. (1997). Weak convergence and optimal scaling of random walk metropolis algorithms. Ann. Appl. Probabil. 7, 110–120. doi: 10.1214/aoap/10346 25254

Sánchez-Martín, R., Pérez-Prado, M. T., Segurado, J., Bohlen, J., Gutiérrez-Urrutia, I., Llorca, J., et al. (2014). Measuring the critical resolved shear stresses in Mg alloys by instrumented nanoindentation. Acta Mater. 71, 283–292. doi: 10.1016/j.actamat.2014.03.014

Simmons, G. (1971). Single crystal elastic constants and calculated aggregate properties: a handbook, 2nd Edn. ed. H. Wang (Cambridge, MA: M.I.T. Press).

Uchic, M. D., Dimiduk, D. M., Florando, J. N., and Nix, W. D. (2004). Sample dimensions influence strength and crystal plasticity. Science 305, 986–989. doi: 10.1126/science.1098993

Vlassak, J. J., and Nix, W. D. (1994). Measuring the elastic properties of anisotropic materials by means of indentation experiments. J. Mech. Phys. Solids 42, 1223–1245. doi: 10.1016/0022-5096(94) 90033-7

Weaver, J. S., Priddy, M. W., McDowell, D. L., and Kalidindi, S. R. (2016). On capturing the grain-scale elastic and plastic anisotropy of alpha-Ti with spherical nanoindentation and electron back-scattered diffraction. Acta Mater. 117, 23–34. doi: 10.1016/j.actamat.2016.06.053


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Castillo and Kalidindi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Modeling Macroscopic Material Behavior With Machine Learning Algorithms Trained by Micromechanical Simulations

Denise Reimann<sup>1</sup> , Kapil Nidadavolu1,2, Hamad ul Hassan<sup>1</sup> , Napat Vajragupta<sup>1</sup> , Tobias Glasmachers <sup>3</sup> , Philipp Junker <sup>4</sup> and Alexander Hartmaier <sup>1</sup> \*

1 Interdisciplinary Centre for Advanced Materials Simulation, Ruhr-Universität Bochum, Bochum, Germany, <sup>2</sup> Department of Metallurgical and Materials Engineering, Indian Institute of Technology, Madras, India, <sup>3</sup> Institut für Neuroinformatik, Ruhr-Universität Bochum, Bochum, Germany, <sup>4</sup> Lehrstuhl für Mechanik-Materialtheorie, Ruhr-Universität Bochum, Bochum, Germany

#### Edited by:

Norbert Huber, Helmholtz Centre for Materials and Coastal Research (HZG), Germany

#### Reviewed by:

Ercan Gürses, Middle East Technical University, Turkey Elías Cueto, University of Zaragoza, Spain

> \*Correspondence: Alexander Hartmaier alexander.hartmaier@rub.de

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

Received: 04 February 2019 Accepted: 10 July 2019 Published: 13 August 2019

#### Citation:

Reimann D, Nidadavolu K, ul Hassan H, Vajragupta N, Glasmachers T, Junker P and Hartmaier A (2019) Modeling Macroscopic Material Behavior With Machine Learning Algorithms Trained by Micromechanical Simulations. Front. Mater. 6:181. doi: 10.3389/fmats.2019.00181 Micromechanical modeling of material behavior has become an accepted approach to describe the macroscopic mechanical properties of polycrystalline materials in a microstructure-sensitive way. The microstructure is modeled by a representative volume element (RVE), and the anisotropic mechanical behavior of individual grains is described by a crystal plasticity model. Such micromechanical models are subjected to mechanical loads in a finite element (FE) simulation and their macroscopic behavior is obtained from a homogenization procedure. However, such micromechanical simulations with a discrete representation of the material microstructure are computationally very expensive, in particular when conducted for 3D models, such that it is prohibitive to apply them for process simulations of macroscopic components. In this work, we suggest a new approach to develop microstructure-sensitive, yet flexible and numerically efficient macroscopic material models by using micromechanical simulations for training Machine Learning (ML) algorithms to capture the mechanical response of various microstructures under different loads. In this way, the trained ML algorithms represent a new macroscopic constitutive relation, which is demonstrated here for the case of damage modeling. In a second application of the combination of ML algorithms and micromechanical modeling, a proof of concept is presented for the application of trained ML algorithms for microstructure design with respect to desired mechanical properties. The input data consist of different stress-strain curves obtained from micromechanical simulations of uniaxial testing of a wide range of microstructures. The trained ML algorithm is then used to suggest grain size distributions, grain morphologies and crystallographic textures, which yield the desired mechanical response for a given application. For validation purposes, the resulting grain microstructure parameters are used to generate RVEs, accordingly and the macroscopic stress-strain curves for those microstructures are calculated and compared with the target quantities. The two examples presented in this work, demonstrate clearly that ML methods can be trained by micromechanical simulations, which capture material behavior and its relation to microstructural mechanisms in a physically sound way. Since the quality of the ML algorithms is only as good as that of the micromechanical model, it is essential to validate these models properly. Furthermore, this approach allows a hybridization of experimental and numerical data.

Keywords: machine learning, micromechanical modeling, crystal plasticity, damage, homogenization, microstructure design

# 1. INTRODUCTION

Most of the processes that happen in nature are too complex to analyze, have too many independent parameters, and sometimes even the interrelation between parameters is unknown. In materials science, Machine Learning (ML) techniques such as Support Vector regression (SVR) (Swaddiwudhipong et al., 2005; Owolabi et al., 2014, 2015), linear regression models (Cheng et al., 2017) and Neural Networks (Ihom and Offiong, 2015) are becoming more and more important to describe complex phenomena for which the governing principle is not known or the proper implementation of which is too tedious and prone to errors. Among others, ML techniques have also been used in the field of material science to predict material properties (Swaddiwudhipong et al., 2005; Lin et al., 2008; Versino et al., 2017), characterize microstructure (Lubbers et al., 2017; Gola et al., 2018) and even to design better and efficient materials (Liu et al., 2015). A vast amount of applications of ML methods in materials science lies in the area of microstructure classification. However, it is beyond the scope of this article to provide a comprehensive literature overview on this topic.

A number of strategies have been proposed in the literature for the prediction of different material properties. Swaddiwudhipong et al. used least square support vector machines (LS-SVMs) to relate load displacement curves from indentation directly to the elastic modulus and yield stress of materials obeying power law hardening. They used data from a simulation of indentation of different geometries in ABAQUS (Swaddiwudhipong et al., 2005). They were able to validate their predicted material parameters against the actual material values based on uniaxial tests to a reasonably good accuracy. Lin et al. used Artificial Neural Network (ANN) to predict the flow stress dependence on temperature, strain and log strain rates of 42CrMo steel by training on experimental data (Lin et al., 2008). They used a feed-forward network with back propagation learning algorithm which showed good agreement with the experimental values. ML techniques are also gaining more importance in the field of crystal plasticity and microstructural modeling. Mangal and Holm investigated the formation of stress hotspots in polycrystalline materials (Mangal and Holm, 2018) under uniaxial tensile deformation by integrating full field crystal plasticity based deformation models and ML techniques. They used synthetic 3D microstructures and a number of crystallographic and geometric factors are defined to describe the relevant features. It has been found that the Schmid factor, equivalent diameters of the grains, distance from the inverse pole figure and average misorientations are the top most influencing factors. They showed that Random Forest models can predict stress hotspots with receiving an operating characteristic curve (ROC-AUC) metric equal to 0.7403 in FCC material.

In the recent years, ML has gained significant interest in the mechanics of materials community. In this context, finite element (FE) simulations provide a powerful tool for understanding deformation and damage mechanisms, because they yield insight into local stresses and strains within components under complex loading states, where experiment can merely assess the global component behavior. The combination with data-driven ML techniques enables further applications of numerical modeling, in particular the efficient use of inverse methods for model parameter identification. In 2006, Tyulyukovskiy and Huber used neural networks trained by FE simulations of spherical indentation with a variety of material parameters to solve the inverse problem of identifying material parameters from experimental load-indentation measurements (Tyulyukovskiy and Huber, 2006). Artificial neural networks were also used by Abbassi et al. (2013) to calibrate parameter sets of the Gurson-Tvergaard-Needleman model to describe ductile damage behavior during sheet forming (Abbassi et al., 2013). Furthermore, Collins et al. used neural networks to approximate the yield and ultimate tensile strength as a function of microstructural properties (such as phase volume fractions) (Collins et al., 2012). The hole drilling method is widely used to determine residual stresses in a component. However, the method has its limitations because the evaluation methods are typically based on the assumption of linear elastic material behavior. To overcome this limitation, Chupakhin et al. developed a method to correct the stress analysis for effects of plastic deformation, and hence to increase the range of applicability of the hole drilling method (Chupakhin et al., 2017).

Phenomenological models formulated in a mathematically closed form as analytical functions are currently the state of the art for computationally modeling of ductile damage behavior on the macro- as well as on the micro-scale. To create an appropriate estimation of specific material behavior, damage evolution has to be described by appropriate constitutive relationships. For bridging material behavior from the microscopic to the macroscopic scale, a micromechanical modeling approach explicitly considering microstructural features, becomes an appealing solution. One benefit of this modeling technique is the possibility to derive microstructure-property relationships through microstructure-based simulations. However, using FE simulations to describe a macroscopic process (such as deep drawing or sheet bending) by explicitly considering the microstructure, is computationally prohibitive. One common multiscale approach is the FE2 method which combines the micro- and the macroscale, and therefore enables one to include microstructural information into a macroscopic model (see El Halabi et al., 2013 and Schröder, 2014). Another approach is based on the response surfaces method which has been applied in the literature for the numerical homogenisation of nonlinear porous materials (Beluch and Hatlas, 2019). At the same time, adding microstructural information directly into current analytical damage models seems to be overly complex. Hence, different homogenization approaches to map damage from the micro- to the macro-scale are required to bring microstructural information into macroscopic simulations. To accomplish this, a novel approach using an ML based framework is suggested here and compared to the well-established analytical damage model proposed by Chaboche (see Chaboche, 1988; Ambroziak, 2007).

With the emergence of ML in the materials research during the last years, another application of ML has been to design microstructures that meet targeted mechanical properties. To fulfill this challenging goal, a set of microstructure-property relationships must be used in terms of training data. Therefore, another clear application of a micromechanical modeling approach is to use results from microstructure simulations as training data for ML models for microstructure design.

In a second application (cf. section 5) microstructure-based simulations are used to create training data for ML models that are able to predict microstructural properties to a given flow curve. The input of these trained ML models is the flow curve and the output is the grain size of the microstructure. In this part, microstructure models with various grain size distribution parameters are simulated by using a nonlocal crystal plasticity model, and they are homogenized to obtain the flow cures. Simulation results are fed to selected ML models in terms of training data.

This paper is structured as follows: First, the FE simulation model and the crystal plasticity material model are explained in section 2, which also includes the homogenization method of the simulation data. In section 3, the ML algorithms (SVR and Random Forest regression (RFR)) are described. Afterwards, the two applications of the ML algorithm discussed in this publication are given. In section 4, the approach to homogenize damage from the micro- to the macroscale is given, and the prediction of microstructural features from the flow curve is presented in section 5. Finally, the conclusion is given in section 6.

# 2. MATERIAL MODELING

In this section, the basic framework of micromechanical modeling is detailed. The described model consists of a geometrical description of the grain structure of a polycrystal with equiaxed grains. This microstructure model is generated with a so-called dynamic microstructure generator (DMG) (Boeff, 2016) based on particle simulation to distribute the centers for a subsequent radical Voronoi tessellation. The constitutive modeling of plastic deformation in the individual grains is carried out with a crystal plasticity method implemented as user-defined material model (UMAT) for ABAQUS. The data set consists of finite element (FE) simulations on the micro-scale for the homogenization of damage as well as of plastic properties.

# 2.1. Representative Volume Element

For the investigation in both applications, quasi-2D representative volume elements (RVEs) were generated using the DMG, which couples a particle simulation method with a radical Voronoi tessellation algorithm (Boeff, 2016). In the first step, the target grain size distribution is determined via a log-normal distribution. Hence, the average grain diameter as well as the standard deviation are required. With respect to prescribed distribution parameters, the number and size of spheres are predefined, which mimic the targeted grain size distribution. In the second step, spheres are randomly distributed into a finite volume which is larger than the intended final RVE. This finite volume is then compressed, allowing spheres to move freely under a repulsive potential and to avoid their overlapping. In the third step, updated sphere positions and diameters of each sphere from selected time steps are then fed to a radical Voronoi tessellation algorithm from the open-source software Voro++ (Rycroft, 2009) to construct RVEs. The resulting grain size distribution of these RVEs is then compared to the targeted grain size distribution, and the RVE with the minimum difference is selected accordingly. It must be noted that the shape of the RVE, generated using DMG, is rugged to leave the grain intact and to improve the mesh quality. In the forth step, to create the RVE for the microstructure simulations, the geometry of the 2D RVE is extruded for 1% of a side length of RVE and meshed with eight-nodes-linear-brick elements (C3D8) by using CUBIT (Sandia National Laboratories, 2016).

In the final step, periodic boundary conditions, following an approach introduced by Smit et al. (1998), are applied to the RVE. Further details on the implementation are described in Kulosa et al. (2017). The basic idea of this approach is that opposite nodes are coupled such that their displacements are the same. The global boundary conditions and strain are imposed to the reference vertex points V1, V2, V4, and H1, which are located at the corners of the RVE. An example of an RVE generated by using the introduced method is illustrated in **Figure 1**. Furthermore, comparison of diameter distribution between defined seed spheres and constructed RVEs with an average grain size µ of 6.0 µm and a standard deviation σ of 1.0 µm, and an average grain size µ of 13.0 µm and a standard deviation σ of 1.0 µm are illustrated in **Figures 1C,D**, respectively. From the comparison, grain size distributions of both RVEs are in good agreement with the targeted size distributions. In the next section, the crystal plasticity-based material model is described.

# 2.2. Crystal Plasticity Model

The material behavior of the FE simulation is described by a phenomenologically based crystal plasticity model. To resolve the heterogeneous deformation resulting from abrupt changes in mechanical behavior across grain boundaries of the considered polycrystal and to consider size effects between small and large grains, a nonlocal crystal plasticity model proposed by Ma and Hartmaier (2014) is implemented. As the applied nonlocal crystal plasticity model is already described in Ma and

µm and standard deviation σ of 1.0 µm.

Hartmaier (2014), only an overview of the formulation is given. For further details on the non-local flow rule, the reader is kindly referred to Ma and Hartmaier (2014). In the following, quantities written in bold letters refer to vectors (small letters) and matrices of second rank tensors (capital letters). From the kinematics of deformation, the total deformation gradient **F** can be multiplicatively decomposed into the elastic deformation gradient **F** e and the plastic deformation gradient **F** p ,

> **F** = **F** e**F** p

The elastic deformation is calculated using the Hooke's law. The plastic deformation is characterized by the plastic velocity gradient **L** p , which is a function of the plastic deformation gradient **F** p and its rate as,

$$\mathbf{L}^{\mathbb{P}} = \mathbf{\dot{F}}^{\mathbb{P}} \mathbf{F}^{\mathbb{P}-1}. \tag{2}$$

For this study, a crystallographic slip of dislocations is defined as the only mechanism for plastic deformation. Thus, **L** p is taken as

. (1)

the sum of the shear rates of all slip systems,

$$\mathbf{L}^{\mathbb{P}} = \sum\_{\alpha=1}^{N} \dot{\boldsymbol{\nu}}\_{\alpha} \mathbf{M}\_{\alpha}. \tag{3}$$

Here, γ˙<sup>α</sup> is the plastic shear rate. **M**<sup>α</sup> = **d**<sup>α</sup> ⊗ **n**<sup>α</sup> is the Schmid tensor for slip system α, which is defined by the slip direction **d**<sup>α</sup> and the slip plane normal **n**α. The symbol ⊗ denotes the dyadic product of two vectors resulting in a second rank tensor. The total number of slip systems is N.

With respect to the nonlocal crystal plasticity model proposed by Ma and Hartmaier (2014), the flow rule and the hardening law can be expressed as:

$$\dot{\boldsymbol{\gamma}}\_{\alpha} = \dot{\boldsymbol{\gamma}}\_{0} \left| \frac{\mathbf{r}\_{\alpha} + \hat{\mathbf{r}}\_{\alpha}^{\text{GNDk}}}{\hat{\mathbf{r}}\_{\alpha} + \hat{\mathbf{r}}\_{\alpha}^{\text{GNDk}}} \right|^{p\_1} \text{sgn}(\mathbf{r}\_{\alpha} + \hat{\mathbf{r}}\_{\alpha}^{\text{GNDk}}), \tag{4}$$

and,

$$\dot{\hat{\mathbf{r}}}\_{\alpha} = \sum\_{\beta=1}^{N} h\_0 \chi\_{\alpha\beta} \left( 1 - \frac{\hat{\mathbf{r}}\_{\alpha}}{\hat{\mathbf{r}}\_{\text{sat}}} \right)^{\rho\_2} \left| \dot{\mathbf{y}}\_{\beta} \right|, \tag{5}$$

where, γ˙<sup>0</sup> is the reference shear rate, and p<sup>1</sup> is the inverse value of the strain rate sensitivity. Furthermore, h<sup>0</sup> is the reference hardening parameter, χαβ is the cross hardening matrix, which is assigned as 1.0 for coplanar slip systems and 1.4 otherwise,τˆsat is the saturation slip resistance, and p<sup>2</sup> is a fitting parameter. The initial value of the slip resistance τˆ<sup>α</sup> is defined as τˆ0, and sgn() is a mathematical function that extracts the sign of a real number. The resolved shear stress τ<sup>α</sup> for each slip system can be calculated from the stress **S**α in the intermediate configuration or the state involving only the plastic deformation gradient **F** p as,

$$
\pi\_{\alpha} = \mathbb{S}\_{\alpha} : \mathbf{M}\_{\alpha}. \tag{6}
$$

The flow rule in Equation (4) consists of two additional back stresses τˆ GNDk α and τˆ GNDi α describing the hardening contributions from geometrically necessary dislocations (GNDs) (Ma and Hartmaier, 2014). The nonlocal constitutive model, in this context, is derived from the concept of super GNDs densities and incorporates the plastic strain gradient. Within a continuum mechanical approach, it is not possible to define crystallographic GND based on the Nye tensor in a unique way. To capture the internal stresses resulting from GND, the concept of super dislocations is followed, which allows us to define the dislocation Burgers vectors and line directions uniquely (Ma and Hartmaier, 2014). This hardening from plastic strain gradients is split up into an isotropic hardening part τˆ GNDi α and a kinematic hardening part τˆ GNDk α .

The second rank dislocation density tensor **G** in the reference configuration is computed from the curl of **F** p as introduced by Nye (1953),

$$G\_{\vec{\imath}\,l} = -\mathbf{F}\_{ik,l}^{\mathbb{P}} \Delta\_{jk,l\*} \tag{7}$$

where 1jkl is the third rank permutation tensor and "l" represents the derivative with respect to the cartesian coordinate "l". It must be noted that in Equation (7) the dislocation density tensor is written in index notation (**G** = Gij). Since a reconstruction of meaningful crystallographic dislocation populations in a unique way is impossible, a unique definition of super GNDs is obtained by projecting the dislocation density tensor to the global Cartesian coordinates of the system. As a result, the stress fields of the crystallographic GNDs can be described with a good accuracy (Ma and Hartmaier, 2014), and the GND density tensor can be segmented into nine independent parts ρ¯<sup>α</sup> by evaluating,

$$\sum\_{\alpha=1}^{9} \rho\_{\alpha} \mathbf{d}\_{\alpha} \otimes \mathbf{t}\_{\alpha} = \frac{1}{b} \mathbf{G}, \tag{8}$$

where **d**α and **t**α are permutations of the Cartesian unit vectors as determined in Ma and Hartmaier (2014), and b is the magnitude of the crystallographic Burgers vector. The super GND densities for α = 1, 2, 3 represent screw-type superdislocations, while the remaining 6 components represent edge-type superdislocations, which are vital for determining the internal stress fields as a consequence of the super GNDs.

The isotropic hardening for the dislocation slip contributed by these super GNDs can be expressed using a Taylor-type equation,

$$\hat{\mathfrak{r}}\_{\alpha}^{\text{GNDi}} = \mathfrak{c}\_{1} \mu \bar{b} \left[ \overline{\sum\_{\beta=1}^{9} \chi\_{\alpha\beta}^{\text{GNDi}} \, | \, \rho\_{\beta} \, |} \right] . \tag{9}$$

Here, c<sup>1</sup> is the Taylor hardening coefficient or a geometrical factor [38], and µ is the shear modulus. χ GND αβ is the cross hardening matrix between crystallographic mobile dislocations and super GNDs.

The long-range internal stresses, caused by GNDs in dislocation pile-ups, contribute to the kinematic hardening effect. This part is calculated by evaluating the second order gradient of **F** p , which results in a super GND gradient ρ<sup>α</sup> in the form,

$$
\rho\_{\alpha,l} = \frac{1}{b} G\_{jk,l} d\_{\alpha j} t\_{\alpha k} \,. \tag{10}
$$

By evaluating these gradients within a small volume of dimension L 3 , the internal stresses <sup>e</sup>**<sup>S</sup>** GND in the intermediate configuration caused by dislocation pile-ups at grain boundaries can be calculated as explained in Ma and Hartmaier (2014). Thus, the kinematic hardening can be given by:

$$
\hat{\tau}\_{\alpha}^{\text{GNDk}} = \mathbf{S}^{\text{GND}} : \mathbf{M}\_{\alpha}. \tag{11}
$$

For the FCC crystal structure, the dislocation slip on the common crystallographic h110i{111} slip systems is considered. On the other hand, we only take the dislocation slip on the crystallographic h111i{110} slip systems into account for the case of BCC crystal structure.

#### 2.2.1. Damage Model

For the first application of damage homogenization using Machine Learning (ML), a formulation to compute the local damage is also needed in addition. This applies for the prediction

of the damage evolution (cf. section 4). The damage of a material can be assessed by using the damage parameter D, which is defined as the ratio of the damaged volume to the initial volume (cf. Lemaître, 1985) and can, therefore take values between zero and one. The increase of damaged volume leads to a reduction of the stiffness of the material. In general, for an ideal isotropic and uniaxial case, the damage parameter, D stiff, can be described in terms of the Young's modulus as,

$$D^{\rm stiff} = 1 - \frac{E^{\rm damage}}{E^{\rm initial}},\tag{12}$$

where E initial is the initial Young's modulus and the E damage is the E-modulus after the damage occurred. More generally, both quantities can be interpreted as the material stiffness along a given loading path. In our model, the damage is calculated numerically using a ramp function, which depends on the equivalent plastic strain p. The equivalent plastic strain is computed as the Frobenius norm (Gentle, 2007) as,

$$
\rho = \sqrt{\frac{2}{3}} \parallel \mathbf{E}\_{\mathcal{P}} \parallel\_{\mathcal{F}}, \tag{13}
$$

where the subscript F indicates the Frobenuis norm, and **E**<sup>p</sup> is the plastic Green-Lagrange, strain which is computed by using the plastic deformation gradient **F** p (Haupt, 2002). The plastic deformation gradient is computed according to Equations (2) and (3) in section 2.2 using the plastic velocity gradient **L** p , which depends on the shear rate γ˙<sup>α</sup> and the Schmidt tensor **M**α. After an initial threshold value of the plastic strain is reached locally, the damage increases linearly with the plastic strain. Once the upper limit of the plastic strain occurs, the damage parameter reaches its maximum value. Locally, the damage parameter is computed as follows,

$$D = \frac{p - p\_1}{p\_2 - p\_1} \quad \text{for} \quad p\_1 \le p \le p\_2 \,\,,\tag{14}$$

p<sup>1</sup> and p<sup>2</sup> are the lower limit and the upper limit. In **Figure 2**, the damage model is given graphically. For values smaller than the lower limit of the plastic strain, the damage parameter equals zero. Hence, the damage parameter reaches its maximum value for plastic strains higher than the upper limit, which numerically is realized by setting the parameter close to, but not equal to, one (Dmax = 0.999). The damage evolution is the rate of the damage parameter. Here, the limits were chosen so that the resulting model reaches its uniaxial tensile strength at around 10% total strain: p<sup>1</sup> = 0.3 and p<sup>2</sup> = 0.5. Note that the limits were not chosen to describe a specific alloy.

#### 2.3. Homogenization Methods

In the previous section, the material model for the microscopic FE simulations was described. For the ML algorithms, homogenized values (or global values) that describe the RVE are used. In the following, the global homogenized parameters have the superscript RVE. The homogenization procedure is different for the two applications presented here (cf. sections 4 and 5). For the prediction of the damage evolution, the global values are homogenized according to the Hill-Mandel condition (Hill, 1963, 1972) in section 2.3.1 and with respect to the stiffness reduction in section 2.3.2. It is necessary to use such volume average technique, because the damage needs to be calculated locally. For the prediction of microstructural features from the flow curve, macroscopic stress and strain tensors are calculated with respect to the approach of Nemat-Nasser (Nemat-Nasser, 1999). In this case, we only need to formulate a macroscopic stress and strain tensor in order to calculate the flow curve. Therefore, we use a much simpler and numerically more effective efficient approach described in section 2.3.3. In this section, methods to homogenize global values or macroscopic properties from microstructure simulations are described.

#### 2.3.1. Volume-Average Method

From the FE simulation, the value of each Gauss point is extracted, i.e., eight values for each element (cf. section 2.1). The bullets in brackets stand for the parameter that is homogenized, i.e., the stresses and strains. The Gauss point values of each element are averages, so one value for each element is obtained,

$$(\bullet)\_{\varepsilon} = \frac{1}{8} \sum\_{\text{Gauss}=1}^{8} (\bullet)\_{\text{Gauss}} \,. \tag{15}$$

Here, the index Gauss refers to the current Gauss point, which can take values from one to eight. To obtain global representative values for each time step, the local values, which are the average values of the eight Gauss points, are averaged by using the element volume,

$$(\bullet)^{\text{RVE}} = \frac{1}{V^{\text{RVE}}} \sum\_{\mathfrak{e}=1}^{N\_{\text{el}}} (\bullet)\_{\mathfrak{e}} \cdot V\_{\mathfrak{e}},\tag{16}$$

where the index e indicates the current element and Nel is the total number of elements. The symbol V is the volume, and V RVE is the total volume of the RVE. As Equation (16) shows, the global (homogenized) value is the sum of the local element value multiplied with the corresponding volume, which is then scaled by the total volume of the RVE. This averaging procedure is well established for stresses and strains (see Jänicke, 2010, Nguyen et al., 2012b) and is based on the Hill-Mandel condition (cf. Hill, 1963, 1972). Nevertheless, the application of Equation (16) is not appropriate to define a suitable measure for the homogenized damage state: consider a microstructure that is fully damaged, i.e., a crack with distinct width runs through the entire ensemble of grains. Then, the volume fraction of the damaged areas may be less than few percent of the entire microstructural volume. However, the microstructure is not able to sustain any load (in the direction that caused the damage evolution). Consequently, the usage of this small value for the volume-averaged damage state would underestimate a comparable measure according to Equation (17) for the effective damage state at the macroscale to a large extent: the volume average for the damage does not reflect the true physical properties of the microstructure. We thus propose a different homogenization scheme for the damage variable in the next subsection.

#### 2.3.2. Homogenization of the Damage Variable

As indicated earlier, a homogenization of the damage variable via volume averaging is not appropriate for defining a reasonable measure for the effective damage state that can be used for a description of the macroscopic behavior. In the available literature some attempts have been made to solve this problem. Nguyen et al. developed a multiscale cohesive damage model to determine the macroscopic behavior of a quasi-brittle material. They homogenized the response of a microscale sample representing the heterogeneous microstructure inside the adhesive crack (see Nguyen et al., 2012b, Nguyen et al., 2012a). Fish and Yu derived a closed-form expression relating microscopic, mesoscopic and overall strain and damage (Fish and Yu, 2001) for brittle materials. These approaches are, however, applicable to the small strain regime and to brittle/semi brittle materials. Souza and Allen developed homogenization-based multiscale frameworks for impact modeling of heterogeneous viscoelastic material. The damage was modeled through a field of evolving microcracks using XFEM method and cohesive law. In the above mentioned approaches, the correlation between damage evolution and large plastic strain is missing. It was, hence, necessary to develop an approach which is also valid for large plastic strain regime (Souza and Allen, 2009). We, therefore, define a homogenization approach that is in accordance with the definition of the damage parameter (at the microscale):

$$D^{\rm RVE} := 1 - \frac{C\_{\rm D}}{C\_0},\tag{17}$$

where C<sup>D</sup> and C<sup>0</sup> define the effective structural stiffness of the microstructure in the damage (subscript D) and the initial state (subscript 0). Consequently, D RVE has an identical meaning to the local definition of the damage variable according to (12). The important difference is, however, that Equation (17) accounts also for geometrical aspects. Thereby, D RVE depends on both the damage (evolution) and the microstructural arrangement provided by the specific microstructural composition, e.g., in terms of grain sizes, grain orientation and grain boundaries.

The values for the stiffness C<sup>D</sup> and C<sup>0</sup> can be extracted from the equivalent stress σ eq for equivalent elastic strain ǫ eq e

(both scalar-valued quantities): The equivalent strain results from volume averaging of the local elastic strain components, following from local total strain and the local plastic parts as function in time. In a comparable manner, the equivalent stress results from the volume averaging of the stress distribution. Then, the initial stiffness is defined by:

$$\mathbf{C}\_{\rm O} := \frac{\sigma\_0^{\rm eq}}{\epsilon\_{\rm e,0}^{\rm eq}} \tag{18}$$

and for the damaged stiffness we define accordingly:

$$\text{Cp} := \frac{\sigma\_{\text{D}}^{\text{eq}}}{\epsilon\_{\text{e,D}}^{\text{eq}}}. \tag{19}$$

The initial stiffness represents the stiffness of the undamaged state, indicating that the tuple (ǫ eq e,0, σ eq 0 ) can be read off the equivalent stress/equivalent strain curve at any load step before damage sets in. For this case, the initial stiffness was computed as the slope between the first stress/equivalent strain point and the point corresponding to the maximum stress. The damaged stiffness C<sup>D</sup> evolves in time as the fraction between σ eq D and ǫ eq e,D is no longer constant (in contrast to C0): the crack evolution at the microscale renders the volume-averaged equivalent stress σ eq D being a monotonously decreasing function such that lim C<sup>D</sup> = 0, whereas C<sup>D</sup> = C<sup>0</sup> just before damage sets in. Accordingly, D RVE <sup>∈</sup> [0, 1], where <sup>D</sup> RVE <sup>=</sup> 0 indicates a completely intact and undamaged microstructure, whereas D RVE <sup>=</sup> 1 represents a completely damaged microstructure. Consequently, this measure can be used for future applications of our approach presented here: the microstructural behavior is computed for reference states on which the machine-learning algorithm is built. This results in an effective material model for the simulation at the macroscale while taking into account the microstructural effects that are synthesized in the effective damage parameter D RVE. The macroscopic damage evolution is computed as the change of the homogenized damage parameter hDi with respect to the time, t, according to,

$$\dot{D}\_{n}^{\text{RVE}} = \frac{D\_{n}^{\text{RVE}} - D\_{n-1}^{\text{RVE}}}{t\_{n} - t\_{n-1}},\tag{20}$$

where, n indicates the current time step. Note that we apply this homogenization approach only for monotonous loading paths in this work.

#### 2.3.3. Homogenization of Macroscopic Stress and Strain Tensors From Periodic Boundary Conditions

With respect to periodic boundary conditions applied to the RVE, the global deformation is imposed to the four reference nodes, V1, V2, V4, and H<sup>1</sup> as highlighted in **Figure 1A**. The RVE boundary nodes are imposed on the kinetics of these reference nodes. Therefore, macroscopic quantities can be homogenized directly from nodal displacement, reaction force, and position vector of these reference nodes as introduced in Kulosa et al. (2017). For further details on the implemented homogenization technique, the reader is kindly referred to Boeff (2016), Kulosa et al. (2017). The macroscopic strain tensor can be formulated from the nodal displacement u node i and be mathematically expressed as:

$$
\epsilon^{\rm RVE} = \begin{bmatrix}
\frac{u\_1^{V\_2}}{\Delta x} & \frac{1}{2} \left(\frac{u\_1^{V\_4}}{\Delta y} + \frac{u\_2^{V\_2}}{\Delta x}\right) & \frac{1}{2} \left(\frac{u\_1^{H\_1}}{\Delta z} + \frac{u\_3^{V\_2}}{\Delta x}\right) \\
\frac{1}{2} \left(\frac{u\_1^{V\_4}}{\Delta y} + \frac{u\_2^{V\_2}}{\Delta x}\right) & \frac{u\_2^{V\_4}}{\Delta y} & \frac{1}{2} \left(\frac{u\_2^{H\_1}}{\Delta z} + \frac{u\_3^{V\_4}}{\Delta y}\right) \\
\frac{1}{2} \left(\frac{u\_1^{H\_1}}{\Delta z} + \frac{u\_3^{V\_2}}{\Delta x}\right) & \frac{1}{2} \left(\frac{u\_2^{H\_1}}{\Delta z} + \frac{u\_3^{V\_4}}{\Delta y}\right) & \frac{u\_3^{H\_1}}{\Delta z}
\end{bmatrix}.
\tag{21}
$$

1x, 1y, and 1z are dimensions of the periodic box in the global Cartesian coordinate system. Similarly, the macroscopic stress tensor can be formulated from the reaction force vectors F node at the four reference nodes and the current nodal position vectors x node of the reference nodes which is given as:

$$\sigma^{\rm RVE} = \frac{1}{V^{\rm RVE}} \text{sym} [ (\mathbf{x}^{V\_4} - \mathbf{x}^{V\_1}) \otimes F^{V\_4} + (\mathbf{x}^{V\_2} - \mathbf{x}^{V\_1}) \otimes F^{V\_2} + (\mathbf{x}^{H\_1} - \mathbf{x}^{V\_1}) \otimes F^{H\_1} ]. \tag{22}$$

The symmetrization function is defined as sym=1/2[A+A T ] for tensor A and its transpose. With the formulated macroscopic stress and strain tensor, the von Mises stress (σvM) and equivalent plastic strain (p) can be calculated accordingly.

### 3. MACHINE LEARNING

This section gives a short description of the types of supervised learning algorithms used in this work. In case of supervised learning (as opposed to unsupervised learning), the actual output is known and has to be approximated by the algorithm. In general, the algorithm learns to predict the target output for given features (input parameters) with a minimal error by adjusting parameters. A function **y**(**x**) is created by the Machine Learning (ML) algorithms, where **y** is the predicted output depending on the input features **x**. In general, the input and output are vectors, their length depending on the given problem. Here, for both applications (predicting the damage evolution in section 4 and predicting the grain size from the flow curve in section 5), there are several input features, so that the input is a vector. However, the output is a single scalar quantity. Furthermore, the target values are real-valued and known, and therefore supervised regression algorithms are used. For both cases, Support Vector regression (SVR) and Random Forest regression (RFR) algorithms are used. In this section, both algorithms (sections 3.1 and 3.2) are explained briefly with respect to regression.

#### 3.1. Support Vector Regression

Following the work of Hastie et al. (2008), SVR is an extension of linear regression and used for non-linear problems. In the following, the theory of SVR is briefly described. A more detailed description can be found in **Appendix 7.3**. The main idea is to gain a function fitting the given data points so that all points lie within a (small) distance of ǫ to the function (see **Figure 3A**). In **Figure 3A**, a simple two-dimensional problem is shown, in which all data points are supposed to be described by a linear function. The green area is called the margin, and its width is equal to two times ǫ. To obtain the best fit, the main task is to minimize the margin, and for doing so to solve a convex optimization problem (cf. Smola and Schölkopf, 2004; Hastie et al., 2008). Furthermore, so-called slack variables ξ are introduced, for measuring the relative distance by which the target distance of ǫ is violated (cf. **Figure 3A**). The points far away from the margin are the so-called support vectors. In addition, a regularization or cost parameter C is specified. It balances the contradictory goals of a good fit vs. a simple model by weighting the penalty for the slack variables. Furthermore, outliers have more influence in shaping the predicted output. To enable the algorithm to develop complex non-linear functions, so-called Kernels are introduced (Ng, 2016). Kernels are customizable to the needs of the target domain, which gives the algorithm the advantage to be adaptable to many problems. With the kernel function, it is possible to map the input data into an enlarged feature space. Since this mapping is in general non-linear, kernels enable SVRs to represent highly non-linear functions. In this work the Gaussian radial basis function kernel:

$$k\_{rbf}(\boldsymbol{\chi}\_1, \boldsymbol{\chi}\_2) = \exp\left(-\boldsymbol{\gamma} \parallel \boldsymbol{\chi}\_1 - \boldsymbol{\chi}\_2 \parallel^2\right). \tag{23}$$

is used (Müller and Guido, 2017). The parameter γ controls the width of the Gaussian kernel. The decision function is then no longer linear, but rather a flexible weighted sum of Gaussian kernels.

#### 3.2. Random Forest Regression

RFRs are a combination of multiple Decision Trees (DTs) or, more precisely in our case, regression trees. It is a prototypical ensemble method, which builds a highly accurate predictive model by combining many simple models (often referred to as weak learners). Each DT predicts an output, and their results are averaged. DTs are hierarchy-based models where the goal is to find the right answer by "asking as few if/else questions" as possible (Müller and Guido, 2017). For regression, nodes contain the distinction whether a value is below or above a threshold value. The main idea is to split the feature space into regions using recursive binary partitioning (cf. Hastie et al., 2008), so that every new data point can be assigned to one region. A visual example of a RFR is given in **Figure 3B**. Here, the single DT has a so-called depth of two. The tree depth is equal to the longest number of consecutive nodes in a tree. Every DT starts with a root (the top node), which contains the first question, e.g., whether a chosen feature of the data point is smaller or larger than a specific value. The nodes of the last layer of the tree are called leaves. Each leaf corresponds to one target value, i.e., a single value of the output domain. Each data point is assigned to exactly one leaf by following the decisions down the tree. If a leaf contains only data points that correspond to the same target value, the leaf is called pure. Using DTs with pure leaves results in a model that can fit the training data perfectly, but can result in over-fitting. There are four important algorithm parameters that are tuned for the RFR in this work (cf. Müller and Guido, 2017). The number of used DTs (estimators) influences the amount of over-fitting and

also the computation time. In addition, the maximum depth of each tree can be chosen specifically, or the tree is built until each leaf is pure or reaches a minimum number of samples inside the node. Furthermore, a criterion to decide whether to split a node needs to be defined, e.g., mean squared error. Another important parameter is the maximum number of features used for splitting a node. In general, a low value of this parameter means that each tree is different and may not need to be deep enough to be sufficiently accurate. A high maximum feature parameter or setting the value equal to the total number of features, results in DTs that are quite similar and thus defeating the purpose of an ensemble in the first place. The training data are fitted well by building deep trees and using the most distinctive features.

# 4. HOMOGENIZE DAMAGE EVOLUTION FROM MICRO- TO MACROSCALE

As mentioned in section 1, a new method to map damage from the micro- to the macroscale using Machine Learning (ML) is proposed. Based on the described representative volume element (RVE) (cf. section 2.1) and the local crystal plasticity model (cf. section 2.2) with damage (cf. section 2.2.1), several finite element (FE) simulations using Abaqus (version 6.12–3) are conducted. Here, the local crystal plasticity model is used, hence no influence of the geometrically necessary dislocations GNDs is considered. The main aim of the damage evolution application is to show that the global material response, gained from FE simulations, can be generally approximated with ML algorithms. For this application, we do not compare results obtained with different meshes. The material parameters are given in Table 4 in the **Appendix 7.1**. First, the data set for the ML algorithms is explained (cf. section 4.1), then the ML parameters are presented (cf. section 4.2). Finally, the results are given in section 4.3. Note that all parameters, such as stresses and strains, are the global, hence homogenized (cf. section 2.3), parameters. For simplicity reasons, the superscript RVE of the global parameters are skipped throughout the current section 4.

# 4.1. Data Set

A variety of loading states are simulated to make the data base valid for damage occurring under general monotonous loading paths. Hence, nine displacement-controlled simulations with different loading states are performed: uniaxial tension, biaxial tension cases, and shearing as shown in **Figure 1B**. The nine loading cases are uniaxial tension in x- (Tx) and y-direction (Ty), biaxial tension Txy, T2xy, and Tx2y (see **Figure 1B**). In addition, four shearing cases were applied: Shearing in x- (Sx, S2x) and ydirection (Sy, S2y) according to **Figure 1B**. The RVE used for the creation of the data set for ML is presented in **Figure 1A** in section 2.1. It contains 51 grains with a mean grain size of 59µm and a standard deviation of 10µm, which results in a grain size range between 40 and 90µm. The material model, as well as the damage model and the homogenization methods, are described in sections 2.2 and 2.3, respectively. First, the local results are presented. Then the global material behavior is presented. As an example, **Figure 4** shows contour plots of the von Mises stress and the damage parameter for uniaxial tension in x-direction.

In both **Figures 4A,B**, a strain localization in form of a band can be seen inside the RVE. Note that the damaged zone is split up because of the periodic boundary conditions. It is due to such a morphology of the damage band that a new homogenization scheme is required to homogenize it from the micro to the macro-scale (see section 2.3.2). At the flanks of the localization band, the stress is close to zero and the damage parameter has reached its maximum of 0.999. Furthermore, it is noted that the damage band propagates through the grains, i.e., in a transgranular manner, as one would expect for a ductile material, where damage and fracture occurs by void nucleation, coalescence and growth. From the simulations, relevant parameters for the homogenization are extracted:

equivalent total, elastic and plastic strain, equivalent plastic strain rate, von Mises and hydrostatic stress, as well as the element volume. Locally, the parameters are computed as follows: The equivalent plastic strain is computed as described in section 2.2.1 and its rate is computed equivalently to the rate of the damage parameter according to Equation (20). The equivalent total and elastic strains are computed in the same way as the equivalent plastic strain using the Frobenius norm and the Green-Lagrange strain (cf. Equation 13). The total deformation gradient **F** is calculated as the gradient of the displacement, and the elastic deformation tensor is computed as **F** <sup>e</sup> <sup>=</sup> **F F**p−<sup>1</sup> (Haupt, 2002). The von Mises and hydrostatic stress are computed according to Gross et al. (2011). The extracted values are homogenized as described in section 2.3.1 and 2.3.2: The global stress and strain values are the volume average of the local (element) values, and the global damage is calculated based on the stiffness reduction of the entire RVE. This results in eight global parameters: equivalent plastic strain (p) and its rate (p˙), equivalent total (ǫ eq t ) and equivalent elastic strain (ǫ eq <sup>e</sup> ), von Mises stress (σvM), hydrostatic stress (σhyd), and the damage parameter (D) and its rate (D˙ ). After the homogenization, a further pre-processing of the points is applied (see **Appendix 7.2**), which spaces the data equally with respect to the equivalent plastic strain. Each data point represents one time step of the FE simulation. For each time step, there is a set of parameters consisting of the global parameters previously mentioned. Therefore, the complete data set has the size 9×(•)× 8, where (•) is the number of time increments for each of the 9 loading cases applied to the single RVE (cf. **Figure 1A**), and 8 is the number of global parameters (p, p˙, σvM, σhyd, ǫ eq t , ǫ eq <sup>e</sup> , D, D˙ ). In total, the time increments of all loading cases equal 3454. For all data points used in this application, the reader is kindly referred to Data Sheet 1\_v1 in the **Supplementary Material**. The global values are used as the data set for the training and testing of the ML algorithms. The number of training and test data has been verified to be sufficient by using so-called learning curves, which are further described in section 4.2. The global material response in terms of the von Mises stress and damage rate with respect to the equivalent total strain can be seen in **Figures 5A,B**. The global behavior is given in the following by showing five out of the nine loading cases with the most significant difference in the material response.

It can be seen in **Figures 5A,B** that different loading conditions result in (quantitatively) different stress and damage evolution, although the general curve shapes are (qualitatively) similar. Each loading condition shows a distinct starting point for the initiation of damage, which corresponds to the maximum stress occurring at different global strains. In addition to the given global plots, it is worth having a look at the maximum global damage parameter. For the uniaxial and biaxial tension, the value is quite similar: 35.77% of maximum global damage for biaxial tension (Txy), and 35.78 and 37.7% for uniaxial tension in x-direction (Tx) and ydirection (Ty), respectively. The two shearing cases Sx and S2y have a lower maximum value for the global damage parameter. For shearing in x-direction (Sx), the maximum global damage occurring is 24.86%, and for S2y it is 27.18%. It should be noted that even though the tension cases share a similar maximum global damage value, the evolution of damage, with respect to the total equivalent strain, is different as seen in **Figure 5B**. In the following, the ML models and their parameters are presented.

# 4.2. Machine Learning Models and Parameters

For this application, Support Vector regression (SVR) and Random Forest regression (RFR) are used to predict the damage evolution D˙ . The ML is conducted using python and scikitlearn 0.19.1 (cf. Pedregosa et al., 2011). The data set is split into training (75%) and testing (25%) data sets. For validation, the training set with 75% of the data is used as the "complete" data set, and therefore further split into a training set (for validation purposes) with 56.25% of all data points, and a validation set with 18.75% of the data. The testing and validation data set is unseen data that is only used for evaluating the final model, i.e., after the final ML parameters are set. The validation set acts as a test set during the fitting of the ML parameters. Before splitting the data into sets, the order of the data was randomized in a way that can be reproduced (constant random state of 666 Müller and Guido, 2017, Pedregosa et al., 2011). To assess the accuracy of the learning algorithms, the so-called R 2 score is used (Müller and Guido, 2017), which is computed as a fraction of the mean squared error and the

variance (Pedregosa et al., 2011),

$$R^2 = 1 - \frac{\sum \left(\wp\_{\text{true}} - \wp\_{\text{pred}}\right)^2}{\sum \left(\wp\_{\text{true}} - \wp\_{\text{true}, \text{mean}}\right)^2}. \tag{24}$$

Here, y represents the output vector, index true indicates the reference output data, true, mean the mean value of the reference output data, and index pred represents the output data predicted by the ML algorithm. The total number of data points is assessed to be sufficient by using learning curves (cf. Ng, 2016) and cross-validation. Learning curves are a tool to check whether the number of data points used for training and testing is sufficient (Pedregosa et al., 2011). The training data is split several times into different set sizes to see the development in training and validation score with respect to the number of data used. Here, RFR is used to train and validate the model since its training process is very robust and shows only little sensitivity to the training parameters. As mentioned in section 4.1, a total number of 3454 data points are available. For the training and validation, 56.25 and 18.75% of the data is used, i.e., 2071 (training) and 519 (validation) data points, respectively. The training data is split seven times, so that the following absolute training split sizes result: 207, 517, 828, 1139, 1449, 1760, 2071. The validation set is 20% of each split. The resulting training and validation scores converge after using 1449 training data points to 97.3– 97.5% for training and 82.3–82.8% for validation. Selecting the most predictive subset of features can help to avoid over-fitting. Therefore, the features were chosen according to conducted feature importance methods and to an ductile damage model from the literature (cf. Equation 25), which is formulated in a mathematically closed form as analytical function. Feature importance is used to assess the influence of each feature with respect to the result. The attribute importance can be understood as a value of how informative each feature is and therefore shapes the result. For the feature importance, RFR is used (cf. section 3.2) with the only non-default parameter being the number of Decision Tree (DT) (=500). Note that feature importance gives a rank of all features with respect to their impact on the results. Less important features are not necessarily trivial, and neglecting them does not automatically improve the results. Nevertheless, feature importance can provide an understanding of the relationship between input and output parameters with respect to the ML algorithms. As mentioned above, the training data contain 56.25% of the data, and the validation set accounts for 18.75% of all data points. The other 25% of the data is the test set, i.e., the unseen data that is only used after the training process to assess the ability of the ML algorithm to generalize. As mentioned in section 4.1, for the feature importance all extracted features are used without additional polynomial features or interactions. The results of the feature importance are presented in a bar plot in **Figure 6**.

The conducted feature importance results in the damage parameter being the most informative feature with a validation score of 89.37%. This leaves the importance of all other features to around 10% in total with 3.4% for the hydrostatic stress being the second most relevant feature, and the plastic strain rate the least important feature with 0.9%. Selecting half of the most important features (D, σvM, σhyd, ǫ eq t ) produces a validation score of about 94.58%. One can see that the damage parameter and the stresses seem to be the most relevant, based on feature importance. For the RFR, selecting only the most relevant features (D, σvM, σhyd, ǫ eq t ) leads to better results than other feature combinations. In contrast, these input parameters induce a lower accuracy for the SVR. Choosing the same features for SVR as chosen for RFR, results in a training score of just about 77.6% and a test score of 82.2%. Both score values are below acceptance. The SVR cannot extract enough information from the given features to approximate the damage rate sufficiently. Therefore, leaving out features causes an under-fitting problem so that all features are used: D, p, p˙, σvM, σhyd, ǫ eq t , ǫ eq e . Taking a look at the analytical ductile damage model,

$$\dot{D} = \left(\frac{\sigma\_{\rm vM}}{2 \,ES \, (1 - D)^2} \left[\frac{2}{3} (1 + \nu) + 3(1 - 2\,\nu) \left(\frac{\sigma\_{\rm hyd}}{\sigma\_{\rm vM}}\right)^2\right] \right)^s \cdot \dot{p},\tag{25}$$

one can see a similarity in the input parameters compared to the results of the feature importance (Chaboche, 1988; Ambroziak, 2007). In the above Equation (25), s and S are material damage parameters. In addition, constant material parameters such as the Young's modulus [E = 228.96(GPa)] and Poisson's ratio [ν = 0.27(−)] are used, which can be calculated from uniaxial stress and strain curves. The input parameters for the analytical damage model are similar to the selected parameters by the feature importance, damage parameter and stresses. Later, the ML results are compared to the analytical damage model given in Equation (25) to investigate whether ML algorithms can describe the damage evolution at least as well as a well-established closedform damage model.

Furthermore, cross-validation is used to find the best ML parameters. First, the most appropriate method to scale the data is determined for the SVR (as RFR does not require a scaling of the data). The input data are scaled according to a Gaussian normal-distribution with zero mean value and a variance of one (standard scaler Pedregosa et al., 2011). Moreover, crossvalidation is used to assess the most suitable kernel and whether to use additional polynomial features. In this case, the Gaussian kernel and no additional polynomial features result in the highest accuracy. Furthermore, grid-search, i.e., finding a parameter set that results in the highest accuracy, is used to find the best parameter value of the regularization parameter C (cf. Equation

shown up to 4% for clarification because all features, except for the damage

25) and the Gaussian kernel coefficient γ (cf. Equation 32). During grid-search, both parameters are fitted simultaneously. Here, the epsilon-SVR model is used, which is named after the parameter ǫ which can be found in Equation (28) of the **Appendix 7.3**. This precision parameter defines the distance between data point and target value, which is still considered accurate, and has no negative influence on the overall accuracy (Pedregosa et al., 2011). For the RFR, the cross-validation is used to choose the best number of DTs, the maximum tree depth and the split criterion of a node. A number of 500 DTs gives the best results with respect to a reasonable compromise on the computation time. Each DT is built until all leaves are pure, i.e., each last node corresponds only to a single target value, and the criterion to split a node is the mean absolute error. In **Table 1**, the optimum parameters of both ML algorithms are summarized. The other parameters, as defined in the scikitlearn library (Pedregosa et al., 2011), are set to their default values. RFR is rather robust with respect to the parameter values. Generally, SVR is more sensitive to parameter tuning. Therefore, its parameters were tuned within a smaller range. Within this range, the SVR parameters are not as sensitive to tuning. For example, changing the values for the parameters C and γ from their optimized values (based on the Grid-Search method) by 10% changes the training score by about 0.02% and the test score by around 0.04%. With the described data set and the fitted ML parameters, the two algorithms are trained. The results of the training processes are given in the next section 4.3.

#### 4.3. Results and Discussion

For the homogenization of damage, two algorithms are used: SVR and RFR. The same randomly partitioned data set for the training and the testing process is used for both algorithms. The final training processes are conducted by using the previously defined parameters (cf. **Table 1**) and by using the features that lead to the best results as described in section 4.2. For SVR the training and testing processes both have a considerably high accuracy: 99.73% (training) and 98.25% (testing). One can see that the high accuracy for training as well as testing indicates no overor under-fitting (high bias or high variance) problems. The same applies for the RFR: the training score is 97.66%, and the test score is 97.48%. The results of the algorithms are presented in **Figures 7A,B**. In both cases, only the testing data set (25% of all data) is displayed in the form of a predicted data against the target

TABLE 1 | ML parameters used for SVR and RFR (scikit-learn library, cf. Pedregosa et al., 2011); Other parameters are default values.


parameter, show values < 4%.

damage rate plot. The red lines in both figures represent the 5% mismatch area calculated based on the R<sup>2</sup> score (see Equation 24). Here, the predicted data are the output of the ML algorithm, and the real data are the reference damage rate gained from the simulations.

From the data set, one can see that the majority of data points have damage rate values below 1/s. Hence, the damage evolution is predicted more accurately for such values. Even though, both algorithms have a sufficiently high accuracy on the test set, the SVR is able to approximate the damage rate more accurately for higher damage rates. SVR lacks to approximate the damage rate sufficiently for values near zero as one can see in **Figure 7A**, where a reference damage rate of around 0.3/s is predicted for one point. The RFR has a lower accuracy for larger damage rate values but can approximate values near zero more accurately than the SVR. Some data points were predicted incorrectly with an error of more than 5% for both algorithms, but the SVR shows less scattering inside the ±5% mismatch area. Furthermore, the SVR is able to predict high values for the damage rate more precisely than the RFR.

In **Figure 8A**, the ML algorithm results are given in a damage rate against equivalent total strain plot for five loading cases. Both ML algorithms are able to capture the damage evolution with increasing strain, even though not every value can be predicted perfectly. Hence, the ML algorithms are capable of predicting the damage evolution for different loading states

FIGURE 7 | ML results of the test set (25% of the data) with predicted values plotted against target values, a <sup>±</sup>5% mismatch border (red line) of the R<sup>2</sup> score and points with a lower or equal R<sup>2</sup> score of 95% (cyan color) (A) SVR with a score of R <sup>2</sup> <sup>=</sup> 98.25% (B) RFR with a score of <sup>R</sup> <sup>2</sup> <sup>=</sup> 97.48%.

tension in x- and y-direction: Tx and Ty, biaxial tension: Txy, shearing in x- and y-direction: Sx and S2y (Numerical data points before damage initiation are not plotted) and (B) compared to the analytical damage model according to Equation (25) with the parameters s = 5.06(−) and S = 0.24(MPa) for the test data of the uniaxial tension in x-direction.

precisely. In general, the ML algorithms can approximate the material response with respect to damage behavior almost as well as the full-field FE simulations as shown in **Figures 7**, **8**. The comparison of the trained ML algorithms to the analytical damage model is given in **Figure 8B** for the test data points for the uniaxial tension in x-direction. For the analytical damage model, the two material parameters had to be adjusted: s = 5.06(−) and S = 0.24(MPa) (cf. Equation 25). Both, SVR and RFR, are trained as described previously and shown in **Figure 7**. The ML algorithms are able to describe the damage evolution well as mentioned above. Nevertheless, SVR shows a slight over-fitting problem as the damage rate marginally decreases after the maximum of around 1.58(1/s) (cf. **Figure 8B**). One can see a small roughness in the course, but no overfitting is visible for the RFR. Consequently, the RFR method is more robust to describe the damage evolution for the presented cases than the SVR method. Furthermore, the fitting of the algorithm parameter is less demanding for RFR compared to SVR. The analytical damage model is able to describe the general damage evolution (see blue line in **Figure 8B**). Nonetheless, some limitations for the analytical model are worth noting. According to the analytical model, a damage evolution is visible even before the actual damage initiation occurs (after about 10% of total strain). The reason for this is that in the numerical model we explicitly gave the limit of the strain value as the initiation criteria, while the damage evolves in the analytical model as soon as plasticity occurs; however, due to the selection of parameters it stays small until some level of plastic strain is reached. After the actual damage initiation, the analytical damage model also shows a gradual increase, although, some difference is observed between analytical model and the numerical simulations regarding the point of sharp increase in damage. Moreover, the analytical model is compared to only one loading state, and its generalization to a variety of loading states would require re-adjusting its material parameters. Furthermore, the analytical model does not allow to take microstructural quantities into account.

# 5. PROPERTY-BASED DESIGN OF MICROSTRUCTURES

With the micromechanical modeling approach, the influence of important microstructural features on the mechanical response

can be investigated through numerical simulations, yielding microstructure-property relationships. Thus, it is possible to use synthetical microstructures in form of representative volume elements (RVEs) together with their homogenized mechanical response that result from micromechanical simulations as training data for Machine Learning (ML) algorithms. Consequently, the input parameters of ML models are the required mechanical properties, and these trained models shall recommend microstructures that posses such properties accordingly, which represents one way of microstructure design.

## 5.1. Virtual Mechanical Testing of RVEs

In a first step, 74 RVEs consisting of 100 grains with various grain size distribution parameters following a log-normal distribution function were generated using the dynamic microstructure generator (DMG) introduced in section 2.1. In this context, the average grain size µ and the standard deviation σ are varied between 6–13 and 0.1–1 µm respectively. To exclude any influence of crystallographic orientation on the deformation behavior of RVEs, 100 different sets of randomly chosen Euler angles have been assigned to all RVEs. In this way, the remaining factor influencing the strain hardening behavior must be the grain size distribution parameters of the microstructure. In the next step, the nonlocal crystal plasticity model described in section 2.2 is implemented onto a user-defined material model (UMAT) and applied in a finite element (FE) simulation with the commercial software ABAQUS to assess the mechanical response of RVEs. By using a nonlocal crystal plasticity model, size effects including the influence of grain size are taken into account. For this part of the study, a BCC crystal structure is assigned to all grains in the RVEs; nonlocal crystal plasticity parameters are summarized in Table 4 in the **Appendix 7.1** (Vajragupta et al., 2017).

# 5.2. Homogenization of Empirical Hardening Law

In the next step, the mechanical response of RVEs is simulated under a uniaxial tension loading condition, and macroscopic flow curves are homogenized from reference nodes using the method introduced in section 2.3.3. Examples of two


TABLE 2 | Optimized ML parameters of SVR and RFR (scikit-learn library, cf. Pedregosa et al., 2011) for prediction of microstructural features from flow curve.

RVEs with different grain size distribution parameters and the corresponding homogenized flow curves are shown in **Figure 9**. With a nonlocal crystal plasticity model, the influence of the grain size on the strain hardening behavior can be observed. These results prove the validity of implemented strain gradient crystal plasticity model and demonstrate that grain size effects can be incorporated properly in microstructure simulations. For the sake of simplicity, these flow curves are fitted with an empirical isotropic hardening law in order to reduce the dimensionality of the training data. In this context, the modified Voce law (Kim et al., 2013) is chosen and expressed as,

$$
\sigma \text{s} = Y\_0 + R\_0 p + R\_{\text{inf}} (1 - \exp(-\beta p)). \tag{26}
$$

Y0, R0, Rinf , and β are material parameters to be determined, and p is the equivalent plastic strain. To parameterize the aforementioned hardening law from results of RVEs simulations, the nonlinear least square fitting method is implemented (Bates and Watts, 1988). As a result, two sets of calibrated modified Voce isotropic hardening parameters from two selected RVEs simulations are used to plot flow curves as illustrated in **Figure 9**. From the comparison, both fitted flow curves are in a good agreement with simulation results and can be used to represent microstructure simulations. Furthermore, the evolution of these fitted material parameters with respect to the average grain size is plotted as shown in **Figure 10**.

From **Figure 10**, the influence of the average grain size on fitted material parameters of the modified Voce law is observed. According to Equation (26), Y<sup>0</sup> is directly related to the yield stress. Fitted Y<sup>0</sup> as plotted in **Figure 10A** linearly decreases with an increasing average grain size, and standard deviation influences a scatter of Y<sup>0</sup> at the same average grain size. From **Figures 10B,C**, R<sup>0</sup> and Rinf non-linearly decrease with larger average grain size. These two parameters behave similarly to the Hall-Petch relation. However, the standard deviation does not contribute to a scatter of R<sup>0</sup> and Rinf . β, which inversely governs the slope of the hardening law and increases with an increasing average grain size. With respect to the hardening law, smaller average grain size results in a more pronounced strain hardening behavior. In the next step, these microstructure simulation results are fed as training data for ML models.

### 5.3. Training of Machine Learning Models

For this application, Support Vector regression (SVR) and Random Forest regression (RFR) are implemented to predict the average grain size producing a given material behavior, which is described by the parameters of the modified Voce hardening law. SVR and RFR are performed using Python and scikit-learn 0.19.1 (Pedregosa et al., 2011). The data are split into training (80 %) and testing (20 %) data sets. Similar to section 4, the R 2 score is used to evaluate the performance of ML models. To determine hyperparameters of selected ML models yielding the highest accuracy, Grid-Search with 3-fold cross validation is applied, which manually considers all combinations of hyperparameters in a search space.

In **Table 2**, the optimized parameters of both ML models are summarized while other parameters as introduced in the scikitlearn library (Pedregosa et al., 2011) are set to default values.

#### 5.4. Results and Discussion

The training processes of both ML models for predicting the grain size from the flow curve are performed by using the defined parameters (cf. **Table 2**). For SVR, both, training and testing processes, give a high accuracy of 99.39% (training) and 97.95% (testing), respectively. These results indicate no over- or under-fitting issues. Similarly, trained RFR also results in a great accuracy for both training (99.62%) and testing (97.86%). The results of algorithms for the test data set (20% of all data) in the form of predicted grain size vs. reference grain size are shown in **Figure 11**. In this case, the predicted grain size data are the output from ML models and the reference grain size data are the grain sizes of RVEs used in microstructure simulations. From **Figure 11**, most of the data points from both trained ML models are within 5% error and there are only some data points, which give more than 5% error. Therefore, it can be concluded that there is no significant difference between both models in terms of scatter from the 100% accuracy line.

Furthermore, trained ML models are tested with data that are out of range of the training data. In this context, an RVE consisting of 100 grains with average grain size of 15 µm and a standard deviation of 0.1 µm are generated using DMG. Plastic behavior is again described with a nonlocal crystal plasticity model, with parameters given in Table 4 in the **Appendix 7.1**, and uniaxial tension loading conditions are simulated. This microstructure simulation is homogenized (B) RFR.

to obtain macroscopic flow curves and modified Voce law parameters are determined using the non-linear least square fitting method accordingly. The fitted modified Voce law parameters are summarized in **Table 3**. For the validation process, these parameters are then used as input for both trained ML models to determine the average grain size.

By comparing the average grain size of the RVE to produce the flow curve with grain sizes predicted from ML models, significant deviations are observed when out-of-range-data are used, because predicted grain sizes are always within the range of training data. Therefore, such results show that an application of these trained ML models are only valid when the input data are within a certain range. Furthermore, it must be verified that the output data lie within the space covered by the training data. To further improve accuracy and to extend applicability of trained ML models, more training data covering a wider range of grain sizes should be used. In any case, within the range of training data, predicted grain sizes are still in a very good agreement with the reference data.

# 6. CONCLUSION

In this work, two novel applications with respect to using Machine Learning (ML) in material science were given and discussed. Both included microstructurally informed representative volume elements (RVEs) and crystal plasticity material modeling and used finite element (FE) simulations to study the mechanical response of different microstructures to applied loads. The results of the FE simulations were used to train and test the ML algorithms.

The first application was the approximation of damage evolution in an RVE using Support Vector regression (SVR) and Random Forest regression (RFR). Furthermore, their results were compared to the analytical damage model, which was formulated in a mathematically closed form as analytical function. The FE simulations included several loading conditions to be generally

TABLE 3 | Summary of fitted modified Voce parameters from microstructure model with the average grain size of µ=15.0 µm and σ=0.1 and predicted grain size using trained ML model.


valid for monotonous load paths. The data gained from the simulations were homogenized and pre-processed before being used as training data for ML algorithms. Both regression schemes succeeded to predict the damage evolution correctly, with an accuracy (R 2 score) higher than 97% on the test data set. Additionally, both algorithms were able to predict the damage rate for different loading conditions appropriately. Comparing the results of ML to the analytical damage model, the limitations of such an analytical model became visible. Both ML methods, SVR and RFR, were able to describe the damage evolution of a microstructure with very good precision. However, for the prediction of damage evolution, SVR showed a lower ability to generalize to unseen data than RFR and, furthermore, RFR shows a lower over-fitting problem, and its parameters are easier to calibrate.

It is observed that damage homogenization with ML algorithms exhibits several interesting features that are also observed in real experiments and macroscopic modeling, e.g., the shape of the damage evolution curve over the total equivalent strain or the fact that once damage is initiated, the increase in plastic strain leads to a sharp increase in the damage rate. These investigations show the capabilities of this method to predict macroscopic damage such that in future macroscopic applications, like deep drawing or sheet bending, it will become possible to include microstructure information into the constitutive relations of the materials.

The second application of ML methods aimed at predicting the necessary grain size in microstructure models to produce given flow curves with a desired work hardening behavior. This was accomplished, again, by using SVR and RFR. In this context, 74 RVEs with various grain size distribution parameters were generated, simulated for uniaxial tension and homogenized to obtain macroscopic flow curves. These simulated flow curves were fitted with a modified Voce law and the obtained parameters together with grain size distribution parameters of RVEs were used as input for the ML algorithms. For both ML models, the grain size prediction gave a good accuracy with R 2 scores higher than 97.8% on the test data set. However, when outof-range data were applied to trained ML models, predicted grain sizes strongly deviated from the reference quantities. It is hence concluded that the trained ML models are restricted to the space covered by training data. To further enhance the prediction accuracy, training data should cover a wider range of grain sizes. In any case, with a proper range of training data, one can see the prospect of using ML models to suggest microstructural parameters that produce desired mechanical properties.

#### REFERENCES


# AUTHOR CONTRIBUTIONS

DR and KN performed the numerical simulations. AH, TG, and PJ defined the research scope. HuH and NV contributed to the paper writing.

# ACKNOWLEDGMENTS

We acknowledge support by the DFG Open Access Publication Funds of the Ruhr-Universität Bochum.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmats. 2019.00181/full#supplementary-material

Data Sheet 1 | Input data for homogenization of damage evolution from microto macroscale.

Data Sheet 2 | Appendix.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Reimann, Nidadavolu, ul Hassan, Vajragupta, Glasmachers, Junker and Hartmaier. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Hierarchical Machine Learning Model for Mechanical Property Predictions of Polyurethane Elastomers From Small Datasets

#### Aditya Menon<sup>1</sup> , James A. Thompson-Colón<sup>2</sup> and Newell R. Washburn<sup>3</sup> \*

<sup>1</sup> Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, PA, United States, <sup>2</sup> Covestro LLC, Pittsburgh, PA, United States, <sup>3</sup> Department of Chemistry and Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA, United States

Polyurethanes are a broad class of material that finds application in coatings, foams, and solid elastomers. The urethane chemistry allows a diversity of monomers to be used, and prediction of mechanical properties, which are determined by complex interplay between monomer chemistry and chain architecture, is an unresolved challenge. Urethanes are based on aromatic or cyclic isocyanates and linear or branched polyols, and polymerization results in linear chains for bifunctional monomers or branched chains for multifunctional monomers. Strong intermolecular interactions between aromatic groups result in the formation of hard-segment domains that generate physical crosslinks between disorganized rubbery domains and anchor the material microstructure, contributing to resistance to deformation. Here, a general hierarchical machine learning (HML) model for predicting the stress-at-break, strain-at-break, and Tan δ for thermoplastic and thermoset polyurethanes is presented. The algorithm was trained on a library of 18 polymers with different diisocyanates, bifunctional or trifunctional polyols, and NCO:OH index. HML reduces data requirements through robust embedding of domain knowledge and surrogate data in a middle layer that bridges input variables (composition) and output responses (mechanical properties). In this work, the middle layer included information on overall polymer composition, predictions of chain architecture derived from Monte Carlo simulations of polymerization, information on interchain interactions from empirically derived molecular potentials and shifts in infrared (IR) spectroscopy absorbances. The HML predictions are shown to be more accurate than those from a random forest model directly relating composition and properties, suggesting that embedding domain knowledge provides significant advantages in predicting the properties of complex material systems based on small datasets.

Keywords: polyurethane, machine learning, structure-property relationship, property prediction, tunability

# INTRODUCTION

Polyurethanes are ubiquitous materials found in coatings, foams, and solid elastomers (Oertel, 1994; Engels et al., 2013). Prototypical polyurethanes are formed through step-growth polymerization of an aromatic diisocyanate and an aliphatic diol, resulting in the formation of a material having aggregated aromatic hard segments bridged by rubbery segments. This

#### Edited by:

Christian Johannes Cyron, Hamburg University of Technology, Germany

#### Reviewed by:

David Cereceda, Villanova University, United States Youyong Li, Soochow University, China

> \*Correspondence: Newell R. Washburn washburn@andrew.cmu.edu

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

> Received: 05 February 2019 Accepted: 08 April 2019 Published: 08 May 2019

#### Citation:

Menon A, Thompson-Colón JA and Washburn NR (2019) Hierarchical Machine Learning Model for Mechanical Property Predictions of Polyurethane Elastomers From Small Datasets. Front. Mater. 6:87. doi: 10.3389/fmats.2019.00087 microstructure is the basis for the remarkable mechanical properties of polyurethanes characterized by primarily elastic behavior and large values of ultimate elongation. However, modern polyurethanes are based on a highly diverse family of monomers that provide control over the number of reactive isocyanate or alcohol groups, which allow the preparation of linear of branched materials, varying monomer chemistries, which tune the interactions between hard segments and soft segments, and control over the NCO:OH index, which controls the degree of polymerization. Developing a model to predict the mechanical properties of these materials based on all these compositional variables is an unresolved challenge.

There are many analytical approaches to predicting properties of polymers based on their linear or crosslinked structure. Viscoelastic and rheological properties of linear polymers have been predicted by tube models (Milner and McLeish, 1998; Pattamaprom et al., 2000; van Ruymbeke et al., 2002, 2005) where reptation theory has been extended with contour length fluctuations and constraint release mechanisms, leading to successful predictions especially with low molecular weight polymers. For crosslinked polymers, mechanical properties have been predicted using group interaction modeling (GIM) (Foreman et al., 2008) where a mean-field potential is calculated from cohesive energy and other molar constants derived using a group-additivity approach based on each component in the repeating unit. Another model for thermo-mechanical behavior prediction uses a molecular-modeling approach (Shenogina et al., 2012) whereas Eom et al show the effect of native topology on mechanical strength of crosslinked polymer chains (Eom et al., 2003).

With the advent of machine learning in many traditional scientific disciplines and the Materials Genome Initiative, there have also been many data-driven, ML-based approaches for prediction of polymer properties (de Pablo et al., 2014; Agrawal and Choudhary, 2016). Viscoelastic properties have been modeled using a multi scale computational framework on inverse Boltzmann method (Li et al., 2012) and specific properties (mechanical, thermal, optical, and electrical) have been trained on microscopic, mesoscopic, and macroscopic structures from polymer databases available online using artificial neural networks (Roy et al., 2006). Recently, Kim et al. (2018) have developed a polymer informatics platform which trains machine learning models of a dataset of high throughput DFT calculations and experimental data from the polymer literature.

Since most of the ML based approaches rely on large datasets, Hierarchical Machine Learning (HML) was developed on small experimental datasets to predict properties of complex material systems utilizing an intermediate layer between the desired responses and system variables (Menon et al., 2017). These intermediate variables are based on latent physicochemical factors from domain knowledge pertaining to the material system. This methodology was validated on a system of dispersant dosed concentrated MgO suspension, which acted as a non-setting model of cement. Building upon previous work, HML was successfully utilized to designed a superplasticizer tailored specifically for metakaolin-portland blend cement blends (Menon et al., 2018).

In this work, HML was applied to a system of linear and crosslinked polyurethanes modeling mechanical responses: stress-at-break, strain-at-break, and Tan δ with system variables which are polymer structure, molecular weights and densities of the reactants (diisocyanates, polyols), chain length of polyol and isocyanate: alcohol (NCO:OH) index. Intermediate variables utilized for predicting mechanical responses were chosen to simplistically represent intermolecular, interchain and crosslinking behaviors in the system. The input dataset of 18 synthesized polymers was split into a training set of 14 and test set of 4 data points. A model was developed on the training set and validated against the test set. Finally, a comparison of the HML algorithm was performed with a random forest (RF) model (Breiman, 2001; Pedregosa et al., 2011) that directly predicted mechanical properties based on composition using the same training set.

# TRAINING DATASET

# Materials

The oligomers and monomers used to build the training set, poly(tetramethylene ether) glycol(PTMEG) (M<sup>n</sup> = 1,000), polycaprolactone triol (PCL) (M<sup>n</sup> = 900), toluene-2,4-diisocyanate (TDI), hexamethylene diisocyanate (HDI), isophorone diisocyanate(IPDI), were purchased from Sigma-Aldrich, as well as the catalyst used for the polymerization reaction, dibutyltin dilaurate (DBDTL). Dichloromethane was acquired from EMD Millipore. All of the materials were used as received.

# Polymer Synthesis

The training set consisted of 18 samples, all of which were prepared by reacting a bifunctional diisocyanate with either a bifunctional or trifunctional polyol at NCO:OH indices of 1.0, 1.2, or 1.5, as shown in **Supplementary Table 1**. The reactions were carried out at room temperature in 8 ml of dichloromethane as a solvent under the presence of DBTDL as a catalyst. Films were cast from the synthesized polymers and were left to dry at

room temperature for 24 h and then again dried in a vacuum oven for 24 h at 60◦C to remove any residual solvent.

# Measurements

Stress-at-break and strain-at-break were measured for all polymers in a universal testing machine (Instron). Gage width, parallel section width and thickness for each sample tested were 12 mm, 4 mm, and 2mm, respectively. Tan δ for all polymers was measured via a frequency sweep in a Discovery HR-2 rheometer (TA instruments) and the value at 1 Hz was used as the characteristic system response in this study. FT-IR analysis was performed on 2 mm thick film specimens in a Frontier Spectrometer (PerkinElmer) in the standard wavenumber range (4000–700 cm−<sup>1</sup> ).

# Monte Carlo Modeling

Monte Carlo simulations of step growth polymerization were performed in R (R Core Team, 2018). We focused on parameters which would provide insight into the crosslinking tendency of polymer chains due to polyfunctionality in either the polyisocyanates or the polyol reactants. These parameters were determined through a Monte Carlo simulation of a spatially homogeneous chemical ensemble of monomers; for these simulations we used 200,000 monomers. The algorithm performs an event-based stochastic process analogous to the approach described by Mikes and Dusek (1982), then repeats the stochastic process until all of the limiting reactive group are consumed, which in our simulations was the OH group. For the recipes with NCO:OH index of 1, the simulation is at this point complete. For the recipes with index 1.2 and 1.5 we further simulate for moisture cure. Moisture cure refers to the reaction of some of the remaining unreacted—NCO react with ambient moisture to form to—NH<sup>2</sup> with further reactions with—NCO to form urea bonds. In these cases, once all the -OH is consumed, the appropriate amount of water is added to the ensemble and the stochastic process is continued until no more bonds can be formed. The simulation provides parameters that describe the connectivity for all monomers in the post-gel thermoset. For our analysis, we identify and quantify three type of molecular configurations, shown in **Figure 1**: Sol-which are the oligomers that are not part of the infinite network formed during polymerization, Elastic link—which are the effective links formed between the reactive groups and form part of the crosslinked core gel component and Dangler—which are the pendant groups that are attached to the infinite network and are also known as tethered plasticizer and ineffective links. The parameters which are relevant are the effective crosslinks per kg of polymer (nEff) calculated using Miller and Macosko's recursive method (Miller and Macosko, 1976), the average molecular weight of the elastic links weighed as a percent of total polymer weight (elastic\_link\_mtw), the average molecular weight of the core gel component weighed as a percent of total polymer weight (Mtw), the percentage of sol present in the synthesized polymer (sol\_pctWgt) and the percentage of core gel component in the synthesized polymer (Core\_pctWgt).

# HML MODELING

HML modeling was performed in Python (Rossum, 1995) using the Scikit learn library (Pedregosa et al., 2011) for machine learning estimators. HML modeling has been used with multiple systems now; in all of these systems, the top layer will represent a complex system response that has to be either predicted or optimized with respect to a bottom layer which consists of simple experimentally tunable variables. However, there is an intermediate middle layer which consists of physical or chemical factors parameterized from the variables in bottom layer through surrogate physical measurements and existing physical/chemical relationships pertaining to the specific material system. The algorithm can be better understood with the scheme shown in **Figure 2**.

The bottom layer of input or experimental variables in the model for PU consisted of the repeating unit in the synthesized polymer split into chemical structural units per kg of polymer,

molecular weights and densities of the diisocyanates and polyols, the NCO:OH indices and the estimated chain length of polyols, assuming a PDI of 1. The chemical structures of the diisocyanates and polyols used in the training set are depicted in **Figure 3**. Each of the 18 polymers synthesized from these reactants have been categorized into a vector of structural units per kg of polymer (c, ch, ch2, ch3, c6h6, co, nh, o, nh2) calculated from the groups present in the reactants—the diisocyanates and polyols, the NCO:OH index, chain length of polyols and the total weight of the polymer synthesized.

The middle layer or the intermediate physical/chemical variables have been grouped into three categories: the first will probe into the intermolecular interactions and absorption characteristics of the synthesized polymer molecule through

FT-IR spectroscopy (Siesler, 1980). The presence of spectral features in specific regions of the spectrum is indicative of certain functional groups which have vibrational modes with large displacements and are minimally affected by the presence of other functional groups or atoms (Griffiths and de Haseth, 2007). For our study on polyurethanes, the wavenumbers exhibited by CO and NH groups as well as the ratio of absorbance values of NH to CO groups for each sample, are of particular importance and will provide better correlation to mechanical responses. It has been observed before that the CO stretching vibration and NH stretching vibration show different wavenumbers depending on the degree of H-bonding as well as crosslinking occurring due to trifunctional hydroxyl group polyol which significantly impact mechanical properties of such polyurethanes (Tsai et al., 1998). It is expected that a particular amount or degree of these factors, h-bonding and crosslinking, which results in the optimal mechanical response suited to a particular end-user application. The degree of H-bonding is also influenced through the symmetry of chemical structure of the reactants and presence of even/odd number of atoms (Caracciolo et al., 2009), making these variables highly significant for model prediction. These variables have been parameterized with respect to all the bottomlayer variables using a Gaussian regression-based framework using the scikit learn—ML library in Python (Williams and Rasmussen, 1996; Pedregosa et al., 2011). The FT-IR variables were then recalculated using the predict function from the Gaussian process models to be used with the top-layer variables.

The second set of middle layer variables consists of intermolecular chain interactions and properties pertaining to polyurethane polymer system. Hard segment (HS%) and soft segment (SS%) were easily calculated by mass of diisocyanate and polyols with respect to the total mass of the polymer. Similarly, % aromatic and % cyclic (non-aromatic) nature was calculated based on mass of respective structural units with respect to total mass of polymer. The solubility parameter and cohesive energy density (CED) was calculated using molar attraction constant, molar volume and cohesive energy of the polymer repeating unit using a group additivity-based approach on the structural units present in the bottom layer. The values for group contributions to the molar attraction constant, molar volume and cohesive energy are easily available in literature and have been extensively used before, for other polymer systems.

The third set of middle layer variables were the predictions of chain architecture from Monte Carlo simulations as described in the section above. Based on existing literature, these three categories of variables sufficiently model the main forces and interactions that govern the mechanical behavior of polyurethanes—the microstructure consisting of soft and hard domains which control permanent deformation, high modulus and tensile strength, hydrogen bonding between neighboring polymer chains control the elasticity as well as strain deformation behavior whereas simulation of chemical crosslinking addresses the mechanical behavior due to network formation.

Finally, the top layer, which consists of system responses (stress-at-break, strain-at-break, and Tan δ) have been modeled with the middle-layer variables using a random forest regressionbased model from the scikit-ML library in Python. Random forest regression is an ensemble learning technique based on multiple decision trees learned from the provided variables. One of the advantages of a random forest model is its use of bagging or bootstrap aggregation where each decision tree is modeled on a subset of the input set but by drawing samples with replacement the subset has the same size as the original input set. Then, averaging is performed on all the decision trees to improve the prediction accuracy and to control overfitting. The number of decision trees used in our training set is equal to 100 and the max depth of trees was unrestricted since modeling was performed on a sparse dataset and are not concerned about memory consumption or computational efficiency, thus leading to better predictive power for the model. The estimator used from scikit -ML library is a "RandomForestRegressor" with the following attributes: bootstrap = True, criterion = "mse," max\_depth = None, max\_features = "auto," max\_leaf\_nodes = None, min\_impurity\_decrease = 0.0, min\_impurity\_split = None, min\_samples\_leaf = 1, min\_samples\_split = 2, min\_weight\_fraction\_leaf = 0.0,


TABLE 2 | Training and test scores for HML and Random Forest modeling of mechanical properties as a function of composition.


HML provides significantly greater accuracy through embedding domain knowledge in the algorithm, allowing it to build predictive models from small datasets.

n\_estimators = 100, n\_jobs = 4, oob\_score = False, random\_state = 0, verbose = 0, warm\_start = False. Each of these attributes are well described at the scikit learn website for random forest regression:

https://scikit-learn.org/stable/modules/generated/sklearn. ensemble.RandomForestRegressor.html

and the code is available with **Supplementary Files**.

#### RESULTS AND DISCUSSION

**Figure 4** shows the mechanical responses for the training set. In general, Tan δ and strain-at-break are expected to decrease and

stress-at-break is expected to increase with increasing NCO:OH index associated with the transition from rubbery toward a glassy state (Petrovic et al., 2002; Levine et al., 2012 ´ ) due to the presence of increasing urea linkages in the polymer network from moisture cure of excess NCO content. Furthermore, similar trends are expected with a general replacement of a bifunctional polyol with a higher functionality polyol, owing to crosslinking and network formation (Dušek and Dušková-Smrcková, 2000 ˇ ). Crosslinking also occurs either physically through hydrogen bonding between hard urethane segments or chemically through allophanate linkages due to excess NCO content during the polymerization reaction (Kontou et al., 1990). We see that even though some of our samples show expected behavior, others behave differently in either different response metrics or in all of them. In **Figure 5**, it can be observed that there is poor correlation between the various responses under a wide range of attained measurements. The diagonal grid represents the one-dimensional spread of values for single responses whereas the top right section represents the scatter plot correlation between response pairs and the bottom grid represents the two-dimensional spread as well as density of values for the response pairs. This suggests that there are multiple factors controlling these responses that may be competing.

In order to deconvolute the relationship between these mechanical responses and the variables in the bottom layer of the model, the middle layer variables of our algorithm were parameterized in terms of the variables in bottom layer. Here, CO wavenumber, NH wavenumber and the ratio of NH absorbance per CO absorbance were modeled with a Gaussian Process regression using a train/test split to ensure accuracy and predictive capability. The train and test scores for the three variables are shown in **Table 1**. The IR values were regenerated from the learned GP model to be used further in the next training step.

The solubility parameter and the cohesive energy density for the polymers were calculated using group contribution methods (Van Krevelen and Te Nijenhuis, 2009) using the equation (1) for solubility parameter

$$\delta = \sum F\_i / \sum V\_{m,i} \tag{1}$$

where F<sup>i</sup> is the molar attraction contribution and Vm,<sup>i</sup> is the molar volume contribution for the ith structural unit in the bottom layer and the equation (2) for cohesive energy density is

$$e\_{\rm coh} = \sum E\_{\rm coh,i} / \sum V\_{m,i} \tag{2}$$

where Ecoh,<sup>i</sup> is the cohesive energy contribution for the ith structural unit in the bottom layer. HS%, SS%, nEff, elastic\_link\_mtw, Mtw, sol\_pctWgt, and core\_pctWgt were calculated and simulated as mentioned in the previous section.

After generating the set of middle layer variables, a random forest regression model was fitted between the latter and the mechanical responses (stress-at-break, strain-at-break, and Tan δ), and the train/test scores for all responses are shown in **Table 2**. The feature importance values from the RF model are shown in **Figure 6**, the predicted vs. test values are shown in **Figure 7** and the ensemble averaged trees are shown in **Figure 8**.

HML model.

Interestingly, the most important features for prediction of strain-at-break from the trained model were CO wavenumber, NH absorbance per CO absorbance, cohesive energy density, and NH wavenumber. This could be due to the vibrational shift in the CO and NH bands, which relate to the chemical environment obtained with different reactant combinations (polyols and diisocyanates) whereas the ratio of carbonyl peak absorbance to amide peak absorbance may indicate the effect

of NCO:OH index. The shift in frequencies is an indication of hydrogen bonding strength between polymer chains, primarily due to interactions between CO and NH groups. Strong hydrogen bonding between the groups will make the bond within the carbonyl and NH groups weaker, for e.g., HDI and PTMEG 1000 exhibit strong hydrogen bonding, thus higher strain at break as the CO wavenumber is <sup>∼</sup>1,683 cm−<sup>1</sup> and NH wavenumber is <sup>∼</sup>3,318 cm−<sup>1</sup> . Similarly, between HDI and PCL 900, weaker hydrogen bonding reduces strain at break where the CO wavenumber is <sup>∼</sup>1,732 cm−<sup>1</sup> and NH wavenumber is ∼3,380 cm−<sup>1</sup> . Cohesive energy density is a proxy for intermolecular forces within polymer chains, and as such strongly links to forces required for mechanical deformation of a polymeric material. CED is also notable as a property prediction tool for calculating relative strain at failure for similarly networked chains i.e., under presence of moderate chemical crosslinking, CED can improve toughness of chains during large strain deformation (Safranski and Gall, 2008). Other parameters in our model such as % cyclic and Mtw (average molar weight of core gel component as a percentage of total polymer weight) also have smaller yet influencing behavior on strain.

Apart from the FTIR derived variables, the model for stress shows a reliance on hard segment %, core\_pctWgt (percentage of core gel component) and % aromatic behavior. This makes sense as the hard segment in a polyurethane is the load bearing component under mechanical deformation and as a result, induces most of the elastic response in the system. At higher HS%, urethane—urethane hydrogen bonding in particular is also increased. The core gel component represents the crosslinked structure in a polyurethane which again corresponds to mechanical strength and load bearing nature of the polymer. It is interesting that the model identified % aromatic behavior as an important feature: it impacts mechanical strength due to much more efficient hydrogen bonding and pi-stacking between aromatic groups in neighboring chains. Aromatic groups in isocyanates account for stiffer chains and result in a higher melting point polyurethane as well.

Tan δ represents the damping behavior in the mechanical performance of a viscoelastic polymer, i.e., the ratio of plastic behavior to elastic behavior. Thus, it was not surprising to observe both hard segment % and soft segment % as important features, however % cyclic behavior has an interestingly significant impact on Tan δ. Even though cyclic groups correspond to stiffness and rigidity, they contribute less than aromatic groups due to the possibility of configurational isomerism as well as non-planar structures. This might explain the ability of these groups to absorb more energy while mechanical stress is applied and provide a good balance between elastic and plastic performance. Other crucial features which were identified in the model were the CO wavenumber, cohesive energy density, and mer solubility parameter.

Each random forest model shown in **Figure 8** represents an averaged decision tree from an ensemble of decision trees, for each mechanical response. The random forest classifier from Python scikit learn uses bootstrap aggregating in which multiple decision trees are modeled on subsets of training data, chosen randomly with replacement. Each predictor or feature is learned and is split for values based on a mean squared error reduction scheme, which is continued until all the data is split till the last node. Bootstrap aggregation is an excellent stochastic method of avoiding overfitting in the trained model and reduces variance in result without increasing the bias of the model. In random forests, this allows for an out-of-bag (OOB) error estimate to measure prediction error of a trained decision tree on the subset of data not used in that tree, thus negating the need of an independent validation dataset.

In order to evaluate the prediction efficiency on the sparseness of the training set, a standard big data approach was taken where the same random forest framework was applied on a training set containing the mechanical responses and our tunable formulation variables namely, polyol choice, diisocyanate choice, and NCO:OH index. As expected, the model converged with significantly lower test scores as shown in **Table 2** and the predicted responses vs. test data can be seen in **Figure 9**.

If compared with other approaches, this methodology mainly benefits from the planned surrogate physical and chemical measurements, and existing scientific literature to embed domain knowledge with statistical learning to strongly improve predictive capability. In traditional material industries, high throughput data for a single product family with specific end user application is hard to collect; it requires huge investment in time, effort, and cost. Thus, computational techniques relying on big data will not be beneficial for shortening the research and development cycle in such industries as shown earlier. Analytical approaches are better in some respects as they have physical laws and chemistry as underpinnings for property prediction however most of the approaches are highly complex to practically apply in an industry setting and highly sensitive to lack of required data/measurements. Design of experiments offers an accepted method for predicting structure-property relationships for small datasets, yet it does not provide insight into how the underlying forces interact with each other to achieve a specific system response since it is purely statistical in nature and may not accurately predict synergies between variables. HML aims to learn from the shortcomings as well as advantages of the previous mentioned approaches by utilizing and building upon the existing scientific domain expertise with much lesser measurements, providing a tool for not only property prediction but also to elucidate upon the nature of physical and chemical interactions that shape a system response.

#### REFERENCES


#### CONCLUSION

Using HML algorithm, mechanical responses of a training set of polyurethanes were as a function of monomer chemistry, index, and chain architecture. The accuracy was compared against a random forest model and it was found that HML produced significantly better predictions of the test data. This was attributed to integration of an intermediate layer of variables comprising domain knowledge based physicochemical factors which significantly improved the model relating experimental formulation variables and mechanical responses of the cured elastomers. Some of the advantages of this approach are (a) the possibility of modeling categorical and qualitative responses of polyurethane products to formulation and processing variables and (b) predicting the properties of novel monomers, such as bio based materials. In future work, we intend to model such responses and also test our model by substituting polyols and diisocyanates to further investigate the predictive nature and capability of the HML algorithm on polymer systems.

#### AUTHOR CONTRIBUTIONS

AM (CMU) conducted experiments, performed machinelearning modeling, and wrote parts of the manuscript. JT-C (Covestro, LLC) performed Monte Carlo simulations and wrote parts of the manuscript. NW (CMU) oversaw all aspects of the research and wrote parts of the manuscript.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmats. 2019.00087/full#supplementary-material

parallel strands maximize the strength of model polymers and protein domains. J. Phys. Chem. B 107, 8730–8733. doi: 10.1021/jp035178x


**Conflict of Interest Statement:** JT-C is employed by Covestro, LLC and NW has started a company to explore commercial applications of the algorithm described here and declares a potential conflict of interest. NW owns Ansatz AI, LLC. AM declares no competing financial interest.

Copyright © 2019 Menon, Thompson-Colón and Washburn. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Connections Between Topology and Macroscopic Mechanical Properties of Three-Dimensional Open-Pore Materials

#### Norbert Huber 1,2 \*

1 Institute of Materials Research, Materials Mechanics, Helmholtz-Zentrum Geesthacht, Geesthacht, Germany, <sup>2</sup> Institute of Materials Physics and Technology, Hamburg University of Technology, Hamburg, Germany

This work addresses a number of fundamental questions regarding the topological description of materials characterized by a highly porous three-dimensional structure with bending as the major deformation mechanism. Highly efficient finite-element beam models were used for generating data on the mechanical behavior of structures with different topologies, ranging from highly coordinated bcc to Gibson–Ashby structures. Random cutting enabled a continuous modification of average coordination numbers ranging from the maximum connectivity to the percolation-cluster transition of the 3D network. The computed macroscopic mechanical properties–Young's modulus, yield strength, and Poisson's ratio–combined with the cut fraction, average coordination number, and statistical information on the local coordination numbers formed a database consisting of more than 100 different structures. Via data mining, the interdependencies of topological parameters, and relationships between topological parameters with mechanical properties were discovered. A scaled genus density could be identified, which assumes a linear dependency on the average coordination number. Feeding statistical information about the local coordination numbers of detectable junctions with coordination number of 3 and higher to an artificial neural network enables the determination the average coordination number without any knowledge of the fully connected structure. This parameter serves as a common key for determining the cut fraction, the scaled genus density, and the macroscopic mechanical properties. The dependencies of macroscopic Young's modulus, yield strength, and Poisson's ratio on the cut fraction (or average coordination number) could be represented as master curves, covering a large range of structures from a coordination number of 8 (bcc reference) to 1.5, close to the percolation-cluster transition. The suggested fit functions with a single adjustable parameter agree with the numerical data within a few percent error. Artificial neural networks allow a further reduction of the error by at least a factor of 2. All data for macroscopic Young's modulus and yield strength are covered by a single master curve. This leads to the important conclusion that the relative loss of macroscopic strength due to pinching-off of ligaments corresponds to that of macroscopic Young's modulus. Experimental data in literature support this unexpected finding.

Keywords: open-pore materials, topology, structure–property relationship, elastic-plastic deformation behavior, machine learning, data mining

#### Edited by:

Roberto Brighenti, Università degli Studi di Parma, Italy

#### Reviewed by:

Liviu Marsavina, Politehnica University of Timi ¸soara, Romania Stefan Scheiner, Technische Universität Wien, Austria

> \*Correspondence: Norbert Huber norbert.huber@hzg.de

#### Specialty section:

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

Received: 11 August 2018 Accepted: 30 October 2018 Published: 26 November 2018

#### Citation:

Huber N (2018) Connections Between Topology and Macroscopic Mechanical Properties of Three-Dimensional Open-Pore Materials. Front. Mater. 5:69. doi: 10.3389/fmats.2018.00069

#### INTRODUCTION

The mechanical properties of materials with interconnected open-porous structures can be tuned by the choice of the material, the pore fraction, and the connectivity of the solid fraction. Such materials include open-pore foams (Gibson and Ashby, 1997; Ashby et al., 2000), nanoporous metals (Biener et al., 2006, 2007; Balk et al., 2009; Weissmüller et al., 2009), and architectured meta-materials (Jang et al., 2013; Zheng et al., 2014). Nanoporous Gold (NPG), with its fascinating mechanical and functional properties, has recently received significant attention due to the advances in materials development, allowing the production of specimen of mm size containing billions of nanoscaled ligaments. This material exhibits a bi-continuous network of nanoscale pores and solid "ligaments", which are connected in nodes. Hence, it serves as an ideal model material for the investigation of structure-property relationships of openporous materials in general.

Continuum micromechanics models including the Self Consistent Method and the Mori-Tanaka Model allow for an efficient prediction of the effective elastic properties of composites for given phase moduli and volume fractions. For a survey, see (Zaoui, 2002). To a certain extent, such models can predict the effective properties when the inclusions are pores. For example, Scheiner et al. (2016) extended this micromechanics concept to predict the micro–macro relations in the doubleporous medium of hierarchically organized physiological bone and validated the model for a porosity of 10%. Motivated by the limitation to small pore fractions and homogeneity of the microstructure, Gong et al. (2011) extended the Mori–Tanaka model for porous materials of finite size. However, also such extended micromechanics models predict non-zero effective properties for porosities close to 100%. Furthermore, they assume that the entire solid fraction is bearing load.

The solid fraction ϕ is used as the major parameter in several theoretical models for predicting the macroscopic mechanical behavior of the porous materials (Roberts and Garboczi, 2002; Sun et al., 2013; Huber et al., 2014; Pia and Delogu, 2015; Mangipudi et al., 2016). The Gibson–Ashby model (Gibson and Ashby, 1997) is the commonly used basis for all these models. In what follows, E<sup>s</sup> and σys denote the Young's modulus and yield stress of the solid phase. The scaling of the macroscopically effective values of Young's modulus E and yield stress σ<sup>y</sup> is dependent on the solid fraction ϕ in the form:

$$\frac{E}{E\_s} = C\_E \varphi^{n\_E},\tag{1}$$

$$\frac{\sigma\_{\mathcal{Y}}}{\sigma\_{\mathcal{Y}^s}} = \mathbb{C}\_{\sigma} \varphi^{n\_{\mathcal{O}}}.\tag{2}$$

As summarized by Ashby and Bréchet (2003), for bendingdominated behavior, we have n<sup>E</sup> = 2 and n<sup>σ</sup> = 3 2 , while for tension-dominated behavior, n<sup>E</sup> = n<sup>σ</sup> = 1. An extension of the Gibson–Ashby scaling law for Young's modulus was proposed by Roberts and Garboczi (2002), who computed the density and microstructure dependent on Young's modulus and Poisson's ratio for four different isotropic random models. The data for the low-coordination number node-bond model (0.03 ≤ ϕ ≤ 0.3) were found to be well-described by the Gibson–Ashby scaling law Equation (1), with n<sup>E</sup> = 2. For high densities, an equation with three parameters is suggested

$$\frac{E}{E\_s} = C \left( \frac{\varphi - \varphi^P}{1 - \varphi^P} \right)^m,\tag{3}$$

where ϕ = ρ/ρ<sup>s</sup> is the solid fraction of the material. The fitting parameters ϕ <sup>P</sup> = −0.0056 and <sup>m</sup> <sup>=</sup> 2.12, determined for the simulation data, can be interpreted as the percolation threshold and exponent.

Soyarslan et al. (2018) used Equation (3) to fit data computed from 3D Representative Volume Elements (RVE) of nanoporous microstructures. The RVEs are obtained using Cahn's method of generating a Gaussian random field by taking a superposition of standing sinusoidal waves that have fixed wavelength but are random in direction and phase. From the data for the macroscopic elastic modulus of the RVE for varying solid fraction, the percolation threshold for the random field microstructures is computed to be ϕ <sup>P</sup> <sup>=</sup> 0.159 with an exponent of m = 2.56. Moreover, it was found that the scaled genus per volume can be represented by an analytical expression that depends on the solid phase fraction, with its maximum value at a solid fraction of ϕ = 0.5 and reaching the percolation threshold at a solid fraction of ϕ = 0.159.

An equation very similar to Equation (3) has been proposed for modeling the macroscopic Young's modulus of porous microstructures produced by sintering (Phani and Niyogi, 1987):

$$\frac{E}{E\_0} = \left(1 - \frac{\mathcal{P}}{\mathcal{P}c}\right)^f. \tag{4}$$

In this equation, the variables p and p<sup>c</sup> represent the porosity and the percolation threshold, respectively. E<sup>0</sup> is the Young's modulus of the material free of pores E<sup>0</sup> = E(p = 0). In context of 3D percolation theory, the model assumes a value f = 3.75 for a cluster dominated by bond-bending forces when the dimension of the system tends to infinity for all dimensions (Sahimi, 1994, p. 185). Smaller samples and sample preparation can have a strong influence on the value of f , leading to lower values close to f = 1.2. The percolation threshold from different sources varies from 0.06 to 60 Vol% (Kovácik, 1999 ˇ ). The interpretation of the value of f in terms of pore geometry is discussed by Phani and Niyogi (1987) with respect to the grain morphology and pore structure of the material. They conclude that for larger values f ≈ 3, the pores deviate from the spherical shape and are interconnected to a certain extent. The lower is the value of f , the more isometric and isolated is the pore phase and vice versa. Equations (3, 4) and the exponents m and f are equivalent due to the relation between the solid fraction and the porosity

$$
\varphi = 1 - p.\tag{5}
$$

Experimental work, including macroscopic testing (Liu et al., 2016; Liu and Jin, 2017) and 3D FIB tomography (Hu et al., 2016; Ziehmer et al., 2016), give evidence that nanoporous metals, which can be interpreted as a network of nanosized ligaments, contain a considerable fraction of so-called dangling ligaments. They originate from pinch-off events during the coarsening of the nanoporous metal, due to atomic diffusion during heat treatment. Thus, using the solid fraction ϕ in Equations (1, 2) significantly overestimates the mass contributing to load transfer within the ligament network (Mameka et al., 2015; Hu et al., 2016; Liu et al., 2016; Liu and Jin, 2017). It is, therefore, proposed to make use of the effective solid fraction ϕeff , which considers only the load-bearing mass of ligaments in the network. In this case, the effective solid fraction ϕeff is determined indirectly via measurement of Young's modulus under compressive deformation, assuming Equation (1) to hold for the effective solid fraction (Liu et al., 2016; Liu and Jin, 2017; Jin et al., 2018).

For a spatial network structure with complex topological and morphological characteristics, the coordination number also plays an important role (Jinnai et al., 2001). The authors investigated 3D images of morphologies arising in an orderedblock copolymer at equilibrium and a polymer blend during spinodal decomposition. They conclude that the coordination number is particularly important with regard to the assignment of bi-continuous morphologies, since it can be used to differentiate between closely related morphologies such as gyroid and diamond. Recent works investigate the skeletons of NPG obtained from FIB tomography and artificially generated structures and similarly report that mainly triple junctions and a few percent of quadruple junctions exist (Hu et al., 2016; Mangipudi et al., 2016). It can be speculated that the average coordination number is slightly higher than 3, which would be very close to the coordination number of the Gibson–Ashby unit cell (Gibson and Ashby, 1997).

Several finite element models (FEMs) simplify the 3D openpore structure to cubic or diamond unit cells (Nachtrab, 2011; Liu and Antoniou, 2013; Huber et al., 2014; Husser et al., 2017). Hu et al. (2016) compare the simulation results from the 3D model of their FIB tomography of NPG with that of a Gibson–Ashby structure of same solid fraction. The first FEM models built from 3D FIB tomography data were presented independently by Hu et al. (2016) and Mangipudi et al. (2016). The model of Hu et al. (2016) has been further refined by Richert and Huber (2018), who analyzed the detected ligament shapes and investigated the predictive capability of the FEM beam model in comparison to the 3D solid model of Hu et al. (2016). Soyarslan et al. (2018) used complex artificially generated structures and FEM solid modeling for validating an analytical solution that relates the solid fraction to the scaled genus density. This helps to explain the divergence of experimental and numerical data from the Gibson–Ashby scaling law for Young's modulus with decreasing solid fraction.

To investigate the effect of changing connectivity on the macroscopic properties at a constant solid fraction in a more general way, Nachtrab et al. modeled the behavior of metal foams based on a diamond structure (Nachtrab, 2011; Nachtrab et al., 2011, 2012). The reduction of the connectivity was included by splitting of nodes with a coordination of 4 into two nodes, each with a coordination of 2. This led to fibrous structures with a percolation threshold p<sup>c</sup> close to 1. For the prediction of the mechanical properties of selected additively manufactured open-pore structures, a voxel-based FE scheme was used. This scheme is, however, computationally demanding and therefore significantly limited the number of investigated structures.

To get closer to realistic microstructures, we use RVEs that are built following the idea proposed by Huber et al. (2014), where NPG is modeled as a randomized diamond structure using beam elements. The approach allows us to define a solid fraction ϕ by the radius r and length l of the individual ligaments (Roschning and Huber, 2016). It is also possible to vary the ligament shape (Jiao and Huber, 2017a) and to integrate nodal masses for predicting both elastic and plastic mechanical behavior comparable to the RVE, which is built with solid elements, while maintaining the computational efficiency (Jiao and Huber, 2017b). This technique enables quantitative prediction of the macroscopic Young's modulus, Poisson's ratio, and yield strength for a large number of structures.

By mechanically deactivating randomly selected ligaments in a 3D network, pinched-off (or dangling) ligaments are systematically studied in this work for the first time. The remaining load-bearing ligaments form the mass that defines the effective solid fraction ϕeff . In this way, we can shed new light on the effect of dangling ligaments in open-pore materials and expect to gain a more general and deeper understanding of the interdependencies between the coordination number, scaled genus density, effective solid fraction, percolation threshold, and the scaling behavior of mechanical properties for 3D network structures.

#### METHODS

If we use the notation of Equation (4), a fully connected structure consisting of a given number of cylindrical ligaments has a macroscopic Young's modulus E0. This value, corresponding to a cut fraction ζ =0, can be computed pointwise using FEM simulations for a given solid fraction ϕ, defined by the ligament radius-to-length ratio r/l, i.e., E<sup>0</sup> = Eˆ <sup>0</sup> (ϕ) = E0(r/l), following the approach for the diamond structure (Huber et al., 2014; Roschning and Huber, 2016; Jiao and Huber, 2017a,b).

As soon as the cut fraction ζ reaches the percolation to cluster transition, the structure breaks and the mechanical stiffness becomes zero. Consequently, we can set the porosity p in Equation (4) to be equal to the cut fraction ζ and the percolation threshold for cutting is defined by the parameter ζ<sup>c</sup> . In a more general form, Equation (4) suggests that the macroscopic Young's modulus can be written as a multiplicative decomposition

$$E = \triangle\_0(\varphi)\triangle\_c(\zeta). \tag{6}$$

While Eˆ <sup>0</sup>(ϕ) is well-investigated, we focus in this work on mining the relationship between Young's modulus and cut fraction Eˆ <sup>c</sup>(ζ ) from numerical data.

#### Finite Element Simulations

For all FEM simulations in this work, ABAQUS (Abaqus, 2014) was used, while the raw models were built based on the unit cells as defined in **Table 1** using Patran 2017 and then modified by


Values are given for a unit cell of size a. Values for the percolation probability ϕ <sup>P</sup> are taken from Sykes and Essam (1964) for z <sup>≥</sup> 4. The value of <sup>ϕ</sup> P for z = 3 is computed using Equation (7).

Python scripting. Images of the FEM models and further details are provided in **Data Sheet 1**, **Supplementary Sections 1, 2**. As a substantial extension of previous work on FEM beam models, which concentrates exclusively on the diamond lattice with coordination number of 4, the RVE beam-modeling technique (Huber et al., 2014; Roschning and Huber, 2016) is generalized in this work for structures with coordination numbers ranging from 8 (bcc), 6 (simple cubic), 4 (diamond), to 3 (Gibson–Ashby). For all structures, the bcc structure serves as reference, because in terms of the coordination number, a lower coordination can always be reached by cutting connections in a higher coordinated structure. By orienting the <111>-direction of the cubic structure along the loading direction, all investigated structures deform by bending, which is the major deformation mechanism of NPG (Huber et al., 2014; Griffiths et al., 2017; Jiao and Huber, 2017a). In this way, it is ensured that the scaling laws for the mechanical properties are based on the same deformation mechanism.

The unit cells, as described in section Unit Cell Geometries, serve as building blocks for the generation of the RVE, which is described in section RVE Generation. Motivated by the high flexibility in the model setup and computational efficiency even for large 3D networks, all following unit cells and RVEs are built using the FEM beam model approach originally developed for the diamond structure (Huber et al., 2014). This approach has been thoroughly investigated and validated in subsequent works with respect to the solid fraction and macroscopic mechanical properties for cylindrical and parabolic ligaments (Roschning and Huber, 2016; Jiao and Huber, 2017a,b). The randomization of the structure was found as an important parameter to adjust the macroscopic behavior of the structures to experimental results, particularly for calibrating the elastic Poisson's ratio (Huber et al., 2014; Roschning and Huber, 2016; Lührs et al., 2017). So far, only fully connected 3D networks have been investigated. It is, thus, of obvious interest to quantitatively investigate the effect of cutting of a fraction of connections in the ligament network.

#### Unit Cell Geometries

The unit cell geometries for different coordination numbers z of 3 (Gibson–Ashby or GA), 4 (diamond or dia), 6 (simple cubic or cub), and 8 (bcc) are depicted in **Table 1**. The number of cuts until complete decohesion of the structure can be treated as a general problem of topology. A characteristic parameter is the percolation threshold ζ<sup>c</sup> = 1 − ϕ P at which the structure loses its connectivity and the macroscopic mechanical properties become zero. For most structures used in this work, the critical percolation probabilities ϕ P for the "bond problem" are known (Domb and Sykes, 1961; Sykes and Essam, 1964), and the data can be summed up with a simple rule of thumb, which is valid with an accuracy of a few percent from z = 4 to z = 12 (Ziman, 1968):

$$1 - \zeta\_{\mathfrak{c}} = \varphi^{\mathfrak{P}} \cong \frac{1.5}{z}. \tag{7}$$

The data for the critical percolation probabilities ϕ P and percolation thresholds ζ<sup>c</sup> are included in **Table 1**. For the Gibson–Ashby structure, which is located at the lower end of connectivity, Equation (7) predicts a value of ζ<sup>c</sup> = 0.5.

The unit cell defines the ligament length l dependent on the unit cell size a, as seen in **Table 1**. The solid fraction of the fully connected structure can be calibrated to any value via the ligament radius r for each structure. For the generation of the data in section Macroscopic Mechanical Properties, a solid fraction of ϕ = 0.25 was used, which is a typical value for NPG (Weissmüller et al., 2009). For simplicity, the calculation of the solid volume was estimated by the total of N<sup>L</sup> cylindrical ligaments V<sup>s</sup> = NLπr 2 l, with the numbers given in **Table 1**. The solid fraction is adjusted such that ϕ = Vs/a <sup>3</sup> <sup>=</sup> 0.25, ignoring overlapping volumes or gaps in the cylindrical ligaments in the nodal area.

#### RVE Generation

The generation of periodic RVEs from unit cells is straightforward when the RVE boundaries are aligned with the unit cell boundaries. Because the simple cubic structure would normally deform under compression in its original orientation, the periodic cubic structure was generated to be large enough that after rotating the <111> direction into z-direction, a cube of the size of the RVE is completely filled. All ligaments penetrating the boundaries of the RVE were clipped at the boundary plane and the structure was cut to the size of the RVE.

For the fundamental investigation on the effect of cutting of 3D structures represented by the dependency Eˆ <sup>c</sup>(ζ ) in Equation (6), the problem can be simplified to the relevant information of connectivity. A refined modeling, considering the randomization of the structure (Huber et al., 2014; Roschning and Huber, 2016), incorporation of variable ligament shapes (Jiao and Huber, 2017a), or nodal mass using the so-called nodal-corrected beam model (NCBM) (Jiao and Huber, 2017b) are related to the dependency Eˆ <sup>0</sup>(ϕ) in Equation (6). This is set aside for generating more realistic structures for validation in section Randomized Diamond Structures With Nodal Correction. For details on the generation of such RVEs, please refer to **Data Sheet 1** in the **Supplementary Section 1**.

In the diamond structure (Huber et al., 2014; Roschning and Huber, 2016), the boundary conditions are chosen as symmetry conditions applied to the nodes in the planes x = 0, y = 0, and z = 0. The load is applied as a homogeneous displacement of all nodes on the top side of the RVE, applying a compressive strain of maximum 15%. To capture the boundary conditions of a uniaxial compression experiment, all nodes on the remaining faces are free to move. For the mechanical properties of the solid fraction, Young's modulus E<sup>s</sup> = 80 GPa, Poisson's ratio ν = 0.42, yield strength of σy,<sup>s</sup> =500 MPa, and work-hardening rate of E<sup>T</sup> =1000 MPa were chosen. These parameters represent the mechanical behavior of the ligaments in NPG reasonably well (Huber et al., 2014; Hu et al., 2016; Roschning and Huber, 2016).

The cut fraction ζ defines the number of cut ligaments relative to the total number of ligaments in the RVE. Cutting of ligaments is realized by setting the Young's modulus for a set of FE elements, which form a randomly selected ligament, to a low value of <sup>E</sup>cut <sup>=</sup> <sup>10</sup>−3E<sup>s</sup> . This ensures that otherwise free-floating parts of the model remain connected, despite being mechanically negligible. In this way, convergence can be achieved in the FE simulations even for structures beyond the percolation threshold. By the random removal of ligaments from a higher coordinated structure, for example, a bcc structure, it is possible to provide an initial structure that has an average coordination number equal to lower coordinated structures. For example, by removing half of the ligaments, the bcc structure is turned into a structure with the same average coordination number as the diamond structure. For more details on the data structure and data processing, please refer to **Data Sheet 1** in the **Supplementary Section 2**.

In previous works, an RVE size of 4 × 4× 4 unit cells was used for the fully connected diamond structure and effects of structural randomization were averaged from 5 to 10 realizations of RVEs of same size (Huber et al., 2014; Roschning and Huber, 2016; Jiao and Huber, 2017a,b). Preliminary studies on diamond structures with random cutting of ligaments show that an RVE size of 6 × 6 × 6 unit cells represents a good compromise between accuracy and computational cost, as seen in **Data Sheet 1** in the **Supplementary Section 3**. At this size, the macroscopic Young's modulus is predicted with an accuracy of 5%, while elasticplastic compression up to 15% strain takes 2 CPUh for a single realization. Furthermore, bcc and cubic structures serve to create representative structures with reduced coordination numbers by removing a given fraction of ligaments. For these two higher coordinated structures, the RVE size was therefore increased to 12 × 12 × 12 unit cells.

#### Artificial Neural Networks

Feed-forward artificial neural networks (ANN) (Haykin, 1998) are a machine-learning technique that enable the approximation of arbitrary non-linear relationships between multiple input and outputs (Yagawa and Okuda, 1996). An ANN can mathematically be represented as an operator that maps an input vector **x** to an output vector **y**

$$\mathbf{y} = \mathcal{N}(\mathbf{x}, \mathbf{w}).\tag{8}$$

The synaptic weights **w** of the flexible function N are calibrated by training the ANN with patterns, consisting of pairs of input data **x** and desired outputs **d**. The training algorithm minimizes the error for all outputs and all patterns presented to the ANN during training through the iterative adjustment of the synaptic weights **w**. In this context, the number of training increments is called epoch. A percentage of the provided patterns (typically 10%) are kept for validation and are not used for training.

The error for a presented set of patterns is computed from the squared error for all K outputs and P patterns by the following equation:

$$E = \sum\_{p=0}^{P} \sum\_{k=0}^{K} (\nu\_k^{(\rho)} - d\_k^{(\rho)})^2. \tag{9}$$

Normalizing the squared error E by KP yields the mean squared error MSE, which allows the comparison of results from different pattern sizes and neural network architectures. This is helpful for comparing the prediction quality of training and validation patterns, as denoted by MSE<sup>T</sup> and MSEV, respectively.

Observing the development of the training and validation error during training provides important insight into whether the function N exists and if the information in the input data **x** is sufficiently complete for obtaining the desired outputs **d**. A very limited decay of the training error indicates that the problem at hand cannot be (uniquely) solved. A decay of the training error to low values along with a significant increase in validation error indicates overlearning, which leads to lack of generalization. In this case, the ANN tends to classify and memorize each pattern individually and is not able to interpolate between the patterns. When the ANN reaches low training and validation error, it can be used to predict the output for any input, provided that the data are within the training range of the ANN. For details about the ANN simulation software and its application to various problems in mechanics and materials science please refer to Huber et al. (2002), Tyulyukovskiy and Huber (2006), Tyulyukovskiy and Huber (2007), Willumeit et al. (2013), and Chupakhin et al. (2017).

# MACROSCOPIC MECHANICAL PROPERTIES

This section addresses the general question as to whether a relationship exists between cut fraction and macroscopic mechanical properties and if so, how this relationship can be represented. This type of problem can be addressed by data mining. In addition, there are a number of specific questions. The literature suggests that the behavior of the mechanical properties follows a power-law behavior, as given in Equation (4). It is unclear whether the values for the percolation threshold from literature collected in **Table 1**, which were computed for the fundamental problems of ferromagnetic crystals and electron transport, can be transferred to our solid mechanics context.

Sykes and Essam (1964) propose Equation (7) for computing the percolation probability from the coordination numbers ranging from 4 to 12 with only a few percent error. The open question is how accurate this rule of thumb is for values below 4. If it still describes the overall dependency sufficiently well, we could speculate that once the average coordination number of a 3D network reaches a value of 1.5, there is no further cut possible without losing connectivity. This value appears to be surprisingly low.

# Study of Percolation Behavior

In this section, the behavior of the macroscopic Young's modulus and yield strength are studied for initially fully connected structures, as listed in **Table 1**, by varying the cut fraction ζ in 10% increments. The macroscopic Young's modulus was computed from the response of the RVE after the first loading increment, which is fully elastic. To determine the yield strength, the corresponding plastic strain was computed at each load increment and the macroscopic stress-plastic strain curve was interpolated to 0.2% of plastic strain. The results for the different structures under investigation are shown in **Figure 1**, normalized to the value of the corresponding fully connected structure. The scatter of five realizations for each cut fraction is visible from the symbols, representing the individual numerical results.

For the normalized macroscopic Young's modulus, as shown in **Figure 1A**, the behavior was fitted by adjusting the exponent f in Equation (4), while the values for the percolation threshold pc from **Table 1** were inserted as a predefined parameter for the respective structure. It can be seen that the exponent f increases with decreasing coordination number. The fit results presented

in **Figure 1A** confirm that the value of f cannot be understood as an invariant number, as suggested in literature (Sahimi, 1994). Instead, it strongly depends on the initial structure under study. For high coordination numbers, the exponent tends toward 1, while for low coordination numbers it exceeds the value of 3, confirming the findings summarized by Kovácik (1999) ˇ . However, in our work, this value is not related to the extension of a pore morphology, which could be interpreted as a network of cut ligaments within the RVE, but instead it is related to the coordination number of the respective fully connected structure.

It is striking how well the very same behavior applies to the yield strength shown in **Figure 1B**. The fit curves determined from the Young's modulus also show strong agreement with the strength data for diamond, cubic, and bcc structures. Concerning the Gibson–Ashby structure, it is not clear if this applies here as well, because only few simulations reached sufficient plastic strains. Due to the missing statistics and the numerical issues, the few remaining data points are arguable. Irrespective of this uncertainty, the result strongly suggests that the Young's modulus and yield strength follow the very same behavior for partially cut structures.

#### Scaling of Mechanical Properties

The numerical experiment carried out in section Study of Percolation Behavior is based on the idea of random cutting of ligaments of an initially fully connected structure. For each type of structure, the degradation of the macroscopic mechanical properties follows a non-linear behavior that is defined by the coordination number of the corresponding fully connected structure. Following this line of thinking leads to the speculation that the behavior of a structure with a certain fraction of missing connections might be defined by the original unit cell structure.

On the other hand, simple math suggests that cutting 25% of the ligaments in a bcc structure with a coordination number z = 8, for example, yields the coordination number of the cubic structure (z = 6). The question at hand is whether the topology of a structure with higher coordination number can be effectively transformed into the topology of any structure of lower coordination number. It follows that the macroscopic properties for a given solid fraction is defined by the average coordination number via the second part of Equation (6). Ensuring a consistent scaling of mechanical properties, however, requires that the structures under consideration deform through the same mechanism. In this work, we therefore concentrate on structures that show bending as the dominant deformation mechanism (Huber et al., 2014).

#### Young's Modulus

The hypothesis presented in the previous paragraph is tested as follows: Starting from a fully connected bcc structure, a new starting structure is generated, in which a defined percentage of ligaments, ζini are removed. The steps were chosen such that the connectivity is continuously reduced from 8 to 4 in steps of 1, i.e., ζini ∈ {0, 0.125, 0.25, 0.375, 0.5}. Again, for each of these structures, the macroscopic Young's modulus was subsequently calculated, depending on the cut fraction ζ increased by 10% increments, with five random realizations for each increment. The results, analyzed according to section Study of Percolation Behavior in **Data Sheet 1** (**Supplementary Section 4**), suggest that it should be possible to combine the data from different structures in a single curve.

To this end, the total cut fraction ζtot is calculated from ζtot = 1−(1−ζini)(1−ζ ), where ζ is defined as the cut fraction relative to the remaining solid fraction of the pre-cut structure. All data related to the macroscopic Young's modulus E are normalized by the value computed for the fully connected bcc structure, which is denoted as E0,bcc. The results for the bcc structure are compiled in **Figure 2A** as black crosses.

The curve constructed from the bcc data in **Figure 2A** clearly shows that the power law function Equation (4) is not capable of describing the behavior from the fully connected structure down to the percolation to cluster transition. Therefore, we suggest a function consisting of an initially linear descent for ζtot ≥ 0 with

FIGURE 2 | (A) Construction of the master curve derived from bcc data and validation with data obtained independently from fully connected cubic, diamond, and Gibson–Ashby (GA) structures. (B) Deviation between numerical data and the proposed master curve.

a sigmoidal transition toward the percolation threshold ζ<sup>c</sup> in the following form:

$$\frac{E}{E\_{0, \text{loc}}} \; = \,\, \tilde{E} \left( \zeta\_{tot} \right) = 1 - a\_0 \zeta\_{tot} + \frac{a\_1}{\left[ 1 + \exp(-a\_2(\zeta\_{tot} - \zeta\_\varepsilon)) \right]},$$

$$0 \le \zeta\_{tot} \le \zeta\_\varepsilon. \tag{10}$$

The parameters in Equation (10) can be adjusted to satisfy the conditions E/E0,bcc ζc = 0 and dE/dE0,bcc ζc = 0 by setting a<sup>1</sup> = 2(a0ζ<sup>c</sup> − 1) and a<sup>2</sup> = 4a0/a1. The percolation threshold ζ<sup>c</sup> = 0.822 is taken from Sykes and Essam (1964). Please also see **Table 1** for the bcc structure. By using the literature value for infinite structure size, a treatment of the finite-size scaling effect (Sahimi, 1994; Nachtrab, 2011) in the numerical data can be avoided. This leads to a single adjustable parameter a0, which is determined from the linear slope of the numerical data at low total cut fractions to a<sup>0</sup> = 1.55. It follows that a<sup>1</sup> = 0.55 and a<sup>2</sup> = 11.31.

For validation of the master curve, the data from section Study of Percolation Behavior, computed for the cubic, diamond, and Gibson–Ashby structure, are included in **Figure 2A** according to the following procedure. As the coordination number of the bcc structure zbcc = 8 is used as reference, the calculation of the total cut fraction can be done in the following form

$$\xi\_{tot} \ = 1 - \frac{z\_{\chi}}{z\_{bcc}} \left( 1 - \xi\_{ini} \right) \left( 1 - \xi \right), \tag{11}$$

where the term zx/zbcc scales the current structure x with a coordination number z<sup>x</sup> relative to that of the bcc reference structure zbcc and defines the starting point for the total cut fraction on the master curve. Alternatively, the average coordination number z = z<sup>x</sup> (1 − ζini) (1 − ζ ) of an RVE can be determined by averaging the coordination number over all the internal nodes within the RVE. Incorporating nodes at boundaries would add a bias toward lower coordination numbers. The numerical data confirm the following linear relationship:

$$
\overline{z} = z\_{bcc}(1 - \zeta\_{tot}).\tag{12}
$$

This is confirmed with an accuracy of 5% for all structures under investigation. The vertical adjustment of the starting point of an initially fully connected structure x is defined by normalizing the Young's modulus data, such that the value E0,x/E0,bcc calculated from Equation (10) for ζini = 0 and ζ = 0 is met. The corresponding values are given in **Data Sheet 1**, **Supplementary Table 1**.

Furthermore, **Figure 2A** includes the data for the other structures (cub, dia, GA) of **Figure 1A** mapped to E/E0,bcc vs. ζtot. The overall agreement with the master curve appears to be very good. The quantitative comparison, as presented in **Figure 2B**, shows the deviation between the numerical data and the master curve with an error of < 2% for all structures, except for the uncut Gibson-Ashby structures showing a deviation of 5%. Although this is a factor of two and is better compared to the power law fit using Equation (4) of **Figure 1A**, it should be kept in mind that this accuracy is relative to the macroscopic Young's modulus E0,bcc of a fully connected bcc structure with a relatively high coordination number.

Remembering the strong agreement of the macroscopic yield strength data with the fit curves for macroscopic Young's modulus presented in **Figure 1B**, it can be expected that the same master curve as determined for Young's modulus can be applied to the macroscopic yield strength as well. This is shown in **Figure 3A** for bcc structures with different degrees of initial cutting. For low cut fractions (or high coordination numbers), the yield strength data fall about 3% below the master curve. However, with increasing cut fraction (or for lower coordination numbers), the difference reduces. For ζtot ≥ 0.55 (z ≤ 3.6), the two properties show a perfect match.

The validation carried out by using data from the structures with originally different unit cell geometry and coordination number is shown in **Figure 3B**. It can be seen that the scatter

together with the master curve Equation (10), using the parameters as determined for macroscopic Young's modulus. (B) Validation of the master curve with data of fully connected simple cubic (cub), diamond (dia), and Gibson–Ashby (GA) structures.

is larger, particularly for the cubic structure. The cubic structure is also located a few percent above the values of the other structures. It can be argued that the cubic structure is the only one in which the unit cell does not agree with the coordinate directions of the RVE boundary. Due to the rotation in <111> direction, numerous ligaments are cut at the RVE boundary to form a cube of size 12 × 12 × 12 unit cell size. The shorter ligaments show a higher strength due to the reduced lever available for bending (Huber et al., 2014). It can therefore be concluded that σy/σy0,bcc = E/E0,bcc = E˜ (ζtot) holds within the numerical accuracy and Equation (10) can be identically applied for predicting both the scaling behavior of the macroscopic Young's modulus and the yield strength.

#### Poisson's Ratio

The successful construction of a master curve for the macroscopic Young's modulus and yield strength from numerical data motivates the search for a master curve for Poisson's ratio. Starting from the bcc structure with increasing fraction of initial cuts leads to the behavior shown in **Figure 4A**. In contrast to the behavior of the Young's modulus and yield strength, the initial slope for low total cut fractions ζtot & 0 is close to zero and then takes progressively negative values. At ζtot & 0.7, the data show a minimum value. The scatter strongly increases while the curve changes direction toward larger values. As the structure rapidly loses connectivity with ζtot → 0.822, the lateral expansion of the RVE is based on very few connections within the 3D network, causing the large scatter and the change in the overall trend. It could be speculated that

Poisson's ratio should theoretically continue downwards toward zero when approaching the percolation threshold. Based on this assumption, a simple fit function with a single adjustable parameter can be formulated that assumes an elliptic shape:

$$\frac{\upsilon}{\upsilon\_{0,bcc}} = \tilde{\upsilon} \left( \zeta\_{tot} \right) = \left( 1 - \left( \frac{\zeta\_{tot}}{\zeta\_c} \right)^n \right)^{1/n}, 0 \le \zeta\_{tot} \le \zeta\_c. \tag{13}$$

The master curve for Poisson's ratio, as plotted in **Figure 4A** as a dashed curve, uses ζ<sup>c</sup> = 0.822 as fixed percolation threshold for the bcc structure, similar to Equation (10), and an exponent n = 1.75.

In contrast to the macroscopic Young's modulus and strength, which are measured in loading direction, Poisson's ratio characterizes the lateral expansion normal to the loading direction. It is, therefore, not obvious that the master curve can also apply to structures built from very different unit cells, as their deformation mechanisms could significantly differ. However, both the simple cubic and the diamond structure agree equally well with the master curve. Interestingly, the diamond structure, which starts as fully connected structure at the low coordination number of z = 4, shows a further continuation of the downwards trend along the master curve and confirms the hypothesis that Poisson's ratio should actually continue toward zero as the percolation threshold is approached. This hypothesis is further supported through additional simulations conducted for the low coordinated Gibson–Ashby structure, loaded in <111> direction, which are incorporated in **Figure 4B**.

# Relationship Between Scaled Genus Density and Average Coordination Number

Throughout the previous analysis, the total cut fraction ζtot was used as an independent variable for the characterization of the connectivity. By this approach, common issues with determining the percolation threshold p<sup>c</sup> and exponent could be avoided. For measuring the total cut fraction of a real structure, e.g., from a skeleton of a FIB tomography (Hu et al., 2016; Ziehmer et al., 2016; Hu, 2017; Richert and Huber, 2018), the related fully connected reference is required; however, this is unknown. Alternatively, the average coordination number z of a 3D network could be measured, because it is connected with the total cut fraction by the linear relationship, as given in Equation (12). But even if the skeleton of a structure is available, the determination of the average coordination number z, as defined in this work, is difficult.

By averaging the coordination numbers of all junctions, Nz, the average coordination number z should be obtained. The problem is that any junction with fewer than three connections cannot be recognized. A junction that connects two branches is invisible because the two branches form a single longer branch. A node that has lost all connections physically reduces to a void junction, which is undetectable in any case. Thus, one would naturally obtain z = 3 as the lower limit, irrespective of how many more cuts are introduced in a structure. This is consistent with the results of Ioannidis and Chatzis (2000), where only pores with z ≥ 3 are considered as valid nodes in topological context. Consequently, with ongoing removal of connections, the number of detectable junctions starts to decrease at the same time.

A third parameter that is frequently used is the genus density gV. The genus g is the maximum number of non-intersecting closed curves along which the object can be cut without dividing it into two parts (Richeson, 2008). As no internal pores are present in our structures, the genus equals the connectivity. For 3D networks consisting of solid struts, as represented by a graph G, the genus g is calculated from the Euler characteristic χ (G) = <sup>1</sup> <sup>−</sup> <sup>g</sup>, where <sup>χ</sup> (G): <sup>=</sup> <sup>V</sup> <sup>−</sup> <sup>E</sup>, with <sup>V</sup> and <sup>E</sup> being the number of graph vertices and the number of graph edges, respectively (Nachtrab, 2011; Hu et al., 2016). Note that this calculation of the genus assumes connected structures. As we do not account for the formation of free floating clusters, this can lead to negative values of g, because the formation of clusters and the cutting of all load-bearing rings may happen before reaching the percolation threshold.

Because the genus increases with increasing structure size, it is commonly scaled to a characteristic volume, g<sup>V</sup> = g/V<sup>c</sup> . To compare the topology of different structures, the dimensionless product gVS −3 V is used. In the context of nanoporous metals, 1/S<sup>V</sup> is typically chosen as characteristic length, representing the reciprocal of the interfacial area per volume of a given system (Kwon et al., 2010). This definition can be applied to 3D solid structures with an interface separating the solid fraction and the pore space, for which all characteristic lengths are linearly dependent due to the geometrical similarity of the structure under investigation (Kwon et al., 2010; Hu et al., 2016; Mangipudi et al., 2016; Hu, 2017). Therefore, the importance of the characteristic length scale for the normalization and the associated challenges in its experimental determination are still under debate (Lilleodden and Voorhees, 2018).

The large data set for various structures sheds some light onto this. The way in which the structures have been generated in this work enables the setting of any arbitrarily chosen value for the ligament radius, independent of the topology of the structure. Consequently, the interfacial area is fully decoupled from the genus, which is in contrast to the approach of generating artificial nanoporous structures based on the Cahn–Hilliard equation (Kwon et al., 2010; Sun et al., 2013; Mangipudi et al., 2016; Soyarslan et al., 2018). Moreover, Soyarslan et al. (2018) could show for this type of structures that the solid fraction controls the scaled genus density and a closed form relationship exists that uniquely relates the two quantities to each other.

By using the large set of data for structures covering a large range of coordination numbers and cut fractions, we are able to determine which characteristic length, more generally denoted as lV, allows the transfer of results for the scaled genus density among the different structures. For comparing RVEs of different sizes, all results are normalized by the number of unit cells in the model, i.e., g<sup>V</sup> = (1 − χ (RVE))/N 3 , where N = 12 for all bcc- and cubic-based structures, and N = 6 for all diamond- and Gibson–Ashby-based structures. The results for g<sup>V</sup> are plotted in **Figure 5A**. All curves intersect at g<sup>V</sup> = 0, which indicates that the genus is correctly calculated, as this particular point should be common for all structures, independent of the scaling. The data suggest that the intersection with g<sup>V</sup> = 0 corresponds to

FIGURE 5 | Calculated scaled genus density plotted vs. average coordination number for different structures and cut fractions: (A) genus per unit cell volume vs. average coordination number. (B) fingerprint of various definitions for the characteristic length lV with the condition g ′ V l 3 V = const fulfilled only for the characteristic length lV,J (green).

an average coordination number z . 2. Below this point, i.e., for g<sup>V</sup> < 0, clusters form and the mechanical properties are zero. For z > 2, the curves gV(z) separate because the different unit cells have a different genus, as seen in **Table 1**.

We can now derive a fingerprint from the data in **Figure 5A**, which supports the search for the characteristic length lV. Following Kwon et al. (2010), the scaled genus density gV(z) is defined as

$$\lg\_V^\*(\overline{z}) := \lg(\overline{z}) l\_V^3. \tag{14}$$

As long as the structures under investigation are self-similar, any characteristic length l 3 V can be chosen, such as 1/S<sup>V</sup> or the ligament diameter hDi (Hu et al., 2016). However, when the self-similarity is no longer conserved, we need to select a characteristic length that works for all structures. Our data set supports the search for l<sup>V</sup> to fulfil the condition g ∗ V (z) = const,

independent of the structure. As can be seen in **Figure 5A**, g<sup>V</sup> depends linearly on z in the upper right area of the plot. In this region, the condition g ∗ V (z) = const can be replaced by g ∗ V ′ = g ′ V l 3 <sup>V</sup> = const. The slopes g ′ V characterizing gV(z) >1 are listed in **Table 1**.

A number of possible characteristic lengths l<sup>V</sup> can be obtained from the geometrical parameters defining the structure of the different unit cells, such as the coordination number z, the ligament length l, the ligament radius r, the number of ligaments per unit cell NL, and the number of junctions per unit cell N<sup>J</sup> . We can exclude the coordination number z as the independent variable, as well as combinations with the ligament radius r for the aforementioned reasons. As one example of this category of characteristic lengths, the inverse of the ratio of surface area by volume is tested. Normalizing the unit cell volume by the surface area of the cylindrical ligaments in the unit cell, we can estimate S −1 V by lV,<sup>S</sup> : <sup>=</sup> <sup>a</sup> 3 /(NL2πrl). The other characteristic lengths are the ligament length lV,<sup>l</sup> :<sup>=</sup> <sup>l</sup>, the total ligament length in the unit cell, <sup>l</sup>V,ltot :<sup>=</sup> <sup>N</sup>Ll, and characteristic lengths calculated from the volume per junction and from the volume per ligament, as lV,<sup>J</sup> :<sup>=</sup> (<sup>a</sup> 3 /NJ) 1/3 and <sup>l</sup>V,<sup>L</sup> :<sup>=</sup> (<sup>a</sup> 3 /NL) 1/3 , respectively.

The dependency of g ′ V l 3 V for the different definitions of l<sup>V</sup> plotted in **Figure 5B** reveals that only the characteristic length lV,<sup>J</sup> satisfies the condition g ′ V l 3 <sup>V</sup> = const. If this is inserted in Equation (14), we get

$$g\_V^\*\left(\overline{z}\right) := \frac{\operatorname{g}\left(\overline{z}\right)}{N^3 N\_I} = \frac{\operatorname{g}\left(\overline{z}\right)}{N\_{I,RVE}}.\tag{15}$$

Therefore, the definition of a scaled genus density, which combines all structures in a single curve, requires a normalization of the genus by the number of junctions of the original, fully connected structure, NJ,RVE, given by the number of unit cells in the RVE, N 3 , multiplied by the number of junctions per unit cell, NJ . This finding is consistent with Ioannidis and Lang (1998) and Ioannidis and Chatzis (2000), where the genus per node was used.

By knowing the characteristic length for scaling the genus density, we can derive a closed form relationship for g ∗ V depending on the RVE size N, which can be analyzed for any unit cell, as seen in **Data Sheet 1**, **Supplementary Section 5**. The results shown in **Figure 6A** reveal that structures with a scaled genus density that is sufficiently insensitive to the surface require an RVE size in the order of 100. Thus, the relationship g ∗ V (z) that holds for large structures should be determined from the analytical solution for the infinite RVE size. To confirm this approach, the numerical and analytical data are plotted in **Figure 6B**. The strong agreement for RVE sizes of 6 to 12 with corresponding curves for finite structure size validates the analytical solution provided in Supplementary Equations (9–14) in **Data Sheet 1**.

For a periodic structure of infinite size, the scaled genus density can be calculated analytically depending on the RVE size, as seen in **Data Sheet 1**, **Supplementary Figure 8**, with values given in **Data Sheet 1**, **Supplementary Table 2**. The numerical data extend the relationship between the genus and the average coordination number for infinite structure size and z ≥ 3 as given by Ioannidis and Lang (1998) and Ioannidis and Chatzis (2000)

FIGURE 6 | (A) Scaled genus density vs. RVE size for different structures calculated from the analytical solution in Data Sheet 1, Supplementary Section 5. (B) Scaled genus density vs. average coordination number calculated for the RVEs, compared to the analytical solution dependent on the RVE size.

to lower values:

$$g\_V^\*\left(\overline{z}\right) = \frac{g\left(\overline{z}\right)}{\mathcal{N}\_{I.\,RVE}} = \overline{z}/2 - 1,\ \overline{z} \ge 2.\tag{16}$$

Equation (16) does not predict the nonlinear runout, which is clearly visible in **Figure 5A** for bcc and diamond and in **Figure 6B**, where the data show a curvature deviating to the left for z < 2, relative to the linear extrapolation of the analytical solution for N = ∞. This behavior is a result of the formation of clusters at the lowest average coordination numbers close to and beyond the percolation to cluster transition.

It thereby follows that the scaled genus density is independent of the structure if the genus g is normalized to the number of nodes NJ,RVE in the fully connected structure. Other characteristic lengths, such as the reciprocal of the interfacial area per volume of a given system (Kwon et al., 2010) or the mean ligament diameter hDi (Hu et al., 2016) work for structures that are self-similar, but they do not allow a comparison between results from non-similar structures.

Another important result is the unique relationship between the scaled genus density and the average coordination number, which is linear as long as z ≥ 2. Whether the genus might nevertheless provide additional linear-independent information on the topology is an important question that is investigated in the following section.

#### MACHINE LEARNING

From section Scaling of Mechanical Properties, we know the percolation threshold ζ<sup>c</sup> , at which all mechanical properties reach the value of zero. Inserting this value in Equation (12) for ζtot leads to the corresponding minimum average coordination number zmin ≈ 1.5. It is shown in section Relationship Between Scaled Genus Density and Average Coordination Number that the genus reaches zero at z = 2. It remains an open question how meaningful data are for values below z = 2. In any case, the valid range from z = 2 to 3, which corresponds to a positive genus, has not been touched upon so far in topology for the reasons explained by Ioannidis and Chatzis (2000). As a consequence, all structures approaching z = 3 are systematically overestimated with respect to their coordination number. In the following section, the difficult task of interpreting topological data for lowest average coordination numbers is solved via machine learning.

#### Determination of Average Coordination Number

For an overview, a number of ANNs are trained and analyzed for different choices of inputs **x**. Starting with a complete set, more and more inputs are removed, which are hard or impossible to measure. The investigated cases are summarized in **Table 2**, together with the architecture of the ANNs and the achieved MSE values. The input is formed by the statistics of local connectivity. For each structure, we count the number of branches for each coordination number z starting from lowest value of z = 0 to highest value z = 8, denoted by N<sup>0</sup> to N8. They are normalized by the total of detectable junctions, N (zmin) <sup>=</sup> P<sup>8</sup> z=zmin Nz. All data for creation of the patterns are generated from the whole set of structural models presented in section Macroscopic Mechanical Properties, including all variants of initial cuts and subsequent cutting. In total, 585 patterns are used, of which 10% are kept for validation. Each ANN is trained for 20,000 epochs with no sign of overlearning. As common output definition for all variants ANN0 to ANN3, a single output neuron is used to predict the average coordination number z, which is computed for each pattern by

$$\mathcal{Y} := \overline{z} = \sum z \mathcal{N}\_z / N^{(0)}.\tag{17}$$

The errors collected in **Table 2** show that the mean squared error increases, as expected, with reduction in inputs for junctions with lower coordination numbers. For ANN3, the uncertainty increases particularly for very low average coordination numbers, as shown in **Figure 7A**. For obvious reasons, all data from the Gibson–Ashby structure lead to a constant output value (highlighted in purple). The distribution of the predicted values for ANN1 (red symbols in **Figure 7A**) confirms that the missing information about the number of void junctions can be largely reconstructed from the remaining data derived from all other structures and has almost no effect, even for lowest coordination numbers.

For visualizing the performance of the ANN, an estimate of the average coordination number z is calculated by averaging all coordination numbers provided to the input for ANN3:

$$\tilde{z} = \sum\_{z=3}^{8} z N\_z / N^{(3)}.\tag{18}$$

It can be seen from **Figure 7A** that the machine-learning approach has the capability of interpreting the presented data in the context of the information of all structures provided during training. Knowing the big picture obviously helps to reconstruct missing information in the input data with reasonable accuracy. This can be understood by visualizing the statistical distribution of the local coordination numbers, which follow typical patterns according to the probability of cutting, as seen in **Data Sheet 1**, **Supplementary Section 6**.

Section Relationship Between Scaled Genus Density and Average Coordination Number leaves us with the question of whether the genus could provide additional linear independent information on the average coordination number. To investigate this, the input definition of the ANN can be enriched by adding an estimated scaled genus density using the accessible number of junctions in the structure, g/N (3). Using such data, however, limits the generality of the approach to perfectly ordered structures. As soon as structures are randomized or cut in planes that do not meet planes of the unit cells, the genus is biased to lower values due to the cutting of originally closed

TABLE 2 | ANN definitions and squared errors after training for varying degree of information about junctions with low coordination number.


curves. Thus, the incorporation of the genus requires an input definition that is insensitive to the boundary, similarly to the computation of z from junctions located inside the RVE. To this end, ligaments that touch the boundary are removed from the structure before calculating the genus of the inner graph, denoted as g (i) . The normalization by the number of detectable junctions inside the RVE boundaries, corrected to a structure of infinite size via the factor gV,∞/gV,RVE (see **Data Sheet 1**, **Supplementary Table 3**), leads to an additional input g (i) /N (3) :<sup>=</sup> g (i) RVE/N (3) · gV,∞/gV,RVE. This input definition works without any knowledge about the fully connected structure.

After training, this neural network, denoted as ANN3gi, performs better than ANN2 by a factor of 3, as can be seen from the mean squared errors in **Table 2** and the predicted output data plotted in **Figure 7B**. This shows that the additional information on the scaled genus density, despite being a rough estimate of the mathematically correct value, particularly helps in reducing the uncertainty for z ≤ 3.

#### Young's Modulus and Yield Strength

The master curve Equation (10) developed in section Young's Modulus brings the data generated for different structures very close to a single curve. For our data set, the accuracy compared to the master curve can be improved without limiting the generality of the approach. A second artificial neural network is trained, which corrects the macroscopic Young's modulus for each pattern relative to the prediction of the master curve E˜(ζtot), given by Equation (10). For reasons of consistency, we use the same input definition as used for ANN3 (see **Table 2**) but apply the following output definition:

$$\gamma := E/E\_{0,bcc} - \triangle(\zeta\_{tot}).\tag{19}$$

After this artificial neural network, denoted as ANN3E, is trained, the mean squared training and validation error come to MSE<sup>T</sup> = 1.21 · <sup>10</sup>−<sup>3</sup> and MSE<sup>V</sup> <sup>=</sup> 1.03 · <sup>10</sup>−<sup>3</sup> respectively. An accuracy of ±0.01 for the output is reached, which is an improvement by a factor of 2 compared to the master curve. Only a few data points are located outside this limit. Trials including the estimate of the scaled genus density in the input definition, as used for ANN3gi, do not improve the result. This is possibly because this additional information is only relevant close to and beyond the percolation threshold, where the mechanical properties are anyway approaching zero.

In the same way as for the macroscopic Young's modulus, an artificial neural network ANN3sy is trained for correcting the macroscopic yield strength relative to the master curve, with the inputs as defined for ANN3 and the output definition being

$$\chi := \sigma\_{\mathcal{Y}} / \sigma\_{\mathcal{Y}0,bc\mathcal{C}} \ -\tilde{\sigma}\_{\mathcal{Y}}(\zeta\_{tot}),\tag{20}$$

where σ˜y(ζtot) ≡ E˜(ζtot) is given by Equation (10). The yield strength shows a larger scatter in the numerical data and also larger deviations from the master curve, as seen in **Figure 3**. As a neural network interprets only the general relationship hidden in the data as whole, the scatter of the data is also reflected in the overall training error, which is double that of the Young's modulus. The resulting mean squared training and validation error are MSE<sup>T</sup> <sup>=</sup> 5.22 · <sup>10</sup>−<sup>4</sup> and MSE<sup>V</sup> <sup>=</sup> 4.99 · <sup>10</sup>−<sup>4</sup> , respectively. The accuracy is improved by a factor of about 4 from the span of the training range from −0.03 to 0.06, with a remaining uncertainty of ±0.01. This uncertainty results from the sensitivity of the RVEs to local plastic yielding, which seems to be influenced more strongly by the realization of random cutting than the macroscopic Young's modulus.

An overview of the workflow developed in sections Macroscopic Mechanical Properties and 4 is given in **Data Sheet 1**, **Supplementary Section 7**. The Supplementary data files (**Data Sheet 2**) for training and validation of the ANNs are specified in **Data Sheet 1**, **Supplementary Section 8**; the ANNs are provided in **Data Sheet 3** as Supplementary Python code including selected example problems as described in **Data Sheet 1**, **Supplementary Section 9**.

#### VALIDATION AND APPLICATION

#### Randomized Diamond Structures With Nodal Correction

Literature on NPG, studying the topological properties from artificially generated 3D structures and 3D FIB tomography, reports a large number of three-fold junctions and a smaller number of quadruple junctions (Mangipudi et al., 2016). A diamond structure, as suggested by Huber et al. (2014), can be tuned using the cut fraction to meet any ratio of threefold and quadruple junctions. The randomization of the finite element beam model by an additional parameter A as a multiple of the unit cell size allows the prediction of realistic macroscopic properties, including Poisson's ratio (Huber et al., 2014; Roschning and Huber, 2016). A nodal corrected beam model can be applied, mimicking the effect of the nodal mass on the deformation behavior similar to a solid model (Jiao and Huber, 2017b).

To validate the approach developed in this work, we use such an extended model, which describes the elastic-plastic deformation behavior of NPG more realistically compared to the perfect 3D periodic structures without nodal masses employed in the previous section. To this end, additional structures with randomization values ranging from A = 0.1 to A = 0.3 and cut fractions ζ up to 0.4 are generated. Examples of RVEs of size N = 6 unit cells are given in **Data Sheet 1**, **Supplementary Figure 3**.

For all randomized structures, the chosen ligament radius is r/a = 0.118. The geometry and property parameters for the nodal corrected beam model are given in **Data Sheet 1**, **Supplementary Section 1**. In addition to the randomization, the major difference with perfectly ordered crystals is that distorted ligaments are now cut at the RVE boundaries. With increasing degree of randomization, the RVE also loses junctions that are shifted outside the RVE boundaries. This allows the approach to be tested for more general structures.

#### Topology

**Figure 8A** presents the results for the determined average coordination number z vs. the correct values. The solid curve indicates the exact solution. The ANN3 outputs (circles) agree for all three randomizations and are very close to the exact solution, with a slight trend for underestimation by −0.1. The results for the highest cut fraction ζ = 0.4 show the highest scatter toward low values by an average of −0.2 (on average 10% deviation) due to missing information on the statistics for coordination numbers z < 3. The comparison with the estimate z˜, on the other hand, shows that the ANN3 significantly improves the determination of the average coordination number for z < 3.5. The accuracy is further improved by including additional input g (i) /N (3) (ANN3gi, cross symbols). The scatter is reduced compared to ANN3 and the outputs are very close to

Data Sheet 1, Supplementary Sections 1–3.

the correct values. Only for the lowest values of z does a slight underestimation along with some scatter occur.

Based on the value of z, the scaled genus density g ∗ V (z) = g/NJ,RVE is calculated from Equation (16), as seen in **Figure 8B**. After scaling the data for the perfectly ordered crystal (A = 0.0, blue diamonds) to infinite structure size by a factor of gV,∞/gV,RVE = 1.314 (see **Data Sheet 1**, **Supplementary Table 3**), they fall nicely onto the master curve. As expected, the genus falls immediately below the master curve for the randomized structures, because about 50% of the distorted ligaments are now cut by the RVE boundary. A similar effect occurs in the analysis of tomographic data, where the boundary of the inspected volume is introduced artificially and does not exist as a real boundary in the larger sample. In this sense, the elevated values from the master curve Equation (16), calculated with the identified average coordination number from ANN3gi, reflect the scaled genus density of the infinite-size structure.

#### Macroscopic Young's Modulus and Yield Strength

According to the workflow depicted in **Data Sheet 1**, **Supplementary Figure 10**, the total cut fraction ζtot, calculated from z serves as key input for the prediction of the mechanical properties based on the master curves Equations (10, 13). Equation (11) determines the relevant part of the master curves. For zdia/zbcc = 0.5 and ζ = 0 to 0.5, we obtain the range for ζtot = 0.5 to 0.75 and the initial value E0,dia/E0,bcc = 0.239, as also seen in **Data Sheet 1**, **Supplementary Table 1** for ζini = 50%. The master curves are entered into **Figure 9** as solid curves. The related cut fraction ζ , which is 0 for the fully connected diamond structure, is shown on the top axis of these plots. All numerical results are entered as solid symbols with the same color-coding as explained for **Figure 9A** for the different degrees of randomization.

The plots for the Young's modulus and yield strength, as shown in **Figures 9A,B**, show no dependence on the degree of randomization. This supports the hypothesis that the scaling of mechanical properties, as formulated in Equation (6) for Young's modulus, holds. The values E/E0,bcc and σy/σy0,bcc, as determined by the artificial neural networks ANN3E and ANN3sy, respectively, are added as cross symbols. Both ANNs are able to predict the displacement by about −0.02 relative to the master curve, thus resembling the position of the numerical data extremely well. Uncertainties in the determined total cut fraction appear as a scatter of the determined values along the displaced master curve. Only in the upper left corner of the plot are the values of E/E0,bcc and σy/σy0,bcc displaced. It can be assumed that these specific data points are treated rather as outliers by the ANN during training, because fully connected diamond and Gibson–Ashby structures appear outside the overall trend in **Figure 2B**.

From **Figure 9B**, it can be seen that it is possible to predict the yield strength for RVEs that cannot be numerically solved due to convergence problems. This happens more often for increasing randomization and cut fraction, made visible by the frequency of green and red solid symbols with zero values. This nicely shows that the presented approach allows the prediction of macroscopic mechanical properties for structures that cannot be solved with computer simulations.

#### Poisson's Ratio

**Figure 10A** presents the results for the Poisson's ratio, which show different behavior for the three randomizations. While the data for A = 0.1 (black solid symbols) follow the master curve for all cut fractions, the data for A = 0.2 (green solid symbols) show a minimum value at ζtot = 65%, while values increase for higher cut fractions. This phenomenon, already observed for the perfect crystals (see **Figure 4**), is expected. However, for the RVE with maximum randomization of A = 0.3 (red solid symbols), the minimum value moves up to the starting point at zero cut fraction and all data show a very large scatter.

This deviation from the master curve motivates additional simulations for the same randomizations but without nodal correction. The results entered in **Figure 10A** as open symbols (beam model, BM) do not show such strong deviations. Up to A = 0.2, all data follow nicely on the master curve. However, for A = 0.3, a similar behavior can be observed as for the nodal-corrected RVE, with larger deviations for increasing cut fraction.

This seemingly odd behavior can be understood if it is considered that randomization has a strong effect on the elastic Poisson's ratio (Huber et al., 2014; Roschning and Huber, 2016; Jiao and Huber, 2017a) which rapidly decreases with increasing randomization. In addition, plotting the absolute values in **Figure 10B** shows that the nodal correction, combined with randomization, decreases the Poisson's ratio even further. This effect is not mentioned by (Jiao and Huber, 2017b), because in their study, nodal correction is only discussed in relation to the macroscopic stress-plastic strain response of the RVE.

Poisson's ratio is a critical parameter for the calibration of the randomization. Data from different sources display a range for NPG from ν =0.165 to 0.2 (Roschning and Huber, 2016). This range, highlighted in yellow in **Figure 10B**, can now be analyzed with respect to the cut fraction as an additional degree of freedom. This limits the choice of realistic combinations of cut fraction and randomization, for which the deviation of the numerical data from the master curve is negligible. The sensitivity of Poisson's ratio is much stronger with respect to the randomization in comparison to the cut fraction. If the randomization is around A = 0.2, the data with nodal correction are even insensitive to the cut fraction. To calibrate the model by the experimental data, we can determine possible combinations (A, ζ ) by moving from zero to maximum cut fraction along the yellow-shaded area. This again underlines the necessity of determining the average coordination number through

experimental data is taken from Roschning and Huber (2016).

a structural analysis. With the known average coordination number, the position on the x-axis (total cut fraction) is defined and Poisson's ratio can be used for calibrating the randomization parameter A.

#### Data From Macroscopic Compression Experiments

The determination of the effective solid fraction, which mechanically contributes to the ligament network of NPG, is the scope of the studies by Liu et al. (2016) and Liu and Jin (2017). The authors report a large range of samples with ligament sizes from 5 to 500 nm. The degree of connectivity was changed via the alloy composition prior to coarsening. Coarsening of sets of samples after dealloying for four different initial solid fractions gave a large set of samples, forming a valuable database. The measured macroscopic Young's modulus was used for determining the effective solid fraction. The major assumption is that only connected ligaments contribute to the mechanical stiffness, which is given by the Gibson–Ashby scaling relation Equation (1), rewritten as ϕeff = (E/Es) 1/2 . The difference between ϕ and ϕeff is attributed to dangling ligaments.

#### Determination of Cut Fraction

In this work, the mass of dangling ligaments corresponds to the fraction of cut ligaments according to

$$
\alpha = \varphi\_{\text{eff}} / \varphi = (1 - \zeta). \tag{21}
$$

In Equation (21), α is the fraction of load-bearing ligaments, as introduced by Liu et al. (2016) and Liu and Jin (2017). Consequently, ζ represents the fraction of cut ligaments as introduced in this work. For samples with lower solid fraction ϕ ∼ 0.26, the macroscopic Young's modulus takes very low values. As no percolation threshold is considered by Liu et al. (2016) and Liu and Jin (2017), the calculated effective solid fraction reaches values close to 0 and the cut fraction tends to 1. From the results of Soyarslan et al. (2018), we know that the network loses its connectivity at a solid fraction ϕ P , which is why Equation (3) with ϕ <sup>P</sup> <sup>=</sup> 0.159 and <sup>m</sup> <sup>=</sup> 2.56 should be used instead of Equation (1).

Combining the data from the studies by Liu et al. (2016) and Liu and Jin (2017) with Equation (3), plotted as suggested by Soyarslan et al. (2018), leads to **Figure 11A**. In this plot, symbols and colors correspond to those in **Figure 4** of the study by Liu and Jin (2017). Soyarslan et al. (2018) selected the experimental data to include only as-prepared samples, ignoring samples where the ligament size was varied by annealing and using only the data from specimen with maximum connectivity (Jin et al., 2018). After including the data of the annealed specimen, a large scatter appears for each data set of constant solid fraction. This cannot be captured by Equation (3) as it uses the solid fraction ϕ as a sole parameter for the characterization of the structure. However, with the cut fraction as additional parameter, we have the degree of freedom that is needed for analyzing the data of Liu et al. (2016) and Liu and Jin (2017), depending on the fraction of dangling ligaments.

From the statistical analysis of the skeletonized NPG structures presented by (Mangipudi et al., 2016), most junctions show a three-fold coordination, while a few percent with higher coordination numbers can also be detected. Assuming a Gibson– Ashby structure would limit the maximum coordination to 3, whereas a diamond structure can be adjusted to a similar statistical distribution, including some four-fold coordinated junctions, by cutting off ligaments.

Based on the assumption that a fully connected NPG material is described with a diamond structure, we can now analyze the data for macroscopic Young's modulus taken from Liu et al. (2016) and Liu and Jin (2017) and interpret the decay of the modulus for a given solid fraction in terms of cut fraction ζ . The values of ζ are determined by calibrating the modulus data to fit onto the maser curve for Equation (10) (see also **Figure 2A**). For the reference value, the average macroscopic Young's modulus of the data set with maximum solid fraction ϕ ∼ 0.46 is calibrated to match E0,dia/E0,bcc = 0.239. On the x-axis, ςtot = 0.5 corresponds to z = 4 for diamond. It should be noted that the starting point can be set to any non-integer value in general when more precise information on the topology of the fully connected structure is available.

Relative to the value of E0,dia/E0,bcc = 0.239, the experimental data yield cut fractions, which are shown in **Figure 11B** for each data set at constant solid fraction depending on the ligament size, approaching the limits ζ<sup>c</sup> = 0.822 and zmin ≈ 1.5 for the lowest modulus data. The determined cut fractions ζ qualitatively agree very well with the results for 1 − α presented by Liu and Jin (2017). However, we determine different fractions of loadbearing ligaments. While Liu et al. report that < 10% of the ligaments remain for bearing load for ϕ ∼ 0.25, we have & 40% load-bearing ligaments (< 60% cut fraction with respect to diamond). This is due to the percolation threshold that represents an upper limit for the cut fraction. On average, we determine the following values for the average coordination number z ∼ 2.5 − 3.0 (ϕ ∼ 0.33 − 0.35) and z . 2 (ϕ ∼ 0.26).

#### Determination of Yield Strength

Liu et al. inserted the effective solid fractions in the Gibson– Ashby scaling law for the yield strength given in Equation (2) to determine the yield strength of the solid fraction in each sample from the macroscopic yield strength (Liu et al., 2016). The findings of our work have two implications in this context: (i) the effective solid fraction is higher due to the percolation threshold, and (ii) the scaling of macroscopic Young's modulus and yield strength due to cutting of ligaments follow both the same relationship given by Equation (10), instead of Eqs. (1) and (2) with two different exponents 2 and 1.5, respectively. Whether this has a significant impact on the determined yield strength of the solid fraction can be investigated using the scaling laws, as applied by Liu et al. (2016) and Liu and Jin (2017): σy/σyS = 0.3ϕ 3/2 eff and E/E<sup>S</sup> = ϕ 2 eff . By solving E/E<sup>S</sup> with respect to ϕeff and inserting the result in σy/σyS, the dependencies of the yield strength on the macroscopic Young's modulus are generated in the form σy/σyS ∼ (E/E<sup>S</sup> ) 3/4 .

According to section Young's Modulus, we have in fact σy/σyS ∼ (E/E<sup>S</sup> ) for a set of samples with constant solid fraction. This clearly shows that the yield strength would decrease faster with the exponent 1 instead of 0.75. A quantitative comparison is given in **Figure 12A**, where ϕ = 0.48 is assumed for the fully connected structure and ϕeff = αϕ with α = 0.74 (Liu and Jin, 2017). As expected, the data confirm that the results do not depend on the chosen structure (diamond or Gibson– Ashby). Also, the linear fits in the log-log diagram confirm the exponents, as derived in the previous paragraph.

We can furthermore conclude from **Figure 12A** that both curves converge for large values of macroscopic Young's modulus and yield strength (i.e., low cut fractions) while for lower values (or for higher cut fractions), the yield strength is overpredicted by Liu et al. (2016). By translating the quantitative behavior of the two curves into the diagram presented in **Figure 12B**, we obtain a very interesting result. The blue and the black curve both correspond to the fits of the data as entered in **Figure 8** of the study by Liu et al. (2016), representing the yield strength as determined from NPG along with data collected by the

authors from literature on FIB-machined Au pillars, respectively. It can be seen that an extrapolation of both curves for small characteristic sizes tend toward the theoretical shear strength. However, for larger characteristic size, the curves diverge.

The green curve results from translating the strength data of Liu et al. with the help of the data presented in **Figure 12A** to the correct scaling for increasing cut fraction, according to Equation (10). This lowers the strength values for larger characteristic sizes more than for small characteristic sizes and, within the experimental scatter, this closes the gap between the data from Au nanoligaments and FIB-machined Au pillars. The better agreement of the experimental results for larger ligament sizes is a strong support for the theoretical findings that (i) the macroscopic properties can be modeled as multiplicative decomposition of two terms, where one term depends only on the solid fraction and the second term depends on the cut fraction and (ii) the master curves for Young's modulus and yield strength show the same dependence on the cut fraction. Despite this promising outcome, the experimental validation presented here is only an indirect access whereas a direct validation would be much more desirable. To this end, artificial structures as generated in this work (for example see **Data Sheet 1**, **Supplementary Figure 3**) could be translated into specimen using additive manufacturing or 3D laser lithography technology. Elastic-plastic compression testing of specimen with varying initial structure and cut fraction would deliver the unchallengeable proof for the findings presented in this work.

# CONCLUSIONS

This work addresses a number of fundamental questions regarding topological description and its incorporation in the structure-properties relationships of materials characterized by a highly porous three-dimensional structure. The findings are relevant for nanoporous metals and open-pore foams, morphologies of ordered block copolymers and polymer blends during spinodal decomposition, and architectured mechanical meta-materials consisting of struts or beams that undergo bending deformation.

Generalizing the highly efficient finite element beam models allows the generation of data for structures of different topologies, ranging from a highly coordinated bcc structure down to a Gibson–Ashby structure with a coordination number of three. What is common for all these structures is that they deform under bending of the ligaments. By random cutting of a fraction of ligaments in the RVE, selected structures were continuously modified with respect to their average coordination number from the value of the fully connected structure to the percolation-cluster transition. The macroscopic responses, such as Young's modulus, yield strength, and Poisson's ratio, were computed for each RVE. Together with the cut fraction, average coordination number, and statistical information about the local coordination within the structure, a database was created consisting of more than 100 different structures.

It is shown that the macroscopic Young's modulus, yield strength, and Poisson's ratio can be expressed in the form of a multiplicative decomposition, where the first term depends on the solid fraction and the second on the cut fraction. The dependence on the cut fraction can be represented by a master curve, covering a large range of structures beginning from highest coordination number of 8 of the bcc reference structure to 1.5, which is the average coordination number close to the percolation-cluster transition. Any average coordination number in between can be reached by the random cutting of a corresponding fraction of ligaments, as the average coordination number decreases linearly with increasing cut fraction.

As a striking result, all data for macroscopic Young's modulus and yield strength are covered by a single master curve, irrespective of whether perfectly ordered structures or more realistic randomized structures with nodal masses are considered. This leads to the important conclusion that the loss of strength due to pinching-off of ligaments is proportional to the decline in Young's modulus. Experimental support for this unexpected finding came from re-analyzing the data of Liu et al. (2016) and Liu and Jin (2017). The analysis shows that the gap between the strength data of Au nanoligaments and from FIB-machined micropillars can be neatly closed by incorporating this scaling behavior.

For the elastic Poisson's ratio, a second master curve was constructed that follows an elliptic-type shape with a maximum value for the fully connected structure and very low values with increasing cut fraction. In this case, the data show a divergence from the master curve for high cut fractions, which is probably caused by the beginning formation of clusters that do not follow the deformation pattern of the more connected parts of the structure. Beyond what is known for fully connected structures (Huber et al., 2014; Roschning and Huber, 2016), it turns out that the randomization needs to be calibrated for the correct value of the cut fraction (or average coordination number), because both structural parameters commonly define the elastic Poisson's ratio.

Based on the fingerprints of the different topologies, a scaling of the genus density could be identified that again combines all data in a single master curve with the average coordination number being the independent variable. The characteristic length that fulfils this condition normalizes the genus to the number of junctions of the fully connected structure. It is shown that linear relationships exist between all topological parameters under investigation, which are the total cut fraction, the average coordination number, and the scaled genus density. For proper measurement of each of these parameters, knowledge about the fully connected structure is required. This important detail significantly complicates the experimental measurement in each case.

An access to the solution to this central problem was found in machine learning. Feeding statistical information about the local coordination numbers of detectable junctions and, optionally, the estimated genus density, allows the determination of the average coordination number without knowledge of the fully connected structure. It could be shown that incorporating an estimate for the scaled genus density improves the accuracy by a factor of 3.5. Having determined the average coordination number, this parameter serves as a common key for the calculation of the cut fraction, the scaled genus density, and the mechanical properties with reference to a chosen fully connected structure.

# REFERENCES


Analyzing the data from the study by Liu and Jin (2017), the cut fraction for NPG samples with different solid fraction and degree of coarsening were determined, assuming a diamond structure as a fully connected reference. It turns out that the number of load-bearing ligaments with & 40% is much larger compared to < 10% reported by Liu and Jin (2017), which is due to the incorporation of the percolation threshold. At the same time, the corresponding average coordination numbers falls slightly below 2, which corresponds to a scaled genus density of 0.

# AUTHOR CONTRIBUTIONS

NH conceptualized and designed the study. NH developed the Python scripts for object-oriented job creation and submission and organized the database. NH carried out the data-mining and machine-learning work. NH wrote and revised the manuscript.

# FUNDING

Support from Deutsche Forschungsgemeinschaft (SFB 986 M<sup>3</sup> project B4) is gratefully acknowledged.

# ACKNOWLEDGMENTS

Jingsi Jiao is acknowledged for carrying out preliminary studies on RVEs with random cutting of ligaments. The author thanks Jörg Weissmüller (SFB 986 M<sup>3</sup> project B2) for the intense discussions and valuable feedback throughout the development of this work. Particularly, section Relationship Between Scaled Genus Density and Average Coordination Number on the scaled genus density and its connection to the average coordination number would not exist without this highly inspiring interaction. Claudia Richert and Erica T. Lilleodden (SFB 986 M<sup>3</sup> project B4) are acknowledged for discussions and careful reading of the manuscript, which have helped to strengthen the work.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmats. 2018.00069/full#supplementary-material


probe small-scale mechanical behavior. Mater. Res. Lett. 4, 27–36. doi: 10.1080/21663831.2015.1094679


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Huber. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.