# ROBUST MONITORING, DIAGNOSTIC METHODS AND TOOLS FOR ENGINEERED SYSTEMS

EDITED BY : Eleni N. Chatzi, Manolis N. Chatzis and Costas Papadimitriou PUBLISHED IN : Frontiers in Built Environment

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88966-088-9 DOI 10.3389/978-2-88966-088-9

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## ROBUST MONITORING, DIAGNOSTIC METHODS AND TOOLS FOR ENGINEERED SYSTEMS

Topic Editors: Eleni N. Chatzi, ETH Zürich, Switzerland Manolis N. Chatzis, University of Oxford, United Kingdom Costas Papadimitriou, University of Thessaly Volos, Greece

Citation: Chatzi, E. N., Chatzis, M. N., Papadimitriou, C., eds. (2020). Robust Monitoring, Diagnostic Methods and Tools for Engineered Systems. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88966-088-9

# Table of Contents


Manolis N. Chatzis and Eleni N. Chatzi

*57 Gaussian Process Time-Series Models for Structures under Operational Variability*

David Avendaño-Valencia, Eleni N. Chatzi, Ki Young Koo and James M. W. Brownjohn

*76 Comparing Structural Identification Methodologies for Fatigue Life Prediction of a Highway Bridge*

Sai G. S. Pai, Alain Nussbaumer and Ian F. C. Smith


Gregory Patrick Gislason, Qipei Mei and Mustafa Gül

*121 Technology Leveraging for Infrastructure Asset Management: Challenges and Opportunities*

A. Emin Aktan, Ivan Bartoli and S. Gokhan Karaman


Ignace Mugabo, Andre R. Barbosa and Mariapaola Riggio


Kyriaki Gkoktsi, Agathoklis Giaralis, Roman P. Klis, Vasilis Dertimanis and Eleni N. Chatzi

*194 System Identification-Enhanced Visualization Tool for Infrastructure Monitoring and Maintenance*

Premjeet Singh and Ayan Sadhu

# Editorial: Robust Monitoring, Diagnostic Methods and Tools for Engineered Systems

#### Manolis N. Chatzis <sup>1</sup> \*, Eleni N. Chatzi <sup>2</sup> and Costas Papadimitriou<sup>3</sup>

<sup>1</sup> Department of Engineering Science, Mathematical, Physical and Life Sciences Division, University of Oxford, Oxford, United Kingdom, <sup>2</sup> Department of Civil Environmental and Geomatic Engineering, ETH Zürich, Zurich, Switzerland, <sup>3</sup> Department of Mechanical Engineering, University of Thessaly, Volos, Greece

Keywords: monitoring, damage identification, data-driven models, diagnostic tools, uncertainty quantification, structural health monitoring

**Editorial on the Research Topic**

### **Robust Monitoring, Diagnostic Methods and Tools for Engineered Systems**

Complex engineered systems manifest across all engineering fields. Such systems are further characterized by uncertainties linked to assumptions and limited information on material constitutive laws, description of loads, the influence of operational and environmental factors, energy dissipation mechanisms, motion constraints, or large displacements of system components. The propagation of these uncertainties adversely affects simulation accuracy and, consequently, the design, operation, and maintenance decisions for meeting desirable system performance and safety requirements.

#### Edited by:

Nizar Bel Hadj Ali, École Nationale D'Ingénieurs de Gabès, Tunisia

#### Reviewed by:

Eliz-Mari Lourens, Delft University of Technology, Netherlands

\*Correspondence: Manolis N. Chatzis manolis.chatzis@eng.ox.ac.uk

#### Specialty section:

This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment

Received: 29 May 2020 Accepted: 08 July 2020 Published: 25 August 2020

#### Citation:

Chatzis MN, Chatzi EN and Papadimitriou C (2020) Editorial: Robust Monitoring, Diagnostic Methods and Tools for Engineered Systems. Front. Built Environ. 6:125. doi: 10.3389/fbuil.2020.00125

Structural Health Monitoring exploits measurements from operating or tested systems for the development of robust diagnostic tools and procedures, which aim to improve condition assessment of complicated engineering systems under uncertainty. Researchers are pushing the boundaries of such uncertainty quantification tools and diagnostic and prognostic methods to improve the accuracy of the predictions, or for achieving robust results under sensorial information that is less accurate but better tailored to functionality requirements. The works in this special issue, deal with the previous directions.

In Mugabo et al. and Mugabo et al. an experimental campaign on a three-story timber building, the "Albina Yard," is performed using a set of accelerometers, with the dataset further made available to the scientific community. The authors further demonstrate how Operational Modal Analysis methods succeed in identifying the modal properties of this hybrid timber building under ambient excitation. The findings were compared to a finite element representation of the building and led to the interesting conclusion of how secondary-elements, such as an exterior wall, and non-structural elements could bear a significant effect in the modal properties, and therefore the dynamics, of such buildings. Such a fusion with a system model is often critical to the assessment. However, in practice, engineers need to resort to model assumptions and simplifications, which as discussed in Song et al. can result in bias. Song et al. account for this bias by identifying not only the structural parameters of the assumed model but also of the stochastic properties of the modeling error through a hierarchical Bayesian framework. This allows for removing the effects of the bias and obtaining more reliable estimates of the modal properties of the simplified model. The method is demonstrated by identifying the properties of a shear-type building using data from a building with rocking foundation.

The drive for energy-efficient sensors for continuous monitoring of field applications has brought forth challenges related to the acquired data. In Horner et al. the authors discuss the effect of missing observations in estimating the parameters of regression models and suggest a novel methodology to efficiently do so. Experimental data from a two-bay steel frame and simulated data are used illustrating that the method can operate robustly despite a significant amount of missing data. In Gkoktsi et al. the challenges presented by measurements obtained at a lower sampling frequency, for example resulting from compression at the wireless nodes of field sensors, are addressed. The authors suggest two methods to cope with sub-Nyquist and non-uniformly sampled time histories and demonstrate the reconstruction of the original signals in the frequency and time domain. Experimental data from a monitored highway bridge and an on-shore Wind Turbine are used to demonstrate the ability of the methods to robustly reconstruct the signals for output-only identification.

A challenge in the monitoring of field structures often lies in the influence of the environmental and operational conditions, which can result in challenging the commonly used assumption of time-invariance. In Avendaño-Valencia et al. the authors address this challenge for the effect of variable wind speeds on Wind Turbines. Using Gaussian processes, the coefficients of auto-regressive models representing the structure are updated for variable wind speeds. The authors demonstrate the capacity of the method in terms of estimating the fatigue life of a wind turbine. Similarly, in Gislason et al. the authors rely on the use of autoregressive time series models for identifying damage in structural buildings. This is achieved via coupling of ARMAX models with a sensor clustering concept, for use with ambient vibration sensors, such as accelerometers. The authors demonstrate that the changes in the properties of such timeseries models would be able to detect damage in structures as demonstrated in simulated examples of multi-story buildings.

Strain measurements are revealed as a valuable tool for condition assessment, and estimation of reserve capacity. In this context Kliewer and Glisic employ a series of long-gauge Fiber Bragg Grating sensors for damage detection in beam-types of structures, by means of a so-called normalized curvature ratio. The method is demonstrated to robustly detect damage along beam-type structures, or changes in the support conditions in analytical examples, an experimental test and when used on field data from a highway bridge. In separate work exploiting strains, the fatigue life of the Venoge Bridge is the topic of investigation of Pai et al. relying on deployment of strain sensors placed on the bridge since 1995. The authors investigate different methods for updating a Finite Element model of the Bridge. A modified Bayesian updating scheme is proposed, which explicitly includes model bias, and a model falsification framework (EDMF) are implemented and cross-assessed for updating the model parameters, which in turn allows estimating the remaining fatigue life of the structure. The authors further suggest that EDMF offers the additional advantage of compatibility with engineering practice.

A mechanical engineering application is the focus in Matthaiou et al. where the authors target condition monitoring for gas turbines. The method presented by the authors is data-driven, using a machine-learning approach based on novelty detection, focusing on the utilization of training data that correspond to mainly healthy cases. The framework is demonstrated on experimental vibration data from engines operating on different types of fuel, proving the diagnostic capability of the method. Remaining in the context of diagnostics, a modification of the Unscented Kalman Filter for the case of non-smooth systems is presented in Chatzis and Chatzi, termed the Discontinuous UKF. Non-smooth systems arise from the mathematical representations of phenomena related to damage such as sliding, impacts or plasticity. The authors demonstrate in numerical examples how the Discontinuous modification allows for detecting the properties of such systems and achieve damage detection in a robust and online manner. In Abdessalem et al. the authors present a novel combination of two Bayesian tools, Gaussian Processes (GPs), and the use of the Approximate Bayesian Computation (ABC) algorithm for kernel selection and parameter estimation in machine learning applications. The method is demonstrated on simulated and actual datasets.

The previously mentioned papers present a series of tools that deliver information on the condition of an asset and, in some cases, further allow estimating its remaining lifespan. A common issue is how such information can be utilized by a managing authority for the process of decision making. This is discussed in Aktan et al., where the authors present an overview of how sensorial information can be exploited by managing authorities and a roadmap for facilitating such a transition in asset management through appropriate training. In rendering further linkage to the practice of construction, Singh and Sadhu deliver a dynamic Building Information Modeling (BIM) webbased framework, which incorporates online visualization of data, real-time system identification, and decision-making. A steel bridge located in London, Ontario is utilized as a case study, where both BIM and SHM are integrated in a unified fashion.

Despite the obvious hurdles posed by uncertainties in the monitoring and diagnostics of engineered systems, the works featured in this Special Issue clearly demonstrate that adoption of a data-driven attitude toward structural assessment is not only the way forward, but also mature enough to be put into practice.

### AUTHOR CONTRIBUTIONS

MC prepared the original draft with contributions from EC. EC and CP revised and MC submitted the final version. All authors contributed to the article and approved the submitted version.

### ACKNOWLEDGMENTS

The editors would like to thank to all contributors to this special issue as well as the reviewers and editorial team of Frontiers.

**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Chatzis, Chatzi and Papadimitriou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# **Normalized Curvature Ratio for Damage Detection in Beam-Like Structures**

*Kaitlyn Kliewer\* and Branko Glisic*

*Department of Civil and Environmental Engineering, Princeton University, Princeton, NJ, United States*

Fiber Optic Sensors (FOS) offer numerous advantages for structural health monitoring. In addition to being durable, lightweight, and capable of multiplexing, they offer the ability to monitor strain in both static and dynamic mode. FOS also allow for instrumentation of large areas of a structure with long-gage sensors which helps enable global monitoring of the structure. Drawing upon these benefits, the Normalized Curvature Ratio (NCR), a curvature based damage detection method, has been developed. This method utilizes a series of long-gage Fiber Bragg Grating (FBG) strain sensors for damage detection of a structure through dynamic strain measurements and curvature analysis. The main assumption is that the ratios between cross-sectional curvature amplitudes under free vibration remain unchanged given the state of the structure is unchanged. The theoretical development of this method is presented along with an analytical study of a simply supported beam with two damage cases: a loss of flexural stiffness in the span and a change in rotational stiffness of the support. Validation of the method is then performed through two implementations. First, through a small-scale laboratory test with a simply supported aluminum beam subjected to a change in the rotational stiffness of the support. Second, the method is applied to an existing in-service highway overpass with over 5 years of data collection of dynamic strain events. The advantages and limitations of the method are identified and discussed. This research shows encouraging results and the potential for the NCR to be used as a simplistic metric for damage detection.

#### *Edited by:*

*Eleni N. Chatzi, ETH Zurich, Switzerland*

#### *Reviewed by:*

*Vasilis K. Dertimanis, ETH Zurich, Switzerland Harsh Nandan, SC Solutions, United States*

### *\*Correspondence:*

*Kaitlyn Kliewer kkliewer@princeton.edu*

#### *Specialty section:*

*This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment*

> *Received: 29 March 2017 Accepted: 02 August 2017 Published: 30 August 2017*

#### *Citation:*

*Kliewer K and Glisic B (2017) Normalized Curvature Ratio for Damage Detection in Beam-Like Structures. Front. Built Environ. 3:50. doi: 10.3389/fbuil.2017.00050* **Keywords: structural health monitoring, curvature, dynamic strain, FBG sensor, damage sensitive feature**

## **1. INTRODUCTION**

American infrastructure recently received a D+ rating by the American Society of Civil Engineers in the 2017 Infrastructure Report (ASCE, 2017). In the US, there are over 600,000 bridges and more than 25% of those bridges are structurally deficient or functionally obsolete, according to the US federal highway administration (USDOT, 2015). The average age of these structurally deficient bridges is over 65 years, which well exceeds the average service life of 50 years for those structures (Davis et al., 2013). In an effort to monitor the state of bridges, the federal highway administration currently mandates periodic inspection of all bridges which typically done through visual inspection (National Bridge Inspection Standards (NBIS), 1996; Phares et al., 2004). However, this has been found to be inefficient and unreliable as it is prone to human errors. Phares et al. looked at the accuracy and reliability of these routine bridge inspections and found that 56% of average condition ratings are incorrect with a probability of 95% (Phares et al., 2004). Because civil infrastructure, such as bridges, roads, dams, and buildings plays a crucial role in the socio-economic life and development of a

**6**

country, there is a need for reliable methods to assess the condition of structures. Structural health monitoring (SHM) provides the ability to address this challenge and potentially improve the lifespan and cost of repairs on these structures. However, implementation of SHM has its own challenges relating to selection of damage sensitive feature and data analysis. Bridge managers and engineers are somewhat reluctant in applying SHM in the cases where damage-sensitive features have little engineering meaning or where the data analysis is complex and complicated. In addition, in spite of technological advancements during the last decade, live load monitoring still represents a challenge, and thus, their correlation with damage sensitive feature is in many cases impossible.

To address above challenges, the objective of this research is to create a simplistic dynamic SHM method based on curvature change under free vibration, through the use of the normalized curvature ratio (NCR) as a damage sensitive feature. The NCR is a parameter that was identified as simplistic to implement in SHM and independent of live loads. It uses the curvature values at discrete locations measured using strain sensors. The method is developed through the use of analytical case studies which demonstrated the potential of the NCR as a damage sensitive feature. To assess its performance and limitations, this method has been applied to both a small-scale laboratory specimen and to an in-service bridge, both instrumented with long-gage Fiber Bragg-Grating (FBG) strain sensors. With sufficient sensors and sensitivity, the NCR method has the potential for Level II SHM, which includes both the determination that damage is present and the determination of the geometric location of the damage. However, due to the limited number of sensors available for the experimental tests presented in this paper, the analysis of the NCR method in this paper is limited to Level I SHM.

The field of vibration-based structural health monitoring methods is currently a vast area of research with contributions to the field beginning in the 1970s. Extensive literature reviews of vibration-based methods have been performed by Doebling et al. (1998), which present a review of methods published prior to 1996 (Sohn et al., 2003), review vibration-based methods published between 1996–2001 (Carden and Fanning, 2004), focus on papers published after 1996 (Fan and Qiao, 2011), and review vibrationbased methods for beam-type structures. A common approach for vibration-based monitoring methods is to rely upon detecting structural changes through natural frequency (Doebling et al., 1996; Salawu, 1997) and mode shape-based analysis (Shi et al., 2000; Zonta et al., 2003, 2008). However, it was found by Nandan and Singh (2014)that modal frequencies can be heavily effected by thermal changes in the thermal environment and these temperature influences can mask the changes in modal frequencies due to damage. However, Pandey et al. (1991) found that curvature based methods may be a more sensitive indicator of damage inbeam like structures. Since this finding, there has been research focused on curvature-mode shape damage detection methods (Wahab and De Roeck, 1999; Quaranta et al., 2016; Yang et al., 2016) and modal strain energy methods (Shi et al., 1998). It was found that accelerometers are a commonly used sensor for these vibration-based methods, including curvature-based methods. However, strain sensors are more optimal sensor for curvature based methods as curvature can be directly determined from the strain sensors and curvature as methods using numerically calculated curvature were found to have unacceptably high errors (Chance et al., 1994; Wahab and De Roeck, 1999).

Due to ease of instrumentation and their low cost, accelerometers are very common for dynamic structural monitoring with a wide range of applications from long-span bridges to wind turbines. However, there are limitations associated with traditional accelerometer technology that include difficulty multiplexing the sensors, they are limited to point sensors, they are sensitive to electromagnetic interference, and they have limited application in hostile environments (Antunes et al., 2012). In addition, determination of curvature from acceleration requires differentiation, which is prone to errors. When using strain sensors, because the curvature is linearly correlated with the strain, the curvature can be directly calculated from the strain measurements and eliminates the need for numerical differentiation, which reduces errors. There are many long-gage fiber optic strain sensors currently available, such as those developed by Pozzi et al. (2008); however, this research will focus on the use of fiber-Bragg grating (FBG) strain sensors. FBG sensors overcome many of the disadvantages associated with traditional accelerometers as they offer long-gage sensor possibilities as well as static and dynamic monitoring abilities, they are durable and lightweight, immune to electro-magnetic interference and offer multiplexing capabilities (Glisic and Inaudi, 2007). This research will focus on the use of long-gage sensors as opposed to point (short-gage) sensors as they are not influenced by local inhomogeneity of monitored material (e.g., concrete) and increase the chance of detecting damage due to the larger spatial coverage.

The theoretical development of the NCR method is briefly presented in the Section 2 followed by an application of these methods in two different analytical studies described in Section 3. Sections 4 and 5 present the application of the method to a small-scale laboratory specimen using a simply supported beam and the application of this method to an existing in-service structure, respectively. Last, the conclusions are presented in Section 6.

### **2. METHOD DEVELOPMENT**

This research focuses on the creation of a curvature-based method applicable to beam-like structures under free vibration that can be approximated as a Bernoulli–Euler beam. The curvature modes under free vibration are not expected to change unless the structure experiences unusual behavior. This creates the basis for the main assumption of the proposed method, that the ratios between curvature amplitudes at different locations of the beam should remain constant under free vibration, unless there is a change in the state of the structure. While the sensors can detect the damage directly if it occurs at location of sensors, direct damage detection is not considered in this study, as it is less challenging and already addressed in the literature (e.g., see Hubbell and Glisic (2013)). A summary of the derivation of the equations critical for understanding the NCR method are presented in the following sections. Elementary theoretical equations for the vibrations of continuous structures are only briefly presented, a more explanation of the theory and more detailed derivations can be found in Leissa and Qatu (2011).

### **2.1. Dynamic Behavior of Bernoulli–Euler Beam**

The equation of motion for a plane Bernoulli–Euler beam under transverse free vibration with a small amplitude can be described by the following equation:

$$\frac{\partial^2}{\partial \mathbf{x}^2} EI(\mathbf{x}) \frac{\partial^2 y}{\partial \mathbf{x}^2} + \rho A(\mathbf{x}) \frac{\partial^2 y}{\partial t^2} = \mathbf{0} \tag{1}$$

where *EI*(*x*) is the flexural rigidity, *ρ* is the density per unit volume, *A*(*x*) is the area of the cross-section, *x* is the coordinate in longitudinal direction of the beam, and *y* is the deflection of the center-line of the beam. In order to solve this equation, a solution in the following form is assumed:

$$\mathbf{y}(\mathbf{x},t) = \mathbf{Y}(\mathbf{x})\Phi(t) \tag{2}$$

which allows for the solution of the modal displacement of a beam. The solution for the displacement, assuming a uniform beam where *EI*(*x*) = *EI* = *constant* and *ρ*(*x*) = *ρ* = *constant*, is:

$$Y\_n(\mathbf{x}) = \mathbf{C}\_1 \sin(\alpha\_n \mathbf{x}) + \mathbf{C}\_2 \cos(\alpha\_n \mathbf{x}) + \mathbf{C}\_3 \sinh(\alpha\_n \mathbf{x})$$

$$+ \mathbf{C}\_4 \cosh(\alpha\_n \mathbf{x})\tag{3}$$

where *α<sup>n</sup>* is related to the *n* (*th*) eigenfrequency of the beam, *ωn*, and can be described by the following equation:

$$
\omega\_n^2 = \frac{EI\alpha\_n^4}{\rho A} \tag{4}
$$

*C*1, *. . .*, *C*<sup>4</sup> are constants determined by the boundary and continuity conditions of the beam. Because the curvature of a beam at a point is equal to the second derivative of deflection at that point, equation (2) can be used to obtain a generic equation for the curvature of a beam. This equation is equivalent to:

$$\kappa\_n(\mathbf{x}) = -\mathbf{C}\_1 \alpha\_n^2 \sin(\alpha\_n \mathbf{x}) - \mathbf{C}\_2 \alpha\_n^2 \cos(\alpha\_n \mathbf{x}) + \mathbf{C}\_3 \alpha\_n^2 \sinh(\alpha\_n \mathbf{x})$$

$$+ \mathbf{C}\_4 \alpha\_n^2 \cosh(\alpha\_n \mathbf{x}).\tag{5}$$

For the purposes of simplification of presentation, in this paper, the research focuses on the application to a simply supported beam. However, the method can easily be extended to any beamlike structure by following the same logic as for a simply supported beam. In many real-life applications, free vibration of structure is frequently dominated by the first mode or it may be possible to filter out the higher modes of vibration. For a simply supported beam, in order to determine the constants in equation (5), the following set of equations can be used to describe the boundary conditions:

$$\mathcal{y}(\mathbf{0}, t) = \mathbf{0} \to \mathcal{Y}(\mathbf{0}) = \mathbf{0} \tag{6}$$

$$\frac{\partial^2 \mathcal{y}}{\partial \mathbf{x}^2}(\mathbf{0}, t) = \mathbf{0} \to \mathcal{Y}'(\mathbf{0}) = \mathbf{0} \tag{7}$$

$$\mathcal{Y}(L, t) = \mathbf{0} \to \mathcal{Y}(L) = \mathbf{0} \tag{8}$$

$$\frac{\partial^2 \mathcal{Y}}{\partial \mathbf{x}^2}(L, t) = \mathbf{0} \to \mathcal{Y}'(L) = \mathbf{0} \tag{9}$$

By substituting equations (6)–(9) into equation (3), a solution for the coefficients *C*1–*C*<sup>4</sup> can be obtained, and thus the curvature distribution along the beam can be determined at any moment in time for an intact, non-damaged simply supported beam. Derivations for two typical damage scenarios—reduction of the cross-section and partial fixation of a support (see **Figure 1**)—are presented in the following text. These two cases were studied in order to assess the theoretical sensitivity of the method.

### 2.1.1. Equations of Motion: Beam Damaged Mid-Span

A beam with a loss of stiffness due to a reduction in the crosssection (e.g., due to a crack, corrosion or loss of composite action between steel and concrete in steel–concrete composite structures) can be represented as illustrated in **Figure 1A**. The beam can be discretized into 3 different segments, where the beginning and end segments have the full uniform stiffness, *EI*1. The middle segment of this beam is the section with reduced stiffness *EI*2. For this system, a series of equations based on equation (3) are needed to describe the equation of motion for the beam. The equation for the displacement of the beam for each of the three segments is as follows:

$$Y(\mathbf{x}) = \begin{cases} \mathcal{C}\_1 \sin(\alpha \mathbf{x}) + \mathcal{C}\_2 \cos(\alpha \mathbf{x}) \\ \quad + \mathcal{C}\_3 \sinh(\alpha \mathbf{x}) + \mathcal{C}\_4 \cosh(\alpha \mathbf{x}), \qquad \text{for } \mathbf{x} < L\_1, \\\\ \mathcal{C}\_5 \sin(\alpha \mathbf{x}) + \mathcal{C}\_6 \cos(\alpha \mathbf{x}) \\ \quad + \mathcal{C}\_7 \sinh(\alpha \mathbf{x}) + \mathcal{C}\_8 \cosh(\alpha \mathbf{x}), \qquad \text{for } L\_1 \le \mathbf{x} \le L\_2, \\\\ \mathcal{C}\_9 \sin(\alpha \mathbf{x}) + \mathcal{C}\_{10} \cos(\alpha \mathbf{x}) \\ \quad + \mathcal{C}\_{11} \sinh(\alpha \mathbf{x}) + \mathcal{C}\_{12} \cosh(\alpha \mathbf{x}), \quad \text{for } \mathbf{x} > L\_2. \end{cases} \tag{10}$$

A solution to these equations can be determined using both the continuity equations listed below in equations (11)–(18) and using the boundary conditions for the beam. For a simply supported beam under these conditions, the boundary conditions at *x* = 0 and *x* = *L* for this system are equivalent to the boundary conditions provided in equations (6)–(9). The continuity conditions at the junctions of the beam segments are represented as follows:

$$Y\_1(L\_1) = Y\_2(L\_1) \tag{11}$$

$$\frac{dY\_1}{d\mathbf{x}\_1}(L\_1) = \frac{dY\_2}{d\mathbf{x}\_2}(L\_1) \tag{12}$$

$$I\_1 \frac{d^2 Y\_1}{d\mathbf{x}\_1^2}(L\_1) = I\_2 \frac{d^2 Y\_1}{d\mathbf{x}\_2^2}(L\_1) \tag{13}$$

$$I\_1 \frac{d^3 Y\_1}{d\mathbf{x}\_1^3}(L\_1) = I\_2 \frac{d^3 Y\_1}{d\mathbf{x}\_2^3}(L\_1) \tag{14}$$

$$Y\_2(L\_2) = Y\_3(L\_2) \tag{15}$$

$$\frac{dY\_2}{d\mathbf{x}\_2}(L\_2) = \frac{dY\_3}{d\mathbf{x}\_3}(L\_2) \tag{16}$$

$$I\_2 \frac{d^2 Y\_2}{d\mathfrak{x}\_2^2}(L\_2) = I\_1 \frac{d^2 Y\_3}{d\mathfrak{x}\_3^2}(L\_2) \tag{17}$$

$$I\_2 \frac{d^3 Y\_2}{d\mathfrak{x}\_2^3}(L\_2) = I\_1 \frac{d^3 Y\_3}{d\mathfrak{x}\_3^3}(L\_2) \tag{18}$$

From the equations of motion, the resulting curvature of the beam dependent on the loss of stiffness (*I*2/*I*1) can be determined once the coefficients C1–C<sup>12</sup> are determined.

#### 2.1.2. Equations of Motion: Beam Damaged at Support

Another typical damage or unusual structural behavior may occur when there is a change in the boundary conditions of the structure. A change in the boundary conditions of a structure can be the result of malfunction of support mechanism due to various causes, such as corrosion, dislocation, and fatigue cracks. This damage may lead to a change in the rotational stiffness of the beam. The rotational stiffness can be represented by a rotational spring with a stiffness *k<sup>θ</sup>* at the location of the support and an example of this beam is illustrated in **Figure 1B**. Using this new boundary condition, a relationship between the curvature of the beam and the rotational stiffness of the support can be determined using the same boundary conditions as equations (6)–(8) in addition to the following boundary condition:

$$\frac{\partial^2 \mathcal{y}}{\partial \mathbf{x}^2}(L, t) = -k\_\theta \theta(L) \to \frac{Y''(L)}{Y'(L)} = \frac{-k\_\theta}{EI} \tag{19}$$

Using equations (6), (7), (8), and (19) and substituting them into equation (3), a solution for the coefficients *C*1–*C*<sup>4</sup> and *α* can be obtained where a relationship between the rotational stiffness, *kθ*, and *α* can be described by the following equation:

$$0 = -2\sin(\alpha L) + \frac{k\_{\theta}}{\alpha EI} \left[ \cos(\alpha L) - \frac{\sin(\alpha L)\cosh(\alpha L)}{\sinh(\alpha L)} \right]. \tag{20}$$

Using these equations, the curvature mode shape for a beam with a pin support and a pin support with a rotational stiffness of *k<sup>θ</sup>* can be determined.

### **2.2. Normalized Curvature Ratio (NCR) Method**

Based on equation (5), for any beam under free-vibration, in a single mode (in most cases in real structures, free vibration occurs primarily in the first mode), the ratio of the curvature of the beam at one location and the curvature of the beam at another location should remain a constant value. This is true regardless of the boundary conditions of the system and is independent of the amplitude of the motion. We can define this ratio between curvatures in two cross-sections as normalized curvature ratio (NCR) by the following equation:

$$\text{NCR}\_{i,j} = \kappa\_i / \kappa\_j \tag{21}$$

where *κ<sup>i</sup>* is the curvature of the beam at sensor location *i* and *κ<sup>j</sup>* is the curvature of the beam at sensor location *j*. Experimentally, the peak curvature values for each sensor are used, such as those shown in **Figure 2**. In order to obtain *NCRij*, the peak curvatures at sensor *i* can be plotted against the peak curvatures at sensor *j*. A linear regression can be fit to this relationship and the slope of this regression provides the NCR for these two sensor locations. If there is a change occurring in the structure (e.g., regarding the boundary conditions of the structure or reduction of cross-section along the span of the beam), this change is expected to be reflected as a change in the normalized curvature ratio (NCR). Evaluation of the NCR is particularly well suited for a structure instrumented with a series of parallel strain sensors installed at discrete locations along the length of the beam. Parallel strain sensors can be placed at the desired locations to calculate the NCRs and, because strain is linearly related to the curvature in a beam, the curvature at the desired locations can be obtained while introducing minimal uncertainty into the results.

An overview of this SHM method based on NCR is schematically presented in **Figure 3**. The initial stage of the method involves the development of a model for the structural system. Either an analytical or numerical model can be used for the NCR method; however, in this paper an analytical model will be used. If there is a preexisting sensor network installed on the structure, it may be possible to utilize the existing network

as opposed to installing a new system. The effectiveness of the existing network may be evaluated using the structural model that was developed. If there is no sensor network in place, the model of the structural system can be used to perform an analysis to determine the optimal sensor placement for the structure. Once the structural response is obtained, the free vibration response time series must be determined. An example is provided in Section 5.2 when analyzing the results from the highway overpass. If reference data for the structure is available, the damage sensitive parameter, the NCR, can be compared to the parameters from the reference point. This comparison allows for an evaluation of the change in structural performance over time. However, a reference point is not always available for a structure. In these cases, the NCR can be compared with the theoretical values determined using the structural model. If there is a statistically significant difference between the measured value and the reference value, it may be indication of unusual structural behavior. Since the measurements are collected over very short terms, temperature compensation is not necessary, and thermal strain can be neglected in calculus (Sigurdardottir and Glisic, 2013). Thus, a benefit of the NCR method is that it does not require any correction of data related to temperature changes.

### **3. ANALYTICAL STUDY**

In order to assess sensitivity of the method presented above, an analytical study was performed for the two typical damage types presented in the previous section. Both analysis will use a simply supported aluminum beam with a length of 2 m, a height of 1 cm, and a width of 25 cm. Additionally, the results presented will focus on the strain and curvature values calculated at 4 locations on the beam that are evenly spaced along the length of the structure (0.4, 0.8, 1.2, and 1.6 m). This was done so to partially simulate the beam being instrumented with a series of 4 strain sensors, similar to the method used in laboratory tests (see Section 4). An overview of the application of the method is presented along with the results from the study.

### **3.1. Beam with Reduced Stiffness**

An analysis of the NCR for a beam with a reduced flexural stiffness was performed by modeling an increasing loss of stiffness of the cross-section of the beam at various locations on the beam. The height of the cross-section was varied and this occurred at 0.5 and 1 m from the left support. The curvature of this beam was determined using equations (10)–(18).

The effect of the reduced stiffness on the curvature of the beam can be seen in **Figure 4** showing the curvature mode shape for the two damage locations. The NCR were calculated for each case and are shown in **Figures 4A,B**. For the case with the damage located at 0.5 m, a difference in the NCR values is observed as the crosssection is reduced. However, these changes in the curvature ratios are not significant, except in the case of very severe damage. It is important to note that when the damage is located in the middle of the beam, there is a minimal impact on the curvature ratios of the structure as the vibration of the beam is symmetric. This indicates that a given configuration of sensors is not equally sensitive to the damages occurring at different locations. In general, problems with sensitivity to damage at specific locations (e.g., in the middle of the beam) may be resolved by instrumenting the structure with sensors at these identified locations (i.e., where the NCR method is not sensitive to damage), which will allow for direct detection of damage; however, direct damage detection is out of the scope of this study. Hence, because not all sensor placements are optimal or may have regions where they are insensitive to damage, it is important to perform an analysis on the structure to determine the optimal sensor placement prior to installation.

### **3.2. Beam with Change in Support Conditions**

Using the same beam, a case study was performed by modeling a damaged support as a rotational spring at the right support of the beam. The rotational stiffness of the beam boundary condition was varied from a pinned support, which is idealized as a rotational stiffness of 0 to a fully fixed support idealized as having infinite stiffness. Using equations (5) and (11), the curvature mode of the beam under the various support conditions could be determined. The results for the varying curvature of the beam are shown in **Figure 5**. This figure shows that as the rotational stiffness of the support increases, there is a global change in the curvature of the beam. Additionally, there is a shift to the left of the inflection point for the curvature. Knowing the curvature of the beam under the changing support condition, the NCRs were determined based on the locations selected for the strain sensors. The results for the NCRs are shown in **Figure 6**.

This application of the NCR method to a beam with a damaged support in an analytical study shows very good results with a clear change in the NCR values that is dependent on the stiffness of

**FIGURE 4** | Results from two analytical beam models with loss of stiffness in the cross-section: the curvature **(A)** and corresponding NCRs **(C)** for analytical beam with damage located at the quarter-span and the curvature **(B)** and corresponding NCRs **(D)** for analytical beam with damage located at the mid-span; **(E)** Damage classification and corresponding reduction in stiffness (EI).

the support. Therefore, this case is further tested in the laboratory with a small-scale experiment.

### **4. LABORATORY TESTS: SIMPLY SUPPORTED BEAM**

Basic laboratory tests were performed and agreed with the findings of the first analytical study, confirming relatively low sensitivity of NCR method to a reduction of stiffness in the span. The tests were performed on a cantilevered beam to amplify the magnitude of curvature, while damage is simulated by varying the stiffness at predetermined location. Since these results simply confirmed low sensitivity of the method, and given the figure limitations of this paper, the focus of this section is on a beam with a support with a varying rotational stiffness, as the analytical study showed this application more promising. Additionally, preliminary experimental tests and analysis performed on the simply supported beam demonstrated the potential for this method (Kliewer and Glisic, 2015).

### **4.1. Experimental Setup**

Small-scale laboratory tests were performed using a simply supported aluminum beam with a span of 1.71 m and dimensions of 25.4 cm wide by 0.95 cm high, as shown in **Figure 7**. The aluminum beam was instrumented with a total of 5 Fiber Bragg-Grating (FBG) strain sensors that are installed along the top surface of the beam. The FBG sensors are not placed symmetrically around the center line of the beam as shown in **Figure 7**. The sensors have a gage length of 10 cm in order to simulate long-gage fiber optic sensors on a full scale structure and the sensors are spaced 10-cm apart. A series of dynamic tests were performed where the aluminum beam was displaced at the midspan and released in order to induce free vibration. A change in the boundary condition of the beam was simulated by altering the right roller support. The stiffness of the roller was gradually increased by placing clamps at the location of the roller support. A total of 15 trials were run for each of the 4 support conditions: a normal behaving roller and 3 conditions with increasing rotational stiffness. A sampling rate of 250 Hz was used to record the strain data from the FBG sensors.

### **4.2. Results**

The typical strain response observed in the beam is shown in **Figure 8A**, where the initial sensor response is used as the reference period, followed by the loading of the beam and finally the free vibration. The NCR method uses only the strain response from the free vibration time period. Using the strain response from the sensors on the beam, the curvature can be determined, such as the example shown in **Figure 8B**. This is done for each of the sensor locations and for each time step. The FBG sensors are installed along the top surface of the aluminum beam and the height, *h*, of the beam is known. The strain along the bottom of the beam can be assumed to have the same magnitude and opposite sign to the strain along the top of the beam. The curvature is related to the strain at each location by the following equation where, where *κ*, *r*, *εt*, *εb*, and *ε* are the curvature, the radius of

**(B)** schematic of beam dimensions and sensor layout.

curvature, the strain at the top of the beam, the strain at the bottom of the beam, and the strain measured by the sensor.

$$\kappa = \frac{\varepsilon\_t - \varepsilon\_b}{h} = \frac{2\varepsilon}{h} \tag{22}$$

Once the curvature response is determined, the peak curvature values are then extracted and used for the remaining analysis of the results, as shown in **Figure 8C**. Additionally, using the dynamic response of the beam from the strain sensors, it is possible to determine the natural frequency of the beam at each sensor location.

### **4.3. Experimental NCR**

beam.

Using the peak curvature values, shown in **Figure 8C**, the NCR was calculated for the beam for each test performed. In order to obtain NCR*ij*, the curvature at sensor *i* was plotted against the curvature at sensor *j* for each boundary condition of the beam. A linear regression can be fit to this relationship and the slope of this regression provides the NCR for these two sensor locations. The NCR for each support condition was determined along with the associated uncertainty and are provided in **Figure 9**.

A Welch's *t*-test was used to compare the statistical difference between the state of the altered support versus the normal roller. For all cases, there is a p-value significantly lower than 0.001. This indicates that for all damage states compared to the normal state, they are statistically different from one another. In **Figure 9**, there is a clear progression of each of the NCR values as the rotational stiffness of the joint is increased. Similar to the observations made in the analytical case studies, there are some sensor locations that are significantly more sensitive to changes in the rotational stiffness of the support compared to other locations. Again, this highlights the importance of planning the placement of the sensors.

### **4.4. Rotational Stiffness Analysis**

In the tests, it was not possible to directly measure the change in stiffness of the support due to limitations of the testing setup. However, the change of the stiffness could be determined from the measurements. Using the peak curvature values and the known boundary condition at the left support, where the support is a pin and the curvature equals zero, a line can be fit to these points using the general curvature mode provided in equation (5). When the curvature mode equation is fit to the undamaged support case, the inflection point is found to be located at the same location as the roller support on the beam as expected, as seen in **Figure 10A**. However, when the curvature mode equation is fit to one of the beams with an altered support state, there is a clear shift in the inflection point of the curvature mode shape to the left of the support, as shown in **Figure 10B**. This indicates there

is a stiffening of the joint and the support now has some moment carrying capacity.

The theoretical relationship between the inflection point and the rotational stiffness of the support was determined for this beam using the methods presented in Section 2.1.2. For the experimental tests, four different support conditions were analyzed: undamaged condition, case 1 with minor damage to support, case 2 with moderate damage to support, and case 3 with major damage to support. For each of the support conditions analyzed, the experimental inflection point was determined through the curvature fitting process. These inflection points are then used to determine the quantitative stiffness of support based on the theoretical model of the beam, as shown in **Figure 11**. For the undamaged case, the inflection point is located at approximately the same location as the support location and has minimal rotational stiffness, as anticipated. As the stiffness of the support was increased using clamps located at the beam support, there is an increasing shift of the location of the curvature inflection point which corresponds to a higher theoretical rotational stiffness of the support. After successful laboratory testing, the method was applied to a real structure previously instrumented with long-gage FBG sensors, as shown in Section 5.

### **5. REAL STRUCTURE: HIGHWAY OVERPASS**

### **5.1. Description of Structure and Monitoring System**

The NCR Method was implemented on a real bridge instrumented with a SHM system in 2011 (within the frame of an earlier project). Preliminary results of the case study (Kliewer and Glisic, 2017) were upgraded with more refined data processing and improved uncertainty calculation, and presented in the following text.

The bridge is located in the United States and the design of the structure is representative of a typical highway overpass that is very common in the United States. Because of this, it

provides the opportunity to test SHM methods on a typical structure.

The bridge contains multiple spans and consists of built-up steel girders of varying sizes and concrete deck. The structure is skewed at the north end, providing a unique structural behavior as all girders differ in length. In this research, only the southbound span of the structure was instrumented. Two of the eight girders on the span, girder 2 and girder 5, were instrumented with sensors. On both girders, FBG strain sensors were installed in three locations: the mid span and the quarter spans, as shown in **Figure 12**. At each location, strain sensors were installed in parallel topology on the top flange and the bottom flange for a total of 6 sensors on each girder. Additionally, a temperature sensor was installed with each strain sensor.

Since the installation of the monitoring system on the structure, periodic data collection sessions were carried out several times a year, and have been ongoing for almost 6 years. During the data collection sessions, the structure remains in-service and the measurements consist of the strain response of the structure caused by the traffic loading. A total of 28 measurement session have occurred from June 2011 to January 2017. The strain response of the structure was recorded for approximately 1 h for each session and the data are recorded with a sampling rate of 250 Hz. In this paper, the research will focus only on girder 5; however, similar methods were applied to analyze the response from girder 2. A typical strain response for girder 5 of the structure is shown in **Figure 13**. The strain response from the sensors was filtered using a fourth order Butterworth low-pass filter to remove the higher frequency noise. The figure shows several peak strain responses on the structure, which are the result of passing heavy weight vehicles, followed by periods of free vibration. It is these periods of free vibration that are used in NCR analysis.

### **5.2. Results**

From the strain response from the FBG sensors, the curvature at the locations of the sensors can be calculated using the following equation:

$$
\kappa = \frac{\varepsilon\_t - \varepsilon\_b}{h}.\tag{23}
$$

where *κ* is the curvature, *ε<sup>t</sup>* is the strain measured at the top of the cross-section, *ε<sup>b</sup>* is the strain measured at the bottom of the crosssection, and *h* is the vertical distance between the sensors. Because the data measurements are obtained using existing traffic loading on the overpass, the typical dynamic strain response recorded does not have pure free vibration due to the high traffic volumes. Using the strain response of the structure, a time periods of approximately free vibration can be extracted from the full time history. As an example, the strain response shown in the red box in **Figure 13** corresponds to approximately free-vibration (high strain before the box is passage of a heavy vehicle that excites the bridge to vibrate). The NCRs were calculated for each monitoring session along with the associated uncertainty of the values, calculated from uncertainty of the linear regression inherent to NCR method. These NCRs are shown in **Figure 14**, spanning from June 2011 to December 2016. Since the sensor were installed onto the existing structure with an unknown damage condition, as a means of comparison the theoretical NCRs were determined assuming perfect conditions. This was done by approximating the structure as simply supported and using the equations presented in Section 2.1. These theoretical values are shown in **Figure 14**. Overall, the NCRs showed no significant change in the values over time, which indicates no significant change of the structural performance could be noticed.

There is a reasonably good agreement between the theoretical NCR and the *NCR*1,2 calculated using sensors at locations

5.1 and 5.2. Similarly, there is reasonable agreement between the theoretical NCR and the *NCR*1,3 (calculated using sensors at locations 5.1 and 5.3). The final *NCR*3,2 calculated using sensor location 5.3 (the last quarter span) and 5.2 (the midspan) does not have as strong of an agreement with the theoretical value of NCR, as there are locations where it falls outside of the uncertainty bounds of the results obtained from the FBG sensors. This may indicate existence of unusual behavior around or at location 5.2 or 5.3, and it is coherent with indication of unusual behavior in the structure noted by Sigurdardottir and Glisic (2013) when observing the behavior of the neutral axis of the structure. An analysis similar to the analysis presented in Section 4.3 showed that behavior of the girder 5 is not consistent with the malfunction of supports. However, it was determined that potential delamination between the steel and concrete would reduce the stiffness (EI) of the cross section by 61%. The percent loss in stiffness was determined using the cross-sectional properties of the composite section provided in the engineering design drawings. The stiffness of the section with full composite action was compared to loss of composite action between the concrete deck and steel girder, i.e., to the simple sum of stiffnesses of the two components. The 61% reduction in stiffness would correspond to severe to very severe non-symmetric damage, as per analysis in Section 3.1, which might be theoretically detectable using the NCR (see **Figure 4**).

A study was also performed by Domaneschi et al. (2017), where a damage detection method was implemented using dynamic curvature data obtained from the same source of data, the series of FBG strain sensors on the highway overpass that is also explored in this paper (Domaneschi et al., 2017). The results of that study reached similar conclusions regarding the condition of the highway overpass, which in part validates the NCR method. However, method presented by Domaneschi et al. needs a finite element model (FEM), whereas the NCR method does not require the FEM, which makes the latter easier to implement and more efficient damage detection method.

### **6. CONCLUSION**

This paper presents a simplistic SHM method based on dynamic curvature analysis. The method uses the normalized curvature ratio (NCR) as a damage sensitive feature. The method was initially presented through an analytical study of a simply supported beam with two types of damage—reduction of cross-sectional stiffness and malfunction of support. This study illustrated the simplicity of the method and its potential for application in a real structure. However, the study also identified limitations. First, the sensitivity of the method depends on the layout of the strain sensors on the structure, and their relative position with respect to damage. Additionally, the study indicates that the method is sensitive to malfunction of support, while it features relatively low sensitivity in detecting a reduced flexural stiffness occurring in the span of the beam. Finally, the NCR method is based on determination of curvatures and thus it is limited to applications for beam-like structures subjected to bending. The method is not effective for purely axially loaded structures or structures in pure shear deformation. The analysis was then performed on a smallscale laboratory specimen subjected to a change in stiffness in a cross-section (not presented due to space limitations) and in the

support. These results were consistent with analytical study. They confirmed low sensitivity to reduction of cross-sectional stiffness, and demonstrated the ability to use the NCRs as a damage sensitive feature for detection of stiffening of the support, and the ability to quantify the rotational stiffness of the support based on the dynamic strain measurements. Finally, the method was applied to a real, in-service highway overpass which was instrumented with a series of FBG strain sensors and periodically monitored. From the dynamic strain measurements on the structure, the NCRs were successfully calculated and compared with the theoretical NCR values. Comparison indicated existence of unusual behavior that is consistent with previous works based on analysis of the neutral axis. Also, it pointed that although the method features low sensitivity to reduction of cross-sectional stiffness, it can successfully be applied to composite structures, as delamination between the steel and concrete actually significantly reduces the cross-sectional stiffness. An additional advantage of this method is the use of the free vibration response of a structure. This means the method is independent of the magnitude of load applied, does not require temperature compensation, and allows the structure to remain unperturbed (in-service, with no restriction to traffic) during the data acquisition.

### **AUTHOR CONTRIBUTIONS**

KK had substantial contributions to the creation of methods and analysis and interpretation of data; she drafted the paper. BG advised and supervised the work, revised the draft of the paper, and approved the final version to be submitted. Both authors are accountable for all aspects of the work.

### **ACKNOWLEDGMENTS**

The authors would like to thank the following individuals for their assistance with this research: Corrie Kavanaugh, Dorotea Sigurdardottir, Dennis Smith, and Joe Vocaturo. The project on the US202/NJ23 highway overpass in Wayne has been realized with the important support, great help and kind collaboration of several professionals and companies. The authors would like to thank SMARTEC SA, Switzerland; Drexel University, in particular Professor Emin Aktan, Professor Frank Moon (now at Rutgers University), and graduate student Jeff Weidner (now Assistant Professor at University of Texas, El Paso); New Jersey Department of Transportation (NJDOT), and in particular Nat Kasbekar and Eddy Germain; Long-Term Bridge Performance (LTBP) Program of Federal Highway Administration; PB Americas, Inc., Lawrenceville, NJ, in particular Mr. Michael S Morales, LTBP Site Coordinator; Rutgers University, in particular Professors Ali Maher and Nenad Gucunski; All IBS partners; and Kevin the lift operator. The authors would also like to thank Yao Yao who helped with the sensor installation.

### **FUNDING**

This material is based upon work supported by NSF GRFP Grant No. 1148900, NSF CMMI-1362723, and USDOT-RITA DTRT12- G-UTC16. Any opinions, findings, and conclusion or recommendations expressed herein are those of the authors and do not necessarily reflect the views of the funding agencies.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer, VD, and handling editor declared their shared affiliation, and the handling editor states that the process nevertheless met the standards of a fair and objective review.

*Copyright © 2017 Kliewer and Glisic. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **Automatic Kernel Selection for Gaussian Processes Regression with Approximate Bayesian Computation and Sequential Monte Carlo**

#### *Anis Ben Abdessalem, Nikolaos Dervilis\*, David J. Wagg and Keith Worden*

*Dynamics Research Group, Department of Mechanical Engineering, University of Sheffield, Sheffield, United Kingdom*

The current work introduces a novel combination of two Bayesian tools, Gaussian Processes (GPs), and the use of the Approximate Bayesian Computation (ABC) algorithm for kernel selection and parameter estimation for machine learning applications. The combined methodology that this research article proposes and investigates offers the possibility to use different metrics and summary statistics of the kernels used for Bayesian regression. The presented work moves a step toward online, robust, consistent, and automated mechanism to formulate optimal kernels (or even mean functions) and their hyperparameters simultaneously offering confidence evaluation when these tools are used for mathematical or engineering problems such as structural health monitoring (SHM) and system identification (SI).

#### *Edited by:*

*Eleni N. Chatzi, ETH Zurich, Switzerland*

#### *Reviewed by:*

*Donghyeon Ryu, New Mexico Institute of Mining and Technology, United States Luis David Avendaño Valencia, ETH Zurich, Switzerland*

> *\*Correspondence: Nikolaos Dervilis n.dervilis@sheffield.ac.uk*

#### *Specialty section:*

*This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment*

> *Received: 01 June 2017 Accepted: 08 August 2017 Published: 30 August 2017*

#### *Citation:*

*Abdessalem AB, Dervilis N, Wagg DJ and Worden K (2017) Automatic Kernel Selection for Gaussian Processes Regression with Approximate Bayesian Computation and Sequential Monte Carlo. Front. Built Environ. 3:52. doi: 10.3389/fbuil.2017.00052* **Keywords: kernel selection, hyperparameter estimation, approximate Bayesian computation, sequential Monte Carlo, Gaussian processes**

### **1. INTRODUCTION AND MOTIVATION**

Regression analysis or classification using Bayesian formulation and specifically Gaussian Processes (GPs) or relevance vector machines (RVMs) is becoming very popular and attractive due to incorporation of uncertainty and the bypassing of unattractive features from methods like neural networks. Regression using neural networks for example, although they present a very powerful tool, sometimes can make it difficult and demanding to achieve the right tuning. The hard questions that have to be asked while multi-layer perceptrons (MLPs) are implemented are: which is the right architecture? How many nodes? What transfer functions? What momentum or learning rate? How many times they should run for different initial conditions?

The use of Gaussian processes is a current research area of increasing interest, not only for regression but also for classification purposes (Dervilis et al., 2015). Gaussian processes (GPs) are a stochastic non-parametric Bayesian approach to regression and classification problems. These Gaussian processes are computationally very efficient, and non-linear learning is relatively easy. Gaussian process regression takes into account all possible functions that fit to the training data vector and gives a predictive distribution around a single prediction for a given input vector. A mean prediction and confidence intervals on this prediction can be calculated from the predictive distribution. Due to its simplicity and desirable computational performance, GP has been applied in numerous domains particularly in structural health monitoring (Cross, 2012; Dervilis et al., 2016; Worden and Cross, 2018) and civil and structural engineering to construct surrogate models, which can mimic the real behavior of large-scale complex systems/structures and then make predictions. In Su et al. (2017), GP model has been coupled with Monte Carlo simulations to perform a reliability analysis of complex engineering structures. An application of GP to control an existing building can be found in Ahn et al. (2015). In Wan et al. (2014), a surrogate model based on GP has been established to deal with uncertainty quantification for modal frequencies. An interesting application of GP to deal with finite element model updating for a civil structures is presented in Wan and Ren (2015).

The initial and basic step in order to apply Gaussian process regression is to obtain a mean and covariance function. These functions are specified separately, and consist of a specification of a functional form and a set of parameters called hyperparameters. When the mean and covariance functions are specified, then one can infer model hyperparameters by minimization of the logmarginal likelihood. The software used for the implementation of GP regression was provided by Rasmussen and Williams (2006).

However, as mentioned, a covariance or kernel function has to be defined and the new questions that one has to ask: how one chooses the kernel function for a GPs? And of course one could say, well the people running or providing the code are experts on GPs why they do not include a default mechanism to choose kernel and it is user oriented and free choice?

The answer is that the choice of any covariance function or kernel, determines in the authors opinion, almost all the generalization properties of GPs, but here one is talking about a black box model and the user might not be an expert, or not have a deep data or physics understanding or the modeling challenge. In turn, if one is not qualified to choose the proper covariance function as an expert, then this work is adding an important practical and sophisticated approach in order to choose a sensible kernel.

The article starts out with an introduction to the GPs and approximate Bayesian computation based on Sequential Monte Carlo (ABC-SMC) algorithm and the selection of the different hyperparameters required for its implementation. Then, in Section Simple Demonstration Example, the application of the ABC algorithm is illustrated and investigated through two illustrative examples using simulated and real data and forms the core of the article. Finally, the article is closed with some conclusions about the strengths of the method and future discussion.

### **2. GAUSSIAN PROCESSES (GP)**

Rasmussen and Williams (2006) define a Gaussian process (GP) as "a collection of random variables, any finite number of which have a joint Gaussian distribution." In recent years, GPs are gaining a lot of attention in the area of regression (or classification) analysis as they offer fast and simple computation properties (Dervilis, 2013). The core of the algorithm is coming from Rasmussen andWilliams (2006).

### **2.1. Algorithm Theory**

The initial step in order to apply Gaussian process regression is to define a prior mean *m*({*x*}) and covariance function *k*({*x*},{*x ′* }), as GPs are completely specified by them, {*x*} represents the input vector. For any real process *f*({*x*}) one can define:

$$m(\{x\}) = E[f(\{x\})] \tag{1}$$

$$k(\{\mathbf{x}\}, \{\mathbf{x'}\}) = E[(f(\{\mathbf{x}\}) - m(\{\mathbf{x}\}))(f(\{\mathbf{x'}\} - m(\{\mathbf{x'}\}))] \tag{2}$$

where *E* represents the expectation. Often, for practical reasons, because of notation purposes (simplicity), and lack of prior knowledge for the overall trend of the data, the prior mean function is set to zero. The Gaussian processes can then be defined as

$$f(\{\mathbf{x}\}) \sim GP(\mathbf{0}, k(\{\mathbf{x}\}, \{\mathbf{x'}\})).\tag{3}$$

Assuming a zero-mean function, the covariance function could be described as

$$\begin{split} \text{cov}(f(\{\mathbf{x}\}\_p), f(\{\mathbf{x}\}\_q)) &= k(\{\mathbf{x}\}\_p, \{\mathbf{x}\}\_q) \\ &= \sigma^2 \exp\left(-\frac{1}{2} \left| \{\mathbf{x}\}\_p - \{\mathbf{x}\}\_q \right|^2\right). \end{split} \tag{4}$$

This is the squared-exponential covariance function (although not the only option). It is very important to mention an advantage of the previous equation as the covariance is written as a function only of the inputs. For the squared-exponential covariance, it can be noted that it takes nearly unit values between variables where their inputs are very close and starts to decrease as the variable distance in the input space increases.

Assuming now that one has a set of training outputs {*f*} and a set of test outputs *{f}<sup>∗</sup>* one has the prior:

$$
\begin{bmatrix}
\{\emptyset\} \\
\{\emptyset\}\_\*
\end{bmatrix} \sim N\left(\mathbf{0}, \begin{bmatrix}
K(X,X) & K(X,X\_\*) \\
K(X\_\*,X) & K(X\_\*,X\_\*)
\end{bmatrix}\right) \tag{5}
$$

where the capital letters represent matrices. A zero-mean prior has been used for simplicity, and *K*(*X*, *X*) is a matrix whose *i*, *j*th element is equal to *k*(*xi*, *xj*). And *K*(*X, X*\*) is a column vector whose *i*th element is equal to *k*(*xi*; *x*\*), and *K*(*X*\*, *X*) is the transpose of the same. The covariance matrix must be symmetrical about the main diagonal.

As the prior has been generated by the mean and covariance functions, in order to specify the posterior distribution over the functions, one needs to limit the prior distribution in such a way that it includes only these functions that agree with actual data points. An obvious way to do that is by generating functions from the prior and selecting only the ones that agree with the actual points. Of course, this is not a realistic way of doing it as it would consume a lot of computational power. In a probabilistic manner, the operation can be done easily via conditioning the joint prior on the observations and this will give (for more details see Bishop (1995), Nabney (2002), and Rasmussen and Williams (2006)):

$$\begin{aligned} \{f\}\_\* [[X]\_\*, [X], \{f\}] \\ \sim \mathcal{N} \begin{pmatrix} \mathcal{K}([X\_\*], [X]) \mathcal{K}([X], [X])^{-1} \{f\}, \mathcal{K}([X\_\*], [X\_\*]) \\ -\mathcal{K}([X\_\*], [X]) \mathcal{K}([X], [X])^{-1} \mathcal{K}([X], [X\_\*]) \end{pmatrix} . \end{aligned} \tag{6}$$

Function values *{f}<sup>∗</sup>* can be generated by sampling from the joint posterior distribution and at the same time evaluating the mean and covariance matrices from equation (6).

The covariance functions used in this study are usually controlled by some hyperparameters in order to obtain a better control over the types of functions that are considered for the inference. One of the most commonly employed kernels for GPs is the squared-exponential covariance function, which can take the following form:

$$k\_{\mathcal{V}}(\mathbf{x}\_p, \mathbf{x}\_q) = \sigma\_f^2 \exp\left(-\frac{1}{2l^2}(\mathbf{x}\_p - \mathbf{x}\_q)^2\right) + \sigma\_n^2 \delta\_{pq} \tag{7}$$

where *k<sup>y</sup>* is the covariance for the noisy target set *y* (i.e., *y* = *f*({*x*}) + *ε*, where {*x*} is input vector and *ε* is the noise). The length scale *l* (determines how far one needs to move in input space for the function values to become uncorrelated), the variance *σ* 2 *<sup>f</sup>* of the signal and the noise variance *σ* 2 *<sup>n</sup>* are free parameters that can be varied. These free parameters are called *hyperparameters*.

The tool that is usually applied for choosing the optimal hyperparameters for GP regression is the maximum marginal likelihood of the predictions p({y}|[X], {*θ*}) with respect to the hyperparameters *θ*:

$$\begin{aligned} \log p(\{\boldsymbol{\nu}\}|\{\mathbf{X}\}, \{\boldsymbol{\theta}\}) &= -\frac{1}{2} \{\boldsymbol{\nu}\}^T [\boldsymbol{\mathcal{K}}]\_{\boldsymbol{\nu}}^{-1} \{\boldsymbol{\nu}\} \\ &- \frac{1}{2} \log |[\mathbf{K}\_{\boldsymbol{\nu}}]| - \frac{n}{2} \log 2\pi \end{aligned} \tag{8}$$

where [*Ky*] = [*K<sup>f</sup>* ] + *σ* 2 *<sup>n</sup>I* is the covariance matrix of the noisy test set {*y*} and [*Kf*] is the noise-free covariance matrix. In order to optimize these hyperparameters through maximizing the marginal log likelihood, the partial derivatives give the solution, via gradient descent:

$$\begin{split} \frac{\partial}{\partial \theta\_{\dot{\jmath}}} \log p(\{\dot{\jmath}\}|[\mathcal{X}], \{\theta\}) &= \frac{1}{2} \{\mathcal{Y}\}^{T} [\mathcal{K}]^{-1} \frac{\partial [\mathcal{K}]}{\partial \theta\_{\dot{\jmath}}} [\mathcal{K}]^{-1} \{\mathcal{Y}\} \\ &- \frac{1}{2} tr\left( [\mathcal{K}]^{-1} \frac{\partial [\mathcal{K}]}{\partial \theta\_{\dot{\jmath}}} \right) \\ &= \frac{1}{2} tr\left( (\alpha \alpha^{T} - [\mathcal{K}]^{-1}) \frac{\partial [\mathcal{K}]}{\partial \theta\_{\dot{\jmath}}} \right) \end{split} \tag{9}$$

where {*α*} = [*K*] –1{*y*}. Of course this solution is not a trivial procedure, and for specific details, readers are referred to Rasmussen and Williams (2006).

### **3. APPROXIMATE BAYESIAN COMPUTATION (ABC)**

As stated in the previous section, by default GPs need a selection of a kernel which for either SI or SHM might be of great interest as it may affect not only the mean prediction and actual accuracy but also the confidence bounds of the prediction. This creates a model selection and comparison problem, especially when several competing models—kernels in our case (or even expanded to the mean function)—are consistent with the selection criterion and could potentially explain the data reasonably well (this will be expanded later in the section Discussion).

In reality, selecting the most likely model or kernel among a family of competing models (big or small) may be quite challenging, especially with black box methods where deep understanding of the physics is not obvious.

Several methods have been proposed in the literature, and someone can start from Markov chain Monte Carlo (MCMC) variants to evolutionary algorithms like genetic algorithms or particle swarm. The reader can refer to the following references: Schwarz et al. (1978), Bishop (1995), Green (1995),Kullback (1997),Akaike (1998), Doucet et al. (2000, 2001), Au and Beck (2001), Nabney (2002), Lawrence (2003), Marjoram et al. (2003), Ching et al. (2006), Rasmussen and Williams (2006), Skilling (2006), Gretton et al. (2007), Beaumont et al. (2009), Toni et al. (2009), Toni and Stumpf (2010), Barnes et al. (2011), Worden et al. (2011), Neath and Cavanaugh (2012), Turner and Van Zandt (2012), Filippi et al. (2013), Hensman et al. (2013), Wilson and Adams (2013), Chiachio et al. (2014), Ben Abdessalem et al. (2016), and the references therein, where many varied examples illustrating the use of the Bayesian method are investigated. As GPs are an elegant Bayesian method, it fits very well to adopt a Bayesian approach for kernel selection and hyperparameter estimation as this shall give some uncertainty evaluation around the kernel parameters as well.

In this contribution, the approximate Bayesian computation (ABC) algorithm is used for the first time in order to deal with kernel selection and hyperparameter estimation. ABC offers a series of advantages over MCMC (or reversible jump MCMC (RJMCMC) in this context (Green, 1995)). ABC is as general as a Bayesian method can be as there is no need to evaluate any extra criterion to discriminate between competing kernels and the inference can be calculated for any different number of suitable metric regarding the similarity between the observed and modeled data, bypassing issues associated with intractable likelihood functions and Gaussian assumptions, which are not always valid.

Another major advantage offered by the ABC algorithm is its independence of the dimensionality of the competing model, as ABC is able to jump between the different kernel hyperparameter spaces without any need of a specific mapping function that assures continuing of dimension; this is a critical advantage when dealing with large numbers of kernels with different dimensions. In practice, the ABC algorithm compares the competing models simultaneously and eliminates progressively the least likely models, to converge to the most appropriate ones. For much deeper evaluation of ABC, the reader is referred to Toni et al. (2009) and Ben Abdessalem et al. (2016, 2017).

### **3.1. Quick Overview of ABC Algorithm**

For a deep and detailed analysis of the algorithm, the reader is redirected to Schwarz et al. (1978), Bishop (1995), Green (1995), Kullback (1997), Akaike (1998), Doucet et al. (2000, 2001), Au and Beck (2001), Nabney (2002), Lawrence (2003), Marjoram et al. (2003), Ching et al. (2006), Rasmussen and Williams (2006), Skilling (2006), Gretton et al. (2007), Beaumont et al. (2009), Toni et al. (2009), Toni and Stumpf (2010), Barnes et al. (2011), Worden et al. (2011), Neath and Cavanaugh (2012), Turner and Van Zandt (2012), Filippi et al. (2013), Hensman et al. (2013), Chiachio et al. (2014), and Ben Abdessalem et al. (2016) as the purpose of this work is not to repeat the great advantages and theory behind ABC-SMC, but for the readers' convenience, a brief introduction is given.

In the ABC algorithm, the objective is to obtain a "proper" and computationally efficient approximation to the posterior distribution:

$$
\pi(\xi|\mu^\*, \mathcal{M}) \propto f(\mu^\*|\xi, \mathcal{M})\pi(\xi|\mathcal{M})\tag{10}
$$

where *M* is the model based on a set of parameters (or kernel function) *{ξ}*, *π*(*ξ|M*) denotes the prior distribution over the parameter space, and *f*(*u ∗ |ξ,M*)is the likelihood of the observed data *u*\* for a given parameter set {*ξ*}.

To overcome the issue of intractable likelihood functions, the ABC algorithm bypasses the problem by utilizing systematic comparisons between observed and output data. The main objective consists of comparing the simulated data, *u*, with observed data *u*\*, and accepting simulations if a suitable distance measure between them, ∆(*u*, *u*\*), is less than a specified threshold defined by the user, *ε* (for more information check Toni and Stumpf (2010) and Ben Abdessalem et al. (2016, 2017)). The ABC algorithm, as a result, gives a sample from the approximate posterior of the form

$$\begin{split} \pi(\xi|\boldsymbol{u}^{\star},\mathcal{M}) \approx \pi\_{\varepsilon}(\xi|\boldsymbol{u}^{\star},\mathcal{M}) \propto \int f(\boldsymbol{u}^{\star}|\xi,\mathcal{M}) \mathbb{I}\left(\Delta(\boldsymbol{u},\boldsymbol{u}^{\star}) \leq \varepsilon\right) \\ \quad \times \pi(\xi|\mathcal{M}) \text{d}\boldsymbol{u} \end{split} \tag{11}$$

where I(*a*)is an indicator function returning unity if the condition *a* is satisfied and a zero otherwise; when *ε* is small enough, *πε*(*ξ|u ∗ ,M*) is a good approximation to the true posterior distribution.

In this work, the ABC-SMC algorithm presented in Toni and Stumpf (2010) will be used to make Bayesian inference for kernel selection and parameter estimation. Generally speaking, the algorithm works as a particle filter (Schwarz et al., 1978; Bishop, 1995; Green, 1995; Kullback, 1997; Akaike, 1998; Doucet et al., 2000, 2001; Au and Beck, 2001; Nabney, 2002; Lawrence,

Frontiers in Built Environment | www.frontiersin.org

2003; Marjoram et al., 2003; Ching et al., 2006; Rasmussen and Williams, 2006; Skilling, 2006; Gretton et al., 2007; Beaumont et al., 2009; Chatzi and Smyth, 2009, 2013; Toni et al., 2009; Toni and Stumpf, 2010; Barnes et al., 2011; Worden et al., 2011; Neath and Cavanaugh, 2012; Turner and Van Zandt, 2012; Filippi et al., 2013; Hensman et al., 2013; Chiachio et al., 2014; Ben Abdessalem et al., 2016) and is based on the sequential importance sampling (SIS) algorithm, which is a Monte Carlo (MC) method that constitutes the basis for most sequential MC filters developed over the last decades (see Schwarz et al. (1978), Bishop (1995), Green

(1995), Kullback (1997),Akaike (1998), Doucet et al. (2000, 2001), Au and Beck (2001), Nabney (2002), Lawrence (2003), Marjoram et al. (2003), Ching et al. (2006), Rasmussen and Williams (2006), Skilling (2006), Gretton et al. (2007), Beaumont et al. (2009), Toni et al. (2009), Toni and Stumpf (2010), Barnes et al. (2011), Worden et al. (2011), Neath and Cavanaugh (2012), Turner and Van Zandt (2012), Filippi et al. (2013), Hensman et al. (2013), Chiachio et al. (2014), and Ben Abdessalem et al. (2016)). The key idea of ABC-SMC is to provide an approximation of the posterior density function by a set of random samples with associated weights. The algorithm converges through a number of intermediate posterior distributions before converging to the optimal approximate posterior distribution satisfying a convergence criterion defined by the user. In a nutshell, starting from the first iteration, one can choose an arbitrarily large tolerance threshold *ε*<sup>1</sup> to avoid a low acceptance rate and computational inefficacy. One selects directly from the prior distributions *π*(*m*) and *π*({*ξ*}), evaluates the distance ∆(*u*\*, *u*), and then compares this distance to *ε*1, in order to accept or reject the (*m*, {*ξ*}) selection. This process is repeated until *N* particles distributed over the competing models are accepted. One then assigns equal weights to the accepted particles for each model. For the next iterations (*t >* 1), the tolerance thresholds are set such that *ε*<sup>1</sup> *> ε*<sup>2</sup> *> . . . > εt*. The choice of the final tolerance schedule, denoted here by *εt*, depends mainly on the goals of the practitioner.

### **4. SIMPLE DEMONSTRATION EXAMPLE**

In the next two sections, two illustrations of the ABC-SMC algorithm applied to kernel selection for GPs are presented. For ABC-SMC implementation, one sets the prior probabilities of each model to be equal. A population of *N* = 1,000 particles is used here, and the marginal likelihood given by equation (8) is used as a metric to measure the level of agreement between the training and simulated data. Furthermore, the sequence of tolerance *ε*1, *ε*2,*. . ., ε<sup>t</sup>* is selected in adaptive way instead of having a predefined sequence of tolerances to walk through. For the first iteration (population in the ABC jargon), one chooses a high value of the log-marginal likelihood |log *p*(*y*, *X*, *θ*)| (set to 1,000 in the present examples). For the subsequent iterations, one selects *ε<sup>t</sup>* according to the distribution of {∆ = |log *p*(*y*, *X*, *θi*)|; *i* = 1,*. . .*,*N*}. For the next iteration, *t* = 2, the tolerance *ε<sup>t</sup>* = 2 is set to the 30 percentile of ∆ values obtained from the previous population. Finally, the convergence criterion used here is when the difference between two consecutive tolerance values is less than a threshold value defined by the user.

Once the required hyperparameters are defined for the ABC-SMC, one can go forward in order to determine the GP kernel which best follows the data.

The first example is a simulated numerical example given by the form:

$$y = f(\mathbf{x}) + \epsilon = -2\mathbf{x} + \mathbf{x}\sin(\mathbf{x}) + \varepsilon,\ \varepsilon \sim \text{N}(\mathbf{0}, 1). \tag{12}$$

The representation of this simple example based on simulated training data with input *x* ranging from 0 to 10 as can be seen in **Figure 1** and it is for demonstration purposes. For this study, the three most common kernels, the Squared-Exponential (SE) kernel, the Rational Quadratic (RQ) kernel, and Matern (Ma) 5/2 kernel, were used to compete. It has to be clear that the ABC does not care about the number of competing kernels neither the number of their hyperparameters. Furthermore, there would be no value to keep increasing the number of different kernel models as this offers nothing in terms of the presenting work and the application of ABC to GPs.

The kernel models are defined as

$$\begin{aligned} M\_1: \ k\_{\rm SE} &= \sigma^2 \exp\left(-\frac{r}{2\ell^2}\right) \\ M\_2: \ k\_{\rm Ma} &= \sigma^2 \left(1 + \frac{\sqrt{5}r}{\ell} + \frac{5r^2}{3\ell^2}\right) \exp\left(-\frac{\sqrt{5}r}{\ell}\right) \\ M\_3: \ k\_{\rm RQ} &= \sigma^2 \left(1 + \frac{r^2}{2\alpha\ell^2}\right)^{-\alpha} \\ \text{where} &\ r = ||\mathbf{x} - \mathbf{x}'||. \end{aligned} \tag{13}$$

The SE kernel (as stated in the definition of GPs previously) is the most common and default kernel for GPs or even RVMs. As a kernel, it has some nice properties. It is universal, with trivial integration procedure against most functions. It is clear though that each function in its prior mode has an infinite number of derivatives. Furthermore, and more realistically, it has only two parameters, such as the length scale *ℓ* that controls the length of the "wiggles" in the function, and as a result it cannot extrapolate more than *ℓ* units away from the data, and the variance *σ* 2 that determines the average distance of a function away from its mean, and usually it works just as a scale factor.

The RQ kernel can be seen as adding together SE kernels with different length scales parameter. As a result, in this case, GP priors of this kernel produce functions, which vary smoothly across along different length scales. The parameter *α* controls the relative weighting of large-scale or small-scale variations. It is very evident that when *α→∞*, then the RQ is the same as the SE.

The reason that the Matern kernel is presented here as well is that allows to control the smoothness and includes a large variety of kernels, which can be proven to be very useful for applications because of this flexibility. For the majority of the people who put together a GP regression or classification exercise, they use extensively the SE or RQ kernels. Both these kernels have closed form solutions (integration) and are a quick and easy solution that will probably work well when one is assuming smooth functions when interpolating.

**Figure 2** shows the model posterior probabilities over the different populations and the associated tolerance threshold

when ABC algorithm is running. One can easily observe that for high tolerance thresholds, there is no strong evidence that a kernel model is more favorable, but between populations 9 and 11, the algorithm gives the trend to favor the simplest, smoothest SE covariance. In a nutshell, the algorithm tries at first to move toward the simplest model, which is the SE one (something that is not so trivial in the next example). As a result, this means that the more complex model is simply penalized. At population 12, the ABC gives a higher evidence to the SE covariance, which remains the simplest one and ends up by finding the true model at population 17 with strong evidence.

From population 12–17, the algorithm refines the model parameter estimates associated to the selected kernel. **Figure 3** shows the histograms of the model hyperparameters from the last population.

**Figures 4**–**6** show the training data and the model prediction with the 95% confidence interval for all different kernels. One observes a good agreement between the observed and predicted data. In the next real application example, one is able to follow a more interesting and complex behavior on how the ABC algorithm chooses the right kernel model by favoring the simplest model at the beginning but choosing the more complex one at the end.

To summarize so, why it chooses SE kernel against RQ kernel for example. First of all, one has to notice that both of them are giving very similar results in **Figures 4** and **5**. However, this is the beauty of the methodology followed via ABC-SMC; it scales that both are similar, so there is no need to choose RQ as it is more complicated than SE. If simplicity is good, then keep it as there is no need to add complexity both mathematically and computationally. Another point that it is noticeable is that in **Figure 6** where Ma kernel is evaluated there are many "wiggles" and no outliers, but with 95% confidence intervals, one expects a percent of outliers to be present as it happens in **Figures 4** and **5**.

### **5. REAL DATA APPLICATION**

All three kernels described and mentioned earlier are very useful but if and only if the data is all of the same type with similar feature space. In real applications thought if one wants to perform regression and construct a kernel, then for all different feature/data types, one can multiply kernels together. This is the common standard way to combine kernels together. In simple probabilistic language kernels, multiplication can be considered as an "and" operation. At the same spirit, adding kernels can be considered as an "or" operation.

So the motivation here (as one can do the exact same exercise with different kernels as before) is that the model/data structure one needs are not described by some known kernel (independently of how many different kernels one uses). And for demonstration reasons, the next real data, toy example, is used. One

can with different ways to construct kernel combinations with different properties that would allow to include as much highlevel structure as possible and check at the same time which of the "modified" models is the best.

For the purposes of this example of composite covariance matrix, two competing models are considered

$$M\_1: k^{\text{SE}} + k^{\text{PER}} \times k^{\text{SE}} + k^{RQ} + k^{\text{SE}}$$

$$M\_2: k^{\text{SE}} + k^{\text{PER}} \times k^{Ma} + k^{Ma} \tag{14}$$

where PER is the periodic kernel, which allows one to model functions that repeat themselves exactly. The period *p* determines the distance between repetitions of the function, and the length scale is identical to SE kernel. The PER kernel is given by

$$k\_{\rm{PER}} = \sigma^2 \exp\left(-\frac{\frac{2\sin^2(\pi r)}{p}}{\ell^2}\right). \tag{15}$$

The data that were used here consist of CO<sup>2</sup> concentrations from Mauna Loa observatory, and the reader can find more details in Keeling et al. (1976), Thoning et al. (1989), Etheridge et al. (1996), and Tans (2012) (see **Figure 7**).

Keeling and Whorf (Keeling and Whorf, 2005; Rasmussen and Williams, 2006; Wilson and Adams, 2013) recorded monthly average atmospheric CO<sup>2</sup> concentrations at the Mauna Loa Observatory in Hawaii. The months between around 1960 and 1998 are used for training (see **Figure 7**), and the remaining months until year 2020 (including GPs extrapolation) are used for testing (see **Figure 12**).

A very similar dataset was used in Keeling and Whorf (2005), Rasmussen and Williams (2006), and Wilson and Adams (2013) and is often utilized in GPs' tutorials to demonstrate how GPs are performing as flexible black box modeling tools (even during extrapolation). This data set is great as a toy example as one can notice a long-term rising trend including some seasonal variability and some irregularities. The current work goes toward a fully automated algorithm and investigation for data pattern recognition and robust GP modeling. In all procedures (as before), Gaussian noise is assumed, so that marginalization (or in simple terms integration) over the unknown functions can be performed in a closed form.

*M*<sup>1</sup> and *M*<sup>2</sup> in this example are composed of 12 and 9 hyperparameters {*θ*} (as seen in **Figure 9**), respectively, and **Figure 11** shows the prediction and the 95% according to Rasmussen, while **Figure 12** shows the prediction and the 95% confidence bounds by propagating the uncertainty in the hyperparameters.

On running ABC, **Figure 8** shows the model posterior probabilities over the different populations and the associated tolerance threshold. One can easily observe that for high tolerance thresholds, there is no strong evidence that either kernel model is more favorable. Between populations 2 and 17, the algorithm gives the trend to favor the simplest covariance. In a nutshell, the algorithm tries at first to converge toward the most simple model, which is the Model Two. This means that the complex model with higher number of parameters (Model One) is penalized. For instance, this is quite obvious at population 9, where the probability associated with Model Two is much higher than Model One. However, by further decreasing the tolerance threshold, it seems that the Model Two is no longer able to give good model prediction with adequate accuracy and in turn, the algorithm moves to favor the more complex Model One. At population 19, the algorithm gives a higher evidence to the Model One. The algorithm ends up by finding the best model at population 23 with strong evidence

and eliminates Model Two, which is no longer able to explain the data.

In the subsequent iterations, the algorithm refines the model parameter estimates associated to the best model. **Figure 9** shows the histograms of the Model One kernel parameters. By making a comparison between the log-marginal likelihood values obtained with a gradient-based optimizer and ABC-SMC algorithm over the populations, one clearly sees from **Figure 10** how the ABC-SMC algorithm converges to a better optimum. This proves the ability of the ABC-SMC algorithm to better explore the input space mainly when one has to deal with high-dimensional problems.

**Figures 11** and **12** show the training data and the model prediction with the 95% confidence interval for all different kernels. **Figure 11** is obtained according to Rasmussen and Williams (2006), while **Figure 12** is obtained by propagating the uncertainty on the hyperparameter estimates, and the kernel was chosen automatically and not by trial and error (important difference). One can see a good agreement between both predictions.

### **6. DISCUSSION AND CONCLUSION**

It is evident from the last example that it was different at the beginning to favor one kernel model against the other. This means that both kernels could be candidates that can explain and fit the data. As the algorithm progresses though, and the threshold tightens, the ABC will jump to the more complex model to understand the trend and the behavior of the data, by forgetting the insufficient properties of the simplest combined kernel. It is clear that the method presented here gives to the end user a systematic and consistent way of choosing kernels for machine learning applications and simultaneously estimating the parameters that accompany them. Given these distributions of the hyperparameters, one can even give confidence intervals that are estimated from the obtained posterior distribution of kernel hyperparameters by generating randomly a large number of samples, simulating the kernel model responses and a pointwise confidence interval can be obtained.

One small comment can now generate a huge discussion that is outside the remit of this paper but can give to the reader food for thought. Why might someone need the uncertainty around the hyperparameters? Are they giving any more information for GP or RVM for example?

The answer is yes and no. It is very evident that kernel selection (or even the mean function) controls all the generalization

### **REFERENCES**


properties of the algorithm, but as semi- or non-parametric tools like GPs, the uncertainty of the hyperparameters might not add something to the physical mechanism of this Bayesian tool. However, one can argue that they can potentially be used for the evaluation of the training set. GPs or RVMs do not over-fit in the sense of a classical neural network or trapped to local minima as they are closed formed solutions by integrating out the parameters and as a result not having an actual classic error or cost function. But they are "optimized" by giving a specific training set and the uncertainty arising from fitting the best kernel and the best hyperparameters values can be used as "metric" to evaluate if something is wrong with the defined training set and furthermore to check that even different kernel models might struggle to understand the data, which means that the training set is not representative when projected to a validation/test set. Also, if one moves to dynamic models like NARX-GPs, the current work can find not only the best lags number by treating them as different competing models but also a beautiful uncertainty evaluation of choosing specific lags to represent the dynamic regression algorithm.

To summarize, the presented work moves forward to a compact, consistent, and automatic mechanism via Bayesian formulation of the ABC to find an optimal kernel and its hyperparameters simultaneously. As can be seen in example one, the difference between kernels is not significant and this is the reason that the simplest kernel is chosen. In the authors' opinion, this can generate an argument like a "no free lunch theorem" as for certain types of engineering problems (non-linear systems for example), the computational cost of reaching a solution, averaged over all different models in the same problem, could be simply the same for any "optimized" solution algorithm or kernel model, leaving one with the question is there a best model with best solution that offers a clear "short cut"?

### **AUTHOR CONTRIBUTIONS**

All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.

### **FUNDING**

This work was supported by the U.K. Engineering and Physical Sciences Research Council (EPSRC) under Grant EP/J016942/1 and Grant EP/K003836/2.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer, LA, and handling editor declared their shared affiliation, and the handling editor states that the process nevertheless met the standards of a fair and objective review.

*Copyright © 2017 Abdessalem, Dervilis, Wagg and Worden. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **Vibration Monitoring of Gas Turbine Engines: Machine-Learning Approaches and Their Challenges**

*Ioannis Matthaiou\*, Bhupendra Khandelwal and Ifigeneia Antoniadou*

*Department of Mechanical Engineering, The University of Sheffield, Sheffield, United Kingdom*

In this study, condition monitoring strategies are examined for gas turbine engines using vibration data. The focus is on data-driven approaches, for this reason a novelty detection framework is considered for the development of reliable data-driven models that can describe the underlying relationships of the processes taking place during an engine's operation. From a data analysis perspective, the high dimensionality of features extracted and the data complexity are two problems that need to be dealt with throughout analyses of this type. The latter refers to the fact that the healthy engine state data can be non-stationary. To address this, the implementation of the wavelet transform is examined to get a set of features from vibration signals that describe the non-stationary parts. The problem of high dimensionality of the features is addressed by "compressing" them using the kernel principal component analysis so that more meaningful, lowerdimensional features can be used to train the pattern recognition algorithms. For feature discrimination, a novelty detection scheme that is based on the one-class support vector machine (OCSVM) algorithm is chosen for investigation. The main advantage, when compared to other pattern recognition algorithms, is that the learning problem is being cast as a quadratic program. The developed condition monitoring strategy can be applied for detecting excessive vibration levels that can lead to engine component failure. Here, we demonstrate its performance on vibration data from an experimental gas turbine engine operating on different conditions. Engine vibration data that are designated as belonging to the engine's "normal" condition correspond to fuels and airto-fuel ratio combinations, in which the engine experienced low levels of vibration. Results demonstrate that such novelty detection schemes can achieve a satisfactory validation accuracy through appropriate selection of two parameters of the OCSVM, the kernel width γ and optimization penalty parameter ν. This selection was made by searching along a fixed grid space of values and choosing the combination that provided the highest cross-validation accuracy. Nevertheless, there exist challenges that are discussed along with suggestions for future work that can be used to enhance similar novelty detection schemes.

**Keywords: engine condition monitoring, vibration analysis, novelty detection, pattern recognition, one-class support vector machine, wavelets, kernel principal component analysis**

#### *Edited by:*

*Eleni N. Chatzi, ETH Zurich, Switzerland*

#### *Reviewed by:*

*Dimitrios Giagopoulos, University of Western Macedonia, Greece Wei Song, University of Alabama, United States*

> *\*Correspondence: Ioannis Matthaiou imatthaiou1@sheffield.ac.uk*

#### *Specialty section:*

*This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment*

> *Received: 15 March 2017 Accepted: 29 August 2017 Published: 20 September 2017*

#### *Citation:*

*Matthaiou I, Khandelwal B and Antoniadou I (2017) Vibration Monitoring of Gas Turbine Engines: Machine-Learning Approaches and Their Challenges. Front. Built Environ. 3:54. doi: 10.3389/fbuil.2017.00054*

### **INTRODUCTION**

Vibration measurements are commonly considered to be a sound indicator of a machine's overall health state (global monitoring). The general principle behind using vibration data is that when faults start to develop, the system dynamics change, which results in different vibration patterns from those observed at the healthy state of the system monitored. In recent years, gas turbine engine manufacturers have turned their attention into increasing the reliability and availability of their fleet using data-driven vibrationbased condition monitoring approaches (King et al., 2009). These methods are generally preferred, for online monitoring strategies, over a physics-based modeling approach, where a generic theoretical model is developed and in which several assumptions surround its development. In the case of data-driven condition monitoring approaches, a model based on engine data can be constructed so that inherent linear and non-linear relationships, depending on the method, that are specific to the system being monitored, can be captured. For this reason, engine manufacturers see the need to implement such approaches during pass-off tests, where it is necessary to identify possible defects at an early stage, before complete component failure occurs.

Due to the complex processes taking place in a gas turbine engine, and since modes of failure of such systems are rarely observed in practice, the novelty detection paradigm is normally adopted for developing a data-driven model (Tarassenko et al., 2009), since in this case only data coming from the healthy state of the system are needed for training. On the other hand, conventional multi-class classification approaches are not as easy to implement, since it is not possible to have data and/or understanding (labels) from all classes of failure. The main concept of a novelty detection method is described Pimentel et al. (2014): training data from one class are used to construct a data-driven model describing the distribution they belong to. Data that do not belong to this class are novel/outliers. In a gas turbine engine context, a model of "normal" engine condition (class *N* ) is developed, since data are only available from this class. This model is then used to determine whether new unseen data points are classed as normal or "novel" (class *A*), by comparing them with the distribution learned from class *N* data. Such a model must be sensitive enough to identify potential precursors of localized component malfunctioning at a very early stage that can lead to total engine failure. The costs of a run-to-break maintenance strategy (i.e., decommissioning equipment after failure for replacement) are exceptionally high, but most importantly safety requirements are crucial, and thus, robust alarming mechanisms are required in such systems.

Novelty detection approaches exploit machine learning and statistics. In this study, we will use a non-parametric approach that is specific to the engine being monitored and relies solely on the data for developing the model. The novelty detection field comprises a large portion of the machine-learning discipline and therefore, only a few examples of literature, specific to the application of engine condition monitoring using machine learning, will be mentioned here. Some of the earliest works in this field were made possible through collaboration between Oxford University and Rolls Royce (Hayton et al., 2000). The authors in that paper have used data from vibrations to train a one-class support vector machine (OCSVM). The so-called tracked orders (defined as the vibration amplitudes centered at the fundamental of engine shaft speed and its harmonics) were used as training features for the OCSVM. The OCSVM has also been implemented to detect the impending combustion instability in industrial combustor systems using combustion pressure measurements and combustion high-speed images as input training data (Clifton et al., 2007). The method has also been extended in Clifton et al. (2014) to calibrate the novelty scores of the OCSVM into conditional probabilities.

The choice of the kernel function used in the OCSVM influences its classification accuracy significantly. Since a kernel defines the similarity between two points, its choice is mainly dependent on the data. However, the kernel width is a more important factor than the particular kernel function choice since it can be selected in a manner that ensures that the data are described in the best way possible (Scholkopf and Smola, 2001). Although kernel methods are considered as a good way of injecting domain specific knowledge in an algorithm like the OCSVM, the kernel function choice and its parameters' tuning is not so straightforward. In this study, the authors follow a relatively simple approach to determine both the kernel function parameter and the optimization penalty parameter for the OCSVM. The kernel function parameter that was varied is the radial basis function (RBF) kernel width γ, together with the optimization penalty parameter ν. In general, γ controls the complexity of describing the training examples, while ν defines the upper bound on the fraction of training data points that are outside the boundary defined for class *N* data. Using these two parameters, a compromise can be made between good model generalization capability and good description of the data (training data set) to obtain accurate and reliable predictions.

The novelty detection scheme that is presented in the following sections has been developed for a gas turbine engine that operates on a range of alternative fuels on different air-to-fuel ratios. This engine is being used to study the influence of such operating parameters on its performance (e.g., exhaust emissions), and thus, it is important to enable the early detection of impending faults that might take place during these tests. Since we apply novelty detection on a global system basis, the whole frequency spectrum of vibration must be used for monitoring, rather than specific frequency bands that correspond to engine components. As will be shown later, large vibration amplitudes can be expected in any region along the spectrum.

### **EXPERIMENTAL SETUP AND DATA DESCRIPTION**

The experimental data used in this work were taken from a larger project that aimed to characterize different alternative fuels from an engine performance perspective, e.g., fuel consumption and exhaust emissions. Alternative fuels that are composed of conventional kerosene-based fuel Jet-A1 and bio jet fuels have shown promising results in terms of reducing greenhouse gas emissions and other performance indicators. Several research

programs studied alternative fuels for aviation quite extensively, as reviewed in Blakey et al. (2011). The facility that was used to test the different alternative fuels under different engine airto-fuel ratios, houses a Honeywell GTCP85-129, which is an auxiliary power unit of turboshaft gas turbine engine type. Thus, the operating principle of this engine follows a typical Brayton cycle. As can be shown in the schematic diagram of the engine in **Figure 1**, the engine draws ambient air from the inlet (1 atm) through the centrifugal compressor C1, where it raises its pressure by accelerating the fluid and passing it through a divergent section. The fluid pressure is further increased across a second centrifugal compressor C2, before being mixed with fuel into the combustion chamber (CC) and ignited to add energy into the system (in the form of heat) at constant pressure. The high temperature and pressure gasses are expanded across the turbine, which drives the two compressors, a 32 kW generator G that provides aircraft electrical power and the engine accessories (EA), e.g., fuel pumps, through a speed reduction gearbox.

The bleed valve (BV) of the engine, allows the extraction of high temperature, compressed air (~232°C at 338 kPa of absolute pressure) to be passed to the aircraft cabin and to provide pneumatic power to start the main engines. This allows the engine to be tested on different operating modes as the air-to-fuel mass flow that goes into the CC can be changed with the BV position. When the BV opens, a decrease in turbine speed will take place if there is no addition of fuel to compensate for the lost work. The energy loss arises from the decrease in work done *wc*<sup>2</sup> to the engine's working fluid as it passes through the second compression stage. The amount of lost work is proportional to the extracted bleed air mass *m*bleed and can be expressed as *wc*<sup>2</sup> = *m*bleed*cpdT*, with *c<sup>p</sup>* representing the heat capacity of the working fluid and *dT* the temperature differential across the second compression stage. Since the shaft speed must remain constant at 4,356 *±* 10.5 rad/s, the fuel flow controller achieves this by regulating the pressure in the fuel line, by injecting different mass fuel flow into the CC.

Increasing the fuel mass flow that goes into the CC to maintain constant shaft speed without a subsequent increase in air mass

**TABLE 1** | Averaged engine operating parameters for three operating modes on Jet-A1 fuel.


flow rate, raises the exhaust gas temperature, as can be shown in **Table 1**. This can be explained by the fact that when there is a deficiency of oxygen required for complete combustion of the incoming sprayed fuel, more droplets of fuel are carried further downstream of the CC, until they eventually burn. This gradual burning of fuel along the combustion section causes the associated flame to propagate further toward the dilution zone. Hence, inadequate cooling of the gas stream takes place, which causes higher combustor exit and, in turn, exhaust gas temperatures. This also implies that there is an upper and lower limit for the exhaust gas temperature, which is monitored and controlled by the electronic temperature controller.

Three operating modes have been considered by changing the BV on three positions. These modes are typical for an auxiliary power unit and correspond to a specific turbine load and air-tofuel ratio. The turbine load is thus solely dependent upon the bleed load, whilst shaft load (amount of work required to drive generator and EA) is kept constant in all three operating modes. Using the conventional kerosene jet fuel, Jet-A1, the average values of key engine parameters change on the three operating modes as shown in **Table 1**. Regarding Mode 1, the engine BV is fully closed; no additional load on the turbine, while Mode 2, is a midpower setting and is used when the main engines are switched off and there is a requirement to operate the aircraft's hydraulic systems. During Mode 3, the engine BV is fully opened, which corresponds to the highest level of turbine load and exhaust gas temperature. This operating mode is selected when pneumatic power is required to start the aircraft main engines, by providing sufficient air at high pressure to rotate the turbine blades until self-sustaining power operation is reached.

A piezoelectric accelerometer with sensitivity of 10 mV/g was placed on the engine support structure, sampling at 2 kHz (*f<sup>s</sup>* = 2 kHz). The time duration for each test took 110 s. The fuels that were considered are blends of Jet-A1 and a bio jet fuel [hydro processed esters and fatty acids (HEFA)]. The specific energy density of HEFA is 44 MJ/kg, and thus, it can release the same amount of energy for a given quantity of fuel as that of Jet-A1. The mass fractions of bio jet fuel blended with Jet-A1 in this study are as follows: 0, 2, 10, 15, 25, 30, 50, 75, 85, 95, and 100%. Additional blends of fuels were also considered for comparison: 50% liquid natural gas (LNG) + 50% Jet-A1, 100% LNG and 11% Toluene + 89% Banner Solvent.

**Figures 2** and **3** show examples of the normalized time- and frequency-domain accelerations, respectively. The normalization was done by dividing each time- and frequency-domain acceleration amplitude by its corresponding maximum value, i.e., unit normalized, so that all amplitudes, corresponding to the different datasets, vary within the same range [0, 1]. In the time domain, it is shown that there are certain engine conditions, e.g., 85% Jet-A1 + 15% HEFA, in which the vibration responses of the engine operating under steady-state display strong non-stationary trends. Whereas for conditions such as 50% Jet-A1 + 50% HEFA, the vibration responses contain periodic characteristics, as can be more clearly seen at the frequency-domain plots. Note that the actual recorded acceleration time for each engine condition was 110 s, but, for reasons of clarity only 2 s are shown in the plots. **Figure 3** shows that with condition 85% Jet-A1 + 15% HEFA, the engine experiences the highest overall amplitude level across the whole spectrum on Modes 1 and 3. While for Mode 2, the engine operating under condition 50% Jet-A1 + 50% HEFA exhibits the highest vibration levels throughout the whole frequency spectrum. The above demonstrate that the change in air-to-fuel ratio changes the statistical properties of the datasets and consequently the frequency-domain response of the engine for the different fuel blends. For Modes 1 and 3, with condition 50% Jet-A1 + 50% HEFA, a strong frequency component at 100 Hz is present. Strong periodicity is also present for 100% LNG, at the same frequency. Therefore, looking at the data we can distinguish two main groups, i.e., those that contain some strong periodic patterns and those that do not share this characteristic and in this case can be nonstationary, if appropriate evaluation of their time-domain statistics confirms that.

It is hard to provide a theoretical explanation of the physical context behind the vibration responses acquired, without a

valid physics-based model that can predict the engine's vibration response as an output of a system where, apart from the dynamics context, complex thermochemical, and other physical processes take place. At the same time, the nature of the modeling/monitoring problem, if approached from a physics-based perspective, suggests that model validation would be a significant challenge. Choosing a data-driven strategy overcomes this challenge, since the system examined (engine in operation) is treated as a black box.

### **DATA ANALYSIS METHODS**

As mentioned in the Section "INTRODUCTION," this study follows a machine-learning framework for the condition monitoring of engines using vibration data. This means that, to develop a methodology that can be used to detect novel engine patterns from vibration data, three subsequent steps should be taken, following the data acquisition stage. Those are, namely, data preprocessing, feature extraction, and development of a learning model of normal engine behavior (Tarassenko et al., 2009).

### **Preprocessing of Raw Vibration Data**

To improve the ability of the novelty detection scheme to determine whether a data point belongs to the class *N* or *A*, while removing absolute values, a preprocessing method was applied prior to feature extraction. As has been shown in Clifton et al. (2006), this step has a major effect for the novelty detection system since it enables a better discriminating capability between the two different classes. Scaling and normalization is also important for most condition monitoring systems for the removal of any undesirable environmental or operational effects in the analyzed data (He et al., 2009). As a preprocessing method, it is considered for improving the performance of one-class classifiers (Juszczak et al., 2002): it is a very good practise when working with machinelearning algorithms to scale the data being analyzed, since large absolute value ranges of features will tend to dominate the ones with smaller value ranges (Hsu et al., 2016). In this study, the aim is to enhance the difference in vibration amplitude for classes *N* and *A*, and therefore, the data are chosen to be scaled across the different conditions tested (not across time).

First, a *D*-dimensional matrix *X* = {*x*1, *. . .*, *xN*} of class *N* was constructed. An index *i* = 1, *. . .*, *N* is used to denote the different conditions that were included in this matrix, i.e., the various fuel blends on the three modes of operation. A separate matrix *Z* = {*z*1, *. . .*, *zL*} containing data from both classes (25% of engine conditions are from class *A*), was also constructed. This prior labeling of the two classes, was performed by assembling a matrix with all the raw data (prior to preprocessing) and reducing its dimensions to 2 using principal component analysis (PCA), for visualizing it. The observed data points in the two-dimensional space of PCA that were far from the rest of the data were assigned the class *A* label, while all the others they were given the class *N* label. For instance, the condition 85% Jet-A1 + 15% HEFA at Mode 1 was given the former label.

The scaled version of matrix *X* was obtained as follows:

$$\mathbf{x}\_i = \left[\mathbf{x}\_i - \bar{\mathbf{x}}\right] / \sigma\_x,\tag{1}$$

where the mean vector is defined as ¯*x* = 1 *N* ∑*N i*=1 *x<sup>i</sup>* and the variance vector as **σ***<sup>x</sup>* = 1 *N* ∑*N* (*x<sup>i</sup> −* ¯*x*) 2 . Now, the scaled version of matrix

*i*=1 *Z*, with an index denoting the different conditions in the matrix *j* = 1, *. . .*, *L*, containing data from both classes was obtained as follows:

$$\mathfrak{J}\_{\mathfrak{j}} = \left[ \mathfrak{z}\_{\mathfrak{j}} - \bar{\mathfrak{x}} \right] / \mathfrak{o}\_{\mathfrak{x}}.\tag{2}$$

### **Feature Extraction of Preprocessed Raw Vibration Data**

The process of feature extraction follows after the data preprocessing stage. The wavelet packet transform (WPT) is chosen for this purpose. All the coefficients from the time-scale transformations are used as inputs to an algorithm that is suitable for linear or nonlinear dimensionality reduction, the kernel principal component analysis (KPCA). This procedure of data transformation using wavelet bases and projection onto a set of lower-dimensional axes is advantageous in cases when there is no knowledge about the characteristic frequencies of the mechanical system being monitored.

#### Wavelet Coefficients

The objective of this stage is to obtain a set of discriminating features from the preprocessed raw vibration data, so that the learning model will then be able to easily separate the two classes of engine conditions. It was previously shown in **Figure 3** that there is a certain degree of dissimilarity between the engine conditions with regards to their amplitudes in the frequency spectrum. Hence, to capture both time- and frequency-domain information from the data, it is necessary to use time–frequency methods. The wavelet transform allows one to include time information for the frequency components. Non-stationary events can, therefore, be analyzed using the wavelet transform. It is expected that the data can be more effectively described than with Fourier-based methods, where any non-stationary regions of the stochastic signal are not localized in time. Choosing a time–frequency approach, such as the wavelet transform, might be the best option for the type of data processed in this study. The simplest time–frequency analysis method, the short-time Fourier Transform, will not be an optimal option as the window size is fixed. Hence, there exist resolution limitations, determined by the uncertainty principle, which could hinder the analysis of potentially non-stationary parts of the signal.

The wavelet transform solves the problem of fixed window size, by using short windows to analyze high frequency components (good time localization) and large windows for low frequency components (good frequency localization). An example of wavelet transforms applied for condition monitoring applications was presented in Fan and Zuo (2006). Several other frequency methods exist for monitoring applications, e.g., the Empirical Mode Decomposition, as presented in Antoniadou et al. (2015), which can offer similar benefits to the wavelet transform. However, the latter method is chosen in this work because it is very easy to implement and a proven concept that is mathematically well grounded. The wavelet transform was originally developed for constructing a map of dilation and translation parameters. The dilation represents the scales *s ≈* 1/frequency and translation τ refers to the time-shift operation. Consider the *n*th engine condition χ*n*(*t*), with *t* = {0, *. . .*, 110} s. The corresponding wavelet coefficients can be calculated as follows:

$$c(s,\pi) = \int \chi\_{\pi}(t)\psi\_{s,\pi}(t)dt.\tag{3}$$

The function ψ*s*,<sup>τ</sup> represents a family of high frequency short time-duration and low frequency large time-duration functions of a prototype function ψ. In mathematical terms, it is defined as follows:

$$
\psi\_{s,\mathbf{r}}(t) = \frac{1}{\sqrt{|s|}} \psi\left(\frac{t-\mathbf{r}}{s}\right), \quad s > 0,\tag{4}
$$

when *s <* 1 the prototype function has a shorter duration in time, while when *s >* 1 the prototype function becomes larger in time, corresponding to high and low frequency characteristics, respectively.

In Mallat (1999), the discrete version of Eq. 3, namely, the discrete wavelet transform (DWT), was developed as an efficient alternative to the continuous wavelet transform. In particular, it was proven that using a scale *j* and translation *k*, that take only values of powers of 2, instead of intermediate ones, a satisfactory time–frequency resolution can be still obtained. This is called the dyadic grid of wavelet coefficients, and the function presented in Eq. 4, becomes a set of orthogonal wavelet functions:

$$
\Psi\_{j,k}(t) = 2^{j/2} \Psi \left( 2^j t - k \right), \tag{5}
$$

such that redundancy is eliminated using this set of orthogonal wavelet bases, as described in more detail in Farrar and Worden (2012).

In practice, the DWT coefficients are obtained by convolving χ*n*(*t*) with a set of half-band (containing half of the frequency content of the signal) low- and high-pass filters (Mallat, 1989). This yields the corresponding low- and high-pass sub-bands of the signal. Subsequently, the low-pass sub-band is further decomposed with the same scheme after decimating it by 2 (half the samples can be eliminated per Nyquist criterion), while the highpass sub-band is not analyzed further. The signal after the first level of decomposition will have twice the frequency resolution than the original signal, since it has half the number of points. This iterative procedure is known as two-channel sub-band coding (Mallat, 1999) and provides one with an efficient way for computing the wavelet coefficients using conjugate quadrature mirror filters. Because of the poor frequency resolution of the DWT at high frequencies, the WPT was chosen for feature transformation. The difference between DWT and WPT, lies on the fact that the latter decomposes the higher-frequency sub-band further. The schematic diagram of the WPT up to 2 levels of decomposition is shown in **Figure 4**. First, the signal χ*n*(*t*) is convolved with a halfband low-pass filter *h*(*k*) and a high-pass filter *g*(*k*). This gives, the wavelet coefficient vector*c*1,1, which captures the lower-frequency content [0, *fs*/4] Hz and the wavelet coefficient vector *c*2,1 that captures the higher-frequency content (*fs*/4, *fs*/2) Hz. After *j* levels of decomposition the coefficients from the output of each filter are assembled on a matrix *cn*, corresponding to the *n*th engine condition χ*n*. Note that each coefficient has half the number of samples as χ*n*(*t*) in the first level of decomposition. In this study, four levels of decomposition were considered as an intermediate value. The above process was repeated for the rest of the *N −* 1 engine conditions to get the matrix of coefficients *C* = {*c*1, *. . .*,*cN*}.

#### Low-Dimensional Features

The wavelet coefficients matrix *C* is a *D*-dimensional matrix, i.e., it has the same dimensions as the original dataset. Hence, lowerdimensional features are necessary to prevent overfitting, which is associated with higher dimensions of features. In this study, the PCA, was initially used for visualization purposes, e.g., to observe possible clusters of the data points for matrix *X*. Its non-linear equivalent, the KPCA, is used for dimensionality reduction so that non-linear relationships between the features can be captured.

Principal component analysis is a method that can be used to obtain a new set of orthogonal axes that show the highest variance in the data. Hence, *C* was projected onto 2 orthogonal axes, from its original dimension *D*. In PCA, the eigenvalues λ*<sup>k</sup>* and eigenvectors *u<sup>k</sup>* of the covariance matrix *S<sup>C</sup>* of *C* are obtained by solving the following eigenvalue problem:

$$S\_C \ u\_k = \lambda\_k \ u\_k,\tag{6}$$

where *k* = 1, *. . .*, *D*. The eigenvector *u*1, corresponding to the largest eigenvalue λ<sup>1</sup> is the first principal component, and so on. The two-dimensional representation of *C*, i.e., *Y* (an *N × k* matrix), can be calculated through linear projection, using the first two eigenvectors:

$$\mathbf{Y} = \mathbf{C} \,\,\mathfrak{u}\_{k=1,2} \,\,\mathrm{.}\tag{7}$$

In Schölkopf et al. (1998), the KPCA was introduced. This method is the generalized version of the PCA because scalar products of the covariance matrix *S<sup>C</sup>* are replaced by a kernel function. In KPCA, the mapping ϕ of two data points, e.g., the

*n*th and the *m*th wavelet coefficient vector *c<sup>n</sup>* and *cm*, respectively, is obtained with the RBF kernel function as follows:

$$k(\mathfrak{c}\_n, \mathfrak{c}\_m) = \begin{array}{c} \frac{\left\| \mathfrak{c}\_n - \mathfrak{c}\_m \right\|^2}{2 \cdot \sigma\_{\text{BPCA}}^2} \end{array} . \tag{8}$$

Using the above mapping, standard PCA can be performed in this new feature space *F*, which implicitly corresponds to a nonlinear principal component in the original space. Hence, the scalar products of the covariance matrix are replaced with the RBF kernel as follows:

$$\mathfrak{S}\_{\phi} = 1/N \sum\_{i}^{N} \phi(\mathfrak{c}\_{i})^{T} \phi\left(\mathfrak{c}\_{i}\right). \tag{9}$$

However, the above matrix cannot be used directly to solve an eigenvalue problem as in Eq. 6, because of its high dimension. Hence, after some algebraic manipulation, the eigenvalues *ℓ<sup>d</sup>* and eigenvectors *u<sup>d</sup>* can be computed for the kernel matrix *K* (of size *N × N*), instead of the covariance matrix (of size *F × F*). Therefore, in KPCA, we are required to find a solution to the following eigenvalue problem instead:

$$
\mathcal{K}\mathfrak{u}\_d = \ell\_d \mathfrak{u}\_d,\tag{10}
$$

where *d* = {1, *. . .* , *N*}, since *F > N*, the number of non-zero eigenvalues cannot exceed the number of engine operating conditions *N* (Bishop, 2006). Using the eigenvectors of the kernel matrix, it is possible to obtain the new projections *Y* = { *y*1 *, . . . , y<sup>N</sup>* } of the mapped data points of wavelet coefficients ϕ(*ci*) on a non-linear surface of dimensionality *d* that can vary from 1 up to *N*.

### **Learning Model for Novelty Detection**

Support vector machines as a tool for classification offer the flexibility of an artificial neural network, while overcoming its pitfalls. Using a kernel function to expand the original input space into a higher dimensional one to find a linear decision hyperplane is closely related to adding more layers to an artificial neural network. Therefore, the algorithm can be adapted to match the characteristics of our data better, in such a manner that enhances the prediction accuracy. Given that OCSVM forms a quadratic optimization problem, it guarantees to find the optimal solution to where the linear decision hyperplane must be positioned (Schölkopf et al., 2001; Shawe-Taylor and Cristianini, 2004). On the other hand, it is possible to obtain a local optimum as a solution to finding the mean squared error in an artificial neural network using the gradient descend algorithm.

As training data, we use the matrix obtained from KPCA, i.e., *Y*. Whereas, lower-dimensional representations of testing data (from the matrix *Z*) are obtained by following the same feature transformation, selection, etc. The OCSVM methodology allows the use of the RBF kernel function, which maps the data points in *Y* in a similar way as that in KPCA. However, the formulation in the LIBSVM toolbox (Chang and Lin, 2011) is slightly different for the RBF kernel. Given two data points *y<sup>n</sup>* and *y<sup>m</sup>* , the RBF kernel implemented in the OCSVM is defined as follows:

$$k(\mathbf{y}\_n, \mathbf{y}\_m) = \left. e^{-\gamma \left\| \mathbf{y}\_n - \mathbf{y}\_m \right\|^2} \right. \tag{11}$$

After the training data are mapped *via* the RBF kernel, the origin in this new feature space is treated as the only member of class *A* data. Then, a hyperplane is defined such that the mapped training data are separated from the origin with maximum margin. The hyperplane in the mapped feature space is located at ϕ(*y<sup>i</sup>* ) *−* ρ = 0, where ρ is the overall margin variable. To separate all mapped data points from the origin, the following quadratic program needs to be solved:

$$\begin{aligned} \min\_{\mathbf{w}, \rho, \boldsymbol{\xi}} \quad & 0.5 \boldsymbol{\mathsf{w}}^T \boldsymbol{\mathsf{w}} + \frac{1}{\nu \boldsymbol{N}} \sum\_{i} \boldsymbol{\xi}\_i - \rho \\ \text{subject to:} \, & \left( \boldsymbol{\mathsf{w}} \boldsymbol{\phi} \left( \boldsymbol{y}\_i \right) \right) \ge \rho - \boldsymbol{\xi}\_i, \quad i = 1, \ldots, N, \quad \boldsymbol{\xi}\_i \ge \mathbf{0}, \quad \text{(12)} \end{aligned}$$

where *w* is the normal vector to the hyperplane and ξ are called slack variables and are used to quantify the misclassification error of each data point, separately, according to the distance from its corresponding boundary. The value ν that was previously mentioned is responsible for penalizing for misclassifications and is bounded ν*∈*(0, 1]. The decision that determines whether an unseen data point *y ∗* , i.e., from matrix *Z*, belongs to either of the two classes of engine conditions can be made by using the following function:

$$\lg(\mathbf{y}^\*) = \text{sgn}\left[\mathbf{w}\phi(\mathbf{y}^\*) - \rho\right].\tag{13}$$

For a data point from class *A*, *g*(*y ∗* ) *>* 0, otherwise, *g*(*y ∗* ) *≤* 0. Note that for practical reasons, the optimization problem in Eq. 12 is solved by introducing Lagrange multipliers. One of the main reasons for that is because it enables the optimization to be written in terms of dot products. This gives rise to the "kernel trick," which enables the problem to be generalized to the non-linear case by using suitable kernel functions, such as the RBF kernel that is used in this study.

### **RESULTS AND DISCUSSION**

In this work, the RBF kernel was used to map the data points of the OCSVM to an infinite dimensional feature space, where linear separation of the two classes can be achieved. By employing an OCSVM to our problem, we have available a wide range of kernel function formulations to use. The RBF kernel is one of the most popular ones, since it implies general smoothness properties for a dataset, an assumption that is commonly accepted in many real-world applications, as discussed in more detail in Scholkopf and Smola (2001). An RBF kernel has two parameters that need to be determined to adapt the OCSVM algorithm to the characteristics of the vibration signals expected in this study. These parameters are called the kernel width γ and optimization penalty ν. By observing the variation in validation accuracy α<sup>ν</sup> of the OCSVM on a fine grid of values of γ and ν, it was possible to determine the combination of those two values that maximize αν. The values of γ and ν were chosen in steps of powers of 2, as suggested from a practical study in Hsu et al. (2016). The validation accuracy was calculated using a 10-fold crossvalidation scheme to prevent overfitting the data. As discussed in more detail in Bishop (2006), the cross-validation scheme is used when the supply of training data is small. In such cases, there are not enough data to separate them into training and validation datasets, to investigate the model robustness and accuracy. In our study, the number of engine operating conditions is relatively small as compared to the number of dimensions in the feature matrix. Therefore, cross-validation scheme is a possible solution to the problem of insufficient training data. In more detail, in this scheme the data are first divided into 10 equalsized subsets. Each subset is used to test the model's (which was trained on the other nine subsets) classification performance sequentially. Each data point in the dataset of vibration training data is predicted once. Hence, the cross-validation accuracy is the percentage of correct classifications among the dataset of vibration training data.

In **Figure 5**, we present two exemplar results of cross-validation accuracies variation on a grid space of γ and ν parameters. These results correspond to the cross-validation accuracies obtained by training the OCSVM with the wavelet coefficients dataset after being "compressed" with PCA (right plot) and KPCA (left plot). The cross-validation accuracy was evaluated with ν. in the range of 0.001 and 0.8 in steps of 0.002, while γ being in the range of 2 *<sup>−</sup>*<sup>25</sup> and 2<sup>25</sup> in steps of 2. The choice of this grid space for ν was made on the fact that this parameter is bounded, as it represents the upper bound of the fraction of training data that lie on the wrong side of the hyperplane [see more details in Schölkopf et al. (2001)]. In the case of γ, there was no upper and lower limits, therefore, a relatively wider range was selected. In both cases, the steps were determined such that computational costs were kept to a reasonable amount. Generally, the grid space decision followed a trial and error procedure for the given vibration dataset, to determine suitable boundaries and step size. As can be observed from the contour plots, the grid search allows us to obtain a high validation accuracy when an appropriate combination of γ and ν is chosen. For our dataset, this combination can be found mostly on relatively low values of γ. As the value of γ decreases, the pairwise distances between the training data points become less important. Therefore, the decision boundary of the OCSVM becomes more constrained, and its shape less flexible due to the fact that it will give less weight to these distances. Note that the examples in **Figure 5**, were produced with a *d* = 100 for *Y* and *D* = 100 for *Y* (see Low-Dimensional Features), with the decomposition level of WPT *j* = 4 and (for KPCA only) a kernel width γKPCA = 1. Clearly, using KPCA with the RBF kernel, a maximum cross-validation accuracy of around 95% can be obtained, while with the standard PCA the classification accuracy of the OCSVM is relatively poor, i.e., around 60%. Hence, there is an advantage of using KPCA over standard PCA for the specific dataset that is being used in this study. This is expected since KPCA finds non-linear relationships that exist between the data features.

The grid search method for finding "suitable" values for γ and ν, offers an advantage when other parameters, e.g., KCPA kernel width σKPCA, cannot be determined easily. It can be demonstrated that α<sup>ν</sup> can be increased significantly, in comparison to a fixed set of default values. The LIBSVM toolbox suggests the default values to be ν = *d −*1 and γ = 0.5. In **Figure 6**, the validation accuracy is shown for different values of KPCA kernel width σKPCA and number of principal components *d*, for the cases when γ and ν were selected from grid search and when they were given their fixed default values. It is clear from those two plots that the OCSVM parameters γ and ν can be "tuned" such that the validation accuracy can be maximized, regardless of the choice of *d* and σKPCA. This observation illustrates the strength of kernel-based methods, in general, since the kernel width can have a great influence in describing the training data. Most of the times, choosing this parameter is only necessary to obtain a suitable adaptation of our algorithms (Shawe-Taylor and Cristianini, 2004). As can be seen by choosing different ν and γ combinations each time (according to the grid search procedure), the maximum achievable validation accuracy is always close to 100%. This is a major improvement from the corresponding accuracy that can be obtained using the fixed set of values. Moreover, this demonstrates that it is not so challenging to "tune" a support vector machine, since there are only two parameters that need to be found, and this can be done using the grid search procedure. In contrary, an artificial neural network requires its architecture, the learning rate of gradient descent, among other parameters to be specified beforehand, which makes the problem of "tuning" the algorithm much more difficult. Nevertheless, the strongest point of a support vector machine is its ability to obtain a global optimum solution for any chosen value of γ and ν we specified, such that its generalization capability is always maximized.

As it was shown previously in **Figure 5**, the chosen γ value (from the grid search) was very small. This is true for every case

examined, e.g., for different *d* values. For this reason, it can be said that the algorithm generalizes better with a less complex decision boundary. However, the "tuning" of the OCSVM proves to be challenging because the prediction accuracy (using the test data set) is lower than expected, i.e., less than 50%. Most of the errors occurred for data points wrongly accepted as coming from class *A*, whereas in reality they belonged to class *N* . Plausible reasons for the unsatisfactory performance of the OCSVM on the test data set are discussed below:


## **CONCLUSION**

In this study, we have followed a novelty detection scheme for condition monitoring of engines using advanced machine-learning methods, chosen as appropriate for the kind of data analyzed. This resulted in a better description of the main challenges that can be faced when following a data-driven strategy for monitoring engine vibration data. The novelty detection scheme was chosen over a classification approach due to the lack of training data for the various states of an engine's operation, commonly faced in real life applications. The following steps were examined as fundamental, optimal methods for the analysis of the data. A model of normality, based on OCSVMs, that was trained to recognize scenarios of normal and novel engine conditions, was developed using data from the engine operating under conditions in which the engine experienced low vibration amplitudes. The choice of this novelty detection machine-learning method was due to the fact that the pattern recognition problem is based on building a kernel that offers a versatility that can support the analysis of more complex data. In this case, according to the analysis presented in the study, the heavy influence of the penalizing parameter ν and kernel width γ of the OCSVM can affect the validation accuracy. Using a fine grid search for selecting the parameters ν and γ, it is possible to achieve close to 100% in validation accuracy, as demonstrated in the results. This is a significant advantage when there is no methodology in place in selecting other parameters, such as the number of principal components used in KPCA. This also outlines one of the strengths of kernel-based methods, which is the adaptability to a given a data set. In particular, the RBF kernel was proven very effective in describing the data from the engine, by choosing an appropriate value of its kernel width γ.

The limitations of the novelty detection approaches in general and the one discussed in particular in this study include the following points: the training vibration data that can be obtained from engines and the limitations of the specific algorithms examined. For the latter, the selection of ν and γ was discussed and an independent test data set that included 25% of conditions from novel engine behavior was used to calculate classification accuracies using the selected ν and γ from the grid search. Even though, validation results were exceptionally good and the model did not seem to overfit the data as the decision boundary was smooth and the number of support vectors relatively small, the classification accuracy using the test data set was unsatisfactory. The largest errors occurred when incorrectly predicting data points from the healthy engine conditions, as being novel. A few possible reasons as to why this can happen were mentioned in the previous part of the study.

To improve the novelty detection scheme presented in this study, some further work is required to train the OCSVM appropriately. For instance, instead of selecting ν and γ using a grid search approach, it is possible to use methods that calculate those parameters in a more principled way using simple geometry. Also, the wavelet transform features extracted from the data, might have resulted in a large scattering of data points in the feature space due to the fact that there is a high variability in the signals from each engine condition. One way to solve this problem is to examine new set of features needs that can provide better clustering of the data points from the healthy engine conditions, so that a smaller and tighter decision boundary can be formed in the feature space. Another suggestion would be the development of new machine-learning algorithms that do not rely on the quality of the training data but can rather adaptively classify the different states/operation condition of the engine examined.

### **REFERENCES**


### **AUTHOR CONTRIBUTIONS**

IM conducted the machine-learning analysis and is the first author of the study. IA supervised the work (conception and review). BK facilitated the conduction of the experiments and the acquisition of the data analyzed. All the authors are accountable for the content of the work.

### **ACKNOWLEDGMENTS**

The authors would like to thank the people from the Low Carbon Combustion Center at The University of Sheffield for conducting the gas turbine engine experiments and for kindly providing the engine vibration data used in this study.

### **FUNDING**

IM is a PhD student funded by a scholarship from the Department of Mechanical Engineering at The University of Sheffield. All the authors gratefully acknowledge funding received from the Engineering and Physical Sciences Research Council (EPSRC) grant EP/N018427/1.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Matthaiou, Khandelwal and Antoniadou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **A Discontinuous Unscented Kalman Filter for Non-Smooth Dynamic Problems**

#### *Manolis N. Chatzis <sup>1</sup> \* and Eleni N. Chatzi <sup>2</sup>*

*<sup>1</sup>Department of Engineering Science, The University of Oxford, Oxford, United Kingdom, <sup>2</sup>Department of Civil, Environmental and Geomatic Engineering, ETH Zurich, Zurich, Switzerland*

For a number of applications, including real/time damage diagnostics as well as control, online methods, i.e., methods which may be implemented on-the-fly, are necessary. Within a system identification context, this implies adoption of filtering algorithms, typically of the Kalman or Bayesian class. For engineered structures, damage or deterioration may often manifest in relation to phenomena such as fracture, plasticity, impact, or friction. Despite the different nature of the previous phenomena, they are described by a common denominator: switching behavior upon occurrence of discrete events. Such events include for example, crack initiation, transitions between elastic and plastic response, or between stick and slide modes. Typically, the state-space equations of such models are non-differentiable at such events, rendering the corresponding systems non-smooth. Identification of non-smooth systems poses greater difficulties than smooth problems of similar computational complexity. Up to a certain extent, this may be attributed to the varying identifiability of such systems, which violates a basic requirement of online Bayesian Identification algorithms, thus affecting their convergence for nonsmooth problems. Herein, a treatment to this problem is proposed by the authors, termed the Discontinuous *D*– modification, where unidentifiable parameters are acknowledged and temporarily excluded from the problem formulation. In this work, the *D*– modification is illustrated for the case of the Unscented Kalman Filter *UKF*, resulting in a method termed *DUKF*, proving superior performance to the conventional, and widely adopted, alternative.

**Keywords: identifiability, non-smooth systems, Kalman filters, UKF, system identification and structural health monitoring, Bayesian methods**

## **1. INTRODUCTION**

The increasing availability of dense and heterogeneous sensor information has allowed for condition assessment and robust diagnostics of linear and non-linear engineered systems across diverse domains, including the civil, mechanical, and aerospace fields (Kumar and Crassidis, 2007; Worden et al., 2008; Farrar and Worden, 2012). Of particular interest for a number of specialized implementations, including that of robust diagnostics and control, are systems that extend beyond the linear range, commonly attained in response to extreme or unusual loads. Such loads may induce behavior that is non-linear and potentially non-smooth, as in the case of plasticity (Smyth et al., 1999; Ebrahimian et al., 2017), impact (Wriggers, 1991), fracture (Kakouris and Triantafyllou, 2017), and sliding (Giannakopoulos, 1989). The adequate modeling of such systems may be achieved by

#### *Edited by:*

*Dryver R. Huston, University of Vermont, United States*

#### *Reviewed by:*

*Hamed Ebrahimian, California Institute of Technology, United States Feng-Liang Zhang, Tongji University, China Prakash Kripakaran, University of Exeter, United Kingdom*

#### *\*Correspondence:*

*Manolis N. Chatzis manolis.chatzis@eng.ox.ac.uk*

#### *Specialty section:*

*This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment*

> *Received: 28 June 2017 Accepted: 14 September 2017 Published: 19 October 2017*

#### *Citation:*

*Chatzis MN and Chatzi EN (2017) A Discontinuous Unscented Kalman Filter for Non-Smooth Dynamic Problems. Front. Built Environ. 3:56. doi: 10.3389/fbuil.2017.00056*

**42**

application of offline (Papadimitriou and Papadioti, 2013; Au and Zhang, 2016; Zhang and Au, 2016) or of (near) "real-time" identification schemes, the latter typically relying on adoption of Bayesian-type filters (Kalman, 1963; Ljung and Glad, 1994). The task of online identification is a non-trivial one, especially when conjoined with identification of system parameters, which are often not known *a priori* or are highly uncertain (Astroza et al., 2017).

When both unmeasured system states and system parameters are to be estimated, a so-called problem of joint state and parameter identification is posed, often expressed as a non-linear system identification case (Chatzi and Smyth, 2014). In previous years, online non-linear estimation was for the most part achieved by means of the Extended Kalman Filter (*EKF*) (Mariani and Corigliano, 2005; Ding et al., 2014; Ebrahimian et al., 2015). However, more recently, methods which avoid linearization of the state equations, such as the Ensemble Kalman Filter (*EnKF*) (Huang et al., 2017), the Unscented Kalman Filter (*UKF*) (Julier and Uhlmann, 1997; Chatzi et al., 2010; Omrani et al., 2013; Al-Hussein and Haldar, 2015), and the Particle Filter (Chatzi and Smyth, 2012; Eftekhar Azam et al., 2012), have gained in popularity due to their flexibility in treating non-linear dynamics. Of the aforementioned techniques, the UKF in particular employs a reduced number of particles, termed the Sigma-Points, maintaining a rapid and online estimation. This is the main driver behind its selection as the method of choice in this paper.

Kalman-type, and Bayesian filters in general, place a fundamental assumption on the dynamic states and the time-invariant parameters of a system being fully observable (Kalman, 1963; Hermann and Krener, 1977; Diop and Fliess, 1991) and identifiable (Walter, 1982). While this holds true for smooth systems, the same does not apply for their non-smooth counterpart, which pertains to systems that are described by non-differentiable statespace equations (Chatzis et al., 2014; Olivier and Smyth, 2017a). Nonetheless, the simulation and tracking of non-smooth systems is essential for numerous engineering problems, since these are by default tied to manifestations of damage and failure.

Previous work of the authors (Chatzis et al., 2014) overviews the classification of systems in accordance with their observability and identifiability properties. Violation of this property forms a salient obstacle for Bayesian-type online identification algorithms, which are expected to diverge for unobservable states or parameters (Liu et al., 1996), which is naturally amplified for nonlinear systems. The methods presented in Chatzis et al. (2014) may be implemented to infer the observability and identifiability of a system's states and parameters, and the way in which these evolve during a system's response under specific loads. This information may then be seeded into the online estimators (filters) in real-time for ensuring convergence, thereby improving estimation.

In this work, and following original developments introduced in Chatzis et al. (2017), a modified version of the standard *UKF* is proposed. The key to the formulation lies in adoption of a modular state-space formulation, evaluation of the observability within each time step of the analysis, and selection of an appropriate subspace of the full state vector to be used by the *UKF*. The method is termed the Discontinuous *D*– modification to the *UKF*, i.e., the *DUKF*. Examples are presented illustrating the performance of the method for models used in the context of plasticity, which are however general enough to be applied in several other applications of non-smooth problems. The examples reveal a consistently superior performance of the *D*– modification further highlighting the effects of the special observability properties of non-smooth problems. The proposed alternative opens up the way for robust online tracking and control of a variety of engineered systems including rocking (Chatzis and Smyth, 2012a,b, 2013; Greenbaum et al., 2015), energy (Alavi et al., 2015), and biological systems (Villaverde et al., 2016; Villaverde and Banga, 2017).

### **2. NON-SMOOTH DYNAMICAL SYSTEMS**

A non-linear system with state variables **x***t*, time-invariant parameters *θ*, known input vector **u**, and measurement vector **y** can in general be described by the following system of equations:

$$\dot{\mathbf{x}}\_t = E(\mathbf{x}\_t, \boldsymbol{\theta}, \mathbf{u}), \qquad \dot{\boldsymbol{\theta}} = \mathbf{0}, \qquad \mathbf{y} = G(\mathbf{x}\_t, \boldsymbol{\theta}, \mathbf{u}) \tag{1}$$

where *E* and *G* designate the non-linear state-space and measurement functions, respectively. For uncertain systems, i.e., systems whose time-invariant parameters are uncertain or unknown, the above problem may be recast into one of joint state and parameter identification. In this case, the state-space and measurement equations of formulation (1) may be written in an augmented form by introducing the state vector **x** = [**x***t*, *θ*]:

$$
\dot{\mathbf{x}} = e(\mathbf{x}, \mathbf{u}), \qquad \mathbf{y} = g(\mathbf{x}, \mathbf{u}) \tag{2}
$$

In the latter representation, one treats both the dynamic states and the parameters of the system as states of the augmented system. A dynamical system is further characterized as analytic or smooth, when the state-space equation (2) are continuous and infinitely differentiable. Very often, however, the state-space equations of physical models may not be analytic, either due to discontinuities in the state-space equation or in their derivatives. It should also be highlighted that smoothness requires that the equations are infinitely differentiable through at least all the realizations of the states encountered during the trajectory of the system. In this paper, we deal with models for which the state-space equations are continuous, but not differentiable, and whose statespace equations can be separated into smooth, i.e., continuous and infinitely differentiable, branches of the form:

$$\dot{\mathbf{x}} = \begin{cases} e\_1(\mathbf{x}), \mathbf{x} \in \mathcal{R}\_1^n \\ \vdots \\ e\_i(\mathbf{x}), \mathbf{x} \in \mathcal{R}\_i^n \\ \vdots \\ e\_l(\mathbf{x}), \mathbf{x} \in \mathcal{R}\_l^n \end{cases} \tag{3}$$

where *ei*(**x**) is an analytic set of functions within *R n i* . It should be noted that at a specific time instance the state is described by a given realization, corresponding to a single branch of equation (3).

Very often the study of the behavior of dynamic systems results in a discretized description of the problem where nodes are

connected with each other with components, as can be seen in the following **Figure 1**. Often the nodes correspond to point masses with the components corresponding to elements that resist the relative displacements and velocities of the masses, or in the case of Finite Element analysis the nodes are the finite element nodes and the components correspond to the finite elements (Zienkiewicz and Taylor, 2005).

The investigated system may be expressed as a combination of individual components *Cj*, *j* = 1 *. . . Nc*, where *N<sup>c</sup>* is the overall number of the components. For each component *C<sup>j</sup>* a subset, **x***<sup>C</sup><sup>j</sup>* = [**x***<sup>C</sup><sup>j</sup> <sup>t</sup>, θ<sup>C</sup><sup>j</sup>* ], of the overall states of the system **x**, acts as input. The outputs of *Cj*, **P***<sup>C</sup><sup>j</sup>* , are connected to the inputs according to a set of equations of the form:

$$
\dot{\mathbf{P}}\_{\text{C}\_{\uparrow}} = E\_{\text{C}\_{\uparrow}}(\mathbf{x}\_{\text{C}\_{\uparrow}}, \dot{\mathbf{x}}\_{\text{C}\_{\uparrow}}) \tag{4}
$$

where *E<sup>C</sup><sup>j</sup>* are generally non-smooth equations, which can further be separated into smooth branches, e.g., for the *k th* branch of model *Cj*:

$$
\dot{\mathbf{P}}\_{\text{C}\_{\text{\textdegree}}} = e\_{\text{C}\_{\text{\textdegree}}^{\text{\textdegree}}}(\mathbf{x}\_{\text{C}\_{\text{\textdegree}}}, \dot{\mathbf{x}}\_{\text{C}\_{\text{\textdegree}}}) \tag{5}
$$

where *e<sup>C</sup> k j* (**x***<sup>C</sup><sup>j</sup>* )is an analytic set of functions. As the system evolves dynamically over time, it is expected to shift between the individual branches of a component *Cj*. This transition between branches will be referred to as a dynamic event, and the corresponding time instance as the time of the event. A set of transition equations *gC k→l j* (**x**) = 0 describe the transition from branch *C k j* to *C l j* . Having determined the active smooth branch of a component the smooth branch of the overall system from equation (3) can be easily chosen.

### **2.1. Observability of Non-Smooth Dynamical Systems**

The observability of non-smooth systems and the points of differentiation to their smooth counterparts have been discussed in Chatzis et al. (2014, 2017). It should be noted that the notions of observability and identifiability used in these papers and in this work refer to the ability to distinguish the states and parameters from their neighbors at a specific time instance. The method proposed in Chatzis et al. (2014) relies on the study of the observability of each of the smooth subsystems of equation (3), resulting into a characterization of each system branch as either observable, when all associated states are observable, and hence the parameters are identifiable, or as unobservable, when not all states are observable, and hence not all parameters are necessarily identifiable. In general, the separation of an analytic system's states into an observable and an unobservable set requires a nonlinear transformation (Persis and Isidori, 2000). However, for the systems examined herein, it is further assumed that for each of the subsystems *i* of equation (3), we can further separate the state vector **x** into its observable and a minimum number of unobservable components, denoted as **x** *oi* and **x** *ui*, in a straightforward manner.

If the union of the observable states from all branches is a strict subset of the state vector **x** (*∪ l <sup>i</sup>*=1**x** *oi ⊂* **x**), i.e., does not contain at least one of the components of **x**, then it may be concluded that these excluded states are unobservable and may not be adequately tracked via a System Identification algorithm. If on the other hand, the union of the observable components results in the state vector **x**, *∪ l <sup>i</sup>*=1**x** *oi* = **x**, then each component of the state vector **x** could potentially be identified within the corresponding smooth branch within which it is observable. Hence, if the response of the system includes at least one branch for which a parameter is identifiable, then a system identification algorithm could potentially succeed in identifying the value of that parameter. In this paper, the latter case of systems is studied, i.e., systems for which the parameters of the model may be inferred via an appropriate system identification method.

For a component *C<sup>j</sup>* whose smooth branch *k* is defined by equations (5), it is of further interest to proceed in a observability analysis where it is assumed that all of the dynamic states, and their derivatives are measured inputs and *P<sup>c</sup>* are the measured outputs, i.e., **u** = [**x***<sup>C</sup><sup>j</sup> <sup>t</sup>,* **x**˙*<sup>C</sup><sup>j</sup> <sup>t</sup>*] and *y* = *y<sup>I</sup>* = [**P***<sup>C</sup><sup>j</sup>* ]. By studying the observability of this system the parameters *θ<sup>C</sup><sup>j</sup>* can be separated into identifiable and a minimum set of unidentifiable parameters [*θ o Cj , θ u Cj* ]*|yI* . Equation (5) can be expressed only in terms of [**x***<sup>C</sup><sup>j</sup> <sup>t</sup>,* **x**˙*<sup>C</sup><sup>j</sup> <sup>t</sup>, θ o Cj* ]:

**P**˙ *<sup>C</sup><sup>j</sup>* = *e<sup>C</sup> k j* ( **x***Cj <sup>t</sup>,* **x**˙*<sup>C</sup><sup>j</sup> <sup>t</sup>, θ o Cj* ) (6)

Due to the absence of *θ u Cj |yI* from equation (6), for any measurement setup which does not directly involve measurement of *θ u Cj |yI* , those parameters directly contribute to the unidentifiable states **x** *ui* of the corresponding smooth branch of equation (3). If a parameter is shared between different components it will be contributing to the unidentifiable **x** *ui* only if it belongs to *θ u Cj |yI* for all of them. It should, however, be noted that whether [**x***<sup>C</sup><sup>j</sup> <sup>t</sup>, θ* **o** *Cj |yI* ] contribute to **x** *ui* depends on the observability of the system under the actual measurement setup used.

Hence, this component analysis may often pinpoint part of the unidentifiable parameters, although it ought to further be paired with an observability analysis, as discussed in Chatzis et al. (2014).

### **3. UNSCENTED KALMAN FILTER**

The UKF simulates non-linear systems by approximating the state as a Gaussian random variable (GRV), represented by a set of carefully chosen deterministic points known as the Sigma Points. This section only provides a basic overview of the filter equations; more details can be found in Julier and Uhlmann (1997) and Wan and Van Der Merwe, 2000 and previous work of the authorsChatzi and Smyth (2009), Chatzi et al. (2010), and Chatzis et al. (2015).

Consider the general dynamical system described by the following equations (7).

$$\mathbf{x}\_{k} = f(\mathbf{x}\_{k-1}, \mathbf{u}\_{k-1}) + \mathbf{w}\_{k-1}, \qquad \mathbf{y}\_{k} = h(\mathbf{x}\_{k}, \mathbf{u}\_{k}) + \mathbf{v}\_{k} \tag{7}$$

where **w***<sup>k</sup>* is the process noise and **v***<sup>k</sup>* is the observation noise, both of which are considered to be white Gaussian noise processes of covariance matrices **Q***<sup>k</sup>* and **P***k*, respectively.

Given the state vector at step *k −* 1 and assuming that this has a mean value of ˆ*x<sup>k</sup>−*<sup>1</sup> and covariance **P***<sup>k</sup>−*<sup>1</sup>, the statistics of *x<sup>k</sup>* can be calculated by using the Unscented Transformation, or in other words by computing the set of sigma points *χ i <sup>k</sup>* with associated weights *Wi*. The steps of the method are summarized in **Table 1**.

As inferred by the steps outlined above, the *UKF* algorithm does not discern between observable/unobservable states and identifiable/unidentifiable parameters. The overall convergence of the

**TABLE 1** | The steps of the *UKF* algorithm.


*•* Time-update: 3. Propagation of the sigma points through the system model: *χ i <sup>k</sup>|k−*<sup>1</sup> <sup>=</sup> *<sup>f</sup>* ( *χ i k−*1 *, χ w,i k−*1 ) *, i* = 0*, ..,* 2*L* 4. Predicted mean and covariance: <sup>ˆ</sup>*xk|k−*<sup>1</sup> <sup>=</sup> ∑2*L <sup>i</sup>*=<sup>0</sup> *<sup>W</sup><sup>m</sup> <sup>i</sup> χ i k|k−*1 and **P***k|k−*<sup>1</sup> = ∑2*L <sup>i</sup>*=<sup>0</sup> *<sup>W</sup><sup>c</sup> i* [ *χ i <sup>k</sup>|k−*<sup>1</sup> *<sup>−</sup>* <sup>ˆ</sup>*xk|k−*<sup>1</sup> ] [*<sup>χ</sup> i <sup>k</sup>|k−*<sup>1</sup> *<sup>−</sup>* <sup>ˆ</sup>*xk|k−*<sup>1</sup> ]*T*

*•* Measurement steps:

5. Measurement mean and covariance matrices: <sup>ˆ</sup>*yk|k−*<sup>1</sup> <sup>=</sup> ∑2*L <sup>i</sup>*=<sup>0</sup> *<sup>W</sup><sup>m</sup> <sup>i</sup> Y i k|k−*1 and *Yk|k−*<sup>1</sup> = *h* ( *χ i k|k−*1 *, χ η,i k−*1 ) **P** *yy <sup>k</sup>* = ∑2*L <sup>i</sup>*=<sup>0</sup> *<sup>W</sup><sup>c</sup> i* [ *Y i <sup>k</sup>|k−*<sup>1</sup> *<sup>−</sup>* <sup>ˆ</sup>*yk|k−*<sup>1</sup> ][*<sup>Y</sup> i <sup>k</sup>|k−*<sup>1</sup> *<sup>−</sup>* <sup>ˆ</sup>**y***k|k−*<sup>1</sup> ]*T* and **P** *xy <sup>k</sup>* = ∑2*L <sup>i</sup>*=<sup>0</sup> *<sup>W</sup><sup>c</sup> i* [ *χ i <sup>k</sup>|k−*<sup>1</sup> *<sup>−</sup>* <sup>ˆ</sup>*xk|k−*<sup>1</sup> ][*<sup>Y</sup> i <sup>k</sup>|k−*<sup>1</sup> *<sup>−</sup>* <sup>ˆ</sup>*yk|k−*<sup>1</sup> ]*T*

*•* Kalman updating 6. Calculation of Kalman Gain: **K***<sup>k</sup>* = **P** *xy k* ( **P** *yy k* )*−*1 where: 7. Improve predictions of the state and covariance using the latest observations: <sup>ˆ</sup>*x<sup>k</sup>* <sup>=</sup> <sup>ˆ</sup>*xk|k−*<sup>1</sup> <sup>+</sup> **<sup>K</sup>***<sup>k</sup>* ( *<sup>y</sup><sup>k</sup> <sup>−</sup>* <sup>ˆ</sup>*yk|k−*<sup>1</sup> ) **P***<sup>k</sup>* = **P***k|k−*<sup>1</sup> *−* **K***k***P** *yy <sup>k</sup>* **K** *T k*

method is ensured only when a parameter converges faster during identifiable time steps, than it diverges during unidentifiable steps.

### **4. DISCONTINUOUS UNSCENTED KALMAN FILTER** *DUKF*

As stated in Section 2.1., during a specific time instance where the system lies in a specific smooth branch *i*, only part **x** *oi* of the state vector may be observable. Therefore, the *UKF* algorithm is expected to converge only for that observable part **x** *oi*. The predictions furnished during this interval by the *UKF* for the unobservable part **x** *ui*, which in this work is assumed to be the unidentifiable parameters, are non-optimal and it is also quite likely that during these time intervals the values of **x** *ui* may very well diverge from the real solutions. In fact, their resulting estimates are expected to be inferior to the initial estimates of these parameters in the initiation of the interval. Hence, during such intervals it is argued that the optimal choice would be to update only the observable part of the state.

To introduce the computational part of the *D*– modification, a row switching transformation matrix **T<sup>i</sup>** is defined such that:

$$\mathbf{T}\_i \mathbf{x} = \begin{Bmatrix} \mathbf{x}^{ol} \\ \mathbf{x}^{\boldsymbol{\mu}i} \end{Bmatrix} = \mathbf{x}' \tag{8}$$

in other words, pre-multiplying **x** with **T<sup>i</sup>** results in a rearranged vector **x** *′* where the first *noi* components are observable and the remaining *nui* components are unidentifiable. As the order among the observable components, and likewise for the corresponding order among the unidentifiable parameters, is not of importance any of the **T<sup>i</sup>** matrices that satisfy equation (8) may be chosen. In all cases, those are by definition Boolean matrices containing only one non-zero element per row which is further equal to one, satisfying the property **T** *−*1 **<sup>i</sup>** = **T T i** . Any vector *v* and matrix *A* whose rows, and columns for the latter, correspond to the elements of *x* may be brought to the order of **x** *′* with the following operations:

$$\mathbf{T}\_i \mathbf{v} = \begin{Bmatrix} \mathbf{v}^{oi} \\ \mathbf{v}^{ui} \end{Bmatrix} \qquad \mathbf{T}\_i \mathbf{A} \ \mathbf{T}\_i^T = \begin{bmatrix} \mathbf{A}^{oo} & \left(\mathbf{A}^{uo}\right)^T \\ \mathbf{A}^{uo} & \mathbf{A}^{uu} \end{bmatrix} \tag{9}$$

while for a *n × m* matrix *B*, whose rows only correspond to the order of the elements in **x**, the following operating reorders its elements to the order of **x** *′* :

$$\mathbf{T}\_i \mathbf{B} = \left[\frac{\mathbf{B}^o}{\mathbf{B}^u}\right] \tag{10}$$

It is now straightforward to separate a vector or matrix to the observable *<sup>o</sup>* , unobservable *<sup>u</sup>* , and cross *uo* components. *DUKF* follows the steps of the *UKF* algorithm for steps 1–5, as shown in **Table 1**. The *DUKF* structure in nonetheless differentiated from the standard *UKF* steps as follows: *DUKF* updates the observable components of the estimates of the mean vector and covariance matrix during the Kalman updating step: ˆ*x<sup>k</sup>|k−*<sup>1</sup> *→* ˆ*x<sup>k</sup>|<sup>k</sup>* and **P***<sup>k</sup>|k−*<sup>1</sup> *→* **P***<sup>k</sup>|<sup>k</sup>* , using an appropriate Kalman gain matrix defined based on the observable components. The unobservable parts are retained invariant, while the cross terms of the covariance are updated using the Schmidt-Kalman Filter (Schmidt, 1966; Novoselov et al., 2005). The steps of the *DUKF* algorithm are summarized in detail in **Table 2**.

The extra steps entailed by the *D*– modification in comparison to the standard *UKF* are simple operations involving multiplications with the transformation matrix **T***i*. Hence, the extra computations required are of minimal cost, while in fact the computational cost of the *DUKF* results similar to, or lower than, that of the *UKF*. This may be attributed to the fact that the Kalman updating steps result in multiplications of matrices that are of lower dimension to those of the original method. An additional advantage of the method against the Discontinuous Extended Kalman Filter, *DEKF*, previously introduced by the authors, lies in that it does not require the detection of events and hence there are no constraints on the algorithm used for the time updating step. This implies that the method can be directly paired with any existing dynamic or finite element software for the time updating step.

### **4.1. Estimating the Active Smooth Branch**

An important step of the *DUKF* algorithm is related to separating the states to observable and unobservable as indicated in **Table 2**. This is straightforward to do once the smooth branch the system lies in is known. As discussed earlier, this may more easily be constructed by choosing the active smooth branch for each component *C<sup>j</sup>* of equation (5). If the true value of the states is

known that can be done by evaluating the values of the set of functions *g<sup>C</sup><sup>j</sup>* (**x**) which result to the transition equations between branches, *g<sup>C</sup> k→l j* (**x**) = 0. In this paper, this branch has to be estimated by evaluating a related set of functions ˆ*g<sup>C</sup><sup>j</sup>* (**x**) at **x** = ˆ*x<sup>k</sup>|k−*<sup>1</sup> .


### **5. APPLICATIONS**

### **5.1. Non-Linear Hysteretic Bouc–Wen Model**

In this example, the hysteretic system illustrated in **Figure 2** comprising a Bouc–Wen type spring of mass-normalized stiffness *k* and linear damping *c* is examined.

The relative displacement *x* of the body with respect to the ground is considered as the measured quantity. The observability of this system was examined in Chatzis et al. (2014). The equations of motion are formulated as:

$$\begin{aligned} \ddot{\mathbf{x}} + k\,r + c\dot{\mathbf{x}} &= -\ddot{\mathbf{x}}\_{\mathbf{g}} \\ \dot{\mathbf{r}} &= \dot{\mathbf{x}} - \beta \, \left| \dot{\mathbf{x}} \right| \left| r \right|^{\nu - 1} \, r - \gamma \dot{\mathbf{x}} \left| r \right|^{\nu} \end{aligned} \tag{11}$$

**P***k|<sup>k</sup>* = **T**

*<sup>i</sup>* **P** *′ <sup>k</sup>|<sup>k</sup>* **T***<sup>i</sup>* where *k* is the stiffness of the spring,*c*the damping coefficient, and *β*, *γ*, and *ν* are the parameters of the Bouc–Wen model. The term *r*˙ can be rewritten as*r*˙ = *x*˙ *−x*˙*s*, where *x<sup>s</sup>* is the displacement of the slider and *x*˙*<sup>s</sup>* = *β |x*˙*| |r| ν−*1 *r − γ x*˙ *|r| ν* . Hence, *r* can be thought of as the displacement of the elastic spring. As stated in that paper the dynamic equations of motion of the system can be separated into four smooth branches:

$$\begin{aligned} (A) &: \dot{r} = \dot{x} - \beta \dot{x} \, r^{\nu} - \gamma \dot{x} \, r^{\nu} \,, \text{for } \dot{x} > 0 \& \; r > 0\\ (B) &: \dot{r} = \dot{x} + \beta \dot{x} \, r^{\nu} - \gamma \dot{x} \, r^{\nu} \,, \text{for } \dot{x} < 0 \& \; r > 0\\ (C) &: \dot{r} = \dot{x} + \beta \dot{x} \, (-r)^{\nu} - \gamma \dot{x} \, (-r)^{\nu} \,, \text{for } \dot{x} > 0 \& \; r < 0\\ (D) &: \dot{r} = \dot{x} - \beta \dot{x} \, (-r)^{\nu} - \gamma \dot{x} \, (-r)^{\nu} \,, \text{for } \dot{x} < 0 \& \; r < 0 \end{aligned} \tag{12}$$

within these branches the system is not fully observable, but may be rewritten in the form:

$$\begin{aligned} (A) &: \dot{r} = \dot{x} - \Delta\_1 \dot{x} \, r^\nu, \text{for } \dot{x} > 0 \& \; r > 0\\ (B) &: \dot{r} = \dot{x} + \Delta\_2 \dot{x} \, r^\nu, \text{for } \dot{x} < 0 \& \; r > 0\\ (C) &: \dot{r} = \dot{x} + \Delta\_2 \dot{x} \, (-r)^\nu, \text{for } \dot{x} > 0 \& \; r < 0\\ (D) &: \dot{r} = \dot{x} - \Delta\_1 \dot{x} \, (-r)^\nu, \text{for } \dot{x} < 0 \& \; r < 0 \end{aligned} \tag{13}$$

where ∆<sup>1</sup> = *β* + *γ* and ∆<sup>2</sup> = *β − γ*. The augmented state vector is hence defined as: [*x, x*˙*, r, k, c, ν,* ∆1*,* ∆2]. In this new representation, within each branch all of the states (*x, x*˙*, r, k, c, ν*) are observable while only one of the parameters ∆<sup>1</sup> and ∆<sup>2</sup> is identifiable depending on the *sign*(*x r* ˙ ). When *x r* ˙ *≥* 0, i.e., with branches (A, D) ∆<sup>1</sup> is identifiable, while when *x r* ˙ *<* 0, i.e., within the branches (B, C) ∆<sup>2</sup> is identifiable.

The previous result can also be demonstrated in terms of the Bouc–Wen spring that can be considered to be the non-smooth component *C*1, for which equations (13) correspond to the form of equation (5) with **P***<sup>C</sup>*<sup>1</sup> = *r* and **x***<sup>C</sup>*1*<sup>t</sup>* = [*x*˙]. For such a component, *θ o <sup>C</sup>*<sup>1</sup> = ∆1*, θ<sup>u</sup> <sup>C</sup>*<sup>1</sup> = ∆2, when *x r* ˙ *≥* 0 and *θ o <sup>C</sup>*<sup>1</sup> = ∆2*, θ<sup>u</sup> <sup>C</sup>*<sup>1</sup> = ∆2, when *x r* ˙ *<* 0. For the given measurement setup used, the observability analysis on the overall system shows that there are no further unidentifiable parameters or unobservable states. To complete the description of the method the transformation matrices are defined:

1. *x r* ˙ *≥* 0 :

$$\mathbf{T}\_1 = \mathbb{I}\_{8 \times 8} \tag{14}$$

2. else:

$$\mathbf{T}\_2 = \begin{bmatrix} \frac{\mathbb{I}\_{6 \times 6}}{\mathbb{I}\_{6 \times 6}} & \begin{Bmatrix} \mathbf{0} \end{Bmatrix}\_{6 \times 1} \\\hline \{\mathbf{0}\}\_{1 \times 6} & \begin{Bmatrix} \mathbf{0} & \mathbf{1} \\ \mathbf{1} & \mathbf{0} \end{Bmatrix}\_{6} \end{bmatrix} \tag{15}$$

A system with mass-normalized stiffness and damping terms *k* = 1000 <sup>1</sup> *sec*<sup>2</sup> and *c* = 2*∗ √ k∗*5% <sup>1</sup> *sec* , respectively, and Bouc–Wen parameters *ν* = 2, ∆<sup>1</sup> = 6000, ∆<sup>2</sup> = 2000 initially at rest is subjected to the input ground motion shown in **Figure 3**. The measured signal is assumed to be the displacement of the system *x*.

#### 5.1.1. The Effect of Noise

For the parametric analysis that follows, an initial guess is herein assumed as *k*<sup>0</sup> = *k*, *c*<sup>0</sup> = *c*, *ν* = 3, ∆1/2000 = 2, and ∆2/2000 = 2. Initially, the effect of different realizations of noise vectors to the convergence of the algorithms will be studied. To that end 1000 different sets of random process and noise vectors are generated, corresponding to a noise-to-signal *RMS* ratio of 5%. The noisy inputs and outputs are then used in the two methods, *DUKF* and *UKF*, which assume corresponding covariance matrices for the process and measurement noise, and the mean error of the final estimates for the Bouc–Wen parameters is calculated. The Cumulative Distribution of the mean BW parameter errors is shown in **Figure 4A**. Subsequently, the effect of different levels of noise-to-signal *RMS* ratio for the process noise and the assumed values in the algorithm is investigated. To that end, values of *RMS* ratios in the range [1%, 7%] with an increment of 1%, where different values are used for the signals that contaminate the input and measurement vector and the assumed covariances of the measurement and process noise used in the models. That creates a total of 7<sup>4</sup> cases that are examined, for which the mean BW parameter error is calculated, using both the conventional and proposed method, and the results are presented in 4B.

As observed in **Figure 4A**, DUKF performs superior to the *UKF* for a given noise-to-signal RMS ratio and is less affected by the exact realizations of the noise vectors. For the *DUKF* approximately 80% of the cases result into mean parameter error less than 20 versus 60% for the *UKF*. As can further be deduced from **Figure 4B**, the same qualitative comparison for the two methods is observed even in the case where diverse noise-to-signal *RMS* ratios are adopted, while a mismatch is noted between the

**FIGURE 4** | Predictions of the two *EKF* models for the corresponding parameters of the Bouc–Wen model. **(A)** 5% noise RMS ratio and corresponding assumed covariances. **(B)** Varying the noise RMS ratio and assumed covariances.

actual and assumed (in the model) covariances of the process and measurement noise. The use of the *DUKF* allows for larger discrepancies between the assumed covariances of the noises and their real properties.

### 5.1.2. The Effect of the Initial Estimates

The effect of different initial estimates for the Bouc–Wen parameters on the convergence is explored in the following **Figure 5**. To that end, the initial estimates used for ∆<sup>1</sup> and ∆<sup>2</sup> vary in the range between [1, 7] \* 2000, while *ν* is varied in the range: [1.8, 3]. The mean relative error of the BW parameters is calculated for each case and a color is assigned depending on the value of that error. The error color-bar is shown in **Figure 5** corresponding to mean errors from 0 to 100%.

As observed in **Figure 5**, the relative error is lower for the *DUKF* as compared to the *UKF* for a wide range of initial estimates of the parameters. Essentially the method is more forgiving in terms of the proximity of the initial estimate to the real value, which offers an important advantage as often the initial estimates are not close to the final value.

### 5.1.3. Non-Smoothness and Dimensionality

The previous Bouc–Wen spring will be extended to 4 masses connected with Bouc–Wen springs. Each mass is described by a displacement *x<sup>i</sup>* relative to the ground. The non-linear springs are defined by their stiffness *k<sup>i</sup>* and the Bouc–Wen parameters ∆<sup>1</sup>*<sup>i</sup>* , ∆<sup>2</sup>*<sup>i</sup>* , and *ν<sup>i</sup>* and linear dampers with coefficients *ci*, *i* = 1, *. . .*, 4 as shown in the following **Figure 6**.

The state-space equations of the system may be assembled after noting that the equation of evolution of the elastic displacement of spring *i >* 1 becomes:

$$\begin{aligned} \dot{r}\_i &= \dot{\mathbf{x}}\_i - \dot{\mathbf{x}}\_{i-1} - \Delta\_{1\_l} \left( \dot{\mathbf{x}}\_i - \dot{\mathbf{x}}\_{i-1} \right) |r\_i|^{\nu\_i}, \text{ for } \left( \dot{\mathbf{x}}\_i - \dot{\mathbf{x}}\_{i-1} \right) r\_i \ge \mathbf{0} \\\ \dot{r}\_i &= \dot{\mathbf{x}}\_i - \dot{\mathbf{x}}\_{i-1} + \Delta\_{2\_l} \left( \dot{\mathbf{x}}\_i - \dot{\mathbf{x}}\_{i-1} \right) |r\_i|^{\nu\_i}, \text{ for } \left( \dot{\mathbf{x}}\_i - \dot{\mathbf{x}}\_{i-1} \right) r\_i < \mathbf{0} \end{aligned} \tag{16}$$

If this system is excited by a ground acceleration ¨*x<sup>g</sup>* and the displacements of the four masses [*x*1, *x*2, *x*3, *x*4] are measured then it may be demonstrated that for each of the four Bouc–Wen components of the system, *C*1, *. . .*, *C*4, the parameters ∆<sup>1</sup>*<sup>i</sup>* and ∆<sup>2</sup>*<sup>i</sup>* become unidentifiable when (*x*˙*<sup>i</sup> −x*˙*<sup>i</sup>−*<sup>1</sup>)*r<sup>i</sup> <* 0 or (*x*˙*<sup>i</sup> −x*˙*<sup>i</sup>−*<sup>1</sup>)*r<sup>i</sup> ≥* 0, respectively. This occurs from the identifiability properties of each Bouc–Wen component, *Ci*, after noting that **P***<sup>C</sup><sup>i</sup>* = *r<sup>i</sup>* and **x***Ci <sup>t</sup>* = [*x*˙*<sup>i</sup> − x*˙*<sup>i</sup>−*<sup>1</sup>]. When (*x*˙*<sup>i</sup> − x*˙*<sup>i</sup>−*<sup>1</sup>) *≥* 0, *θ o <sup>C</sup><sup>i</sup>* = ∆<sup>1</sup>*<sup>i</sup>* and *θ u <sup>C</sup><sup>i</sup>* = ∆<sup>2</sup>*<sup>i</sup>* , else *θ o <sup>C</sup><sup>i</sup>* = ∆<sup>2</sup>*<sup>i</sup>* and *θ u <sup>C</sup><sup>i</sup>* = ∆<sup>1</sup>*<sup>i</sup>* . The remaining parameters and dynamic states are identifiable and observable, respectively, according to the observability analysis of the overall system. The overall transformation matrix *T<sup>i</sup>* can hence be assembled at any time instance.

A system with parameters *k<sup>i</sup>* = [1000, 900, 800, 700] [1/s<sup>2</sup> ],*c<sup>i</sup>* = 2 *√ k<sup>i</sup>* 5*/*100 [1/s], ∆<sup>1</sup>*<sup>i</sup>* = [6*,* 7*,* 8*,* 9] 2000, ∆<sup>1</sup>*<sup>i</sup>* = [6*,* 7*,* 8*,* 9] 2000, ∆<sup>1</sup>*<sup>i</sup>* = [1*,* 1*,* 2*,* 6] 1000, and *ν<sup>i</sup>* = [2, 2, 2, 2] is subjected to the time history of **Figure 3A**. The obtained displacements are shown in the following **Figure 7**.

The measurements are contaminated with noise of noiseto-signal *RMS* ratio of 1%. The corresponding covariances are assumed in both models for the process and measurement noises. The initial estimates used for both models are *k<sup>i</sup>* = 1000, *c<sup>i</sup>* = 2 *√* 1000 10*/*100, *ν<sup>i</sup>* = 2.5, ∆<sup>1</sup>*<sup>i</sup>* = ∆<sup>2</sup>*<sup>i</sup>* = 3 for*i* = 1, *·*, 4. The results of the identification using the *UKF* and *DUKF* are shown in the following **Figures 8** and **9**.

As observed in **Figure 8**, both methods provide fairly good estimates of the elastic parameters of the system *ki*, *ci*. However, when it comes to the non-linear (Bouc–Wen) parameters, the *DUKF* provides a substantially improved estimated versus

the *UKF*. This can be seen by the fact that while for *DUKF* the final ratio of the estimated over real values for the parameters is close to unity, for the *UKF* this ratio substantially deviates from unity indicating large estimation errors. This can be attributed to the multiple, 4 in this case, unidentifiable parameters at any time window. As those unidentifiable parameters are increased, it is more likely that the estimates of the system overall diverge.

It should hence be noted that non-smooth high dimensional systems suffer from the effects of dimensionality, but also from the additional effect of the increased number of unidentifiable parameters and hence sources of divergence. The former can be improved using techniques applied to smooth systems for dimensionality (Olivier and Smyth, 2017b), while the latter is treated through the *D*– modification suggested here. It should be noted that the two treatments, which aim at tackling different problems, can be combined.

### **5.2. 2DOF Elasto-***Plastic System*

In this example, the behavior of a shear system of two masses with displacements *x*<sup>1</sup> and *x*<sup>2</sup> connected to each other and the ground by means of linear damping elements of normalized damping over mass *c*1, *c*<sup>2</sup> and elastoplastic springs of normalized over mass stiffness *k*1, *k*<sup>2</sup> and yield force *F<sup>y</sup>*<sup>1</sup> and *F<sup>y</sup>*<sup>2</sup> as shown in **Figure 10** is studied.

The equations of motion describing the system when subjected to a ground acceleration ¨*x<sup>g</sup>* become:

$$\begin{aligned} \ddot{\mathbf{x}}\_1 + \left(\boldsymbol{c}\_1 + \boldsymbol{c}\_2\right) \dot{\mathbf{x}}\_1 - \boldsymbol{c}\_2 \dot{\mathbf{x}}\_2 + k\_1 \boldsymbol{x}\_{\mathrm{el}\_1} - k\_2 \,\boldsymbol{x}\_{\mathrm{el}\_2} &= -\ddot{\mathbf{x}}\_{\mathrm{g}}\\ \ddot{\mathbf{x}}\_2 + \left(\boldsymbol{c}\_2\right) \dot{\mathbf{x}}\_2 - \boldsymbol{c}\_2 \dot{\mathbf{x}}\_1 + k\_2 \,\boldsymbol{x}\_{\mathrm{el}\_2} &= -\ddot{\mathbf{x}}\_{\mathrm{g}} \end{aligned} \tag{17}$$

where *xel<sup>i</sup>* is the elastic elongation of the elastoplastic spring *i* whose evolution over time is defined as:

$$
\dot{\mathbf{x}}\_{\iota l\_1} = \dot{\mathbf{x}}\_1,\\
\text{in the elastic branch} \qquad \dot{\mathbf{x}}\_{\iota l\_1} = \mathbf{0},\\
\text{in the plastic branch} \tag{18}
$$

$$
\dot{\mathbf{x}}\_{\text{cl}\_2} = \dot{\mathbf{x}}\_2 - \dot{\mathbf{x}}\_1,\\
\text{in the elastic branch} \quad \dot{\mathbf{x}}\_{\text{cl}\_2} = \mathbf{0},\\
\text{in the plastic branch} \tag{19}
$$

The following equations define the transition conditions between the elastic and plastic branches for spring i:

$$\begin{aligned} ||k\_{\!\!\!\!/}x\_{\!\!\!\!\!/ \!\!/ \!\!/ \!\!/} &= F\_{\mathbb{V}\_{\!\!\!\!/ \!\!/}} \text{ elastic} \rightarrow \text{plastic} \\ \dot{x}\_{\!\!\!\!/ \!\!/ \!\!/} &= \mathbf{0}, \text{plastic} \rightarrow \text{elastic} \end{aligned} \tag{20}$$

The previous transition equations ensure that the force of elastoplastic spring *i* always satisfies the condition: *∥ k<sup>i</sup> xel<sup>i</sup> ∥≤ F<sup>y</sup><sup>i</sup>* .

For the purpose of identification the augmented state vector will include *x*1*, x*2*, x*˙ <sup>1</sup>*, x*˙ <sup>2</sup>*, kxel*1*, kxel*2*,c*1*,c*2*, k*1*, k*2*, F<sup>y</sup>*<sup>1</sup> *, F<sup>y</sup>*<sup>2</sup> . The dynamic states *kxel<sup>i</sup>* correspond to the product *k<sup>i</sup> xel<sup>i</sup>* and are used instead of the elastic displacements as it allows separating the states into observable and unidentifiable within all branches without having to use a non-linear transformation (Chatzis et al., 2017). In terms of the identifiability of the system it can easily be shown as in Chatzis et al. (2017), that all the dynamic states together with *c*<sup>1</sup> and *c*<sup>2</sup> are always identifiable. However, only one of [*ki, F<sup>y</sup><sup>i</sup>* ] is identifiable depending on whether

spring *i* lies in an elastic or plastic branch at that specific time instance. This follows after studying the identifiability of any of the two elastoplastic spring components, *C<sup>i</sup>* where **P***<sup>C</sup><sup>i</sup>* = *kxel<sup>i</sup>* and *x<sup>C</sup>*1*<sup>t</sup>* = [*x*˙*<sup>i</sup> − x*˙*<sup>i</sup>−*<sup>1</sup>]. Then the equation of component *C<sup>i</sup>* becomes:

$$
\dot{\mathbf{P}}\_{C\_i} = k\_i \left( \dot{\mathbf{x}}\_i - \dot{\mathbf{x}}\_{i-1} \right), \text{in the elastic branch}
$$

$$
\dot{\mathbf{P}}\_{C\_i} = \mathbf{0}, \text{in the plastic branch} \tag{21}
$$

and as a result *θ u <sup>C</sup><sup>i</sup>* = *F<sup>y</sup><sup>i</sup>* in the elastic branch and *θ u <sup>C</sup><sup>i</sup>* = *k<sup>i</sup>* in the plastic branch. *F<sup>y</sup><sup>i</sup>* becomes identifiable when component *C<sup>i</sup>* enters the plastic branch. This is because the plasticity constraint, when activated, effectively results into an additional measurement equation: *F<sup>y</sup><sup>i</sup>* = *∥kxeli∥*.

Hence, the identifiability of the system requires estimation of whether spring *i* is in an elastic or plastic branch. While this discussion is obvious for the real system through use of equation (20), it requires some careful consideration when applied to the systems estimated by the *DUKF*. As each of the sigma points are bound by the inequality constraint of equation (20), it is likely that their mean would satisfy the condition *∥k*ˆ*xeli∥ < F<sup>y</sup><sup>i</sup>* even if the majority of the sigma points are satisfying the equality, and are hence in the plastic branch. The problem occurs due to the fact that the condition in equation (20) indicates of whether the spring is elastic or plastic, but cannot quantify "how" elastic or plastic the response of the system is. A means of indicating the tendency of the system to behave in an elastic or plastic manner, suggested in this paper, may be attained via comparison of the estimated mean velocities of the elastic and plastic elongation of the springs.

To such an end, the inequality of equation (20) is used to deem of whether each sigma point lies in an elastic or plastic branch using the estimated values for the springs prior to applying the measurement update (i.e., *χ i k|k−*1 for sigma point *i*). Then, the elastic and plastic velocity of spring *i* for sigma point *j*, *x*˙ *j elj* , *x*˙ *j plj* , are:

$$\begin{aligned} \dot{\mathbf{x}}\_{el}^{j} &= \dot{\mathbf{x}}\_{s\_l}^{j}, \text{if spring } i \text{ is elastic} \\ \dot{\mathbf{x}}\_{el}^{j} &= \mathbf{0}, \text{ if spring } i \text{ is plastic} \end{aligned} \tag{22}$$

where *x*˙ *j si* is the total velocity of spring *i* for sigma point *j*, *x*˙ *j <sup>s</sup>*<sup>1</sup> = *x*˙ *j* 1 and *x*˙ *j <sup>s</sup>*<sup>2</sup> = *x*˙ *j* <sup>2</sup> *− x*˙ *j* 1 , then *x*˙ *j pli* = *x*˙ *j <sup>s</sup><sup>i</sup> − x*˙ *j eli* . The mean estimates of the two velocities <sup>ˆ</sup>*x*˙ *el<sup>i</sup>* and <sup>ˆ</sup>*x*˙ *pl<sup>i</sup>* can be calculated using the fourth

step of the *DUKF* algorithm. Hence, the following criterion is used to deem of whether the system is behaving elastically or plastically:

$$\begin{aligned} \left||\hat{\vec{\mathbf{x}}}\_{cl}\right|| &\geq \left||\hat{\vec{\mathbf{x}}}\_{pl}\right|| \longrightarrow \text{ the system behaves elasticity} \\ \left||\hat{\vec{\mathbf{x}}}\_{cl}\right|| &< \left||\hat{\vec{\mathbf{x}}}\_{pl}\right|| \longrightarrow \text{ the system behaves physically} \end{aligned} \tag{23}$$

It is now straightforward to estimate the branch each spring would be in and obtain the corresponding contribution to the transformation matrix *T*. It should finally be noted that in both the *UKF* and *DUKF* algorithms the following constraint is applied to sigma point *j* if *∥kxeli∥ > F<sup>y</sup><sup>i</sup>* for spring *i*:

$$k\varkappa el\_l = F\_{\mathbb{Y}\_l} \* \operatorname{sign}(k\varkappa el\_l) \tag{24}$$

where equation (24) is a return mapping scheme. This is what one would follow in the forward dynamics problem, as the value of *F<sup>y</sup><sup>i</sup>* would be known. However, in this problem it would also be possible to instead modify the value of *F<sup>y</sup><sup>i</sup>* when the plasticity constraint is violated.

A system with properties *k*<sup>1</sup> = 1000 [1/s<sup>2</sup> ], *k*<sup>2</sup> = 800 [1/s<sup>2</sup> ], *c*<sup>1</sup> = 2 *√ k*<sup>1</sup> 0*.*05 [1/s], *c*<sup>2</sup> = 2 *√ k*<sup>2</sup> 0*.*05 [1/s], *F<sup>y</sup>*<sup>1</sup> = 50 [m/s<sup>2</sup> ] *F<sup>y</sup>*<sup>1</sup> = 30 [m/s<sup>2</sup> ] is subjected to the excitation of **Figure 11A**. The occurring displacements of both masses are measured as shown in **Figure 11B**.

The two springs exhibit an elastoplastic response as shown in the force displacement responses plotted in **Figure 12**. The maximum total displacements, *x*<sup>1</sup> and *x*<sup>2</sup> *− x*<sup>1</sup> correspond to 1.7 and 2 times the yield displacements, respectively.

The input and measurements are contaminated with white noise signals corresponding to 5% noise-to-signal RMS ratios. As there is a substantial drift of the measured signals their *RMS* is calculated after these are passed through a high pass filter with a cutoff frequency at 0.5 *Hz*. The initial estimates for the stiffness and the damping of both springs are given a significant offset, as these are assumed as twice their actual value. The initial estimates of the yield forces, *F<sup>y</sup>*<sup>1</sup> and *F<sup>y</sup>*<sup>2</sup> , are varied in the following ranges: *F<sup>y</sup>*<sup>1</sup> *∈* [5*,* 80] and *F<sup>y</sup>*<sup>2</sup> *∈* [5*,* 60]. This is later used in both the *UKF* and *DUKF*, where the assumed covariances of the process and measurement noises match the 5% noise-to-signal RMS ratio. After the algorithms are implemented, the mean relative error of the estimated parameters with respect to the real values is calculated and is plotted in the following **Figure 13**, where the upper row of figures corresponds to the results of *DUKF* and the lower to *UKF*, each column of sub-figures corresponds to the error of the parameter indicated by the title and for each sub-figure

the horizontal and vertical axes correspond to different initial estimates of *F<sup>y</sup>*<sup>1</sup> and *F<sup>y</sup>*<sup>2</sup> . The mean relative error is indicated by the color-bar of the Figure.

As observed in **Figure 13**, *DUKF* in general results in reduced errors over *UKF* for a wide range of initial estimates. It should be noted that the method results in a large improvement over the estimates of the stiffness of the two springs *k*<sup>1</sup> and *k*2. This is expected as these parameters become unidentifiable when the corresponding spring is in the plastic branch. Equally there appears to be a clear improvement for the estimates of the plastic forces *F<sup>y</sup>*<sup>1</sup> and *Fy*2 , when the initial estimates are in the range *F<sup>y</sup>*<sup>1</sup> *∈* (0*,* 70) and *F<sup>y</sup>*<sup>2</sup> *∈* (5*,* 50). For values of *F<sup>y</sup>*<sup>1</sup> *>* 70 and *F<sup>y</sup>*<sup>2</sup> *>* 50, *DUKF* does not change the initial estimate of the corresponding parameter, as the algorithms estimates that the system is always elastic. This can be understood by looking at the following **Figure 14** which is plotting the forces seen by elastic springs of stiffness *k*<sup>1</sup> and k<sup>2</sup> for the real displacements of the system.

In both cases, there are only few points in **Figures 14A,B**, where, respectively, even the force of a linear spring for the displacements of the system would exceed the values of *F<sup>y</sup>*<sup>1</sup> *>* 70 and *F<sup>y</sup>*<sup>2</sup> *>* 50. As a result, it is reasonable for the *DUKF*, given the properties of the estimator selected and the way the plasticity constraint is applied, to reach the conclusion that the corresponding spring always remain elastic when the initial estimates of *F<sup>y</sup><sup>i</sup>* are in the aforementioned range. While this is disadvantageous in terms of identification of the real value of the parameters, when high initial estimates of *F<sup>y</sup>*<sup>1</sup> and *F<sup>y</sup>*<sup>2</sup> are used, the advantage of the *DUKF* lies in that the algorithm has not changed the estimates of the corresponding covariance terms indicating that these parameters were not identified. Additionally, it appears that even for such cases the *DUKF* is capable of providing good estimates for the remaining parameters.

In contrast, the *UKF* would proceed to evolve the initial estimate of *F<sup>y</sup>*<sup>1</sup> and *F<sup>y</sup>*<sup>2</sup> in all scenarios. The algorithm may hence reduce the initial estimate of *F<sup>y</sup>*<sup>1</sup> or *F<sup>y</sup>*<sup>2</sup> even during periods of unidentifiability and this non-optimal change could happen to result into more favorable estimates for the value of *F<sup>y</sup>*<sup>1</sup> and *F<sup>y</sup>*<sup>2</sup> . However, it is equally or more probable that the algorithm will change the overall estimates of the parameters to less favorable

Chatzis and Chatzi DUKF for Non-Smooth Problems

values during unidentifiable windows, thus resulting in divergence of all parameters. This is depicted in the behavior of the *UKF* in **Figure 13** for initial estimates of *F<sup>y</sup>*<sup>1</sup> *>* 70 and *F<sup>y</sup>*<sup>2</sup> *>* 50 where even when the algorithm happens to do better for the estimates of the corresponding *F<sup>y</sup><sup>i</sup>* the remaining parameters behave less optimally. Additionally, in that region the final estimate of the covariance terms corresponding to *F<sup>y</sup><sup>i</sup>* are substantially lower than the *DUKF* even when the algorithm does not converge. The behavior of both algorithms for the case of initial estimates *F<sup>y</sup>*<sup>1</sup> = 70, *F<sup>y</sup>*<sup>2</sup> = 60 is shown in the following **Figures 15** and **16** for the estimated/real values of the parameters versus time and the estimated versus real plastic displacements *xpl<sup>i</sup>* , respectively.

It should be noted that in **Figure 15**, the *DUKF* does not practically alter the initial guess of *F<sup>y</sup><sup>i</sup>* as described above. In this simulation the *UKF* happens to evolve the estimate of *F<sup>y</sup>*<sup>1</sup> in a coincidentally favorable way, while doing the opposite happens for *F<sup>y</sup>*<sup>2</sup> . However, as observed in **Figure 16A**, the *UKF* has difficulty in tracking the plastic displacements for this case. To

*Fy*<sup>2</sup> = 60.

the contrary, the *DUKF* is shown in **Figure 16B** to result into excellent predictions despite the inability to track the values of *Fy*1 and *F<sup>y</sup>*<sup>2</sup> . Finally it should be noted that *UKF* appears relatively certain for the values of *F<sup>y</sup>*<sup>1</sup> and *F<sup>y</sup>*<sup>2</sup> , despite the fact that the latter is incorrect, with corresponding covariance terms smaller than 2 *×* 10*<sup>−</sup>*<sup>3</sup> . In contrast, *DUKF* has not practically changed the covariance from the initial guess which is of the order of 10<sup>2</sup> indicating that the method is uncertain of its estimation of these two values.

Hence, this investigation leads to a result similar to those of the *DEKF* in Chatzis et al. (2017) for Elasto-plastic springs: it appears more favorable to use initial estimates of *F<sup>y</sup><sup>i</sup>* smaller than the real value of those parameters, or at least smaller than the maximum force seen by an elastic spring for the displacements of the system. Of course while this is not possible *a priori*, as those values are not known one can be informed by the *DUKF* of the inability of the algorithm to identify the corresponding value of *F<sup>y</sup><sup>i</sup>* , due to the lack of change of the parameters and the corresponding large covariance terms. Additionally, the elastic displacements *xel<sup>i</sup>* and total displacements *x<sup>i</sup>* of the system are calculated by the *DUKF* with high precision allowing to alert the user on the presence of permanent displacements.

### **6. DISCUSSION AND CONCLUSION**

This paper suggests the use of a Discontinuous *D*– modification for non-smooth systems for modifying the *UKF*. Non-smooth systems include certain parameters whose identifiability property changes over time. To alleviate the divergence exhibited by standard filtering algorithms during time periods of unidentifiability, the Discontinuous modification suggests retaining such parameters invariant during those intervals. This paper, therefore, introduces a Discontinuous Unscented Kalman Filter *DUKF*.

The method is implemented as a minimally invasive modification allowing, as the original *UKF* algorithm, straightforward implementation with any existing software that employ filtering algorithms to update the states of a system over time. The *D*– modification makes use of a transformation matrix at any time instance based on the estimated active smooth branch of the system

and the occurring identifiable and unidentifiable parameters. The proposed modification does not increase the computational cost of the original method; in fact it leads to inversions and operations on matrices of lower dimension.

The examples provided demonstrate the use of the algorithm with two different types of non-smooth systems: systems where the identifiability condition varies between different subspaces of the state vector and systems for which the non-smoothness is a result of an inequality constraint. Several non-smooth systems can be described as combinations of these two cases. The examples illustrate the robustness of the *D*– modification and the overall improved performance of *DUKF* versus *UKF* for problems of increased complexity. Different sources of complexity were studied in terms of their effect such as the noise in the input and measured data, the assumed noise covariances in the models, different initial conditions and dimensionality. The *DUKF* was shown to improve the estimates provided for all the previous cases, and resulted in a more consistent behavior than that of the standard *UKF* for non-smooth systems.

This paper together with former work of the authors on the *EKF* (Chatzis et al., 2017) demonstrate that non-smoothness bears an effect on the convergence of non-linear Kalman Filters and in general for online Bayesian methods, further illustrating

### **REFERENCES**


that the *D*– modification is a viable treatment across algorithmic implementations of this class. The *D*– modification tackles the problems associated to non-smoothness and may be paired with existing modifications aiming at improving the performance of the algorithms for smooth problems, substantially expanding the ability of the occurring algorithms to handle problems of increased complexity.

### **AUTHOR CONTRIBUTIONS**

Both authors verify to have contributed to this original research paper, which has not been submitted for publication elsewhere.

### **FUNDING**

MC would like to acknowledge the Marie Curie FP7 Career Integration Grant No. 618359 within the seventh European Union Framework Programme, for the support of this research. EC would like to gratefully acknowledge the support of the Albert Lück Stiftung. The authors would also like to acknowledge the use of the University of Oxford Advanced Research Computing (ARC) facility in carrying out this work: http://dx.doi.org/10. 5281/zenodo.2255.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Chatzis and Chatzi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **Gaussian Process Time-Series Models for Structures under Operational Variability**

*Luis David Avendaño-Valencia<sup>1</sup> \*, Eleni N. Chatzi <sup>1</sup> , Ki Young Koo<sup>2</sup> and James M. W. Brownjohn<sup>2</sup>*

*<sup>1</sup>Department of Civil, Environmental and Geomatic Engineering, Institute for Structural Mechanics, ETH Zürich, Zürich, Switzerland, <sup>2</sup>College of Engineering, Mathematics and Physical Sciences, University of Exeter, Exeter, United Kingdom*

A wide range of vibrating structures are characterized by variable structural dynamics resulting from changes in environmental and operational conditions, posing challenges in their identification and associated condition assessment. To tackle this issue, the present contribution introduces a stochastic modeling methodology *via* Gaussian Process (GP) time-series models. In the presently introduced approach, the vibration response is represented by means of a random coefficient time-series model, whose coefficients comply with a GP regression on the environmental and operational parameters. The approach may be implemented in conjunction to any type of linear-in-the-parameters time-series model, ranging from simple AR models to more complex non-linear or nonstationary time-series models. The obtained GP time-series modeling approach provides an effective and compact global representation of the vibrational response of a structure under a wide span of environmental and operational conditions. The effectiveness of the postulated GP time-series models is demonstrated through two case studies: the first involves the identification of the vertical vibration response of the Humber bridge, evaluated over a period of three years; the second considers the long-term simulated vibration response of a wind turbine featuring non-stationary dynamics stemming from the rotor speed. In both cases, the variation of the average wind speed is the main driver of uncertainty, while, through application of the proposed GP time-series models, it is possible to track the resulting variation in modal quantities.

#### **Keywords: time-series models, uncertainty, metamodels, random coefficient, gaussian process**

### **1. INTRODUCTION**

Several types of vibrating structures by default operate in constantly varying environmental and operational conditions, which inevitably results in variability of the induced structural dynamics. This is the case for wind turbines, bridges, high rise buildings and more. This issue poses a practical challenge related to the identification and analysis of the vibrational response of these structures, as well as for the health monitoring, fatigue assessment and control of the induced vibrations.

In order to construct a robust model of the dynamic response of the structure, it is not only necessary to accurately model the *short-term* response of the structure, but it is further necessary to effectively capture the *long-term* trends underlying the induced dynamics. This issue, in the particular case of *data-based time-series models*, has been extensively researched in recent years, resulting in the formulation of different strategies, including *projection methods*, and

*Edited by:*

*Branko Glisic, Princeton University, United States*

#### *Reviewed by:*

*James-Alexandre Goulet, École Polytechnique de Montréal, Canada Osman Eser Ozbulut, University of Virginia, United States*

> *\*Correspondence: Luis David Avendaño-Valencia avendano@ibk.baug.ethz.ch*

#### *Specialty section:*

*This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment*

*Received: 03 July 2017 Accepted: 30 October 2017 Published: 08 December 2017*

#### *Citation:*

*Avendaño-Valencia LD, Chatzi EN, Koo KY and Brownjohn JMW (2017) Gaussian Process Time-Series Models for Structures under Operational Variability. Front. Built Environ. 3:69. doi: 10.3389/fbuil.2017.00069*

**57**

*deterministic* or *stochastic* functional dependence models. *Projection methods*, also referred to as *data normalization methods*, aim at projecting characteristic quantities associated with the timeseries model representing the short-term response of the structure, into a subspace where the influence of *Environmental and Operational Parameters* (EOPs) may be easily removed (Yan et al., 2005; Sohn, 2007; Deraemaeker et al., 2008). On the other hand, *deterministic functional dependence models* aim at capturing the long-term variability in the dynamics by assuming a deterministic functional relationship from EOPs to the characteristic quantities of the time-series model describing the dynamics of the response. Typically, such a deterministic functional relationship is captured *via* a functional series expansion. Methods falling into this class include the regression/interpolation methods discussed in Worden et al. (2002) and Sohn (2007), as well as the *Functionally Pooled* (FP) time-series models explained in Kopsaftopoulos et al. (2018) and Sakellariou and Fassois (2016). These methods are particularly effective when a direct relationship exists between measurable input EOPs and the characteristic quantities of the time-series models. Nonetheless, uncertainty in the EOPs and in the dynamic response introduces variability in the characteristic quantities of the basic time-series model, which may not be effectively captured by means of a deterministic relationship. Instead, random or stochastic functions may be more effective in capturing the uncertainty on the characteristic quantities of the time-series model.

In this sense, a third class of methods, referred to as *stochastic functional dependence models*, aim at capturing the long-term variability by assuming that the characteristic quantities of the basic time-series model are stochastic variables depending on the EOPs. Toward this end, recent works have postulated either *Random Coefficient* (RC) (Avendaño-Valencia and Fassois, 2015, 2017a,b; Avendaño-Valencia et al., 2015a) or *Polynomial Chaos Expansions* (PCE) (Spiridonakos and Chatzi, 2014, 2015; Avendaño-Valencia et al., 2015b; Spiridonakos et al., 2016) to represent the variability of the characteristic quantities of a model. In particular, RC timeseries models represent the variability in the dynamics as randomness in the parameters of the time-series model. Then, apart from the selection of the specific time-series model, a further userdefined choice pertains to the definition of an appropriate distribution model for its respective coefficients, which in several cases can become very complex. On the other hand the PCE approach exploits the probabilistic knowledge of the EOPs to build the most effective functional representation of the time-series parameters. However, it is also considered that the randomness in the model parameters originates solely on the randomness of the EOPs. Therefore, other sources of uncertainty may be misrepresented.

In this regard, this work provides a framework for the global (short and long term) identification of the dynamic response of a structure, of unknown properties or a given *a priori* numerical model, under variable operational and environmental conditions by representing the short-term dynamics *via* a linear-in-the parameters regressive time-series model (which may assume the form of an AutoRegressive, AutoRegressive with eXogenous input or similar model), and a *Gaussian Process* (GP) regression to represent the stochastic dependence of the parameters of the basic time-series model on the EOPs, which in turn, describes the long-term variability on the dynamics of the structural response. Contrary to deterministic functional dependence models and PCE-based methods, where the EOPs are considered as the sole source of variability on the time-series model parameters, the appraised GP approach is further capable of capturing and quantifying the additional uncertainty stemming from other unmeasurable sources. Likewise, the obtained GP time-series model is totally linear-in-the-parameters, which facilitates the identification, parameter estimation and posterior model-based analysis. The issue of model identification is addressed by the Maximum Likelihood principle, which is solved by means of an Expectation-Maximization method adapted to the particular structure of the Gaussian Process time-series model. In addition, *Gaussian Process Principal Component Regression* (GP-PCR) is introduced as an optional improvement to the basic GP time-series model aiming at reducing the number of variables in the representation and to improve the numerical stability of the parameter estimation and optimization.

The methods discussed here are demonstrated on two dedicated case-studies. The first one pertains to the identification of actual data corresponding to the vertical acceleration response in the Humber bridge measured over 21 non-consecutive days in the period from May 19, 2011, to March 24, 2013. The second one corresponds to the identification of the long-term vibration response of a wind turbine, employing simulations obtained *via* the FAST wind turbine aeroelastic simulation code, which are characterized by non-stationary dynamics and long-term variability induced by variable wind speed.

The remainder of this paper is organized as follows: Section 2 initially provides a summary of traditional linear-in-theparameters time-series models, pointing out their limitations in long-term identification, and offering their natural extension *via* GP regression. In addition, principal component regression is introduced as an alternative to reduce the number of identified parameters. Subsequently, Section 3 is devoted to the identification of the GP time-series model based on a set of dynamic responses, including the estimation of the parameters of individual realizations, the estimation of the hyper-parameters of the representation and the assessment and validation of the obtained model. Finally, Section 4 provides the two aforementioned case studies, i.e., the Humber bridge and wind turbine simulated vibrational response, while Section 5 concludes the study.

### **2. MODELS OF THE DYNAMIC RESPONSE OF STRUCTURES**

### **2.1. Traditional Linear-in-the-Parameters Regressive Time-Series Models**

Consider the dynamic response of a structure *y*[*t*] *∈* R defined over the normalized discrete time *t* = 1, 2, *. . .* , *N* and sampled with a sampling rate *f* <sup>s</sup>. The response is assumed to obey the linear-in-the-parameters regressive time-series model:

$$\mathbf{y}[t] = \boldsymbol{\Phi}^T(\mathbf{z}[t]) \cdot \mathbf{\boldsymbol{\Theta}} + \boldsymbol{\omega}[t], \qquad \qquad \boldsymbol{\omega}[t] \sim \text{NID}(\mathbf{0}, \sigma\_{\mathbf{w}}^2) \tag{1}$$

where *ϕ*(*z*[*t*]) *∈* R *n* is the *regression vector*, *z*[*t*] *∈* R *nz* is the vector of *regressed variables*, *θ ∈* R *n* is the *parameter vector* and *w*[*t*] is a *Normally and Identically Distributed* (NID) innovations of mean zero and variance *σ* 2 *<sup>w</sup>*. The vector of regressed variables *z*[*t*] may contain previous values of the dynamic response *y*[*t*], excitation inputs *x*[*t*] and innovations*w*[*t*], according to the model type. Accordingly, equation (1) corresponds to the case of either an *output-only* or a *Multiple Input Single Output* (MISO) model. However, the *Multiple Input Multiple Output* (MIMO) case can be cast into the presently discussed framework through the proper rearrangement of the regression and parameter vectors, as well as the redefinition of the dynamic response and innovations as column vectors. In addition, linear state space representations may be considered when observation errors are an important component of the dynamic response measurements.

The linear-in-the-parameters regressive time-series model of equation (1) encompasses a large group of time-series models, which differ in the specific form of the regression vector and the vector of regressed variables. A few important examples are summarized next:

*•* AutoRegressive (AR) models: the simplest case corresponds to the AR model, for which the regression vector is of the form ((Ljung, 1999) Ch. 4):

$$\phi(\mathbf{z}[t]) = \mathbf{z}[t] = \begin{bmatrix} \boldsymbol{y}[t-1] & \boldsymbol{y}[t-2] & \cdots & \boldsymbol{y}[t-n\_a] \end{bmatrix}^T \\ \text{ (2)}$$

where *n*<sup>a</sup> is the order of the AR model.

*•* AutoRegressive Moving Average (ARMA) models: ARMA models further include previous values of the innovations in the vector of regressed variables, and thus (Ljung, 1999):

$$\begin{aligned} \boldsymbol{\phi}(\mathbf{z}[t]) &= \mathbf{z}[t] \\ &= \begin{bmatrix} \boldsymbol{\wp}[t-1] & \cdots & \boldsymbol{\wp}[t-n\_d] & \boldsymbol{\wp}[t-1] & \cdots & \boldsymbol{\wp}[t-n\_c] \end{bmatrix}^T \end{aligned} \tag{3}$$

where *n<sup>a</sup>* and *n<sup>c</sup>* are the orders of the AR and MA parts of the model.

*•* AutoRegressive with eXogenous variable (ARX) models: ARX models combine previous values of the dynamic response and the excitation vector *x*[*t*] *∈* R *nx* , and thus ((Ljung, 1999) Ch. 4):

$$\begin{aligned} \boldsymbol{\phi}(\mathbf{z}[t]) &= \mathbf{z}[t] \\ &= \begin{bmatrix} \boldsymbol{y}[t-1] & \cdots & \boldsymbol{y}[t-n\_a] & \mathbf{x}^T[t-1] & \cdots & \mathbf{x}^T[t-n\_b] \end{bmatrix}^T \end{aligned} \tag{4}$$

where *n<sup>a</sup>* and *n<sup>b</sup>* are the orders of the AR and exogenous parts of the model.

*•* Linear Parameter Varying AR (LPV-AR) models: LPV-AR models are a class of time-dependent AR models, which correspond to a generalization of the simple AR model, where the parameters of the AR model are dependent on an external *scheduling variable β*[*t*] determining the values of these parameters at time *t*. The regression vector in the case of LPV-AR models is of the form (Avendaño-Valencia and Fassois, 2017b):

$$\mathbf{z}[t] = \begin{bmatrix} \mathbf{y}[t-1] & \mathbf{y}[t-2] & \cdots & \mathbf{y}[t-n\_a] \end{bmatrix}^T \tag{5a}$$

$$\mathbf{g}(\mathbf{z}[t]) = \mathbf{z}[t] \otimes \begin{bmatrix} \mathbf{g}\_{b\_1}(\beta[t]) & \mathbf{g}\_{b\_2}(\beta[t]) & \cdots & \mathbf{g}\_{b\_p}(\beta[t]) \end{bmatrix}^T \tag{5b}$$

where *n<sup>a</sup>* is the AR order, *⊗* denotes the Kronecker product, *gbj* (*β*[*t*]) is the *j*-th functional expansion basis, and *p* is the order of the functional expansion basis. The closely related *Functional Series TAR* (FS-TAR) models form a special case of the LPV-AR model where the scheduling variable is time, i.e., *β*[*t*] *≡ t* (Poulimenos and Fassois, 2006).

*•* Non-linear AR (NAR) models: NAR models correspond to the non-linear counterpart of AR models, where the dynamic response is regressed on non-linear functions of its previous values, so that the regression vector assumes the form (Spiridonakos and Chatzi, 2015):

$$\mathbf{z}[t] = \begin{bmatrix} \mathbf{y}[t-1] & \mathbf{y}[t-2] & \cdots & \mathbf{y}[t-n\_a] \end{bmatrix}^T \tag{6a}$$

$$\boldsymbol{\phi}(\boldsymbol{z}[t]) = \begin{bmatrix} \boldsymbol{g}\_1(\boldsymbol{z}[t]) & \boldsymbol{g}\_2(\boldsymbol{z}[t]) & \cdots & \boldsymbol{g}\_n(\boldsymbol{z}[t]) \end{bmatrix}^T \tag{6b}$$

where *n<sup>a</sup>* is the AR order, and *g<sup>j</sup>* (*z*[*t*]) is the *j*–th non-linear term of the vector of regressed variables.

Equation (1) may be expressed alternatively as follows:

$$\mathbf{w}[t] = \mathbf{y}[t] - \boldsymbol{\Phi}^T(\mathbf{z}[t]) \cdot \boldsymbol{\Theta} = \mathbf{y}[t] - \boldsymbol{\chi}[t|t-1] \tag{7}$$

where *y*[*t*|*t* – 1]: = E{*y*[*t*]|*θ*, *ϕ*(*z*[*t*])} is the *one-step-ahead prediction* of the dynamic response, with associated variance *E{*(*y*[*t*1] *− y*[*t*1*|t*<sup>1</sup> *−* 1]) *·* (*y*[*t*2] *− y*[*t*2*|t*<sup>2</sup> *−* 1])*}* = *σ* 2 *<sup>w</sup> · δ*[*t*<sup>1</sup> *− t*2], where *t*1, *t*<sup>2</sup> are two analysis instants, and *δ*[*t*] denotes the Kronecker delta. Hence, under the NID assumption of the innovations *w*[*t*], the conditional probability of the dynamic response *y*[*t*] given the parameter vector *θ* and the regression vector *ϕ*(*z*[*t*]) is Gaussian with mean *y*[*t*|*t*–1] and variance *σ* 2 *<sup>w</sup>*, or more specifically:

$$p(\boldsymbol{y}[t] \mid \boldsymbol{\Theta}, \boldsymbol{\phi}(\boldsymbol{z}[t])) = \mathcal{N}\_{\boldsymbol{y}[t]}(\boldsymbol{y}[t|t-1], \sigma\_{\boldsymbol{w}}^{2}),$$

$$\boldsymbol{y}[t|t-1] = \boldsymbol{\phi}^{T}(\boldsymbol{z}[t]) \cdot \boldsymbol{\Theta} \tag{8}$$

where *Nx*(*xo, σ*<sup>2</sup> *<sup>x</sup>* ) denotes a Gaussian distribution for the random variable *x* with mean *x<sup>o</sup>* and variance *σ* 2 *x* . Moreover, by virtue of the NID nature of the innovations, it follows that the probability for an entire vibration response realization of length *N* aggregated in the vector *y* = [*y*[1] *y*[2] *· · · y*[*N*]]*<sup>T</sup>* , is:

$$\rho(\mathbf{y} \mid \boldsymbol{\theta}, \boldsymbol{\Phi}) = \prod\_{t=1}^{N} \rho(\mathbf{y}[t] \mid \boldsymbol{\theta}, \boldsymbol{\phi}(\mathbf{z}[t])) = \prod\_{t=1}^{N} \mathcal{N}\_{\mathcal{I}[t]}(\mathbf{y}[t \mid t-1], \sigma\_{\mathbf{w}}^{2}) \tag{9}$$

where **Φ** = [*ϕ*(*z*[1]) *ϕ*(*z*[2]) *· · · ϕ*(*z*[*N*])] *∈* R *n×N* is the *N*-sample long *regression matrix*. Then, by introduction of equation 8, and by virtue of the properties of exponential functions, it follows that (see Appendix A.1):

$$p(\mathbf{y} \mid \boldsymbol{\Theta}, \boldsymbol{\Phi}) = \mathcal{N}\_{\mathcal{Y}}(\boldsymbol{\Phi}^T \cdot \boldsymbol{\Theta}, \sigma\_w^2 \cdot \mathbf{I}\_N) \tag{10}$$

where *I<sup>N</sup>* indicates the *N*-size identity matrix. The conditional PDF *p*(*y*|*θ*, **Φ**), seen as a function of the parameter vector *θ*, determines the *likelihood of the parameter vector L*(*θ* | *y*, **Φ**). Furthermore, after assuming that the coefficient vector *θ* is a deterministic variable, *Maximum Likelihood* (ML) estimates may be obtained by determining the values that maximize the likelihood *L*(*θ* | *y*, **Φ**) ((Ljung, 1999), Sec. 7.4).

### **2.2. Limitations of the Traditional Linear Regressive Models**

Although the linear-in-the-parameters regressive time-series model shown in equation (1) is useful for representing diverse classes of time-series, including stationary, non-stationary and non-linear, it lacks the flexibility to effectively represent variable dynamics stemming from variable *Environmental and Operational Conditions* (EOCs). For instance, it is known that the elasticity modulus of a material may change with temperature, and in turn, a change in this variable would modify the natural frequencies and damping ratios associated with the dynamic response of the structure. Hence, the model in equation (1) with a fixed parameter vector *θ*, would fail to effectively represent the dynamic response of the structure over a long period of analysis.

Instead, it may be considered that during a given period of time, say *T* = *N · f <sup>s</sup>*, the EOCs remain more or less constant, and consequently, the physical parameters of the structure would also remain constant. Under these conditions, the linear regression model of equation (1) is a valid representation of the dynamic response of the structure for such analysis period, while the parameter vector *θ* would change according to the EOCs. Two main questions may be identified in this regard, the first is on how to select the length of the period where it is considered that the structural parameters remain more or less constant; the second is on how to represent the variability in the parameters as a function of the EOCs. The selection of the period of pseudoconstant dynamics may be obtained empirically by means of stationarity tests, as described for example in Kay (2008), Basu et al. (2009), and Borgnat et al. (2010). On the other hand, the representation of the variability in the dynamics of the structure as an effect of the EOCs is the main problem addressed in this work, for which a Gaussian Process Regression approach is postulated, as shown in the remainder of this work.

### **2.3. Gaussian Process Time-Series Model**

The key assumption in the *Gaussian Process* (GP) regression approach is that the parameter vector of the time-series model follows a Gaussian distribution. Therefore, the linear-in-theparameters regressive time-series model of equation (1), is complemented as follows:

$$\boldsymbol{y}[t] = \boldsymbol{\phi}^T(\boldsymbol{z}[t]) \cdot \boldsymbol{\Theta} + \boldsymbol{w}[t], \quad \boldsymbol{w}[t] \sim \text{NID}(\boldsymbol{0}, \sigma\_{\text{w}}^2) \tag{11a}$$

$$\theta = \theta\_{\circ}(\xi, \mathsf{M}) + \mathfrak{u}, \qquad \qquad \mathfrak{u} \sim \mathsf{NID}(\mathbf{0}\_{n \times 1}, \Sigma\_{\theta}) \tag{11b}$$

where *ξ ∈* R *m* is the *Environmental and Operational Parameter* (EOP) vector determining the EOCs in the analysis period, *θ0*(*ξ*, *M*) = E{*θ*|*ξ*, *M*} is the mean parameter vector indicating the expected value of the parameter vector given the EOP *ξ* and the matrix of projection coefficients *M∈* R *n×p* , and *u* is an NID random vector with mean zero and covariance **Σ***θ*. The model is completed by the following functional series expansion of the mean parameter vector:

$$\theta\_{\circ}(\xi, \mathsf{M}) = \sum\_{j=1}^{p} \mu\_{j} \cdot g\_{\flat \flat}(\xi) = \mathsf{M} \cdot \mathsf{g}(\xi) \tag{12}$$

where *p* is the order of the functional series expansion and:

$$\mathbf{M} = \begin{bmatrix} \mu\_1 & \cdots & \mu\_p \end{bmatrix} \in \mathbb{R}^{n \times p} \quad \mathbf{g}(\boldsymbol{\xi}) = \begin{bmatrix} g\_{b\_1}(\boldsymbol{\xi}) \\ \vdots \\ g\_{b\_p}(\boldsymbol{\xi}) \end{bmatrix} \in \mathbb{R}^{p \times 1} \tag{13}$$

are the matrix of projection coefficients and the functional basis vector containing the basis with indices *b* = [ *b*<sup>1</sup> *· · · b<sup>p</sup>* ]*T p×*1 , *b<sup>j</sup> ∈* N. Note that any type of non-linear function may have been used to describe the parameter variation, however, a linear-inthe-parameter structure has been selected—again—in equation (12) to facilitate the estimation and analysis of the model. A time-series model obeying equation (11) shall be referred to as a *Gaussian Process* (GP) time-series model and is characterized by a set of deterministic parameters, referred to as *hyper-parameters P* = *{M,* **Σ***θ, σ*<sup>2</sup> *w}*, consisting of the matrix of projection coefficients, the parameter covariance matrix and the innovations variance.

For a GP time-series model, the instantaneous value of the dynamic response *y*[*t*] and the parameter vector *θ* are jointly distributed Gaussian variables, with the joint PDF conditioned on the regression vector, the EOP vector and the hyper-parameters shown next:

$$\rho(\boldsymbol{y}[t], \boldsymbol{\theta} \mid \boldsymbol{\phi}[t], \boldsymbol{\xi}, \mathcal{P}) = \rho(\boldsymbol{y}[t] \mid \boldsymbol{\phi}[t], \boldsymbol{\theta}, \boldsymbol{\xi}, \mathcal{P}) \cdot \rho(\boldsymbol{\theta} \mid \boldsymbol{\phi}[t], \boldsymbol{\xi}, \mathcal{P}) \tag{14}$$

where, from equation (11), it follows that:

$$\rho(\boldsymbol{y}[t] \mid \boldsymbol{\Theta}, \phi[t], \boldsymbol{\xi}, \mathcal{P}) = \mathcal{N}\_{\boldsymbol{y}[t]}(\boldsymbol{y}[t|t-1], \sigma\_{\boldsymbol{w}}^{2}) \tag{15a}$$

$$p(\boldsymbol{\theta} \mid \boldsymbol{\phi}[t], \boldsymbol{\xi}, \mathcal{P}) = \mathcal{N}\_{\boldsymbol{\theta}}(\boldsymbol{\theta}\_{\boldsymbol{\circ}}(\boldsymbol{\xi}, \mathcal{M}), \boldsymbol{\Sigma}\_{\boldsymbol{\theta}}) \tag{15b}$$

Similarly, when the dynamic response over the complete period of analysis of length *N*, namely *y*, is considered, the respective joint conditional PDF takes the form:

$$\begin{split} p(\boldsymbol{y},\boldsymbol{\theta} \mid \boldsymbol{\Phi},\boldsymbol{\xi},\mathcal{P}) &= \prod\_{t=1}^{N} p(\boldsymbol{y}[t],\boldsymbol{\theta} \mid \boldsymbol{\phi}[t],\boldsymbol{\xi},\mathcal{P}) \\ &= \prod\_{t=1}^{N} p(\boldsymbol{y}[t] \mid \boldsymbol{\theta},\boldsymbol{\phi}[t],\boldsymbol{\xi},\mathcal{P}) \cdot p(\boldsymbol{\theta} \mid \boldsymbol{\phi}[t],\boldsymbol{\xi},\mathcal{P}) \end{split} \tag{16}$$

which, under the Gaussianity of both *p*(*y*[*t*]|*θ*, *ϕ*[t], *ξ*, *P*) and *p*(*θ* **|***ϕ*[*t*], *ξ*, *P*), becomes:

$$\begin{split} \rho(\boldsymbol{y}, \boldsymbol{\theta} \mid \boldsymbol{\Phi}, \boldsymbol{\xi}, \mathcal{P}) &= \\ \underbrace{\mathcal{N}\_{\mathcal{V}}(\boldsymbol{\Phi}^{\mathrm{T}} \cdot \boldsymbol{\theta}, \sigma\_{\mathrm{w}}^{2} \cdot \mathrm{I}\_{\mathrm{N}})}\_{\rho(\boldsymbol{y} \mid \boldsymbol{\theta}, \boldsymbol{\Phi}, \boldsymbol{\xi}, \mathcal{P})} \cdot \underbrace{\mathcal{N}\_{\boldsymbol{\theta}}(\boldsymbol{\theta}\_{\boldsymbol{o}}(\boldsymbol{\xi}, \boldsymbol{\mathsf{M}}), \boldsymbol{N}^{-1} \cdot \boldsymbol{\Sigma}\_{\boldsymbol{\theta}})}\_{\rho(\boldsymbol{\theta} \mid \boldsymbol{\Phi}, \boldsymbol{\xi}, \mathcal{P})} \end{split} \tag{17}$$

In addition, according to the conditional density axiom, the joint PDF may be decomposed as follows ((Rasmussen and Williams, 2006), p. 9):

$$p(\boldsymbol{y}, \boldsymbol{\theta} \mid \boldsymbol{\Phi}, \boldsymbol{\xi}, \mathcal{P}) = p(\boldsymbol{y} \mid \boldsymbol{\theta}, \boldsymbol{\Phi}, \boldsymbol{\xi}, \mathcal{P}) \cdot p(\boldsymbol{\theta} \mid \boldsymbol{\Phi}, \boldsymbol{\xi}, \mathcal{P})$$

$$= p(\boldsymbol{\theta} \mid \boldsymbol{y}, \boldsymbol{\Phi}, \boldsymbol{\xi}, \mathcal{P}) \cdot p(\boldsymbol{y} \mid \boldsymbol{\Phi}, \boldsymbol{\xi}, \mathcal{P}) \qquad \text{(18)}$$

where *p*(*θ|y,* **Φ***, ξ, P*)is the *posterior* PDF of the parameter vector after observing the dynamic response, and *p*(*y* **| Φ**, *ξ*, *P*) is the *marginal probability of the dynamic response*, comprising a function of the hyper–parameters and is referred to as the *marginal likelihood* of the hyper-parameters *L*(*P* | *y*, **Φ**, *ξ*). The latter is obtained by marginalizing the joint conditional PDF *p*(*y, θ* **| Φ**, *ξ*, *P*) with respect to all the possible values of *θ* (Rasmussen and Williams, 2006), p. 9). Given that both distributions *p*(*y | θ*, **Φ**, *ξ*, *P*) and *p*(*θ* **| Φ,** *ξ*, *P*) are Gaussian, the posterior parameter PDF and marginal likelihood are Gaussian as well, of the form (Rasmussen and Williams, 2006), p. 9):

$$\rho(\boldsymbol{\theta}|\mathbf{y}, \boldsymbol{\Phi}, \boldsymbol{\xi}, \mathcal{P}) = \mathcal{N}\_{\boldsymbol{\theta}}(\hat{\boldsymbol{\theta}}, \mathcal{P}\_{\boldsymbol{\theta}}) \tag{19a}$$

$$\mathcal{L}(\mathcal{P} \mid \mathbf{y}, \Phi, \xi) = \rho(\mathbf{y} | \Phi, \xi, \mathcal{P}) = \mathcal{N}\_{\mathbf{y}}(\Phi^T \cdot \boldsymbol{\theta}\_o(\xi, M), \boldsymbol{\Sigma}\_{\varepsilon\_t}) \tag{19b}$$

where

$$\hat{\boldsymbol{\theta}} = \mathbb{E}\{\boldsymbol{\theta} | \mathbf{y}, \boldsymbol{\Phi}, \boldsymbol{\xi}, \mathcal{P}\} = \boldsymbol{\theta}\_o(\boldsymbol{\xi}, \boldsymbol{\mathcal{M}}) + \mathbf{K} \cdot (\mathbf{y} - \boldsymbol{\Phi}^T \cdot \boldsymbol{\theta}\_o(\boldsymbol{\xi}, \boldsymbol{\mathcal{M}})) \tag{20a}$$

$$\mathcal{P}\_{\boldsymbol{\theta}} = \mathbb{E}\{ (\boldsymbol{\theta} - \boldsymbol{\hat{\theta}}) \cdot (\boldsymbol{\theta} - \boldsymbol{\hat{\theta}})^{T} | \boldsymbol{y}, \boldsymbol{\Phi}, \boldsymbol{\xi}, \mathcal{P} \} \tag{20b} \\ = (\boldsymbol{I}\_{n} - \boldsymbol{K} \cdot \boldsymbol{\Phi}^{T}) \cdot \boldsymbol{\Sigma}\_{\boldsymbol{\theta}} \tag{20b}$$

$$K = \Sigma\_{\theta} \cdot \Phi \cdot \Sigma\_{\varepsilon}^{-1} \tag{20c}$$

$$
\Sigma\_{\varepsilon} = \sigma\_{\text{w}}^2 \cdot I\_N + \Phi^T \cdot \Sigma\_{\theta} \cdot \Phi \tag{20d}
$$

In the previous equations *θ*ˆ *∈* R *n* corresponds to the *posterior parameter mean* with the associated *posterior parameter covariance matrix P<sup>θ</sup> ∈* R *n×n* . Moreover, the quantities:

$$\tilde{\wp}[t|t-1] := \phi^T(\mathbf{z}[t]) \cdot \boldsymbol{\Theta}\_o(\boldsymbol{\xi}, \mathbf{M}), \quad \varepsilon[t] := \boldsymbol{\wp}[t] - \tilde{\wp}[t|t-1] \tag{21a}$$

$$y[t|t-1] := \boldsymbol{\phi}^T(\mathbf{z}[t]) \cdot \hat{\boldsymbol{\Theta}}, \quad \boldsymbol{e}[t] := \boldsymbol{\chi}[t] - \boldsymbol{\chi}[t|t-1] \tag{21b}$$

are referred to as the *prior* and *posterior* one-step-ahead predictions, respectively, with associated prior and posterior prediction errors *ε*[*t*] and *e*[*t*]. In addition, **Σ***<sup>ε</sup> ∈* R *N×N* is the covariance matrix of the prior prediction error. The main difference between the prior and posterior predictions, is that the prior predictions correspond to the best guess of the dynamic response in the absence of knowledge of the actual parameter vector at the period of analysis, while the posterior predictions are the best guess of the dynamic response based on (an estimate of) the actual parameter vector.

#### 2.3.1. Remark – Unknown or Non-Measurable Sources of Uncertainty

The GP time-series model may also be used in the context where the EOP vector is either unknown or unmeasurable. If that is the case, the functional series expansion of the mean parameter vector in equation (12) is limited to a single constant term, so that *θ*0(*ξ*, *M*): = *θ*<sup>0</sup> = *µ*1*·*1, while the parameter PDF reduces to the conventional multivariate Gaussian model, so that *θ ∼ N <sup>θ</sup>*(*θ0*, **Σ***θ*). Such a case has been explored in Avendaño-Valencia et al. (2015a).

### **2.4. Regression on a Reduced Parameter Set** *via* **Principal Component Regression**

A potential difficulty in the adoption of the GP time-series models lies in the computational cost due to the large number of parameters that need to be estimated. Moreover, a coefficient vector covariance matrix **Σ***<sup>θ</sup>* with full structure implies that several of the estimated parameters are redundant and/or unnecessary. A potential solution to this problem is to use a dimensionality reduction scheme for the regression matrix, such as in *Principal Component Regression* (PCR), which is explained next.

To start with, consider the matrix **Φ**¯ *<sup>K</sup>* = [**Φ**<sup>1</sup> **Φ**<sup>2</sup> *· · ·* **Φ***K*] *<sup>T</sup> ∈* R *n×*(*N·K*) containing the regression matrices associated with the set of dynamic responses *YK*, which possesses the *Singular Value Decomposition* (SVD) **Φ**¯ = *U ·* **Λ** *· V T* , where *U ∈* R *n×m* and *V ∈* R (*N·K*)*×m* are orthogonal matrices, and **Λ** *∈* R *m×m* , with *m* = min{(*N · K* – 1), *n*} designating the rank of **Φ**¯ *<sup>K</sup>*, is a diagonal singular value matrix with entries *λ*<sup>1</sup> *≥ λ*<sup>2</sup> *≥ · · · ≥ λm*. Moreover, consider the vector *d* = [ *d*<sup>1</sup> *d*<sup>2</sup> *· · · dm*˜ ]*T* of dimension *m*˜ *≤ m* containing the indices of selected singular values, and the truncated eigenvector matrices *U*˜ *∈* R *n×m*˜ and *V*˜ *∈* R (*N·K*)*×m*˜ built from the columns of *U* and *V* corresponding to the indices in *d*.

Hence, if the regression vector *ϕ*(*z*[t]) is projected into the column Eigen-space, so that:

$$
\tilde{\mathbf{U}} \cdot \tilde{\boldsymbol{\phi}}(\boldsymbol{z}[t]) = \boldsymbol{\phi}(\boldsymbol{z}[t]) \quad \rightsquigarrow \quad \tilde{\boldsymbol{\phi}}(\boldsymbol{z}[t]) = \tilde{\mathbf{U}}^T \cdot \boldsymbol{\phi}(\boldsymbol{z}[t]) \tag{22}
$$

Then, upon replacement of the previous result into the original regression model, the alternative *Principal Component Gaussian Process Regression* (PC-GP) time-series model is obtained:

$$\mathbf{y}[t] = \tilde{\boldsymbol{\phi}}^T(\mathbf{z}[t]) \cdot \boldsymbol{\vartheta} + \boldsymbol{\omega}[t], \qquad \boldsymbol{\omega}[t] \sim \text{NID}(\mathbf{0}, \sigma\_{\mathbf{w}}^2) \tag{23a}$$

$$\vartheta = \vartheta\_{\boldsymbol{\sigma}}(\boldsymbol{\xi}, \tilde{M}) + \tilde{\boldsymbol{u}}, \qquad \qquad \tilde{\boldsymbol{u}} \sim \text{NID}(\mathbf{0}\_{\tilde{m}\times 1}, \boldsymbol{\Sigma}\_{\boldsymbol{\Theta}}) \tag{23b}$$

where *ϑ* = *U*˜ *T · θ ∈* R *m*˜ is a reduced dimensionality parameter vector. Note that the matrix *U* and the scaled singular values **Λ** 2 /*N* correspond to the matrix of principal vectors and the matrix of principal values of the Principal Component Analysis (PCA) of the covariance matrix estimate **Φ**¯ *·***Φ**¯ *<sup>T</sup> /N*, thus the name *Principal Component Regression* (PCR) ((Bair et al., 2006; Hastie et al., 2009), Section 3.5).

Two main advantages are obtained through the use of the PC-GP method (Hastie et al., 2009): (i) the dimension of the parameter vector is reduced from *n* to *m*˜ ; (ii) since the regression is built on orthogonal regressors (contained in the matrix *U*˜ ), the reduced parameter vector *ϑ* is also uncorrelated, and thus the corresponding covariance matrix **Σ***<sup>ϑ</sup>* is diagonal. Consequently, only the diagonal elements of the matrix need to be estimated. Additionally, the original parameter vector may be retrieved *via* the operation:

$$
\theta = \tilde{\mathbf{U}} \cdot \mathfrak{d} \tag{24}
$$

### **3. IDENTIFICATION OF THE GP TIME-SERIES MODEL**

The identification of a GP time-series model may be stated as the problem of determining the hyper-parameters *P* and the structural parameters (consisting of the model and basis orders) that best fit a given a set of *K* dynamic responses, say *Y<sup>K</sup>* = {*y*1, *y*2, *. . .* , *yK*}. Additionally, according to the model type, a corresponding set of excitation inputs *X <sup>K</sup>* = {*x*1, *x*2, *. . .* , *xK*} and EOP vectors **Ξ***<sup>K</sup>* = {*ξ*1, *ξ*2, *. . .* , *ξK*} are provided. Additionally, it may be of interest to determine the parameter vectors associated with each one of the realizations, i.e., each one of the parameter vectors *θ<sup>k</sup>* for all *k* = 1, *. . .* , *K*. These three topics are analyzed next.

### **3.1. Estimation of the Parameter Vectors of Individual Realizations**

*Maximum A Posteriori* (MAP) estimates of the parameter vector of the GP time-series model for a single realization *y<sup>k</sup>* are obtained by evaluating the values that maximize the posterior PDF *p*(*θ<sup>k</sup>* | *yk*, **Φ***k*, *ξk*, *P*) in equation (19a). Given that the posterior distribution is Gaussian, the MAP estimates correspond to the posterior mean (equal to the mode of the Gaussian distribution), which may be computed *via* equation (20) for given values of *P*.

### **3.2. Estimation of the Hyperparameters** 3.2.1. Maximum Likelihood Estimation of the Hyperparameters

Maximum likelihood estimates of the hyperparameters are obtained *via* optimization of the marginalized hyper-parameter likelihood for the complete set of data. Accordingly, ML estimates are obtained from the optimization problem ((Rasmussen and Williams, 2006), ch. 5; (Shumway and Stoffer, 2011), ch. 6):

$$\hat{\mathcal{P}} = \arg\max\_{\mathcal{P}} \sum\_{k=1}^{K} \ln \mathcal{L}(\mathcal{P}|\mathbf{y}\_k, \Phi\_k, \xi\_k) \tag{25a}$$

$$\sum\_{k=1}^{K} \ln \mathcal{L}(\mathcal{P}|\mathbf{y}\_k, \Phi\_k, \xi\_k) = -\frac{N \cdot K}{2} \ln 2\pi$$

$$-\frac{1}{2} \sum\_{k=1}^{K} \left( \ln |\boldsymbol{\Sigma}\_{\varepsilon\_k}| + \boldsymbol{\mathfrak{e}}\_k^T \cdot \boldsymbol{\Sigma}\_{\varepsilon\_k}^{-1} \cdot \boldsymbol{\mathfrak{e}}\_k \right) \tag{25b}$$

where *ε<sup>k</sup>* = [ *εk*[1] *εk*[2] *· · · εk*[*N*] ]*T N×*1 is the vector of prior prediction errors, with *εk*[*t*] defined in equation (21a). Although it is possible to analytically solve the ML optimization problem in equation (25a) for some of the hyper-parameters (in particular for a constant parameter mean), the problem becomes intractable for other quantities. Alternatively, the Expectation-Maximization (EM) algorithm constitutes a powerful tool to solve this optimization problem.

### 3.2.2. Expectation-Maximization Algorithm for Efficient Computation of the ML Estimates

The *Expectation-Maximization* (EM) algorithm attempts to maximize the conditional expectation of the logarithm of the joint conditional PDF *p*(*yk*, *θ<sup>k</sup>* | **Φ***k*, *ξk*, *P*), with respect to available data ((Shumway and Stoffer, 2011), ch. 6). Formally expressed, the EM algorithm aims at maximizing the expected log-likelihood:

$$Q(\mathcal{P}|\mathcal{P}^{(-)}) = \mathbb{E}\_{\theta \mid \mathcal{D}, \mathcal{P}^{(-)}} \left\{ -\sum\_{k=1}^{K} \ln \rho(\mathbf{y}\_k, \theta\_k \mid \Phi\_k, \xi\_k, \mathcal{P}) \right\} \tag{26}$$

where *E<sup>θ</sup>|D,P*(*−*) *{·}* denotes the conditional expectation of the argument with respect to the space of *θ* given data *D* = {*yk*, **Φ***k*, *ξk*}, *∀ k* = 1, *. . .* , *K* and hyper-parameters *P* (*−*) , and where:

$$\begin{split} &\sum\_{k=1}^{K} \ln p(\mathbf{y}\_k, \boldsymbol{\theta}\_k \mid \boldsymbol{\Phi}\_k, \boldsymbol{\xi}\_k, \mathcal{P}) \\ &= -\frac{1}{2} \sum\_{k=1}^{K} \left( N \cdot \ln \sigma\_{\boldsymbol{w}}^2 + \sigma\_{\boldsymbol{w}}^{-2} (\mathbf{y}\_k - \boldsymbol{\Phi}\_k^T \cdot \boldsymbol{\Theta}\_k)^T \cdot (\mathbf{y}\_k - \boldsymbol{\Phi}\_k^T \cdot \boldsymbol{\Theta}\_k) \right) \\ &+ \ln \left| \boldsymbol{\Sigma}\_{\boldsymbol{\theta}} \right| + \left( \boldsymbol{\Theta}\_k - \boldsymbol{\Theta}\_o(\boldsymbol{\xi}\_k, \boldsymbol{\mathsf{M}}) \right)^T \cdot \boldsymbol{\Sigma}\_{\boldsymbol{\theta}}^{-1} \cdot \left( \boldsymbol{\Theta}\_k - \boldsymbol{\Theta}\_o(\boldsymbol{\xi}\_k, \boldsymbol{\mathsf{M}}) \right) \end{split} \tag{27}$$

Thus, after evaluating the expectation, the expected loglikelihood of the GP time-series model becomes:

$$\begin{split} \mathcal{Q}\left(\mathcal{P}|\mathcal{P}^{(-)}\right) &= \mathcal{Q}\_{1}\left(\mathcal{P}|\mathcal{P}^{(-)}\right) + \mathcal{Q}\_{2}\left(\mathcal{P}|\mathcal{P}^{(-)}\right) \\ \mathcal{Q}\_{1}\left(\mathcal{P}|\mathcal{P}^{(-)}\right) &= -\frac{K \cdot N}{2} \ln \sigma\_{w}^{2} \\ &- \frac{1}{2\sigma\_{w}^{2}} \sum\_{k=1}^{K} \left( \left(\mathbf{e}\_{k}^{(-)}\right)^{T} \cdot \mathbf{e}\_{k}^{(-)} + \text{tr}\left(\boldsymbol{\Phi}\_{k}^{T} \cdot \mathbf{P}\_{k}^{(-)} \cdot \boldsymbol{\Phi}\_{k}\right) \right) \\ \mathcal{Q}\_{2}\left(\mathcal{P}|\mathcal{P}^{(-)}\right) &= -\frac{K}{2} \ln |\boldsymbol{\Sigma}\boldsymbol{\sigma}| \\ &- \frac{1}{2} \sum\_{k=1}^{K} \left( \left(\boldsymbol{\Phi}\_{k}^{(-)}\right)^{T} \cdot \boldsymbol{\Sigma}\_{\Theta}^{-1} \cdot \boldsymbol{\Phi}\_{k}^{(-)} + \text{tr}\left(\boldsymbol{\Sigma}\_{\Theta}^{-1} \cdot \mathbf{P}\_{k}^{(-)}\right) \right) \end{split}$$

where tr(*·*) denotes the trace operation, and:

$$\mathbf{e}\_k^{(-)} = \mathbf{y}\_k - \Phi\_k^T \cdot \hat{\boldsymbol{\Theta}}\_k^{(-)}, \quad \mathbf{e}\_k^{(-)} \in \mathbb{R}^N \tag{29a}$$

$$\boldsymbol{\mathfrak{G}}\_{k}^{(-)} = \boldsymbol{\hat{\sigma}}\_{k}^{(-)} - \boldsymbol{\Theta}\_{o} \left( \boldsymbol{\xi}\_{k}, \boldsymbol{\mathsf{M}}^{(-)} \right), \quad \boldsymbol{\mathfrak{G}}\_{k}^{(-)} \in \mathbb{R}^{n} \tag{29b}$$

and *θ*ˆ(*−*) *k* and *P* (*−*) *k* are the MAP estimates of the mean and covariance of the coefficient vector given the hyperparameter values *P* (*−*) = *{M*(*−*) *,* **Σ** (*−*) *θ , σ* 2(*−*) *<sup>w</sup> }* obtained with equation (20).

The Expectation-Maximization algorithm operates by selecting some initial hyperparameter values *P* (0), and then, at each iteration *i* = 1, 2, *. . .* , the following two steps are performed:

### 3.2.3. Expectation Step (E-Step)

The expected log-likelihood is evaluated based on the previous hyper-parameter values, *P* (*i−*1). This translates into the evaluation of the mean and covariance matrix of the posterior coefficient PDF, by applying equation (20) on all the available dynamic response realizations *k* = 1, *. . .* , *K*.

### 3.2.4. Maximization Step (M-Step)

Updated hyper-parameter values are obtained by computing the values that maximize the expected log-likelihood function, this is to say:

$$\mathcal{P}^{(i)} = \underset{\mathcal{P}}{\text{arg}\,\text{max}}\ Q\left(\mathcal{P}|\mathcal{P}^{(i-1)}\right) \tag{30}$$

which leads to the update equations:

$$\mathbf{M}^{(i)} = \left(\sum\_{k=1 \atop \nu}^{K} \hat{\boldsymbol{\theta}}\_{k}^{(i-1)} \cdot \mathbf{g}^{T}(\boldsymbol{\xi}\_{k})\right) \cdot \left(\sum\_{k=1}^{K} \mathbf{g}(\boldsymbol{\xi}\_{k}) \cdot \mathbf{g}^{T}(\boldsymbol{\xi}\_{k})\right)^{-1} \tag{31a}$$

$$\hat{\boldsymbol{\Sigma}}\_{\boldsymbol{\Theta}}^{(i)} = \frac{1}{K} \sum\_{k=1}^{K} \left( \boldsymbol{\delta}\_{k}^{(i-1)} \cdot (\boldsymbol{\delta}\_{k}^{(i-1)})^T + \boldsymbol{\mathcal{P}}\_{k}^{(i-1)} \right) \tag{31b}$$

$$\hat{\sigma}\_{\mathbf{w}}^{2(i)} = \frac{1}{K \cdot N} \sum\_{k=1}^{K} \left( \left( \mathbf{e}\_{k}^{(i-1)} \right)^{T} \cdot \mathbf{e}\_{k}^{(i-1)} + \text{tr}\left( \Phi\_{k}^{T} \cdot P\_{k}^{(i-1)} \cdot \Phi\_{k} \right) \right), \tag{31c}$$

Moreover, if the parameter vector is constant, then the update equation for the mean parameter vector reduces to:

$$\boldsymbol{\Theta}\_o^{(i)} = \frac{1}{K} \sum\_{k=1}^{K} \boldsymbol{\hat{\Theta}}\_k^{(i-1)} \tag{32}$$

In addition, in the case of the PC-GP time-series model, the diagonal structure of the coefficient covariance matrix leads to the simplified update equation:

$$
\sigma\_{\vartheta\_{\boldsymbol{\vartheta}}}^{2(i)} = \frac{1}{K} \sum\_{k=1}^{K} \left( \left( \vartheta\_{j}^{(i-1)} - \vartheta\_{j,o}^{(i-1)} \right)^{2} + \left[ \mathbf{P}\_{k}^{(i-1)} \right]\_{j,j} \right),
$$

$$
\forall j = 1, 2, \dots, \tilde{m} \tag{33}
$$

where [*M*]*a,b* is the entry of matrix *M* on row *a* and column *b*.

The E and M steps are iterated until a specific number of iterations, say *N*iter, is reached, or until convergence, which may be assessed by evaluating if the norm of the difference between the current and previous values of the marginal likelihood and hyperparameter estimates is lower than a pre-specified threshold. The later translates into monitoring if any of the following conditions is true:

$$
\Delta\_P \ge |\mathcal{P}^{(i)} - \mathcal{P}^{(i-1)}| \tag{34a}
$$

$$\Delta\_{\mathcal{L}} \ge \left| \ln \mathcal{L}(\mathcal{P}^{(i)} | \mathcal{Y}\_{\mathbb{K}}) - \ln \mathcal{L}(\mathcal{P}^{(i-1)} | \mathcal{Y}\_{\mathbb{K}}) \right| \tag{34b}$$

$$N\_{\text{iter}} \ge i \tag{34c}$$

where ∆*<sup>p</sup>* and ∆*<sup>L</sup>* are thresholds on the absolute difference of hyperparameters and the marginal likelihood updates.

The EM algorithm has been demonstrated to maximize the marginal likelihood (equation (25b)) at every step and to converge to a local maximum of the marginal likelihood located in the neighborhood of the given initial values (Shumway and Stoffer, 2011). In order to facilitate the convergence toward the global maximum, it is essential to provide a suitable set of initial hyperparameter values, which may be derived from an initial set of estimates of the coefficient vectors *θ<sup>k</sup>* obtained with traditional least squares or maximum likelihood methods as explained for example in Ljung (1999).

### **3.3. Model Assessment and Validation**

Once the GP time-series model has been estimated, it is important to determine the performance of the model. Likewise, it may be of interest to compare with other model structures and determine which one is best for the data. For that purpose, the main tool for evaluating the performance is the marginal likelihood shown in equation (25b). However, precise evaluation of the marginal likelihood may be non-trivial, in particular because the prior prediction error covariance matrix is unknown. Instead, it may be preferable to evaluate the *Residual Sum of Squares* normalized by the *Series Sum of Squares* (RSS/SSS) based on the prior estimation residuals *ε*ˆ[*t*], as follows:

$$\begin{split} \text{RSS/SSS}\_{(prior)} &= \left(\sum\_{k=1}^{K} \sum\_{t=1}^{N} \hat{\varepsilon}\_{k}^{2}[t]\right) \bigg/ \left(\sum\_{k=1}^{K} \sum\_{t=1}^{N} \wp\_{k}^{2}[t]\right), \\ \hat{\varepsilon}\_{k}[t] &= \wp\_{k}[t] - \boldsymbol{\phi}\_{k}^{T}(\boldsymbol{z}[t]) \cdot \boldsymbol{\theta}\_{o}(\boldsymbol{\xi}\_{k}, \hat{\boldsymbol{\Lambda}}) \end{split} \tag{35}$$

where *M*ˆ corresponds to the estimates of the matrix of coefficients of projection.

Alternatively, a validation error can be evaluated in order to assess the representation effectiveness and generalization ability of the GP time-series model. In this sense, consider the *validation set Y* (*v*) *<sup>L</sup>* = *{y* (*v*) 1 *, · · · , y* (*v*) *L }* consisting of *L* dynamic responses, whose elements are independent from the set *Y<sup>K</sup>* used for estimation (training) of the model. Then, the prior RSS/SSS in equation (35) may be evaluated based on the validation set, where the prior estimation residuals are replaced by the validation error *ε*ˆ (*v*) *l* [*t*], so that:

$$\begin{split} \text{RSS} / \text{SSS}\_{\text{(prior)}}^{\left(\nu\right)} &= \left( \sum\_{l=1}^{L} \sum\_{t=1}^{N} \left( \hat{\varepsilon}\_{l}^{\left(\nu\right)} [t] \right)^{2} \right) \Bigg/ \left( \sum\_{l=1}^{L} \sum\_{t=1}^{N} \left( \nu\_{l}^{\left(\nu\right)} [t] \right)^{2} \right), \\ \hat{\varepsilon}\_{l}^{\left(\nu\right)} \left[ t \right] &= \nu\_{l}^{\left(\nu\right)} \left[ t \right] - \Phi\_{k}^{T} (\mathbf{z}\_{l}^{\left(\nu\right)} [t]) \cdot \Theta\_{o} (\mathbf{\xi}\_{l}^{\left(\nu\right)}, \hat{M}) \end{split} \tag{36}$$

The validation RSS/SSS may be associated with the empirical risk ((Vapnik, 2000), ch. 1) for the loss function *L*(*yk*[*t*]*, yk*[*t|t−*1]) = ˆ*ε* 2 *k* [*t*] *·* (∑*<sup>K</sup> k*=1 ∑*<sup>N</sup> t*=1 *y* 2 *k* [*t*] )*<sup>−</sup>*<sup>1</sup> .

### **4. CASE STUDIES**

### **4.1. Long-term Identification of the Acceleration Response in the Humber Bridge**

#### 4.1.1. Data Description and Preprocessing

The Humber Bridge is a long span suspension bridge joining the small towns of Hessle (north) and Barton (south) in the UK. The main span of the bridge comprises 1,410 m and is built on aerodynamic steel box girders and inclined hangers, and supported by two reinforced concrete towers rising 155.5 m above the caisson foundations. The bridge is exposed to prevailing southwesterly cyclonic winds that can reach hurricane force (exceeding 32.7 m/s), with atmospheric temperatures ranging from *−*10 to 30°C. Further details of the structure of the Humber bridge may be found in Rahbari et al. (2015). The bridge has been instrumented with various sensors, including GPS antennas, accelerometers, inclinometers and extensometers. In addition, various environmental variables including wind speed and temperature at different locations of the bridge are also measured. The monitoring campaign comprised a three year period starting from January 11, 2011, to December 2, 2013. In the present study, the vertical acceleration response signal measured at the midspan on the east side of the deck is selected for analysis, while the wind speed is used as EOP.

The vertical acceleration response signal is originally sampled at 20 Hz. For the present study however, it is down-sampled to 2 Hz in order to focus the analysis on the main structural frequencies, located under 1 Hz, while at the same time reducing the model complexity. Acceleration and wind speed signal segments of 250 s (*N* = 500 samples) are extracted every 30 min, thus resulting in a maximum of 48 segments per day. In the analysis presented here, the average wind speed on each analysis period is considered as the unique EOP for the construction of a GP-AR model of the vertical vibration of the bridge. Hence, in order to reduce the parameter uncertainty and the computational cost in the construction of the model, only the vibration records corresponding to the main wind direction (about 90 *±* 20°) are utilized in the construction of the model. Note, however, that a more comprehensive representation of the vibration response of the bridge may be appraised by considering also the wind direction as an EOP in the GP-AR model. This issue shall be appraised in a future work. Thus, after the selection of the vibration records corresponding to the main wind direction and the removal of artifacts due to problems of the measuring system, a total of 7,000 signal segments are finally obtained for the construction of the model.

In **Figure 1** is provided a histogram of the average wind speed corresponding to the selected vibration signals. In addition, **Figure 2** displays a typical daily variation of the power spectral density (PSD) of the vertical acceleration response and the average wind speed (irrespective of the wind direction). The obtained PSDs demonstrate that the main natural frequencies remain relatively stable, although the amplitude of the vibration evidences large variations even during a single day. In particular, it is evident that during the low wind speed period observed on the first 4 h of the analyzed period, the power of the vibration is much lower in comparison with later hours where the wind speed increases.

### 4.1.2. Model Identification

#### *4.1.2.1. Modeling of the Short-term Response*

An AutoRegressive (AR) model structure is selected to represent the acceleration response on short, 250 s, intervals. The order of the AR model is selected by evaluating different model structures with orders in the range *n<sup>a</sup>* = [1, *. . .* , 100]. A subset of 7 days of data is selected to determine the model order. The prior RSS/SSS and Bayesian Information Criterion (BIC) curves, as well as the frequency stabilization plot, displaying the natural frequencies and damping ratios associated with the estimated AR models for increasing orders, are shown in **Figure 3**. It may be observed that the RSS/SSS tends to favor large AR orders, while the BIC clearly demonstrates several minima, and a global minimum found for the value *n<sup>a</sup>* = 72. The frequency stabilization plot seems to confirm these findings, by demonstrating that the main frequency peaks found in the PSD are accommodated by AR models with order around *n<sup>a</sup>* = 72. Thus, according to the frequency stabilization plot and the BIC curve, the selected model order for subsequent analysis is *n*<sup>a</sup> = 72.

#### *4.1.2.2. Modeling of the Long-term Response*

The long-term acceleration response of the bridge is represented by means of a GP-AR model where the 250 s average wind speed is used as EOP, this is to say *ξ*. For this purpose, the average wind speed is normalized from the range [0, 30] m/s to the range [0,1] by making *ξ* = *AWS/30*, where AWS stands for the 250 s Average Wind Speed. The functional basis used for the expansion of the parameter vector of the model corresponds to the class of Hermite orthogonal polynomials, which satisfy the recurrence relation:

$$g\_{j+1}(\boldsymbol{\xi}) = H\_j(\boldsymbol{\xi}) = \boldsymbol{\xi} \cdot H\_{j-1}(\boldsymbol{\xi}) - (j-1) \cdot H\_{j-2}(\boldsymbol{\xi}), \quad \forall j \ge 2 \tag{37}$$

$$g\_1(\boldsymbol{\xi}) = H\_o(\boldsymbol{\xi}) = 1, \qquad g\_2(\boldsymbol{\xi}) = H\_1(\boldsymbol{\xi}) = \boldsymbol{\xi}$$

Thus, GP-AR(72) models are estimated using the EM algorithm for increasing basis orders in the range of *p* = 1, *. . .* , 6. For the application of the EM algorithm and validation of the performance of the obtained models, the whole dataset consisting of 7,000 segments of 250 s is separated into training and validation subsets. The training subset is composed by the initial 3,000 segments, while the validation subset is composed by the remaining 4,000 segments. It should be noted that the GP-AR(72) model with *p* = 1 would correspond to the case when the EOP variable is ignored and the parameter vector is assumed to be Gaussian with constant mean and covariance matrix. The settings used for the application of the EM algorithm are shown in **Table 1**. Note that the threshold ∆*RSS/SSS* is presently utilized instead of ∆*L*, as suggested in equation (34), since the RSS/SSS is used to evaluate the convergence of the optimization procedure. **Figure 4** shows the RSS/SSS based on prior and posterior residuals, as well as for the training and validation subsets obtained with the GP-AR(72) model with basis orders *p* = 1, *. . .* , 6. The obtained results demonstrate that a GP-AR(72) model with basis order *p* = 4 provides the best fit in all cases. Furthermore, the validation error results slightly elevated when compared against the training error, thus demonstrating the generalization capability of the model.

The prior and posterior estimates of the parameter vector as a function of the AWS are shown in **Figure 5**. The dependency of the parameter estimates on the AWS is evident and is consistent on both the prior and posterior parameter estimates. Nonetheless, in

some cases the posterior estimates tend to deviate from the prior at higher AWS values, especially for higher wind speeds. This effect may be due to the reduced amount of signal segments available for higher wind speeds in the construction of the model, as evident in the histogram of wind speeds shown in **Figure 1**.

*samples, overlap 5,792 samples*; **(B)** 2 min average wind speed.

#### 4.1.3. Model-Based Analysis

Once the GP-AR(72)*<sup>p</sup>* = 4 model has been identified, it is possible to analyze the dynamic characteristics of the acceleration response of the bridge as a function of the average wind speed. In particular, the Power Spectral Density, the natural frequencies and damping ratios are extracted from the identified GP-AR(72)*<sup>p</sup>* = 4 model. Each one of these quantities are evaluated as follows:

$$\text{Characteristic polynomial}: A(z, \xi) = 1 + \sum\_{i=1}^{n\_x} \hat{a}\_i(\xi) \cdot z^{-i} \tag{38a}$$

$$\text{PSD}: \ P\_{\mathcal{V}}(f, \xi) = \frac{\sigma\_w^2}{\left| A(\mathbf{e}^{j2\pi f}, \xi) \right|^2} \tag{38b}$$

$$\text{Poles}: \ \{ \lambda\_i(\pm \mathbb{C}) \in \mathbb{C}, i = 1, \ldots, n\_a : A(\lambda, \xi) = 0 \}\tag{38c}$$

$$\text{Natural frequencies}: f\_{n,i}(\boldsymbol{\xi}) = \frac{f\_{\boldsymbol{\xi}}}{2\pi} \cdot |\ln \lambda\_i(\boldsymbol{\xi})| \tag{38d}$$

$$\text{Damping ratios}: \ \zeta\_i(\pmb{\xi}) = -\cos(\arg(\ln \lambda\_i(\pmb{\xi}))) \tag{38e}$$

**Figure 6** shows the GP-AR(72)*<sup>p</sup>* <sup>=</sup> <sup>4</sup> model-based PSD, natural frequencies and damping ratios obtained for the range of average wind speeds from 0 to 25 m/s, while frequencies and damping ratios are limited to the ranges [0,500] mHz and [0,10]%, respectively. The model-based PSD and modal quantities demonstrate that both the amplitude and frequency of the vibration are directly influenced by the average wind speed.

A more detailed analysis of the first six natural frequencies and damping ratios obtained from prior and posterior parameter estimates is presented in **Figure 7**. The posterior estimates are

**FIGURE 3** | Selection of the order of the AR model. **(A)** Prior RSS/SSS and BIC curves for increasing model orders *n<sup>a</sup>* = 1, 2,*. . .*, 100; **(B)** frequency stabilization plot of the AR model for increasing model orders with the Welch PSD estimate–*Hamming window, length 512 samples, 256 samples overlap*.

**TABLE 1** | Settings of the EM algorithm.


calculated for the complete set of vibration segments based on the estimated GP-AR model. The dispersion of the modal quantities tends to blow up for increasing wind speeds, which in part may be due to the effect of wind and turbulence on the structure. In addition, the difference between prior and posterior estimates of the modal quantities tends to increase when the wind speed is larger than 20 m/s. This effect may be due to the lesser amount of samples acquired from higher wind speeds, which may be leading to increased uncertainty in the parameter estimates. The modal analysis results obtained with the GP-AR model may be contrasted with those previously reported on (Diana et al., 1992; Brownjohn et al., 1994, 2010). In particular, it appears that the modes displayed in **Figure 7** correspond to the first, second, third, and eight vertical modes (*fn*,1, *fn*,2, *fn*,3, and *fn*,5 respectively), and the first torsional mode of the bridge (*fn*,4). Nonetheless, the predicted variation in the first torsional mode appears to be different to that one found with the GP-AR model.

Confirmation of the modal analysis results shall be sought in a future work, where a vector AR model would be used to represent the two vertical and the horizontal vibration response of the bridge. However, the relatively simpler model utilized in this analysis can be used to track and assess the variability in the modes of the bridge when the wind is blowing from the main direction.

### **4.2. Simulated Vibration of Operating Wind Turbine Blades**

### 4.2.1. Data Description and Preprocessing

This application focuses on the identification and analysis of the vibration acceleration signals obtained *via* simulations of a fully operational wind turbine. For a PCE-based treatment on actual tower measurements obtained from an operated wind turbine,

ratios as a function of AWS.

the interested reader is referred to Bogoevska et al. (2017). The analyzed wind turbine is the NREL 5 MW reference offshore wind turbine, fully described in Jonkman et al. (2009). The simulation is performed by means of the FAST wind turbine aeroelastic code, which uses a turbulent wind excitation simulated with TurbSim (Jonkman and Buhl, 2005). Acceleration signals are measured at different locations along the span of on one of the blades of the wind turbine on the flap wise direction, as depicted in **Figure 8**. From these, the acceleration signals measured on the tip of the blade (node 6) are used for further analysis.

Simulations of both turbulent wind and vibration response are computed for a period of 10 min (600 s) with a sampling rate of 200 Hz. Moreover, the instantaneous rotor azimuth is also extracted, which shall be used as the scheduling variable in the model for representation of the short-term response. The obtained acceleration signals are subsequently downsampled at 12.5 Hz for further analysis and processing. An antialiasing low-pass filter is applied before down-sampling. The filter consists of a 100 order FIR filter with cutoff frequency of 5 Hz. The cutoff frequency has been selected in order to preserve the structural modes which are under 4 Hz, and is applied in a forward–backward fashion to compensate the phase delay by using the MATLAB command filtfilt.

A set of 100 simulations is obtained under different 10 min average wind speeds in the range from 5 to 26 m/s. For that

**FIGURE 7** | Distribution of selected natural frequencies and damping ratios obtained from the GP-AR(72)*p*=4 model. Solid line, modal quantities from prior parameter estimates; dots, modal quantities from posterior parameter estimates; shaded areas, 50 and 90% confidence intervals derived from the parameter mean and covariance matrix. Left column: natural frequencies in mHz; right column: damping ratios.

purpose, the Latin hypercube sampling algorithm is used to create a set of 100 random average wind speeds within the specified range. Then, a turbulent wind speed time-series is simulated for each 10 min average wind speed, which is subsequently used as input to the FAST simulation software.

It is noted that the present simulated data bears important differences with those published in the previous work (Avendaño-Valencia and Fassois, 2017a), which are summarized as follows: (i) The turbulent wind excitation is simulated over an 8 *×* 8 grid, in comparison with the 2 *×* 2 grid used in the previous work. Therefore, the excitation is richer while the vibration response may be more complex. (ii) The sampling rate used for simulation is extended from 25 to 200 Hz so as to improve the convergence of the numerical integration algorithm. (iii) The analysis is performed in a sensor in the blade tip instead of the blade root. A consequence of the previous selections is that the structure of the obtained models may differ significantly with that reported (Avendaño-Valencia and Fassois, 2017a).

### 4.2.2. Model Identification *4.2.2.1. Modeling of the Short-term Response*

The blade vibration signal is represented *via* a *Linear Parameter Varying AR* (LPV-AR) model, which uses the instantaneous rotor azimuth as scheduling variable (Avendaño-Valencia and Fassois, 2017a). The selection structure of the LPV-AR model follows the procedure described in Avendaño-Valencia and Fassois (2017a), while the selection of the LPV-AR model structure is guided by the BIC. The obtained BIC curves are shown in **Figure 9**, from which the model order *n<sup>a</sup>* = 17 and basis order *p<sup>a</sup>* = 5 are selected.

### *4.2.2.2. Modeling of the Long-term Response*

A GP-LPV-AR model is constructed for the representation of the long term response of the blade. For this purpose, the EOP variable *ξ* is defined as the 10 minute average wind speed (lying in the range [0,30] m/s) normalized within the range [0,1]. The parameter vector of the model is expanded on the Hermite polynomials defined in equation (37).

GP-LPV-AR(17)<sup>5</sup> models with GP basis orders in the range *p* = 0, 1, *. . .* , 6 are estimated using the EM algorithm. The settings of the EM algorithm are similar to those used in the previous example and summarized in **Table 1**; however, in the present case the thresholds for finalization of the optimization are defined as ∆*<sup>p</sup>* = ∆RSS/SSS = 10–6 . **Figure 10A** shows the RSS/SSS based on the prior and posterior prediction errors for increasing orders of the functional basis expansion of the GP. The plot shows

**FIGURE 8** | Location of the sensors in the blade of the wind turbine. Acceleration signals are measured on the flapwise direction of the blade (normal to the surface of the page).

**FIGURE 9** | Selection of the LPV-AR model structure: BIC curves for increasing model orders *n<sup>a</sup>* = 1, 2,*. . .*, 40 and for different basis orders *p<sup>a</sup>* = 1, 3,*. . .*, 9: **(A)** whole range; **(B)** detail in the range *n<sup>a</sup>* = [10, 25].

totally different behaviors of the prior and posterior errors. In particular, it is evident that the prior prediction error depends on the order of the functional expansion, while the posterior error does not show a significant dependence. This behavior may be expected, since the prior estimates are based solely on the model predictions, while the posterior estimates are adjusted to the observed vibration response. For that same reason, the prior error is a better tool to evaluate the model capabilities. In the present case, according to the prior RSS/SSS curve, a basis order *p* = 5 is selected.

Similarly, a PC-GP-LPV-AR model (based on a Principal Component representation of the regression matrix of the LPV-AR model, as explained in Section 2.4) is used for the longterm identification of the response of the wind turbine. The PC-GP-LPV-AR model is further estimated by means of the EM algorithm, using the same settings applied in the previous GP-LPV-AR model. The Principal Component representation of the regression matrix is carried out using all the components (without dimensionality reduction), however, the covariance matrix of the model parameters is in the present case, diagonal, thus a lower number of hyper-parameters have to be estimated. The obtained prior and posterior RSS/SSS curves obtained with the PC-GP-LPV-AR model are shown in **Figure 10B**. In this case, the prior and posterior error curves are almost the same, while the overall error is slightly higher than that obtained with the GP-LPV-AR model.

### 4.2.3. Model-Based Analysis

The dynamics of the wind turbine are analyzed based on the identified GP-LPV-AR(17)5,5 model. For that purpose, the analysis of

the dynamics is based on the "frozen" Power Spectral Density, natural frequencies and damping ratios, which are calculated by means of the following equations:

"Frozen" characteristic polynomial :

$$A(z,\beta,\xi) = 1 + \sum\_{i=1}^{n\_a} \left\| \left(\beta,\xi\right) \cdot z^{-i} \right\|\tag{39a}$$

$$\text{PSD}: \ P\_{\mathcal{Y}}(f, \beta, \xi) = \frac{\sigma\_{\le}^{2}}{\left| A(e^{j2\pi f}, \beta, \xi) \right|^{2}} \tag{39b}$$

$$\text{Poles}: \ \{ \lambda\_i(\beta, \xi) \in \mathbb{C}, i = 1, \ldots, n\_a : A(\lambda, \beta, \xi) = 0 \} \quad \text{(39c)}$$

$$\text{Natural frequencies}: f\_{n,l}(\beta, \xi) = \frac{f\_s}{2\pi} \cdot |\ln \lambda\_l(\beta, \xi)| \tag{39d}$$

$$\text{Damping ratios}: \ \zeta(\beta, \xi) = -\cos(\arg(\ln \lambda\_i(\beta, \xi))) \tag{39e}$$

For the present application, these quantities are functions of two variables, namely the instantaneous rotor angle *β* and the 10 minute average wind speed *ξ*. In that sense, the "frozen" PSD, natural frequencies and damping ratios are calculated for a single period of rotation of the blades and for the whole range of wind speeds (3–25 m/s). **Figure 11** shows the obtained "frozen" PSDs for different wind speeds in the range. The figure indicates that the variability of the characteristics of the dynamic response of the wind turbine both as the rotor azimuth changes, and as the wind speed increases. In particular, it can be observed that for all the wind speeds, an important increment in the power of the vibration response is evident at about two-thirds of a full rotation. This event may be associated with the blade passing in front of the tower. Moreover, it is also evident that as the wind speed increases, the overall power of the vibration response increases as well. The total power difference when the wind speed changes from 5 to 25 m/s is about 20 dB or a whole magnitude level.

Average values of the "frozen" natural frequencies and damping ratios and their respective confidence intervals may be drawn from the obtained GP-LPV-AR model. The procedure is similar to that performed in the previous application example. Three modes are selected for this analysis, namely the pair of modes located around 1 Hz, and the mode located around 2 Hz. The selected average "frozen" natural frequencies and damping ratios are shown in **Figure 12**. In the obtained curves is clear the dependency of both natural frequency and damping ratio on the wind speed. Particularly, for modes 1 and 3 there is an increase on the damping ratio, while for mode 2 the damping ratio tends to decrease as the wind speed does. In addition, the natural frequencies tend to increase with the wind speed as well. The confidence intervals are in general well bounded, and in particular it is clear that the estimates of the damping ratios are reliable.

### **5. CONCLUDING REMARKS**

This work has been devoted to a Gaussian Process time-series modeling framework for the representation of the long-term dynamics of structures operating under variable environmental and operational conditions. The model definitions plus an identification method based on the Expectation-Maximization algorithm have been presented. In addition, an optional parameter reduction technique based on Principal Component Regression has been introduced as a method to reduce the number of parameters to be estimated in the representation. The resulting GP time-series methods provide an appealing alternative for the representation of the complex dynamics of vibrating structures operating under variable environmental and operational conditions.

A potential limitation of the GP time-series modeling methodology, as presented in this work, is that the innovations variance of the time-series model is assumed to be constant. The adoption of a constant innovations variance may hinder the representation of changing power in the vibrational response of the structure according to environmental and operational conditions. A solution for this limitation is to define a GP to represent the dependence of the innovations variance on the EOPs, in the same manner as already explained for the coefficients of the time-series model. Toward this end, it would be further necessary to modify the EM algorithm for the estimation of the parameters of the innovations variance GP.

The proposed GP time-series modeling method offers a promising tool for assimilation in damage diagnosis algorithms. For this purpose, a key element lies in formalizing the selection of the conditions used for training of the model, namely, specification of the range of environmental and operational conditions under which the GP time-series model is able to lead to a robust

### **REFERENCES**


decision. An exploratory study on this issue can be found in Avendaño-Valencia and Chatzi (2017), and will be extended as future work.

### **AUTHOR CONTRIBUTIONS**

LA-V has developed the mathematical techniques involved in this work and performed the simulations shown in the case studies. He has further carried out the major part of the writing and organization of the text. EC has collaborated on the method's development, writing, and proof-reading of the document and has cross-checked the mathematical techniques. JB has provided the vibration data of the Humber bridge and has further contributed to the scientific writing. KK contributed to the revised version of the work by providing the complete bridge vibration data used in the application example on the Humber bridge. In addition, he participated in the revision of the manuscript pre-prints and in the approval of the finalized version.

### **FUNDING**

The authors gratefully acknowledge the support of the ETH Zurich Postdoctoral Fellowship FEL-45 14-2 "*A data-driven computational framework for damage identification and life-cycle management of wind turbine facilities*." In addition, EC would further like to acknowledge the ERC Starting Grant award (ERC-2015- StG 679843) on the topic of "*Smart Monitoring, Inspection and Life-Cycle Assessment of Wind Turbines*."


*International Conference on Structural Dynamics (EURODYN 2014)*, Porto, Portugal, 2393–2398.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Avendaño-Valencia, Chatzi, Koo and Brownjohn. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## **APPENDIX**

### **A. Demonstrations**

### A.1. Demonstration of the PDF in Equation (10)

This section aims to demonstrate the PDF of the dynamic response vector *y* = [ *y*[1] *· · · y*[*N*] ]*T* given the parameter vector *θ* and the regression matrix **Φ** = [ *ϕ*(*z*[1]) *· · · ϕ*(*z*[*N*])]*<sup>T</sup>* . For that purpose, consider the PDF:

$$p(\boldsymbol{y} \mid \boldsymbol{\theta}, \boldsymbol{\Phi}) = \prod\_{t=1}^{N} p(\boldsymbol{y}[t] \mid \boldsymbol{\Theta}, \boldsymbol{\phi}(\boldsymbol{z}[t])) \tag{A1}$$

substituting equation (8) and displaying the Gaussian distribution as an exponential function, yields:

$$\begin{split} p(\mathbf{y} \mid \boldsymbol{\theta}, \boldsymbol{\Phi}) &= \prod\_{t=1}^{N} \mathcal{N}\_{\mathbf{y}[t]}(\boldsymbol{y}[t|t-1], \sigma\_{\mathbf{w}}^{2}) \\ &= \prod\_{t=1}^{N} \left( 2\pi \sigma\_{\mathbf{w}}^{2} \right)^{-1/2} \cdot \exp \left( -\frac{1}{2\sigma\_{\mathbf{w}}^{2}} (\boldsymbol{y}[t] - \boldsymbol{\Phi}^{T}(\mathbf{z}[t]) \cdot \boldsymbol{\Phi})^{2} \right) \end{split} \tag{A2}$$

applying the product operator, yields:

$$\begin{split} p(\mathbf{y} \mid \boldsymbol{\theta}, \boldsymbol{\Phi}) \\ \boldsymbol{\Phi} = & \left( 2\pi \sigma\_{\boldsymbol{w}}^{2} \right)^{-N/2} \cdot \exp \left( -\frac{1}{2\sigma\_{\boldsymbol{w}}^{2}} \sum\_{t=1}^{N} \left( \boldsymbol{y}[t] - \boldsymbol{\Phi}^{T}(\mathbf{z}[t]) \cdot \boldsymbol{\Phi} \right) \right) \\ \boldsymbol{\Phi} = & \left( 2\pi \right)^{-N/2} \cdot \left| \sigma\_{\boldsymbol{w}}^{2} \cdot \boldsymbol{I}\_{N} \right|^{-1/2} \cdot \exp \left( -\frac{1}{2\sigma\_{\boldsymbol{w}}^{2}} \sum\_{t=1}^{N} \left( \boldsymbol{y}[t] - \boldsymbol{\Phi}^{T}(\mathbf{z}[t]) \cdot \boldsymbol{\Phi} \right)^{2} \right) \end{split} \tag{A3}$$

Then, operating in the sum inside of the exponential function, leads to:

∑*N t*=1 (*y*[*t*] *− ϕ T* (*z*[*t*]) *· θ*) 2 = ∑*N t*=1 (*y*[*t*] *− ϕ T* (*z*[*t*]) *· θ*) *·* (*y*[*t*] *− ϕ T* (*z*[*t*]) *· θ*) = *y*[1] *− ϕ T* (*z*[1]) *· θ y*[2] *− ϕ T* (*z*[2]) *· θ* . . . *y*[*N*] *− ϕ T* (*z*[*N*]) *· θ T · y*[1] *− ϕ T* (*z*[1]) *· θ y*[2] *− ϕ T* (*z*[2]) *· θ* . . . *y*[*N*] *− ϕ T* (*z*[*N*]) *· θ* = *y*[1] *y*[2] . . . *y*[*N*] *− ϕ T* (*z*[1]) *ϕ T* (*z*[2]) . . . *ϕ T* (*z*[*N*]) *· θ T · y*[1] *y*[2] . . . *y*[*N*] *− ϕ T* (*z*[1]) *ϕ T* (*z*[2]) . . . *ϕ T* (*z*[*N*]) *·θ* (A4)

and thus:

$$\sum\_{t=1}^{N} \left( \mathbf{y}[t] - \boldsymbol{\phi}^T (\mathbf{z}[t]) \cdot \boldsymbol{\theta} \right)^2 = \left( \mathbf{y} - \boldsymbol{\phi}^T \cdot \boldsymbol{\theta} \right)^T \cdot \left( \mathbf{y} - \boldsymbol{\Phi}^T \cdot \boldsymbol{\theta} \right) \tag{A5}$$

Then, after putting everything together, the following result is obtained:

$$\begin{split} p(\mathbf{y}|\boldsymbol{\Phi},\boldsymbol{\Phi}) &= \left(2\pi\right)^{-N/2} \cdot \left| \sigma\_{\text{w}}^{2} \cdot \boldsymbol{I}\_{N} \right|^{-1/2} \\ &\quad \cdot \exp\left(-\frac{1}{2\sigma\_{\text{w}}^{2}} \left(\boldsymbol{y} - \boldsymbol{\Phi}^{T} \cdot \boldsymbol{\Phi}\right)^{T} \cdot \left(\boldsymbol{y} - \boldsymbol{\Phi}^{T} \cdot \boldsymbol{\Phi}\right)\right) \\ &= \mathcal{N}\_{\mathcal{V}}(\boldsymbol{\Phi}^{T} \cdot \boldsymbol{\Phi}, \sigma\_{\text{w}}^{2} \cdot \boldsymbol{I}\_{N}) \\ \end{split} \tag{A6}$$

# **Comparing Structural Identification Methodologies for Fatigue Life Prediction of a Highway Bridge**

*Sai G. S. Pai 1,2 \*, Alain Nussbaumer <sup>3</sup> and Ian F. C. Smith1,2*

*<sup>1</sup>Applied Computing and Mechanics Laboratory (IMAC), School of Architecture, Civil and Environmental Engineering (ENAC), Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland, <sup>2</sup>ETH Zurich, Future Cities Laboratory, Singapore, Singapore, <sup>3</sup>Resilient Steel Structures Laboratory (RESSLAB), School of Architecture, Civil and Environmental Engineering (ENAC), Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland*

Accurate measurement-data interpretation leads to increased understanding of structural behavior and enhanced asset-management decision making. In this paper, four datainterpretation methodologies, residual minimization, traditional Bayesian model updating, modified Bayesian model updating (with an *L∞*-norm-based Gaussian likelihood function), and error-domain model falsification (EDMF), a method that rejects models that have unlikely differences between predictions and measurements, are compared. In the modified Bayesian model updating methodology, a correction is used in the likelihood function to account for the effect of a finite number of measurements on posterior probability–density functions. The application of these data-interpretation methodologies for condition assessment and fatigue life prediction is illustrated on a highway steel–concrete composite bridge having four spans with a total length of 219 m. A detailed 3D finite-element plate and beam model of the bridge and weigh-in-motion data are used to obtain the time–stress response at a fatigue critical location along the bridge span. The time–stress response, presented as a histogram, is compared to measured strain responses either to update prior knowledge of model parameters using residual minimization and Bayesian methodologies or to obtain candidate model instances using the EDMF methodology. It is concluded that the EDMF and modified Bayesian model updating methodologies provide robust prediction of fatigue life compared with residual minimization and traditional Bayesian model updating in the presence of correlated non-Gaussian uncertainty. EDMF has additional advantages due to ease of understanding and applicability for practicing engineers, thus enabling incremental asset-management decision making over long service lives. Finally, parallel implementations of EDMF using grid sampling have lower computations times than implementations using adaptive sampling.

**Keywords: model-based data interpretation, Bayesian model updating, model falsification, fatigue life evaluation, full-scale structures**

## **1. INTRODUCTION**

In this paper, four data-interpretation methodologies for model updating are compared to evaluate their applicability in predicting the remaining fatigue life of a full-scale bridge. The deficit between demand and supply of civil infrastructure is increasing annually from an estimated US\$ 1 trillion in 2014 (World Economic Forum, 2014). Performance-based asset management of existing

*Edited by:*

*Eleni N. Chatzi, ETH Zurich, Switzerland*

#### *Reviewed by:*

*Dimitrios Giagopoulos, University of Western Macedonia, Greece Jian Li, University of Kansas, United States*

> *\*Correspondence: Sai G. S. Pai sai.pai@epfl.ch*

#### *Specialty section:*

*This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment*

*Received: 17 July 2017 Accepted: 28 November 2017 Published: 08 January 2018*

#### *Citation:*

*Pai SGS, Nussbaumer A and Smith IFC (2018) Comparing Structural Identification Methodologies for Fatigue Life Prediction of a Highway Bridge. Front. Built Environ. 3:73. doi: 10.3389/fbuil.2017.00073* infrastructure for decisions such as repair and retrofit for life extension helps reduce this deficit. Replacement of all aging infrastructure is expensive, unsustainable, and often not necessary. Models that are used for design of civil infrastructure are justifiably conservative. Therefore, most structures possess reserve capacity and can last well beyond their design working lives (referred to as service lives in this paper) (Brühwiler, 2012; Smith, 2016). The challenge lies in quantifying this reserve capacity to enable asset-management decision making such as repair, retrofit, and extension.

Asset-management decision making may be improved through a better understanding of structural behavior. This can be achieved through monitoring of civil infrastructure enabled by recent advances in sensing technology (Lynch and Loh, 2006; Taylor et al., 2016) and availability of affordable computational tools (Frangopol and Soliman, 2016). However, analytical models of civil infrastructure systems possess large modeling uncertainty, including significant systematic errors and unknown correlations between measurement locations (Jiang and Mahadevan, 2008). These conditions have lead to recent studies of uncertainties and development of data-interpretation methodologies that are robust to incomplete knowledge (Goulet and Smith, 2013). Moreover, civil infrastructure are in service for decades and are subjected to changing load and environmental conditions. Therefore, datainterpretation methodologies should support engineers for iterative asset-management decisions as new information becomes available throughout infrastructure lives.

Structural identification involves interpreting measurement data to update knowledge of parameters governing structural response in the presence of uncertainties from numerous sources. Methodologies for structural identification have been studied extensively (Worden et al., 2007; Beck, 2010; Cross et al., 2013; Moon and Catbas, 2013). However, every civil structure is unique due to its form, function, and utility and this requires explicit consideration of uncertainties in decision making. Most data-interpretation methodologies assume that the uncertainty associated with the structural system is defined by a zero-mean independent Gaussian distribution. However, this assumption is rarely satisfied for civil infrastructure (Pasquier et al., 2014). Lack of knowledge of uncertainty related to aspects such as geometry of structural elements and model bias means that most sources can only be estimated as bounds. There are other sources of uncertainties, such as support conditions, that are systematic in nature and their magnitudes may change the correlation between uncertainties at measurement locations. Misevaluation of these uncertainties have led to incorrect updated probability distributions (Goulet and Smith, 2013; Simoen et al., 2013; Pasquier and Smith, 2015). Such inaccuracy can result in misinformed asset-management decisions.

The success of data-interpretation methodologies is best measured on full-scale examples. Brownjohn et al. (2011) have noted difficulties in transfer of technology from the laboratory to the field. Laboratories, by design, are intended to reduce uncertainty and thus they provide little similitude with structural identification of real structures. Unfortunately, there are many studies and theoretical proposals found in the literature (Ben-Haim and Hemez, 2012) that have not involved testing with full-scale systems.Yan and Katafygiotis (2015) have presented a novel approach for Bayesian model updating and highlighted the difficulties in implementing the procedure in engineering practice. They assert that the number of parameters to be identified and the large uncertainty associated with complex systems may lead to an unidentifiable system, requiring the need for model reduction techniques. Kuok and Yuen (2016) have studied modal identification of the Ting Kau Bridge, which is monitored with more than 230 sensors of various types. They employ a Bayesian framework for parameter estimation and model class selection. Their study shows that the identification results obtained are influenced by monitoring conditions such as wind. Behmanesh and Moaveni (2016) have carried out hierarchical Bayesian model updating of a footbridge that is subjected to varying temperature conditions. They consider the effect of parameter uncertainty, parameter variability due to ambient or environmental conditions and modeling error uncertainty for continuous monitoring. The results from their study show the importance of including modeling errors for response prediction. There is a continuing need to evaluate applicability of model updating methodologies to full-scale systems under practical conditions.

Detailed numerical models have been used to capture the physical conditions affecting the response of a full-scale system. Use of these models in data-interpretation methodologies was recognized to be computationally expensive (Chang et al., 2000). Surrogate models have been proposed to improve computation times. Surrogate models that replaced finite-element models include polynomial regression (Hemez et al., 2002), multivariate regression spines (Friedman, 1991), and Kriging estimates (Simpson et al., 2001) as reviewed by Rutherford et al. (2006). Worden and Cross (2018) presented the utility of using surrogate models to predict bridge response under the influence of environmental conditions such as temperature. Support vector machines have been used for predicting correlation between modal frequencies and temperature (Ni et al., 2005), fatigue truck load model (Lu et al., 2016), and to obtain bridge scour information (Chou and Pham, 2017). A back propagation neural network model was used by Ni et al. (2009) to model the correlation between model frequencies and temperature of the Ting Kau bridge. In this paper, neural network models have been used to obtain the structural response for both identification and prediction.

Most research so far has focused on parameter identification primarily for the purpose of damage detection. Few researchers have aimed to predict structural response for asset-management decision making. Li et al. (2016) have predicted von Mises stress in a test structure. They have employed a Bayesian framework to arrive at a posterior distribution of model parameters, which they then utilized to predict von Mises stress at an unmeasured location. Their study has found that there is a large uncertainty associated with prediction. Therefore, the forward problem of prediction requires rigorous treatment of uncertainty associated with the system. This research exemplifies the need for uncertainty quantification utilizing engineering knowledge to enable robust prediction of structural response for the purpose of assetmanagement decision making.

Pasquier and Smith (2015) compared a model falsificationbased methodology and Bayesian model updating under various uncertainty conditions for prediction utilizing a simple beam. Their results showed that the model falsification methodology provided accurate prediction in presence of non-Gaussian sources of uncertainty, model bias, and other sources of systematic uncertainty. Based on this observation, Pasquier et al. (2014) and Pasquier et al. (2016) utilized model falsification for reserve capacity estimation of a full-scale bridge. However, no research was found that compares several data-interpretation methodologies for reserve capacity estimation on a full-scale case study.

This paper compares four data-interpretation methodologies, residual minimization (Alvin, 1997), traditional Bayesian model updating (BMU) (Beck and Katafygiotis, 1998), error-domain model falsification (EDMF) (Goulet and Smith, 2013), and a modified formulation of BMU. These methodologies are briefly explained in Section 2. The objective of this comparison is to verify the applicability of these methodologies for use in practice for the purpose of reserve capacity estimation. They are compared based on their ability to provide robust identification and prediction for a full-scale structure in presence of systematic uncertainty and incomplete correlation information. Also, these methodologies have been evaluated based on their compatibility with introduction of new information, ease of understanding for use in practice, and computation demand. Using updated information, the remaining fatigue life of the bridge is predicted under two traffic loading scenarios observed using a weigh-in-motion (WIM) station and one simulated future loading scenario.

### **2. BACKGROUND—METHODOLOGIES FOR DATA-INTERPRETATION**

In this section, a brief explanation of four data-interpretation methodologies, residual minimization, traditional BMU, EDMF, and modified BMU is presented. Residual minimization is a deterministic methodology, while EDMF and BMU are probabilistic methodologies that can incorporate multiple sources of uncertainty associated with the system. These methodologies differ in the assumptions that are made to represent the uncertainty associated with the system.

### **2.1. Residual Minimization**

In residual minimization, a structural model is calibrated by determining model parameter values that minimize the error between model prediction and measurements. Sanayei et al. (2011) presented a manual model updating example where model predictions are manually compared to measurements and the model is calibrated based on engineering knowledge to minimize an objective function. A typical objective function for residual minimization is shown in equation (1):

$$\hat{\theta} = \operatorname\*{argmin}\_{\theta} \sum\_{i=1}^{n\_{\mathcal{V}}} \left( \mathcal{g}(\mathbf{x}\_i, \theta) - \hat{\mathbf{y}}\_i \right)^2. \tag{1}$$

In equation (1), ˆ*θ* is the optimum model parameter set obtained using measurements, (*g*(*xi, θ*) *−* ˆ*yi*) is the residual obtained between the model response, *g*(*xi, θ*), and measurement, ˆ*yi*, at measurement location *i*. Residual minimization does not account for the inherent model bias in civil infrastructure due to application of safe design models. It also does not take into account uncertainties arising from systematic or environmental sources and the correlation between uncertainties. The simplicity of the methodology makes it popular for use in practice, although the identification results obtained are not always accurate (Beven, 2000).

### **2.2. Traditional Bayesian Model Updating**

Bayesian model updating is a popular methodology for structural identification. In this methodology, prior knowledge of model parameters is updated using information obtained through monitoring of a structure. If *g*(*θ*) is the model of a structure with parameters *θ*, then the prior probability distribution function (PDF) of the model parameters, *P*(*θ*) is updated as shown in equation (2),

$$P(\theta|\mathbf{y}) = \frac{L(\mathbf{y}|\theta) \cdot P(\theta)}{L(\mathbf{y})} \tag{2}$$

where *P*(*θ*|*y*) is the posterior or updated PDF of model parameters, *L*(*y*|*θ*) is the likelihood function, and *L*(*y*) is the normalizing constant. The likelihood function, *L*(*y*|*θ*), indicates the plausibility of observing data *y* for a given realization of *θ*.

In traditional BMU methodology (Beck and Katafygiotis, 1998), a *L*2-norm-based Gaussian likelihood function, as shown in equation (3), is used to update prior information of model parameters:

$$L(\boldsymbol{y}|\boldsymbol{\theta}) = 2\pi^{-n\_m/2} |\boldsymbol{\Sigma}|^{-1/2} \exp\left[ \left( -\frac{1}{2} \varepsilon\_0 \left( \boldsymbol{\theta} \right) - \boldsymbol{U\_\varepsilon} \right)^T \right.$$

$$\times \boldsymbol{\Sigma}^{-1} \left( -\frac{1}{2} \varepsilon\_0 \left( \boldsymbol{\theta} \right) - \boldsymbol{U\_\varepsilon} \right) \Big| . \quad \text{(3)}$$

In equation (3), Σ is the correlation matrix defined by the correlation coefficients between measurement locations, *ε*0(*θ*) is a vector of residuals between observation and model response, and *U<sup>c</sup>* is a vector containing the mean of uncertainty at each measurement location.

In traditional BMU, the uncertainty associated with the system is assumed to have an independent zero-mean Gaussian distribution implying model bias and correlations are not considered. A prominent approach to account for model bias is to model it as a Gaussian process with variance *σ* 2 (Kennedy and O'Hagan, 2001), which is assigned a non-informative prior and whose posterior distribution is identified along with other model parameters. Brynjarsdóttir and O'Hagan (2014) used an informed prior for *σ* 2 to include available information about the model error. Although these approaches provided reliable estimates of model parameters in a few cases, they failed to provide reliable solutions for extrapolation (prediction at an unmeasured location). In the context of civil infrastructure, some researchers have considered modeling uncertainty for updating response prediction (Papadimitriou et al., 2001), fatigue reliability assessment (Kwon et al., 2010), and damage assessment (Simoen et al., 2015). In the above studies, modeling uncertainty at all measurement locations is assumed to be the same, which is rarely the case in the presence of systematic bias. Also, Bayesian methodology may provide accurate identification of parameters at measured locations but the information obtained from measurements cannot be extrapolated to predict structural responses at other locations in the presence of systematic bias (Behmanesh et al., 2015; Pasquier and Smith, 2015).

### **2.3. Error-Domain Model Falsification**

Another methodology for model updating is EDMF (Goulet and Smith, 2013). In this methodology, information from measurements is used to falsify parameter values, thereby obtaining a candidate set from an initial set of possible parameter values. This methodology is based on the assertion by Popper (1959)that models cannot be validated by data; they can only be falsified. Conservative and simplified models used to design civil infrastructure possess model bias, model fidelity uncertainty, and uncertainties from simplification of loading conditions, geometrical properties, material properties, and boundary conditions. Most of these uncertainties can be estimated only using engineering heuristics and cannot be described using a zero-mean Gaussian distribution. In EDMF, engineering knowledge is utilized to quantify uncertainties from various sources and combined together along with measurement uncertainty to obtain a robust falsification criterion.

Consider a structure with modeling and measurement uncertainty, at a measurement location *i*, *ϵmod,<sup>i</sup>* , and *ϵmeas,i*, respectively. If a structure is represented by a physics-based model, *g*(*θ*), then the true response of the structure at a measurement location, *Qi*, is given by,

$$\mathbf{Q}\_{i} = \mathbf{g}\_{i} \left( \boldsymbol{\theta}^{\*} \right) + \boldsymbol{\epsilon}\_{mod,i} \tag{4}$$

where *gi*(*θ*\*) is the model response at measurement location *i* for the real values of the model parameters, *θ*\* and *ϵmod,<sup>i</sup>* are the modeling uncertainty at the measurement location. Similarly, if the structure is monitored, then the true response of the structure at a measurement location, *Qi*, is given by,

$$Q\_{\bar{l}} = \mathcal{y}\_{\bar{l}} + \epsilon\_{meas,\bar{l}} \tag{5}$$

where *y<sup>i</sup>* is the measured response of the structure and *ϵmeas,<sup>i</sup>* is the measurement uncertainty at the measurement location. Equating equations (4) and (5), the following relationship between model response and measurement can be obtained,

$$g\_i \left( \theta^\* \right) - y\_i = \epsilon\_{meas,i} - \epsilon\_{mod,i} \tag{6}$$

where the residual between model response and measurement at a sensor location is equal to the combined model and measurement uncertainty. In design decision making, an important consideration is to first fix a target reliability for design. Therefore, in using EDMF for asset-management decision making, first a target reliability of identification, *ϕ∈*{0,1}, is established (Goulet and Smith, 2013). Using the target reliability for identification, the criteria for falsification in EDMF, thresholds *Thigh,i* and *Tlow,i*, are computed using equation (7):

$$
\phi^{1/m} = \int\_{T\_{\text{low},l}}^{T\_{\text{high},l}} f\_{U\_{\varepsilon,l}}(\epsilon\_{\varepsilon,i}) \, d\epsilon\_{\varepsilon,i}.\tag{7}
$$

In equation (7), *f<sup>U</sup>c,<sup>i</sup>* (*ϵc,i*) is the combined uncertainty PDF at measurement location *i* and *ϕ* is the target reliability of identification. The combined uncertainty, *f<sup>U</sup>c,<sup>i</sup>* (*ϵc,i*), is calculated by combining uncertainty from various sources such as geometric simplifications, modeling assumptions, and sensor resolutions using Monte Carlo sampling. If the target reliability of identification, *ϕ*, is set to 0.95, then using Monte Carlo sampling, 1 million samples from the combined uncertainty distribution are generated. From these samples, the smallest range that contains 95th percentile of the samples is calculated. The bounds of this range correspond to the threshold bounds, *Thigh,i* and *Tlow,i*. In equation (7), the term 1/*m* is the Šidák correction (Šidák, 1967) that accounts for a finite number of measurements *m*. For example, using Šidák correction, if the desired target reliability of identification is 0.95 using two measurements, then the thresholds bounds are computed for 97.5th percentile (0.951/2) of the generated samples. Once, the bounds, *Thigh,i* and *Tlow,i*, are computed, the user generates model responses for various instances of model parameters, *θ*. If the residual between model prediction and measurement does not lie within the thresholds then the model instance is falsified as shown in equation (8):

$$T\_{low,i} \le g\_i\left(\theta\right) - \gamma\_i \le T\_{high,i} \qquad \forall i \in \{1...n\_m\}.\tag{8}$$

Using equation (8), if the response of a model instance does not lie within the established thresholds for any measurement location, then that model instance is falsified (Goulet et al., 2010, 2013b; Goulet and Smith, 2013). The remaining model instances from the initial set, whose responses for all measurement locations lie within the thresholds are accepted to form the candidate model set. These candidate models are then utilized to carry out model prediction with reduced uncertainty (Pasquier and Smith, 2015).

The EDMF methodology has been developed and applied to fourteen full-scale systems since 1998 (Smith, 2016). Recent applications include model identification (Goulet et al., 2013b), leak detection (Goulet et al., 2013a; Moser et al., 2015), wind simulation (Vernay et al., 2015), prediction (Pasquier and Smith, 2016), fatigue life evaluation (Pasquier et al., 2014, 2016), and measurement system design (Goulet and Smith, 2012a,b; Papadopoulou et al., 2016).

### **2.4. Modified Bayesian Model Updating**

The other methodology considered for comparison in this paper is the modified BMU. In this methodology, prior knowledge of model parameters is updated using measurements as shown in equation (2). However, a box-car-shaped *L∞*-norm-based Gaussian likelihood function is used to include information gained through measurements. A generalized Gaussian distribution is defined as,

$$f(\mathbf{x},k) = \frac{\kappa^{1-1/\kappa}}{2\sigma\_\kappa \Gamma\left(1/\kappa\right)} e^{\kappa \cdot \left[\frac{|\mathbf{x}-\mathbf{x\_0}|}{\sigma\_\kappa}\right]^\kappa} \tag{9}$$

where *f*(*x*, *κ*) is the generalized Gaussian PDF of random variable *x*, based on *Lκ*-norm with mean, *x*0, and SD, *σκ*. For *κ→∞*, *f*(*x*, *κ*) tends to a box-car shape. Parameters, *x*<sup>0</sup> and *σκ*, of the likelihood function are determined using threshold bounds from equation (7) as shown in equations (10) and (11):

$$\mathbf{x}\_0 = \frac{T\_{high,i} - T\_{low,i}}{2} \tag{10}$$

$$
\sigma\_{\kappa} = T\_{\text{high},i} - \infty. \tag{11}
$$

The *L∞*-norm-based Gaussian likelihood function is approximated using *κ* = 200. If the residuals between model response and measurements at all locations lie within the thresholds, then that model instance is attributed a higher likelihood of occurrence, while model instances that would be falsified in EDMF are attributed a low likelihood. The application of modified BMU was compared with traditional BMU and EDMF, using an illustrative example, by Pai and Smith (2017).

In this paper, these four methodologies are compared considering a range of uncertainty sources and computational demands of simulating the behavior of a full-scale structure. Results obtained from model updating are utilized for fatigue life evaluation of a full-scale bridge under two traffic loading scenarios.

### **3. CASE STUDY**

### **3.1. Structure Description**

The case considered here is inspired from a steel–concrete composite highway twin bridge in the town of Echandens, Switzerland, called the Venoge bridge. The bridge has four spans of length 52, 60.4, 55, and 52 m, and a total length of 219.4 m. In 1995, the bridge was extended from 2 *×* 2 lanes to 2 *×* 3 lanes by adding an additional lane in each traffic direction. The bridge is part of the European route E62 and on average, 7,008 heavy vehicles cross the bridge weekly in one direction with an average weight of 22 tons. According to Eurocode, the term heavy vehicles refers to vehicles with weight greater than 10 tons (Eurocode, 1991). Most of these heavy vehicles drive on the slow lane on the extended part of the bridge as shown in **Figure 1**.

Each half of the twin bridge is composed of a concrete deck supported over four steel girders. The extended lane and the old bridge in one traffic direction are supported by two steel girders each. The concrete bridge deck and steel girders under the extended lane are modeled using SHELL182 elements in ANSYS. The steel girders supporting the old bridge are modeled using BEAM188 elements. The finite-element model is used for a linear elastic analysis in which the deck is assumed to be homogeneous and un-cracked on supports under fatigue loads. The bridge has

four spans as shown schematically in **Figure 2**. The bridge is monitored using ten strain gages, installed in 1995, located at two sections along the span as shown in the figure. These sensors are located on the interior girder supporting the extended lane of the bridge. Supports Sup 0, Sup 1, and Sup 2, supporting the extended lane of the bridge are modeled using spring elements with parametrized stiffness. Sup 3, Sup 4, and supports under the old bridge are modeled using rigid spring elements.

Sensors, A1 to A4 and B1 to B4, are used for updating parameters of the model. Using this updated knowledge, the response at sensor locations, A5 and B5, is predicted for validation of the results obtained using the data-interpretation methodologies. The updated model parameters are then used to predict the remaining fatigue life of a cover plate detail on the bridge. This cover plate detail is located near sensors A1 and A3, shown in **Figure 2**. In this study, the minimum remaining fatigue life of the bridge for this critical detail is evaluated using in-service traffic and strain measurement data. The number of sensors and their location on the bridge is sub-optimal. The measurement system was installed in 1995 for another objective than the one being studied in this paper.

### **3.2. Measurement and Traffic Load Data**

**Figure 2** shows the position of the sensors on the inner girder of the extended section of the bridge. Data from eight sensors, A1 to A4 and B1 to B4, are used for updating knowledge of the bridge behavior. The position of sensors B1 and B3 on the bridge is shown in **Figure 3**. Four sensors are located close to the location of the critical detail and four other sensors are located at the end of the first span, 1 m from the support, Sup 1. Data from these eight sensors, recorded from November 18 to 24, 2013, is used for identification of the model parameters. However, as the data available is a time-history, it has to be processed to acquire a response that can be utilized for data-interpretation. A comparable structural response is the equivalent stress range. The equivalent stress range calculated using in-service strain measurements is considered as measured response at sensor locations, *yi*. The computation of equivalent stress range is explained in Section 3.3.

The traffic load on the bridge from November 18 to 24, 2013, in the direction Lausanne–Geneva, is obtained from a weighin-motion (WIM) station located only 1 km from the bridge, at Denges, without any exits in between. The WIM station provides traffic load in terms of time of passage (T), vehicle speed (V), number of axles (N), total length (TL), gross total weight (GTW), axle weight (AW), and distance between axles (AD). Using this traffic data, a train of axle loads is generated for the 1-week duration from November 18 to November 24, 2013. This axle train is used as a moving point load on the bridge to obtain the equivalent stress range at each sensor location.

### **3.3. Computation of Equivalent Stress Range and Remaining Fatigue Life**

Each sensor shown in **Figure 2** provides a time-history of strain for vehicles passing over the bridge. This time-history of strain is used to compute the stress range histogram using the rainflow algorithm (Matsuishi and Endo, 1968). In the stress range histogram, stress range values below 2 MPa are not considered due to their

**FIGURE 3** | Sensors B1 and B3 on the Venoge bridge (Credit: IMAC, EPFL).

low effect on fatigue damage of the bridge. The equivalent stress range, ∆*σ<sup>e</sup>* (Dowling, 1971), is computed using equation (12),

$$
\Delta \sigma\_{\varepsilon} = \left[ \sum \frac{n\_i \Delta \sigma\_i^m}{\sum n\_i} \right]^{1/m} \tag{12}
$$

where *n<sup>i</sup>* is the number of cycles that takes place at stress range level ∆*σ<sup>i</sup>* and *m* is the slope coefficient of the S–N curve. The equivalent stress range is calculated using a single slope S–N curve, which is a conservative assumption.

Similarly, the equivalent stress range is computed for each model instance using the finite-element model and traffic load on the bridge. The finite-element model is used to generate an influence line for stress at each sensor location for a given set of parameter values. The train of axle loads is passed over influence line of each sensor and processed using the rainflow algorithm to obtain stress range histograms. The equivalent stress range is computed from these histograms using equation (12) for all sensors. This step is repeated to obtain the equivalent stress range at each sensor location for various model parameter values.

The updated model using traffic and strain data is used to predict the remaining fatigue life of a cover plate detail located close to sensors, A1 and A3, as shown in **Figure 2**. The remaining fatigue life of the cover plate detail is computed using the damage index. The damage index, *Dperiod*, is calculated using Miners rule (Miner, 1945), as shown in equation (13):

$$D\_{period} = \left[\sum \frac{n\_i}{C \cdot \Delta \sigma\_i^{-m}}\right] \tag{13}$$

where *C* is a constant depending on the category of the critical detail, *m* is the slope coefficient, and *n<sup>i</sup>* is the number of cycles that takes place at stress range level ∆*σi*. The cover plate welded attachment close to sensor A1 and A3 is classified as FAT36 according to SIA263/1 (2013). Based on the detail classification, the characteristic value for the constant *C* is utilized in computing the damage index. The remaining fatigue life of the bridge, *RFL*, is calculated using equation (14):

$$RFL = \frac{R\_{year}}{D\_{period}}\tag{14}$$

where *Ryear* is the fraction of traffic simulation period over a year. Thus, if traffic is simulated over a 1-week period, then *Ryear* is taken as 1/52.

### **3.4. Model Class and Sources of Uncertainty**

The bridge response, i.e., the equivalent stress range at the sensor locations, is affected by several factors, which are not known completely. In the finite-element model, unknown parameters are quantified as random variables with a uniform distribution. Not all parameters of the finite-element model affect the structural response significantly. The relative importance of these parameters to structural response is estimated using a sensitivity analysis. Equivalent stress range at each sensor location is calculated for numerous values of model parameters. The dataset containing the model response and parameters is used to fit a linear regression model for each sensor location. The parameters of the regression model are indicative of the importance of the structural parameters to response at each sensor location, which is used to calculate the relative importance. A list of these parameters is shown in **Table 1** along with their relative importance to the structural response of the bridge at various sensor locations.

The parameters that significantly affect the structural response based on their relative importance are *Ec*, *KdeckX*. These parameters constitute the parameters of the model class and knowledge regarding these parameters will be updated using data from

#### **TABLE 1** | Parametric uncertainty.


**TABLE 2** | Secondary parametric uncertainty, surrogate modeling uncertainty, and model bias at each sensor location.


*All distributions are uniform.*

strain gages and WIM station. The parameters not considered in the model class for the identification are called as secondary parameters. They contribute to the secondary parameter uncertainty at each sensor location, which is estimated using the finiteelement model. The secondary parameter uncertainty at each sensor location is shown in **Table 2**.

Probabilistic data-interpretation methodologies, such as those discussed in Section 2, require evaluation of a structural model for various realizations of model parameters, which are described as random variables. In this paper, a finite-element model of the bridge is used as the structural model with two parameters, *Ec*, *KdeckX*, comprising the model class to be identified. Each realization of the model parameters is a set of values for *Ec*, *KdeckX* for which the bridge response is evaluated. Using the finite-element model and a realization of the model parameters, an influence line for stress at each sensor location is obtained. This influence line is then used to obtain the equivalent stress range at each sensor location. The computation of influence line for all sensors for one set of model parameters takes around 4 h and 30 min, using an Intel(R) Xeon(R) CPU E5-2670 v3 @2.30GHz processor. The long computation does not allow for efficient sampling of **TABLE 3** | Other sources of uncertainty.


the parameter space to obtain optimum results. Therefore, to reduce the computation cost, surrogate models are developed to predict the equivalent stress range at each sensor location and the remaining fatigue life of the critical detail. The equivalent stress range predicted using the surrogate models for various parameter values is taken to be the model response, *gi*(*θ*), in model updating.

120 parameter values of *Ec*, *KdeckX* are generated using Latin hypercube sampling and input into the finite-element model to obtain the equivalent stress range at sensor locations and remaining fatigue life of the critical detail. The parameter values and the corresponding structural response obtained are used as a data set to train the surrogate models. Here, a neural network is used to map the function between the inputs and outputs. Neural network models (Farrar and Worden, 2012) have multiple layers that map the inputs to the outputs using linear or non-linear transfer functions. The neural network used here is a feedforward neural net with 4 hidden layers, trained using the Levenberg–Marquardt algorithm (Beale et al., 2015). The neural network models were then cross-validated with 15% of the data points, which were not used for training the net. The cross-validation results are used to obtain the surrogate modeling uncertainty. As the number of data points used for cross-validation is small, the residual between surrogate and finite-element model prediction is assumed to have a uniform distribution. The surrogate modeling uncertainty estimated for each sensor location using cross-validation is shown in **Table 2**. The neural network models developed are used in the subsequent sections for prediction of equivalent stress range at sensor locations and remaining fatigue life of the bridge. The model bias at each sensor location is also shown in **Table 2**. The model bias, estimated using heuristics, is assumed to be higher at sensor locations closer to the supports than for those at mid-span.

Structural response of the bridge under in-service traffic loading is also affected by additional uncertainty sources such as model bias, influence line calculation, transversal position of vehicles on the bridge, measurement uncertainty associated with strain gages, and WIM station. Most of these uncertainty sources cannot be computed numerically and are estimated using engineering knowledge. The uncertainty distribution assumed for these uncertainty sources is provided in **Table 3**. The uncertainty from these sources is assumed to be the same for all measurement locations.

The uncertainty from these sources are combined together using Monte Carlo sampling to determine the combined uncertainty PDF. The falsification thresholds for EDMF and the likelihood functions for traditional and modified BMU are determined based on the combined uncertainty PDF using equations (7), (3), and (9), respectively. Equivalent stress ranges at measurement locations obtained using strain gages and the falsification thresholds obtained using equation (7) are shown in **Table 4**.

**TABLE 4** | Equivalent stress ranges at measurement locations and falsification thresholds for EDMF.


### **3.5. Structural Identification**

In this section, the updated distribution of model parameters obtained using the data-interpretation methodologies is presented.

For residual minimization, samples from the prior distribution of model parameters, *E<sup>c</sup>* and *KdeckX* are generated through Monte Carlo sampling. For each parameter set, using the surrogate models developed, the equivalent stress range at each sensor location is predicted and the parameter set that provides minimum value for objective function provided in equation (1) is considered as the optimum value.

For EDMF, an initial set of model parameters is generated as a grid. Each model instance is input into the surrogate models developed for equivalent stress range, explained in Section 3.4. Using these surrogate models, the equivalent stress range at each measurement location is obtained and compared with the equivalent stress range obtained using measurement. If the residual between model response and measurement for each location lies within the threshold bounds, *Thigh,i* and *Tlow,i*, computed using equation (7) then the model instance is accepted. All such accepted model instances form the candidate model set, while the remaining model instances are falsified.

In modified and traditional BMU, the posterior PDF is sampled using Markov chain Monte Carlo (MCMC) sampling. The difference between the two Bayesian methodologies is the likelihood function employed. Traditional BMU employs a zeromean Gaussian likelihood function, as described in equation (3), while modified BMU utilizes a *L∞*-norm-based Gaussian likelihood function, as described in equation (9), to update the model parameters.

The candidate model set obtained using EDMF and samples of the joint posterior PDF of primary parameters obtained using modified BMU and traditional BMU are shown in **Figure 4**.

In **Figure 4A**, each candidate model instance obtained using EDMF is assumed to have an equal probability of occurrence. **Figure 4B** shows the samples of joint posterior PDF obtained using modified BMU. The sampled region is similar to the candidate model set region obtained using EDMF. This is because of the *L∞*-norm-based Gaussian likelihood function used in updating the probability distribution of model parameters. The modified likelihood function has a box-car shape that attributes a constant probability, *p*, to model instances whose residual when compared to measurements at each sensor location lies within the threshold bounds, *Thigh,i* and *Tlow,i* computed using equation (7).

In EDMF, these model instances form the candidate model set. Model instances whose residuals lie outside the threshold bounds for any measurement location are attributed a probability close to zero, which is analogous to falsified model instances in EDMF. Therefore, EDMF and modified BMU provide a similar joint posterior PDF.

Traditional BMU, which assumes a zero-mean Gaussian distribution for the uncertainty associated with the system, provides an informed posterior PDF. The maximum likelihood estimate obtained using traditional BMU for the parameters *E<sup>c</sup>* and*KdeckX* is 30 GPa and 5.5 log N/mm, respectively. Samples of the joint posterior PDF obtained using traditional BMU is shown in **Figure 4C**. Using residual minimization, the updated parameter values of *E<sup>c</sup>* and *KdeckX* obtained are, 20 GPa and 4 log N/mm.

In subsequent sections, the updated model parameters are used to predict the equivalent stress range at two sensor locations and the remaining fatigue life of the bridge at a critical detail. The remaining fatigue life of the bridge is predicted under two scenarios of observed traffic loading, to enable informed decision making regarding intervention for assessment, retrofit, and replacement.

### **3.6. Equivalent Stress Range Prediction**

In this section, the updated model parameters from Section 3.5 are used to predict the equivalent stress range at two sensor locations. The first location is of sensor A5, which is located on the upper flange of the bridge girder as shown in **Figure 2**. The second location is of sensor B5, which is located on the upper flange of the bridge girder as shown in **Figure 2**. Measurements from sensors A5 and B5 were not used in model updating. The comparison between equivalent stress range obtained using measurements and predicted using the updated model parameters is shown in **Figure 5**.

In **Figure 5A**, the equivalent stress range predicted for sensor A5 is shown. The equivalent stress range obtained using strain data from sensor A5 is 2.5 MPa. The equivalent stress range predicted using the prior distribution of model parameters ranges from 0.1 to 24 MPa. Using updated knowledge of bridge behavior as obtained using the three probabilistic datainterpretation methodologies, the prediction range is reduced. Utilizing the updated model parameters obtained using residual

minimization, the equivalent stress range predicted is 14.6 MPa, which is biased from the value obtained using measurements. The 95th percentile bounds of equivalent stress range predicted using traditional BMU are 0.1 and 10.1 MPa. Modified BMU and EDMF provide wider and similar bounds ranging from 0.2 to 22 MPa. In this case, all three probabilistic methodologies provide robust prediction of the equivalent stress range at the sensor location as the predicted bounds include the value obtained using measurements.

**Figure 5B** shows the equivalent stress predicted for sensor B5. The results obtained using the four data-interpretation methodologies again show a similar trend as observed for sensor A5. Residual minimization provides biased prediction of the equivalent stress range at location of sensor B5. The three probabilistic methodologies provide reduced prediction ranges compared with the initial model set prediction. Moreover, the prediction bounds obtained using all three probabilistic methodologies include the equivalent stress range obtained using measurements.

Structural identification for the purpose of damage detection is limited to validation of structural response under uncertainty conditions that are similar to those used for model updating. In this scenario, all three probabilistic methodologies provide robust identification and prediction as shown in **Figure 5**. In the next section, structural response prediction under uncertainty conditions that are different from that present during identification is presented.

### **3.7. Remaining Fatigue Life Prediction**

Most studies involving structural identification are carried out with the objective of model updating for damage detection. The results obtained are generally validated through prediction as demonstrated in Section 3.6. However, in such a scenario, the uncertainty associated with identification and prediction are similar. Under similar uncertainty conditions all three probabilistic data-interpretation methodologies provide robust predictions, as shown in **Figure 5**. For the purpose of asset-management decision making, the structural response to be predicted is generally not the response used for identification. Moreover, the loading conditions under which structural response needs to be predicted is likely to be different from those used for identification. Under these conditions, the uncertainties associated with modeling are different during identification and prediction.

**TABLE 5** | Sources of relative prediction uncertainty.

traditional BMU, the bias in uncertainty is assumed to be zero.


*All uncertainty sources are assumed to have a uniform distribution.*

In this section, the updated knowledge of bridge response is used to predict the fatigue life of the cover plate detail, which is located close to sensors A1 and A3, shown in **Figure 2**. The fatigue life prediction is carried out under three loading scenarios. In the first case, the remaining fatigue life of the detail is predicted under the loading duration utilized for identification, from November 18 to 24, 2013 (period 1). In the second case, traffic loading observed during another 1-week period in 2013 (period 2) is utilized for predicting the remaining fatigue life. In the third scenario, traffic loading is simulated for a week assuming 2% annual increase in traffic weight over the next 20 years. The relative combined uncertainty associated with identification of model parameters and prediction of remaining fatigue life is shown in **Figure 6**.

The identification uncertainty, shown in **Figure 6**, is obtained through combination of uncertainty sources specified in **Tables 2** and **3**. The prediction uncertainty, shown in **Figure 6**, is obtained through a combination of all modeling uncertainty sources specified in **Table 5**.

and **(B)** period 2.

Model bias and surrogate modeling uncertainty are different for identification and prediction. The two primary parameters, *E<sup>c</sup>* and *KdeckX*, included in the model class for identification are important in developing the surrogate model for equivalent stress range and remaining fatigue life. However, their relationship to the structural responses is different as equivalent stress range and remaining fatigue life are inversely related to each other. The surrogate model developed using neural networks for remaining fatigue life is biased. Along with the model bias, this leads to a biased combined uncertainty PDF for prediction. This difference is taken into account by adding the prediction uncertainty to the remaining fatigue life predicted using the surrogate models.

The Venoge bridge built in 1995 was designed for a service life of 100 years (Eurocode, 1990). As the assessment here is based on traffic data from 2013, the remaining fatigue life of the bridge based on design values is 82 years in 2013. The remaining fatigue life of the bridge predicted using the updated model parameters for traffic loading during period 1 and 2 are shown in **Figures 7A,B**, respectively.

In **Figure 7**, the remaining fatigue life predicted using updated model parameters obtained through the three probabilistic methodologies and residual minimization is shown. These results are compared with the remaining fatigue life obtained using strain measurements from sensor A1. In **Figures 7A,B**, the remaining fatigue life predicted under traffic loading from period 1 and period 2, are shown, respectively. The prediction uncertainty associated with each case is different as new surrogate models are developed for the respective traffic load duration.

In **Figure 7A**, the remaining fatigue life obtained using measurements from period 1 is 261 years, while the minimum remaining fatigue life predicted using the prior distribution of model parameters is 92 years. The remaining fatigue life predicted using the updated model parameters obtained using residual minimization is 168 years, which is biased from the value obtained through measurements by 36%. Using traditional BMU, the 95th percentile bounds of predicted remaining fatigue life are 101 and 213 years, which does not contain the remaining fatigue life obtained using measurements. Bias between the MLE of the predicted distribution and the value obtained through measurements is 40%. The bounds of predicted remaining fatigue life obtained using EDMF and modified BMU includes the value calculated using measurements. Using EDMF and modified BMU, the minimum remaining fatigue life predicted using updated model parameters is 127 and 126 years, respectively. Therefore, using measurement data, the minimum remaining fatigue life prediction was improved by 54% compared to the expected service life of 82 years in 2013.

In **Figure 7B**, the remaining fatigue life obtained using measurements from sensor A1 for traffic loading during period 2, is 302 years. The minimum remaining fatigue life predicted using the prior distribution of model parameters is 99 years. The remaining fatigue life predicted using the updated model parameters obtained using residual minimization is 183 years, which is biased from the value obtained through measurements by 39%. The MLE of remaining fatigue life predicted using traditional BMU is biased from the value obtained through measurement by 45%. Moreover, the 95th percentile bounds on prediction obtained using traditional BMU do not include the value obtained using measurements. Using EDMF and modified BMU, the minimum remaining fatigue life predicted using updated model parameters is increased to 136 and 135 years, respectively. Therefore, using measurement data, the minimum remaining fatigue life prediction was improved by 65% compared to the expected service life of 82 years in 2013.

Traditional BMU provides a biased mean value from the value obtained through measurements for both scenarios. Moreover, the 95th percentile bounds obtained in both cases does not include the value obtained through measurement. Even residual minimization provides biased values for both scenarios considered. EDMF and modified BMU provide robust prediction bounds under both traffic loading scenarios. They help improve the minimum remaining fatigue life prediction by 54 and 65% in the two scenarios considered. Also, EDMF and modified BMU provide similar results, an observation previously made in **Figures 4** and **5**.

The objective of measurement data-interpretation is to predict structural behavior for future loading scenarios to decide on repair, replace, and retrofit actions. Due to recent trends in transportation, it is likely that freight traffic on highways will increase in future. Therefore, a 2% annual increase in weight of vehicles is assumed over a 20-year period to simulate traffic loading for a 1-week period in the year 2033. Using this traffic loading, the remaining fatigue life of the bridge is predicted as shown in **Figure 8**.

In **Figure 8**, the minimum remaining fatigue life of the bridge in 2033, predicted using the prior distribution of model parameters

is 38 years. The remaining fatigue life predicted using the updated model parameters obtained using residual minimization, traditional BMU (lower 95th percentile bound), EDMF, and modified BMU are 70, 40, 52, and 53 years, respectively. However, as noted in **Figure 7**, traditional BMU and residual minimization provided biased results from the value obtained using measurements, while EDMF and modified BMU provided robust and accurate bounds. Therefore, the lower bound of remaining fatigue life obtained using EDMF and modified BMU is a robust metric for future decision making. Using measurements, the minimum fatigue life increased from 38 to 52 years, a 37% improvement. Moreover, based on initial design, not accounting for increased loading, the remaining fatigue life of the bridge in 2033 is 62 years, which is greater than the value predicted after model updating. This implies that the bridge if subjected to increased traffic loading will require repair action sooner than expected during initial design, which can be improved through data-interpretation. In the next section, the applicability of these data-interpretation methodologies in practice along with the computation time required will be discussed.

### **3.8. Applicability in Practice and Computation Time**

Application of data-interpretation in practice requires a methodology to satisfy four criteria. First, it should provide accurate identification of model parameters and second, accurate prediction of structural response for reserve capacity estimation. Third, it should be able to incorporate engineering knowledge within the framework. Fourth, the methodology should be easy to understand and use, to enable iterative asset-management decision making.

In Sections 3.5, 6, and 3.7, results obtained using the three probabilistic data-interpretation methodologies and residual minimization are compared. The objective of the comparison in these sections was to elucidate the accuracy of these methodologies to uncertainty conditions in a real environment and the utility of their solutions in practice.

From the perspective of incorporating engineering knowledge into the data-interpretation framework, traditional BMU utilizes a zero-mean Gaussian likelihood function for model updating. Information about model bias is not incorporated

in traditional BMU. Moreover, traditional BMU involves the assumption that uncertainty between measurement locations is independent, which is not compatible within a closed system such as civil infrastructure. Residual minimization also cannot include non-parametrized model bias. In traditional BMU and residual minimization, as new sources of uncertainty are identified over the service life of a structure, they can be incorporated into the framework explicitly only as parameters to be identified. Inverse problems such as structural identification have an exponential "*O*" complexity with respect to number of parameters to be identified. Therefore, increase in the number of parameters to be identified exponentially increases the computation time, which is clearly not desirable.

EDMF and modified BMU provide accurate identification and prediction as shown in **Figures 5** and **7**. Both methodologies utilize engineering knowledge to determine the combined uncertainty and model bias associated with the system. Then, this information is translated into a falsification criteria for EDMF and into a *L∞*-norm-based Gaussian likelihood function for modified BMU. As new information regarding uncertainty sources becomes available, it can be incorporated into the combined uncertainty, thereby not increasing the number of parameters to be identified unless required. This helps in limiting the problem dimension and preventing an exponential increase in computation time. **Figure 9** shows the computation time required with increasing number of samples for four combinations of the probabilistic data-interpretation methodologies.

In **Figure 9**, comparisons of computation time are provided when the data-interpretation methodologies are utilized in a series and a parallel computation framework. Section 3.4 contains a description of the hardware used in computation. Modified and traditional BMU utilize MCMC sampling to obtain samples of the posterior PDF, while EDMF and residual minimization utilize grid sampling to obtain the updated model parameter values. MCMC sampling is a one-step memory process requiring that samples are generated sequentially. Therefore, MCMC sampling cannot be implemented efficiently in a parallel computation framework. In this case, utilization of multiple cores for computation does not make a significant difference. In grid sampling, samples are generated independently of one another and thus, calculations can be shared more efficiently within parallel configurations, thereby significantly reducing computation time. Other parallel implementations of MCMC sampling are discussed in Section 4.

As shown in **Figure 9**, modified BMU takes the maximum time to obtain the complete posterior PDF using MCMC sampling. Due to the steep ascent of the box-car-shaped likelihood function before reaching the high-probability plateau, many samples are rejected. To obtain a single accepted sample, many samples are evaluated and rejected. This increases the computation times to obtain the joint posterior PDF. Traditional BMU, which also utilizes MCMC sampling, takes less time than modified BMU as the likelihood function utilized has a gradual slope, which is more favorable for exploring the parameter space. The number of samples rejected before accepting a sample is lower, therefore the total computation time is lower. EDMF takes higher computation time than traditional BMU when grid sampling is utilized. However, once the grid sampling is parallelized with sections of the grid passed to 24 cores for computation, the time taken is reduced. The reduction in computation time is dependent on the number of cores available. In the comparison shown in **Figure 9**, the parallel computing setup uses 24 cores thereby reducing the computation time for grid sampling by a factor of 24, when computations are completely independent as is the case for grid sampling. Residual minimization, when applied using Monte Carlo sampling, has the same computation time as EDMF using single or multiple cores. A drawback of grid sampling is that, as the number of model parameters increases, the number of parameter combinations to be evaluated increases exponentially. Therefore, optimal model-class selection is very important to limit computational cost without comprising the accuracy of updated model predictions.

The fourth criterion is compatibility of the data-interpretation methodology with knowledge and procedures of practicing engineers. Methodologies should be transparent when updating knowledge of model parameters. MCMC sampling, utilized in traditional and modified BMU, is a black-box algorithm and thus provides low transparency to engineers for understanding the process involved in accepting or rejecting samples. Also, the sampling metrics such as burn-in samples, step size, and number of samples required to obtain a stable solution can be determined only through trial and error. Therefore, many iterations of the sampling process are required to determine these metrics to converge efficiently on to posterior PDFs. EDMF, typically, utilizes grid sampling, wherein a grid of initial model instances is generated. From this initial grid, only model instances whose responses are comparable to measurements within certain bounds of uncertainty (*Thigh* and *Tlow*) are accepted as candidate model instances. The falsification criteria is based on a simple acceptreject decision making. Engineers are able to better understand EDMF, thus increasing robustness of solutions when information changes over service lives of structures.

Currently, residual minimization is the most commonly used methodology. However, as shown in **Figures 5** and **7**, it does not always provide accurate solutions for estimating reserve capacity. EDMF provides accurate identification and prediction utilizing a simple accept-reject criterion for determining the updated model parameters and requires lower computational resources than other probabilistic methodologies when implemented in a parallel framework.

### **4. DISCUSSION**

Decision making for infrastructure asset management is a complex task, which can be aided through a better understanding of structural behavior using measurements. It is important that the data-interpretation methodology provides accurate predictions while being easy to use and incorporating new information and engineering knowledge over the service life of a structure. Comparison between four data-interpretation methodologies using the Venoge bridge revealed the applicability of these methodologies for reserve capacity estimation. In addition to structural identification (inverse task), the remaining fatigue life of the Venoge bridge (forward task) was predicted under three traffic loading scenarios using updated model parameters. The critical detail analyzed to determine the remaining fatigue life of the bridge is situated in the extended part of the bridge, built in 1995. The design remaining fatigue life of this bridge is 82 years in 2013, which after model updating is estimated to be 126 years.

Residual minimization is a deterministic data-interpretation methodology, which is commonly used in practice due to its ease of understanding. Using updated model parameters obtained through residual minimization, the equivalent stress range at sensors A5 and B5 and remaining fatigue life at a critical detail under two traffic loading scenarios were predicted and compared to results obtained using measurements. The prediction was not accurate in any of the four prediction cases considered. Residual minimization does not always provide robust identification in the presence of unknown correlated uncertainty with model bias and systematic uncertainty, an observation previously noted by Goulet and Smith (2013) for a cantilever beam.

Traditional BMU is a probabilistic methodology that utilizes a zero-mean independent Gaussian likelihood function for model updating. Using updated model parameters obtained through traditional BMU, accurate prediction was observed for only two out of four cases, when compared with results obtained using measurements. Traditional BMU may provide biased predictions when model bias and correlation between uncertainties at measurement locations is not accounted for in model updating. This has been previously observed by Pasquier and Smith (2015) using an idealized simple beam. This paper makes similar observations for a full-scale bridge. Improvement in prediction accuracy may be achieved by parametrizing model bias and correlation between uncertainties at various measurement locations. However, identification of the additional parameters increases dimensionality of the structural identification problem, thereby increasing computational cost and such strategies usually involve assumptions of constant bias at all measurement locations. Also, with few sparse measurements as in the case presented in this paper, a model class with many parameters may lead to unidentifiability (Reuland et al., 2017).

EDMF and modified BMU, which are robust to variations in correlation assumptions, provided accurate identification and prediction for the four prediction cases when compared with results obtained using measurements. Moreover, results obtained using EDMF and modified BMU are similar, implying that EDMF can be understood as an analogous and discrete approach to Bayesian model updating, based on the philosophy of model falsification. This result was previously observed using an idealized simple beam by Pai and Smith (2017). A drawback with application of EDMF for model updating is its sensitivity to presence of outliers in measurements. Therefore, before application of EDMF, it is important to detect outliers in measurement and clean the data.

Predictions obtained using updated model parameters are subject to prediction uncertainty. This prediction uncertainty arises due to the difference between the model used in identification (to predict model response comparable to measurements) and the model used for prediction. In the case studied here, as shown in **Figure 6**, the identification and prediction uncertainty are different. The prediction uncertainty, similar to identification uncertainty, is computed using engineering knowledge and numerical evaluation of the finite-element and surrogate models. A different prediction uncertainty model, than the one used in this paper, may be employed based on engineering knowledge. However, as prediction results obtained using traditional BMU, modified BMU and EDMF are subjected to the same uncertainty model, the bias between predictions from various methods will not change.

The posterior distribution of model parameters and predictions, such as equivalent stress range and remaining fatigue life, that are obtained through model updating have large uncertainty ranges. The information contained in measurements has not reduced the uncertainty associated with the model parameters significantly. The use of in-service strain and traffic measurements provides engineers with the possibility to gain information without disrupting bridge traffic. However, such measurements are associated with uncertainties such as traffic load position on the bridge, axle weight of the traffic moving on the bridge. Moreover, the strain gages are clustered at two cross-sections of the bridge, thereby the information obtained from these sensors has high redundancy. Better positioning of sensors based on modern measurement system design strategies could help improve the amount of information that is acquired with the sensors. In addition, conducting load tests with knowledge of weight and position of the trucks may help in acquiring additional information. Parameter estimation and prediction can be further refined by improving knowledge of materials through non-destructive testing, improved modeling of the bridge and a more detailed fatigue evaluation with more appropriate S–N curves.

Engineering knowledge of uncertainty sources cannot be included in residual minimization or traditional BMU without increasing the number of parameters to be identified, which increases computation time exponentially. EDMF and modified BMU account for estimation of uncertainty from many sources using engineering knowledge by incorporating it into a combined uncertainty PDF, which is then used to determine either the falsification criteria or the likelihood function. EDMF has an additional advantage as it utilizes a simple accept–reject criterion for model updating, which is easy to understand and implement. Traditional BMU and modified BMU, when applied using MCMC sampling, cannot be implemented efficiently in a parallel computation framework. An alternative may be transitional MCMC sampling proposed by Ching and Chen (2007).Angelikopoulos et al. (2012) **TABLE 6** | Comparison of criteria for the data-interpretation methodologies that are studied in this paper.


*The checks indicate satisfactory performance while the crosses indicate unacceptable performance. The roman numerals used for the last two criteria are rankings where I is the best.*

have implemented transitional MCMC sampling in a parallel computation framework, wherein independent Markov Chains are generated by individual cores. Usage of transitional MCMC could help in sampling a complex parameter space when the joint posterior distribution of model parameters is multimodal. However, the use of black-box search algorithms decreases the understandability of the methods for use in practice. Comparison of the methodologies is summarized in **Table 6**.

As summarized in **Table 6**, EDMF fulfils all the criteria required of a data-interpretation methodology for use in practice. In addition, grid sampling used in EDMF can be implemented in a parallel computation framework thereby reducing computation cost. EDMF and other data-interpretation methodologies were also used to predict the remaining fatigue life of the Venoge bridge under a future traffic loading scenario. In this scenario, the traffic weight was assumed to increase 2% annually over the next 20 years. The minimum remaining fatigue life predicted after model updating using EDMF, under increased traffic loading, was lower than the service life. Robust prediction of remaining fatigue life for such future scenarios enable use of data-interpretation methodologies in scheduling inspections and deciding on assetmanagement actions.

### **5. CONCLUSION**

In this paper, four data-interpretation methodologies are applied to evaluate the fatigue life of a highway bridge under monitored traffic loading. Comparisons are made in terms of parameter identification and accuracy of predictions with respect to measured structural response. Applications of the four methodologies to the Venoge bridge lead to the following conclusions:

*•* Measurements of service behavior improve the accuracy of remaining fatigue life calculations. The minimum remaining fatigue life of the Venoge bridge is improved by 54% using

in-service measurement from eight strain gages and observed traffic load from a WIM station.


### **REFERENCES**


### **AUTHOR CONTRIBUTIONS**

SP elaborated the application of various data-interpretation methodologies to a full-scale case study for fatigue life evaluation. AN assisted in the elaboration of the case study and fatigue life evaluation. IS was actively involved in developing and adapting the data-interpretation methodologies. All authors reviewed and accepted the final version.

### **ACKNOWLEDGMENTS**

The authors acknowledge Y. Reuland for fruitful discussions.

### **FUNDING**

This work was funded by the Swiss National Science Foundation under contract no. 200020-169026 and Singapore-ETH Centre (SEC) under contract no. FI 370074011-370074016.


Miner, M. A. (1945). Cumulative damage in fatigue. *J. Appl. Mech.* 12, A159–A164.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2018 Pai, Nussbaumer and Smith. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Modeling Error Estimation and Response Prediction of a 10-Story Building Model Through a Hierarchical Bayesian Model Updating Framework

#### Mingming Song<sup>1</sup> , Iman Behmanesh<sup>2</sup> , Babak Moaveni <sup>1</sup> \* and Costas Papadimitriou<sup>3</sup>

<sup>1</sup> Department of Civil and Environmental Engineering, Tufts University, Medford, MA, United States, <sup>2</sup> WSP USA, New York, NY, United States, <sup>3</sup> Department of Mechanical Engineering, University of Thessaly, Volos, Greece

#### Edited by:

Ian F. C. Smith, École Polytechnique Fédérale de Lausanne, Switzerland

#### Reviewed by:

Feng-Liang Zhang, Tongji University, China Matteo Pozzi, Carnegie Mellon University, United States Yves Reuland, École Polytechnique Fédérale de Lausanne, Switzerland

> \*Correspondence: Babak Moaveni babak.moaveni@tufts.edu

#### Specialty section:

This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment

Received: 17 October 2018 Accepted: 11 January 2019 Published: 31 January 2019

#### Citation:

Song M, Behmanesh I, Moaveni B and Papadimitriou C (2019) Modeling Error Estimation and Response Prediction of a 10-Story Building Model Through a Hierarchical Bayesian Model Updating Framework. Front. Built Environ. 5:7. doi: 10.3389/fbuil.2019.00007 In this paper a hierarchical Bayesian model updating approach is proposed for calibration of model parameters, estimation of modeling error, and response prediction of dynamic structural systems. The approach is especially suitable for civil structural systems where modeling errors are usually significant. The proposed framework is demonstrated through a numerical case study, namely a 10-story building model. The "measured data" include the numerically simulated modal parameters of a frame model which represents the true structure. A simplified shear building model with significant modeling errors is then considered for model updating with stiffness of different structural components (substructures) chosen as updating parameters. In the proposed hierarchical Bayesian framework, updating parameters are assumed to follow a known distribution model (normal distribution is considered here) and are characterized by the distribution parameters (mean vector and covariance matrix). The error function, which is defined as the misfit between model-predicted and identified modal parameters, is also assumed to follow a normal distribution with unknown parameters. The hierarchical Bayesian approach is applied to estimate the stiffness parameter distributions with mean and covariance matrix referred to as hyperparameters, as well as the modeling error which is quantified by the mean and covariance of error function. Joint posterior probability distribution of all updating parameters is derived from the likelihood function and the prior distributions. A Metropolis-Hastings within Gibbs sampler is implemented to evaluate the joint posterior distribution numerically. Two cases of model updating are studied with first case assuming a zero mean for the error function, and the second case considering a non-zero error mean. The response time history of the building to a ground motion is predicted using the calibrated shear building model for both cases and compared with the exact response (simulated). Good agreements between predictions and measurements are observed for both cases with better accuracy in the second case. This verifies the proposed hierarchical Bayesian approach for model calibration and response prediction and underlines the importance of considering and propagating the uncertainties of structural parameters and more importantly modeling errors.

Keywords: hierarchical Bayesian model updating, modeling error estimation, uncertainty quantification and propagation, probabilistic response prediction, Metropolis-Hastings within Gibbs sampler

## INTRODUCTION

Finite element (FE) model updating is one of the most common methods for response prediction and performance assessment of structural systems (Mottershead and Friswell, 1993; Friswell and Mottershead, 2013). In the deterministic formulation, model updating includes an optimization process to obtain model parameter values (e.g., geometry, mass, stiffness) that minimize the misfit between model-predicted and experimentally measured data features. Data features of interest include acceleration or strain response time history, or modal parameters such as natural frequency and mode shapes. Several applications of deterministic model updating have been reported for response prediction and performance assessment of realworld structures with relative success (Capecchi and Vestroni, 1993; Levin and Lieven, 1998; Friswell et al., 2001; Bakir et al., 2008; Fang et al., 2008; Perera and Ruiz, 2008; Jafarkhani and Masri, 2011). Brownjohn et al. applied model updating for dynamic assessment of a cable-stayed bridge (Brownjohn and Xia, 2000) and a highway bridge (Brownjohn et al., 2003). Teughels et al. performed damaged detection of a highway bridge through FE model updating (Teughels et al., 2002, 2003; Teughels and De Roeck, 2004). More applications to real-world bridges can be found in these studies (Zhang et al., 2001; Jaishi and Ren, 2006; Jaishi et al., 2007; Reynders et al., 2007). Moaveni et al. employed model updating for progressive damage identification of a 2/3-scale reinforced concrete (RC) frame (Moaveni et al., 2012). Song et al. performed damage identification of a twostory RC building and compared it with lidar measurement (Song et al., 2017). However, deterministic approaches have their shortcomings. For examples, they are unable to quantify the uncertainty of updating results and are only valid when unique optimal solutions exist, i.e., the inverse problem is identifiable. The uncertainty quantification issue and identifiability problem can be addressed in a probabilistic formulation of model updating such as Bayesian model updating. Beck et al. derived the framework for Bayesian model updating and presented some numerical applications (Beck and Katafygiotis, 1998; Katafygiotis and Beck, 1998; Beck et al., 2001; Beck and Au, 2002). Yuen et al. applied Bayesian model updating for damage identification of the numerical ASCE-IASC benchmark structure (Yuen et al., 2004). Behmanesh and Moaveni performed probabilistic identification of the simulated damage on a footbridge through Bayesian inference (Behmanesh and Moaveni, 2015). More applications of Bayesian model updating to numerical and experimental case studies can be found in the literature (Sohn and Law, 1997; Ching and Beck, 2004; Muto and Beck, 2008; Ntotsios et al., 2009).

In the application of model updating to real-world civil structures, three major sources of uncertainty must be considered: (1) measurement noise and identification error (e.g., in extraction of modal parameters), (2) variability in effective model parameters due to the changing in-service ambient and environmental conditions (change in effective mass, damping, stiffness due to temperature, humidity, wind load, and occupancy, etc.), and (3) modeling errors (e.g., linearity assumption, boundary conditions, and discretization). Although the classical Bayesian model updating approaches often consider the effects of measurement noise and identification error, the second and third sources of uncertainty are not explicitly accounted for. The second source of uncertainty is relatively unique to large-scale civil structures, and referred to as inherent variability. In past studies (Alampalli, 2000; Clinton et al., 2006; Moser and Moaveni, 2011), identified natural frequencies of different structural systems are reported to be significantly affected by temperature, humidity, and weather conditions. Furthermore, different levels of ambient loading such as wind and traffic load cause changes in effective structural stiffness. The proposed hierarchical Bayesian model updating framework is capable of accounting for these sources of uncertainty by estimating the probability distributions of updating parameters characterized by hyperparameters (Behmanesh et al., 2015; Behmanesh and Moaveni, 2016).

Simplifying assumptions cannot be avoided when modeling complex civil structures and they often lead to significant modeling errors. The classical Bayesian model updating framework cannot explicitly quantify the modeling errors since all three sources of uncertainty mentioned above are lumped into one term. However, the classical formulation is useful for model class selection among competing model forms (Ching and Chen, 2007; Song et al., 2018). Error-domain model falsification algorithm is shown to be capable of falsifying model instances/classes in the view of compatibility with measurement by avoiding assumptions on the exact distribution of modeling errors and residual dependency (Goulet and Smith, 2013; Goulet et al., 2013; Pasquier and Smith, 2015). Comparisons between error-domain model falsification and Bayesian model updating approaches regarding to prediction accuracy and robustness are recently made in these studies (Reuland et al., 2017; Pai et al., 2018). In the proposed hierarchical Bayesian framework, the influence of modeling errors is quantified by fitting and estimating the probability distribution of error functions characterized by the distribution parameters, e.g., mean and covariance in a normal distribution. The estimated error mean reflects the modeling bias which causes a shift in model predictions, while the covariance matrix is accounting for the effect of measurement noise and identification error, as well as the uncertainty due to modeling errors.

In this paper, the proposed hierarchical Bayesian model updating approach is implemented for probabilistic response prediction of a numerical 10-story building model. A frame model which represents the considered true structure is used to simulate the measurements. A simplified shear building model is created and used for model updating to represent significant modeling errors. Stiffness of different stories in the shear building model (substructures) are selected as the updating parameters and are assumed to follow normal distributions which are characterized by stiffness mean and covariance. The error function is defined as the difference between identified modal parameters and their model-predicted counterparts and is also assumed to follow a normal distribution. The hierarchical Bayesian approach is implemented to estimate the stiffness mean and covariance—referred to as hyperparameters—as well as the modeling errors. The mean of the error function is assumed to be zero in the first case of model updating. However, significant bias is observed in the predicted natural frequencies, which prompts a second case of model updating with non-zero error mean. Finally, displacement and acceleration time histories are predicted using the calibrated models and are compared with measured data for both cases.

### HIERARCHICAL BAYESIAN MODEL UPDATING FRAMEWORK

### Formulation of Hierarchical Bayesian Approach

The probability distribution of updating structural parameters θ (e.g., stiffness of different building components) is assumed to be normal, which is characterized by the mean vector and covariance matrix referred to as hyperparameters, θ ∼ N (µθ, 6θ). Error function, which is defined as the misfit between measured data (or data features such as identified modal parameters) and their model-predicted counterparts, is also assumed to follow a normal distribution with mean µ<sup>e</sup> and covariance matrix 6e. The proposed framework allows estimation of posterior probability distribution of updating parameters and hyperparameters, namely µθ, 6θ, µe, and 6e. **Figure 1** shows the graphical representation of the proposed hierarchical Bayesian framework. The influence of changing ambient and environmental conditions on structural stiffness is accounted for by hyperparameters µ<sup>θ</sup> and 6θ. The effect of modeling assumptions on the error function can vary across different types of structures, structural components, and material. The modeling errors are assumed to follow a joint normal distribution in this study. The mean of error function µ<sup>e</sup> represents a modeling bias, and covariance of error function 6<sup>e</sup> includes the contribution of measurement noise, identification error and modeling error, with modeling error generally having the largest influence.

In model updating applications, modeling bias (mean of modeling error) is commonly assumed to be zero. This assumption can be verified once the updating is completed by evaluating the error mean. If the error mean is negligible, then the assumption has been accurate. Therefore, the error function can be written as:

$$e\_t = \begin{bmatrix} e\_{\lambda\_l} \\ e\_{\Phi\_l} \end{bmatrix} \sim N(0, \Sigma\_\ell) \tag{1}$$

in which e<sup>t</sup> is the error function for dataset t and consists of two parts: eigen-frequency error eλ<sup>t</sup> and mode shape error e8<sup>t</sup> , defined as:

$$e\_{\lambda\_{tm}} = \frac{\tilde{\lambda}\_{tm} - \lambda\_m(\theta\_l)}{\lambda\_m(\theta\_l)}\tag{2}$$

$$\varepsilon\_{\Phi\_{tm}} = \frac{\sideset{}{}{\Phi}\_{tm}}{\left\|{\sideset{}}{\Phi}\_{tm}\right\|} - a\_{tm} \frac{\Gamma \Phi\_m(\theta\_t)}{\left\|{\Gamma \Phi}\_m(\theta\_t)\right\|} \tag{3}$$

Subscript m denotes the mode number, λm(θt)and 8m(θt) are model-predicted eigen-frequency (λm(θt) = 2πftm (θt) 2 , in which ftm (θt) is the natural frequency in Hz) and mode shape in dataset t, andλ˜ tm and 8˜ tm are their identified counterparts. The natural frequencies and mode shapes extracted from the vibration measurement are referred to as identified modal parameters in this paper. Ŵis a Boolean matrix which maps corresponding degrees of freedom (DOFs) between 8m(θt) and 8˜ tm. atm is a scaling factor and defined as:

$$a\_{tm} = \frac{\left(\Phi\_{tm}\right)^T \Gamma \Phi\_m(\theta\_t)}{\left\|\Phi\_{tm}\right\| \left\|\Gamma \Phi\_m(\theta\_t)\right\|}\tag{4}$$

The assumption of µ<sup>e</sup> = 0 considers negligible modeling bias, and in the case of significant modeling bias a nonzero µ<sup>e</sup> should be considered. Due to the compensation effect between µ<sup>θ</sup> and µe, these two terms cannot be estimated simultaneously. Therefore, µ<sup>e</sup> is not updated through a Bayesian inference but is evaluated from the obtained results, as demonstrated in section Model updating with µ<sup>e</sup> 6= 0. The covariance matrix 6<sup>e</sup> is assumed to be a diagonal matrix which neglects the correlation between different error function components.

$$
\Sigma\_{\varepsilon} = \begin{bmatrix}
\ddots & & & \\
& \sigma\_{\varepsilon\_i}^2 & & \\
& & \ddots & \\
& & & \ddots
\end{bmatrix} \\ \tag{5}
$$

Note that a full matrix can also be estimated in this framework, but this would increase the computational burden of the updating process. Based on the authors' past experience, use of diagonal covariance matrix is reasonable in many applications. However, this is not true for all applications and errors in frequency and mode shape components of the same mode can be correlated. In the case of error function dependency, the estimated diagonal covariance matrix is an approximate solution of the full matrix.

The posterior probability density function (PDF) is proportional to the multiplication of the likelihood function and prior PDFs which are assumed to be independent (Gelman et al., 2013), as shown below:

$$\begin{split} &\boldsymbol{\rho}\left(\boldsymbol{\theta}\_{t},\boldsymbol{\mu}\_{\boldsymbol{\Theta}},\boldsymbol{\Sigma}\_{\boldsymbol{\Theta}},\boldsymbol{\Sigma}\_{\boldsymbol{\varepsilon}}\,\middle|\,\tilde{\boldsymbol{\lambda}}\_{t},\,\tilde{\boldsymbol{\Phi}}\_{t}\right) \\ &\propto \boldsymbol{\rho}\left(\tilde{\boldsymbol{\lambda}}\_{t},\,\tilde{\boldsymbol{\Phi}}\_{t}\,|\,\boldsymbol{\theta}\_{t},\boldsymbol{\mu}\_{\boldsymbol{\Theta}},\,\boldsymbol{\Sigma}\_{\boldsymbol{\Theta}},\boldsymbol{\Sigma}\_{\boldsymbol{\varepsilon}}\right) \boldsymbol{\rho}\left(\boldsymbol{\Theta}\_{t},\boldsymbol{\mu}\_{\boldsymbol{\Theta}},\boldsymbol{\Sigma}\_{\boldsymbol{\Theta}},\boldsymbol{\Sigma}\_{\boldsymbol{\varepsilon}}\right) \end{split} \tag{6}$$

$$\propto p\left(\tilde{\lambda}\_{\ell}, \tilde{\Phi}\_{\ell} \mid \Theta\_{\ell}, \Sigma\_{\varepsilon}\right) p\left(\Theta\_{\ell} \mid \mu\_{\Theta}, \Sigma\_{\Theta}, \Sigma\_{\varepsilon}\right) p\left(\mu\_{\Theta}, \Sigma\_{\Theta}, \Sigma\_{\varepsilon}\right) \tag{7}$$

$$\propto p\left(\tilde{\lambda}\_t, \tilde{\Phi}\_t \mid \Theta\_t, \Sigma\_\varepsilon\right) p\left(\theta\_t \mid \mu\_\theta, \Sigma\_\theta\right) p\left(\mu\_\theta\right) p\left(\Sigma\_\theta\right) p\left(\Sigma\_\varepsilon\right) \tag{8}$$

Equation (7) is derived based on the fact that the identified modal parameters only depend on the structural stiffness θ<sup>t</sup> and the error function, therefore, the condition on hyperparameters µ<sup>θ</sup> and 6<sup>θ</sup> can be discarded from Equation (6). In addition, structural stiffness is only dependent on its hyperparameters,

therefore, the condition on 6<sup>e</sup> can be dropped, and by assuming µθ, 6θ, and 6<sup>e</sup> are independent in their joint prior distribution, Equation (8) can be obtained. When multiple datasets are available and considered, the joint posterior PDF could be derived by assuming different datasets are independent:

$$\mathcal{P}\left(\Theta,\mu\_{\Theta},\Sigma\_{\Theta},\Sigma\_{\varepsilon}\left|\tilde{\boldsymbol{\lambda}},\tilde{\Phi}\right>\propto p\left(\mu\_{\Theta}\right)p\left(\Sigma\_{\Theta}\right)\right)$$

$$\int \mathcal{P}\left(\Sigma\_{\varepsilon}\right)\prod\_{t=1}^{N\_{\mathrm{I}}}\not{p}\left(\tilde{\boldsymbol{\lambda}}\_{t},\tilde{\Phi}\_{t}\left|\theta\_{t},\Sigma\_{\varepsilon}\right>\right)p\left(\theta\_{t}\left|\mu\_{\Theta},\Sigma\_{\Theta}\right)\right)\tag{9}$$

where 2 = - θ<sup>1</sup> . . . θ<sup>t</sup> . . . θN<sup>t</sup> , λ˜ = - λ˜ <sup>1</sup> . . . λ˜ t . . . λ˜ <sup>N</sup><sup>t</sup> , 8˜ = - 8˜ <sup>1</sup> . . . 8˜ t . . . 8˜ <sup>N</sup><sup>t</sup> , and N<sup>t</sup> denotes the number of datasets.

In this study, uniform prior PDF is assumed for µ<sup>θ</sup> and "conjugate priors" (Gelman et al., 2013) are used for 6<sup>θ</sup> and 6e(σ 2 ei ) to simplify the formulation as shown below.

$$p(\mu\_{\theta}) \propto 1\tag{10}$$

$$
\Sigma\_{\theta} \sim \text{Inverse-Wishart}(\Sigma\_{\theta0}, \nu\_1) \tag{11}
$$

$$
\sigma\_{\mathfrak{e}\_i}^2 \sim \text{Inverse-}\chi^2(\nu\_2, \sigma\_{\mathfrak{e}0}^2) \tag{12}
$$

In above equations, v1, v2, 6θ0, σ 2 e0 are the parameters of prior PDFs. The selection of these parameters can influence the final posterior distribution and should be made based on prior knowledge and engineering expertise. For the considered inverse-Wishart and Inverse-χ <sup>2</sup> distributions, smaller values of v<sup>1</sup> and v<sup>2</sup> would "flatten/widen" the prior PDFs indicating larger prior uncertainties, and 6θ<sup>0</sup> and σ 2 <sup>e</sup><sup>0</sup> would reflect the mode of distributions.

The joint posterior PDF is derived by substituting the likelihood function and conjugate prior PDFs into Equation (9) as shown below.

$$\begin{aligned} &p\left(\Theta,\mu\_{\theta},\Sigma\_{\theta},\Sigma\_{\varepsilon}\left|\tilde{\lambda},\tilde{\Phi}\right>\right) \\ &\propto \left|\Sigma\_{\theta}\right| - \frac{N\_{t}+\nu\_{1}+N\_{p}+1}{2} \prod\_{i=1}^{N\_{\varepsilon}} \left(\sigma\_{\varepsilon\_{i}}^{2}\right) - \frac{N\_{t}+\nu\_{2}+2}{2} \\ &\exp\left[\sum\_{i=1}^{N\_{\varepsilon}} \left(J\_{\varepsilon\_{i}}+J\_{\theta\_{i}}\right) - \frac{1}{2}tr\left(\Sigma\_{\Theta0}\bullet\Sigma\_{\Theta}^{-1}\right) - \sum\_{i=1}^{N\_{\varepsilon}} \frac{\nu\_{2}\sigma\_{\varepsilon\_{i}0}^{2}}{2\sigma\_{\varepsilon\_{i}}^{2}}\right] \end{aligned} \tag{13}$$

$$\exp\left[\sum\_{t=1}^{\infty} \left(\mu\_{\ell\_t} + \mu\_{\theta\_t}\right) - \frac{1}{2} \text{tr}\left(\boldsymbol{\omega}\_{\theta0} \cdot \boldsymbol{\omega}\_{\theta\_t}\right) - \sum\_{i=1}^{\infty} \overline{\boldsymbol{2}\sigma\_{\epsilon\_i}^2}\right] \tag{13}$$
  $\boldsymbol{\varepsilon}\_{\theta} = \boldsymbol{1}\_{\mathcal{L}, \boldsymbol{T}, \boldsymbol{\Sigma} - 1}$ 

$$J\_{e\_l} = -\frac{1}{2} e\_l^T \Sigma\_e^{-1} e\_l \tag{14}$$

$$J\_{\theta\_t} = -\frac{1}{2} (\theta\_t - \mu\_\theta)^T \Sigma\_\theta^{-1} (\theta\_t - \mu\_\theta) \tag{15}$$

Here N<sup>p</sup> is the dimension of stiffness parameters θ, N<sup>e</sup> is the dimension of error function e<sup>t</sup> and is equal to (1 + Ns)Nm, and N<sup>m</sup> and N<sup>s</sup> denote number of available modes and number of components (sensors) of the identified mode shape, respectively.

### Metropolis-Hastings Within Gibbs Sampler

The derived joint posterior PDF in Equation (13) is only known up to a normalizing constant, and it is often difficult to evaluate it analytically. Gibbs sampler, which belongs to the class of Markov Chain Monte Carlo (MCMC) methods, has been shown to be capable of sampling and evaluating Equation (13) efficiently. Gibbs sampler requires the derivation of posterior conditional PDFs which are listed below:

$$\rho\left(\theta\_t \left| \mu\_{\theta}, \Sigma\_{\theta}, \Sigma\_{\varepsilon}, \tilde{\lambda}\_t, \tilde{\Phi}\_t \right.\right) \propto \exp\left(J\_{\varepsilon\_l} + J\_{\theta\_l}\right) \tag{16}$$

$$\mathcal{P}\left(\mu\_{\boldsymbol{\theta}}\,\middle|\,\Theta,\,\Sigma\_{\boldsymbol{\theta}},\,\Sigma\_{\boldsymbol{\varepsilon}},\tilde{\lambda},\,\tilde{\Phi}\right) = N\left(\frac{1}{N\_t}\sum\_{t=1}^{N\_l}\theta\_t,\,\frac{1}{N\_t}\Sigma\_{\boldsymbol{\theta}}\right) \tag{17}$$

$$\mathfrak{p}\left(\Sigma\_{\Theta}\left|\Theta,\mu\_{\Theta},\Sigma\_{\varepsilon},\tilde{\lambda},\tilde{\Phi}\right>\right) = \text{Inverse-Wishart}\{\Sigma\_{\Theta0} + \mathcal{S}, \nu\_1 + N\_t\}$$

$$\begin{aligned} \text{(18)}\\ \text{(18)}\\ \text{(19)} \end{aligned} \tag{18}$$
  $\text{(18)}$   $\text{(19)}$   $\text{(19)}$   $\text{(19)}$   $\text{(19)}$   $\text{(10)}$   $\text{(10)}$   $\text{(10)}$   $\text{(10)}$   $\text{(10)}$   $\text{(11)}$   $\text{(10)}$ 

$$S = \sum\_{t=1}^{N\_l} \left(\theta\_t - \mu\_\theta\right) \left(\theta\_t - \mu\_\theta\right)^T \tag{20}$$

$$V\_i = \frac{1}{N\_t} \sum\_{t=1}^{N\_l} e\_{t\_i}^2 \tag{21}$$

It can be seen that the posterior conditional PDFs for µθ, 6θ, and 6<sup>e</sup> are standard distributions due to the use of conjugate priors, therefore, samples can be easily generated for these parameters. However, the conditional PDF for θ<sup>t</sup> is only known up to a scaling constant and therefore must be evaluated numerically. In this study, Metropolis-Hastings (MH) algorithm (Metropolis et al., 1953; Hastings, 1970) is employed to sample θt . The presented sampling algorithm is called MH within Gibbs sampler. Gibbs sampler samples the parameters recursively based on the conditional PDFs, and in each loop, one sample is generated containing the values of all updating parameters.

### Propagation of Uncertainties in Model-Predicted Response

After the joint posterior PDF is evaluated using Gibbs sampler, the calibrated model can be used and assessed for prediction of structural dynamic behavior through propagating the uncertainties of inherent variability and modeling errors estimated in the hierarchical Bayesian framework. The parameter estimation uncertainties are not considered in this study, as this type of uncertainty becomes negligible when using larger amount of data. For prediction of natural frequencies and mode shapes, the definitions of error function in Equations (2, 3) are used:

$$
\lambda\_{tm}^{\text{pre}} = \lambda\_m(\theta\_t) + \lambda\_m(\theta\_t)e\_{\lambda\_{tm}} \tag{22}
$$

$$\frac{\Phi\_{tm}^{\text{pre}}}{\|\Phi\_{tm}^{\text{pre}}\|} = a\_{tm} \frac{\Gamma \Phi\_m(\theta\_l)}{\|\Gamma \Phi\_m(\theta\_t)\|} + e\_{\Phi\_{tm}} \tag{23}$$

where θ<sup>t</sup> refers to the stiffness parameters of the calibrated model which follows the normal distribution N(µˆ <sup>θ</sup> , 6ˆ <sup>θ</sup>) in which µˆ <sup>θ</sup> and 6ˆ θ refer to the maximum a posteriori (MAP) values estimated through Gibbs sampler. The error function e<sup>t</sup> (which consists of two parts eλtm and e8tm ) follows the normal distribution N(µˆ <sup>θ</sup> , 6ˆ θ).

Accurate prediction of response time history is critical for assessment of structural performance by using metrics such as the maximum inter-story drift of buildings during an earthquake. Modal superposition method is employed to predict response time history. These predictions propagate uncertainties due to stiffness variability and modeling errors using the modelpredicted modal parameters in Equations (22, 23). The equation of motion in modal coordinates (Chopra and Chopra, 2007) is:

$$
\ddot{q}\_m(t) + 2\zeta\_m \alpha\_m \dot{q}\_m(t) + \omega\_m^2 q\_m(t) = \frac{P\_m(t)}{M\_m} \tag{24}
$$

where qm(t) is modal displacement of mode m, ω<sup>m</sup> is the circular natural frequency in rad/s and ω<sup>m</sup> = q λ pre tm , ζ<sup>m</sup> is the damping ratio. Pm(t) is the generalized force function and Pm(t) = 8<sup>T</sup> <sup>m</sup>P(t) in which P (t) is the input force vector. M<sup>m</sup> is the generalized mass of mode m with M<sup>m</sup> = 8<sup>T</sup> <sup>m</sup>M8m. The response time history in physical coordinates can be transformed from modal displacement as shown below:

$$\wp^{\text{pre}}(t) = \sum\_{m}^{N\_m} \Phi\_m^{\text{pre}} q\_m(t) \tag{25}$$

In Equation (25), N<sup>m</sup> denotes the total number of modes used in the model calibration process, which means that only contributions of N<sup>m</sup> modes are included in the response predictions as the calibrated model is only sufficient for providing reliable predictions of these modes by propagating all uncertainties considered. Note that error function e<sup>t</sup> is only evaluated at locations with measurement/sensors, which are usually sparse. To extend the error function to DOFs which are not measured, maximum component of σˆe(φi,j) is assumed for unmeasured DOFs with µe<sup>i</sup> = 0. This is a relatively conservative

TABLE 1 | Geometry and material property of the 10-story frame model.


Std, standard deviation.

approach for the extension of error function. The velocity and acceleration prediction can be derived from Equation (25) by replacing q<sup>m</sup> with q˙<sup>m</sup> or q¨m. Note that the model-predicted displacement, velocity and acceleration responses are all relative to the ground.

### Ten-Story Building Model and Simulated Data

The proposed hierarchical Bayesian approach is applied to a numerical model of a 10-story building for validation. The identified modal parameters of the building are simulated using a frame model as shown in **Figure 2A**. Foundation rocking is modeled by two rotational springs with stiffness k<sup>r</sup> = 2× 10<sup>5</sup> kN-m/rad. The building is assumed to be 30 m (10×3 m) tall and 10 m wide with a total weight of 40 metric tons (each floor mass of 4 tons). No variability of the mass is considered in this study. The cross-section and Young's modulus of the columns and floor slabs are reported in **Table 1**. The larger values of Young's modulus for lower stories are representing larger effective Young's modulus of reinforced concrete at lower stories. To account for the inherent variability of the structural stiffness, Young's modulus of all members are assumed to follow normal distributions with means and standard deviations shown in **Table 1**. The stiffness of different structural members are independent except for the two columns on the same

story which are assumed to have the same stiffness. The small rectangles with arrows in **Figure 2A** refer to the considered locations of accelerometers on the building and direction of measurements.

Based on the assumed normal distribution of structural stiffness in the frame (exact) model, 100 sets of modal parameters (natural frequency and mode shape) are simulated, which represent the "measured" data. The modal parameters are polluted with white noise of 0.5% in coefficient of variation to account for the measurement noise and identification error. It is assumed that only the first three modes are identified and their histograms are shown in **Figure 3**. The

mode shapes of the first three modes are shown in **Figure 4**, with mean stiffness assigned for all structural members in this graph. Note that 5 accelerometers are considered in the building, therefore, only mode shape components at these stories are available in the following model updating process.

### MODEL UPDATING RESULTS

### Case 1: Model Updating With µ<sup>e</sup> = 0

To consider the effects of modeling errors, a 10-story shear building model (instead of a frame model) as shown in **Figure 2B** is used in the model updating process. In this model, the foundation rocking is ignored by using a fixed boundary condition, the floors motion is constrained as only horizontal direction, and the slabs are assumed to be rigid. The structural columns are grouped into three substructures (story 1–3, story 4–6, and story 7-10) as shown in **Figure 2B**, and the updating parameters θ1, θ2, and θ<sup>3</sup> are the Young's modulus of columns in these substructures (the same Young's modulus is assumed for all columns in each group). It is assumed that the material distribution along the height of the building is known and can be divided into three groups. The substructuring strategy is utilized to limit the number of updating parameters. However, this strategy introduces additional modeling error due to the smearing effect of grouping strategy.

The proposed hierarchical Bayesian model updating approach is applied to estimate stiffness of the three substructures θ = [θ1, θ2, θ3] T , their hyperparameters µ<sup>θ</sup> and 6θ, and covariance of error function 6<sup>e</sup> (mean of error function µ<sup>e</sup> is assumed to be zero) using the simulated noisy modal parameters. After a tuning process, the parameters of prior PDFs in Equations (11, 12) are selected as:

$$\nu\_1 = 3, \Sigma\_{\theta 0} = \begin{bmatrix} 1^2 \\ & 1^2 \\ & & 1^2 \end{bmatrix}, \nu\_2 = 1, \sigma\_{c0}^2 = 1 \times 10^{-6} \tag{26}$$

MH within Gibbs sampler is employed to generate samples from the posterior conditional PDFs in Equations (16–19). In total, 20,000 samples are generated and first 5,000 samples are discarded as burn-in period to remove the transitional samples. Sample mean and standard deviation of µ<sup>θ</sup> and 6<sup>θ</sup> are plotted in **Figures 5**, **6**, which show that the samples have converged and the number of samples is adequate for estimating these statistics. The sample histograms for µθ, σθ<sup>i</sup> , and σe(λ1−3) are shown in **Figure 7**. The black lines denote the kernel PDFs which are normalized to have the same height as the highest bins of the histograms and black dots denote the MAPs. The MAPs are estimated as the peaks of kernel PDFs which are preferred over selecting the sample with highest posterior probability to reduce the estimation

uncertainty. Alternatively, the average values could be used but the MAPs are preferred as they represent the most probable values of parameters and are more appropriate for asymmetric distributions. It can be seen that samples of µ<sup>θ</sup> seem to follow a normal distribution, samples of σθ<sup>i</sup> roughly follow an Inverse-Wishart distribution, and samples of σe(λ1−3) approximately follow an Inverse-χ <sup>2</sup> distribution with a tail on the right side. These are expected due to the choice of conjugate priors used in section Formulation of Hierarchical Bayesian Approach.

The estimated MAPs of µθ, 6θ(σθ<sup>i</sup> , ρij) and 6e(σe<sup>i</sup> ) are summarized in **Table 2**, together with their nominal values from


ρˆ, µe, and σˆ<sup>e</sup> are normalized and therefore unitless terms. µ<sup>e</sup> values are in percentage (×10-2) while σˆ<sup>e</sup> terms are per thousand (×10-3) as indicated in left column.

the frame model. In **Table 2**, λ<sup>i</sup> refers to eigen-frequency of mode i and φj,<sup>i</sup> refers to component j of mode shape i. It can be observed that µˆ <sup>θ</sup><sup>i</sup> and σˆθ<sup>i</sup> are underestimated compared to their nominal values due to the significant modeling errors introduced in the shear building model. Although the mean and standard deviation of stiffness are underestimated, the correlation coefficients ρˆij are

FIGURE 9 | Comparison of natural frequency predictions with their identified counterparts using the calibrated model of Case 2 (µe 6= 0).

accurately estimated to be close to zero. Note that the overall stiffness variability in each of the three column groups is less than the individual stiffness uncertainty shown in **Table 1** due to (1) the compensation effect of independent element stiffness, and more importantly (2) modeling errors in the shear building model, including the negligence of rocking behavior at the base and the rigid assumption of floors. The implemented grouping strategy introduces additional modeling errors. Grouping or sub-structuring is a common strategy to reduce the number of updating parameters and avoid unidentifiability/illconditioning.

The updated structural parameters (Young's moduli) represent the effective stiffness of different substructures. In presence of large modeling errors, the updated values will compensate for non-updated model parameters and modeling errors, therefore, they may not correspond to the physical Young's modulus of the used material. In linear time-invariant applications, calibrated models can still provide good response prediction even outside the calibration range of response amplitude. However, the calibrated model should be cautiously used for prediction of local response quantities with little sensitivity to the used error metrics such as modal parameter errors. The available measurements can provide information about the accuracy or bias of the calibrated model on used error metrics (natural frequencies and mode shapes here). But they do not provide accurate estimation of the expected

bias for localized quantities such as strains and stresses at locations with large modeling errors (e.g., base of the building in this application). Significant variability is observed for the covariance of error function, and σˆe(λ1) and σˆe(λ3) are estimated much larger than their nominal values (added white noise level of 0.5%) due to the modeling errors and the assumption of µ<sup>e</sup> = 0 in Case 1. The nominal value of µ<sup>e</sup> in **Table 2** is computed based on the error function definition in Equations (2, 3) with Young's modulus the same as the mean values in **Table 1**.

The calibrated model is then used to predict the modal parameters using the formulation detailed in section Propagation of Uncertainties in Model-Predicted Response. A total of 1,000 natural frequency predictions are generated from the calibrated model and compared with their identified counterparts (simulated from the frame model) as shown in **Figure 8**. It can be observed that the range of identified values is fully covered by predictions. However, predictions have significantly larger variability than measured data. An evident difference is observed between the centers of two clouds (black dots vs. gray circles) which indicates error bias. Therefore, it is concluded that the assumption of µ<sup>e</sup> = 0 is not an optimal choice in this case, and a non-zero µ<sup>e</sup> is preferred. This observation prompts the second case of model updating referred to as Case 2 in the following section.

### Case 2: Model Updating With µ<sup>e</sup> 6= 0

In this case of model updating, a non-zero mean is considered for the error function. Although µ<sup>e</sup> cannot be updated simultaneously with µ<sup>θ</sup> due to compensation effects, it can be evaluated from the observed bias in error function as shown below:

$$\mu\_{\varepsilon} = \frac{1}{N\_{t}} \sum\_{t=1}^{N\_{t}} \hat{e}\_{t} \tag{27}$$

in which eˆ<sup>t</sup> is the error function evaluated based on Equations (2, 3) using θˆ <sup>t</sup> estimated in Case 1. A total of 100 different values of θˆ <sup>t</sup> are estimated for 100 sets of modal parameters from the joint posterior distribution in Equation (13), and µ<sup>e</sup> is computed as the mean of 100 evaluations of eˆ<sup>t</sup> . The evaluated µ<sup>e</sup> is reported in **Table 2**. It can be seen that the largest values in µ<sup>e</sup> correspond to the natural frequencies of mode 1 and 3 which exhibit the largest bias in the predictions as shown in **Figure 8**. Note that the estimated bias does not provide physical interpretation of modeling errors, but it can potentially indicate the extent of such error.

The hierarchical Bayesian model updating is repeated with the evaluated value of µ<sup>e</sup> from Equation (27). Note that in Case 2, µ<sup>e</sup> is not updated through a Bayesian inference but is obtained using the updating results of Case 1. The model updating follows the same process with minor modifications in Equations (14, 21):

$$J\_{\mathfrak{e}\_{t}} = -\frac{1}{2} (\mathfrak{e}\_{t} - \mathfrak{\mu}\_{\mathfrak{e}})^{T} \Sigma\_{\mathfrak{e}}^{-1} (\mathfrak{e}\_{t} - \mathfrak{\mu}\_{\mathfrak{e}}) \tag{28}$$

$$V\_i = \frac{1}{N\_l} \sum\_{t=1}^{N\_l} (e\_{l\_i} - \mu\_{c\_i})^2 \tag{29}$$

The estimated MAPs of µθ, 6θ(σθ<sup>i</sup> , ρij), and 6e(σe<sup>i</sup> ) with µ<sup>e</sup> 6= 0 are reported in **Table 2**. It can be observed that µˆ <sup>θ</sup> and 6ˆ θ remain almost the same for the two cases which is expected because the hyperparameters estimation is based on the measured data and the underlying model. The inclusion of a constant µ<sup>e</sup> only shifts

the center of error function distribution and therefore, would not affect the hyperparameters. However, values of 6ˆ <sup>e</sup> components are generally reduced, especially for σˆe(λ1) and σˆe(λ3), making 6ˆ e components much closer to their nominal values which are 0.005 due to the added white noise level of 0.5%. Similar comparison of natural frequency predictions with their identified counterparts using the calibrated model of Case 2 is shown in **Figure 9**. It can be seen that significantly improved predictions are achieved in this case compared to **Figure 8**. No observable bias exists between the two clouds (black dots vs. gray circles), and similar variability is observed. This demonstrates the importance of accounting for modeling bias in the proposed hierarchical Bayesian framework to achieve more accurate predictions.

### Response Time History Prediction

The calibrated models from Case 1 and Case 2 are used to predict response time history to an earthquake ground motion using the modal superposition method described in section Propagation of Uncertainties in Model-Predicted Response. The input is the recorded ground motion at Antrodoco station during the 2009 L'Aquila Italy earthquake as shown in **Figure 10**. The response predictions only include the contribution of the first three modes as the shear building model is only calibrated using these modes. The building is assumed to have modal damping ratios of 2% for all modes. To account for the estimation uncertainty of damping ratios, the identified damping ratios of the first three modes are assumed to be 2% with a coefficient of variation of 30%. Therefore, response time history predictions include the uncertainties of the estimated stiffness inherent variability (µˆ <sup>θ</sup>, 6ˆ <sup>θ</sup>), uncertainty of error function (µˆ <sup>e</sup>, 6ˆ <sup>e</sup>) and the uncertainty of damping ratios. To verify the accuracy of response predictions, the model predictions are compared with the response of the exact (frame) model. The damping of the frame mode is assumed to be exact (2% damping ratios for all modes) and the contributions of all modes are included in the simulation. However, the exact model simulations consider the variability of stiffness parameters using their exact probability distributions. It is worth noting that under large amplitude seismic excitations, buildings experience non-linear hysteretic behavior (Astorga et al., 2018). Therefore, in the case of dealing with large amplitude excitations, a non-linear model of the structural system is recommended to be used. The proposed hierarchical Bayesian method can then be applied for non-linear model calibration where hysteretic material properties can be considered as updating parameters. However, in this study the considered ground motion is deliberately selected to have a small peak ground acceleration of around 0.02 g so the assumption of linear elastic regime with low damping is realistic. Furthermore, certain building codes allow the use of linear FE models for simplified and approximate analysis of buildings under seismic loads such as the equivalent linear procedure and response spectrum procedure. These two linear methods are routinely used in practice to predict structural responses during seismic events.

A total of 200 independent predictions (using calibrated model) and simulations (using exact model) are performed and a 95% confidence interval is generated by: (1) sorting the 200 values at each time instant in an increasing order; (2) TABLE 3 | Statistics of maximum roof displacement, maximum roof acceleration, and maximum inter-story drift of 10th story for measurement and predictions.


Std, standard deviation.

selecting the 6th and 195th values as the lower and upper bounds of the confidence interval. Note that only 5 sensors (story 2, 4, 6, 8, and 10) are considered in the model updating process, therefore, the estimated error function only includes information for these stories. The error function is extended for unmeasured DOFs using the strategy detailed in section Propagation of Uncertainties in Model-Predicted Response to predict response of the unmeasured DOFs. The comparisons of displacement and acceleration time history predictions at the roof with their simulated counterparts are shown in **Figures 11**, **12**. A good agreement is observed between model predictions and simulations for both cases, while the predictions in Case 2 are more accurate (smaller bias and uncertainty). Note that in general, acceleration predictions have larger uncertainty compared to displacements since a larger number of modes contribute to acceleration response and higher modes often have larger modeling errors. In this study, accelerations are predicted using only the first three modes while the simulations of true response include contributions of all modes. **Figures 13**, **14** show the model-predicted responses at the 7th floor which does not have a sensor. Again, a good agreement can be seen for displacement and acceleration time history predictions and simulated response. Similarly, Case 2 predictions provide tighter fit with simulated response. In general, the predictions in DOFs without sensors are more conservative (larger variance) due to the conservative assumption made in the extension of error functions detailed in section Propagation of Uncertainties in Model-Predicted Response. The statistics of maximum roof displacement, maximum roof acceleration and maximum interstory drift of the 10th story for measurement and predictions of Case 1 and Case 2 are summarized in **Table 3**. It can be seen that, although Case 1 provides relative satisfactory results, Case 2 delivers significantly more accurate mean values and standard deviations.

### SUMMARY AND CONCLUSIONS

In this paper a hierarchical Bayesian model updating approach is implemented for modeling error estimation and response prediction of a 10-story building model using modal parameters. The identified modal parameters are simulated from a frame model which represents the true structure. A shear building model with significant modeling errors is created for model updating and the stiffness of three defined substructures are selected as updating parameters. The hierarchical Bayesian approach is employed to estimate the stiffness mean and covariance, as well as the modeling errors of the shear building model. Metropolis-Hastings within Gibbs sampler is implemented to evaluate numerically the joint posterior distribution of updating parameters. The mean of error function µ<sup>e</sup> is first assumed to be zero in Case 1. Evident bias is observed in natural frequency predictions which prompts a second case of model updating (Case 2) with µ<sup>e</sup> evaluated from the observed bias. The natural frequency predictions for Case 2 show no bias and similar variability to the identified values. Displacement and acceleration time history predictions are obtained for both cases and for all measured and unmeasured DOFs. Good agreements are observed between predictions and measurements for both cases and for all DOFs. The predictions are improved significantly for Case 2 when considering nonzero modeling bias. These observations validate the proposed hierarchical Bayesian approach for model calibration, modeling error estimation, and response prediction by considering and propagating the uncertainties of structural stiffness and modeling errors, and demonstrate the effects of accounting for modeling bias in response predictions. In the application of proposed hierarchical Bayesian method, model updating is recommended to initially be applied with the assumption of zero mean error (similar to Case 1). If evident prediction error bias is observed from the calibrated model, then a second case can be applied to remove the bias and improve the predictions. The main novelties of the proposed approach include the following.

(I) The "inherent variability" of updating structural parameters (stiffness) due to changing ambient and environmental conditions is quantified and estimated by

### REFERENCES


the hyperparameters µ<sup>θ</sup> and 6θ. As expected, the parameter estimation uncertainties would decrease with additional data but the estimated inherent variabilities would converge to a constant value similar to the estimation variability obtained from a frequentist approach. This is not the case for traditional Bayesian approaches.


### AUTHOR CONTRIBUTIONS

MS is the lead author where he completed most of the study and composed the paper. IB provided technical feedback for implementing the framework and sampling. BM serves as MS supervisor and provided guidance through the study as well as polishing the text. CP also provided feedback on technical details of the updating framework.

### ACKNOWLEDGMENTS

Partial support of this study by the National Science Foundation Grant 1254338 is gratefully acknowledged. The opinions, findings, and conclusions expressed in this paper are those of the authors and do not necessarily represent the views of the sponsors and organizations involved in this project.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Song, Behmanesh, Moaveni and Papadimitriou. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Rapid and Automated Damage Detection in Buildings Through ARMAX Analysis of Wind Induced Vibrations

Gregory Patrick Gislason, Qipei Mei and Mustafa Gül\*

Department of Civil and Environmental Engineering, University of Alberta, Edmonton, AB, Canada

After a seismic event, it is imperative that critical structural members that are damaged within a building are identified and analyzed as soon as possible to ensure proper remedial measures can be taken. Failure to detect damage or correctly analyze the severity of damage within the building could have catastrophic consequences. When a reinforced concrete building is subjected to a damaging event, the current standard method for identifying and analyzing structural damage involves extensive surfacelevel visual inspections which often result in inconclusive and inconsistent damage analysis. Structural Health Monitoring (SHM) is a rapidly developing field which is vastly improving the way damage is assessed within buildings and other major infrastructure. In this paper, an automated SHM Damage Detection Model (DDM) specifically tailored for buildings is developed that uses time series analysis along with sensor clustering techniques to detect damage in a building from its vibration response due to ambient wind loading. The specific time series analysis methodology used throughout this paper is an Auto-Regressive Moving Average model with eXogenous inputs (ARMAX). To validate the ARMAX DDM, a detailed wind simulation model that applies forces based on actual wind behavior is created along with a numerical damage model applicable to reinforced concrete buildings. To evaluate the effectiveness of the proposed DDM in locating and quantifying damage at a story level precision, two buildings are modeled in SAP2000. The results from the numerical modeling proved the effectiveness of the ARMAX DDM at accurately locating and quantifying the degree damage from wind induced floor vibrations at a story level precision. The limitations of the DDM in its current state and recommendations for future work are discussed to conclude the paper.

Keywords: ARMAX model, wind induced vibration, damage detection, time series analysis, shear type building

### INTRODUCTION

When a building undergoes a seismic event, the typical method for locating and analyzing any potential structural damage involves lengthy surface level visual inspections by structural engineers where each critical member is classified in a damage category based on the engineer's judgement. Such an arbitrary inspection method often leads to inconclusive and inconsistent damage analysis.

#### Edited by:

Eleni N. Chatzi, ETH Zürich, Switzerland

#### Reviewed by:

Luis David Avendaño Valencia, ETH Zürich, Switzerland Suparno Mukhopadhyay, Indian Institute of Technology Kanpur, India Harsh Nandan, SC Solutions, United States

> \*Correspondence: Mustafa Gül mustafa.gul@ualberta.ca

#### Specialty section:

This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment

Received: 23 August 2018 Accepted: 05 February 2019 Published: 26 February 2019

#### Citation:

Gislason GP, Mei Q and Gül M (2019) Rapid and Automated Damage Detection in Buildings Through ARMAX Analysis of Wind Induced Vibrations. Front. Built Environ. 5:16. doi: 10.3389/fbuil.2019.00016

To overcome the issues from visual inspections, vibrationbased structural health monitoring (SHM) has seen substantial progress due to the rapid development of advanced technologies in the areas of computer science and electrical engineering; it is now more convenient and cheaper to acquire large amounts of data. Despite this abundant data, the proper way to detect damage is still a big challenge.

Among all the vibration-based SHM methods, nonparametric methods and statistical pattern recognition techniques, such as Time Series Modeling (Sohn et al., 2001; Nair et al., 2006; Gul and Catbas, 2009, 2011) have gained significant momentum in the field of SHM due to their ability to deal with massive data and their capability to improve reliability by accounting for the variations in the recorded data.

Time series analysis is used to analyze time dependent data sets to understand their statistical characteristics. In their infancy, time series models were not used for structural analysis purposes. They were initially used in a variety of fields, such as population modeling, electrical engineering, long term weather predictions, and stock price prediction. In the following papers, the coefficients of time series models are used as damage sensitive features in which damage was found by comparing the changes in the coefficients from the undamaged and damaged models. Bodeux and Golinval (2000) introduced the application of Vector Autoregression Moving-Average (VARMA) models for both system identification and damage detection. Their approach utilized a prediction error method which assumed a zero mean Guassian white noise. The method was tested on the "Steel-Quake" benchmark proposed in the framework of COST Action F3 "Structural Dynamics." The tests showed a good correlation for the modal parameters and for detecting damage based on the modal parameter uncertainties, however the location of the damage was not properly identified. Gul and Catbas (2011) implemented a novel damage detection process which involved creating a damage detection model which combined time series modeling and a novel sensor clustering technique. The authors created ARX models for different sensor clusters by using the free response of the structure and each sensor cluster output was treated as an input for the ARX model. The methodology was shown to successfully identify and locate damage on both numerical and experimental vibration data even when noise is considered. Nair et al. (2006) introduced a new damage sensitive feature (DSF) using the first three auto-regressive (AR) terms from the auto-regressice moving average (ARMA) series that is modeled from vibrations. The authors found that the mean values of the DSF for the damaged and undamaged signals were different, so a statistical summarization, i.e., a t-test, was implemented to obtain a confident damage decision. Numerical and experimental vibration data from the ASCE benchmark was used to validate the method and the results showed that both minor and major damage could be precisely detected and located. de Lautour and Omenzetter (2006) analyzed the vibrations of a multi story building due to ground motion to detect seismic damage within the building. Their simple numerical 3 story structure was subjected to random ground motion and the resulting vibrations at each story were fit to an AR time series model. The AR coefficients were then used as the inputs for an Artificial Neural Network (ANN). The ANN was trained to detect any changes in the AR coefficients from before and after damage to identify and quantify the damage at each story. The results from their numerical case study proved that their methodology could successfully detect damage in a simple numerical structure even in the presence of noise and changes in operating conditions. Ji et al. (2011) conducted a series of full scale tests at the E-Defense shaking table facilities to simulate realistic seismic damage in a high-rise steel building. In conducting these full scale tests, the authors could evaluate the effectiveness of vibration-based damage diagnosis methodologies using real life vibration data. The vibration data from each floor was fit by the frequency response curve-fitting method and the ARX method. As the seismic damage increased, the natural frequencies of the structure decreased as expected. The modal shapes, however did not change as the damage was distributed evenly over the height of the structure. Note that these results only apply to steel high rise structures and it is expected that different results would occur if a different type of structure was used, such as a concrete moment frame or shear wall structure. Bao et al. (2013) proposed a damage detection technique for subsea pipelines which could account for various loading conditions. The authors first partitioned and normalized the acceleration data, then used auto-correlation functions and partial-correction functions to compute the ARMA models inputs and their orders, respectively. The AR parameters served as the damage feature vector and the damage indicators were based on the Mahalanobis Distance between the ARMA models which were used for damage detection and localization. A finite element model of a subsea pipeline under ambient excitations was numerically simulated to verify the authors' methodology, and the results show that it can successfully detect and locate damage even with noise effects. Roy et al. (2015) proposed a set of 4 ARX model based DSF for damage detection and localization when no input excitation data is made available. This was done by assuming that one of the output responses in a multi-degree-of-freedom (MDOF) system is assumed as an input whereas the rest are taken as the output. The damage features are based on ARX model coefficients, Kolmogorov-Smirnov test statistical distance, and the model residual error. The authors' methodology was tested on both numerical and experimental structures and the results show that the DSF could both localize and quantify the stiffness degradation, however, in cases where there are multiple locations of damage, one of the DSFs was unable to clearly quantify the amount of stiffness degradation. Lakshmi and Rama Mohan Rao (2014) created a novel outputonly damage detection technique based on time series analysis which accounted for environmental variability and measurement noise. The authors applied Principle Component Analysis to transform the large amount of data in order to reduce the data size, thereby improving computational efficiency. The data is fitted with AR and ARX models, and the probability density functions of damage features are obtained by assessing variance in prediction errors. The authors tested their methodology on a numerical simply supported beam and an experimental three story framed bookshelf benchmark structure. Results from the experiments indicate that the method can detect and locate

damage, however the measurement of the severity of damage should be further examined.

This article presents an automated SHM system based on Time Series Analysis (TSA) and sensor clustering capable of rapidly providing engineers with the location and degree of damage at a story level precision from the building's vibration due to ambient wind forces. The method presented in this paper is developed based on previous studies of the authors (Mei and Gül, 2014; Do, 2015), and aims to complement lengthy visual inspections and arbitrary scaling constants to provide a more efficient, consistent and accurate damage assessment.

The novelty of the paper is to utilize structural responses under wind loading to rapidly detect damage in a building at a story level precision with severity information. When relating this damage detection methodology to the objectives presented by Rytter (1993), it satisfies the first three steps.

### METHODOLOGY

### Background to Time Series Models

This section provides a brief discussion about the Auto-Regressive Moving Average model with eXogenous inputs (ARMAX). More discussions about time series model theories can be found in the following literature (Sohn and Farrar, 2001; Lu and Gao, 2005; Omenzetter and Brownjohn, 2006).

ARMAX modeling is the specific time series model used in this paper. Its general form is given in Equation (1).

$$y\left(t\right) + a\_1y\left(t-\Delta t\right) + \dots + a\_{n\_d}y\left(t-n\_d\Delta t\right)$$

$$y = b\_0u\left(t\right) + b\_1u\left(t-\Delta t\right) + \dots + b\_{n\_b}u\left(t-n\_b\Delta t\right) + e\left(t\right)$$

$$= d\_1e\left(t\right) + d\_1e\left(t-\Delta t\right) + \dots + d\_{n\_c}e\left(t-n\_c\Delta t\right)\tag{1}$$

In Equation (1), y(t) is the output, u(t) is the input of the model, e(t) is the error term, and a<sup>i</sup> , b<sup>i</sup> , d<sup>i</sup> are the parameters of the model. The model orders are given in terms of na, n<sup>b</sup> , nc . A general form of the ARMAX equation can be written as Equation (2).

$$A\begin{pmatrix} q \end{pmatrix} \boldsymbol{y}\left(t\right) = B\begin{pmatrix} q \end{pmatrix} \boldsymbol{u}\left(t\right) + D\begin{pmatrix} q \end{pmatrix} \boldsymbol{e}(t) \tag{2}$$

The terms A(q), B(q) and D(q) are polynomials in delay operators q j as shown in Equation (3).

$$A\left(q\right) = 1 + a\_1 q^{-1} + \dots + a\_{n\_d} q^{-n\_d}$$

$$B\left(q\right) = b\_1 q^{-1} + b\_2 q^{-2} + \dots + b\_{n\_b} q^{-n\_b}$$

$$D\left(q\right) = 1 + d\_1 q^{-1} + \dots + d\_{n\_c} q^{-n\_c} \tag{3}$$

From Equation (3), it is simpler to understand the meaning of the delay operator. For example, a data set x(t) at time multiplied by q j is equal to x(t – j1t). From the general form of the ARMAX models (Equation 2), different time series models can be created by changing the order of A(q), B(q), and D(q). For example, Auto Regressive (AR) process is created with only n<sup>a</sup> while n<sup>b</sup> , and n<sup>c</sup> are set to zero. The Moving Average (MA) process sets n<sup>a</sup> and n<sup>b</sup> to zeros and a non-zero value to n<sup>c</sup> . The ARX model is defined as setting n<sup>c</sup> to zero. As previously stated, the focus of this paper will be solely on ARMAX modeling of the transformed equations of motion as described below.

### ARMAX Models for Different Sensor Clusters

The equation of motion, which governs the dynamic responses (accelerations, velocities and displacements) of structures, is described herein. Equation (4) below represents the general equation of motion for an N degree of freedom system.

$$\mathbf{M}\ddot{\mathbf{x}}\ \left(\mathbf{t}\right) + \mathbf{C}\dot{\mathbf{x}}\ \left(\mathbf{t}\right) + \mathbf{K}\mathbf{x}\ \left(\mathbf{t}\right) = f(\mathbf{t})\tag{4}$$

In which **M**, **C** and **K** represent the N by N mass, damping and stiffness matrices of the system. The vectors x¨(t), x˙(t), and x(t) represent the acceleration, velocity and displacement at a certain time t. The external forcing vector is denoted by f(t) which is considered as a wind force in this paper. The vibration of a structure is strongly dependent on time, the prior state of the structure, and external inputs. By modeling the vibration data as a time series sequence, statistical characteristics of the time series which represents the behavior of the structure can be extracted. This vibration data can be gathered by installing a pair of bi-axial sensors in perpendicular directions at each story. The focus of this research centers on the change in stiffness which represents damage within the lateral resisting members of a building structure.

Equations (5–12) outline the steps for how the equation of motion (EOM) can be transformed so that it can be represented as an ARMAX model. For clarity, one story (represented as a single degree of freedom) is considered as a single i th row in Equation (4) and is shown in Equation (5) below.

$$(m\_{i1}\ddot{\mathbf{x}}\_1(t) + \dots + m\_{iN}\ddot{\mathbf{x}}\_N(t)) + (c\_{i1}\dot{\mathbf{x}}\_1(t) + \dots + c\_{iN}\dot{\mathbf{x}}\_N(t))$$

$$+ \left(k\_{i1}\mathbf{x}\_1(t) + \dots + k\_{iN}\mathbf{x}\_N(t)\right) = f\_i(t) \tag{5}$$

Rearranging Equation (5) to isolate the acceleration on the lefthand side results in Equation (6).

$$\begin{aligned} \ddot{\mathbf{x}}\_{i} &= \frac{f\_{i}}{m\_{ii}} \\ &- \frac{m\_{i,1}\ddot{\mathbf{x}}\_{1} + \dots + m\_{i,i-1}\ddot{\mathbf{x}}\_{i-1} + m\_{i,i+1}\ddot{\mathbf{x}}\_{i+1} + \dots + m\_{i,N}\ddot{\mathbf{x}}\_{N}}{m\_{ii}} \\ &- \frac{c\_{i,1}\dot{\mathbf{x}}\_{1} + c\_{i,2}\dot{\mathbf{x}}\_{2} + \dots + c\_{i,N}\dot{\mathbf{x}}\_{N}}{m\_{ii}} \\ &- \frac{k\_{i,1}\mathbf{x}\_{1} + k\_{i,2}\mathbf{x}\_{2} + \dots + k\_{i,N}\mathbf{x}\_{N}}{m\_{ii}} \end{aligned} \tag{6}$$

It can be assumed in shear type building modeling that the mass of each degree of freedom is entirely lumped into the center of the degree of freedom. Any mass values which aren't in the diagonal are assumed to be zero and can be removed. For simplicity, the damping terms in the equation can be removed due to their miniscule contribution to the equations balance. As such, Equation (6) can be simplified to Equation (7) below.

$$\ddot{\mathbf{x}}\_{i} = \frac{f\_{i}}{m\_{ii}} - \frac{k\_{i,1}\mathbf{x}\_{1} + k\_{i,2}\mathbf{x}\_{2} + \dots + k\_{i,N}\mathbf{x}\_{N}}{m\_{ii}} \tag{7}$$

Taking the second derivative of Equation (7) results in Equation (8) below.

$$\ddot{\vec{x}}\_i = \frac{\ddot{f}\_i}{m\_{ii}} - \frac{k\_{i,1}\ddot{x}\_1 + k\_{i,2}\ddot{x}\_2 + \dots + k\_{i,N}\ddot{x}\_N}{m\_{ii}} \tag{8}$$

The goal of taking the second derivative of Equation (8) is to create an equation in which the right-hand side is only dependent on acceleration values. Measuring the displacement and velocities of a structure under light ambient wind loading may result in measurement errors due to the miniscule values involved. By applying the forward difference technique (Levy and Lessman, 1961) as shown in Equation (9), the left side of Equation (8) can be transformed to create a new equation solely based on acceleration values as shown in Equation (10).

$$
\begin{split}
\ddot{\boldsymbol{x}}\_{i} &= \frac{\ddot{\boldsymbol{x}}\_{i}(t + \Delta t) - \ddot{\boldsymbol{x}}\_{i}(t)}{\Delta t} \\
\ddot{\boldsymbol{x}}\_{i} &= \frac{\ddot{\boldsymbol{x}}\_{i}(t + 2\Delta t) - \ddot{\boldsymbol{x}}\_{i}(t + \Delta t)}{\Delta t} - \frac{\ddot{\boldsymbol{x}}\_{i}(t + \Delta t) - \ddot{\boldsymbol{x}}\_{i}(t)}{\Delta t} \\
\ddot{\ddot{\boldsymbol{x}}\_{i}}(t + 2\Delta t) - \ddot{\boldsymbol{x}}\_{i}(t + \Delta t)}{\Delta t} - \frac{\ddot{\boldsymbol{x}}\_{i}(t + \Delta t) - \ddot{\boldsymbol{x}}\_{i}(t)}{\Delta t} \\
&= \frac{\ddot{\boldsymbol{f}}\_{i}(t)}{m\_{ii}} - \frac{k\_{i,1}\ddot{\boldsymbol{x}}\_{1}(t) + k\_{i,2}\ddot{\boldsymbol{x}}\_{2}(t) + \dots + k\_{i,N}\ddot{\boldsymbol{x}}\_{N}(t)}{m\_{ii}}
\end{split}
\tag{10}
$$

One issue with the newly transformed Equation (10) is that the acceleration x˙(t) exists on both sides of the equation, which could lead to trivial solutions. To eliminate this possibility, a new sequence yi(t) is introduced to represent the left components in Equation (10) where y<sup>i</sup> (t) = ¨x<sup>i</sup> (t + 1t) − ¨x<sup>i</sup> (t). The final transformation of the equation of motion is shown in Equation (11).

$$\frac{y\_i(t + \Delta t) - y\_i(t)}{\Delta t^2} = \frac{\overset{\circ}{f}(t)}{m\_{ii}}\tag{11}$$

$$-\frac{k\_{i,1}\ddot{\mathbf{x}}\_1(t) + k\_{i,2}\ddot{\mathbf{x}}\_2(t) + \dots + k\_{i,N}\ddot{\mathbf{x}}\_N(t)}{m\_{ii}}$$

This newly transformed equation can be represented as an ARMAX function (Equation 1) provided that yi(t) and x¨i(t) are considered the output and input terms, respectively. The error term in the ARMAX model represents damping, excitation force, ambient noise and numerical errors out of the numerical approximation of the derivative. As stated in Do (2015), it was found that an order of 1 for both the n<sup>a</sup> and n<sup>b</sup> terms and an order of 3 for the n<sup>c</sup> term was sufficient to account for these influences. The ARMAX model for the i th row of the equation of motion of a multi-DOF system can be expressed as in Equation (12) below. The parameters can be estimated using least square criterion (Mei and Gül, 2016, #207).

$$\begin{aligned} \,\_1\boldsymbol{y}\_i\left(t+\Delta t\right) + \boldsymbol{a}^i\boldsymbol{y}\_i\left(t\right) &= \boldsymbol{b}^i\_1\ddot{\boldsymbol{x}}\_1\left(t\right) + \boldsymbol{b}^i\_2\ddot{\boldsymbol{x}}\_2\left(t\right) + \dots + \boldsymbol{b}^i\_N\ddot{\boldsymbol{x}}\_N\left(t\right) \\ + \boldsymbol{e}\left(t\right) + \boldsymbol{d}^i\_1\boldsymbol{e}\left(t-\Delta t\right) &+ \boldsymbol{d}^i\_2\boldsymbol{e}\left(t-\Delta t\right) \end{aligned} \tag{12}$$

### Sensor Clustering

Due to the nature of shear structures, it can be assumed that the signal of a DOF can only affect the DOFs located directly above or below. With this assumption, the time series models can be constructed in a more concise way where each model only incorporates the neighboring DOFs. These models are referred to as a sensor cluster.

Based on the ARMAX model built for the equation of motion of a DOF, vibration at one sensor is chosen to fit the part at the left side of the equation, which is considered the reference channel. The vibration data from the neighboring sensors represent the right part of the equation. For an N-DOF structure, there are N ARMAX models with outputs as the reference channel and inputs only from the adjacent channels.

The ARMAX model is solely reliant on the sensor clusters, and not the readings of each individual sensor. This sensor clustering technique, which was previously developed by Gul and Catbas (2011), greatly reduces the complexity of the equation of motion for an N DOF.

If we consider a four story shear building to explain this sensor clustering technique. The first sensor cluster created to build the ARMAX model incorporates the first and second story and the first story is chosen as the reference channel. The reference channel of the second cluster is the second story, and the two neighboring stories (first and third) are included. The third sensor cluster has the third story as its reference channel and includes the two adjacent stories: the second and the fourth. The final sensor cluster incorporates both the third and fourth stories, with the fourth story being the reference channel.

### Building Damage Features

Among the property changes of a shear structure, mass changes are often related to the loading of the structure and are not considered as damage in most cases. To isolate stiffness changes from mass changes, the damage features proposed by Do (2015) are used in this paper. This section briefly describes the definition of the damage features.

The B(q) terms in the ARMAX model (Equation 12) represents the terms <sup>k</sup>ij mii in the equation of motions of each sensor cluster. The baseline case matrix is defined in Equation (13) and the matrix representing the unknown case (i.e., damaged case) is represented by Equation (14).

$$\begin{aligned} b\_{j,baseline}^{i} = \begin{bmatrix} b\_1^1 & b\_2^1 & \dots & b\_n^1 \\ b\_1^2 & b\_2^2 & \dots & b\_n^2 \\ \vdots & \vdots & \ddots & \vdots \\ b\_1^n & b\_2^n & \dots & b\_n^n \end{bmatrix} \cong \begin{bmatrix} \frac{k\_{11}}{m\_{11}} & \frac{k\_{12}}{m\_{11}} & \dots & \frac{k\_{1n}}{m\_{11}} \\ \frac{k\_{21}}{m\_{22}} & \frac{k\_{22}}{m\_{22}} & \dots & \frac{k\_{2n}}{m\_{22}} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{k\_{n1}}{m\_{nn}} & \frac{k\_{n2}}{m\_{nn}} & \dots & \frac{k\_{nn}}{m\_{nn}} \end{bmatrix} \tag{13}$$

$$d\_{j,damaged}^{i} = \begin{bmatrix} d\_1^1 & d\_2^1 & \dots & d\_n^1 \\ d\_1^2 & d\_2^2 & \dots & d\_n^2 \\ \vdots & \vdots & \ddots & \vdots \\ d\_1^n & d\_2^n & \dots & d\_n^n \end{bmatrix} \cong \begin{bmatrix} \frac{k'\_{11}}{m'\_{11}} & \frac{k'\_{12}}{m'\_{11}} & \dots & \frac{k'\_{1n}}{m'\_{11}} \\ \frac{k'\_{21}}{m'\_{22}} & \frac{k'\_{22}}{m'\_{22}} & \dots & \frac{k'\_{2n}}{m'\_{22}} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{k'\_{n1}}{m'\_{nn}} & \frac{k'\_{n2}}{m'\_{nn}} & \dots & \frac{k'\_{nn}}{m'\_{nn}} \end{bmatrix} \tag{14}$$

During seismic events, reinforced concrete members will often undergo a reduction in stiffness. As such, this paper focuses only on the loss of stiffness in a structure to determine damage and the mass is assumed to have not changed significantly during the seismic event. Therefore, the denominators in Equation (14) can be changed from m ′ ij to mij to produce a new matrix as shown in Equation (15), where the stiffness terms are the only ones which change between the baseline case and the unknown case.

$$d\_{j,damaged}^{i} \cong \begin{bmatrix} \frac{k'\_{11}}{m\_{11}} & \frac{k'\_{12}}{m\_{11}} & \dots & \frac{k'\_{1n}}{m\_{11}}\\ \frac{k'\_{21}}{m\_{22}} & \frac{k'\_{22}}{m\_{22}} & \dots & \frac{k'\_{2n}}{m\_{22}}\\ \vdots & \vdots & \ddots & \vdots\\ \frac{k'\_{n1}}{m\_{nn}} & \frac{k'\_{n2}}{m\_{nn}} & \dots & \frac{k'\_{nn}}{m\_{nn}} \end{bmatrix} \tag{15}$$

The Stiffness Damage Feature (SDF) is presented in Equation (16) as follows.

$$\text{SDFs} = \frac{d\_{j,damaged}^{i} - b\_{j,basline}^{i}}{b\_{j,basline}^{i}} \times 100\% \quad i:\text{sensor clusters};\tag{16}$$

### CASE STUDIES: NUMERICAL ANALYSIS

To verify the validity of the ARMAX damage detection model, two different structures were modeled using SAP2000. Each structure was subjected to a variety of damage cases, and the undamaged and damaged models' acceleration responses to ambient wind forces were analyzed and the SDFs were calculated. Those SDFs were then directly compared to the expected SDF results which were obtained from extracting the stiffness matrices from SAP2000.

Of the two of the structures modeled, one was a steel moment frame and the other was a reinforced concrete (RC) frame, where shear deformation from lateral loading is most prevalent. The ARMAX DDM assumes that the structures can be approximated as shear type structures and therefore flexural deflection are not considered.

Each structure is presented with damage cases which range from minor damage cases (only one story damaged) to severe damage cases (>70% of stories damaged). The entire procedure for the numerical analysis can be summarized through **Figure 1**.

### Wind Speed Simulation Model

The ARMAX DDM previously outlined requires acceleration readings at every story to properly function. As previously stated, the acceleration responses can be gathered by installing one bi-axial sensor per story. These accelerations are created by a lateral wind force acting on the building. The following sections describe the procedure to generate the wind forces.

#### Wind Simulation at Reference Elevation

When simulating a wind speed function, a common technique involves breaking the wind down to two components: the Low Frequency Component (LFC) which represents the average hourly wind speed; and the high frequency component (HFC)

which considers the wind speeds at shorter time periods ranging from 10 to 300 s (Welfonder et al., 1997; Nichita et al., 2002; Bayem et al., 2008). This can be represented as follows:

$$U\_r\left(t\right) = \nu\_{LFC}\left(t\right) + \nu\_{HFC}(t)\tag{17}$$

This paper utilizes the method proposed by Fernandez and Alonso (2017) to create a wind speed model at a reference story elevation which considered both wind components as stochastic variables, greatly simplifying the wind speed simulation process and correlating excellently to real life measurements.

### Wind Speeds at Other Elevations

When generating the wind speed functions for elevations other than the reference story elevation, two factors must be considered: the mean wind speed at the given elevation and the correlation with regards to the neighboring story wind speeds.

In general, wind speeds increase at higher heights. In this paper, Power Law is used to represent mean wind speed profiles at other elevations as it has shown to give an accurate approximation for elevations below 200 m (Holmes, 2015)

$$U(z) = U\_r \times \left(\frac{z}{z\_r}\right)^{\alpha} \tag{18}$$

The exponent α is an empirically derived landscape coefficient that ranges from 0.10 for smooth, flat terrain to 0.40 for cities with high rise buildings (Bañuelos-Ruedas et al., 2010). The wind force example used for the two damage models had an exponent value of 0.30.

Correlation is defined as the real number in the range [-1, 1] that measures how two variables (i.e., wind speeds) at different elevations evolve with each other. The Pearson correlation equation (Pearson, 1895), which is used in this paper to measure correlation of wind speeds at different elevations, is defined in Equation (19).

$$\rho\_{\text{xy}} = \frac{\sigma\_{\text{xy}}^2}{\sigma\_{\text{x}}\sigma\_{\text{y}}} \tag{19}$$

The correlation of real-life wind speeds will not be equal to one. The correlation generally ranges from 0.50 to 0.80 depending on site characteristics and wind speeds. The correlation is simulated based on Kim et al. (2009), which can best reflect real-life measurements.

$$C\_{12}\left(r\_{\mathcal{V}},r\_{z},n\right) = e^{\left(-r^{\*} \times n^{\*}\right)}\tag{20}$$

$$r^{\*} = \frac{\sqrt{\left(k\_{\mathcal{V}}r\_{\mathcal{V}}\right)^{2} + \left(k\_{z}r\_{z}\right)^{2}}}{L\_{\mathcal{X}}\left(z\_{m}\right)}$$

$$n^{\*} = \sqrt{1 + \left(\frac{nL\_{\mathcal{X}}\left(z\_{m}\right)}{k\_{2}U\left(z\_{m}\right)}\right)^{2}}$$

$$z\_{m} = \sqrt{z\_{1} \times z\_{2}}$$

$$r\_{z} = z\_{2} - z\_{1}$$
 where  $k\_{\mathcal{V}} = 0.5, \ k\_{z} = 0.5, \ k\_{2} = 0.06$ 

In Equation (20) listed above, the only inputs required are the vertical and horizontal distances between two points (r<sup>z</sup> and ry, respectively) and the frequency at which wind speeds are taken (i.e., TSAMPLE = 10 s, n = 0.1). With the power law and correlation effects accounted for, a wind speed model was generated in the following section which accounts for any elevation as it relates to the wind speed created in the previous section.

### Wind Force Generation Model at a Given Elevation

The first step in creating a wind speed at a given elevation was to generate the wind speed at the reference elevation (first story) as shown in previously, as that reference elevation speed is the baseline for the second story wind speed. With the baseline wind speed generated, each story's wind speed was built in ascending order by first increasing each story's wind speed relative to the story below using the power law. Following that increase, a correlation generator was developed to model real life wind behavior.

According to Kim et al. (2009), the predicted correlation between wind speeds at 3.25 m height difference is 0.695. To simulate the correct correlation, a correlation generator was developed to induce some randomness by either increasing or decreasing the wind speed from its original value. The randomized numbers were bounded by a normal distribution with varying limits to create wind speed trials with varying correlation values. An iterative program was created which simulates several wind speed trials with different limits and then checks which trial yielded the optimal correlation value.

With the second story wind speed generated, the wind speed is then generated for the third floor using the same correlation generator procedure with the second story as the new reference elevation speed and with a different correlation value. This process is repeated for each story until each floor has a wind speed which corresponds to the Power Law mean speed and appropriate correlation. Afterwards, the simulation is refined further to account for turbulence at a one second wind speed samples.

An example final version of wind speeds at 10 separate stories is shown in **Figure 2A** which represents wind speeds with an average starting hourly wind speed of 4 m/s (∼11 km/hr) at the first story.

The major factors that can affect the wind pressure include density of surrounding buildings, relative heights of surrounding buildings, surface roughness and angle of wind. A parametric model considering all the factors above proposed by Grosso (1992) was introduced to simulate pressure coefficients along the building. These pressure coefficients were used in conjunction with the calculated wind speeds to generate a story by story wind force which can be utilized during the damage detection model. A sample of windward and leeward distributed forces (6 m/s average wind speed) acting on a four story 16 m tall building are presented in **Figures 2B,C**. The windward and leeward forces were applied at the windward and leeward sides of each story's floor slab as uniform distributed loads in the numerical building models. The frequency spectrum of wind force and structural response are shown in **Figure 3**.

### Numerical Damage Modeling Technique

As the proposed methodology is based on its ability to detect damage in numerical building models, it is imperative that the damage properly reflects real life behavior. One of the most commonly used damage analysis technique to determine the degree of damage in a structure is the stiffness degradation method, which compares the initial loading stiffness slope of an undamaged structural member to the reloading stiffness slope after the member/structure is subjected to a seismic event. This stiffness degradation model will be utilized as it directly relates to the focus of the ARMAX DDM which determines the change in stiffness at a story by story level. To properly reflect damage, both the concrete and steel properties were modified as follows.

### Concrete Damage

According to Guo et al. (2016), it was assumed that any stiffness reduction can be attributed to the degradation of the initial reloading modulus of concrete as shown in Equation (21). This assumption holds true because when the steel bars are unloaded and reloaded, their reloading modulus generally will not change drastically due to the elastic nature of steel, whereas the formation of cracks in concrete due to a seismic event would greatly reduce the reloading modulus. This damage model assumes that the concrete has underwent non-linear damage due to the

concrete strain passing its peak strength value (∼0.22%). Note that although the concrete has undergone non-linear damage, the ambient wind forces acting on the reinforced concrete afterwards would be of low enough force so that the "re-loaded" concrete is behaving in a linear fashion.

$$DR\_{Concrete} = 1 - \frac{E\_{New}}{E\_{original}}\tag{21}$$

Chang and Mander (1994) studied the effects of dynamic and cyclic loading on concrete and they developed a set of equations which can relate the original stiffness (EORIGINAL) to any reloading damaged stiffness (ENEW) while also calculating the new stress and strain capacities. This set of equations proposed by Chang and Mander (1994) were adapted to create new concrete capacity curves in which the only inputs required are the original concrete compressive strength, initial flexural stiffness and the target Damage Ratio (DR).

The range of Damage Ratios spans from minor damage (0.40) to critical damage (0.65). Minor damage refers to the point in which cracks become noticeable in the concrete. Critical damage refers to the point just before complete failure of the concrete with zero force capacity. These Damage Ratio limits and corresponding degrees of damage were determined previously by Toussi and Yao (1983).

For illustrative purposes, the stiffness, ultimate strength and ultimate strain capacity of the undamaged and damaged 40 MPa concrete is presented in **Table 1**. It is assumed that the damaged concrete has lost all tensile capacity due to cracking.

**Figure 4** is presented below for better visualization and understanding of how the damaged concrete compressive curves compare to the undamaged concrete. Past a strain value of 0.37%, it is assumed that the concrete will have completely failed (Toussi and Yao, 1983). In this paper, concrete damage is introduced by changing the material characteristics, i.e., modulus of elasticity and peak compressive strength, of the damaged columns according to DR within the SAP2000 model.

TABLE 1 | Undamaged and damaged concrete material properties.


### Steel Damage

As the steel reinforcing bars undergo cyclic loading, the unloading and reloading modulus of elasticity remains relatively unchanged. What does change, however, is the ultimate strength of the steel, as the constant cyclic loading has a fatigue loading effect. As such, the DR of the reinforcing steel bars can be calculated as the ratio of the new ultimate strength of the steel compared to its undamaged ultimate capacity and is illustrated in Equation (22) below.

$$DR\_{Rebar} = 1 - \frac{\sigma\_{Ult.(New)}}{\sigma\_{Ult.(Original)}} \tag{22}$$

In this paper, steel members in SAP2000 are replaced with aluminum members to simulate damage.

### Parameters

In this paper, a 4 story steel structure and a 10 story reinforced concrete structure are simulated. As an example, the procedure to calculate parameters of the 4 story steel structure is shown herein. The four story buildings is simplified as 4-DOF systems where the stiffness values of k<sup>1</sup> to k<sup>4</sup> are the lateral force resisting stiffness' at each floor and the mass is assumed to be lumped in the floor of each story. Each numerical building model is treated as a strong-beam weak-column structure and therefore the beams and slabs were treated as perfectly rigid. The stiffness and mass matrix of the first four stories are shown in

Equations (23,24), respectively.

$$K = \begin{bmatrix} K\_{11} & K\_{12} & K\_{13} & K\_{14} \\ K\_{21} & K\_{22} & K\_{23} & K\_{24} \\ K\_{31} & K\_{32} & K\_{33} & K\_{34} \\ K\_{41} & K\_{42} & K\_{43} & K\_{44} \end{bmatrix} = \begin{bmatrix} k\_1 + k\_2 & -k\_2 & 0 & 0 \\ -k\_2 & k\_2 + k\_3 & -k\_3 & 0 \\ 0 & -k\_3 & k\_3 + k\_4 & -k\_4 \\ 0 & 0 & -k\_4 & k\_4 \end{bmatrix} \\ \text{(23)}$$

$$M = \begin{bmatrix} m\_{11} & 0 & 0 & 0 \\ 0 & m\_{22} & 0 & 0 \\ 0 & 0 & m\_{33} & 0 \\ 0 & 0 & 0 & m\_{44} \end{bmatrix} \tag{24}$$

With the stiffness and mass matrices set up as shown, the stiffness damage feature (SDF) matrix was represented as follows. Note that the equation to calculate each SDF is shown in Equation (16).

$$SDF = \begin{bmatrix} \text{SDF}\_{11} & \text{SDF}\_{12} & 0 & 0\\ \text{SDF}\_{21} & \text{SDF}\_{22} & \text{SDF}\_{23} & 0\\ 0 & \text{SDF}\_{32} & \text{SDF}\_{33} & \text{SDF}\_{34}\\ 0 & 0 & \text{SDF}\_{43} & \text{SDF}\_{44} \end{bmatrix} \tag{25}$$

This methodology also applies to the 10 story structure, with the only difference being that the stiffness, mass and SDF matrices are represented as 10 × 10 matrices as opposed to 4 × 4 matrices. With the general SDF matrix set up, the overall loss in stiffness at each story can be calculated as in Equation (26). Note that "last story" refers to the highest story of the building and the calculation of 1K<sup>1</sup> requires that K<sup>1</sup> = K2.

$$\begin{aligned} \Delta K\_1 &= (2 \times SDF\_{11}) - \Delta K\_2\\ \Delta K\_i &= \frac{SDF\_{i-1,i} + SDF\_{i,i-1}}{2}, \; i = 2, 3, \dots, n-1\\ \Delta K\_n &= \frac{SDF\_{n-1,n} + SDF\_{n,n-1} + SDF\_{n,n}}{3} \end{aligned} \tag{26}$$



Theoretically, the change in stiffness at each story (aside from the first) can be gathered by taking a single SDF value, however by averaging the value of two SDF values instead, the experimental errors were mitigated.

To better simulate real life scenarios in which the collected data are usually corrupted with measurement error, white noise with mean of 0 and standard deviation of 5% of original signal's standard deviation are added to each story's acceleration response during the baseline and damaged cases. The SDF results presented in the following section represent the average SDF values after performing 10 trials with the noisy data.

For each structure, the story accelerations were measured at the center of each floor slab. Throughout the numerical modeling simulations, the average starting hourly wind speeds on the first story ranged from 2 m/s (3.6 km/hr) to 8 m/s (28.8 km/hr).

The damage in each numerical model was represented as a uniform change in the material properties throughout an entire column. This model is slightly simplified, as it is expected in moment frames that the top and bottom of each column would be the most damaged due to the peak moment forces location.

For the RC building model, the building reinforcement is designed as per the Concrete Design Handbook-−4th Edition with the loads being calculated using the 2015 National Building Code of Canada (Cement Association of Canada and Canadian Standards Association, 2016). The structures are assumed to be conventional office buildings in Vancouver on Soil Type D. The building reinforcement was verified through SAP2000's automated moment frame design calculations.

The structural response due to wind for the RC model was calculated using Newmark's direct integration method (γ = 0.25, β = 0.50) and incorporated proportional damping with a constant 7% damping coefficient for baseline state of structure and a 5% damping coefficient for the unknown state of the structure (Newmark, 1982). The concrete compressive curves were modeled using Mander's curve.

The material specifications for the structural steel, concrete and rebar are presented in **Table 2**.

### CASE STUDIES: RESULTS AND DISCUSSION

To verify the validity of the ARMAX damage detection model, two numerical building models are presented below. The wind was sampled at 100 Hz and the total time period for one state of each structure is 10 s. Each structure was subjected to a variety of damage cases, and the undamaged and damaged models' acceleration responses to ambient wind forces were analyzed and the SDFs were calculated. It should be mentioned here again that a 5% artificial noise was added to all the responses obtained from the models. Those SDFs were then directly compared to the expected SDF results which were obtained from extracting the stiffness matrix from SAP2000. The ARMAX DDM assumes that the structures can be approximated as shear type structures and therefore flexural deflection are not considered. Each structure is presented with damage cases which range from minor damage cases (only one story damaged) to severe damage cases (>70% of stories damaged).

### Case Study I: 4 Story Steel Structure

It was imperative that the finite element (FE) modeling parameters were properly calibrated to simulate real life structural behavior. As such, the first structure considered was a replica of an experimental four story steel structure which was built by Do (2015). The FE model replica was subjected to identical damage cases to those verified in the experiments to verify that the FE model parameters used throughout this paper properly reflect real life damage from previously created experiments. The focus on steel structure was not to detect seismic damage in a structure, it was to ensure that the numerical modeling parameters reflected real life behavior.

The 3D view and plan view of the structural model are shown in **Figure 5**. As each steel angle column is identical in material properties and dimensions, they are all considered to have identical stiffness values.

To validate that the FE model can be replicated to match previous experiments, the structure was excited by two pairs of Multiple Impulse Forces (MIF) located at the two corners of the first and third floors. This forcing function was created through randomly generating an impulse force under normal distribution at every 0.1 seconds.

The acceleration response of the structure from the MIF was recorded at 0.001 s intervals. For the steel structure, the response calculated by FE modeling was a linear modal response using a constant damping of 2%.

The original accelerations for the first sensor cluster are presented in **Figure 6**. It is seen that the value of output is generally twice smaller than input data. This makes sense because the output is the difference of acceleration. It is expected that such small inconsistency in terms of order is unlikely to cause the illconditioning of matrix while estimating the parameters. As for this sensor cluster, the number of predicted points is 499 and the number of unknowns is 5, i.e., (a 1 , b 1 1 , b 1 2 , d 1 1 andd 1 2 ).

### Damage Case S1—Single Story Damage (4th Story)

The first damage case involved replacing one of the steel angle columns with an identically sized aluminum angle column at the fourth story. The location of the damaged column is at A1 as shown in **Figure 5B**.

By replacing a 200 GPa steel column with a 63 GPa aluminum column at location A1 (the intersection of gridline A and gridline 1), the Damage Ratio of the single column was 1 – (63/200) = 0.685. Every other column in the structure was unchanged and therefore can be assumed to have Damage Ratio of 0. The overall

loss in stiffness on the fourth story can be calculated as [((3 × 0) – (1 × 0.685))/4] = −17.13% which would be reflected in SDF34, SDF43, SDF44; SDF33, which represents the change in combined stiffness of the third and fourth story can be calculated as [((7 × 0) – (1 × 0.685))/8] = −8.56%. Note that the denominator represents the total number of columns that are included in each respective SDF.

To validate this calculation method, each expected SDF is confirmed through extracting the stiffness matrix of the finite element (FE) models. The extracted FE results (also referred to as the "expected" results) and the ARMAX analysis results; one case with no noise and one with 5% noise added; are presented in **Table 3** below. Throughout the damage cases, the SDF results represent the average of 10 trials.

The 5% noise effect did not have a significant impact on the SDF values from the ARMAX analysis. With the SDF matrix set up, the overall loss in stiffness in each story was calculated as shown in Equation (26) using the 5% noise SDF

TABLE 3 | SDF results (DC S1).


values. The calculated change in stiffness at each story from the ARMAX DDM is presented in Equation (27). For brevity, these calculations will not be shown for any other damage case.


The overall change in stiffness of each story based on the 5% noise SDF values from the 10 trials are presented in **Table 4**. The bracketed values in the ARMAX column represent the standard deviation of the 10 trials, with a lower standard deviation value signifying more stable results.

The ARMAX analysis successfully located and quantified the damage in the fourth story while no substantial change was estimated in all other stories. The low standard deviation values for each story (average value of 1.49) illustrates the stability of the results through the 10 trials even with added noise.

### Damage Case S2—Two Story Damage (1st and 2nd Stories)

The second damage case involved replacing two steel columns (A1 and B2 in **Figure 5B**) at the first story and one TABLE 4 | Story stiffness change (DC S1).


steel column (A1) at the second story with identically sized aluminum columns.

Similarto Damage Case S1, the damage ratios of the individual "damaged columns" is 0.685. SDF11, which represents the change in stiffness of the combined first and second story was calculated as [((5 × 0) – (3 × 0.685))/8] = −25.69%. The change in stiffness of the second story, as shown in SDF<sup>12</sup> and SDF<sup>21</sup> was calculated as [((3 × 0) – (1 × 0.685))/4] = −17.13% and SDF<sup>22</sup> was calculated as [((7 × 0) – (1 × 0.315))/8] = −8.56%. For brevity, these calculations will not be shown for any further steel damage cases as the same process can be used for every damage case. In the results tables, each expected damage case result was



#### TABLE 6 | Story stiffness change (DC S3).


completed by extracting the FE matrix, the hand calculations were only used as a second verification.

The overall change in stiffness at each story from both the expected results and the 5% noise ARMAX SDF are presented in **Table 5** as per Equation (26).

The ARMAX DDM successfully located the damage on the first and second story while also measuring no substantial change in stiffness in the undamaged stories. The degree of damage on the first floor was underestimated by 4.81%, however the degree of damage on the second story was very close to the expected value. The low standard deviation values from the 10 trials illustrate the negligible impact that the 5% noise had on the ARMAX DDM.

#### Damage Case S3—Three Story Damage (1st, 2nd , and 3rd Stories)

The final damage case for the experimental steel structure represents a more severe case in which there is damage on the first (A1, A2, and B1 in **Figure 5B**), second (B2) and third story (A1 and B1) with a total of six steel columns being replaced by aluminum columns. The overall stiffness loss values for each story are presented in **Table 6**.

The ARMAX DDM successfully located the damage at each story with excellent correlation to the expected degree of damage and relatively small differences between each trial.

The ARMAX analysis results from the numerical modeling produced results very similar to the results which were measured through previous tests on the experimental structure built by Do (2015). In each damage case, the ARMAX results successfully located and determined the degree of damage at each story without yielding significant false negative or positive results. In some cases, however, the ARMAX model underestimated the severity of damage to some extent.

### Case Study II: 10 Story Concrete Structure

A 10 story structure with a 4 × 4 column layout as shown in **Figure 7A** was simulated. The 3D FE model as well as plan view of the model are presented in **Figures 7B,C**. Each column had identical rebar detailing and identical undamaged stiffness properties.

### Damage Case C1—Two Story Damage (2nd and 5th Stories)

The first damage case incorporated moderate damage to eight columns; four columns (A2, B2, C2, and D2) with a DR of 0.50 and four columns (A4, B4, C4, and D4) with a DR of 0.55; at both the second and fifth story.

Equation (26) was used once again to calculate the story stiffness change at each level and the results are presented in **Table 7** along with the standard deviation from the 10 trials.

The ARMAX DDM successfully located the damage in the second and fifth story. The severity of damage at each story was very close to the expected values with minimal standard deviations. Although there were some false positive SDF values that were higher than in the previous structures, it did not result in any issues as the highest false positive story stiffness change was calculated as −4.63%.

### Damage Case C2—Five Story Damage (1st, 3rd, 4th, 7th, and 9th Stories)

The second damage case simulated a building which has undergone moderate to severe damage throughout with damage being applied to columns on five stories. The first story had five columns (A3, B2, B4, C4, and D2) damaged with DRs ranging from 0.50 to 0.65. The second story had five columns (A1, A2, B1, C3, and D3) damaged as well with two columns having a DR of 0.55 and three columns having a DR of 0.60. The fourth story incorporated damage in seven different columns (A2, A3, C1, C2, C3, and D4) with DRs ranging from 0.45 to 0.65. The seventh story had three columns (A3, B3, and B4) damaged: two with a DR of 0.55 and one with a DR of 0.50. The ninth story had four columns (B2, B3, C2, and C3) damaged, each with a DR of 0.40.

Similar to Damage Case C1, the noise continues to not have a major influence on the damage detection model. **Table 8** presents the overall change in stiffness calculated at each story.

The ARMAX DDM was successful in locating which five stories were damaged without calculating significant false negative or false positive results at the undamaged locations. Like the previous building model, the ARMAX DDM slightly underestimated the severity of damage when the number of damaged stories was increased.

### Damage Case C3—Seven Story Damage (1st, 2nd, 3rd, 4th, 6th, 7th, and 8th Stories)

The final damage case tested represented a building that is in a critical state with damaged columns at seven different stories. The most severe damage was incorporated on the four lowest stories with the first, second, third, and fourth stories having ten (A1, B1, B3, B4, C1, C2, C3, C4, D2, and D4), nine (A1, A2, A4, B2, B3, C3, C4, D1, and D2), nine (A2, A3, B3, B4, C1, C2, C3, D3, and D4) and six (A3, A4, B2, B3, D1, and D2) columns damaged,

respectively. The sixth, seventh and eighth stories each had seven columns damaged. To be more specific, the damaged columns for sixth story are at A2, A3, A4, B2, B3, C3, and C4. The damaged columns for seventh story are at A1, A2, A3, D1, D2, D3, and D4. The damaged columns for eighth story are at A2, A4, B3, B4, C1, C2, and C3.

The overall stiffness changes at each story are presented in **Table 9**.

The ARMAX DDM yielded excellent results by successfully locating the damage at each of the seven damaged stories. The degree of damage was calculated with excellent precision in the first four stories, however the model slightly underestimated the degree of damage in the three higher stories.

### Discussion of Results

Overall, the ARMAX DDM was shown to effectively locate the damaged stories in both models with no significant errors. For most of the damage cases, the ARMAX DDM accurately estimated the degree of damage, however, the DDM had slightly less accurate results in the building model with more columns. This was expected, as the ARMAX DDM relies on approximating buildings as simplified shear type structures and ignoring the flexural deformation, so the ARMAX DDM generated nearly identical results to the FE models when the structures themselves were simplified. Through rigorous numerical testing, the ARMAX DDM was proven to be an effective and consistent method for locating and quantifying damage.

#### TABLE 7 | Story stiffness change (DC C1).


#### TABLE 8 | Story stiffness change (DC C2).


#### TABLE 9 | Story stiffness change (DC C3).


### CONCLUSIONS AND FUTURE WORK

In this paper, a new building damage detection model was proposed and developed using ARMAX analysis on the acceleration responses due to ambient wind loading. Through rigorous numerical modeling, it was demonstrated that damage can be identified at a story level precision and the degree of damage can be accurately quantified based on floor accelerations due to wind forces.

Within the detailed description of the methodology, the ARMAX model, used in conjunction with a sensor clustering concept to analyze the dynamic responses of a structure was explored. By assuming the mass of a building can be grouped into the floors and incorporating mathematical approximations, the ARMAX time series model was transformed to represent the general equation of motion. Using a sensor clustering technique, the ARMAX DDM was able to create a baseline case and damaged case of a structure and those two cases were then evaluated to create a stiffness damage feature capable of locating and quantifying damage at story level precision.

With an accurate account of the analysis model, forcing function and numerical damage model, the second part of the paper involved verifying the capability of the ARMAX DDM using numerical analysis. To illustrate the effectiveness, two separate building models were shown. The first was a previously built, experimental steel structure to which multiple impulse force loading was applied. The results from the ARMAX DDM effectively demonstrated that the parameters used in FE modeling accurately reflected real-life experimental behavior. The second structure was a 10 story reinforced concrete frame with a 4 × 4 column layout. Damage was successfully located and quantified in the minor, moderate and severe damage cases. Note, however, that the model slightly underestimated the degree of damage in some stories in the moderate and severe damage cases. The level of underestimation, however, was small enough to not warrant any major concerns. Overall, the ARMAX DDM was proven to be an effective and consistent method for locating and quantifying damage at a story level precision.

The ARMAX DDM has provided accurate results in multiple damage building scenarios, however there are still limitations that are worth mentioning and recommendations for future work. One limitation of this paper is that although it was validated through various numerical model testing, there have been no experimental structures tested using wind induced vibrations. The numerical damage detection model incorporated a uniform change in material properties in only the columns, with the rigid beams and slabs being unaffected. It is recommended that tests be done which may simulate more realistic structural damage. This may include incorporating severe material property changes in the tops and bottoms of the columns while not affecting the middle elevation as much. This could also include not treating the beams and slabs as rigid members and instead applying damage to them and including plastic hinge effects, i.e., not assuming the structure as shear type. Although the damage model was shown to be effective when replacing a steel column with an aluminum one, it is recommended that the ARMAX DDM be tested on a more realistic damage case for steel structures. It is also recommended that the timber buildings be tested. Further investigation should also be completed which look into adding 10% noise instead of the 5% used. In addition, as the dynamic system becomes faster in comparison to the sampling rate, the first order forward difference to approximate derivative may no longer be suitable, a higher order approximation or up-sampling techniques could be applied.

### AUTHOR CONTRIBUTIONS

GG used the ARMAX model for wind induced vibration, implemented the numerical analysis for verification and drafted the paper. QM developed part of the methodology and drafted the paper. MG is the supervisor of the other two authors

### REFERENCES


who came up with the idea and concept, guided the direction of research, checked the proposed approach in this paper and revised the manuscript, and coordinated the funding for the project.

### FUNDING

This research was partly supported by the Natural Sciences and Engineering Research Council of Canada through the Discovery Grants.

with principal components. Nondestruct. Test. Eval. 29, 357–376. doi: 10.1080/10589759.2014.949709


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Gislason, Mei and Gül. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Technology Leveraging for Infrastructure Asset Management: Challenges and Opportunities

A. Emin Aktan, Ivan Bartoli and S. Gokhan Karaman\*

Department of Civil, Architectural, and Environmental Engineering, College of Engineering, Drexel University, Philadelphia, PA, United States

Transportation and other infrastructure systems, particularly in dense urban regions, are intertwined, interdependent, multi-scale, multi-domain and complex, and their behavior cannot be predicted even when element behaviors are known. Such systems should be managed just like financial assets, leveraging measurement-based, objective and reliable metrics for documenting their value, performance and condition, and based on their lifecycle and disutility risk for each distinct limit-states of performance as discussed in the following. In this paper writers attempt to offer a perspective for asset management of civil infrastructures with a focus on highway bridges and describe the tools that are considered necessary for rectifying the current shortcomings mainly arising from subjective and incomplete performance and condition evaluation practice. The adoption of sensing systems, which allows measurements of displacement, acceleration, strain, tilt and that can be collected wirelessly, has the potential of providing objective metrics needed for optimal asset management. The authors however caution that such a transition (from asset management based on visual inspection to data-driven asset management based on objective metrics) could be truly achieved only if combined with the proper training of a new generation of infrastructure inspectors and stakeholders. The paper attempts to provide a roadmap to achieve such a transition in asset management and describes the critical concepts that should be incorporated in training a new generation of civil engineers in charge of maintaining our transportation assets.

Edited by:

Eleni N. Chatzi, ETH Zürich, Switzerland

#### Reviewed by:

Branko Glisic, Princeton University, United States Ke Chen, The University of Hong Kong, Hong Kong Fernando Moreu, University of New Mexico, United States

> \*Correspondence: S. Gokhan Karaman ks3285@drexel.edu

#### Specialty section:

This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment

Received: 16 November 2018 Accepted: 29 April 2019 Published: 16 May 2019

#### Citation:

Aktan AE, Bartoli I and Karaman SG (2019) Technology Leveraging for Infrastructure Asset Management: Challenges and Opportunities. Front. Built Environ. 5:61. doi: 10.3389/fbuil.2019.00061 Keywords: bridge asset management, bridge inspection, sensing technologies, sensor selection, high resolution imaging, wireless sensing

### 1. INFRASTRUCTURES AND ASSET MANAGEMENT IN THE US

Infrastructures, such as transportation, water, power, fuel and communication are complex, multi-scale and multi-domain systems (with natural-human-engineered elements) providing critical services. They are key for the livability, sustainability and resilience of our communities. The need for infrastructure systems or their expansions are influenced by actual infrastructure service needs for a region, economy, financing and also politics and policy. Infrastructures in the US may fall under public, semi-public, private or hybrid (public-private partnership, PPP) ownership mainly based on history, policy and financing mechanisms. They are operated and preserved with many possible organizational constructs that are also influenced by their financing and revenue mechanisms (**Figure 1**).

In the last decades there has been increasing recognition that all infrastructure systems should be managed similarly to financial assets, which are invested and preserved by principles, such as diversification, time horizon and risk tolerance in addition to leveraging statistics and scenario analysis. Infrastructure asset management is the integrated, multidisciplinary set of strategies for sustaining physical assets, such as water treatment and distribution facilities, sewer lines, roads, dams, utility grids, bridges, railways, manufacturing plants and pipelines.

**Figure 1** depicts some of the most critical parameters and systems that influence and therefore should be considered in infrastructure management. Given that most infrastructures in the US are many decades and in some cases centuries old consider parts of the water distribution systems remaining from the time of the Colonies dating back to the 1700's, the railroads to the 1800's, Interstate Highway System to the 1960's and the Internet—starting as ARPANET—from the 1980's. It is natural for these infrastructure systems to remain under the influence of history, culture and the legal frameworks defining their ownership during the time of their early development.

Privately financed and operated infrastructures regulated as utilities (power, communication, internet, light rail, airlines, toll roads and bridges, clean water systems, and sewers in some regions) generally dedicate funds and adopt maintenance management policies for preservation over the long-term, as their owners stand to lose revenue in the case of service disruptions. Meanwhile, public infrastructures, such as streets, roads, parks and transit that depend on taxpayer funds from the state and especially the local governments face a different challenge. Local governments are often starved for resources (consider that the financial health of many Cities and Counties in the US remain challenged just due to their insufficiently funded pension obligations going back to many decades) and the short election cycles for elected leaders do not offer incentives for proactive long-term planning for the preservation of public assets. As a result of this, aged local infrastructures are often managed in a day-by-day triage mode and local government organizations are seldom evaluated based on the long-term performance of their infrastructure assets. In most cases, the federal government may provide the bulk of major rehabilitation and replacement costs of public infrastructures but the cost of routine preventive maintenance is considered as the responsibility of the local government. Therefore, many local governments prefer to defer maintenance and wait until assets require major rehabilitation or replacement, in which case federal funds may become available to finance most of the cost.

### 2. CHALLENGES TO ASSET MANAGEMENT IN THE US

In the United States, there has been increasing Congressional awareness of checking the performance of federal, state and local agencies receiving federal infrastructure funds. For example, the 1993 Government Performance and Results Act (GPRA) required government agencies to pay increased attention to the outputs and outcomes that are expected from federal programs. The 1995 National Performance Review (NPR) ushered in a broader definition for performance management, which corresponds with evaluating progress toward achieving pre-defined objectives. NPR fostered an examination of the relationship between the outcomes and the investment. In 2001 the U.S. Government Accountability Office (GAO) emphasized that spending should be tied to outcomes (GAO-01-834). Shaw (2003) evaluated the performance measures of operational effectiveness for highways.

In 2010 the National Performance Management Advisory Commission (NPMAC) indicated that the relation between expenditures and predetermined outputs as organizational objectives needs to be realized. The 2012 MAP-21 Act (Moving Ahead for Progress in the 21st Century Act) of the US Congress required state agencies to focus on monitoring performance and outcomes and required that each State should be developing a risk-based transportation asset management plan (TAMP) for the National Highway System (NHS) to improve or preserve the condition of the assets and the performance of the system. MAP-21 specifically requested the Department of Transportation (DOT) secretary to ensure that all states implement performance measurement in order to adequately monitor the condition of interstate highway infrastructure and the national highway system.

MAP-21 defines asset management as a strategic and systematic process of operating, maintaining, and improving physical assets, with a focus on engineering and economic analysis based upon quality information, to identify a structured sequence of maintenance, preservation, repair, rehabilitation, and replacement actions that will achieve and sustain a desired state of good repair over the lifecycle of the assets at minimum practicable cost. According to Federal Highway Administration (FHWA), a State asset management plan shall, as a minimum, be in a form that the Secretary determines to be appropriate and include:


Although 2012 MAP-21 Act's request for each state to develop a risk-based TAMP is an excellent and desirable development, a number of challenges remain:


infrastructure elements and systems, discussed further in the following;

We lack objective measurement based metrics that would reveal the Demands, Capacity, Disutility Probability, and consequences of disutility for evaluating risk in definitive and quantitative manners. A new engineering education paradigm culminating in an infrastructure engineering and management degree is urgently called for and this as well as other shortcomings of aged infrastructure management in the USA in the 21st Century has been discussed by many experts in addition to the writers (Aktan et al., 2016a,b).

### 3. OBJECTIVES

This paper has been written in response to a request by the Writers' colleagues for a contribution to a Frontiers collection of papers for "robust monitoring," diagnostic methods and tools for engineered systems. Rather than directly delving into the technical specifics of "monitoring," the writers opted to link the monitoring problem to the much broader infrastructure asset management concern. Unless monitoring of critical engineered systems is encouraged by policy, and the drivers and objectives of monitoring are crystal clear, the value of many applications where investments were made into technology leveraging have remained questionable. It is therefore important to assert that the discussions in this paper are focused on the broader problem of technology leveraging for asset management in general and highway bridge asset management in the US in particular.

A further clarification is required regarding the precise meaning of "monitoring technology." Writers use the term "technology" to stand for "sensing," imaging and non-destructive probing in relation to data from field experiments; information technology; modeling and simulation; and risk-based optimum decision-making for asset management.

The objectives of this paper follow from the discussions above and include the following:


### 4. INFRASTRUCTURE PERFORMANCE

Infrastructure performance may be defined as the analysis of a multi-dimensional Capacity/Demand relationship as illustrated in **Table 1**. The probability of critical Demands exceeding the corresponding Capacities of an infrastructure system should be the basis for evaluating infrastructure performance at each of the four Distinct Performance Limit-States–Utility and Functionality; Serviceability and Durability; Life Safety and Stability of Failure; and Resilience. The Return Periods for peak demands for most infrastructure components typically vary as shown in the first row and the Performance Criteria for each limit-state of performance are narrated in the last row. Today, engineers and infrastructure owners often do not incorporate the interdependencies between each of these four limit-states and often assume that safety and resilience are separate problems from functionality and durability.

If a common facility is designed and constructed by adhering to the building codes, such as those issued by the International Building Code (IBC) in the USA (or by Eurocodes in the EU) for buildings, the probability of collapse is assumed to be ∼ 10-6 for a 475 years seismic event or other demands governing the safety limit-state. However, we lack the data to confirm or dispute the actual performance. In Japan, a country with the most stringent seismic codes and enforcement, the 1995 Kobe earthquake is reported to have destroyed about 50,000 buildings. In the case of highway bridges designed by American Association of State Highway and Transportation Officials (AASHTO) Standards, there is data implying that the probability of failure is 1/10,000 based on an assumed lifecycle of 75 years (Bektas and Albughdadi, 2018).

Aside from the safety limit-state, most public infrastructures especially in the older urban regions in the USA fail to perform in the Utility and Functionality as well as the Serviceability and Durability limit-states much earlier than anticipated in their service-lives. The Second Row of **Table 1** implies that asset management should actually integrate the management of operations, maintenance and preservation, structural safety and stability and resilience on the same platform. However, most engineers, public infrastructure owners and managers have fragmented the lifecycle asset management challenge by delegating operations, preservation, safety, and resilience to different and disconnected jurisdictions, bureaus, and organizations.

In some cases, such as the tall building stock in San Francisco, doubts may arise about potential performance at the safety and resilience limit-states after several decades of a building boom (Fueller et al., 2018). In most cases, however, it takes an actual natural or man-caused hazard to occur to reveal the actual performance of systems at the safety and resilience limit-states. Our current civil and structural engineering practice based on code prescriptions and subjective visual inspections (Moore et al., 2001) is grossly insufficient given the increasing nature and frequency of hazards and associated risks due to infrastructurefailures and disutility. Further, many civil engineers

are not even aware of the uncertainty in predicting the performance of a constructed system which they design by code provisions.

The desirable approach to managing public infrastructures, especially at dense urban regions, such as the NE Corridor in USA is to demand quantitative descriptions and objective and measurement-based metrics for Loads, Demands, and Capacities from infrastructures and data for evaluating whether the disutility probability is acceptable at each and every limitstate of performance. The acceptable disutility probability would naturally depend on the affected human population and the economic consequences. For example, we may tolerate less than perfect roughness indices and congestion at some highways at certain times of the day and/or the week; and a bridge deck to require maintenance in just 5 years after construction due to wear and deterioration. On the other hand, we cannot continue to practice infrastructure management with the subjective pseudoreality of how constructed systems perform and how engineered, social and natural elements of complex infrastructure systems interact. We have to bring infrastructure management to be based on a rational, objective set of metrics defining the performance expected of them and their organizations. This requires the ability to measure and monitor performance as per Edwards Deming's teachings since we cannot manage what we cannot measure.

Fortunately, the information technology revolution of the last several decades offer the tools needed for measuring and monitoring performance.

### 5. THE INFORMATION TECHNOLOGY REVOLUTION

Information Technology (IT) has been defined as the study or use of systems (especially computers and telecommunications) for storing, retrieving, and sending information. Intel 4004 is considered as the first commercial 4-bit IC micro-processor which has advanced continuously since its invention in 1971. The 4004 was only capable of 60,000 instructions per second, but its successors including the Intel 8086/8088 family brought ever-growing speed and power to computers. Today, thanks to smartphones and tablets we enjoy ubiquitous computing, data and image capture and communication. Cloud computing and storage has become a significant industry, removing many of the limits to computing, archiving and retrieval of data and information. Software is available to the public at a very low cost for almost any conceivable purpose, including all levels of games, K-12 education, productivity, finance, engineering, arts, architecture, and sciences amongst others.

Along with the advances in IT, parallel advances in experimental technology (sensors, actuators, data acquisition systems, controllers, and pumps; wide-area high-definition digital imaging; a variety of NDT probes; and more recently, wireless sensor networks and SCADA (supervisory control and data acquisition) Systems have become available. Most of these hardware and associated data acquisition and control software (e.g., NI's LabVIEW) have been used in laboratories and in some cases in the field on actual infrastructures. Naturally, a variety of sensors have been and are being used in the airplane, auto, HVAC, and elevator systems and in defense applications. More recently, coupled multi-level and broad area real-time imaging, sensing, computing communication and actuationcontrol systems have been demonstrated in association with homeland security purposes and for infrastructure management.

Explosion of IT has created immense amounts of data—to put things in perspective, the size of the Internet doubles about every 2 years. For the beginning of 2016, the Counter expects around 7.7 Zettabyte (ZB) on to data that is distributed worldwide to Internet servers (1 ZB = 10<sup>21</sup> bytes or 1 million petabytes = 1 billion terabytes = 1 trillion gigabytes). Along with this data explosion, privacy, and cyber-security have become significant societal concerns. A Wall Street Journal article describes the major role "big data" is playing in the US economy (Stoll, 2018).

It follows that in spite of the great abundance of data, the challenge of organizing, synchronizing, visualizing and interpreting this data into information, followed by knowledge and wisdom for transforming infrastructure management, is pending. However, we do not yet clearly know the scope of useful data on organizations and assets that we need for infrastructure management, and how we can collect this data. What constitutes useful data (and images) for objective measurement of infrastructure performance at various limitstates, especially at the service and safety limit-states, and how we may capture, fuse and interpret this data will be discussed later. One thing is certain—IT explosion, if properly leveraged, offers a great opportunity for rationalizing and optimizing infrastructure management! However, the path to improving infrastructure management requires being able to manage data and understanding the path from data to wisdom.

### 5.1. Data-Information-Knowledge-Wisdom

In order to identify the most critical data that we need and how to capture this data for prudent management decisions, first we need to understand the distinction and hierarchy of data, information, knowledge and wisdom.

**Figure 2** describes the stages of identifying and understanding complex systems behavior by transforming data to information by understanding any existing relationships between data (e.g., by correlation analyses), followed by understanding the patterns embedded in information to lead to knowledge. Finally, by understanding the physical, chemical and mathematical principles embedded in knowledge we may acquire the wisdom that is essential for generalizing knowledge and developing prudent decision-making tools, such as scenario generation and simulation. While data collection is only the first step of this process, we also have to appreciate that there are additional manners of acquiring knowledge discussed further in relation to **Figure 3**.

### 5.2. Knowledge Classification and Acquisition

**Figure 3** (Knowledge-Management-Tools, 2017) illustrates a commonly accepted knowledge classification: Tacit vs. Explicit (or mechanistic). Tacit knowledge on a system needs to

be accumulated before the fundamental principles, critical parameters and the mechanisms that describe and help model the system for simulation may be established as "explicit or mechanistic knowledge" about the system.

The attributes of tacit knowledge are described in **Figure 3**, indicating that data and IT are often not sufficient for its accumulation especially when we deal with complex systems, such as infrastructures. Some additional sources include: (a) collection and verification of existing heuristics, (b) reason and logic, (c) mathematical proof, (d) trial and error, (e) intuition, (f) experience gained through apprenticeship under the mentorship of an expert, (g) observation and empiricism, and most importantly, (h) the scientific method. Once tacit knowledge about infrastructures which are complex and large natural-human-engineered socio-technical and interdependent system-of-systems is accumulated, this may then be transformed into explicit knowledge through standards, codes, algorithms and numerical-statistical models.

## 6. TECHNOLOGY INTEGRATION

We now return to the challenge of objective data-driven asset management of infrastructures. What type of data is needed; how it may be collected; and, how this data would help complement the available tacit knowledge and help culminate in explicit knowledge are some of the fundamental questions. In addition to IT, a technology leveraging toolbox would include:


These technology tools are naturally integrated by following the scientific method or structural system identification. **Figure 4** offers a schematic of the structural system-identification method which was formalized and reported in a book by an ASCE Committee Catbas et al. (2013), and also discussed by Aktan and Brownjohn (2013). However, the roots of system-identification as well as innovations leading to the field experimental tools including sensing go back to many decades. We need to acknowledge many contemporary colleagues and especially giants from earlier generations who have contributed to the concepts and tools that have made system-identification of operating buildings, bridges and other constructed systems in the field possible. Many of these colleagues are acknowledged in the Summary and Acknowledgments (section 8) and the writers are aware that they are inevitably failing to include many worthy contributors for which they apologize.

A successful culmination of the process (more than one cycle may be required) in **Figure 4**, may lead to a digital twin of the structural-foundation-soil system within the resolution of a mixed macro-micro level representation of a system. The digital twin may potentially serve as a birth-certificate for a new system, and as a basis for condition evaluation and NDT and SHM applications for long-term condition monitoring. These are critical for preventive maintenance as well as evaluating the condition of a system following a hazard such an accident leading to damage.

A closer scrutiny of **Figure 4** reveals the range of disciplines and specializations that are needed for an application to an operating infrastructure component or system. The process is successful only if each of the Six Steps are overseen by the same "project manager" with experience and domain knowledge associated with each and every one of the Steps, preferably a structural engineer who would also possess domain knowledge and heuristics that has been accumulated about the structural system being identified.

Many researchers specialize in only analytical modeling, or only experiment, or only computation and parameter

identification, or only risk and reliability theory. They may simply obtain the products from each Step executed by others and then try to integrate these. Such a disconnected approach often fails to produce a reliable, high-fidelity digital twin that will distribute its loads through the same paths as the actual system, or one that does not deform and displace like the actual system, i.e., share the kinematics of the actual system. The maximum demands computed under different loading schemes during the simulations may not match reality, and most importantly, the corresponding actual capacity distribution may be inaccurate. It is therefore critical to have the same experienced structural engineer participate in, oversee and integrate each Step of the process in **Figure 4** to help properly integrate the contributions by different experts as depicted in **Figure 4**.

### 7. SENSING TECHNOLOGIES

Having introduced the fundamentals or infrastructure performance; why prudent asset management requires objective metrics and measurement data on performance; and the challenges in integrating and leveraging technology; we now need to review the current practice in sensing and imaging. We will first focus on sensing as the most fundamental element in field experimentation and monitoring. For a crash course on sensors (sensorwiki, 2018) is an excellent resource.

A National Academy Report (National Research Council, 1995) offered definitions and issues related to sensing in manufacturing and for structural monitoring and control in aerospace, space, defense and homeland security applications. The definitions and glossary in this book may be used for sensors for civil engineering applications. **Table 2** extracted from this book lists some of the critical sensor characteristics for static and dynamic applications.

Electronic sensing in civil engineering goes back to 1930's (Treatise on Photoelasticity, published in 1930 by Cambridge Press). The bonded wire resistance strain gage for aluminum or steel was invented at MIT in 1938. This was the predecessor of current foil gages manufactured and sold by Vishay, Micro Measurements (recently purchased by Vishay), or HBM which also offer strain-gages for concrete. Weldable versions of these gages are manufactured by and available from HITEC. Tokyo Sokki Kenkyujo (TML) is a sensor manufacturer in Japan


that also offers specialized strain gages, such as for post-yield measurements. TML products may be purchased in the USA from Texas Measurements. Vibrating wire versions of strain transducers, with various gage-lengths and installation methods, have been commercially available for decades and distributed by several companies, such as Telemac in France, Geokon in the US and Roctest in Canada to name a few. Similarly, fiber optic sensors were commercially available since 1990's with pioneers, such as FISO in Canada, Smartec in Switzerland, Omnisens in Switzerland, Blue Road Research and Micron-Optics in the US, and Ando in Japan. Obviously the availability of sensors is no longer a concern, but the art is in designing a field experiment by selecting, calibrating, positioning, installation and integrating the outputs of the best sensors for each measurement needs and constraints in an optimum manner.

The invention of the strain gage enabled the design and development of many transducers capable of measuring deformations, displacements and forces by leveraging strain gages. Vishay, HBM, and TML offer many of these. For example, a clip-gage is a raised arch wired to accommodate a full straingage bridge which amplifies and measures the strains as the arch is extended or contracted at the base. By using such a micro-structural system, the strains of which are measured by strain gages, we may correlate the strains to the elongation or contraction between two points on a member. A TML clip gage is illustrated in **Figure 5**. The PI displacement transducer has a simple structure: a combination of strain gauges and an archshaped spring plate, the former attached to the latter. Six models designed for gauge lengths of 50–300 mm are available. This transducer is used to measure the crack opening displacement occurring within each gauge length on the surface of concrete or to measure local deformations between elements of various structures. Many other types of clip gages have been used in research as long as strain gages have become available.

Linear Variable Differential Transformers (LVDT's) were invented in the 1940's for displacement measurements requiring higher resolution and sensitivity than what is offered by strain-gage based transducers, such as the clip gage. LVDT's

operate by leveraging inductance change and are used in the aerospace industry. LVDT's of various sizes and sensitivity are manufactured by Honeywell, Intertechnology (Celesco), TML, and others. Some LVDT's use a spring-loaded thin steel wire which enables them to measure displacement between distant points.

The vibrating wire strain gage and other vibrating-wire transducers for temperature, tilt, displacement and pressure or force have been used for geotechnical measurements. Geokon Inc. is one of the oldest and most extensive US manufacturer of vibrating wire transducers for geotechnical and structural applications. Roctest is another sensor manufacturer based in Canada and established in 1967, offering a variety of vibrating wire as well as fiber-optic based transducers.

Another class of transducers that is used in field experimentation is the accelerometer. PCB (recently purchased by MTS) offers a wide-range of products suitable for testing constructed systems. Brüel and Kjær, Kistler Group, and TOYO Corporation also offer various accelerometers [A listing of accelerometer manufacturers is available at sens2b-sensors (2018)] with a wide range of specifications. Geophones which measure velocity have also been adapted from mining and used in some bridge tests.

Tiltmeters are designed to measure very small changes from the vertical level, either on the ground or in structures. Tiltmeters are used for monitoring dams, the small movements of potential landslides, the orientation and volume of hydraulic fractures, and the response of structures to various influences, such as loading and foundation settlement. Tiltmeters may be purely mechanical or incorporate vibrating-wire or electrolytic sensors for electronic measurement. A sensitive instrument can detect changes of as little as one arc second. Tuff tiltmeters from Jewell and vibrating wire tiltmeters from Geokon were used by the writer in the past for monitoring bridge superstructure and substructure rotations.

The writers started to explore sensors that may be suitable for monitoring bridges following the Big Bayou Canot rail accident (2018) on September 22, 1993. FHWA and Ohio DOT supported an investigation of available sensors that could be used for bridge monitoring, which were purchased and extensively studied both in the laboratory and in the field over several years. These experiences accumulated and have led to a 2002 FHWA Report "Development of a model health monitoring guide for major bridges" (Aktan et al., 2003) 1 .

### 7.1. Sensor Selection and Calibration Fundamentals

There are many criteria in selecting a sensor for measuring a physical, chemical or electrical quantity in the field. First we must understand that the reading of any sensor in the field will be impacted by many phenomena, and sometimes the sensor reading may prove more sensitive to a cause or mechanism that is unknown as opposed to what the designer of an experiment may think is being measured. In general, we may classify common transducer measurement errors as:


The Five most common sources of measurement error are shown in the following expression:

$$
\epsilon\_{realing} = \epsilon\_1 + \epsilon\_2 + \epsilon\_3 + \epsilon\_4 + \epsilon\_5 \tag{1}
$$


We should recognize that understanding the magnitudes of these errors and mitigating them is not trivial. In fact, the errors will depend on the sensor, its transduction, the structuralmechanical system of the transducer together with its installation assembly, power, A/D conversion, environmental conditions and their changes, bandwidth and duration of data acquisition and most importantly, whether the experiment with sensor types and distribution or density is designed properly. It follows that

<sup>1</sup>http://www.di3.drexel.edu/w2/files/FHWA\_Report\_7\_18\_03.pdf

performing field measurements and checking and validating data reliability and quality requires significant Tacit Knowledge.

The single most effective manner for assuring data quality is to perform calibration of individual sensors, followed by the system of sensors and DAQ in the laboratory (**Figure 6**). Individual sensor and sensor-DAQ system calibrations are discussed in the FHWA Report on Model Health Monitoring Guide (2002) referenced earlier.

A valid question becomes: "How can we leverage field experiments for measuring the behavior and performance of constructed systems given the challenges in obtaining reliable measurements in the field?" This is a very important question and requires federal and state governments to invest into training at field measurement laboratories (e.g., instrumented monitoring of operating bridges) to demonstrate sensors, calibration, installation, measurement system design fundamentals and data quality assurance. These laboratories (Virtual Non-destructive Evaluation Library for Highway Structures, 2018) <sup>2</sup> may serve as best practices for training motivated engineers to be able to obtain reliable field measurements and obtain certificates and even professional degrees.

### 7.2. Wireless Sensing

One of the purposes of this paper has been to introduce the products of a recent research project performed under FHWA Exploratory Advanced Research Program (EARP) funding. In this project, given the challenges in developing new reliable wireless sensors for field measurements, the objective was to explore whether it is possible to transform the most reliable and proven sensors into wireless operation.

The sensors that were selected were: (a) Resistive strain gages or rosettes; (b) Clip Gage—as a 4-arm strain gage transducer; (c) Vibrating wire gage and other vibrating-wire transducers; (d) Displacement transducers employing wire potentiometers; (e) Electrolytic tilt-meters; (f) Seismic accelerometers (see **Figure 7**). Each type of sensor was successfully untethered from power, data acquisition, and communication cables by locating in-situ power, conditioning, data acquisition and communication IC boards in a small box, and the sensor readings were transmitted wirelessly to any computer. All the sensors previously described have been extensively tested in the laboratory and tested in the field and have been presented to a committee of experts (representatives from NASA, Volpe Center, PennDOT, FHWA, Minnesota DOT) as part of a Technology Readiness Level (TRL) assessment on June 6th, 2017.

The TRL panel agreed that the technology had reached a TRL between 4—"Components validated in laboratory environment"—and 5 "Integrated components demonstrated in a laboratory environment." Since then, the researchers have continued the development of the wireless sensing platform and followed the TRL feedback received to identify scenarios and use case development, and to develop a set of requirements for the identified use cases based on stakeholder needs.

With the above set of sensors or transducers, it is possible to measure: strains, deformations or distortions, displacements, rotations, and accelerations along with a wide frequency bandwidth. For instance, the vibrating wire gages offer a highly robust measurement of temperature, and distortion that is stressrelated (not temperature related) measured at a 1 Hz frequency. These sensors (in conjunction with imaging) offer an excellent capability for measuring structural and environmental responses of any highway structure, at an appropriate frequency. Typically, for operational monitoring, acceleration data is obtained at <500 Hz; strain, displacement and rotation response data is collected at <200 Hz and data on environmental conditions and their effects on bridge responses is obtained with vibrating-wire sensors at 1 Hz.

Untethered sensors offer further opportunities for rapid deployment (minutes/hours) and field monitoring of structures for weeks since power sources that are standard batteries last for weeks, depending on the bandwidth and data capture timewindow. Installation assemblies by leveraging industrial magnets are being designed to enable rapid and easy field installation even for structures with challenging access constraints. This capability promises quantitative measurements of structural strains, distortions, displacements, rotations and accelerations of a bridge during operations, augmenting the current visual inspection and subjective assessments to include objective "pulse and blood-pressure" measurements under traffic and even special loads. Such measurements may be conducted for several hours as the inspectors prepare for the inspection of a bridge. Normally, operational monitoring would be recommended for 24 h for some critical bridges to understand and include the impact of daily environmental changes on the structure's responses.

As an example, the traffic video in **Figure 8** that is stored during operational monitoring is synchronized with displacement time histories (or any other response recorded) of sensors installed on selected girders of the spans 3 and 4 of a viaduct structure supporting. A screenshot of one significant displacement event example (peak displacement near 0.6 in) is shown in **Figure 8**. In the video, the sensor layout and locations are displayed on the left hand side on a 3D model of the bridge. The right hand side shows the video of traffic and the corresponding real-time displacement recordings. This imagedata integration helps users comprehend the effect of the traffic (for instance a large truck traveling westbound that just crossed Span 3 at 6:50:05am and that caused displacement responses recorded at critical locations by the wireless sensors 1, and 5 at the mid-span of Spans 3 and 4). Such synchronized image and data combinations are useful for objectively documenting vehicles that may cause significant demands from a viaduct.

Responses from wireless sensors are consistent with the responses measured by traditional wired displacement gages (sensor 3 and 13) that require significantly longer installation time (and cost) for routing several hundred feet of power and communication cables to a data acquisition system and a portable generator. Cables and especially their connections are also a major source of vulnerability and errors in measurements and their elimination augments the performance of the wireless

<sup>2</sup>A virtual version of such a laboratory is available at www.di3.drexel.edu/ view\_project.php?p=907

Aktan et al. Technology for Infrastructure Asset Management

sensing system caused by the traffic loading as well as caused by temperature changes during the period of testing.

### 7.3. Future Opportunities Afforded by Rapidly Deployed Wireless Sensing

Sensors may serve to provide quantitative information, situational awareness and insight on structural behavior during many types of bridge inspection listed in the following (Hearn, 2007). While each type of inspection in **Table 3** would greatly benefit from practically deployed wireless sensing, the value of sensing in the case of Special and Damage Inspections would be especially important and critical for reducing the uncertainty of visual inspection. Special and/or damage inspections are used for bridge reconstruction especially in the case of accelerated construction; bridges that exhibit unexpected damage and/or tilting and deformations; fatigue crack monitoring; load testing for load capacity rating, etc. There is no question that wireless technology offers significant advantages in feasibility

over the standard wired sensors in these applications. We anticipate that wireless sensing, especially the ones obtained by transforming off-the-shelf sensors that have been proven robust and reliable over decades of use can impact the state of practice in these applications immediately as opposed to more long-term, routine measurements and structural-identification applications which are typically longer term investments for the owner agencies.

To reiterate, wireless sensing may offer great value to a bridge owner in the case of:


And more specifically, especially if baseline measurements at commissioning are available:


Untethered sensors offer opportunities for rapid deployment (installation in minutes/hours) unlike tethered sensors and allow field monitoring of structures for weeks (since batteries were verified to last for weeks and up to several months depending on the bandwidth and data capture time-windows). Installation assemblies by leveraging industrial magnets were designed to enable even faster and easier field installation especially for structures with challenging access constraints. Elimination of long cables and connections with these help reduce significant sources of error and uncertainty in measurements under field conditions.

Rapid deployment capability promises quantitative measurements of structural strains, distortions, displacements, rotations and accelerations of a bridge during operations, to augment visual inspection and subjective condition ratings by



including objective deformation measurements under normal traffic and even special proof-level loads.

Enhancing visual inspection with rapid wireless sensing measurements appear as a most promising application. A follow-up question may regard the merits of recording bridge "pulses" during operations every 2 years. A structural system-identification of a bridge type may offer foundational knowledge for the interpretations of measured responses over long-term. This idea was in fact suggested two decades ago (Hunt et al., 1998).

The Long-Term Bridge Performance (LTBP) research program administered by the FHWA was based on a similar concept. The insight and knowledge from the structural systemidentification of reference structures (field laboratories serving as benchmarks and also as training facilities) and the bridge population represented by the reference structure, would enable an excellent understanding of the health of a bridge population.

Naturally, rapidly deployable wireless sensors offer many other uses. Load testing for bridge load rating, construction monitoring especially in the case of accelerated bridge construction, transport and erection, foundation reuse, permit loading, and, structural health monitoring all require reliable field measurements for short and long durations. The single most important prerequisite is to train and educate a sufficient number of technicians and engineers who will have an understanding of what the measurements physically mean, and exactly what load or other environmental event is causing the measurements.

When we stop thinking about bridges as simple beams, and understand the complexity of the behavior of site-soil-foundations-substructures-approaches-bearings-andsuperstructures as an integrated system, we will be able to better understand bridge safety and reliability - much better than what is implicit in the code and incorporating fabrication and construction quality and maintenance.

### 7.4. Potential of High Resolution Imaging

While companies, such as Google are completing the mapping of a large portion of the built environment and users can access street view images of locations across the globe, bridge inspection often still relies on traditional means of documenting the condition of infrastructures by photographs, technical drawings and other information considered relevant by an inspector. Images collected by bridge inspectors are very powerful in documenting local regions of concern but they represent spatially isolated data points and often miss the relationship of these to the global system. The limits in the field of view of human vision in understanding the broader patterns indicating possible condition changes along a large system are well-established by photographers.

Technology become recently available to achieve high resolution maps of bridge decks by stitching high resolution images collected by cameras installed on road vehicles moving at traffic speed (Hiasa et al., 2016). Such rapid data collection, is not only useful because it keeps the inspector protected from the risks posed by moving traffic and reduce the need for traffic control, but it is particularly valuable because it provides a visual overall documentation of the actual condition of the entire deck surface. Further, high-resolution RGB images can be integrated with information which can be extracted from properly timed infrared imaging and reveal possible existence and range of delamination hidden under the surface (Hiasa et al., 2016).

An example of the product extracted from a rapid high speed survey is shown in **Figure 9** where a complete HD imaging of a bridge surface is shown together with zoomed views of the HD images at locations of interest and the location of possible delaminations detected by high speed IR imaging. The study of a full-bridge deck surface image could reveal patterns that even an inspector walking on a bridge cannot easily discern. Such an image would also help design a more in-depth NDT application if needed. Periodically collected images complemented with crack mapping could be really powerful ways to document the progressive deterioration of the deck.

### 7.5. A Holistic Technology Integration and Leveraging Strategy for Bridge Asset Management

Given that this paper was written in response to a call for "robust monitoring, diagnostic methods and tools for engineered systems," writers believe that it is important to consider how

all of the critical experimental, information, modeling and simulation, and, decision tools may come together in relation to typical highway bridge assets. **Table 4** presents a logical hierarchy for field technology applications—starting from (a) Inspection; (b) Measurement of Geometry and Material Properties; (c) Evaluation of Condition and Performance; (d) Diagnosis, Prognosis, Risk Evaluation and Options for Intervention Designs; and, (e) Health and Performance Monitoring based Asset Management. Each Column of **Table 4** would be applied from Left to Right, and from Cell 1 to Cell 4 in sequence as needed and justified.

It is possible to argue that the expertise and cost requirements of integrated technology leveraging in **Table 4** may be currently overwhelming for many infrastructure owners. However, in the case of major infrastructure components, such as long-span bridges, viaducts and tunnels that cost \$ billions, and that are critical for the economic vitality of major urban regions, it is difficult not to justify moving toward a technology-based assetmanagement suggested in **Table 4**. In the case of typical highway bridges, **Table 4** may be applied to a selected population which may provide an excellent asset management procedure for the entire population.

### 8. SUMMARY AND ACKNOWLEDGMENTS

Critical infrastructure systems like water, power, communication and transportation are key for the livability, sustainability, and resilience of urban regions. Together with the natural and the built environments, society, economy and the government services, infrastructure services serve as foundation of our cities. Meanwhile, stresses due to urbanization and climate change are challenging the performance of urban infrastructure systems and their services.

Infrastructure systems are intertwined, interdependent, multiscale, multi-domain and complex, where system behavior cannot be predicted even when element behaviors are known. The complexity is compounded in dense urban regions where infrastructures are bundled in close proximity, where the failure of one impacts all others.

There is now recognition that infrastructure systems need to be managed just like financial assets, yet we are a long way from measurement-based, objective and reliable metrics for documenting their value, performance and condition; as well as how changes in their condition may impact their performance. In this paper writers strive to offer a perspective for meaningful asset management of infrastructures and describe the tools that are needed for rectifying the current shortcomings mainly arising from subjective and incomplete performance and condition evaluation.

By defining "Infrastructure Performance" and "Technology Leveraging" for performance and condition evaluation in terms of rational indices, the paper describes how asset management can be based on objective data in addition to tacit knowledge and how data may then be transformed into explicit knowledge. It would take decades before such a transformation may be completed, however there has to be foundation for proper technology leveraging. In this section writers wish to acknowledge a few of their colleagues who have made significant contributions to the state of the art in the structural system-identification (St-Id) concept, experimental, analytical and computational arts, reliability and decision theory, lifecycle cost analysis and asset management concepts.

In the case of structural system-identification concept as applied to constructed systems, we recognize Agbabian et al. (1991), Shinozuka and Ghanem (1995), Ibáez (1973), Hart and Yao (1977), Beck and Jennings (1980), Yun and Shinozuka (1980), Shinozuka et al. (1982), Yao and Natke (1994), and as


TABLE 4 | Integration of technology tools for bridge asset management: risk based decision making, info-technology, modeling/simulation, and experimental arts.

significant contributions. Actual field experiments on full-scale constructed systems were pioneered by late Professors Hudson (1964) from CALTECH and Clough from UC Berkeley who collaborated in developing rotary-weight shakers for dynamic testing of buildings and dams in the 1960's. Such devices were also used for destructive tests of a decommissioned structure (Galambos and Mayes, 1978). Proper Implementation of St-Id requires careful modeling of constructed systems (Catbas et al., 2013), adequate field testing capabilities (Aktan et al., 2016b), data interpretation (Law et al., 2014; Smith, 2016) and robust parameter identification strategies (Rafael and Smith, 2003; Posenato et al., 2008, Goulet et al., 2010).

During the 1970's, mechanical engineers interested in experimental structural dynamics developed the art of modal analysis and civil engineers started exploring how modal analysis theory may be applied to structural identification of constructed systems (First IMAC conference held at 1982 in Orlando, FL). Brown and Allemang at the University of Cincinnati offered a review of the history of modal analysis at IMAC in 2007 (Brown and Allemang, 2007) 3 .

In the 1990's, engineers from different disciplines have embarked on an exploration of health monitoring as a research area. The First International Workshop on Structural Health Monitoring (IWSHM) was held in 1997 at Stanford University, organized by Professor Fu-Kuo Chang following the first Non-Destructive Testing in Civil Engineering Conference held at Berlin in 1995 and organized by BAM (Schickert, 1997). These milestones of technology applications for constructed systems were followed by remarkable research efforts and summarized in reports which captured the goals and the potential of SHM for civil and other structures (Farrar, 2001; Chang et al., 2003; Brownjohn, 2006; Farrar and Worden, 2006). Sensing systems capable of being deployed for long periods of time

<sup>3</sup>http://www.sandv.com/downloads/0701alle.pdf

were demonstrated on constructed systems (Glisic et al., 2013; Sigurdardottir and Glisic, 2013; Leung et al., 2015).

While in-service or decommissioned constructed systems have been tested by perturbing with many methods (shakers, loaded trucks, pull-release by cranes, rock-anchors and actuators, implosion, etc.), in the case of long-span bridges and highrise buildings perhaps the only practical approach is measuring their ambient vibrations caused by operations and ambient environmental loads, such as by wind. This approach is also referred to as operational modal analysis (Abdel-Ghaffar and Housner, 1977, 1978; Peeters and Roeck, 2001; Ko et al., 2002; Grimmelsman et al., 2007; Conte et al., 2008; Siringoringo and Fujino, 2008; Pakzad and Fenves, 2009; Brownjohn et al., 2010). In the case of ambient monitoring of buildings, exemplary research by Kareem et al. (1999) should be mentioned.

In the last 20 years, exploring structural health monitoring applications by leveraging wireless accelerometers became a trend. Among the many research groups who have advanced SHM for bridges leveraging untethered sensing systems we recognize Straser et al. (2001), Lynch et al. (2004), Lynch et al. (2005), Lynch and Koh (2006), Lynch (2006), Kim et al. (2007), Pakzad et al. (2008), Jang et al. (2010), Jo et al. (2010), Meyer et al. (2010), Feltrin et al. (2011), Spencer et al. (2016), Zhang et al. (2016), O'Connor et al. (2017), Dragos and Smarsly (2017), Moreu et al. (2017), Noel et al. (2017).

Finally, we should acknowledge the significant body of research dedicated to structural reliability, lifecycle cost analysis and management of bridges championed by many researchers, such as Yanev (2001), Frangopol and Liu (2007), Thoft-Christensen (2012) and Yuan et al. (2017) amongst many others. Without these major contributions it would not have been possible for the writers to make their contributions to structural identification for health monitoring and asset management of bridges.

### 9. CONCLUSIONS AND RECOMMENDATIONS

The discussions offered in this white paper are intended to start a conversation on making infrastructure asset management decisions in general and for bridges and other highway structures in particular, based on objective data on structural performance and condition based on objective indices measured in the field. We cannot eliminate visual bridge inspections that are conducted based on NBIS (National Bridge Inspection Standards) every 2 years, but we may make them more effective by leveraging technology and even extend the inspection interval to 5 years as it is in EU and the Far East for many common bridges.

One possible way to augment current bridge inspection and condition rating practice is by making objective measurements that would describe the "pulses or health-signs" of a bridge during or before an inspection. Sensors for making measurements of strains, displacements, rotations and accelerations at any point of a bridge have been available for decades. These sensors have now been transformed to wireless and practical deployment. It should be possible to train a sufficient number of expert bridge inspectors who are capable of making such measurements. On the other hand we have to caution that sensing and imaging is not as simple as buying and installing a sensor. There are too many examples of failed technology applications that did not lead to any meaningful and actionable information. This has become a major impediment for infrastructure owners and managers buying in to sensing and measurements.

This paper detailed the theoretical background and the training that is essential for educating a new breed of civil engineers who can properly leverage technology. The challenges that have to be overcome for completing a transformation of infrastructure asset management are:

	- a. Comprehend the elements and mechanisms through which bridges actually carry their intrinsic and live loads;
	- b. 3D FEM modeling and analysis of typical bridges;
	- c. What are the pulses of a bridge (critical locations and strains, displacements, tilts, and accelerations at these locations along with the causes of these responses);
	- d. What measurements mean in relation to the serviceability, durability and safety performances of the bridge in terms of its live load and intrinsic stresses and the changes in these due to changes in environmental and live load effects.

In spite of the challenges in the training of a new generation of bridge inspectors and engineers, infrastructure capital needs and increasing backlog in infrastructure renewal are forcing us to bring infrastructure asset management to a rational objective platform.

There is a lot that FHWA can do to facilitate objective data-driven asset management. For example, by leveraging the reference bridges that will serve for the LTBP program research, we may also take advantage of these also to serve as field laboratories for demonstrations and training of the DOT engineers and bridge inspectors. By instrumenting these bridges using wireless sensors, FHWA may greatly simplify taking measurements for LTBP data collection, gaining considerable advantage and cost reduction. FHWA can also support best-practice demonstrations and model standards for technology leveraging that can be adopted by AASHTO as Standards or Recommendations.

Writers fully recognize that it may take many years if not decades until civil engineering education and practice is reformed and civil engineers can take an effective lead role in guiding government, infrastructure owners, regulators and stakeholders in cost-effective and reliable asset management of infrastructures as complex systems. However, we cannot delay the reform if we are interested in livability, sustainability, and resilience of our urban regions.

### AUTHOR CONTRIBUTIONS

AA drafted the entire paper and is the primary contributor to the manuscript. IB has led a Research Project on the development of wireless sensing units, the results of which are presented in the paper and contributed in drafting several Figures. SK was a key individual in the improvement of the wireless sensors. He has been critical in performing the quality control of the paper and in formatting the document following the guidelines of Frontiers.

### ACKNOWLEDGMENTS

Writers are grateful to the National Science Foundation; Dr. Steven Chase (former FHWA Chief Science Officer), Dr. Hamid Ghasemi (Former Program Manager of the LTBP Program) and current FHWA EARP Manager David Kuehn, and Program Managers Dr. Jalinoos, Mr. Faridazar, and Dr. Azari for their support and guidance of their past and ongoing research projects involving technology leveraging for infrastructure asset management. Writers would also like to acknowledge their research partners Dr. Moon, Dr. Sumitro, and Mr. Matsumoto as well as graduate students Shi Ye, Qiang Mao, Mustafa Furkan, and Xiangang Lai who contributed to their ongoing research projects (NSF project 0855023 and FHWA project DTFH61-13- C-00021-Drexel Univ-SF26-1).

### REFERENCES


Stoll, J. D. (2018). Facebook's Hard Fall Shows the Pitfalls of Big Data.

Available online at: https://www.wsj.com/articles/facebooks-hard-fall-showsthe-pitfalls-of-big-data-1532750496?mod=hp\_lead\_pos3 (accessed July 18, 2018).


Zhang, Y., Kurata, M., and Lynch, J. P. (2016). Long-term modal analysis of wireless structural monitoring data from a suspension bridge under varying environmental and operational conditions: System design and automated modal analysis. J. Eng. Mech. 143:04016124. doi: 10.1061/(ASCE)EM.1943-7889.0001198

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Aktan, Bartoli and Karaman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Ambient Vibration Measurement Data of a Four-Story Mass Timber Building

#### Ignace Mugabo1,2, Andre R. Barbosa<sup>1</sup> \*, Mariapaola Riggio<sup>2</sup> and James Batti <sup>1</sup>

<sup>1</sup> School of Civil and Construction Engineering, Oregon State University, Corvallis, OR, United States, <sup>2</sup> Department of Wood Science and Engineering, Oregon State University, Corvallis, OR, United States

Keywords: ambient vibration testing, cross-laminated timber, light-frame construction, mass timber construction, operational modal analysis

### INTRODUCTION

Ambient vibration from wind, traffic, and human activities results into low levels of building motion. As a result, ambient vibration collected data have been used to extract dynamic characteristics of a wide array of civil engineering structures. Among these, a few dynamic characterization tests have been performed on timber building structures, but only a few focused on mass timber structures using different types of engineered wood products for the gravity and/or lateral load resisting systems (Worth et al., 2012; Reynolds et al., 2015, 2016). Since there is a growing interest in design and construction of mass timber structures for multi-story buildings, it is important to provide dynamic data obtained from actual constructed projects.

#### Edited by:

Eleni N. Chatzi, ETH Zürich, Switzerland

#### Reviewed by:

Hilmi Lu ¸s, Bogaziçi University, Turkey Shinta Yoshitomi, Ritsumeikan University, Japan Irwanda Laory, University of Warwick, United Kingdom

\*Correspondence: Andre R. Barbosa Andre.Barbosa@oregonstate.edu

#### Specialty section:

This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment

Received: 25 January 2019 Accepted: 07 May 2019 Published: 22 May 2019

#### Citation:

Mugabo I, Barbosa AR, Riggio M and Batti J (2019) Ambient Vibration Measurement Data of a Four-Story Mass Timber Building. Front. Built Environ. 5:67. doi: 10.3389/fbuil.2019.00067

The dynamic features of buildings have lead researchers and design professionals to gain knowledge of how structures behave under ambient excitations offering guidance on serviceability performance and insights on the structural performance of structures under high winds and seismic loading. Serviceability performance are of particular relevance for mass timber buildings when compared to conventional concrete and steel buildings, due to the lack of available data. Dynamic data for this type of building is necessary to understand the influence of overall mass, stiffness, and mass distribution on the modal parameters of the structure (frequencies, damping, and mode shapes), especially. The availability of these data can contribute to the effective design of mass timber buildings to ambient and wind-induced vibrations and the consequent occupant comfort and safety.

In the past decade, cross-laminated timber (CLT) has been adopted for structural assemblies of some North American multi-story buildings. One of these projects is the "Albina Yard," a fourstory building located in Portland, OR, which is the first multi-story construction built using U.S. manufactured cross-laminated timber (CLT) floor systems. There is currently a lack of open-access test data on existing buildings using mass timber technology. This study addresses this research need. Ambient vibration data were collected on the "Albina Yard," providing one of the few dynamic dataset for a U.S. building constructed using CLT members. An application and discussion of these data using two operational modal analysis (OMA) methods is presented in Mugabo et al. (2019)<sup>1</sup> . In the cited study, dynamic characteristics of the Albina Yard building were obtained, using the Enhanced Frequency Domain Decomposition (EFDD; Brincker et al., 2001) and the Stochastic Subspace Identification (SSI; Brincker and Andersen, 2006) methods. However, other OMA approaches could be used and results compared to those presented in Mugabo et al. (2019)<sup>1</sup> . In addition, these collected data can be used for benchmarking future finite element modeling to estimate building performance under extreme loading. For this reason, the complete dynamic dataset for the Albina Yard building is provided to the wider research community for further investigation and dynamic data analyses.

<sup>1</sup>Mugabo, I., Barbosa, A. R., and Riggio, M. (2019). Dynamic characterization and vibration analysis of a four-story mass timber building. Submitted to Frontiers in Built Environment.

This data report aims at guiding interested parties to the dataset and providing essential information for understanding the data. Value and use of the data are listed below:


### DESCRIPTION OF DATASET

The "Albina Yard" building has approximately 27.20 × 13.95 m of plan area per floor and a total height of approximately 15.39 m over four stories. The gravity load bearing system consists of Douglas-Fir (DF - Pseudotsuga menziesii) cross-laminated timber (CLT) floors supported on DF glued laminated (glulam) timber frames. Light-frame double-sheathed shear walls are used for the lateral force resisting system. The building envelope is comprised mainly of window glazing on the East and West façades and metal cladding walls on the North and South faces. **Figure 1A** presents a view of the building's eastern side. Drawings of the tested building can be found in the dataset described below. More information on the structural details is presented in Mugabo et al. (2019)<sup>1</sup> .

The dynamic data collection took place in January 2017 shortly after the building was commissioned. Data were collected a weekend day to avoid interference with occupants' activities and minimize the input from human-induced vibrations. The building was tested assuming ambient conditions (vibration from external environmental sources, such as road traffic, wind, etc.). Approximately 2 h of ambient acceleration data were collected using 16 uni-axial accelerometers (**Figure 1B**) and one tri-axial accelerometer. The uniaxial accelerometers have ∼1,000 mV/g sensitivity, acceleration measurement range of ± 5 g peak, a frequency range of 0.06 to 450 Hz, and a broadband resolution 0.000003 g root mean square (RMS). The tri-axial accelerometer has ∼100 mV/g sensitivity, acceleration measurement range of ± 50 g peak, a frequency range of 0.6–5,000 Hz, and broadband resolution 0.0002 g RMS. For the tri-axial accelerometer, only the measurements corresponding to the horizontal directions were measured. The accelerometers were connected through coaxial cables (with a resistivity of 50 Ohms) to a portable data acquisition system (National Instruments, NI cDAQ – 9174), which transmitted the data to a laptop computer with NI Labview SignalExpress 2014 software (National Instruments, 2013). Asrecorded raw data, without any filtering, are included in the presented dataset.

Overall, on each floor level, vibration measurements were taken at the northwestern corner (between farthest beam to column connection and drywall), center point (utility room closet), and/or southeastern corner of the building (between farthest beam to column connection and wall). The utility room is not at the geometric center of mass of the floor plan of the building; however, it is estimated to be the closest accessible location to the planar center of mass. At each of these three locations, acceleration values were taken in the EW and NS directions of the building plan.

Data collection were conducted in two setups: setup 1 and setup 2. In setup 1, the accelerometers were attached on the

underside of the roof, fourth floor, and third levels. During setup 2, four accelerometers were moved from the fourth level to the second level. The remaining accelerometers shared the same locations for both setups. It is worth noting that the northwestern corner of the second floor level was not instrumented because it was not accessible during the testing period. For each setup, the data were collected for ∼1 h, with a sampling frequency of 2,048 Hz. **Figure 2** is provided to highlight the quality of the data based on two channels labeled channel 2 and 6 taken during setup 1. The channels 2 and 6 measured the NS accelerations of the roof's northwestern and southeastern corners, respectively.

The dataset has the following digital object identifier<sup>2</sup> and is accessible for download to the general public. The dataset is named "Albina Yard Building Structural Monitoring and Behavior Dataset" and is located on the Open Science Framework repository. Two main folders can be found at this repository: one for the collected data, and the other for the supporting setup information. The collected data folder, denoted as "Ambient Vibration Acceleration Data" contains two zipped files, one for each setup. The "Supporting Information" folder contains floor and elevation sketches showing the locations of the accelerometers, and a file with coordinates of the accelerometer's locations in relation to a column located at the SW corner of the building. The "Supporting Information" folder also contains six figures that present power spectral densities of all the channels' acceleration data taken during setup 1. The dataset contains a metadata file that provides units and abbreviations, a general data description, folder hierarchy, and data files information by line.

### POTENTIAL USES

The dataset can be used for a variety of purposes related to the field of operational modal analysis (OMA). Previous operational modal analysis using these data were conducted by Mugabo et al. (2019) using two OMA methods: the enhanced frequency domain decomposition, and the stochastic subspace identification. The data can be analyzed using the many available OMA methods (Reynders, 2012). Several tasks could be of interest to researchers using these data. Studying optimal sensors locations and mechanical vibration induced noise are some among the tasks that can be of interest to researchers in regard to these data. In addition, these data can be used to support finite element model development and updating, since building information is also provided and included in Mugabo et al. (2019)<sup>1</sup> .

### AUTHOR CONTRIBUTIONS

The data described in this report are related to a series of experiments devised within the framework of a multiscale research project—SMART-CLT, aimed to establish a holistic performance-monitoring protocol for mass timber buildings. The project was led by MR, with co-Pi AB as main expert on dynamic monitoring. IM is, at the time of writing, a Ph.D. candidate in a dual major in Civil Engineering and Wood Science at Oregon State University. IM is co-advised by AB and MR. The experimental study conducted at Albina Yard will also be part of his Ph.D. thesis titled Multiscale Dynamic Monitoring and Behavior of Cross-Laminated Timber Elements and Systems. JB participated in the design of the experimental setup, troubleshooting, and in data collection. All authors participated in the design of the dynamic monitoring plan and sensor setup, and in the data acquisition phase.

### FUNDING

Funding for this study was provided by the U.S. Department of Agriculture Agricultural Research Service (USDA ARS) Agreement No. 58-0202-5-001 through the TallWood Design Institute at Oregon State University. This study was also funded by the McIntire Stennis project (contract number 1009740) provided by the National Institute of Food and Agriculture, U.S. Department of Agriculture. The first author received a Graduate Teaching Assistantship from the School of Civil and Construction Engineering at Oregon State during the year in which data collection was conducted.

<sup>2</sup>https://doi.org/10.17605/OSF.IO/34UB6

### ACKNOWLEDGMENTS

The authors would like to acknowledge the individuals who helped with the data collection namely Dr. Rajendra

### REFERENCES


Soti, Dr. Leonardo Rodrigues, and Evan Schmidt. The authors would like to thank the building owner (reworks Inc.) and Lever Architecture for providing access to the building.

Proc. Inst. Civ. Eng. Constr. Mater. 168, 121–131. doi: 10.1680/coma.14. 00047

Worth, M., Gaul, A., Jager, S., Omenzetter, P., and Morris, H. (2012). "Dynamic performance assessment of a multistorey timber building via ambient and forced vibration testing, continuous seismic monitoring and finite element model updating," in World Conference on Timber Engineering 2012. Auckland, New Zealand.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Mugabo, Barbosa, Riggio and Batti. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Dynamic Characterization and Vibration Analysis of a Four-Story Mass Timber Building

#### Ignace Mugabo1,2, Andre R. Barbosa<sup>1</sup> \* and Mariapaola Riggio<sup>2</sup>

*<sup>1</sup> School of Civil and Construction Engineering, Oregon State University, Corvallis, OR, United States, <sup>2</sup> Department of Wood Science and Engineering, Oregon State University, Corvallis, OR, United States*

Mass timber construction has been gaining momentum in multi-story residential and commercial construction sectors in North America. As taller mass timber buildings are being planned and constructed, *in-situ* dynamic tests of this type of construction can be performed to further validate their design and use. As part of this larger effort, an *in-situ* dynamic characterization testing campaign based on ambient vibration measurements was conducted on a recently constructed four-story mass timber building located in Portland, Oregon. The building features cross-laminated timber (CLT) floors, a glued laminated timber (GLT) framing gravity system, and light-frame shear walls and steel HSS hold-downs that compose the lateral resisting system of the building. Ambient vibration acceleration testing data were collected using 18 accelerometers that were wired to a portable data acquisition system in two experimental setups. Approximately 2 h of bi-directional horizontal acceleration data were recorded. In this paper, two operational modal analysis methods are used for estimating the modal parameters (frequency, damping, and mode shapes) based on the data collected. In addition, a multi-stage linear Finite Element (FE) model updating procedure is presented for this building type and the FE estimates of frequencies and mode shapes are compared to estimates from the collected data. The calibrated FE model provides confidence to the operational modal results and presents a comprehensive modal characterization of the building. At ambient levels of excitation, the developed FE model suggests that stiffness of the non-structural elements, such as the exterior wall cladding, and glazing affects the modal response of the building considerably. Lessons learnt on this unique and first of a kind four-story structure constructed in the United States and implications for taller mass timber buildings are summarized and provide valuable insight for the design and assessment for this building type under future dynamic excitation events.

Keywords: cross-laminated timber, enhanced frequency domain decomposition, finite element modeling, lightframed shear walls, mass timber building, operational modal analysis, stochastic subspace identification

## INTRODUCTION

The last decade has been marked with a rise in interest and use of mass timber construction in North America (Pei et al., 2016). This rise is driven by a range of innovative wooden structural products such as cross-laminated timber (CLT) (Gagnon and Popovski, 2011), mass plywood panel (MPP) (Freres, 2018), and more traditional wooden products such as glued laminated timber

#### Edited by:

*Costas Papadimitriou, University of Thessaly, Greece*

#### Reviewed by:

*Ertugrul Taciroglu, University of California, Los Angeles, United States Agathoklis Giaralis, City University of London, United Kingdom*

\*Correspondence: *Andre R. Barbosa Andre.Barbosa@oregonstate.edu*

#### Specialty section:

*This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment*

Received: *19 January 2019* Accepted: *18 June 2019* Published: *04 July 2019*

#### Citation:

*Mugabo I, Barbosa AR and Riggio M (2019) Dynamic Characterization and Vibration Analysis of a Four-Story Mass Timber Building. Front. Built Environ. 5:86. doi: 10.3389/fbuil.2019.00086*

**144**

(GLT). These products are typically used in structural systems in conjunction with other wooden and non-wooden structural members. One example of such combination is the use of CLT walls with light-frame shear walls (Nguyen et al., 2018). In the past, light-frame shear wall systems have extensively been used in the residential industry, typically for one to two story homes, but also in construction of multi-story timber structures up to five stories high. With the use of mass timber structural products along with light-frame shear wall systems, a new opportunity in expanding the use of light-frame construction to a larger variety of occupancy types and to higher building heights has presented itself. This new opportunity warrants the need to improve the understanding of the performance of lateral dynamic behavior of this combined mass timber/light-frame structural system, especially of actual constructed facilities.

The dynamic behavior of a structure can be evaluated given two types of external excitations: (1) free vibration, with the structure subjected to initial input(s) only; (2) forced vibration, with the structure subjected to continuous input(s); and (3) ambient vibration, with the structure responding to ambient loads such as wind, traffic and/or human activities. Ambient vibration testing offers means to evaluate dynamic parameters without causing excitation induced discomfort to its occupants and eliminating the potential of causing excitation induced damage to the structure. When using ambient vibration testing, output only methods known as operational modal analysis (OMA) are typically used to identify structural system natural frequencies, damping ratios, and mode shapes from vibration testing using output only methods. Several OMA have already been developed over the past decades. Among the widely used methods are the Enhanced Frequency Domain Decomposition (EFDD) (Brincker et al., 2001) and the Stochastic Subspace Identification (SSI) (Brincker and Andersen, 2006). Damping ratio estimation from ambient vibrations testing using the above mentioned OMA methods have shown considerable uncertainty (Magalhães et al., 2010; Moaveni et al., 2014). Magalhães et al. (2010) simulated the effects of adding non-proportional damping and closely spaced natural frequencies to the damping estimation by the EFDD and SSI methods, and results indicated that the SSI method displayed more accurate results than the EFDD method in evaluating the damping ratios of highly complex and nonproportionally damped simulated data. This study also evaluated the variability in damping ratios of three ambient vibrations tested large civil engineering structures using the SSI method. Large variability in damping ratios were observed with as much as 52% standard deviation relative to the mean damping ratio. Similar uncertainty was observed in the Moaveni et al. (2014) study. In an effort to draw comprehensive conclusions on modal damping ratios, Satake et al. (2003) compiled natural periods and damping data of 284 structures of height ranging mostly between 50 and 150 m. High correlations were observed between: (1) height of buildings and fundamental translational periods, (2) fundamental translational and torsional periods, and (3) fundamental periods and higher modes periods. Results in Satake et al. suggested that natural periods could be well-approximated as function of height. In general, reinforced concrete buildings had damping ratios above 2%, while steel-framed buildings had damping ratios below 2%. It was also noted that the first mode damping ratios were inversely proportional to the height of the building. Office buildings, which tend to have fewer nonstructural walls compared to apartment and hotel buildings, exhibited slightly lower first mode damping ratios than the apartment and hotel buildings.

The variations in modal parameters due to environmental loads (temperature, rain, wind), seismic ground shaking of varying intensity, and seismic retrofits have extensively studied in the past. Clinton et al. (2006) reported on the observed changes in natural frequencies of two buildings located at the California Institute of Technology. Over a span of 36 years, one of the buildings, the Millikan Library, experienced a decrease of 22 and 12% in the East-West and the North-South fundamental frequencies, respectively. The permanent reductions in frequencies were attributed to several moderate strong motions that the building experienced over the 36-year life span. Other factors such as heavy rain and strong winds also produced temporary changes in natural frequencies. Results indicated that natural frequencies increased up to 3% following heavy rain events. In another study conducted by Nayeri et al. (2008), ambient vibrations measurements on a 15-story steel frame structure were collected over a 50-day period. Changes in natural frequencies were mostly small with coefficients of variation (CVs) in the order of 1 to 2%, while damping ratios varied in the 20 to 70% range for CVs. Diurnal natural frequency variations ranging from 1 to 4% were observed and resulting from changes in temperature during the day. During and following the 1989 Loma Prieta earthquake, Çlelebi et al. (1993) collected strong and ambient motions measurements on five San Francisco bay area buildings that exhibited no visible damage following the Loma Prieta earthquake. For each of the five buildings, the fundamental frequencies obtained during the strong motion responses were lower than those obtained during ambient vibration testing. The ratios of ambient to strong motion (ambient/strong motion) fundamental frequency ranged from 1.47 to 1.14. The difference in fundamental frequencies could be the result of several factors such as: (1) soil structure interaction, (2) non-linear structural behavior, (3) slip of steel connections, and (4) interactions between structural and non-structural elements. Michel et al. (2009) compared weak earthquake to ambient vibrations of a 13-story permanently monitored reinforced concrete building in Grenoble, France. Decreases of up to 3% in natural frequencies were observed using ground motions measurements compared to ambient vibrations. With respect to the effect of seismic retrofits, Ivanovic et al. ´ (2000) described changes in natural frequencies of a severely damaged seven-story reinforced concrete building following the 1994 Northridge earthquake and its aftershocks. Ambient vibration measurements following (1) the main event and (2) one of main aftershocks indicated that the natural frequencies of the building increased up to 10% during the second data collection, most likely due to the additional wooden braces added near structurally damaged areas. Soyoz et al. (2013) evaluated the effects of retrofitting of a non-ductile reinforced concrete building. The various steps of the retrofitting effort included (1) the removal of infill masonry walls, (2) the addition of column jackets, and (3) the addition of structural walls. The removal of infill masonry walls decreased the fundamental frequency by 11%, and the subsequent retrofit increased the fundamental frequency by 96% in relation to the fundamental frequency after infills removal.

Several studies have evaluated the ambient dynamic behavior of light-frame shear wall buildings (Ellis and Bougard, 2001; Camelo et al., 2002; Steiger et al., 2016; Hafeez et al., 2018). Ellis and Bougard (2001) performed dynamic testing and evaluated the stiffness of a six-story light-frame timber building during different phases of construction. In this study, the fundamental frequencies were measured, and lateral stiffness was evaluated throughout several stages of construction. As the construction of the building evolved, the fundamental frequencies of the building increased as non-structural components such as staircases, interior plasters, and brickwork were added. The effects of non-structural components on the behavior of structures under ambient vibrations have been observed by several studies independently of the lateral resisting system used (Clinton et al., 2006; Li et al., 2011; Asgarian and McClure, 2012; Devin and Fanning, 2012; Assi et al., 2016). Li et al. (2011) developed linear elastic finite element models of four tall buildings with some of them featuring non-structural components such as an aluminum façade and infill walls. As much as 60% stiffness increase from the non-structural components were observed. Devin and Fanning (2012) conducted ambient vibration testing on a four-story reinforced concrete frame structure with a lateral force resisting system consisting of a set of reinforced concrete cores. A linear elastic FE model of the structural system produced a natural frequency 24% lower compared to the ambient vibrations estimated values. This difference was attributed to the stiffness contribution of non-structural components. The differences in fundamental frequencies of light-frame shear wall structures tested under different levels of excitation have also been acknowledged by other studies (Kharrazi and Ventura, 2006; Hafeez et al., 2018). Notably, Kharrazi and Ventura (2006) suggested a simple equation relating the fundamental frequencies of light-frame low-rise structures obtained from ambient vibration to the ones obtained from forced vibrations dynamic characterization testing. Hafeez et al. (2018) evaluated fundamental periods and modal damping of 47 wood-frame buildings under ambient vibrations. The authors provided a fundamental period relationship based on the Rayleigh quotient (Chopra, 2012) and extended an equation developed in Kharrazi and Ventura (2006) for estimating fundamental frequencies of light-frame shear wall structures.

Extensive research has focused on modeling of wooden structural systems (Tarabia and Itani, 1997; Folz and Filiatrault, 2004a,b; Collins et al., 2005a,b). Noteworthy examples of interest to this paper are the Folz and Filiatrault (2004a,b), in which a numerical modeling approach for predicting the dynamic response of light-frame shear wall building systems was developed and validated. This modeling approach considers rigid horizontal diaphragms and non-linear lateral load resisting shear wall models, which correspond to shear spring elements connecting adjacent horizontal diaphragms or horizontal diaphragms to the foundation. The modeling approach was validated in Folz and Filiatrault (2004b). For this validation effort, two construction phases were chosen as comparison points: first, after the two-story structure was sheathed with the OSB material, and second, after the interior gypsum wallboards and the exterior stucco were added to the structure. Results from the model prediction showed to be in good agreement with the test results, with maximum relative displacement values averaging 10% difference from the values obtained from the experimental campaign. Similarly to the studies described in the previous paragraph, the results from the testing and modeling also indicated that the natural frequency of the building changed from 3.28 to 6.95 Hz after the addition of gypsum wallboards and stucco increased.

The main objective of this paper is to provide a benchmark dataset on the dynamic characterization of an as-built hybrid mass timber construction of the first building constructed in the United States using US manufactured cross-laminated timber (CLT). The building, known as "Albina Yard," is an example of a hybrid structure, exhibiting a mass timber gravity system, while its lateral force resisting system consists of light-frame shear walls. While extensive research has gone into characterizing the structural properties of mass timber members and subsystems, few research studies have analyzed the dynamic behavior of buildings encompassing mass timber structural products, and specifically CLT (Reynolds et al., 2014, 2015, 2016; Hu et al., 2016). The limited number of currently built mass timber buildings, especially in North America, makes this endeavor more challenging, while it provides motivation for characterizing the as-built modal properties of this type of structures. This study contributes to this gap in knowledge and also improves the general understanding of the impact of drift sensitive nonstructural components (NSCs) on natural frequencies. In this study, output-only modal analysis methods are used to determine the modal parameters (natural frequencies, mode shapes and damping ratios) of the Albina Yard from an in-situ ambient vibrations testing campaign conducted on the building shortly after its completion, in January of 2017. Ambient vibration testing was performed using 18 accelerometers in two experimental setups and a portable data acquisition system that recorded approximately 2 h of horizontal acceleration data. Two OMA methods were used for estimating the modal parameters. A finite element (FE) model that includes the structural and NSCs of the building is created for correlation with the results obtained in the OMA study. Based on the FE model, a parametric study that includes both structural and NSC parameters is conducted to inform the roles that structural and NSCs contribute in the dynamic behavior of the tested structure under ambient vibration. Finally, results from the output-only test and the model are compared to the approximate fundamental period code equations, commonly used by practicing engineers in the United States (American Society of Civil Engineers, 2017).

### STRUCTURAL DYNAMIC TESTING AND CHARACTERIZATION METHOD

### Building Description

The "Albina Yard" is a four-story mass timber building located in Northeast Portland, Oregon (**Figure 1**) whose construction was completed in 2017. The building has a general rectangular shape with open floor plans, two staircases near its South face and an elevator shaft approximately near its geometric center in plan, as shown in **Figure 2**. For reference, **Figure 3** shows an elevation section of the building in the East-West direction through the middle of the building. The building has a footprint of approximately 26 m long (27.20 m to 25.45 m on depending story level) by 13.94 m with a total height of approximately 15.39 m above the grade level. The first story is dedicated to retail space, while upper stories are designed to be used as office space. The building envelope is comprised mainly of window glazing on the East and West façades and metal cladding walls on the North and South faces, with some small window and exterior door openings on the South façade.

The gravity load bearing system is composed of Douglas-fir (Pseudotsuga menziesii) glued laminated timber (GLT) columns and beams that support Douglas-fir (Pseudotsuga menziesii) three-ply cross-laminated timber (CLT) floors. The GLT column used are have two cross-sections: 222 × 229 mm (GL 8 ¾" × 9"), and 222 × 305 mm (GL 8 ¾" × 12"). The first column crosssection is used around the perimeter of the building, while the second one is used at interior load bearing locations. The primary GLT beams are distributed in plan following the gridlines shown in **Figure 2** and include two cross-section types: 171 × 457 mm (GL 6 ¾" × 18"), and 171 × 610 mm (GL 6 ¾" × 24"). The first type corresponds to the exterior beams spanning in the East-West direction and the North-South direction, while the second type is used as the primary interior beams running in the East-West direction. The spans for the beams are generally 6.10 m in the East-West direction with exception of one bay of the second floor level that is 2.78 m. In the North-South direction, the spans are 5.60 m and 7.43 m. In **Figure 2**, the gridlines for some secondary GLT girders spanning in the North-South direction are omitted from the figure for clarity of the figure. These omitted girders are located halfway between the gridlines shown in the North-South direction. Therefore, the typical spacing of beam axes for beams running in the North-South direction is 3.05 m, which serve as the primary span direction and span size of the CLT floors. Three-ply CLT floor panels, with approximate thickness of 104 mm (4.1"), are specified as ANSI PRG 320 Grade V2 (American National Standards Institute/APA—The Engineered Wood Association, 2018). The CLT floors are topped off with a 25.4 mm layer of non-structural lightweight concrete (Gyp-Crete).

The lateral load resisting system consists of double-sheathed plywood shear walls and a diaphragm provided by the CLT floors. The shear walls feature two types of hold-downs: (1) a hollow structural sections HSS 127 × 127 mm × 6.4 mm (HSS 5" × 5" × ¼") at the first and second level stories, and (2) 150 × 150 mm (6" × 6" nominal size) solid sawn lumber posts on the third and fourth level stories. The sheathed plywood shear walls are located in the middle of the building and to South face of the building, close to the elevator shaft and the staircases.

### Testing Description: Instrumentation, Setup, and Procedures

This testing campaign was executed shortly after commissioning. Thus, the testing was carried out on a weekend day to avoid interference with occupants' activities and minimize the input from human induced vibrations. The building was tested assuming ambient conditions (road traffic, wind, etc.). For this ambient vibration testing campaign, 16 uniaxial accelerometers and one (1) tri-axial accelerometer were used. The uniaxial accelerometers were PCB model 393B04 and the tri-axial accelerometer was a PCB model W356A12. **Figure 4** shows the two types of accelerometers and the data acquisition system used during the testing campaign. Further details on the accelerometer specifications are presented (Mugabo et al., 2019).

The accelerometers were distributed across the building and were attached to the underside of the CLT floors using glued metal brackets. The channels and the positive direction of the accelerations measured are indicated by the labels 1 to 18, shown in **Figure 2**. The channels 1 to 12 and 15 to 18 were connected to PCB 393B04 accelerometers while channels 13 and 14 are the X and Y components of the PCB W356A12 accelerometer used. **Figure 3** shows the vertical locations of the accelerometers throughout the building; however, it does not correctly represent the N-S direction locations (in/out-of-plane position) of the accelerometers.

Due to the time constraints and limited number of accelerometers used in this in-situ testing, the test was phased into two setups. The first phase, Setup-1, included six (6) accelerometers attached on the underside of the roof, as well as in the fourth- and third-floor levels. The second phase, Setup-2, included six (6) accelerometers on the underside of the roof and third floor level, two (2) accelerometers on the underside the fourth-floor level, and four (4) on the underside of the secondfloor level. It is worth noting that the northwestern corner of the second-floor level was not instrumented because it was not accessible during the testing period. For each setup, the data were collected for approximately 1 h, with a sampling frequency of 2,048 Hz. Once ambient vibration data were collected, the PCB W356A12 accelerometer channels were deemed not sensitive enough for the application at hand.

### Data Post-processing and Analysis: Procedures and Methods

Data were analyzed using operational model analysis (OMA) techniques. The two OMA methods used in the estimation of the modal features are EFDD and SSI, following a similar approach used by Magalhães et al. (2007) and Moaveni et al. (2014), which are available in the software used (ARTeMIS Modal, 2017). More detailed explanations of the methods can be found in Brincker et al. (2001) for the EFDD and Brincker and Andersen (2006) for the SSI methods, respectively.

Before applying the two methods, however, the collected data were post-processed using power spectral densities (PSDs), taken on 1-min windows using the pwelch function from MATLAB's signal processing toolbox (MathWorks, 2018) to identify high noise signals or malfunctioning accelerometers, and eliminate corrupted data from the analysis. For the data analysis using the EFDD and SSI methods, a set of post-processing schemes were defined to focus on different sections of the frequency spectrum of interest. The processing schemes used

are listed in **Table 1**. An upper limit of 20.48 Hz was considered adequate for capturing the first few natural frequencies of interest and various Butterworth filters windows were used to focus on different sections of the spectrum of interest. The decimation frequencies of 10.24 and 5.12 Hz were used to focus on the lower natural frequencies. The processing steps

accelerometers 9–12 are shown in yellow squares for setup 1 and black triangles for setup 2. The dimensions are presented in meters.


TABLE 1 | Description of processing schemes.

\**Harmonic peak reduction was used.*

were performed on the combined sets of data and on each of the two separate sets of data (Setup-1 only and Setup-2 only). As a result, 30 different data analysis processing were performed. It is worth noting that harmonic peaks were observed in the frequency range of 12 Hz to the decimation frequency 20.48 Hz. To extract modal features in this frequency range, a harmonic peak reduction algorithm integrated in ARTeMIS and based on an SSI process orthogonal projection was used (Gres et al., 2019).

### Structural Modeling

A SAP2000 (CSI, 2017) linear elastic finite element (FE) model was developed to correlate to the obtained OMA results. The model was developed to benchmark the experimental results and define a modeling strategy that can be applied to mass timber buildings with light-frame shear walls dynamically tested under ambient vibrations. To validate the identified natural frequencies at ambient level of excitation, a detailed model of the structure comprising of structural and non-structural members was required. The need for such a detailed model is due to the notion that, at ambient levels, the non-structural members contribute significantly to the lateral stiffness of the structure. To avoid difficulties that can arise from starting with a refined model, it was necessary to start with a simplified structural model and subsequently add non-structural components that are assumed to contribute to the lateral stiffness of the building. This multi-stage modeling approach is graphically illustrated in **Figure 5**. The first phase, phase 1, included the gravity loads supporting system and the light-frame shear walls. In

phase 2 the non-structural wall components were added to the FE model, namely gypsum wallboards (gwb) layers. Following the addition of the components in phase 2, phase 3 included the exterior metal façade walls and window glazing. In phase 4 the staircase members were added to the FE model. Phase 1 through phase 4 included all the structural and non-structural building components that were considered to have effects on the lateral stiffness of the building at ambient levels. A correlation phase, phase 5, was added to adjust the model results to identified modal parameters.

The gravity loads resisting system, included in phase 1 of the model, consisted of GLT beams and columns, and the CLT floors. First, the GLT beams and columns were modeled as isotropic materials with an elastic modulus of 12,410 MPa as specified in the National Design Specification (NDS) Supplement manual (American Wood Council, 2015). It worth noting that NDS provides design values presenting lower bound values, and not necessarily indicate the expected values (median). The column base joints were modeled as fixed restraints to replicate the fixity behavior of column bases at ambient levels of loading. CLT floors were modeled as isotropic thin shell diaphragms with their nominal thickness and assigned a modulus of elasticity of 12,410 MPa (1,800 ksi). This nominal value in fact corresponds to assuming the CLT diaphragm in this case study is essentially rigid. **Table 2** provides a summary of the stiffness properties of all structural components used in the finite element modeling scheme.

For the light-frame shear walls, the lateral stiffness of each shear wall section was modeled as two equivalent braces. The relation between the cross-brace stiffness and shear wall stiffness is provided following recommendations from the Applied Technology Council (2017), which provides initial lateral stiffness values, K0, for different configurations of light-frame shear walls, including those sheathed on two sides. A lateral stiffness, K0, of 1,596 N/mm per meter (2,780 lb. per in. per ft.) was assigned, given the plywood shear wall size and detailing pattern.

All the wooden materials including the GLT beams and columns, CLT floors, and wood posts were assigned a density of 500 kg/m<sup>3</sup> (American Wood Council, 2015). To characterize the masses of the structure and the office supplies, masses were added at the floor and roof levels. The applied masses included the mass concrete screed (referred to as Gyp-Crete), carpet, office chairs and tables, books and roofing materials. **Table 3** summarizes the floor and roof added masses according to the building details and estimates of furniture observed in the office spaces.

Phase 2 of the modeling approach consisted of updating the light-frame shear walls stiffness to include the stiffness contribution of the gypsum wallboards (gwb) (Applied Technology Council, 2017). The same process used for the light-frame shear walls was applied to the gwb wall layers. The reported unit length lateral stiffness value amounts to 247 N/mm per m (430 lb./in. per ft.). Some wall sections displayed pairs of gwb layers on each side and therefore lateral stiffness of these walls was updated to reflect the number of gwb layers.

In the third modeling phase (Phase 3), the exterior walls were added to the model as isotropic shell elements. For the sheet metal façade, the lateral stiffness values of the shell elements were estimated by adding the stiffness of the sheet metal layer and the gwb layer to represent the sheet metal wall assembly. It was assumed that the sheet metal façade acts primarily through shear

behavior. Sheet metal in-plane shear modulus properties proved difficult to estimate from the construction details. Therefore, the shear modulus for the sheet metal was estimated from a previous study that evaluated the lateral stiffness of steelclad wood framed (SCWF) walls (Aguilera, 2014). Aguilera (2014) evaluated the shear modulus and strength of SCWF wall assemblies typically seen in post-frame buildings. The study by Aguilera (2014) considered 17 SCWF shear walls of 4,880 mm (16') in width and 3,660 mm (12') in height, which were tested using a monotonic loading regime and with a cantilever wall setup. Seven different SCWF wall types were tested and differed in criteria such as shape of corrugation, girt spacing, fastener configurations. The mean shear modulus of all the SCWF shear walls tested was used to model the sheet metal façade stiffness and added to the exterior walls' gwb layer stiffness.

For the façade glazing, a lateral stiffness value of 410 N/mm per m (715 lb. per in. per ft.) of glass façade length was used. This stiffness value was estimated from glazing in-plane stiffness test conducted by Cruz et al. (2010) on a glass section of 1,200 mm (47.25") in height and 1,600 mm (63") in width, fastened to a timber frame around its edges. The column-to-column distances were assigned as the width of the individual exterior wall shell elements and the story heights were assigned as the height of the exterior wall shell elements.

The stairs were added as isotropic shell elements for the landings, stairs threads and the stairs handrails. The stair landings and the handrails consist of 3-ply CLT panels, while the stairs threads consist of plywood material. The stiffness assigned to the CLT panels was derived using the Composite Theory Method (k-Method) as presented in the CLT Handbook (Gagnon and Popovski, 2011). The stair threads are made of plywood material with a thickness of 28.5 mm (1–1/8"). A modulus of elasticity (MOE) of 7,450 MPa (1,080 ksi) was assumed on the basis the MOE of Douglas-fir plywood sheathing products presented in the Wood Handbook (Forest Products Laboratory, 2010).

After the additions of the NSCs described in phase 2 through phase 4 were included to the model, a model correlation phase was added. The correlation phase was mainly added due to a disagreement observed between experimentally estimated and the model results for the torsional fundamental frequency. The correlation phase included reevaluating floor mass distributions

#### TABLE 2 | Summary of stiffness properties.


*<sup>a</sup>American Wood Council (2015).*

*<sup>b</sup>American Institute of Steel Construction (2017).*

*<sup>c</sup>Applied Technology Council (2017).*

*<sup>d</sup>Aguilera (2014).*

*<sup>e</sup>Cruz et al. (2010).*

*<sup>f</sup>Gagnon and Popovski (2011).*

*<sup>g</sup>Forest Products Laboratory (2010).*

and adjusting the stiffness contribution of the exterior sheet metal façade walls and the light-frame shear walls. The correlation phase is further discussed in section Parametric Study.

Lastly, a parametric study that included model parameter variations of structural and non-structural factors was conducted to examine the impact that several factors would have on the fundamental frequencies of the structure. The structural factors considered consisted of the global mass of the structure, the lateral stiffness of the light-frame shear walls, the lateral stiffness of the GLT members and the CLT diaphragm stiffness. Along with the structural factors, some non-structural factors were considered in parametric study, including the properties assigned to the sheet metal façade, window glazing, and staircases. A 25% deviation from the modeled values was applied to each of these factors. For the mass parameter, a unit area mass was added or subtracted to the total floor and roof areas.

### RESULTS

### Operational Modal Analysis: EFDD and SSI

Operational modal analysis (OMA) using EFDD resulted in the identification of several modal features of the vibration modes. **Figure 6A** shows the SVDs obtained using processing scheme 7 and Setup-1 and Setup-2 (**Table 1**). The plot shows the first three SVDs with the high-pass filter at 0.5 Hz and a decimation frequency of 20.48 Hz. Three well-defined peaks are discernable in the frequency range from 0–5 Hz for SVD1 to SVD3. For frequencies above 5 Hz, two peaks can be observed between 5–10 Hz.

**Figure 6B** shows the state space models stabilization plots for data processed using procedure 8 (**Table 1**). A maximum model order of 14 was selected (marked with a thick horizontal line) with the expectation that < 7 structural modes would be identified in the frequency range of 0 to 20.48 Hz. The vertical dots in the plot indicate the stable modes identified from data collected in Setup-2 only when using the processing scheme 8. TABLE 3 | Summary of mass estimates for floors and roof.


*<sup>a</sup>From Boise Cascade (2016) technical note on weights of building materials.*

*<sup>b</sup>From Homasote*<sup>1</sup>

*<sup>c</sup>Estimates based on local observation and engineering judgement.*

*<sup>d</sup>From Empire*<sup>2</sup> *West Title Agency web link.*

*<sup>e</sup>From GAF*<sup>3</sup> *web link.*

*<sup>f</sup> From Owens Corning*<sup>4</sup> *web link.*

*<sup>g</sup>From Daikin*<sup>5</sup> *web link.*

*<sup>h</sup>From Glass*<sup>6</sup> *Association of North America.*

The modes are extracted in the frequency range of 0–5 Hz, two of which are closely spaced. One additional mode is extracted in the 5–10 Hz frequency range.

**Figure 7** summarizes the natural frequencies, damping ratios and the mode shapes identified using the EFDD and the SSI. The values in bold show the average natural frequencies and damping ratios resulting from all the processing schemes outlined in section Data Post-Processing and Analysis: Procedures and Methods The values in parenthesis indicate minimum and maximum values of natural frequencies and damping ratios identified as a result from the different processing schemes. Four modes of vibrations were identified using both OMA methods. Three of these modes present the fundamental modes (NS lateral

Bulletin. Available at: https://dcpd6wotaa0mb.cloudfront.net/mdms/dms/ EIS/10015702/10015702-ASTM-C578-Types-and-Physical-Properties-for-FOAMULAR-Tech.-Bulletin.pdf?v=1343093874000

<sup>1</sup>Homasote.com 440 Sound Barrier. Available at: http://www.homasote.com/ products/440-soundbarrier.com

<sup>2</sup>Empire West Title Agency Average Weight of Common Household Furniture. ewtaz.com. Available at: http://www.ewtaz.com/images/uploads/average-weightfurniture-2.pdf

<sup>3</sup>GAF EverGuard TPO Membrane Data Sheet. Available at: https://www.gaf.com/ en-us/document-library/documents/commercialroofingsystems/everguardtpo/ everguardtpo60membrane/everguard\_tpo\_60\_mil\_membrane\_data\_sheet.pdf <sup>4</sup>Owens Corning FOAMULAR Extruded Polystyrene. (XPS) Insulation Technical

<sup>5</sup>Daikin Air Conditioning Technical Data. Available at: http://www.daikintech.co. uk/Data/VRV-Outdoor/RXYQ/2014/RYYQ-T7Y1B/RYYQ-T7Y1B\_Databook. pdf

<sup>6</sup>Glass Association of North America Approximate Weight of Architectural Flat Glass. Available at: http://www.syracuseglass.com/E-DOCS/general/EDOCS/ Approximate%20Weight%20of%20Architectural%20Flat%20Glass.pdf


FIGURE 7 | OMA identified natural frequencies, damping ratios and mode shapes. The values in bold represent the mean natural frequencies and damping ratios from all the post-processing schemes. The parentheses indicate the minimum and maximum natural frequencies and damping ratios as a result of using the post-processing schemes as described in section Data Post-Processing and Analysis: Procedures and Methods.

direction, torsion, and EW lateral direction), and the fourth mode present the second mode in the NS direction of the building. There are slight variations in the average natural frequencies extracted from the two OMA methods. The largest natural frequency variation occurs in the second lateral NS direction mode amounting to 0.12 Hz (or 1.4% of the SSI extracted average natural frequency). These variations in natural frequencies are less significant than the damping ratios variations. For instance, the average fundamental EW mode's damping ratio obtained by EFDD equated to 1.38% while the SSI obtained damping ratio for this mode was 5.66%. Several studies have explored damping ratios variations in closely spaced modes (Magalhães et al., 2010) and identifiability factors such as length of data recorded, amplitude of excitation, spatial density of sensors, and measurement noise (Moaveni et al., 2014). The identified torsional and first EW direction modes are indeed closely spaced modes. This factor could help explain the large variations in damping ratios.

### Finite Element Model Results

**Figure 8** shows the changes in the computed natural frequencies by adding phase 1 through phase 4 components in comparison with the SSI identified natural frequencies. The FE model natural frequencies are normalized to the respective SSI identified averaged natural frequencies. **Figure 8** also shows effects of the correlation phase (phase 5) which will be discussed in section

Parametric Study. The natural frequencies computed from phase 1 were significantly lower than the natural frequencies of the ambient vibration testing. For instance, in the EW and NS directions. They corresponded to 41 and 55% of the natural frequencies obtained through SSI. The first torsional natural frequency amounted 25% of the torsional natural frequency obtained by SSI. After the addition of the gwb layers of the shear walls (phase 2), the fundamental frequencies increased to 48 and 63% in the EW and NS directions, respectively. The first torsional natural frequency, however, was marginally increased in phase 2, reaching 27% of the experimental torsional natural frequency. The large difference in the torsional fundamental frequency in phase 1 and phase 2 is consistent with the concept that the lightframe shear walls, that are located around the center and southto-center of the building, would contribute less torsional stiffness to the overall structural system.

After the addition of the sheet metal façade and the window glazing in phase 3, the fundamental natural frequency in the NS direction increased to 76% of the ambient testing fundamental frequency. A significant increase was observed in the FE torsional fundamental frequency, going up to 52% of the SSI identified torsional natural frequency. This observation confirms that the exterior non-structural walls contribute significantly to the torsion stiffness of the building.

Phase 4, which featured the addition of the staircases, increased of the EW and NS fundamental frequencies to 91 and 87%, respectively. The addition of the staircases resulted in the FE model torsional fundamental frequency adding up to 53% of the SSI identified natural frequency.

The mode shapes features resulting from the two OMA methods and the FE model were compared for consistency using the Modal Assurance Criterion (Pastor et al., 2012). The Modal Assurance Criterion (MAC) is given by:

$$\text{MAC}(\phi\_i, \phi\_j) = \frac{\begin{vmatrix} \phi\_i^T \ \phi\_j \end{vmatrix}^2}{\begin{pmatrix} \phi\_i^T \phi\_i \end{pmatrix} \begin{pmatrix} \phi\_j^T \phi\_j \end{pmatrix}} \tag{1}$$

where φ<sup>i</sup> is the modal vector at frequency i and φ<sup>j</sup> is the modal vector at frequency j.

The diagonal MAC values resulting from the OMA methods and the FE model are presented in **Figure 9A**. The diagonal MAC values between the EFDD and SSI identified modal vectors show high levels of consistency (above 0.9), except for the first EW direction mode which show a significantly low MAC value. To further investigate the possible reasons for the low MAC value in the fundamental EW mode, shorter segments of collected data were analyzed for MAC consistency. The lack of consistency in terms of MAC values on subsets of collected data could indicate some limitations in identifiability. Two 5-min segments collected data were analyzed separately, from setup 1 and from setup 2. The fundamental EW direction MAC value becomes 0.69 when the 5-min setup 1-only data is used as shown in **Figure 9B**. The EW direction mode is not identifiable using the setup 2 only 5-min segment. When data for the two (2) 5-min segments are combined, the MAC value becomes 0.74. Yet when the full data set is considered, the EW direction MAC value is 0.12. While the fundamental EW direction MAC value is higher when considering certain portions of the data, the entire data set yields a lower MAC value. This points to the limitations in consistently extracting the EW mode shape using both OMA methods, which is likely due to the EW mode's proximity to the torsional mode.

The diagonal MAC values of the FE model modes and SSI identified modes show an increasing trend for torsional and EW direction fundamental modes as the additional model components are added (see **Figure 9A**). The NS direction mode MAC value decreases as additional stiffness members are added and equates to 0.92 after phase 4.

### Model Correlation

While the modeled lateral natural frequencies were converging toward the SSI identified natural frequencies after phase 4, the torsional natural frequency amounted only to 53% of the identified torsional fundamental frequency. Phase 5 was introduced to correlate the FE model torsional fundamental frequency to the experimentally identified values. The difference in torsional natural frequencies between the FE model and the SSI method could be attributed to two factors: a mischaracterization of floor mass distribution and/or a difference in the lateral stiffness contribution of structural and non-structural building components.

The masses of the building components were estimated by considering the main members without the masses of their connecting assemblies. This would suggest that the building's total mass is underestimated, given that most of the member connections are made of steel, a significantly denser material compared to wood, and that real moisture condition weights are expected to be larger than the nominal values assumed per NDS. The locations were the underestimations of masses could most likely be higher are the exterior walls. The masses that were not estimated would include the steel furring and connections of the façade to the structural system, and the window glazing framing.

The second factor that plays into the imbalance observed in the torsional natural frequency can be attributed to the stiffness distribution along the horizontal planes. The torsional stiffness

is proportional to the square of the eccentricity between the axis of stiffness and the center of mass. For this reason, the stiffness distribution could play an important role adjusting the torsional stiffness and consequently the torsional frequency. For the Albina Yard, the stiffness distribution was evaluated by adjusting the stiffness values of the light-frame shear walls and non-structural lateral stiffness contributors as specified in phase 1 through phase 4. The modeled exterior walls stiffness most likely present sources of biases due to the fact that they were selected from other experimental walls that may vary considerably from the structure's exterior wall stiffness values. In the case of the sheet metal façade, shear modulus is the factor of both the warping action and slipping in connections in addition to the thickness of the metal profile. As result of these factors, sheet metal shear modulus can be orders of magnitude lower than continuous profiles of similar thickness as stated by Luttrell (2004). Due to the lack of connection details on the exterior walls, it was determined to adjust stiffness with the goal of matching the torsional natural frequency while maintaining the lateral natural frequencies close to their original values. The stiffness distribution adjustment entailed increasing the stiffness of some lateral resisting components, while decreasing the stiffness of to maintain the balance in natural frequencies in the two orthogonal directions and in-plane torsion. It was observed that an increase in the exterior walls' stiffness coupled with a reduction in the stiffness of staircases and light-frame shear walls was leading to an increase in the torsional natural frequency while maintaining the lateral natural frequencies relatively similar to the experimental values.

The two calibration approaches involved the addition of masses to the exterior walls and the increase in exterior walls stiffness coupled with the reduction of other stiffness contributors. The combination of these two approaches led to the natural frequencies converging for both the lateral and torsional modes. The resulting FE natural frequencies are 2.79 Hz in the NS direction, 4.05 Hz in the torsional direction and 4.44 Hz in the EW direction. These natural frequencies are a result of increasing the exterior walls (glazing and metal façade) masses by a factor of two (2). The stiffness of the metal façade and the glazing were multiplied by factors of 10.25 (G = 1,586 MPa) and 2.8 (G = 320 MPa) of the respective initially modeled values. The increase in the stiffness of the exterior walls was coupled with light-frame shear walls and the staircases stiffness values reduced by a factor of two (2) and four (4), respectively.

### Parametric Study

The parametric study considered the effects a 25% change in the parameters described in section Structural Modeling to the fundamental frequencies of the FE model. **Figure 10** presents the results of the parametric studies on the NS direction, torsional, and EW direction fundamental frequencies. The results in this figure are normalized to the respective fundamental frequency identified through the SSI method. **Figure 10A** shows that the total mass exerts the most influence on the NS fundamental frequency, followed by the sheet metal façade. The total mass of the structure has the most influence on the natural frequencies in comparison to the stiffness parameters since it is one of the two factors in fundamental frequency equation, f = 2π q k m . The stiffness parameters considered in this parametric study contribute to the system total lateral stiffness, k. Inherently, the mass has larger effect on the change in natural frequency compared to the single stiffness parameters.

The window glazing, light-frame shear walls, GLT members, CLT floors stiffness influence the NS direction fundamental frequency to a lower degree compared to the total mass and the sheet metal façade stiffness. The light-frame shear walls contribute less to the NS direction fundamental frequency than the sheet metal façade and the window glazing, which are considered to be non-structural building components.

**Figure 10B** shows the effects of the seven considered parameters to the torsional fundamental frequency. Similar to the case of the fundamental frequency in the NS direction, the total mass of the building displays the most influence on the torsional fundamental frequency. The sheet metal façade stiffness, although often considered as a non-structural building component, causes more effect to the torsional fundamental frequency than the light-frame shear walls stiffness. The window

glazing, the light-frame shear walls and the glulam members' stiffness have a lower impact to the torsional fundamental frequency compared to the mass and the sheet metal façade stiffness. The torsional natural frequency is least affected by the change in parameters of the staircases and CLT floors stiffness.

frequencies for (A) *f* = 2.86 Hz, (B) *f* = 4.29 Hz, and (C) *f* = 4.20 Hz.

**Figure 10C** shows the effects of the parameters to the EW direction fundamental frequency. The total mass shows to be the most contributing factor as observed for the two other fundamental frequency cases. The sheet metal façade stiffness has a noticeable effect on the fundamental frequency in the EW direction, and exerts more influence than the light-frame shear walls. For the EW direction, the exterior walls have more impact to the fundamental frequency than the staircases, light-frame shear walls, GLT members, and CLT floors.

Among the structural and non-structural stiffness parameters, the sheet metal façade and the glazing exert the most influence in the fundamental frequencies. Based on this observation, it can be suggested that at the ambient level of excitations, the exterior walls have the most impact in the structure's response to ambient excitations. As expected, the fundamental frequencies are much less sensitive to the change of the CLT floors in-plane stiffness compared to the other stiffness parameters considered. While it is expected that the in-plane shear modulus of CLT is smaller compared to the modulus of elasticity considered in this study, based on this sensitivity study, the in-plane shear modulus stiffness is not expected to have a major impact on the fundamental frequencies.

### DISCUSSION

### Comparison Between EFDD and SSI

Four modes were identified with both of the OMA methods and provide confidence in the results. The identified modes compared to each other well in terms of natural frequency (see **Figure 7**) and mode shapes (**Figure 8A**). The closelyspaced modes provided challenges in modal identifiability. While the OMA methods were consistent in extracting the natural frequencies of the closely spaced modes, their mode shapes proved difficult to differentiate. By analyzing two 5-min segments of recorded data from setup 1 and 2, the diagonal MAC value between the EFDD and SSI identified closely spaced modes varied significantly, while the MAC value for NS direction fundamental frequency was consistently high regardless of recorded data length and changes in sensor locations. Possible causes for the lack of consistency in the EW direction mode shapes can be due primarily to its proximity to the torsional mode but also to other identifiability factors such as the presence of measurements noise, and the limited number of sensor locations.

In contrast to the small difference in natural frequency values observed across setups and methods, a large difference was observed in the extracted damping ratios. However, such differences are to be expected and have been discussed in other studies (e.g., Magalhães et al., 2010; Moaveni et al., 2014; Yu et al., 2017). This is most likely due to larger estimation variance and bias for damping ratios compared to natural frequencies (Pintelon et al., 2007; Reynders et al., 2008).

### Comparison Between FE Model and Dynamic Testing Identification Results

After model calibration, the model natural frequencies showed to match the experimental natural frequencies in the three fundamental modes (NS lateral, EW lateral, torsional directions) and one higher mode (second NS lateral).

The calibrated FE model suggests that non-structural building components play a significant role in the measured ambient vibration excitations. When comparing fundamental frequencies in the NS direction between phase 1 and phase 5 (see **Figure 8**), an increase from 1.57 to 2.79 Hz is observed. This increase in fundamental frequency, as a result of the non-structural components, shows a similar trend to the increases observed by Folz and Filiatrault (2004b) in a laboratory setting, where the fundamental frequency increased from 3.28 to 6.95 Hz after the addition of non-structural components.

The correlation phase results also suggest that, at ambient levels of excitation, the exterior walls stiffness contributes more to the lateral and torsional natural frequencies than the stiffness of the light-frame shear walls, the interior gravity frames and the staircases. The parametric study, conducted to investigate the effects of different structural and non-structural parameters to the three fundamental natural frequencies, showed that the overall mass of the building has the most influence on the fundamental frequencies. For a 25% reduction in the total building mass, the model normalized torsional fundamental frequency increased from a ratio of 0.96 to 1.15 of the torsional fundamental frequency identified with the SSI method (see **Figure 10B**). This change is equivalent to a 20% natural frequency increase in relation to the torsional natural frequency of the correlated FE model. In the NS and EW directions, the 25% decrease in total mass resulted in a 21 and 17% increase in the fundamental frequencies, respectively. This finding support, for FE modeling, the importance of the estimation the structure's dead loads and weights, as well as of a good approximation of the masses acting on the building at the time of the experimental testing. Experimental studies such as Assi et al. (2016) observed a similar effect with as much as a 21.7% reduction in the identified natural frequency after the addition of some non-structural components to a six-story building featuring a reinforced concrete shear walls and moment resisting frames as lateral resisting structural system. It is also worth noting that more accurate estimation of the wood specific gravity could be done by adjusting the reference values, such as those provided by NDS (American Wood Council, 2015), to actual moisture content data measured at the site.

The parametric phase also indicated that the exterior sheet metal walls exerts the most influence among all the lateral stiffness contributors including the light-frame shear walls. Most often, researchers and design practitioners have a limited amount of information on the structural properties of façade elements. Thus, structural modeling often excludes the stiffness addition of non-structural components. The FE modeling results points to the potential benefits of including the stiffness of non-structural components to improve the understanding of dynamic behavior of tall mass timber structures under service lateral loads such as wind loads.

### Comparison of Identified Fundamental Frequencies to Code Approximate Fundamental Frequency Equation

ASCE 7-16 (American Society of Civil Engineers, 2017) provides guidelines for estimating the fundamental period of a building based on its height or the number of stories. The approximate fundamental period calculation is used as a part of the Equivalent Lateral Force (ELF) procedure, a common procedure for analyzing seismic loads on structures. The fundamental frequencies in the lateral directions as well as the torsional fundamental frequency from the ambient vibration results are compared to the fundamental frequencies derived from the fundamental period equation:

$$T\_a = C\_l h\_n^\times \tag{2}$$

where C<sup>t</sup> and h x n are parameters that correspond to 0.048 and 0.75 for light-frame structures. The resulting approximate fundamental period is equal to 0.38 s and corresponds to a natural frequency of 2.63 Hz. By comparison, the approximate fundamental frequency is a close approximation to the identified fundamental frequency in the NS direction. In the EW direction and torsion, Equation (6) does not provide a good estimate of the measured frequencies. It is worth noting, however, that under seismic loading, the stiffness contribution of the non-structural elements is expected to be smaller.

### CONCLUSION AND RECOMMENDATIONS

An ambient vibration test was conducted on a four-story mass timber, commercial building. This experimental study provides a unique benchmark dataset on the first construction using CLT produced in the U.S. Two OMA methods were used to identify natural frequencies, damping ratios, and mode shapes of the building. Four structural modes were identified and compared to those obtained using a correlated model that was developed using a phased finite element model updating approach. The four structural modes identified are the first three fundamental structural modes (EW, NS and torsional directions) and one higher lateral structural mode (NS direction). Reasons causing limitation in identifiability of the higher modes include use of an insufficient number of sensors, non-ideal sensor locations, presence of closely spaced structural modes, and limited level of the energy of excitation. It is wellknown that these factors can increase bias and variance of the modal identification.

The presence of mechanical sources of excitation that are commonly found in buildings that are under operation can further interfere with the determination of modal properties. The use of different types of accelerometers led to additional identifiability limitations due to issues such as scaling and sensor sensitivity. Despite the variability in modal parameters arising from each method, use of both OMA methods was useful in improving confidence in the results.

A parametric study assessing the contribution of seven structural and non-structural factors to the fundamental frequencies was conducted. The mass of the building was identified as the factor that most affected all three fundamental frequencies. Therefore, a careful estimation of the building masses and its distribution in plan is crucial for accurate dynamic modeling of ambient responses. While the stiffness of nonstructural members is often not considered for high amplitude level of lateral excitations, such as extreme seismic or wind loading conditions, the correlation between the identified modal features and the FE model highlights that exterior non-structural walls play a major role in the responses to ambient excitations, which is important when assessing serviceability and comfort of the occupants. The effects of non-structural members to the lateral response of tall mass-timber structures will need to be further investigated as new heights of timber buildings are reached and serviceability limit states may govern design.

### AUTHOR CONTRIBUTIONS

AB and MR conceived the study. AB and IM performed the experimental design. IM, AB, and MR carried out the in-situ data collection. IM performed the data processing and finite element modeling with guidance from AB and MR. IM took the lead on drafting the article with critical input from AB and MR through the writing process. AB and MR contributed to the interpretation of results and carried out the editing process.

### FUNDING

Funding for this study was provided by the U.S. Department of Agriculture Agricultural Research Service (USDA ARS) Agreement No. 58-0202-5-001 through the TallWood Design Institute at Oregon State University. This study was also funded

### REFERENCES


by the McIntire Stennis project (contract number 1009740) provided by the National Institute of Food and Agriculture, U.S. Department of Agriculture. The first author received a Graduate Teaching Assistantship from the Civil Engineering Department of Oregon State University during the year in which data collection was conducted.

### ACKNOWLEDGMENTS

The authors would like to acknowledge the individuals who helped with the data collection namely Rajendra Soti, Leonardo Rodrigues, James Batti, and Evan Schmidt; others who provided valuable technical guidance include Reid Zimmerman, Eric McDonnell, and Palle Andersen. Authors would like to thank the building owner, reworks Inc., and Lever Architecture for providing access to the building.


before and after their completion and occumpancy," in Proceedings of the WCTE 2016 World Conference on Timber Engineering (Vienna).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Mugabo, Barbosa and Riggio. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Parameter Estimation of Autoregressive-Exogenous and Autoregressive Models Subject to Missing Data Using Expectation Maximization

#### Matthew Horner, Shamim N. Pakzad\* and Nur Sila Gulgec

Department of Civil and Environmental Engineering, Lehigh University, Bethlehem, PA, United States

#### Edited by:

Eleni N. Chatzi, ETH Zürich, Switzerland

#### Reviewed by:

Eliz-Mari Lourens, Delft University of Technology, Netherlands Audrey Olivier, Johns Hopkins University, United States

> \*Correspondence: Shamim N. Pakzad pakzad@lehigh.edu

#### Specialty section:

This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment

Received: 31 July 2018 Accepted: 29 August 2019 Published: 13 September 2019

#### Citation:

Horner M, Pakzad SN and Gulgec NS (2019) Parameter Estimation of Autoregressive-Exogenous and Autoregressive Models Subject to Missing Data Using Expectation Maximization. Front. Built Environ. 5:109. doi: 10.3389/fbuil.2019.00109 Missing observations may present several problems for statistical analyses on datasets if they are not accounted for. This paper concerns a model-based missing data analysis procedure to estimate the parameters of regression models fit to datasets with missing observations. Both autoregressive-exogenous (ARX) and autoregressive (AR) models are considered. These models are both used to simulate datasets, and are fit to existing structural vibration data, after which observations are removed. A missing data analysis is performed using maximum-likelihood estimation, the expectation maximization (EM) algorithm, and the Kalman filter to fill in missing observations and regression parameters, and compare them to estimates for the complete datasets. Regression parameters from these fits to structural vibration data can thereby be used as damage-sensitive features. Favorable conditions for accurate parameter estimation are found to include lower percentages of missing data, parameters of similar magnitude with one another, and selected model orders similar to those true to the dataset. Favorable conditions for dataset reconstruction are found to include random and periodic missing data patterns, lower percentages of missing data, and proper model order selection. The algorithm is particularly robust to varied noise levels.

Keywords: regression analysis, vibration, damage assessment, probability, estimation, structural dynamics, data analysis

### INTRODUCTION

A fundamental task in a variety of fields is extracting useful statistical information from time series data. Working with complete datasets, there are different tools toward this end. However, in certain applications, the datasets are faced with the possibility of missing measurements. For example, these may result from network communication disruptions, malfunctioning sensing equipment, improper sampling protocol, or observation patterns inherent to the data collection schemes (Little and Rubin, 2002; Matarazzo and Pakzad, 2015).

Missing data may present several problems for statistical analyses conducted and decisions made as a result of those analyses. If missing value indicators are not present in a data analysis package, inferences about the system being sampled can be biased. Similar biased inferences may result if missing observations are ignored, particularly if an observation's missingness is a function of its value, for example when observations are uncharacteristically orders-of-magnitude atypical. Additionally, simply from a cost perspective, it is not desirable to spend time and resources collecting data that eventually goes unused.

It follows then that researchers should seek out data analysis methods that maximize the utility of their entire dataset, incorporating the fact that it may contain missing observations. In Little and Rubin (2002), missing data methods are grouped into four non-mutually exclusive categories. Procedures based on completely recorded units encompass strategies like those described above, which essentially ignore incomplete observations, and may result in serious biases, particularly with large quantities of missing data. Weighting procedures modify response design weights in an attempt to account for missing data as if it were part of the sample design. Imputation-based procedures fill in missing values, and then analyze the complete estimated sample with standard methods. Finally, model-based procedures define a model for the observed data on which to base statistical inferences. The work presented in this paper represents a particular model-based procedure where regression models are fit to observed datasets.

Regressions represent a broad class of models that may be fit to time series data, and in our case lead to statistical inferences about that data. Model-based missing data procedures are used in conjunction with these fits to estimate their parameters. These parameters may then further be used to predict future system responses, or as indicators of changes to the system over time. In either case, accurate parameter estimation is paramount for correct system behavior prediction and assurance that any changes in regression parameters are due to system changes, as opposed to biased estimation.

This paper concerns parameter estimation of two particular types of regression models. The autoregressive-exogenous, or ARX(n, m) model assumes that current system output is a function of the previous n system outputs and previous m system inputs. The autoregressive, or AR(n) model assumes that current system output is only a function of the n previous system outputs. Both models considered in this study are used to generate simulated datasets; such datasets with missing samples are then used in this study. The algorithm presented in Section Parameter Estimation Algorithm is used for regression parameter estimation and dataset reconstruction, and the estimated parameters and measurements are compared with references. In all cases, we do not explore loss of the entire dataset, as the algorithm requires at least a portion to be run. This algorithm joins a relatively minor list of those dedicated to regression models, and more specifically ARX models, subject to missing observations. Important modifications to the state vector considered in the presented state-space model are presented.

In this paper, a specific real-world example relating to structural health monitoring (SHM) is presented, an application which has not been previously explored with the modified algorithm outside of preliminary work by the authors in Horner and Pakzad (2016a,b).

Structural vibration data has popularly been used in the field for system identification (Juang and Pappa, 1985; James III et al., 1993; Andersen, 1997; Huang, 2001; Pakzad et al., 2011; Dorvash and Pakzad, 2012; Chang and Pakzad, 2013, 2014; Dorvash et al., 2013; Cara et al., 2014; Matarazzo and Pakzad, 2016c; Nagarajaiah and Chen, 2016), finite element model updating (Shahidi and Pakzad, 2014a,b; Yousefianmoghadam et al., 2016; Nozari et al., 2017; Song et al., 2017), and damage-sensitive feature extraction (Sohn et al., 2001; Gul and Catbas, 2009; Kullaa, 2009; Dorvash et al., 2015; Shahidi et al., 2015), with the ultimate goal of inferring information about the current condition of the monitored structure. Regarding regression models, He and De Roeck (1997) shows their utility for describing structural vibration responses, and Shahidi et al. (2015) and Yao and Pakzad (2012) provide structural damage-sensitive features created using the parameters of regression models.

In this paper, acceleration time series data is collected from a two-bay structural steel frame. Parameters are estimated for ARX models using only portions of these datasets. This work extends and generalizes that introduced in Horner and Pakzad (2016b), which included a specific missing data pattern and alternative parameter estimation algorithm, and Horner and Pakzad (2016a), which used the same algorithm, but was specific to randomly missing data.

This paper is organized as follows. Section Model and Method Review provides a literature review on the relevant regression models and estimation of their parameters in the presence of missing data, as well as general likelihood modelbased missing data procedures. Section Parameter Estimation Algorithm presents the proposed algorithm and highlights its differences from previous work. Section Experimental setup and simulation outlines the data collection and simulation schemes, with Section Results and Discussion presenting the validation results of regression model estimation with incomplete experimental datasets. Finally, Section Conclusions outlines the current conclusions of this work and suggestions on future research directions.

### MODEL AND METHOD REVIEW

### Regression Parameter Estimation

The ARX(n, m) model is defined:

$$\begin{aligned} \mathbf{y}\left(\mathbf{k}\right) &= a\_1 \mathbf{y}\left(\mathbf{k}-1\right) + a\_2 \mathbf{y}\left(\mathbf{k}-2\right) + \dots + a\_n \mathbf{y}\left(\mathbf{k}-n\right) \\ &+ b\_1 \boldsymbol{\mu}\left(\mathbf{k}-1\right) + b\_2 \boldsymbol{\mu}\left(\mathbf{k}-2\right) + \dots + b\_m \boldsymbol{\mu}\left(\mathbf{k}-m\right) + \boldsymbol{\nu}(\mathbf{k}) \end{aligned}$$

where y is the model output; u is the model input; a<sup>i</sup> is the ith autoregressive (AR) parameter; b<sup>i</sup> is the ith exogenous (X) parameter; and v is the noise. In solving the problem of parameter identification, a model is also assumed for the ARX input u. In this paper, an AR(p) model is used:

$$\begin{array}{rcl} \boldsymbol{\mu}(k) &=& \mathbf{c}\_1 \boldsymbol{\mu}(k-1) + \mathbf{c}\_2 \boldsymbol{\mu}(k-2) &+ \cdots + \mathbf{c}\_{\mathcal{P}} \boldsymbol{\mu}\left(k-\mathbf{p}\right) &+ \boldsymbol{\mathcal{W}}(k) \end{array} \tag{2}$$

where ci is the ith assumed input AR parameter and w the noise. In this paper, both noise terms are assumed as Gaussian white noise, with E[v 2 (k)] = λ1 and E[w 2 (k)] = λ2, uncorrelated with one another.

Existing literature concerning parameter identification of ARX models includes Isaksson (1993), which presents an algorithm similar to that in this paper, using both the expectation maximization (EM) algorithm and Kalman filter for parameter identification with simulated datasets. Wallin and Isaksson (2000) present an iterative algorithm using least squares and a bias correction for parameter identification that does not require an assumed input model. Wallin and Isaksson (2002) investigate periodic missing data patterns and multiple optima that may result when input data is missing. Wallin and Hansson (2014) proposes an algorithm separate from EM for a wide class of models, including ARX. Ding et al. (2011) use gradient-based parameter identification methods with structural vibration data. Finally, Naranjo (2007) discusses general state space models with exogenous variables and missing data.

The algorithm proposed in this paper can also identify the parameters of AR models alone, of the form in Equation (2). The literature is more extensive on parameter estimation of these models. Papers with a similar approach that utilize maximum likelihood (ML) and EM-based approaches include McGiffin and Murthy (1980, 1981), which provide simulation results for a variety of parameter estimation methods. Kolenikov (2003) and Little and Rubin (2002) investigate parameter estimation for AR(1) models. Sargan and Drettakis (1974) present a ML approach that treats missing observations as additional parameters with respect to which the likelihood is maximized. Broersen and Box (2006) perform ML parameter estimation on AR, MA, and ARMA models. Zgheib et al. (2006) present a pseudo-linear recursive least squares algorithm in conjunction with the Kalman filter for reconstruction and AR parameter identification. Finally, Shumway and Stoffer (2000) present an overview of state space models and modifications with missing data. There are several important distinctions between the proposed algorithm and those presented in previous works; these will be presented as they arise in Section Parameter Estimation Algorithm. Additionally, this paper includes both real and simulated datasets for ARX models.

In each case presented in this paper's results, the model order must be selected prior to parameter estimation and dataset reconstruction. While the paper does explore the effects of selecting a different model order than that used to generate the dataset (see Section Improper Model Order Selection— Simulated), evaluation of strategies for selecting the correct or most-likely model order are beyond the scope of this work. To this end, there are several studies, including Grossmann et al. (2009) and Matarazzo and Pakzad (2016b), which discuss that the model order decreases significantly with an increase in missing data. Additionally, Sadeghi Eshkevari and Pakzad (2019) emphasize that randomness in missing data results in lower model order selection.

### Maximum Likelihood, Expectation Maximization, and Kalman Filter

A review of maximum likelihood estimation (MLE) and the expectation maximization (EM) algorithm is provided in Little and Rubin (2002). The idea behind the former is to find the values of some statistical parameters that maximize a likelihood function associated with the sampled data. This likelihood function is proportional to the probability density function of the data (often the logarithm of the probability density function, or "loglikelihood"). EM is an iterative strategy for MLE in incomplete data problems. It formalizes the procedure to handle missingness of estimating data values, estimating parameters, re-estimating data values assuming the parameters are correct, and iterating until convergence.

The idea of MLE is used extensively in the literature. Specifically for regression models, MLE without EM is discussed in Wallin and Hansson (2014) for several regression model classes, McGiffin and Murthy (1980, 1981) and Sargan and Drettakis (1974) for AR models, Dunsmuir and Robinson (1981) and Jones (1980) for autoregressivemoving-average (ARMA) models, and ARX models in Wallin and Isaksson (2002).

The EM algorithm was first formally defined in Dempster et al. (1977), which outlines several important characteristics, namely, that it is applicable to a wide array of topics, that successive iterations always increase the likelihood, and that convergence implies a stationary point of the likelihood. Shumway and Stoffer (1982) and Digalakis et al. (1993) describe the algorithm's utility with stochastic state-space models in conjunction with the Kalman filter. Mader et al. (2014) introduce a numerically efficient implementation of EM. The algorithm similar to that of this paper presented in Isaksson (1993) utilizes EM for ARX parameter identification. In the context of structural vibrations with missing data, EM is used in mobile sensing for system identification in Matarazzo and Pakzad (2014, 2015, 2016a,b).

Recall that the idea behind EM involves estimating missing values, or more generally, sufficient statistics of the missing observations so as to determine the model parameters (i.e., the "E" step). With the exception of Dempster et al. (1977), the EM papers above all utilize the Kalman filter (Kalman, 1960) to this end, as do Shi and Fang (2010) for randomly missing data. This recursive algorithm produces state variable estimates by prediction at each time step, then updating the estimates as a new measurement is taken. The estimates are then more "precise" than the measurements, which are naturally corrupted with noise. ARMA model parameters are identified with the Kalman filter and EM by Harvey and Phillips (1979), Jones (1980), Harvey and Pierse (1984). AR models are considered using a Kalman filter formulation in McGiffin and Murthy (1981), Zgheib et al. (2006), and Wallin and Isaksson (2002) identify multiple optima in ARX models with Kalman parameter identification. In Section Parameter Estimation Algorithm, the algorithm introduced in this paper is presented, incorporating the Kalman filter in the expectation step of the EM algorithm for MLE of ARX and AR parameters.

### PARAMETER ESTIMATION ALGORITHM

### Underlying State-Space Model

The algorithm proposed in this paper can be used for parameter estimation of either ARX or AR models. We introduce the algorithm here for general ARX(n, m) models and provide guidance for its adaption to AR(n) models. Two important differences from the formulation in Isaksson (1993) are presented here. In Section Experimental Setup and Simulation, the real data considered for testing the proposed algorithm constitutes structural vibrations. In this context, y and u in Equations 1, 2 represent structural responses at two distinct locations. Note here that u does not constitute an "input" to the system; nevertheless, we will use the term "input" when referring to the u data in this paper. Using alternative terminology, such an ARX model represents a transmissibility function from different system responses.

For the remainder of this paper, m = n is assumed for the ARX model orders (this is not required, but it eases notation clarity). Define z(k), Ai , and e(k) for ARX models, as:

$$\begin{aligned} \boldsymbol{z}\ (\boldsymbol{k}) &= \begin{bmatrix} \boldsymbol{\nu}(\boldsymbol{k}) \\ \boldsymbol{u}(\boldsymbol{k}) \end{bmatrix} & \boldsymbol{A}\_{\boldsymbol{I}} &= \begin{bmatrix} \boldsymbol{a}\_{\boldsymbol{I}} & \boldsymbol{b}\_{\boldsymbol{I}} \\ \boldsymbol{0} & \boldsymbol{c}\_{\boldsymbol{I}} \end{bmatrix} & \boldsymbol{e}\ (\boldsymbol{k}) &= \begin{bmatrix} \boldsymbol{\nu}(\boldsymbol{k}) \\ \boldsymbol{\nu}(\boldsymbol{k}) \end{bmatrix} \\ \boldsymbol{\Lambda} \left[ \boldsymbol{e}(\boldsymbol{k})\boldsymbol{e}^{T}(\boldsymbol{k}) \right] & \boldsymbol{\Lambda} = \begin{bmatrix} \boldsymbol{\lambda}\_{\boldsymbol{I}} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{\lambda}\_{\boldsymbol{2}} \end{bmatrix} \end{aligned}$$

assuming the two noise variance terms are uncorrelated. For AR models:

$$z\_i(k) := \mathcal{Y}(k) \qquad A\_l := a\_l \qquad \to \left[ e\_l(k) e^T(k) \right] \\ = \Lambda = \lambda\_I$$

Equations (1, 2) may be rewritten as:

$$\begin{aligned} z\left(k\right) &= A\_I z\left(k-1\right) + A\_2 z\left(k-2\right) + \cdots \\ &+ A\_I z\left(k-l\right) + e(k) \end{aligned} \tag{3}$$

where l is the largest of n (or m) and p.

The Kalman filter (Section Kalman Filter) is employed for reconstruction of missing observations; models for use with this filter may be in state-space form. The choice of a state vector for this model is not trivial as is shown in the EM algorithm parameter estimates of Section EM Algorithm. The state vector x(k) is given below:

$$\mathbf{x}(k) = \begin{bmatrix} z^T(k) \ z^T(k-1) \ \cdots \ z^T(k-n) \end{bmatrix}^T$$

and the state-space equations may then be written as:

$$\mathbf{x}(k) = F\mathbf{x}(k-1) + \mathbf{e}(k) \tag{4}$$

$$\mathbf{z}\left(\mathbf{k}\right) = \mathbf{H}\mathbf{x}(\mathbf{k})\tag{5}$$

with the following state matrix F and output matrix H (note that all matrix entries shown below are 2 x 2 in size for ARX models):

$$F = \begin{bmatrix} A\_I \ A\_2 \ \cdots \ A\_{\mathfrak{n}-I} \ A\_{\mathfrak{n}} & 0 \\ I & 0 & \cdots & 0 & 0 & 0 \\ 0 & I & \cdots & 0 & 0 & 0 \\ \vdots & \vdots & \ddots & \vdots & \vdots & \vdots \\ 0 & 0 & \cdots & I & 0 & 0 \\ 0 & 0 & \cdots & 0 & I & 0 \end{bmatrix}$$

$$H = \begin{bmatrix} I \ 0 \ \cdots \ 0 \ 0 \ 0 \end{bmatrix}$$

In Isaksson (1993), the H matrix is [A1 A2 . . . An−1 An] where it is composed of unknown model parameters. The choice of state vector here, which differs by a lag of one time step, simplifies the H matrix by fixing it to the above. This removes the uncertainty in H and reduces the complexity of the system identification computations for the case where the kth measurement is included in the kth state vector.

The last pieces required to use the Kalman filter are the covariance matrices of the process and measurement noise. Since the kth measurement is included in the kth state vector, the measurement equation noise term is not included [see Equation (5)], and its covariance matrix R = [0]. Due to the identity elements along a lower diagonal of F, only the first (AR) or first two (ARX) terms along the diagonal of the process covariance matrix Q are nonzero, and are equal to 3.

### Kalman Filter

For complete datasets, the matrices of Equation (5) and the noise covariances are fed to the Kalman filter directly with the data to find filtered state estimates, and the parameter estimation algorithm continues with EM (see Section EM Algorithm). The Kalman filter "predictive" equations are presented below:

$$
\hat{\mathbf{x}}^{-}(\mathbf{k}) = F\hat{\mathbf{x}}(\mathbf{k} - \mathbf{l})\tag{6}
$$

$$P^{-}\left(k\right) = F\mathcal{P}\left(k-1\right)F^{T} + \mathcal{Q}\tag{7}$$

where ∧ denotes an estimate; <sup>−</sup> denotes that the value is a priori; x and P quantities without a <sup>−</sup> are a posteriori; and P(k) is the error covariance at time k. The Kalman filter "corrective" equations update the a priori estimates and are given in Equations (8–10):

$$K\left(k\right) = P^-\left(k\right)H^T \left(HP^-\left(k\right)H^T + \mathcal{R}\right)^{-1} \tag{8}$$

$$
\hat{\mathfrak{x}}(k) = \hat{\mathfrak{x}}^{-}(k) + K(k) \left( z^{\prime}(k) - H\hat{\mathfrak{x}}^{-}(k) \right) \tag{9}
$$

$$P(k) = \left(I - K\left(k\right)H\right)P^-\left(k\right) \tag{10}$$

where K(k) is the Kalman gain at time k. In addition to the state and output matrices, the Kalman filter requires initial state and error covariance estimates, selection of which is discussed in Section Results and Discussion.

To produce state estimates at time steps with missing measurements, the filter must also be made aware not to "trust" measurements indicating a time step with missing observations, be they zero values, NaN indicators, or values orders of magnitudeatypical. The output matrix H and measurement noise covariance R are thus indexed to each time step k. Only values in these matrices pertaining to non-missing observations are sent to the filter. For example, in the case of a missing input at time step k, a matrix D(k) is defined to indicate the output observation as the only trusted measurement. In the case of Equation (5) for an ARX model, the measurement vector z(k) would then be length two, and D(k) then would be the first row of a 2 × 2 identity matrix. For any D(k), H(k), and R(k) are then defined:

$$H(k) = D\left(k\right)H\tag{11}$$

$$\boldsymbol{R}(k) = \boldsymbol{D}(k)\boldsymbol{R}\boldsymbol{D}^T(k)\tag{12}$$

Recall, however, that the covariance matrix R = [0], thus always evaluating Equation (12) to [0] as well.

### EM Algorithm

The EM algorithm is used to produce MLEs of the unknown parameters for the system, including the error variance terms λ1 and λ2, and all regression parameters. The basis for this algorithm lies in definition of the log-likelihood equation of the reconstructed dataset; formerly-missing observations are contained in the states estimated using the Kalman filter. First, the ARX model presented in Equations (1–3) is given new notation:

$$\boldsymbol{\nu}(k) = \boldsymbol{\phi}\_1^T(k)\boldsymbol{\theta}\_1 + \boldsymbol{\nu}(k) \tag{13}$$

$$\boldsymbol{\mu}(k) = \boldsymbol{\phi}\_2^T(k)\,\boldsymbol{\theta}\_2 + \boldsymbol{\omega}(k) \tag{14}$$

where:

$$\begin{aligned} \Phi\_{I}\left(k\right) &= \begin{bmatrix} \nu(k-1) \\ u(k-1) \\ \vdots \\ \nu(k-n) \\ u(k-n) \end{bmatrix} & \quad \theta\_{I} &= \begin{bmatrix} a\_{I} \\ b\_{I} \\ \vdots \\ a\_{n} \\ b\_{n} \end{bmatrix} \\ \Phi\_{2}\left(k\right) &= \begin{bmatrix} \mu(k-1) \\ \vdots \\ \mu(k-p) \end{bmatrix} & \quad \theta\_{2} &= \begin{bmatrix} c\_{I} \\ \vdots \\ \vdots \\ c\_{p} \end{bmatrix} \end{aligned}$$

The response is uncorrelated to the excitation for positive time lags. In other words, the response at one time is uncorrelated to the excitation at any time after that. Therefore, model output and input can be considered as independent Gaussian processes. Using the notation above, the joint density of all N observations, denoted f, may be written as:

$$f = \prod\_{k=1}^{N} \left( \left\{ \frac{1}{\sqrt{2\pi\lambda\_{I}}} \exp\left[\frac{-\left(\mathbf{y}\left(k\right) - \boldsymbol{\phi}\_{I}^{T}(k)\boldsymbol{\theta}\_{I}\right)^{2}}{2\lambda\_{I}}\right] \right\} \left\{ \frac{1}{\sqrt{2\pi\lambda\_{2}}} \exp\left[\frac{-\left(\mathbf{u}\left(k\right) - \boldsymbol{\phi}\_{2}^{T}(k)\boldsymbol{\theta}\_{2}\right)^{2}}{2\lambda\_{2}}\right] \right\} \right)$$

Taking the natural logarithm of this equation yields the loglikelihood criterion:

$$\begin{aligned} L(\boldsymbol{\theta}, \mathbf{A}) &= \mathbf{C} - \frac{N}{2} \log \left( \lambda\_1 \right) - \frac{N}{2} \log \left( \lambda\_2 \right) \\ &- \frac{1}{2\lambda\_1} \sum\_{k=1}^N \left( \boldsymbol{\nu} \left( \boldsymbol{k} \right) - \boldsymbol{\phi}\_1^T \left( \boldsymbol{k} \right) \boldsymbol{\theta}\_1 \right)^2 \\ &- \frac{1}{2\lambda\_2} \sum\_{k=1}^N \left( \boldsymbol{u} \left( \boldsymbol{k} \right) - \boldsymbol{\phi}\_2^T \left( \boldsymbol{k} \right) \boldsymbol{\theta}\_2 \right)^2 \end{aligned} \tag{16}$$

Equation (16) contains the constant term C which does not affect maximization. Differentiation of this equation with respect to each of the noise variances and setting the results equal to zero gives the MLEs for the noise variance terms:

$$\begin{split} \lambda\_{I} &= \frac{1}{N} \sum\_{k=1}^{N} E\_{N} \left[ \left( \boldsymbol{\nu} \left( \boldsymbol{k} \right) - \boldsymbol{\phi}\_{I}^{T} \left( \boldsymbol{k} \right) \boldsymbol{\theta}\_{I} \right)^{2} \right] \\ &= \frac{1}{N} \left( \sum\_{k=1}^{N} E\_{N} \left[ \boldsymbol{\nu}^{2} \left( \boldsymbol{k} \right) \right] - 2 \sum\_{k=1}^{N} E\_{N} \left[ \boldsymbol{\phi}\_{I}^{T} \left( \boldsymbol{k} \right) \boldsymbol{\nu} \left( \boldsymbol{k} \right) \right] \boldsymbol{\theta}\_{I} \\ &+ \sum\_{k=1}^{N} \boldsymbol{\theta}\_{I}^{T} E\_{N} \left[ \boldsymbol{\phi}\_{I} \left( \boldsymbol{k} \right) \boldsymbol{\phi}\_{I}^{T} \left( \boldsymbol{k} \right) \right] \boldsymbol{\theta}\_{I} \end{split} \tag{17}$$

and:

$$\begin{split} \lambda\_{2} &= \frac{1}{N} \sum\_{k=1}^{N} E\_{N} \left[ \left( \boldsymbol{\mu} \left( \boldsymbol{k} \right) - \boldsymbol{\Phi}\_{2}^{T} \left( \boldsymbol{k} \right) \boldsymbol{\theta}\_{2} \right)^{2} \right] \\ &= \frac{1}{N} \left( \sum\_{k=1}^{N} E\_{N} \left[ \boldsymbol{\mu}^{2} \left( \boldsymbol{k} \right) \right] - 2 \sum\_{k=1}^{N} E\_{N} \left[ \boldsymbol{\Phi}\_{2}^{T} \left( \boldsymbol{k} \right) \boldsymbol{\mu} \left( \boldsymbol{k} \right) \right] \boldsymbol{\theta}\_{2} \\ &+ \sum\_{k=1}^{N} \boldsymbol{\theta}\_{2}^{T} E\_{N} \left[ \boldsymbol{\Phi}\_{2} (\boldsymbol{k}) \boldsymbol{\Phi}\_{2}^{T} \left( \boldsymbol{k} \right) \right] \boldsymbol{\theta}\_{2} \end{split} \tag{18}$$

where EN denotes the conditional expectation based on observations until sample N, not the entire dataset. Substituting Equations (17, 18) into Equation (16) yields a different form of the log-likelihood:

$$L(\boldsymbol{\theta}, \mathbf{A}) = \mathbf{C} - \frac{N}{2} \log \left( \frac{1}{N} \sum\_{k=1}^{N} E\_N \left[ \left( \mathbf{y} \left( \mathbf{k} \right) - \boldsymbol{\phi}\_1^T \left( \mathbf{k} \right) \boldsymbol{\theta}\_1 \right)^2 \right] \right)$$

$$- \frac{N}{2} \log \left( \frac{1}{N} \sum\_{k=1}^{N} E\_N \left[ \left( \boldsymbol{\mu} \left( \mathbf{k} \right) - \boldsymbol{\phi}\_2^T \left( \mathbf{k} \right) \boldsymbol{\theta}\_2 \right)^2 \right] \right) \tag{19}$$

Maximizing this log-likelihood is equivalent to minimizing the quantities in the log terms, so these parts of the equation are differentiated with respect to the θ parameter vectors and set to zero to yield their MLEs:

(15)

$$\theta\_I^{(l+1)} = \left(\sum\_{k=1}^N \mathbb{E}\_N \left[\boldsymbol{\phi}\_I(k)\boldsymbol{\phi}\_I^T(k)\right]\right)^{-1}$$

$$\times \sum\_{k=1}^N \mathbb{E}\_N \left[\boldsymbol{\phi}\_I(k)\boldsymbol{\upchi}(k)\right] \tag{20}$$

$$\boldsymbol{\theta}\_{2}^{(l+1)} = \left(\sum\_{k=1}^{N} \mathbb{E}\_{N} \left[\boldsymbol{\phi}\_{2}(k)\boldsymbol{\phi}\_{2}^{T}(k)\right]\right)^{-1}$$

$$\times \sum\_{k=1}^{N} \mathbb{E}\_{N} \left[\boldsymbol{\phi}\_{2}(k)\boldsymbol{u}(k)\right] \tag{21}$$

Thus, the EM algorithm is defined, with the expectation step encompassing the dataset completion (Kalman filtering) and Equations (16–18), and the maximization step Equations (20, 21). The algorithm iterates until some convergence criterion is reached (for the purposes of examples presented later in this paper, that criterion is achieved when the change in either variance

parameter becomes <1 × 10−<sup>11</sup> from one iteration to the next; this threshold was selected to be within the computer's machine epsilon threshold, as well as practicality from a speed perspective. A flowchart of the proposed algorithm's key steps is defined in **Figure 1**.

One iteration of the EM algorithm requires calculation of each expectation contained in Equations (17, 18, 20, 21). As the Kalman filter provides the state estimates (which in turn correspond to the y and u values contained in the ϕ vectors) as well as the error covariance matrices, the appropriate quantities must simply be selected for the applicable summations. The key is to notice that the kth state vector x(k) (of length (2l + 2), for ARX) contains the kth y measurement as its first term, the kth u measurement as its second, the kth ϕ1 vector as terms 3 through (2n + 2), and the kth ϕ2 vector as even terms 4 through (2p + 2). The relevant covariance terms are in the corresponding rows and columns within the P(k) matrix. The required expectations are now defined in Equations (22–27), with subscripts denoting the relevant parts of the state vector and error covariance matrices:

$$\operatorname{E}\_{\rm N} \left[ \mathcal{y}^2(k) \right] = \hat{\mathfrak{x}}\_{I}^2(k) + P\_{I,I}(k) \tag{22}$$

$$\mathbb{E}\_N\left[\mathfrak{u}^2(k)\right] = \hat{\mathfrak{x}}\_2^2(k) + P\_{2,2}(k) \tag{23}$$

$$\begin{aligned} \operatorname{E}\_{N}\left[\phi\_{1}(k)\boldsymbol{y}(k)\right] &= \hat{\mathbf{x}}\_{3:(2n+2)}\left(k\right)\hat{\mathbf{x}}\_{1}(k) + \operatorname{P}\_{3}\left[\boldsymbol{z}\_{2n+2}\right]\_{1}(k)\left(24\right) \\ \operatorname{E}\_{N}\left[\phi\_{2}(k)\boldsymbol{u}(k)\right] &= \hat{\mathbf{x}}\_{4:2:(2n+2)}\left(k\right)\hat{\mathbf{x}}\_{2}(k) + \operatorname{P}\_{4:2:(2n+2),2}(k) \end{aligned} \tag{25}$$

$$\begin{aligned} \operatorname{E}\_{\mathcal{N}}\left[\boldsymbol{\phi}\_{\mathcal{I}}(k)\boldsymbol{\phi}\_{\mathcal{I}}^{T}(k)\right] &= \hat{\boldsymbol{x}}\_{\mathcal{I}:(2n+2)}\left(k\right)\hat{\boldsymbol{x}}\_{\mathcal{I}:(2n+2)}^{T}(k) \\ &+ \boldsymbol{P}\_{\mathcal{I}:(2n+2),\,\mathbf{3}:(2n+2)}(k) \end{aligned} \tag{26}$$

$$\begin{aligned} \operatorname{E}\_{N}\left[\boldsymbol{\phi}\_{2}(k)\boldsymbol{\phi}\_{2}^{T}(k)\right] &= \boldsymbol{\hat{x}}\_{4:2:(2n+2)}\left(k\right)\boldsymbol{\hat{x}}\_{4:2:(2n+2)}^{T}(k) \\ &+ P\_{4:2:(2n+2),4:2:(2n+2)}(k) \end{aligned} \tag{27}$$

It is in these expectation definitions that the choice of state vector becomes critical, and where the first distinction is made between this work and that of Isaksson (1993). Without including y(k) and ϕ1(k), and u(k) and ϕ2(k) in the same state vectors, all required error covariance terms for evaluation of Equations (22–27) would not be obtained within the Kalman filter portion of the algorithm. Isaksson (1993) appears to only include all terms in the k + 1 state vector.

For AR models, there are modifications to the EM algorithm. Vectors with subscript 2, and thereby Equations (14, 18, 21, 23, 25, and 27) are no longer needed. ϕ1(k) and θ1 are redefined as:

$$\phi\_I \begin{pmatrix} k \\ \end{pmatrix} = \begin{bmatrix} \wp(k-1) \\ \wp(k-2) \\ \vdots \\ \wp(k-n) \\ \end{bmatrix} \qquad \theta\_I = \begin{bmatrix} a\_1 \\ a\_2 \\ \vdots \\ \vdots \\ a\_n \end{bmatrix}$$

and Equations (15, 19, 24, 26) are rewritten as Equations (28–31), respectively, below:

$$f = \prod\_{k=I}^{N} \left\{ \frac{1}{\sqrt{2\pi\lambda\_I}} \exp\left[\frac{-\left(\mathbf{y}\left(k\right) - \boldsymbol{\Phi}\_I^T(k)\boldsymbol{\Phi}\_I\right)^2}{2\lambda\_I} \right] \right\} \tag{28}$$

$$L(\theta, \Lambda) = \mathcal{C} - \frac{N}{2} \log \left( \frac{1}{N} \sum\_{k=1}^{N} E\_N \left[ \left( \mathcal{Y}(k) - \phi\_I^T(k) \, \theta\_I \right)^2 \right] \right) \tag{29}$$

$$E\_N\left[\boldsymbol{\phi}\_I^T(k)\boldsymbol{y}(k)\right] = \hat{\boldsymbol{x}}\_{2:(n+1)}\left(k\right)\hat{\boldsymbol{x}}\_I(k) + P\_{1,2:(n+1)}(k) \tag{30}$$

$$E\_N\left[\boldsymbol{\phi}\_I(k)\boldsymbol{\phi}\_I^T(k)\right] = \hat{\boldsymbol{x}}\_{2:(n+1)}\left(k\right)\hat{\boldsymbol{x}}\_{2:(n+1)}^T(k)$$

$$+ P\_{2:(n+1),2:(n+1)}(k) \tag{31}$$

The second key difference between the proposed algorithm and that presented in Isaksson (1993) is simply that it stops here; we have not employed the Rauch-Tung-Striebel (RTS) smoother in this paper, as simulations conducted with it did not show a difference in model utility. However, the formulation is kept consistent for the case where smoother operations can be applicable. Please note that the proposed filtering can be implemented in real-time, while the suggested smoothing operations need offline computations. In some applications, the advantage of real-time filtering might be more desirable than the accuracy of the results; implementation of the proposed algorithm for online EM is thus the subject of future work. The ability of this algorithm to identify the correct parameters of incomplete datasets is evaluated in the following sections.

### EXPERIMENTAL SETUP AND SIMULATION

The proposed algorithm is tested on both experimentally-collected, and simulated datasets. When practical, the experimental datasets are used. There are a variety of variables that may affect the convergence behavior of the algorithm. We investigate the effects of varying percentage of missing data; pattern of missing data; improper model order selection; variable AR, X, and input AR model order; noise level; and model type (ARX vs. AR) on the speed of convergence and accuracy of both parameter and completed dataset estimates. In all instances, some portion of the data set was left intact to allow the algorithm to be run—at most, we considered up to 80% missingness in a packet-loss scenario. The majority of investigation results are presented in Section Results and Discussion as box plots. Each box represents the results of 200 simulations according to the prescribed conditions. In the case of randomly missing data, indices of missingness were selected with MATLAB's random number generation scheme.

Missing data indices were kept the same across sets of 200 iterations. This would ensure that the variability in a given box plot was associated only with the initial parameter guesses. These were always randomly generated, but so as to ensure a stationary model. For example, all ARX parameters were limited within (−1/max(n, m), 1/max(n, m)). Initial noise variance estimates were set at 0.01, as this seemed to obtain good results across the board in all simulations. The initial state vector was simply the first (2l + 2) measurements, and the initial Kalman error covariance matrix was the identity matrix.

The quantity typically selected for evaluation of parameter estimate accuracy is the root mean square of the normalized error (RMSNE). The RMSE (note the lack of normalization) is described in Chai and Draxler (2014); we elect to include normalization as we look at both parameter convergence, and adequacy of dataset reconstruction. Equation (32) defines the RMSNE:

$$\text{RMSNE} = \sqrt{\frac{1}{M} \sum\_{f=1}^{M} \left(\frac{p\_m - p\_f}{p\_f}\right)^2} \tag{32}$$

where pm is the estimated point (be it a parameter or a data point) from the missing data analysis; p<sup>f</sup> is the corresponding full data estimate (or true value); and M is the number of values in a given estimate vector. In calculating RMSNE values when p<sup>f</sup> is near-zero, large values may result; for this reason, we have elected not to include points in RMSNE calculations when the actual values are below a tolerance, in this case 0.001.

### Structural Vibration Data

The real dataset used in this paper consists of the responses from two of the sensors in one of 39 tests outlined in Nigro and Pakzad (2014) on a two-bay, structural steel frame. The collected data represents structural acceleration responses to an impulse load. **Figure 2A** shows the (highlighted) frame within the laboratory, with **Figure 2B** presenting an elevation drawing with the sensors of concern and impulse loading location. The responses from sensor L5 were considered as the "output" or y values of the ARX model, with L4 constituting the "input" u values. These sensor designations are used so as to ease the cross-referencing of this work with that in Nigro and Pakzad (2014) and Shahidi et al. (2015). The test lasted 2 s and each wireless sensor had 500 Hz sampling rates, yielding 1,000 total measurements apiece. The amplitude-limited impulse excitation was selected due to the assumption of linear behavior for the frame. The impulse also does not impose a specific frequency onto the frame. This represents an advantageous similarity to ambient vibration (Shahidi et al., 2015), the most likely condition during which monitoring of a real structure would occur.

TABLE 1 | ARX parameters for generation of simulated datasets.


### Simulated Data

The simulated dataset used in this paper was selected to be similar to that obtained from the experimental setup above, but more easily controlled. We began by fitting an ARX(4, 4) with input AR(4) model to the experimental datasets [this model order was selected due to its good results in Shahidi et al. (2015)]. Then, the MLE parameters were simply used as the "true" parameters with which to generate entirely new "output" (and input, for ARX) datasets. Normally-distributed randomly-generated noise could also be added with controllable mean and variance. In all cases, the initial output and input values for the generated datasets were set to unity. With the exception of the noise variances, the same parameters were used for generation of simulated datasets at all times. These (rounded) are provided in **Table 1**.

## RESULTS AND DISCUSSION

### Variable Missing Data Pattern—Experimental

Three missing data patterns were investigated for experimental data with ARX models. **Figure 3** displays the results of variable random missing data percentages. Missing data indices were randomly generated in MATLAB. This may represent intermittent, random sensor network communication disruptions in the real world. **Figure 4** displays the results of variable block missing data percentages. This represents the case of a packet-loss scenario in the real world. The location within the dataset of the block of missing data was confined to the middle of each dataset. **Figure 5** displays the results of variable repetitive patterns of missing data. This

represents a more predictable sensor communication disruption pattern. In each case, missing data was simulated for both output and input data.

In each of **Figures 3**–**5**, the horizontal axes labels indicate the pattern (or percentage) of missing indices along the length of the dataset. In all cases, an ARX(4, 4)with input AR(4) model was considered. For each missing data pattern or percentage, 200 simulations were run. The (A) portions of each figure represent box plots of the parameter RMSNE values for these ten simulations, and the (B) portions the same for the data RMSNE values. In each parameter figure, the lightest box plots indicate AR parameters, the medium-hued X parameters, and the darkest input AR parameters. In each data figure, the lighter box plots indicate output data, and the darker input data. The theoretically "correct" parameters are not known for the experimental dataset; for comparison in RMSNE calculations, we use the results of EM parameter estimation on the complete dataset (without Kalman filtering).

The results of the above simulations are as expected. We are primarily concerned with parameter estimation, so the fact that there does not appear to be a great difference in accuracy of the parameter estimates across different missing data patterns is as intended for a robust parameter estimation algorithm. It is worth noting that the X parameters tend to have significantly higher RMSNE values across all missing data percentages than the AR or input AR parameters. This is likely largely in part due to the X parameters being significantly smaller in magnitude than the AR or input AR parameters.

Of some concern is the drastic increase in data reconstructive error for block missingness compared to random or patterned

missingness, and particularly the variability in this error at high missingness percentages. This phenomenon occurs due to the MLE model causing the Kalman estimates to go to zero at a swifter rate than the actual system. This implies that the algorithm, in its current state, works adequately for parameter estimation with a variety of missing data patterns, but is not particularly appropriate for dataset reconstruction in the case of packet loss, or prediction. This effect additionally begins to affect the parameter estimate reliability and convergence rate as well-beyond 60 percent missingness. However, this is not the only factor at play, since there may be other regression models (or different orders) that simply fit the dataset better.

It may appear that cases of patterned missingness, as opposed to block or random, have less variability in the accuracy of the estimated parameters and reconstructed datasets, as well as number of iterations to convergence. However, if missing every fifth observation is thought of as missing 20 percent of observations (and furthermore, every fourth as 25 percent), we appear to see similar variation as at these percentages for random and block missingness. Finally, note the termination percentages for the **Figures 3**, **4** horizontal axes. In this section, and beyond, the criterion for these termination procedures was that the algorithm became impractically slow to converge. This suggests that packet loss scenarios, with their caveat on reconstruction, may still be handled with the algorithm for parameter estimation, and at higher percentages of missing data than random or patterned-missingness.

### Improper Model Order Selection—Simulated

The purpose of this section is to evaluate the effectiveness of the algorithm for parameter identification and dataset completion when a model is generated with a different series of model orders than that used for the fit in the algorithm. In each simulation, the procedure described in Section Simulated Data was used for model generation [i.e., it was ARX(4, 4) with AR(4)], all with 20% missing data. Cases of overparameterization were explored in these simulations, and would invalidate typical RMSNE calculations for the parameters. Therefore, a different means of presenting parameter convergence is shown in **Figure 6**. The horizontal axis of this plot shows the number of iterations required for converged parameter estimates; thus, each vertical column of parameter estimates corresponds to the results of the same simulation. From left to right in the figure, the vertical columns represent 2nd, 6th, 3rd, 5th, and 4th order ARX model parameters. The vertical axis shows the signed logarithm of the parameter magnitude (SLPM), defined in Equation (33):

$$\text{LSTM} \, = \, \text{sgn}(\mathfrak{p}\_l) \times \log\left(|\mathfrak{p}\_l|\right) \tag{33}$$

where pi represents a particular parameter value. The logarithm of this quantity is selected to properly display parameters of very different magnitudes on the same plot, and the sign function is selected to include both positive and negative values on the plot. In **Figure 6**, the large symbols at the right end of the horizontal lines represent the true parameter values of the ARX(4, 4) and input AR(4) models used for dataset generation (note then that these are NOT representative of convergence behavior in the axis-limit-number of iterations). Symbols within the figure correspond to those of the fits. Symbols occurring within the plots, but without corresponding border symbols (and indicated), were thus determined from higher-than-fourth-order models. **Figure 7**, alternatively, uses the same conventions as above for data value RMSNE calculations.

From **Figure 6** it is clear that parameter estimates are very accurate when correct model order is selected, and less accurate when lower model orders are selected. Larger model orders (converging in 35 and 40 iterations) have non-zero parameter estimates at all time lags, and fail to identify the proper parameter estimates. In **Figure 7**, however, the results of data RMSNE calculations are presented in a similar format to **Figures 3B**, **4B**, **5B**. For input data, it can be seen that there does not appear to be an appreciable difference in measurement estimates for higher order models compared to lower, despite different parameter estimates. However, output data is more accurately reconstructed in over-parameterized systems. In real systems, a "correct" model order is not accessible. Together, these observations regarding parameter and dataset estimates highlight the importance of consistent, as opposed to necessarily "correct" model order selection as a system is monitored over time. This is particularly important if regression parameters are

being monitored as indicators of system change. However, these observations also suggest that dataset reconstruction may occur with safe (i.e., large) estimates of system model orders when they are unknown.

### Variable Model Order—Experimental

The aim of this section is to investigate the convergence behavior of the algorithm with variable model orders. Here, we have only considered one simulation for each model order grouping (each which corresponds to a data point on **Figures 8**, **9**). In **Figures 8** (output data) and **9** (input data), the size of a plotted point is inversely proportional to the RMSNE of the converged Kalman estimated data values. The shade distribution of the plotted points can be likened to a histogram of the iterations to convergence when compared with other points, with four distinct shades corresponding to "bins" of size 60. So, for example, the (2, 2, 2) point has the lightest shade, and corresponds to a set of model orders that converged in the algorithm in 60 iterations or less. Finally, the numbers above and/or below select points on the figures correspond to the maximum, minimum, and quartile iteration or RMSNE values obtained in the analysis. For example, the (4, 2, 4) point converged in 210 iterations, tied for the most of any model order grouping. As another example, point (2, 6, 4) resulted in one of the lowest RMSNE values of any model order considered. In all cases, 20% of data was randomly missing from the experimental dataset.

In many ways, aspects of this figure correspond well with previous observations. We noted in Section Variable Missing Data Pattern—Experimental that the X parameters were lower in magnitude than the AR or input AR. This implies that the ARX model may in fact be dominated by the AR portion for these particular datasets; sure enough, in **Figures 8**, **9**, we tend to see the largest accuracy with high-order AR and input AR model orders, respectively. Furthermore, while convergence is relatively quick, lower accuracy at lower model orders is observed across the board, which likely confirms the earlier notion that structural vibrations may be described specifically with high-order AR processes.

### Variable Noise—Simulated

The aim of this section is to investigate the effect of the signal-to-noise ratios in parameter identification. Datasets were simulated corresponding to the procedure in Section Simulated Data, and noise levels in **Figure 10** were normally distributed with variances the indicated percentages of the initial signal values. In all cases, 20% data was randomly missing, and box plot shade conventions are the same as in Section Variable Missing Data Pattern—Experimental.

It can be seen from **Figure 10** that with the exception of very low noise increasing percentages of noise do not seem to significantly affect parameter estimates. This is due to the characteristics of the Kalman filter mentioned in Section Maximum Likelihood, Expectation Maximization, and Kalman Filter. Furthermore, the number of iterations do not typically significantly increase as noise levels do, though their variability does to some extent. Finally, it may be noted that data estimates errors increase exponentially with the noise from low levels, then stabilize at higher noise variance. This effect is expectedly more drastic for the output data, as it is affected by both its own noise, as well as that in the input.

### Model Type: ARX vs. AR—Simulated

Finally, the performance of the algorithm for parameter identification of ARX models is compared to that of AR models. Datasets are simulated according to the procedure of Section Simulated Data, which in the case of AR does not require X or input AR parameters (third and fourth columns of **Table 1**). Also of note is the increasing percentages of missing data across the horizontal axes of **Figures 11**, **12**. In all cases, no noise was included in the simulation.

In addition to that shown in the figure, the algorithm was also performed for the simulated case where no observations were

missing. Both mean and standard deviation of RMSNE values for input AR parameters and output data is zero (it is for this reason that the results are not included in **Figure 11**). This result is reached in an average of 18 iterations with a standard deviation of 0.483.

Similarly, for the zero missingness ARX model input data, mean and standard deviation of RMSNE values are 7.57e–16 and 8.23e– 16, respectively. For output data, mean and standard deviation are 1e–16 and 1e–15, respectively. Mean and standard deviation of X parameter RMSNE values are 1.29e–11 and 1.46e–11, respectively; for input AR parameter RMSNE values, 6.81e–15 and 6.8e–14, respectively; and for AR parameter RMSNE values, 1.76e–13 and 2.11e–13, respectively.

If the AR parameters are compared between **Figures 11**, **12**, it can be seen that they are estimated more accurately for the purely AR model, as opposed to ARX. This may be a function of the model type selected for this particular system. Similar to experimental datasets, the X parameters are again evaluated with generally a lower degree of accuracy. It is worth noting that if ARX identification in **Figure 12** is compared to that in **Figure 3** (random missingness), simulated datasets' parameters are generally estimated at a higher degree of accuracy when compared to real datasets, at similar percentages of missingness. However, in the case of simulated datasets, the algorithm becomes impractically slow at higher percentages of missing data, and there is generally more variability in the estimates.

### CONCLUSIONS

In this paper, an algorithm was proposed to identify parameters of both ARX and AR models fit to datasets with missing observations. This algorithm joins a relatively minor list of those dedicated to regression models, and more specifically ARX models, subject to missing observations, and presents important modifications to the state vector considered in the presented state-space model. The EM algorithm in conjunction with the Kalman filter is used for MLE of parameters and reconstruction. The effects of varying percentage of missing data; pattern of missing data; improper model order selection; variable AR, X, and input AR model order; noise level; and model type (ARX vs. AR) on the speed of convergence and accuracy of both parameter and completed dataset estimates was investigated. Favorable conditions for accurate parameter estimation include lower percentages of missing data, parameters of similar magnitude with one another, and selected model orders similar to those true to the dataset. Favorable conditions for dataset reconstruction include random and periodic missing data patterns, lower percentages of missing data, and proper model order selection. The algorithm is particularly robust to varied noise levels, an effect of the use of the Kalman filter.

Future research in relation to this work should further formalize the relationships that are apparent in this work. Boxplots in this work were concerned with evaluating the effect of changing initial parameter guesses; future work should explore the effects of consistent initial guesses, but variable missingness data indices. Additionally, the effect of the parameter values themselves on estimation accuracy should be further explored. An investigation should also be conducted into features of the converged ML statistical parameters that may be extracted to further evaluate the effectiveness of the algorithm, and different convergence criteria should be explored. Finally, alternative missing data patterns may be explored, including removing data from only one of the output or input at a time.

## AUTHOR CONTRIBUTIONS

Simulations, writing, and revision were carried out by both MH and SP, with additional writing and revision contributions from NG.

### FUNDING

Research funding is partially provided by the National Science Foundation through Grant No. CMMI-1351537 by the Hazard Mitigation and Structural Engineering program, and by a grant from the Commonwealth of Pennsylvania, Department of Community and Economic Development, through the Pennsylvania Infrastructure Technology Alliance (PITA).

### REFERENCES


input data. Int. J. Control 87, 1–11. doi: 10.1080/00207179.2014. 913346


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Horner, Pakzad and Gulgec. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Output-Only Vibration-Based Monitoring of Civil Infrastructure via Sub-Nyquist/Compressive Measurements Supporting Reduced Wireless Data Transmission

Kyriaki Gkoktsi <sup>1</sup> , Agathoklis Giaralis <sup>1</sup> \*, Roman P. Klis <sup>2</sup> , Vasilis Dertimanis <sup>2</sup> and Eleni N. Chatzi <sup>2</sup>

<sup>1</sup> Department of Civil Engineering, City, University of London, London, United Kingdom, <sup>2</sup> Department of Civil, Environmental and Geomatic Engineering, ETH Zürich, Zurich, Switzerland

#### Edited by:

Dryver R. Huston, University of Vermont, United States

#### Reviewed by:

Fernando Moreu, University of New Mexico, United States Ivan Bartoli, Drexel University, United States

\*Correspondence: Agathoklis Giaralis agathoklis.giaralis.1@city.ac.uk

#### Specialty section:

This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment

Received: 30 March 2019 Accepted: 03 September 2019 Published: 24 September 2019

#### Citation:

Gkoktsi K, Giaralis A, Klis RP, Dertimanis V and Chatzi EN (2019) Output-Only Vibration-Based Monitoring of Civil Infrastructure via Sub-Nyquist/Compressive Measurements Supporting Reduced Wireless Data Transmission. Front. Built Environ. 5:111. doi: 10.3389/fbuil.2019.00111 The consideration of wireless acceleration sensors is highly promising for cost-effective output-only system identification in the context of operational modal analysis (OMA) of large-scale civil structures as they alleviate the need for wiring. However, practical monitoring implementations for OMA using wireless units suffer a number of drawbacks related to wireless transmission of densely sampled acceleration time-series including the energy self-sustainability of the sensing nodes. In this work, two recently proposed approaches for output-only modal identification addressing the above issues through balancing monitoring accuracy with data transmission costs are comparatively studied and numerically assessed using field recorded acceleration datasets from two different structures: (i) an operating on-shore wind turbine, (ii) an open to traffic highway bridge. One approach utilizes non-uniform-in-time deterministic multi-coset sampling at sub-Nyquist rates to capture structural response acceleration time-series under ambient excitation assuming stationary signal conditions. In this approach, a power spectrum blind sampling technique is used to estimate the response acceleration power spectral density matrix from the low-rate sampled measurements and is coupled with the Frequency Domain Decomposition method of OMA. The other is a spectro-temporal compressive sensing approach which recovers response acceleration signals through time-series reconstruction in the time domain from sub-Nyquist non-uniform-in-time randomly sampled measurements. Prior knowledge of signal structure in the spectral domain is exploited through smart on-sensor operations and sensor/server communication. The benefits and limitations of the considered approaches are discussed and demonstrated by processing the field recorded datasets for different levels of signal compression and by estimating battery lifetime gains at a single sensor achieved by reduced data transmission. It is concluded that the two approaches are readily applicable in OMA of large-scale structures and can be used complementarily depending on the requirements of any particular acceleration monitoring campaign: time-series extraction for further interrogation vs. solely modal properties estimation.

Keywords: vibration-based modal identification, multi-coset sampling, spectro-temporal compressive sensing, blind power spectrum sampling, operational modal analysis, wireless sensors

## INTRODUCTION

In recent years, monitoring schemes have roved their worth in terms of the potential for smart operation and maintenance of civil structures (Farrar and Worden, 2012). When further coupled with the concept of the Value of Information, monitoring of structures serves as a direct tool for informing decision support on the life cycle management of structural assets. In this context, operational modal analysis (OMA) involves an efficient monitoring modality in large scale structural systems, as it is typically enabled by low cost acceleration sensors, allowing for a long-term or permanent system supervision (Limongelli et al., 2016). Due to the difficulty involved in the measurement of operating loads, OMA relies on output-only information for extraction of dynamic properties of systems that are typically subjected to low-amplitude operational loads (e.g., due to wind traffic, etc.) (Brincker and Ventura, 2015). For the common case of linear systems, such properties include the modal characteristics of the structure (e.g., natural frequencies, damping and mode shapes). Since no explicit loading information is assumed available, output-only techniques commonly assume ambient conditions corresponding to a flat spectrum over a wide range of frequencies (i.e., a white noise excitation assumption). Such techniques are shown to perform well, even under the challenge of varying environmental and operational conditions (Reynders et al., 2013; Shi et al., 2016; Avendaño-Valencia et al., 2017) while the extracted modal structural properties may be then exploited for a variety of tasks including condition assessment, design verification, structural health monitoring (SHM) and, ultimately, residual life prediction of civil structures (Straub et al., 2017). Still, OMA has been mostly demonstrated for use with tethered sensing configurations.

Wireless sensor networks (WSNs) offer a low-cost and easily deployable alternative to tethered (wired) acceleration sensors that is particularly attractive for large structures featuring locations of reduced accessibility (Lynch, 2007). The "smart" feature of most such wireless platforms, allowing for local processing at the wireless senor (node) level, has been exploited for decentralized autonomous monitoring solutions (Nagayama et al., 2009). Nonetheless, WSNs have so far not enjoyed widespread adoption into practice, largely owing to their limited wireless transmission bandwidth and the maintenance costs related to frequent battery replacement (Klis and Chatzi, 2016a; Gkoktsi and Giaralis, 2019).

In order to extend the self-sustainability of the nodes, and to reduce the power allocated for transmission, compressive sensing (CS) techniques have been explored, with Bao et al. (2010) and O'Connor et al. (2013), leading developments in this field, with applications on actual road bridges. CS samples at random non-uniform in time rates, resulting in equivalent sampling below the Nyquist rate. In a nutshell, CS asserts that a discrete-time finite length signal (e.g., an analog response acceleration signal uniformly sampled in time) can be recovered, with high probability, from a relatively small number of randomly acquired samples/measurements in time, by solving an underdetermined system of linear equations (Candès and Tao, 2006; Donoho, 2006). Importantly, the number of random (compressed) measurements required for a faithful signal recovery depends on the "sparsity" of the acquired signal with respect to some pre-specified vector basis, such as the discrete Fourier transform (DFT) basis used for representation of vectors in the Fourier/frequency domain. Specifically, a Ksparse/compressible signal features K expansion coefficients on a given basis with values larger than a relatively low threshold; the smaller K is, the sparser the signal is, and thus the fewer random measurements are required for its sparse recovery (i.e., estimation of the K significant expansion coefficients) within the CS framework. In this regard, all algorithms for sparse signal recovery necessitate an assumption of signal sparsity (Vaswani and Zhan, 2016), which is a priori unknown and is adversely affected by signal noise. Further, much research work has been devoted in constructing sparsifying bases or, more generally, sparsifying dictionaries tailored for different applications, such as image denoising (Razaviyayn et al., 2014), video sensing (Eslani et al., 2014), and ultrasonic non-destructive damage detection (Fuentes et al., 2019).

In this setting, O'Connor et al. (2014) were the first to deploy customized CS-based wireless sensors in a long-term monitoring field application for an overpass in MI, USA. By randomly triggering in time conventional ADC units, nonuniform in time compressed acceleration response signals were acquired. Sparse recovery assuming a DFT expansion basis, as well as an empirically specified level of sparsity was applied to the compressed data to estimate the response acceleration power spectral density (PSD) matrix. The latter matrix was used in conjunction with the standard frequency domain decomposition (FDD) algorithm to extract mode shapes and natural frequencies within OMA. Yang and Nagarajaiah (2015) and Park et al. (2014) contributed two significantly different approaches for mode shape estimation from CS-based nonuniform in time random sampling of structural vibration timehistories at sub-Nyquist rates. In Yang and Nagarajaiah (2015) mode shape estimation relies on modal structural responses obtained by application of blind source separation directly to the compressed measurements of structural response signals. Sparse signal recovery (reconstruction) in the time-domain is next applied to each compressed modal response vector to retrieve the underlying structural natural frequencies and modal damping ratios. In Park et al. (2014) mode shapes are obtained by means of a singular value decomposition-based algorithm applied directly to response acceleration compressed measurements, without taking any reconstruction step, under the assumption of noiseless undamped free vibration structural response signals (i.e., multitone signal model).

The standard approach to CS-based signal recovery relies on l1-norm minimization and solution of the so-called Basis Pursuit De-Noising problem (BPDN). This approach is typically adopted in CS implementations in OMA applications using wireless sensors (O'Connor et al., 2014; Zou et al., 2015). In enhancing the BPDN approach, Klis and Chatzi (2016b) utilize its re-weighted variant, known as the re-weighted Basis Pursuit De-Noising problem (rwBPDN), also known as the l1-analysis problem (Becker et al., 2011). The resulting Spectro-temporal CS (STCS) scheme leads to improved time-domain signal recovery with respect to the standard BPDN approach. Furthermore, in order to alleviate the heuristic a priori assumption on sparsity Klis and Chatzi (2016a) initially exploit the concept of a leading node, i.e., a (typically) tethered node which is permanently logging at higher sampling rates, and forms part of a Hybrid Sensor Network (HSN). The latter forms a compilation of primarily wireless and a minimal number of tethered sensors. In this way the signal support, on the basis of which the recovery is performed, is narrowed, which results in reduced transmission costs. In later work (Klis and Chatzi, 2016b) the leading node requirement is removed, with wireless sensors transmitting partial temporal information and a selected part of the spectral information in a two way communication with the server (central node). Sparse signal recovery is again enabled via solution of the rwBPDN problem. The recovered signals may then serve as input for any standard output-only modal identification algorithm for OMA. As signal reconstruction is a main target of this approach, it is particularly attractive for use with time domain-based identification methods, such as approaches based on Auto-regressive models, subspace identification algorithms, or real-time tracking using Kalman filters. The proposed scheme has been validated on synthetic simulations generated for a benchmark four-story frame structure of the American Society of Civil Engineers, as well as data from operating structures.

Aiming to circumvent the signal sparsity requirement for the identification of modal characteristics (natural frequencies and modal shapes) from sub-Nyquist sampled response acceleration data, Gkoktsi and Giaralis (2017), Gkoktsi and Giaralis (2019) developed an alternative to the former CS-based approaches. The latter approach couples the sub-Nyquist non-uniform-in-time deterministic multi-coset sampling strategy (Venkataramani and Bresler, 2001), with a Power Spectrum Blind Sampling (PSBS) technique (Ariananda and Leus, 2012; Tausiesakul and González-Prelcic, 2013) extended to the multi-channel case by Gkoktsi et al. (2015) to estimate the response acceleration PSD matrix (second order statistics) from correlation sequences of the sub-Nyquist measurements. Ultimately, the considered PSBS approach derives structural modal properties by application of the FDD algorithm to the estimated PSD matrix without response acceleration signal recovery in the time domain and without making an a priori assumption on signal sparsity in the DFT or in any other domain (Gkoktsi and Giaralis, 2017). In doing so, measured response signals are assumed as stationary correlated stochastic processes in alignment with OMA theory (Brincker and Ventura, 2015). In this respect, the PSBS approach does not return the time-histories of acceleration response signals but, on the positive side, it is purely signal agnostic in terms of signal structure in the frequency domain and, therefore, indifferent to signal sparsity attributes and/or to additive noise. In this regard, it was shown that the PSBS approach achieves quality mode shape extraction and robust modal strain-based damage detection from response acceleration measurements corrupted by additive noise (Gkoktsi et al., 2016) at rates as low as 80% below Nyquist leading to significant energy consumption gains in wireless sensors (Gkoktsi and Giaralis, 2019).

Previously, the PSBS approach has been compared to standard BPDN CS-based approach in terms of quality mode shape extraction (Gkoktsi and Giaralis, 2017). Herein, we comparatively assess the PSBS-based scheme (Gkoktsi and Giaralis, 2017, 2019) and the STCS approach (Klis and Chatzi, 2016a,b) in an effort to demonstrate their readiness levels for output-only modal identification supported by low-power wireless sensors which is the first step toward cost-effective SHM. The strengths and limitations of each approach are elaborated upon, with a comparison in terms of data compression, potential for modal identification, and, for the case of STCS, on timedomain signal recovery. In order to validate the presented tools, field data from large scale structures are utilized, namely acceleration response time-histories from an operating on-shore wind turbine as well as from a highway overpass open to traffic. For the second structure, estimated battery lifetime gains at the sensor node level are provided achieved by power consumption savings from reduced wireless data transmission.

### METHODOLOGICAL FRAMEWORK

### Spectro-Temporal Compressive Sensing (STCS) Approach via rwBPDN

Spectro-Temporal Compressive Sensing (STCS) relies on the formulation of the missing data estimation problem (Candès and Romberg, 2007; Becker et al., 2011; Candès and Plan, 2011). Let us assume a signal recorded by sensor i, comprising N samples. The missing data estimation problem is formulated as:

$$\mathbf{y}\_i = \mathbf{S} \mathbf{x}\_i \tag{1}$$

where **y**<sup>i</sup> = [y1<sup>i</sup> , y2<sup>i</sup> , ..., yMi] <sup>T</sup> ∈ R <sup>M</sup>is the measured signal of partial (incomplete) observations, comprising a dimension M < N, and **S** ∈ R <sup>M</sup>×Nis a zero-one selection matrix, known a-priori. The goal of the missing data estimation problem is to recover the original full signal **x**<sup>i</sup> given the incomplete observations **y**iand the selection matrix **S**.

As demonstrated in Klis and Chatzi (2016b) structural response signals admit a sparse representation via a Discrete Fourier Transform (DFT) orthonormal basis, **A** ∈ C <sup>N</sup>×N, according to the following equation:

$$\mathbf{x}\_{i} = \mathbf{A}\mathbf{c}\_{i}, \text{ where } \mathbf{A}\_{i,l} = \frac{1}{\sqrt{N}} e^{-j2\pi i \frac{l}{N}} \tag{2}$$

where **c**<sup>i</sup> = [c1<sup>i</sup> ,c2<sup>i</sup> , ...,cNi] <sup>T</sup> ∈ R <sup>N</sup>comprises the coefficient vector. Via substitution of equation (2) into equation the observations vector **y**<sup>i</sup> may be recovered on the basis of a finite number of **c**isparse coefficients, as follows:

$$\mathbf{y}\_i = \mathbf{S} \mathbf{A} \mathbf{c}\_i \tag{3}$$

When the vector of observations **y**<sup>i</sup> is incomplete, as is the case for missing data, equation (3) comprises an ill-conditioned problem with multiple solutions for the coefficients vector **c**i . Within the compressive sensing context however, we are interested in the solution rendering the most sparse vector **c**i , for which equation (1) is fulfilled. This is recovered via solution of a non-deterministic polynomial time-Hard (NP-Hard) combinatorial problem:

$$\underset{\mathbf{c}\_i}{\text{arg min }} \|\mathbf{c}\_i\|\_0 \quad \text{subject to} \quad \mathbf{y}\_i = \mathbf{S} \mathbf{A} \mathbf{c}\_i \tag{4}$$

As the formulation of equation (4) is non-trivial to solve, Candès and Romberg (2007) concluded that solving a relaxed k · k<sup>1</sup> convex optimization problem, known as a basis pursuit problem, is equivalent to solving the original k · k0NP-Hard combinatorial problem, given that an appropriate condition the so-called Restricted Isometry Property (RIP)—holds. The basis pursuit problem is expressed as:

$$\underset{\mathbf{c}\_i}{\text{arg}\,\text{min}}\,\|\mathbf{c}\_i\|\_1 \quad \text{subject to} \quad \mathbf{y}\_i = \mathbf{S} \mathbf{A} \mathbf{c}\_i \tag{5}$$

The formulation of equation offers a significant reduction in terms of computational toll, while guaranteeing that the original response signal **x**<sup>i</sup> may be fully recovered, provided that the K-RIP condition is satisfied by matrix **SA**. For each positive integer K, we may define the isometry constant δ<sup>K</sup> of matrix **SA** as the smallest integer ensuring that the following K-RIP condition holds for all K-sparse vectors:

$$(1 - \delta\_K) \left\| \mathbf{c}\_i \right\|\_2^2 \le \left\| \mathbf{S} \mathbf{A} \mathbf{c}\_i \right\|\_2^2 \le (1 + \delta\_K) \left\| \mathbf{c}\_i \right\|\_2^2 \tag{6}$$

According to Candès et al. (2006) if δ2<sup>K</sup> < √ 2−1 then a solution to (5) will also satisfy the original problem (4) with a good quality of approximation.

However, observations stemming from acquisition via sensor nodes, as is typical for SHM measurements, will be corrupted with noise. In this case, the original problem becomes

$$\mathbf{y}\_i = \mathbf{S} \mathbf{A} \mathbf{c}\_i + \mathbf{z}\_i, \ \mathbf{z}\_i \propto \mathcal{N}(\mathbf{0}, \sigma^2) \tag{7}$$

The solution to this modified problem is obtained via a convex optimization problem, known as the Basis Pursuit De-Noising problem (BPDN):

$$\underset{\mathbf{c}\_{i}}{\arg\min} \|\mathbf{c}\|\_{1} \text{ subject to } \left\|\mathbf{y}\_{i} - \mathbf{S} \mathbf{A} \mathbf{c}\_{i}\right\|\_{2} \leq \epsilon \tag{8}$$

Solution of the BPDN problem has been the main approach utilized so far in the context of CS for SHM implementations (O'Connor et al., 2014; Wang and Hong, 2015; Wang et al., 2015). Klis and Chatzi (2016b) have advanced this framework by utilizing in lace of the classical BPDN formulation, its re-weighted variant, known as the re-weighted Basis Pursuit De-Noising problem (rwBPDN) (Becker et al., 2011):

$$\underset{\mathbf{c}\_{i}}{\text{arg min}} \; \|\mathbf{W}\mathbf{c}\_{i}\|\_{1} \; \text{subject to } \left\|\mathbf{y}\_{i} - \mathbf{S}\mathbf{A}\mathbf{c}\_{i}\right\|\_{2} \leq \epsilon \tag{9}$$

Key to this formulation is the weighting matrix **W** = diag([w1,w2, ...,wN]) which ensures a desired structure of the coefficient vector **c**<sup>i</sup> .

The Spectro-Temporal Compressive Sensing (STCS) framework relies on solution of the rw-BPDN problem. **Figure 1** illustrates the steps of STCS framework, with the actions performs locally on the node level aggregated on the left, and actions performed globally at the base station (server) level assembled on the right. As a first step, the support is determined at each node. The support of a vector is defined as the subset of non-zero components supp(**Y**i) = {ω ∈ : **Y**i(ω) 6= 0}. For practical implementations, this is eventually defined in terms of exceedance of a user-specified threshold ǫ : supp(**Y**<sup>i</sup> , ǫ) = {ω ∈ : |**Y**i(ω)| > ǫ}. The selection of the support is executed for windowed frames **x**ij, extracted from the original signal **x**<sup>i</sup> . The formulated support **U**ij, as well as its complementary set **U** c ij allow the decomposition of the spectral representation of the signal, as expressed via the coefficient vector **c**i , into a "noisy" and "clean" component **c**ij = **U**ij**c**ij + **U** c ij**c**ij. The locally defined support components are eventually transmitted to the server, where the weighting matrix **W**ij is formulated and the K-sparsity of the signal is determined as Kij = P(**U**ij).

In a next step, the number of time-domain samples Mij required for signal recovery is defined, on the basis of which the server randomly selects and transmits **y**ij sub-vectors, from the j-th data frame of the i-th sensor, using a uniform distribution. For more details on this process the interested reader is referred to (Klis and Chatzi, 2016b). Upon transmission of the necessary time domain samples, along with the corresponding **W**ij weighting matrix, to the server the coefficient vector is recovered as **c**ˆij = arg min **c**ij **<sup>W</sup>**ij**c**ij 1 subject to **<sup>y</sup>**ij <sup>−</sup> **<sup>S</sup>**ij**Ac**ij 2 ≤ σu,ij. As a

last step, the coefficient vector of the j-th data frame is projected back to the time domain:

$$
\hat{\mathbf{x}}\_{ij} = \mathbf{A}\hat{\mathbf{c}}\_{ij} \tag{10}
$$

Once this process has been executed for each data frame, the estimate of the full time domain signal is attained as **x**ˆ<sup>i</sup> = - **<sup>x</sup>**ˆi1, **<sup>x</sup>**ˆi2, ..., **<sup>x</sup>**ˆiD<sup>T</sup> from all D sensors in the network. The solver adopted for solution of the involved rwBPDN problem is the NESTA algorithm (Nesterov, 2005; Becker et al., 2011).

### The Multi-Channel Power Spectrum Blind Sampling (PSBS) Approach

The multi-channel PSBS approach for operational modal analysis developed by Gkoktsi and Giaralis (2017), Gkoktsi and Giaralis (2019) comprises the three stages delineated in **Figure 2**. The first stage involves low-rate (sub-Nyquist) deterministic periodic non-uniform-in-time multi-coset sampling at all D acceleration sensing nodes. In the next stage, the low-rate (compressed) measurements from all sensors are wirelessly transmitted to a server (base station) and centrally processed to obtain their cross-correlation vectors. These vectors are used to estimate (recover) the response acceleration PSD matrix by solving an overdetermined system of linear equations without invoking any signal sparsity assumption. Lastly, in stage III, the FDD algorithm is applied to the recovered PSD matrix to obtain natural frequencies and mode shapes. Notably, this centralized 3 stage forward-only approach minimizes processing and memory requirements at the node (local) level as well as wireless

data communication payload within the WSN leading to lowcomplexity and low-energy consumption sensor nodes.

Elaborating on the mathematical details of PSBS approach, let x(t) be a continuous in time t real-valued wide-sensestationary random signal (or stochastic process) represented by the PSD function Px(ω) band-limited to 2π/T in the frequency domain ω. The multi-coset sampling strategy is adopted (e.g., Venkataramani and Bresler, 2001) in stage I of the approach to sample x(t) at a rate lower than the Nyquist sampling rate 1/T (in Hz) as follows. Firstly, the uniform grid of Nyquist samples x(nT), n = 0,1,2, . . . is divided into consecutive non-overlapping blocks of N samples each. Then, from each such block, only M (<N)

samples are acquired whose position is specified by the sampling pattern sequence with elements in ascending order

$$\mathbf{n} = [n\_0 \ n\_1 \ \cdots \ n\_{\overline{M}-1}] \tag{11}$$

taken as time independent. The resulting sampling is periodic with period N; non-uniform in time (excluding the special case in which **n** contains all possible even or odd numbers and N is even; deterministic since the position of the M cosets is defined a priori through the sequence **n** applied to all N-length blocks; and sub-Nyquist with average sampling rate M/(NT) (in Hz) (always below the Nyquist rate 1/T since M < N). Notably, the multi-coset sampling rate is associated with the compression ratio (CR) M/N (0 ≤ M/N ≤ 1), attaining lower values for higher signal compression levels. For illustration, **Figure 3** demonstrates multi-coset sampling with pattern **n** = [0, 2, 5] to a discretetime signal partitioned into blocks of N = 8 length. This particular sampling acquires M = 3 cosets of samples by taking the 1st (red), the 3rd (cyan), and the 6th (green) Nyquist sample from every block achieving CR of M/N = 3/8 = 37.5%, that is, average sampling rate of 62.5% below Nyquist.

Mathematically, the samples of the m-th coset can be written as the output of the filtering operation

$$\mathbf{y}^{m}[k] = \sum\_{s=1-\overline{N}}^{0} \mathbf{g}\_{m}[s] \, \varkappa[k\overline{N} - s] \quad k = 0, 1, \ldots, P - 1 \quad \text{(12)}$$

where P is the total number of the N-length blocks and the filter coefficients are given as

$$\mathcal{g}\_{m}\left[s\right] = \begin{cases} 1, & s = n\_{m} \\ 0, & s \neq n\_{m} \end{cases} \tag{13}$$

in which s = [1–N, 2–N, . . . , 0] is arranged in descending order.

Consider, next, an array of D sensors and M cosets as shown in **Figure 2**. The cross-correlation sequences of response acceleration signals sampled at Nyquist rate, xi[v], from all i = 1, 2, . . . , D sensors are theoretically defined as

$$r\_{\mathbf{x};\mathbf{x}\_{j}}[\ell] = \mathbb{E}\_{\mathbf{x}}\left\{\mathbf{x}\_{i}[\nu] \,\,\mathbf{x}\_{j}[\nu-\ell] \right\} \quad i,j = 1,2,...,D \text{ ; } \ell \in \mathbb{Z} \tag{14}$$

where E is the mathematical expectation operator. It is herein assumed that the sequences in Equation (14) take on negligible values outside the range −L ≤ ℓ ≤ L. Further, the crosscorrelation sequences of the compressed measurements y m i [k] from all m = 0, 1, . . . , M − 1 cosets and i = 1, 2, . . . , D sensors, are written as

$$r\_{\boldsymbol{\chi}\_i^a \boldsymbol{\uprho}\_j^b}[\ell] = \mathbb{E}\left\{ \boldsymbol{\uprho}\_i^a[k] \, \boldsymbol{\uprho}\_j^b[k-\ell] \right\}$$
  $i, j = 1, 2, \ldots, D$  ;  $a, b = 0, 1, \ldots, \overline{M} - 1$  ;  $-L \le \ell \le L$   $\tag{15}$ 

It can be shown (Gkoktsi and Giaralis, 2019) that the sequences in Equations (14) and (15) are related by the following key equation

$$\mathbf{r}\_{\mathcal{Y}i\mathcal{Y}\_{\mathcal{Y}}} = \mathbf{R}\_{\mathcal{K}} \mathbf{r}\_{\mathbf{x}\_i \mathbf{x}\_j} \tag{16}$$

where **r**xix<sup>j</sup> ∈ R <sup>N</sup>(2L+1)×<sup>D</sup> is a matrix collecting all crosscorrelation sequences of response acceleration signals rxix<sup>j</sup> [ℓ], **r**yiy<sup>j</sup> ∈ R M 2 (2L+1)×<sup>D</sup> is a matrix collecting all cross-correlation sequences of the compressed measurements r y a i ,y b j [ℓ], and **R**<sup>g</sup> ∈ R M 2 (2L+1)×N(2L+1) is a sparse pattern correlation matrix populated by the multi-coset sampling pattern cross-correlations

$$r\_{\mathbb{S}^a, \mathbb{S}^b}[\tau] = \sum\_{s=1-\overline{N}}^0 \mathbf{g}\_a[s] \mathbf{g}\_b[s-\tau] = \delta[\tau - (n\_a - n\_b)] \tag{17}$$

where δ[u] = 1 for u = 0 and δ[u] = 0 for u 6= 0. Note that Equation (16) defines an overdetermined system of linear equations which can be solved for **r**xix<sup>j</sup> without any sparsity assumptions, provided that **R**<sup>c</sup> is full column rank. The latter is satisfied for M<sup>2</sup> ≥ N (Ariananda and Leus, 2012).

From the practical/computational viewpoint, the unbiased estimator

$$\hat{r}\_{\boldsymbol{\gamma}\_i^a, \boldsymbol{\gamma}\_j^b}[\ell] = \frac{1}{P - |\ell|} \sum\_{l = \max\{0, \ell\}}^{P - 1 + \min\{0, \ell\}} \boldsymbol{\gamma}\_i^a[l] \boldsymbol{\gamma}\_j^b[l - \ell] \tag{18}$$

can be readily employed to approximate the sequences in Equation (15) using all compressed measurements wirelessly transmitted to a central unit (server) from all D sensors as indicated in **Figure 2**. These estimates are collected in the matrix rˆyiy<sup>j</sup> ∈ R M 2 (2L+1)×D. Next, an estimate of the response acceleration PSD matrix at discrete frequencies with frequency discretization step (resolution)

$$
\Delta \omega = \frac{2\pi}{(2L+1)\overline{N}}\tag{19}
$$

is computed at the server through the formula (Gkoktsi et al., 2015; Gkoktsi and Giaralis, 2019)

$$\hat{\mathbf{G}}\_{\mathbf{x}\_{i}\mathbf{x}\_{j}} = \mathbf{F}\_{(2L+1)\overline{\mathbf{N}}} \left(\mathbf{R}\_{\mathcal{g}}^{\mathrm{T}} \overline{\mathbf{W}}^{-1} \mathbf{R}\_{\mathcal{g}}\right)^{-1} \mathbf{R}\_{\mathcal{g}}^{\mathrm{T}} \overline{\mathbf{W}}^{-1} \hat{\mathbf{r}}\_{\mathcal{\mathcal{Y}}\_{i}\mathbf{y}\_{j}},\tag{20}$$

where **<sup>F</sup>**(2L+1)<sup>N</sup> <sup>∈</sup> <sup>C</sup> <sup>N</sup>(2L+1)×N(2L+1) is the standard DFT matrix. In the last equation, **W** is a weighting matrix, and the superscript "−1" denotes matrix inversion. Ultimately, in stage III, the PSD matrix in Equation is treated by the standard FDD algorithm to estimate R structural mode shapes, φˆ r , and natural frequencies, fr , r = 1, 2, . . . , R.

The critical parameters of the herein briefly reviewed PSBS approach regulating CR are the number of cosets, M, and the lengthNof the blocks in **Figure 3**, subject to the two constraints M < Nand M 2 ≥ N. Once the values of M and N are fixed, the weighting matrix **W** in Equation and the sampling pattern **n** is determined by solving a constrained least-squares optimization problem as detailed in Tausiesakul and González-Prelcic (2013) relying on the criterion **r**ˆxix<sup>j</sup> = arg min**r**xi xj ||**r**ˆyiy<sup>j</sup> − **R**<sup>g</sup> **r**xix<sup>j</sup> ||2 **W** , where ||**a**||<sup>2</sup> **W** = **a <sup>T</sup>W**a is the weighted version of the Euclidean norm.

### ASSESSMENT FOR SIGNAL RECOVERY IN TIME AND IN FREQUENCY DOMAIN

The effectiveness and applicability of both considered approaches for vibration-based modal identification relies on the accuracy of their respective information recovery operations from compressed measurements. That is, time-domain signal reconstruction/recovery in the STCS-rwPBDN approach, and frequency domain PSD estimation/recovery in the PSBS-based approach. In this section, the performance of the above recovery operations is numerically assessed using field-recorded response acceleration data from an operational Wind Turbine (WT) in Lübbenau, Germany (Klis and Chatzi, 2016a). The considered structure was instrumented with wired high-accuracy MEMs accelerometer sensors located at the cross-section of the WT tower at 80 and 100 m height. The instrumentation layout of the WT is shown in **Figure 4**. Acceleration data were conventionally acquired at a uniform-in-time sampling rate of 200 Hz measured for ∼10 min every half an hour over a period of 29 days. For the purposes of this work, a small, arbitrarily chosen, sub-set of the recorded acceleration signals is compressed at different compression levels and processed via STCS-rwPBDN and PSBS approaches. PSD estimates and reconstructed signals in time domain are recovered from the compressed data by application of STCS-rwPBDN and PSBS approaches, respectively, while time-histories and non-compressed PSDs of the as-recorded data serve as a basis of comparison. It is expected that the assessment of PSBS information recovery in frequency domain will be most critical given that the level of signal stationarity of the considered data-set is relatively low while PSBS approach relies, theoretically, on signal stationary assumption.

The numerical assessment of both methods is performed using an acceleration time-series recorded on the 29/12/2013 (at 15:44 p.m.) along the North direction (i.e., y-axis in **Figure 4**) from the middle sensor in **Figure 4** located at 80 m height. The considered acceleration recording was uniformly acquired in time and it consists of N = 172,420 samples. Firstly, baseline adjustment is applied to the raw time-series to remove the mean value and other spurious low-frequency trends. Then, the time-series is band-limited within the frequency range of [0.10, 25.00 Hz] through filtering using a fourth-order Butterworth band-pass filter.

### STCS-rwBPDN Approach for Time Domain Signal Recovery

The time-domain reconstruction performance of the STCSrwBPDN approach is assessed for two compression ratios at CR = {30, 45%}. For a given WT acceleration response, the underlying signal support **U** is first computed to define the signal's sparsity level, K (i.e., number of components in the spectral domain), as well as the variance of the noisy component, i.e., the complementary set of the support (remaining part of the spectral representation). As elaborated upon in the work of Klis and Chatzi (2016a,b), this is used to prescribe error bounds on the reconstructed signal. For the two considered CRs, the obtained signal reconstruction estimates are illustrated in **Figure 5** for an acceleration time-window of 400 samples. Comparing the two panels **Figure 5**, it becomes evident that the increase in the number of transmitted samples results in narrowing the estimated maximal error bounds. **Figure 5** also demonstrates the potential of the STCS-rwBPDN approach, when applied in windows of non-stationary response signals, albeit necessitating higher compression rates than the conventional stationary case. The delivered error bounds allow for attributing some level of confidence on the undertaken signal reconstruction operation, which offers a benefit over the alternative (plain) BPDN approach adopted by O'Connor et al. (2014).

### PSBS Approach for Frequency Domain PSD Signal Recovery

The efficacy of the PSBS approach is numerically evaluated herein in terms of recovering quality PSD estimates from computersimulated compressed acceleration data. These compressed data are derived through the application of the multi-coset sampling pattern [section The Multi-Channel Power Spectrum Blind Sampling (PSBS) Approach] to the corrected (i.e., baselineadjusted and band-pass filtered), full-length (N = 172,420 samples) acceleration time-series shown in **Figure 6A**. The Welch periodogram (i.e., conventional PSD estimator) of the fulllength time-series before and after band-pass filtering is further plotted in **Figure 6B**. The two PSDs are normalized to their peak value to facilitate a comparison and have been computed using 4,096 (=2 <sup>12</sup>) points in frequency domain, 8 overlapping segments with 50% overlap, and windowing with a Hanning envelop function. It is observed that peak PSD amplitude occurs at ∼1.4 Hz which is the dominant resonant frequency of the monitored WT. Further, it is seen that most important signal information lies in frequencies below 5 Hz, as PSD values above 5 Hz are negligible.

Given that the PSBS approach anticipates signal stationarity, it is deemed essential to undertake data qualification test to appraise the stationarity level of the recorded time-series considered. To this end, the corrected data **Figure 6A** is divided in seven time-frames of 2 min duration and the standard nonparametric reverse arrangement method (RAM) is used to test statistically the stationarity hypothesis (Bendat and Piersol, 2010). The outcome of RAM application to a representative 2 min long segment of the considered time-series is presented in **Figure 6C** demonstrating that the stationarity hypothesis is confirmed within a 95% confidence interval. Positive stationarity hypothesis is confirmed at similar confidence level for the rest of the 2 min-long segments of the data in **Figure 6A**. Therefore, this WT recorded data can be treated as wide-sense stationary at a high confidence level rendering the application of PSBS meaningful and appropriate.

Next, the WT time-series in **Figure 6A** is multi-coset sampled at three different CRs, 11, 21, and 31%, using the settings listed in **Table 1** and PSBS is applied to the compressed/sub-Nyquist multi-coset samples to obtain single-channel PSD function estimates using the PSD recovery formula in Equation. Specifically, for the case of CR = 31%, the full-length acceleration data-series (N = 172,420 samples) is divided into P = 10,776 (= N/N) blocks of length N = 16 each, and from each block M = 5 samples are selected according to the sampling pattern **n** = [0, 1, 2, 5, 8]. These selected samples are collected in the compressed measurement sequences y <sup>m</sup>[k] (m = 0, 1, 2, 3, 4; k = 1, 2, . . . , P) in Equation resulting in the acquisition and transmission of M = 53,880 compressed samples (i.e., 69% fewer samples compared to the original signal). The estimator rˆ y a i ,y b j [ℓ] in Equation is then computed for i = j = 1 (i.e., single-sensor trivial case) and ℓ ∈ [−40, 40] assuming support correlation parameter L = 40. The latter consideration enables PSD function recovery Gˆ xx ∈ C 1296×1 in Equation with frequency discretization step (resolution) 1ω = 4.85 · 10−<sup>3</sup> rad/sin Equation. Similar computational steps are taken for the cases of CR = 21 and 11%, based on the relevant parameters in **Table 1** which involve the acquisition of 79% (i.e., M = 35,368) and 89% (i.e., M = 18,858) fewer samples compared to the uniformly-sampled full-length signal, respectively.

**Figure 7** plots PSD functions recovered from CR = {11, 31%} (solid red curves) in logarithmic and in linear scale. These

FIGURE 5 | Effect of the increase of transmission level CR on the estimated error bounds: CR = 30% (Left), CR = 45% (Right).

TABLE 1 | Adopted multi-coset sampling and PSBS settings.


functions are normalized to their peak attained value to facilitate comparison against the standard Welch periodogram of the filtered full-length recorded time-series superposed on **Figure 7** (dotted blue curves) and normalized in the same manner. It is qualitatively observed that the PSBS-based recovered PSDs lie close to the PSD of the WT data for frequencies up to 5 Hz which take on non-negligible values, and therefore, contain dependable information for modal identification purposes under ambient/operational excitation, for the low CR value (high data compression level). Better quality point-wise matching between the PSBS-based PSDs and the "exact" (non-compressed) PSD of the WT data is achieved even beyond the 0–5 Hz frequency range for the highest CR value considered (i.e., lower data compression level) as expected.

To discuss further the level of accuracy of the proposed PSBS approach for modal properties identification **Table 2** reports the location of the three largest PSD ordinates obtained by simple peak-picking from the recovered functions via PSBS, ˆ fr,PSBS, as well as from the standard Welch periodogram applied to the noncompressed WT acceleration data (CR = 100%), fr,Welch. Further, the percentage difference error

$$\frac{df\_r}{f\_r} = \frac{\left| \hat{f}\_{r, \text{PSBS}} - f\_{r, \text{Welch}} \right|}{f\_{r, \text{Welch}}} \quad ; \quad r = 1, 2, 3, 4 \tag{21}$$

TABLE 2 | Frequency locations of peak PSD ordinates through peak-picking and percentage difference errors for the PSBS approach at various CRs and for Welch periodogram applied on the full-length signal (non-compressed) data-series.


is also reported as a measure of the quality of the recovered PSDs via the PSBS method. It is seen that the location of the two most prominent PSD peaks (r = 2, 3) is retrieved with <1% error by the PSBS approach for signal compression level as low as 89% below the sampling rate of the original data series. However, the accuracy drops as CR reduces (compression level increases) for the least prominent peak, r = 1, corresponding to an inadequately excited vibration mode whose detection is inherently a challenging task.

### ASSESSMENT FOR MODE SHAPE EXTRACTION UNDER OPERATIONAL LOADING CONDITIONS

In this section, the effectiveness of STCS-rwPBDN and PSBS approaches for extracting modes of vibration and natural frequencies is numerically assessed within the OMA context. To this aim, response acceleration time-histories field recorded in a typical highway overpass open to traffic are used. The considered bridge is the Bärenbohlstrasse overpass in Zürich, Switzerland. The structure is 30.90 m long and fairly symmetric along its longitudinal direction. It consists of a solid prestressed-slab with two equal-length spans of 14.75 m each. The deck is supported in all directions at mid-span and in one of the two abutments, while it is only supported in the vertical and transverse directions at the second abutment. The deck was instrumented with a network of 18 conventional wired accelerometers recording vertical acceleration time-histories at 200 Hz sampling rate for ∼10 min per hour from 12th July 2013 to 26th July 2014. The layout of the sensor network deployment is shown in **Figure 8**; more details about the bridge and the monitoring campaign can be found in Klis et al. (2016).

Herein, the first 2 min-long recordings of a dataset of 18 vertical acceleration time-series simultaneously recorded on 19/06/2014 between 15:08:54 and 15:17:51 from each of the 18 sensors of the network are used. The dataset is pre-processed in the same way as the WT data-series case-study in section STCS-rwBPDN Approach for Time Domain Signal Recovery. That is, they are baseline adjusted and band-limited within [0.15, 50 Hz] frequency range using a band-pass 4th-order Butterworth filter. For illustration, **Figure 9** plots the bandpass filtered acceleration response signal recorded at sensor #13, together with its magnitude Fourier spectrum. Further, representative results from RAM application to a 1 min long segment of this record is provided demonstrating a very high level of data stationarity (i.e., much higher than the WT data-set; see **Figure 6C** for comparison).

Next, the first 2 min of the 18 time-series are downsampled at 100 Hz and treated by the STCS-rwPBDN and PSBS approaches. Starting from the STCS-rwPBDN approach, the considered dataset is first partitioned into R = 29 windows (frames) of NR= 400 samples, and each window is projected into the spectral (Fourier) domain as indicatively shown in **Figure 10A** for an arbitrarily chosen data-frame of sensor #1 time-series. Following the STCS-rwPBDN methodology in section Spectro-Temporal Compressive Sensing (STCS) Approach via rwBPDN, the spectral (Fourier) coefficients per data frame are then thresholded with a value ǫij = ǫlk**c**ijk1/NR, j = 1 . . . R, which pertains to ǫ<sup>l</sup> = 1.5, yielding the spectral domain elements illustrated in **Figure 10A**. The selected support elements are further used to form a weighting matrix Wij per data frame. Considering next two different CRs at {36%,11%}, the compressed samples y<sup>i</sup> (denoted with a cross in **Figure 10B**) are selected and used to retrieve the reconstructed time-domain sequence plotted in **Figure 10B** by a broken line. The standard Natural eXcitation Technique (NeXT) combined with the Eigensystem Realization Algorithm (ERA) are then used to extract estimates of the bridge deck mode shapes, φˆ <sup>r</sup> (r = 1, 2, . . . ), and natural frequencies, ˆ f<sup>r</sup> (r = 1, 2, . . . ), within an OMA context (Brincker and Ventura, 2015).

Moreover, the same dataset is treated by the PSBS approach using the same procedure and settings as in the WT case-study in **Table 1** to recover the PSD matrices Gˆ <sup>x</sup>ix<sup>j</sup> ∈ C <sup>N</sup>(2L+1)×<sup>18</sup> in Equation for CR = {31, 21, 11%}. The standard FDD algorithm of OMA (Brincker and Ventura, 2015) is applied to these matrices to find estimates of estimates of mode shapes, φˆ <sup>r</sup>(r = 1, 2, . . . ), and natural frequencies, ˆ f<sup>r</sup> (r = 1, 2, . . . ), of the monitored bridge. Indicatively, the first row of panels of **Figure 11** plots the first four estimated mode shapes of the bridge for CR = 11% using the PSBS with FDD approach.

The quality/accuracy of modal properties extracted through the STCS-rwPBDN and PSBS approaches is assessed by comparison to natural frequencies, f<sup>r</sup> (r = 1, 2, . . . ), and mode shapes, φ<sup>r</sup> (r = 1, 2, . . . ), obtained by application of the standard FDD to the full-length (non-compressive) dataset of recorded acceleration time-histories which are treated as the "exact" ones. The second row of panels of **Figure 11** plots the first four exact mode shapes and further reports the corresponding natural frequencies. **Table 3** reports percentage difference errors of the first four natural frequencies of the bridge deck obtained through coupling the STCS-rwPBDN with NeXT-ERA as well as the PSBS with FDD for different CRs with respect to the exact (noncompressive) ones as per Equation (21). It further collects the corresponding Modal Assurance Criterion (MAC) values defined by (Brincker and Ventura, 2015)

$$MAC = \frac{\left|\phi\_r^{\mathrm{T}}\hat{\phi}\_r\right|^2}{\|\phi\_r\|\_2^2 \left\|\hat{\phi}\_r\right\|\_2^2} \quad r = 1, 2, 3, 4 \tag{22}$$

to quantitatively compare the mode shapes extracted through the two compressive/sub-Nyquist approaches considered for different CRs with the exact mode shapes. As a commonly used

criterion, estimated mode shapes with MAC > 0.90 are regarded to be acceptably close to the exact ones.

By examining the numerical data in **Table 3**, it is seen that both STCS-rwPBDN and PSBS approaches are quite accurate in identifying natural frequencies as the error df/f is well below 1% even for the lowest CR = 11% value corresponding to average sampling rate of 89% below the Nyquist rate which equals to 100 Hz for the dataset considered. Further, the PSBS approach yields accurate mode shapes across the board for CR as low as 21%. For the lowest CR (=11%), PSBS is still capable of extracting the 1st and 3rd modes with acceptable accuracy, as can be visually appreciated by comparing the mode shapes in **Figures 11A,C** with **Figures 11E,G**, respectively. However, PSBS fails to pass the MAC > 0.90 criterion for 2nd and 4th modes for CR = 11% as highlighted in boldface font in **Table 3** (compare also **Figures 11B,D** with **Figures 11F,H**). This is readily attributed to the fact that 2nd and 4th modes are significantly less excited than 1st and 3rd modes as evidenced by the amplitude of the respective Fourier coefficients in **Figure 9B**. Higher than 11% CR (i.e., larger number of measurements) is required for the PSBS approach to accurately probe into the least excited 2nd and 4th modes. On the antipode, the STCS-rwPBDN approach provides good estimates for all four modes even at CR = 11%.

To highlight the practical merit of considering reduced CR values, estimates of daily energy consumption and battery lifetime savings for a single wireless sensor node of the WSN in **Figure 8** are further reported in **Table 4**. The data account for only data transmission power requirement as this is by far the most energy demanding sensor operation (e.g., Lynch, 2007). The reported estimates are based on the assumption that each sensor acquires 2 min long acceleration signals at Nyquist rate (CR = 100%) with Fs = 100 Hz (Ts = 0.01 s) under operational conditions every 1 h (i.e., a dataset of Q = 24 signals are collected daily per sensor with each signal comprising 12,000 measurements for CR = 100%). Power consumption during transmission of P<sup>t</sup> = 103.8 mW is taken based on the specification of a typical wireless sensor used for SHM: the WiseNode v4 developed by Novakovic et al. (2009). **Table 4** reports transmission time, energy consumption, and battery lifetime for three different CRs previously considered in **Table 3**. For illustration, computations pertaining to the case of CR = 11% are presented in detail for which only 12,000 × 0.11 = 1,320 compressed measurements per hour are wirelessly transmitted.

TABLE 3 | Quantitative assessment of the accuracy of natural frequencies and mode shapes for the bridge case-study obtained by the PSBS with FDD and by the STCS-rwPBDN with NeXT-ERA approaches for different CRs vis-à-vis standard non-compressive FDD approach.


Assuming that ADC units with 16 bits (i.e., 2 bytes) resolution are used, IFWD = 2,640 bytes of data package information are generated per compressed sequence taking t = (IFWD/It1) × t<sup>1</sup> = 7.54 s to be wirelessly transmitted to the server, where It1 = 7 bytes is the information carried within one data package and t<sup>1</sup> = 0.02 s is the time required for package transmission (Novakovic et al., 2009). Thus, ttot = Q × 7.54 = 181 s (or 0.05 h) are required for the daily transmission of all compressed acceleration response data, consuming Etot= Pt×ttot = 18.79 J of energy per day. It is further assumed that sensor energy supply of 3 V comes from two AA-sized batteries with nominal voltage V<sup>n</sup> = 1.5 V and capacity C<sup>n</sup> = 3,000 mAh, providing energy Eb= 64,800 J. A continuous discharge current is taken to occur in the batteries resulting in ξ = 1% annual energy loss. Then, sensor battery lifetime, given by

$$\mathrm{T}\_b = \frac{E\_b}{E\_{tot} + E\_b \times \xi / 365},\tag{23}$$

is estimated as Tb= 104.8 months.

Similar calculations are performed to estimate Etot and T<sup>b</sup> for CR = 21, 31% as well as CR = 100% (non-compressive transmission) shown in **Table 4**. The latter case is the one most widely considered in the literature in comparative studies on energy savings quantification in wireless sensors (e.g., O'Connor et al., 2013, 2014; Klis and Chatzi, 2016b). In this respect, **Table 4** reports energy reduction and battery gain ratio for all CR < 100%



examined with respect to CR = 100% given as

$$\text{ER} = \frac{E\_{tot}|\_{CR=100} - E\_{tot}|\_{CR}}{E\_{tot}|\_{CR=100}} \times 100 \text{ \% and}$$

$$T\_b = \frac{T\_b|\_{CR}}{T\_b|\_{CR=100}},\tag{24}$$

respectively. Evidently, battery life expectancy increases dramatically as CR reduces: it more than triples for CR = 31% compared to non-compressive sampling/transmission, while it increases more than 8 times for CR = 11%. In this respect, through collective consideration of the data in **Tables 3**, **4**, it can be concluded that the considered approaches effectively support more sustainable wireless monitoring systems having reduced maintenance costs and environmental impact associated with sensors battery replacement which can be scheduled at much longer intervals without significant deterioration to the accuracy of the estimated modal properties compared to Nyquist data sampling.

Note, however, that in a practical setting, the choice of CR (and corresponding battery lifetime gains) is normally dictated by the sought level of accuracy in extracting modal properties according to the monitoring purpose and objectives. If accurate modal properties estimation in the absolute sense is desired (e.g., for updating computational models of asbuilt structures, or for designing/assessing the performance of vibration control devices, such as tuned mass dampers, to reduce dynamic response of structures) higher CR values should be adopted (e.g., CR > 31%). In such cases, battery lifetime gains may be relatively low but, at the same time, these gains may be a less important practical consideration. On the antipode, lower CR values (e.g., CR = 21% or lower) may be adopted in monitoring campaigns for which extending sensor battery lifetime becomes a priority over modal extraction accuracy. One such example is in long-term/permanent structural monitoring deployments aiming to detect structural damage (e.g., due to natural deterioration or after an extreme event), in which case reducing battery replacement frequency, and thus maintenance costs, becomes important and may be a main criterion for installing a monitoring system in the first place (e.g., O'Connor et al., 2014), while accuracy of modal properties in the absolute sense is less important since changes to modal properties (as a function of environmental conditions) are of interest.

In every case, the data furnished in **Tables 3**, **4** are only indicative and should be used/interpreted comparatively rather than in an absolute. Indeed, recorded measurements considered to derive modal properties have been obtained by a wired sensor network and, therefore, do not account for the influence of errors that may be encountered in WSNs, such as loss of synchronization. Moreover, power consumption (and thus battery lifetime gains) varies from sensor to sensor in a WSN and is dependent on several factors including environmental conditions, distance from sensors to base station, data communication protocols, etc. In this regard, field deployment of WSNs operating on PSBS and STCSrwPBDN is required to appraise the quality of modal properties and battery gains in an absolute application-specific sense; such consideration falls outside the scope of this work.

### CONCLUDING REMARKS

The applicability and performance of two recently proposed approaches for output-only modal identification supporting low energy consumption wireless sensors has been comparatively demonstrated through numerical assessment using field recorded acceleration data. Both the approaches aim to reduce wireless data transmission payloads by considering compressed structural acceleration responses acquired non-uniformly in time at sub-Nyquist average sampling rates. The first, STCS-rwBPDN, approach aims to recover acceleration time-histories in timedomain from low-rate randomly acquired measurements using the rwBPDN algorithm of compressing sensing. The accuracy and efficiency of this operation requires knowledge of the signal support in the frequency domain prior to transmission of compressed measurements from sensor nodes to a central server where time-domain reconstruction takes place. This knowledge is gained through sampling and interrogation of full-length acceleration data at the sensing nodes as well as sensors/server exchange of pertinent information. In this regard, STCSrwBPDN can recover the time trace of response acceleration signals in a deterministic setting and, therefore, can be coupled at the post-processing back-end with any standard OMA technique for modal properties extraction. Nevertheless, this flexibility comes at the cost of a relatively sophisticated wireless data communication strategy as well the necessity to sample signals at Nyquist frequency or above at sensors front-end. The second, PSBS, approach is effectively a spectral estimation technique aiming to recover second-order statistics (i.e., correlation or PSD functions) of response acceleration signals treated as stationary random processes and acquired through low-rate deterministic non-uniform-in-time multi-coset sampling. Compared to STCSrwBPDN, the main practical advantage of the PSBS-based approach is its simplicity of wireless communication within WSNs as well as minimal on-sensor data interrogation. Low-rate multi-coset samples can be acquired using some pre-specified sampling pattern at each sensor and communicated as-recorded directly to a server. This high level of data transmission simplicity is made possible by the inherent signal agnostic attribute of PSBS which does not require any prior knowledge about signal spectral support. However, PSBS approach cannot retrieve time traces of the acquired signals and this limits the system identification methodologies that can be applied at the back end of the approach to those relying on only second-order signal statistics for modal properties extraction. One such method is the FDD which was herein coupled with PSBS to deliver mode shapes and natural frequencies directly from the low-rate multi-coset sampled response accelerations.

The validation of the two approaches was carried out by considering field-recorded acceleration data obtained from conventional wired sensor deployments in an operating onshore wind turbine and in a highway overpass (bridge) open to traffic. The recorded data have been compressed to different levels (CRs) and processed by both approaches. PSBS captured successfully salient frequency domain information for the dataset of the wind turbine for CR as low as 11% (i.e., using 89% less measurements from the conventionally sampled dataset). This demonstrates the potential of the method to treat real-life data that may deviate from perfect stationary signal conditions. Similarly, STCS-rwBPDN was shown to recover faithfully time traces of the wind turbine data set at same low CR levels (11%). In view of these results, it is concluded that both methods are equally promising for SHM of wind turbines using low-rate acceleration measurements. Moreover, STCS-rwBPDN coupled with standard NeXT-ERA was able to retrieve quality estimates of mode shapes and natural frequencies of the bridge case-study again at CR = 11%. PSBS was also able to capture with high accuracy the same mode shapes for CR = 21% while only the two most significant ones were retrieved at CR = 11% satisfactorily.

In view of the herein reported numerical results, it is concluded that both the considered approaches are capable for accurate output-only system identification of large-scale civil infrastructure while being, to a good extend, complementary. Moreover, it is noted that both approaches can be used to acquire alternative types of signals to acceleration data, such as tilt measurements and dynamic strains considered in SHM. It is thus envisioned that smart sensing nodes may incorporate both these approaches for reducing data transmission payloads in WSNs which will allow operators to switch between the two depending on their monitoring needs at any given time: timeseries recovery at, perhaps, some increased data transmission requirements and more intense on-sensor processing or modal properties recovery at minimum wireless data exchange and with minimum on-sensor data interrogation.

Still, note that all datasets considered in this work pertain to wired sensors and, therefore, are free from errors that are more

### REFERENCES


common in WSNs, such as missing data or gaps in data due to data loss in wireless transmission, loss of synchronization among sensors, etc. The extent of such errors and its potential impact to the quality of monitoring (e.g., accuracy of extracted mode shapes and natural frequencies) is application-specific depending on factors, such as the technology and quality of the sensor nodes used, the topology of WSN, the nature and scale of the monitored structure, the environmental conditions etc. Moreover, the same factors influence sensor energy consumption and ultimately battery lifetime. In this regard, consideration of long-term reallife field deployments of WSNs operating on the examined approaches is further warranted to verify the accuracy and battery life prolongation of the approaches for full-fledged monitoring of large-scale civil engineering structures. This consideration is left for future work.

### DATA AVAILABILITY

The datasets generated for this study are available on request to the corresponding author.

### AUTHOR CONTRIBUTIONS

AG and EC conceived the study. KG performed all data processing related to the PSBS approach. RK and VD performed all data processing related to the STCS approach. KG produced a first draft of the article which was finalized by critical input by AG and EC on introduction, results interpretation, and concluding remarks.

### FUNDING

KG acknowledges the partial financial support received through a Ph.D. studentship by City, University of London. EC would further like to acknowledge support of the ALbert Lück Foundation and the ERC Starting Grant WINDMIL (#679843) on the topic of Smart Monitoring, Inspection and Life-Cycle Assessment of Wind Turbines.

### ACKNOWLEDGMENTS

Sections The Multi-Channel Power Spectrum Blind Sampling (PSBS) Approach, STCS-rwBPDN Approach for Time Domain Signal Recovery, and part of section Assessment for Mode Shape Extraction Under Operational Loading Conditions of this paper form part of the Ph.D. thesis of KG (Gkoktsi, 2018) supervised by AG.


**Conflict of Interest Statement:** KG is currently employed by AKT-II.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Gkoktsi, Giaralis, Klis, Dertimanis and Chatzi. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# System Identification-Enhanced Visualization Tool for Infrastructure Monitoring and Maintenance

Premjeet Singh and Ayan Sadhu\*

Department of Civil and Environmental Engineering, Western University, London, ON, Canada

Today's complex modern infrastructure requires robust and autonomous condition assessment as they continue to age with increasing operational loads and extreme climatic events. Structural Health Monitoring (SHM) has recently gained significant interests in inspection and maintenance of large-scale structures. However, a large amount of raw data resulting from the data logger of these SHM systems require appropriate tools to visualize and diagnose the data systematically. Building Information Modeling (BIM) is a powerful data management tool that can be utilized as a base platform to analyze and visualize long-term SHM data. Current BIM-based approaches have the capabilities of facilitating design, production, and construction management of structures. BIM models in such approaches can serve as static information that contains as-built data. The objective of this paper is to take one step forward from static toward dynamic BIM by representing and visualizing real-time SHM data. The proposed framework developed in this study features an online visualization of data, real-time system identification, and efficient decision-making. In this paper, a steel bridge located in London, Ontario (Canada), is utilized as a case study where both BIM and SHM are integrated with a unified fashion. The proposed framework attempts to improve the visualization of SHM data and facilitates infrastructure owners in real-time tracking of critical transport infrastructure.

Keywords: structural health monitoring, building information modeling, visualization, long-term monitoring, system identification, Autodesk, TVF-EMD, bridge management systems

### INTRODUCTION

Civil infrastructure such as bridges, buildings, dams, wind turbines, and pipelines are prone to deterioration as they age. Keeping track of their usage, performance, and integrity provides impetus to maintain public safety and achieve improved satisfaction to the infrastructure owners and endusers. In the past, numerous catastrophic failures occurred world-wide; most of these tragedies were due to progressive deterioration of structures over the years (Mirza and Shafqat Ali, 2017; Cawley, 2018), demanding an immediate need for systematic monitoring of structures based on their current condition. Structural Health Monitoring (SHM) offers attractive strategies to retain public safety, undertake rapid infrastructure management, and recover a structure from its critical state in ease (Durager et al., 2013; Newhook and Edalatmanesh, 2013). Changes in structural performances can be identified by detailed SHM assessments (Okasha and Frangopol, 2012; Miao et al., 2018).

Edited by:

Eleni N. Chatzi, ETH Zürich, Switzerland

#### Reviewed by:

Saeed Eftekhar Azam, University of Nebraska System, United States Yi-Qing Ni, Hong Kong Polytechnic University, Hong Kong

> \*Correspondence: Ayan Sadhu asadhu@uwo.ca

#### Specialty section:

This article was submitted to Structural Sensing, a section of the journal Frontiers in Built Environment

Received: 04 December 2019 Accepted: 01 May 2020 Published: 28 May 2020

#### Citation:

Singh P and Sadhu A (2020) System Identification-Enhanced Visualization Tool for Infrastructure Monitoring and Maintenance. Front. Built Environ. 6:76. doi: 10.3389/fbuil.2020.00076

SHM offers robust diagnostic and prognostic tools that can detect critical responses of a structure and evaluate any unusual symptoms, serviceability, and safety concerns (Carden and Fanning, 2004; Wu and Jahanshahi, 2018). Most of the conventional inspection methods require visual inspection by maintenance engineers. Recently developed SHM techniques utilized the measured data acquired by sophisticated sensors (Ellenberg et al., 2014; Sankarasrinivasan et al., 2015; Chen et al., 2016; Na and Baek, 2017; Sadhu, 2017; Sony et al., 2019) which can expedite the accuracy of damage detection as compared to visual inspection (Abe, 1998) using continuously monitored data. An SHM system, with the aid of long-term monitored data, can evaluate the structural integrity and perform accurate damage assessment (Aktan and Grimmelsman, 1999; Somwanshi and Gawalwad, 2016; Sadhu et al., 2017). Most of these techniques (Farrar and Worden, 2013) are data-driven in nature, where either modal (e.g., natural frequency, damping and mode shapes) or physical (e.g., stiffness and mass) parameters are estimated or tracked based on the measured data. Any discrete or progressive changes in these parameters are considered as the potential condition indicators for damage identification.

**Figure 1** shows a typical representation of an SHM system where the sensors are connected to a data acquisition (DAQ) system, and send the acquired raw data to a central unit or a computer. The following post-processing phase includes sorting and de-noising of the data, and determines the vital information, including critical deflections and modal parameters such as frequency, damping ratio, and mode shapes, etc. Once such parameters are studied and documented from the measured structural response over a long period, automated alerts can be set up using the appropriate thresholds for safe and reliable use of the public infrastructure. However, the interpretation of long-term data collected from continuous monitoring can be overwhelming due to the processing of an enormous amount of data. Automated processing and visualization of data facilitate accurate decision making in a timely manner. Building Information Modeling (BIM) is a digital representation of physical and functional characteristics of a structure (Ren et al., 2019), which is utilized here for structural monitoring and maintenance.

BIM is not only a computer-aided-design tool but also a three-dimensional (3D) modeling and information management tool that can aid the stakeholders and infrastructure owners in monitoring projects remotely. Traditional BIM aims at design and life-cycle analysis of a new building and its construction (Arayici and Aouad, 2010; Grilo and Jardim-Goncalves, 2010; Liu et al., 2014; Singh and Sadhu, 2019). BIM is capable of integrating various engineering aspects through 3D spatial representation. Capabilities of BIM are not only limited to being a software environment, but it also serves as a visualization tool providing a better understanding of the project and helping designers to convey the design information and ideas to the project owners (Ivson et al., 2019). With all the information about each component being in one place in a single model, it enables endusers to access such information at any time during its lifecycle. Such capability of big data inventory is utilized in this study and explored how it can provide a real-time representation of SHM data to the end-users. During long-term monitoring of structure, raw and preprocessed data can add up to hundreds of gigabytes of data, which makes the process of data retrieval prone to errors (Alampalli et al., 2016; Cremona and Santos, 2018; Almasri et al., 2019). Damage detection can be visualized in the model by assimilating the sensor data within the BIM model. BIM uses a static data source to assess the structure. Therefore, the sensor data, while linked to BIM, can extend the application of BIM model from static to a dynamic model as it can feature real-time data retrieval and interpret the current performance of the structure.

Recently, there have been several efforts to develop BIM-based SHM strategies. For example, (Zhang and Bai, 2015) created a low-cost structural condition assessment device that used BIM computing environment for automated health management of structures (Chen et al., 2014) developed a dynamic BIM framework by developing a prototype to insert real-time data into the BIM model. The dynamic BIM model developed in the study represented real-time building information via connecting the sensor data with the BIM model. A geothermal bridge deck de-icing system monitored with embedded sensors was used as a case study (Delgado et al., 2017) formulated a standard data model to include and visualize SHM data directly to BIM models. A case study was conducted in a pre-stressed concrete girder bridge featuring a fiber-optic based SHM system. The goal to accurately represent the SHM sensory system, including damage sensitive features in the object properties, was achieved by Grosso et al. (2017). The authors demonstrated

the linking of data to sensor representations within the BIM model.

The viability of bridge information modeling with different modules of bridge management systems was explored by Marzouk and Hisham (2011). (Huston et al., 2016) worked on the integration of BIM and decision-making systems with SHM involving collection, storage, transmission, and processing of information obtained from sensor data and design documents. The extended Industry Foundation Classes (IFC) schema, referred to as IFC Monitor was formulated by Theiler et al. (2017) to facilitate the documentation of SHM systems since the current schema was unable to support the full description of modeling related information. The automatic generation of parametric building models of SHM systems and efficient integration with other data sets was enabled by Delgado et al. (2018). Recently, Boddupalli et al. (2019) developed a data visualization tool for systematic decision making using the computing environment of BIM as a primary platform.

The significant limitations of these studies are the lack of a single standardized neutral exchange format for sharing information among the various data software. The problem arises when attempting to extract data from sensors in many different protocols. The handling of large volumes of data requires high-performance hardware. Lack of interoperability is another challenge in the seamless integration of an SHM system with the BIM platform. There is a lot of software commercially available for modeling and development of structures. However, the development of various computational tools such as addins or plug-ins is undertaken in a standalone fashion, which is also inefficient to address the complications arising from multiple data sources. The existing BIM-based SHM tools lack interoperability and information sharing with other software and technology (Grilo and Jardim-Goncalves, 2010; Cemesova et al., 2015; Karan and Irizarry, 2015; Tomasi et al., 2015). Moreover, the capability of system identification and evolution of structural parameters over time are not available in the existing visualization tools in the literature, which forms the main focus of the proposed research.

After a basic introduction of BIM and SHM techniques and identification of the limitations in the current literature, the proposed method is discussed in the next section. The proposed framework is finally illustrated using a case study consisting of visualization of the bridge SHM data followed by results and conclusions.

### PROPOSED FRAMEWORK

This section provides an overview of the proposed methodology implemented to visualize SHM information within BIM through Autodesk REVIT <sup>R</sup> . The proposed framework utilizes the relative merits of SHM and BIM to develop a visualization tool for monitoring of large-scale infrastructure. This study uses REVIT and MATLAB (MathWorks, 2018) online portal to integrate the sensor information with condition data and diagnostic results. Virtual sensors in this study are used to visualize the monitoring related information in the BIM environment. Accelerometer sensors used for vibration data collection are modeled in the REVIT as a new class of family, as shown in **Figure 2**. Sensor metadata is used to create a sensor family and can be accessed by highlighting any sensor from the BIM model. The dynamic behavior of the structure is analyzed using the sensor data in MATLAB.

### Virtual Sensors

Virtual sensors used in this study to mimic the sensors installed on a real structure are created as a new REVIT family using the IFC standard of data exchange shown in **Figure 3**. IFC is used by building-model based applications to exchange data with each other, and it constitutes a specification that can describe model data related to all phases of the life-cycle of a project (Augenbroe et al., 2004; Rio et al., 2013). The IFC model represents tangible building elements such as doors, walls, ceilings, beams, etc. and even more abstract entities such as time, schedule, space, cost, organization, etc. There are different IFC classes for each element, while the sensors are included in the IFCBuildingControls domain module. There are two classes associated with sensors; IFCSensor and IFCSensorType. As the sensors are defined in the BIM environment, sensor information can be accessed using the properties box of each sensor. The link to MATLAB is also connected with the properties box. By clicking the MATLAB link, the user is taken to the MATLAB online portal, where system identification scripts can be run. Subsequently, the system identification results can be analyzed for decision-making.

### System Identification

The data and system identification information of SHM is embedded within the BIM software such that long-term health monitoring information can be visualized. A wide range of system identification methods (Dessi and Camerlengo, 2015; Perez-ramirez et al., 2016; Pappalardo and Guida, 2018; Barbosh et al., 2018; Mao et al., 2019) were developed by the researchers to estimate modal parameters from the measured vibration data. Most of these techniques are suitable where all critical locations of the structure are instrumented. For the visualization purpose, each sensor installed in a real structure corresponds to a virtual sensor in the BIM model. Therefore, while visualizing a particular virtual sensor, each sensor creates a time history that requires a system identification method that is capable of using only a single channel measurement. In this study, a newer timefrequency method, time-varying filter-based empirical mode decomposition (TVF-EMD) (Li et al., 2017; Lazhari and Sadhu, 2019) is used to conduct system identification using singlechannel measurement.

Consider a linear with n degrees-of-freedom (DOFs), damped and discrete lumped-mass structural system, subjected to a random input force, u(t):

$$\mathbf{M}\ddot{\mathbf{y}}\left(\mathbf{t}\right) + \mathbf{C}\dot{\mathbf{y}}\left(\mathbf{t}\right) + \mathbf{K}\mathbf{y}\left(\mathbf{t}\right) = \boldsymbol{\mathfrak{u}}\left(\mathbf{t}\right)\mathbf{y}$$

where, **M**, **C**, and **K** are mass, damping, and stiffness matrices, respectively, and y(t) is a displacement response vector at various


FIGURE 3 | IFC sensor data.

available DOFs. A state-space model can be used to find the solution for a dynamical system given above:

$$
\begin{aligned}
\overline{\mathfrak{p}} &= \begin{bmatrix} y\_I \\ y\_2 \end{bmatrix} \\
\dot{\mathfrak{p}} &= A\overline{\mathfrak{p}} + Bu \\
\textit{p} &= \widehat{C}\ \overline{\mathfrak{p}} + Du
\end{aligned}
$$

where **A** is state matrix, **B** is the input matrix, **C**ˆ is the output matrix, and **D** is the transmission matrix. Under excitation u(t), the resulting solution can be written in terms of expansion of vibration modes:

**y**=ϕη

where y and η are response and modeshape matrix, respectively. ϕmxn is the mode transformation matrix. n and m are the number of modal responses and measurements, respectively. The measurement at k-th DOF (k=1,2,. . . .,m) from the above equation can be expressed as:

$$\eta\_{\mathbf{k}}(t) = \sum\_{j=1}^{n} \varphi\_{\mathbf{k}j} \eta\_{j}(t)$$

TVF-EMD is capable of eliminating the mode-mixing or end-effects under the presence of closely spaced modes or

measurement noise. This method performs local cut-off filtering where a signal is filtered into local high-pass and lowpass frequency components and decomposed into narrowband signal components called Intrinsic Mode Functions (IMFs). By performing TVF-EMD of the kth measurement yk(t) in terms of IMFs (i.e., ikj), one can get (Lazhari and Sadhu, 2019):

$$y\_{\mathbf{k}}(t) = \sum\_{j=1}^{n} t\_{\mathbf{k}j}(t)$$

By comparing the above two equations, we get:

$$
\varphi\_{\mathbf{k}\mathbf{j}} \eta\_{\mathbf{j}}(t) = \mathfrak{h}\_{\mathbf{k}\mathbf{j}}(t)
$$

By taking the ratio of above equations for k-th and l-th DOF, the normalized mode shape ordinates at k-th DOF w.r.t. l-th DOF can be found as:

$$\frac{\mathfrak{i}\_{\mathbf{k}\mathbf{j}}}{\mathfrak{i}\_{\mathbf{l}\mathbf{j}}} = \frac{\varphi\_{\mathbf{k}\mathbf{j}}\eta\_{\mathbf{j}}(\mathbf{t})}{\varphi\_{\mathbf{l}\mathbf{j}}\eta\_{\mathbf{j}}(\mathbf{t})} = \frac{\varphi\_{\mathbf{k}\mathbf{j}}}{\varphi\_{\mathbf{l}\mathbf{j}}} = \widehat{\varphi\_{\mathbf{k}\mathbf{l}}}$$

The details of TVF-EMD method can be found in Lazhari and Sadhu (2019) and are not repeated here. TVF-EMD uses the rootmean-square (RMS) value of the resulting IMFs to extract the modal responses. However, all the frequencies with energy higher than the average RMS value cannot be utilized to differentiate the actual structural frequencies from the background noise. To automate the identification step, it is proposed that when the difference between the respective Fourier peaks in an IMF is more than a specific percent (say, 70%) of the higher peak value, then the IMF represents a structural modal response rather than mixed modal response.

### Proposed Visualization and Decision-Making

The proposed framework has three-fold advantages of online visualization of data, real-time system identification, and decision making by tracking the system identification results obtained from the measured data. **Figure 4** shows the proposed framework that can automate system identification and visualization of SHM data in the BIM environment. First, a parametric 3D model of the structure is developed in Autodesk REVIT. Since the virtual sensors are not predefined elements in the REVIT library, these are manually created using a new REVIT family and IFC attributes. On the other hand, physical sensors, which are connected to a DAQ system, record the SHM data for structural condition assessment. Therefore, accelerometers are used to collect the SHM data, and virtual sensors are created in the BIM environment using IFC (as shown in **Figure 3**) to mimic the physical sensors on site. Data file from each physical sensor is associated with the respective virtual sensor in REVIT. System identification is performed using the TVF-EMD algorithm, which is integrated with REVIT through an online MATLAB portal linked via the "Properties" box of the virtual sensor in the BIM model. Owing to its capability of analyzing a single sensor data associated with a virtual sensor, the TVF-EMD is adopted

to undertake system identification from single-channel data. It is automated and can be implemented in real-time for condition assessment of structures within the BIM platform. A case study is presented next to demonstrate the implementation of the developed framework.

## CASE STUDY

The proposed framework is validated using a 300 feet pony-truss bridge in London, Ontario, shown in **Figure 5A**. This section demonstrates the application of the proposed visualization tool developed in this study. Bridge vibrations were monitored while different numbers of vehicles traveled over the bridge. A virtual model for the bridge was developed in REVIT and sensor data was integrated with the virtual sensors. System identification results from the SHM data were shown in a user-friendly format integrated with the visualization platform of REVIT.

### BIM Model

For the framework presented in this study, a structural model is developed that closely represents the real bridge. The data attributes define the physical, geometrical and abstract properties of the structure. REVIT is used as a BIM tool to visualize the bridge virtually. With the help of 2D drawings provided by the City engineers, a virtual model of the bridge is developed into a 3D model with the generic parameters and properties using REVIT shown in **Figure 6**.

This model is used to define the real-time dynamic behavior of the bridge that can be used for visualization of long-term monitored SHM data. Sensors feed the vibration data to DAQ, which was connected to a computer. The raw data file generated by the DAQ system was used to perform system identification and served as a link to connect the BIM model with the MATLAB online portal. Virtual sensors that were not pre-defined in Autodesk REVIT were manually created in the BIM model. A new REVIT family was used to create the accelerometer sensor virtually and IFC exchange format was used to define the virtual sensor attributes shown in **Figures 2**, **3**, respectively. **Figure 6** shows the virtual sensors placed in the virtual BIM model of the bridge. Properties related to sensors used for this particular

study were defined in REVIT shown in **Figure 7**. Upon selecting a particular virtual sensor in the bridge, its properties box shows all the data associated with that specific sensor, including sampling frequency, raw datasheet location, sensor serial and location, MATLAB link for system identification, etc.

### Instrumentation

The bridge was instrumented with accelerometers to evaluate its modal parameters and analyze and predict the structural health of the bridge. Nine high-sensitive sensors were placed along the walkway of the bridge, and the sensors were set up to measure uniaxial vertical vibration. The sensors used for the testing had a sensitivity of 10 V/g. A sampling frequency of 200 Hz was used. Sensors were placed at a distance of 10, 20, 50, and 100 feet on both sides from the centerline of the bridge shown in

TABLE 1 | Vehicle count of the test data.


6 3 Jumps of a single subject


FIGURE 8 | DAQ file containing raw and unprocessed data.

**Figure 5A**. The data collection was performed through the DAQ system by connecting it with sensors using BNC cables and a laptop using a USB cable shown in **Figures 5B,C**. Test details regarding the number and class of vehicles during each test run are tabulated in **Table 1**. The duration of each test was between 30 s to 5 min. Tests 4, 5, and 6 include the free vibration response recorded during jumping of a single subject near the center of the bridge.

### Implementation of Proposed Visualization Tool

The data collected from the building and MATLAB scripts (MathWorks, 2018) were linked with the virtual sensors that were modeled in REVIT shown in **Figures 2**, **7**. By selecting a sensor, its related properties are shown in the properties box, including serial number, date, time, sensor location, sampling frequency, datasheet link, MATLAB link, etc. The properties box for a highlighted sensor is shown in **Figure 8**. After clicking on the MATLAB link for a particular sensor, the user is taken to the MATLAB online portal, which performs the system identification using the datasheet assigned to the specific sensor. The DAQ file, containing the raw and unprocessed data collected by the sensors, is saved as a text file and is shown in **Figure 8**.



This file is linked with the virtual sensor of the BIM model of the bridge and can be accessed by clicking on the datasheet link on the properties box of the virtual sensor. This text file is also uploaded on the MATLAB online compiler along with the scripts of the TVF-EMD method. By clicking on a sensor in the BIM model, the respective sensor gets highlighted, and a Property box shows up in the REVIT window, and all necessary sensor information is contained in this icon shown in **Figure 8**. By clicking on the datasheet link in the properties box, the user is taken to the raw data file linked to that particular sensor containing unprocessed data. By clicking the MATLAB link, highlighted in **Figure 9**, the user is taken to MATLAB online portal. In the portal, by executing the MATLAB scripts, system identification results can be generated using single-channel measurement through the TVF-EMD method.

The framework presented in this study is used to perform modal identification using a single sensor measurement. The time history of the physical response of the bridge is shown in **Figure 10**. The method used in this study successfully extracts the mono-component modal responses. The resulting IMFs (i.e., extracted modal responses) are separated by the TVF-EMD algorithm. The resulting mono-component responses and the identified structural frequencies are discussed below.

TABLE 3 | Frequencies (in Hz) identified from the different sensors.


### Free Vibration

Free vibration tests were conducted to estimate the natural frequencies of the bridge. To achieve this, the bridge was excited by jumping of a subject at the center of the bridge. This test has another significance of mimicking the pedestrian activity (walking or running) on the bridge. As shown in **Figure 11A** and **Table 2**, around four Fourier peaks can be observed between 0 and 20 Hz, indicating four natural frequencies of the bridge in this range, which are consistent with traffic-induced vibration.

### Traffic-Induced Vibration

Test runs are selected for analysis in such a way that represents live traffic conditions. All the test runs except 4, 5, and 6 include the structural response generated by passing vehicles over the bridge. Test 1, 2, and 3 are selected for further analysis as their vehicle count is 11, 22, and 31, respectively, which represent a wide range of vehicles passing over the bridge. **Figure 11B** shows the processed data from the first three tests, which cover most of the range of vehicle count. As seen in the figure, Fourier amplitudes have higher values with an increasing number of vehicles in the bridge.

TVF-EMD is used to acquire the bridge frequencies from the single-channel measurement or vibration response generated by a bus driving over the bridge, and the results are shown in **Figure 12**. The resulting IMFs are separated by the TVF-EMD method. The mono-component responses and the identified structural frequencies are shown in **Figure 12**. **Table 3** contains the modal frequencies obtained from different sensors generated using the proposed framework.

## CONCLUSIONS

This study investigated the potential of BIM in data management and maintenance of infrastructure using a web-based workflow. The use of different data formats can be omitted since the process is web-based and features real-time integration of sensor data with the BIM model. The proposed framework enhances software interoperability and frequent communication, which are required on civil infrastructure projects. The extension of the BIM model from static to dynamic enables the realtime link between the data-driven SHM techniques and BIM software. The web-based approach can be utilized to identify the modal frequencies from the sensor data using TVF-EMD method used in this study. In this way, system identification is integrated within the BIM model which can be beneficial for better interpretation of SHM data. By linking the real sensory data with the virtual sensor, this study extends the BIM model from static to dynamic and provides an effective management and visualization tool for engineers and project owners at large by providing them with updated, monitored information.

In this paper, it is attempted to integrate system identification within the framework of BIM. The authors used a singlesensor based modal identification technique, which enabled visualization of the bridge frequencies at given sensor location for different periods of time. State-of-the-art SHM methods (Farrar and Worden, 2013) use modal (frequencies, damping and mode shapes) or physical (stiffness or mass) parameters as the condition indicators. In this study, the authors limited their focus only to the dynamic representation of frequency estimation. The inclusion of other parameters (such as mode shape) within the BIM model requires further advances and is reserved for future research.

The proposed tool will allow a bridge engineer to virtually monitor a bridge and visualize both raw data as well as system identification results of different periods in a systematic manner, which will save time and eliminate any source of human errors of manual inspection of large data. Unlike conventional SHM data management, the developed BIM model enables real-time digital representation of SHM information throughout the life-cycle of infrastructure, enhancing the quality and assessment of infrastructure. By linking a singlesensor based system identification within the BIM model, it is possible to diagnose the bridge frequencies using the data from different periods of time from a selected node. Future research is reserved to utilize Augmented or Virtual Reality, and automate the digital representation of SHM from multiple sensors simultaneously.

### DATA AVAILABILITY STATEMENT

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.

## AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

### FUNDING

The proposed research was funded by the Natural Sciences and Engineering Research Council (NSERC) of Canada through the last author's Discovery Grant.

### ACKNOWLEDGMENTS

We would like to thank the engineers of City of London, Jane Fullick, and Karl Grabowski, for their valuable collaboration and assistance during the data collection in the City bridge, and members of AS research team for their active participation in the bridge instrumentation.

### REFERENCES


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Singh and Sadhu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

digital media

of impactful research

article's readership