Monitoring Neutral Axis Position Using Monthly Sample Residuals as Estimated From a Data Mining Model

Structural Health Monitoring (SHM) has enabled the condition of large structures, like bridges, to be evaluated in real time. In order to monitor behavioral changes, it is essential to identify parameters of the structure that are sensitive enough to capture damage as it develops while being stable enough during ambient behavior of the structure. Research has shown that monitoring the neutral axis (N.A.) position satisfies the first criterion of sensitivity; however, monitoring N.A. location is challenging because its position is affected by the loads applied to the structure. The motivation behind this research comes from the greater than expected impact of various load characteristics on observed N.A. location. This paper develops an indirect way to estimate the characteristics of vehicular loads (magnitude and lateral position of the load) and uses a data mining approach to predict the expected location of the N.A. Instead of monitoring the behavior of the N.A., in the proposed method the residuals between the monitored and predicted N.A. location are monitored. Using actual SHM data collected from a cable-stayed bridge, over a 2-year period, the paper presents the steps to be followed for creating a data mining model to predict N.A. location, the use of monthly sample residuals of N.A. to capture behavioral changes, the ability of the method to distinguish between changes in the load characteristics from behavioral changes of the structure (e.g. change in response due to cracking, bearings becoming frozen, cables losing tension, etc.), and the high sensitivity of the method that allows capturing of minor changes.


INTRODUCTION
The objective of Structural Health Monitoring (SHM) is to provide a diagnosis of the condition of a structure in at, or near real time. This allows behavioral changes of the structure to be captured and monitored, potentially reducing the cost of maintenance. SHM systems can also be used for early detection of defects that could affect the behavior of a structure, and thereby prevent possible catastrophic failures.
The Neutral Axis (N.A.) of a structural bending element, like a beam, is the intersection of the cross-section in the undeformed geometry with the neutral surface of the bending element, along which no change in length occurs during the deformation of the beam (Shames, 1975). In symmetrical pure bending, the N.A. passes through the area centroid of the cross-section of the beam. When axial loads are applied, the N.A. location is offset from the centroid, while unsymmetrical bending rotates it around the centroid (Boresi and Schmidt, 2003).
On bridges, for which this research is focused, the location of the N.A. in pure bending has been shown to be independent of temperature (Anastasopoulos et al., 2019). For pure uniaxial bending the N.A. has been shown to be independent of the magnitude of the load (Sigurdardottir and Glisic, 2013); while in wide deck bridges it has been shown that the applied loads affect N.A. location due to biaxial bending effects (O'Brien and Keogh, 1999). In composite structures, N.A. location has been used as a parameter to identify the presence of composite action (Chakraborty and DeWolf, 2006;Yarnold et al., 2018) and also to detect the loss of composite action. For steel girder bridges with composite concrete decks, the N.A. is ideally located in the steel, thereby avoiding the development of tensile cracks in the deck. In several studies, the N.A. location has been successfully used to detect minor (Stroh et al., 2010;Sigurdardottir and Glisic, 2014;Tang and Ren, 2017;Xia et al., 2018;Anastasopoulos et al., 2019) and severe damage (Gangone et al., 2014;Anastasopoulos et al., 2019).
In long-term monitoring of N.A. location, researchers have observed that its position has varied over time, sometimes by a significant amount, and that variation followed a Gaussian distribution around a mean value (Rauert et al., 2011;Oshima and Kado, 2012;Sigurdardottir and Glisic, 2013). Researchers have connected this variability to the uncertainty of the applied

Research Objectives
The objectives of the research were to 1) identify the factors that affect girder N.A. location when a bridge deck is subjected to traffic loads, 2) use data collected from a SHM system to estimate the effect of these factors, 3) train a model to predict N.A. location, and 4) monitor the differences between the predictions and the actual location as calculated from the measured data. In contrast with what has been done in prior research, the factors affecting N.A. location were considered in the monitoring process. By doing this, the monitoring process becomes more sensitive to truck weight and transverse truck position. Evaluating N.A. location movements with added sensitivity enables structural condition changes to be identified in their early stages of development. The work done herein also represented one of the first known attempts to continuously monitoring the N.A. behavior of bending elements in a longspan (a span length great than 250 ft (76 m)) bridge.

Cable-Stayed Bridge Description
The Charles W. Cullen Bridge, or as it is more commonly known, the Indian River Inlet Bridge (IRIB), is a three-span cable-stayed bridge located in Sussex County, Delaware (Shenton et al., 2017).

Structural Health Monitoring System
The Center for Innovative Bridge Engineering at the University of Delaware (UD), in cooperation with the Delaware Department of Transportation (DelDOT), planned and installed a comprehensive SHM system on the bridge during the construction. This system consists of more than 120 individual sensors, distributed throughout the structure, collecting different types of data to evaluate the condition of the bridge ( Figure 2). Strain sensors are installed in different locations along the two longitudinal edge girders of the bridge, in pairs, one located at the top (ε TOP ) and one at the bottom (ε BOT ) of the girder (Figure 3). Having two strain measurements on the cross-section, and assuming that the strains developed on the girder vary linearly through the depth, one can calculate the distance y from the bottom of the girder to the N.A. (Figure 3).
This research focused on the N.A. position, of the west girder, in the middle of the Main Span (MS) and at the location that controls the load rating of the bridge on the backspan, the Controlling Location (CL) (Figure 1). These locations were selected because they experience the highest live-load strains.

DIAGNOSTIC LOAD TESTS CONDUCTED IN THE IRIB
Non-destructive diagnostic load tests of the IRIB have been conducted periodically to evaluate its condition under controlled conditions and for known truck loads (Al-Khateeb et al., 2019;Chajes et al., 2020). The first load test was conducted after the completion of construction (April 2012), followed by tests after 6 months, 1 year, 2 years and repeated every 2 years after that. The tests are conducted at night to reduce the thermal and solar effects and to reduce impacts on the local traffic. Different truck configurations are used including single-, four-, and six-truck passes. Normal traffic is restricted during the load passes. A 10-15 s window is captured before each pass, with the bridge free of traffic, which is used to zero the initial measurements. The N.A. location was calculated for all of the load passes at the two locations (MS and CL) on the bridge. Strain-time histories were post-processed using a 0.32 s moving average to eliminate some of the inherent noise. The 0.32 s average amounts to a 5-point average for the 12.5 Hz sample rate used by the SHM system to capture live load traffic effects. It was found to eliminate most of the low-level noise and vibration of the live load strains and still yield an accurate measurement of the peak stain. To be consistent, the same averaging window was used for the load test results. Due to remaining noise, and the imperfection in replicating the passes , variability still exists when comparing similar passes. To further minimize the effect of this variability on the calculation of the N.A., the location was calculated only when the magnitudes of both the top and bottom strain measurements were greater than 10 με, as small strains can yield large errors in the calculation of N.A., as evident by inspection of Eq. 1. Furthermore, the limit of 10 με was identified as a general agreement between the different studies on N.A. (Sigurdardottir and Glisic 2015).

Static Loads
The examination of the data started with the truck "pass" providing the best conditions for a consistent estimate of the N.A. location: a static load resulting from six side-by-side trucks stopped at the middle of the main span. Figure 4A shows a time history plot of the N.A. location as the six trucks cross the bridge FIGURE 2 | Layout of sensors in the SHM system (Shenton et al., 2017). and stop at mid-span. One can see that under these ideal conditions (a large static load) the N.A. location at midspan is quite stable when the vehicle is stationary, varying only slightly around a mean value. Another interesting feature of the behavior is the drop of the N.A. at the beginning, which represents the time that the trucks are approaching mid-span and then stop.

Moving (Pseudo-static) Loads
For the moving passes, trucks crossed the bridge at a crawl speed of approximately 5 mph. These applied loads are pseudo-static.  Having seen the variability of N.A. over the different passes and the presence of minimums, it was decided to compare the peak values of the different passes over the load tests at the controlling location ( Figure 5). Some of the previous observations were confirmed from this graph, however there are also several new observations: • The results agree with the well-understood correlation between the N.A. location and the magnitude of the applied load. Basic concrete beam behavior in bending result in an upward movement of the neutral axis with increased bending moment. What is notable here is that for a bending element with significant dead loads, the effects on the neutral axis poisiton due to live loads are so significant (and therefore useful in detecting structural condition changes). In the six-truck passes, the N.A. is noticeably higher than it is for the four truck passes, and the six-and four-truck passes are both significantly higher than any of single truck pass.

LIVE LOAD TRAFFIC DATA COLLECTED FROM THE IRIB
The data from the controlled load tests is good for drawing preliminary conclusions about the behavior of the N.A. and how it changes when the loads are changing. However, more refinement is needed to use the N.A. location to accurately monitor the behavior of the bridge. The variability of the N.A. position for the same truck passes over different tests shows the effect of the noise and the difficulty it would present in using N.A. location measured from a load test as an indication of changes in the bridge. Greater amounts of data would provide more reliable feedback on bridge condition, since statistical parameters could be used to eliminate noise effects. This larger amount of data can be gathered from the ambient traffic data. The continuous data collected by the SHM system were used to capture trucks crossing the bridge and monitor the behavior of the N.A. based on indirect estimations of the vehicle weight and lateral position on the bridge. Data is collected continuously, 24/7, at a rate of 12.5 Hz, by the SHM system. This "high frequency" data is used to capture trucks and other heavy vehicles as they cross the bridge. Unfortunately, the weight and lane in which these vehicles travel are unknowns (there is no weigh-in-motion system on, or immediately adjacent to, the bridge). As seen from the load tests, weight and lateral position of a vehicle affect the location of the N.A., so if these are not considered when monitoring N.A., changes in their distribution could change the distribution of N.A. position and falsely trigger alerts. Therefore, a first objective was to find an indirect way to estimate the weight and lateral position of a vehicle from the live load traffic data. The weight is estimated by the sum of the strains measured on the bottom of the east and west edge girders at the same longitudinal position on the bridge, since these strains are well correlated. where W est represents the weight proportional strain, and ε W BOT and ε E BOT are the strains on the bottom of the west and east girders, respectively. The lateral position is estimated using the ratio of the strain measured on the bottom of the girder of interest divided by the sum of east and west bottom strains.
where 0 ≤ P ≤ 1. This is a measure of the transverse load distribution to the two edge girders; the closer P is to one, the closer the vehicle is to the girder or interest. Finally, to exclude the effect of the longitudinal position of the load, only the point in time when the largest strain in the bottom sensor occurs is used in the calculation (this point was usually also the point that the top sensor was developing the largest strain). Data collected for the 22 months between March 2015 and December 2016 were used in the analysis. For this period, a 2-min moving average was first used to calculate the strain due to slowly changing loads, such as thermal effects. This moving average was subtracted from the full measurements to yield the strains developed due to vehicles and any other live loads. Since the focus of the N.A calculation was on traffic loads, it was necessary to understand what variability in the measurement could be caused by system noise and also by dynamic loads that couldn't be measured, such as wind loads. Wind loads for a long span bridge like IRIB can have a large impact even for moderate winds. To reduce the effect of these two factors (noise and unknown dynamic loads), the live load data were smoothed, again using a 0.32 s moving average, as was used in the load tests. With the data free of thermal effects and slightly smoothed, a truck pass event was assumed each time both the top and the bottom sensors on the girder of interest exceeded the 10 με limit. The measurements were collected, together with the strain measurement developed on the bottom of the opposite edge girder, with no strain limitation. Using these three values, the N.A. location and the values related to the truck magnitude and the lateral position were calculated for the west girder. Based on the load tests, where the trucks were weighted before used, the development of 10 με in both sensors (top and bottom) of the girder of interest, needs a minimum vehicle weight of 36.5 kips.

Characteristics of the Collected Traffic Data
Over the 22 months there were 17,418 events (passage of a heavy load as prescribed by the 10 με threshold) captured based on the measured strains at the controlling location (CL) of the west girder. The number of events is equivalent to about 26 per day, which is reasonable since the bridge is in a remote beach community and not a heavily traveled truck corridor. The three strain measurements that were used to calculate the N.A. position at the controlling location, were: West girder top sensor (S_W21), West girder bottom sensor (S_W22), and East girder bottom sensor (S_E22).
The distributions and quantiles for the N.A. location, magnitude of the load, and lateral position are shown in Figure 6, as computed and plotted using JMP Pro (SAS Institute). The distribution of the N.A. location ranges from 39.7 to 52.8 in ( Figure 6A). The summations of the strains at the bottom of the two girders, or the indirect estimation of the magnitude of the load, starts at 14.41 με and reaches a high of 97.12 με, with 80% of them falling between 31 and 49 με ( Figure 6B). Finally, from the distribution for the lateral position, it can be seen that the majority of the events that were captured resulted from trucks in the lanes close to the west girder, i.e., positions closer to 1.0 than 0.0 ( Figure 6C). This is explained from the criteria set for an event to be captured; both top and bottom sensors in the girder of interest must exceed 10 με. In order to capture a truck passing in a lane far from the west girder it must be very heavy. Some trucks that manage to do that are represented in the outliers in the bottom of the lateral position distribution ( Figure 6C). The data collected from the traffic loads confirmed the findings of the diagnostic load test. As was seen from the load tests, as the magnitude of the applied load increases the N.A. location moves higher in the cross-section, which can be seen in the traffic data ( Figure 7A). In the figure, blue dots indicate trucks with a lateral position greater than 0.5 (closer to the west girder), while the red dots represent trucks with a lateral position smaller than 0.5 (closer to the east girder). Trucks developing a combined bottom strain greater than 70 με were captured even when they were passing on the opposite side of the bridge. Trucks being far from the west girder produce strains that move the N.A. higher in the cross-section, just as observed in the load tests. That correlation of the N.A. position with the lateral position of the applied load was seen in all the traffic data collected ( Figure 7B).

Training of Data Mining Model to Predict N.A. Position
Knowing that N.A. changes based on the magnitude and the position of the applied loads, a way to estimate the N.A. location based on these two parameters is proposed. If successful, changes in bridge condition could be indicated by the difference between the measured and predicted N.A. location based on the characteristics of the loads, i.e., where y m and y p are the measured and predicted N.A. location for data sample i, and r i is the ith residual. A positive residual would mean that the measured N.A. position is higher than the predicted, which means that N.A. has moved higher in the cross-section, while a negative residual would mean that it has moved lower in the crosssection. Movements of the N.A. can be related to different types of problems: the development of a crack at the bottom of a girder would move the N.A. higher. While not an issue for the IRIB, the loss of composite action of a concrete-steel composite cross-section would move the N.A. lower in the cross-section. For that reason, knowing the direction that the N.A. is moving is important.
To do this, a data mining model was trained using the data collected from the first 3 months (March 2015 to May 2015) to predict the N.A. location based on the magnitude (W est ) and the lateral position (P) of the load as described earlier (Figure 8). The data mining model was trained using Weka, a program developed at the University of Waikato in New Zealand and the technique selected was the M5P tree (Wang and Witten, 1996), which combines a conventional decision tree with the possibility of linear regression functions at the nodes. Even though the data were not free of noise, the large amount of data eliminated the noise effect on the training process, producing a free of noise model. The trained model was then fed with the remaining 19 months of data to validate it. For each point, the value predicted by the model was compared with the actual value as calculated using the top and bottom strain measurements.
Four different models were used to predict N.A. location: 1) a model based on the average of the 3 months of training data, 2) a model based on the strain developed at the bottom of the west girder, 3) a model based on the magnitude of the applied load (W est ), and 4) a model based on both magnitude (W est ) and lateral position (P) of the applied load. The distributions of the residuals for the four models are shown in Figure 9.
As the data mining model improves, the standard deviation of the residuals decreases, leading to sharper distribution curves. One can see that the model with both magnitude and lateral position has the smallest standard deviation and a slightly sharper distribution of residuals. The improvement of the model is even clearer from the box-and-whisker plots, where quantile values are closer to zero and the outliers have been reduced both in number and magnitude. Of special interest is the improvement of the maximum value of residuals captured by each of the models. In the approach, where the N.A. position should be considered constant (see Figure 9A), the maximum residual captured was 7 in. compared to the 3.5 in. seen in the proposed method. A smaller standard deviation means that the control limits are smaller, and the system should be more sensitive to capturing changes. The outliers are less because their majority on the other models occurs due to loads that have different characteristics than the average loads, such as very heavy trucks or different lane patterns. The reduced number of outliers means that the system

Effect of Noise on Accuracy of Measurements
The effect of low-level vibrations and system noise on the predicted N.A. location have been investigated. To do that, data were collected between 12:00 and 4:00 am for 50 randomly selected nights throughout the year. Night data were used due to the low traffic volume at night on the bridge. The data were processed to eliminate the thermal effects and smoothed using the same procedure used for the traffic data. Any data following a pattern that indicated a truck passing were removed. This way, the data only reflected the effect of system noise and undefined loads such as wind. The standard deviation of the  By comparing the residuals from the MC simulation with the residuals of the actual traffic data collected on the bridge (Figure 10), one can see a very good correlation, which means that the residuals could be explained for the most part by the expected variability in the data. The good correlation is primarily shown in the quantile tables, where the differences are, most of the time, less than 0.1 in. Even the standard deviations of the residuals are very close: σ 0.93 for the actual traffic and σ 0.88 for the MC residuals. Overall, the simulation shows that the residuals could be explained by the uncertainty in the strain measurements, while the small changes are probably due to imperfect data that was used to train the model. Using more data for training should reduce these changes. Another reason for these differences could be the variability on the longitudinal position of the load (the variability could have offset the peak point of the strains).

Change in Distribution of Truck Weights
The importance of taking into account parameters affecting N.A. position can be seen in the results of another Monte Carlo simulation. Here, a change in the distribution of truck weighs has been investigated. To study this, the distribution of the loads captured from the real data were used, but the mean value was increased by 50%. By generating random loads from the revised distribution and feeding them into the data mining model, the N.A. position was predicted. Comparing Figures 11A,B, one can see how the distribution of the N.A. location would be affected by the change in the characteristics of the loads. The increased load caused a significant change in the mean value of the N.A. of almost 0.6 in and, most importantly, generated values that are higher in the crosssection than had been seen before. If the N.A. was monitored without considering the truck weights, this change could have been misconstrued as a change in behavior of the structure, such as damage and triggered a false alarm. However, when the load characteristics (magnitude and lateral position) are taken into account, the distribution of the residuals (Figures 11C,D) remains around zero, with the distribution getting sharper in Figure 11D, because the effect of noise is smaller for the larger loads. Figure 11 demonstrates the advantage of this method by being independent of the applied loads. This method will trigger an alarm when actual behavioral changes occur and not due to change of the traffic patterns, such as heavier trucks.

MONITORING THE DISTRIBUTION OF RESIDUALS ON A MONTHLY BASIS
The use of data mining techniques to find the correlations of N.A. location with magnitude and lateral position of the load reduced the variability of the residuals of the N.A. However, noise is still a problem that can hide small changes. MC simulation showed that a residual of even four in. could be explained by noise, making it difficult to capture small changes. To attempt to minimize the noise effects, changes in the distribution of N.A. residuals on a monthly basis was selected to be monitored instead of individual values.

Monthly Distributions
The monthly sample size of 800 events is large enough to have consistent distributions and support reliable conclusions. While this would mean a possible delay in a month of detecting a change in behavior, this is far more desirable when compared to a possible 2-years delay that could occur with the regularly scheduled inspections. Figure 12 presents an example of 1) the histogram of the monthly distribution (June 2015) and its fit to the normal distribution, and 2) the cumulative distribution function. The seven colored data points represent the points of interest that will be monitored on a monthly basis for capturing changes. These points show which proportion of the monthly data that have residuals smaller than −3, −2, −1, 0, 1, 2, and 3 in ( Figure 12B). From these 19 months ( Figure 12C) it can be observed that, each month: • 100% of data are between −3 and 3. Observations captured in some months are much lower than 0.5% and, for that reason, they are not shown on the graph. • The negative values represent from 42 to 56% of the residuals, varying around 50%. • Residuals less than −2 in are 1-3%, exactly like residuals more than 2 in (in the cumulative graph, that's represented by values from 97 to 99% for values less than 2 in). • 9-20% is the range of the cumulative frequencies for residuals smaller than −1 in and 81-88% for smaller than 1 in.
As expected, the distributions are shown to be consistent over these 19 months in Figure 12C. This consistency is even clearer in the next section where small changes are simulated in the N.A. position.

Sensitivity of Distributions
MC simulations were conducted to evaluate the sensitivity of the distributions to changes of N.A. behavior. The N.A. that was expected based on the specific loads was moved using a fixed value (different value for each scenario) and based on the new "changed" position S_W21 was calculated. Three different scenarios were conducted with results shown in Table 2: 1) N.A. offset upwards by 0.5 in, 2) N.A. offset downwards by 0.5 in and 3) N.A. offset downwards by 0.25 in.
One can see the range that each of the cumulative frequencies took during the three scenarios and compare it with the actual values measured by the system. It can be seen that for a 0.5 in. offset of N.A., in both directions, the distribution of residuals changes significantly, and it can be captured from one month of data. On the other hand, for a 0.25 in. change, a second month might be needed to clarify the overlap of some of the frequency windows. However, if data of multiple months are on the edges of the range window, this would be an indicator of behavioral change.

CONCLUSION
Using the location of the N.A. has shown great potential for identifying changes in the condition of a structure. For that reason, multiple researchers have investigated using it as a parameter to be monitored. However, it has been found that the location of the N.A is affected by various parameters. Both controlled load tests and response from ambient traffic loads have been used to show that N.A. location depends on the magnitude of the vehicle loads and lateral position of the applied load (travel lane). This effect was significantly higher than what one would expect from a dead load driven structure, like the IRIB. Prior studies have shown that monitoring N.A. distribution can be used to identify composite behavior of a cross section, but not necessarily in a general damage detection process, since changes in the characteristics of the applied load (truck patterns) can change the distribution of N.A. location and may therefore not be a good indicator of possible damage.
A new methodology that addresses these shortcomings has been presented. The method uses N.A. locations computed based on live load girder stains. From the strains, an indirect estimate of the vehicle weight and transverse position, properties that affect the N.A. position, can be determined.
Using these estimations, a data mining model was trained to predict N.A. locations, which can then be compared to the values computed using measured top and bottom strains. By doing this it was shown that the residuals between the predictions and the actual values can be separated from the residuals caused by changes due to the characteristics of the traffic loads. Using distributions of the residuals, a method for increasing significantly the sensitivity of N.A. location changes was demonstrated, allowing one to capture changes as small as 0.25 in (0.635 cm), compared to the 7 in (17.78 cm) variability shown when using the N.A. location itself. With the added sensitivity, early stages of damage development can be captured. The amount of data needed for reliable detection depends on the frequency and weight of trucks crossing the bridge. The amount of data should be small enough to inform the owner in a timely manner about changes in behavior, but large enough to be reliable. In the bridge studies conducted herein on the IRIB (which has low volumes of truck traffic), one month of data were found to be sufficient.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
CA, HS, and MC contributed to the conception and design of the study; HS was in charge of developing the SHM system and oversaw all of the load tests; CA performed the data analysis; CA, HS, and MC wrote portions of the first draft of the manuscript. All authors contributed to manuscript revision and read and approved the submitted version.

FUNDING
The project was supported by funds from both the Delaware Department of Transportation and the Federal Highway Administration under grants BRDG422145-09001448, BRDG422158-TASK 30A-1717, BRDG422161-TASK 30B-1717, and BRDG422162-TASK 30C-1717.