ECG-based estimation of respiration-induced autonomic modulation of AV nodal conduction during atrial fibrillation

Introduction: Information about autonomic nervous system (ANS) activity may offer insights about atrial fibrillation (AF) progression and support personalized AF treatment but is not easily accessible from the ECG. In this study, we propose a new approach for ECG-based assessment of respiratory modulation in atrioventricular (AV) nodal refractory period and conduction delay. Methods: A 1-dimensional convolutional neural network (1D-CNN) was trained to estimate respiratory modulation of AV nodal conduction properties from 1-minute segments of RR series, respiration signals, and atrial fibrillatory rates (AFR) using synthetic data that replicates clinical ECG-derived data. The synthetic data were generated using a network model of the AV node and 4 million unique model parameter sets. The 1D-CNN was then used to analyze respiratory modulation in clinical deep breathing test data of 28 patients in AF, where an ECG-derived respiration signal was extracted using a novel approach based on periodic component analysis. Results: We demonstrated using synthetic data that the 1D-CNN can estimate the respiratory modulation from RR series alone with a Pearson sample correlation of r = 0.805 and that the addition of either respiration signal (r = 0.830), AFR (r = 0.837), or both (r = 0.855) improves the estimation. Discussion: Initial results from analysis of ECG data suggest that our proposed estimate of respiration-induced autonomic modulation, a resp , is reproducible and sufficiently sensitive to monitor changes and detect individual differences. However, further studies are needed to verify the reproducibility, sensitivity, and clinical significance of a resp .


Introduction
Atrial fibrillation (AF) is the most common supraventricular tachyarrhythmia (Hindricks, G., Potpara, T., Dagres, N., Arbelo, E., Bax, J. J., Blomström-Lundqvist, C., et al. (2020)).Characteristic for AF is an uncoordinated atrial electrical activation that results in an increased and irregular ventricular activity.Atrial fibrillation poses a significant burden to This study aims to develop and evaluate a method to extract respiratory modulation in the AV node conduction properties from ECG data in AF.We present a novel approach to extract respiration signals from several ECG leads based on the periodic component analysis (Sameni, R., Jutten, C., and Shamsollahi, M. B. (2008)).In addition, we present a novel extension to our previously proposed AV node network model accounting for respiratory modulation of AV nodal refractory period and conduction delay.Furthermore, we predict the magnitude of respiratory modulation using a 1-dimensional convolutional neural network that was trained on synthetic 1-minute segments of RR series, respiration signals, and average atrial fibrillatory rate which replicate clinical data.The trained network was used to analyze data from 28 AF patients performing a deep breathing task.

Materials and methods
First, the clinical deep breathing test data from patients in atrial fibrillation is described in Sec.2.1.In Sec.2.2, the extraction of RR series and atrial fibrillatory rate (AFR) from ECG are described.Moreover, Sec.2.2 covers the extraction of ECG-derived respiration (EDR) signals using a novel approach based on periodic component analysis.A description of the extended AV network model accounting for respiratory modulation is given in Sec.2.3, as well as a description of how the simulated datasets are generated.In Sec.2.4, the architecture of a 1-dimensional convolutional neural network (1D-CNN) that is used to predict the magnitude of respiratory modulation from ECG recordings is described together with the training and testing of the neural network.Finally, the CNN is used to predict the respiratory modulation in clinical data and the predictions are analyzed.

ECG data
The dataset of the clinical deep breathing test consisted of 12-lead ECG recordings with a sampling rate of 500 Hz from individuals with AF participating in the SCAPIS study (Bergström, G., Berglund, G., Blomberg, A., Brandberg, J., Engström, G., Engvall, J., et al. (2015)).The participants in the SCAPIS study were from the Swedish general population aged 50-64 years.A subset of the SCAPIS cohort (5136 participants) performed a deep breathing test (Engström, G., Hamrefors, V., Fedorowski, A., Persson, A., Johansson, M. E., Ostenfeld, E., et al. (2022)).Of this subset, 28 participants with complete data were in AF at the time of recording (Abdollahpur, M., Engström, G., Platonov, P. G., and Sandberg, F. (2022)).The clinical characteristics of that subset are listed in Table 1.The deep breathing test started with the participants resting in a supine position while breathing normally for five minutes.Then for the deep breathing, the participants were guided by a nurse to inhale for five seconds and exhale for five seconds, for a total of six breathing cycles corresponding to 1 minute.
2.2 ECG data processing 2.2.1 Extraction of RR series ECG preprocessing and QRS complex detection were performed using the CardioLund ECG parser (www.cardiolund.com).The CardioLund ECG parser classified QRS complexes based on their QRS morphology.Only QRS complexes with dominant QRS morphology were considered in the computation of the RR series.
The RR series were computed from intervals between R-peaks taken from consecutive QRS complexes with dominant QRS morphology, and the time of each RR interval was set to the time of the first R-peak in each interval.The resulting non-uniformly sampled RR series were interpolated to a uniform sampling rate of 4 Hz using piecewise cubic Hermite polynomials as implemented in MATLAB ('pchip', version R2023a, RRID:SCR_001622).

Estimation of atrial fibrillatory rate
The AFR was used to obtain information about the atrial arrival process.Briefly, the estimation of the AFR involved the extraction of an f-wave signal by means of spatiotemporal QRST-cancellation (Stridh, M., and Sörnmo, L. (2001)) and estimation of an f-wave frequency trend by fitting two complex exponential functions to the extracted f-wave signal from ECG lead V1 as proposed in (Henriksson, M., Petrėnas, A., Marozas, V., Sandberg, F., and Sörnmo, L. (2018)).
The two exponential functions were characterized by a fundamental frequency f and its second harmonic, respectively; f was fitted within the range f W elch max ± 1.5 Hz, where f W elch max denotes the maximum of the Welch periodogram of ECG lead V1 in the range 4 − 12 Hz.The results for the deep breathing data have been previously presented in (Abdollahpur, M., Engström, G., Platonov, P. G., and Sandberg, F. (2022)).The estimated AFR signal has a sampling rate of 50 Hz.

Extraction of lead-specific EDR signals
All steps of the extraction algorithm that are described in the following were applied to 1-minute segments of the lead-specific EDR signals taken from a 1-minute running window.The lead-specific EDR signals were computed with the slope range method (Kontaxis, S., Lázaro, J., Corino, V. D., Sandberg, F., Bailón, R., Laguna, P., et al. (2020)) for the eight ECG leads V1-V6, I, and II.Only eight out of 12 ECG leads were used, because the information in the leads III, aVF, aVL, and aVR can also be derived from lead I and II.The slope range method uses the peak-to-peak difference in the first derivative of the QRS complex to quantify the variations in the QRS morphology that are assumed to reflect the respiratory activity and are caused for example by periodic changes in electrode positions relative to the heart.
Only QRS complexes with dominant QRS morphology (cf.Sec.2.2.1) were considered when applying the slope-range method.Further, a QRS complex was excluded as an outlier from analysis if the slope range value of any of the leads was outside the mean ± 3 std of the slope range values of that lead.The lead-specific non-uniformly sampled EDR signals were interpolated to a uniform sampling rate of 4 Hz using the modified Akima algorithm as implemented in MATLAB ('makima', version R2023a, RRID:SCR_001622).A matrix containing the resampled lead-specific EDR signals T of dimension 8 × N was constructed, where N = 240 corresponds to the length of the 1-minute segment.To remove baseline-wander in X ′ , a Butterworth highpass filter of order 4 with a cut-off frequency of 0.08 Hz was applied separately for each lead x ′ .The filtered X ′ was normalized to zero-mean and signals shorter than 1 minute were zero-padded to create X containing 1-minute segments.A set S seg was created containing all X i , where i = 1, ..., I denotes all I possible choices of 1-minute segments of the lead-specific EDR signals from one recording.

Extraction of joint-lead EDR signals
The joint-lead EDR signal was extracted from X using a modified version of the periodic component analysis (πCA) (Sameni, R., Jutten, C., and Shamsollahi, M. B. (2008)), summarized in Alg. 1.The matrix X was whitened for its elements to be uncorrelated and to have unit variance.The whitened lead-specific EDR signals Z were computed as (1) where D is the diagonal matrix of eigenvalues of the covariance matrix C X = E{XX T }, and the columns of the matrix E are the unit-norm eigenvectors of C X .
The outputs of the πCA are a joint-lead EDR signal s of dimension 1×N and its corresponding lag τ .The assumption of the πCA is that s = w T Z is a linear mixture of the whitened lead-specific EDR signals.The aim is to find a solution for s with a maximal periodic structure.The periodic structure of s is characterized by ϵ(w, τ, Z), which quantifies non-periodicity (Sameni, R., Jutten, C., and Shamsollahi, M. B. (2008)) and is defined as Algorithm 1 Extraction of joint-lead EDR signals for all X i in S seg do X i is whitened according to Eq. 1 to obtain Z i for all τ j ∈ [10, 40] do obtain w j by solving the generalized eigenvalue problem of matrix pair (C z (τ j ), C z (0)) compute ϵ(w j , τ j , Z i ) according to Eq. 2 end for end for compute τ * = min τj Sseg ϵ(w j , τ j , Z i ) for all Z i in S seg do where s(n) is the n:th element of s.We solved the generalized eigenvalue problem (GEP) of the lag-dependent matrix pair (C z (τ ), C z (0)) to obtain a full matrix V whose columns correspond to the right eigenvectors and a diagonal matrix U of generalized eigenvalues so that C z (τ )V = C z (0)VU (Sameni, R., Jutten, C., and Shamsollahi, M. B. (2008)).Here, for some lag τ is a modified lagged covariance matrix, which is always symmetric, unlike the time lagged covariance matrix C z (τ ) = E n {z(n)z(n − τ ) T }, where z(n) is the n:th column of Z and E n {•} indicates averaging over n.The weight vector w = [w 1 , ..., w 8 ] T that minimizes ϵ(w, τ, Z) is obtained as the first column of V (Sameni, R., Jutten, C., and Shamsollahi, M. B. (2008)).In the present study, ϵ(w, τ, Z) is also used to quantify signal quality, where a lower value of ϵ(w, τ, Z) corresponds to a more periodic signal assumed to have a higher SNR.
As τ is unknown, ϵ(w, τ, Z) was minimized for all integer values of τ between 10 and 40, corresponding to respiration rates between 0.1 and 0.4 Hz.To improve the robustness of the πCA for signals with low quality, a τ * was determined in an intermediate step that corresponds to a global minimum of ϵ(w, τ, Z) over all 1-minute segments in S seg .It was assumed that there were no significant transient changes in respiration frequency in the clinical data and we determined two different τ * for each subject; one for normal breathing and one for deep breathing.Then, for each 1-minute segment separately, a τ resp was estimated as the local minimum of ϵ(w, τ, Z) closest to τ * .The respiration frequency estimate fresp = f s /τ resp results from the estimate τresp and the sampling rate f s = 4 Hz and is in the range fresp ∈ [0.1, 0.4]Hz corresponding to the limits set by τ .Finally, the weight vector w resp for the respiration signal extraction was obtained by solving the GEP of the matrix pair (C z (τ resp ), C z (0)).The extracted s = w T resp Z was normalized to unit variance.An ambiguity of πCA is that the sign of s is undetermined.The sign of the joint-lead EDR signal was selected as s * = s • sign ( w resp ), where w resp denotes the sum of the elements in the vector w resp .This was done under the assumption that all lead-specific EDR signals are in phase.

Estimates from clinical data
The joint-lead EDR signal extraction from Sec. 2.2.4 was applied to all 1-minute segments X in S seg for each patient and recording.Segments X were excluded from further analysis if they do not satisfy the following three criteria, for which a valid QRS complex has a dominant QRS morphology and is not classified as outlier based on its slope range values: i) the maximum distance between valid QRS complexes is 2 seconds; ii) the minimum number of valid QRS complexes in a 1-minute segment is 48; iii) the minimum number of valid QRS complexes in a 1-minute segment is 80% of the normal-to-normal average heart rate of the 1-minute segment.After exclusion, several sets of non-overlapping 1-minute segments could be created from the remaining X.Out of these, the set S * seg that resulted in the smallest sum of ϵ(w resp , τ resp , Z) was chosen, and used to produce joint-lead EDR signals X Clin Resp of dimension 1×N as described in Sec.2.2.4.In addition, the corresponding 1-minute RR series X Clin RR of dimension 1×N was extracted from the RR series obtained in Section 2.2.1.We estimated the mean arrival rate of atrial impulses to the AV node μ as 1000/AF R, where AF R is the average AFR-trend within each of the selected 1-minute windows as described in Sec.2.2.2.To match the dimensions of X Clin RR and X Clin Resp , μ was then repeated N times to produce X Clin AF R of dimension 1×N .From the clinical data, a maximum of five non-overlapping 1-minute long segments in normal breathing and one segment in deep breathing was obtained for X Clin RR , X Clin Resp and X Clin AF R .

Network model of the human atrioventricular node
The atrioventricular node is modeled by a network of 21 nodes (cf.Fig. 1).The presented AV node model was initially proposed in (Wallman, M., and Sandberg, F. (2018)), updated with minor modifications in (Karlsson, M., Sandberg, F., Ulimoen, S. R., and Wallman, M. ( 2021)), and extended using constant scaling factors A R and A D for the refractory period and conduction delay to account for the effect of changes in the autonomic tone in (Plappert, F., Wallman, M., Abdollahpur, M., Platonov, P. G., Östenson, S., and Sandberg, F. ( 2022)).The slow pathway (SP) and fast pathway (FP) are described by two chains of 10 nodes each, which are only connected at their last nodes.Impulses enter the AV node model simultaneously at the first node of each pathway.Within the pathways and between their last nodes, the impulses are conducted bidirectionally to allow for retrograde conduction.The last nodes of the two pathways are connected to an additional coupling node (CN), through which the impulses leave the model.
Each node represents a section of the AV node and is characterized by an individual refractory period R P (∆t k , A P (t), θ P R ) and conduction delay D P (∆t k , A P (t), θ P D ) defined as (3) where P ∈ {SP, F P, CN } denotes the pathway.The refractory period and conduction delay are defined by fixed model parameters for the refractory period θ P R and conduction delay θ P D as well as model states for the diastolic interval ∆t k and respiratory modulation A P (t).Each pathway has a separate set of fixed model parameters for the refractory period θ P R = [R P min , ∆R P , τ P R ] and conduction delay θ P D = [D P min , ∆D P , τ P D ], where R P min is the minimum refractory period, ∆R P is the maximum prolongation of the refractory period, τ P R is a time constant, D P min is the minimum conduction delay, ∆D P is the maximum prolongation of the conduction delay and τ P D is a time constant.For clarity, the notation of R P (•, A P (t), •) and R P (•, A P (t), •) are specified with dots when the replaced parameters or model states are currently not discussed.
The scaling factor A P (t) accounts for the effect of changes in autonomic tone on the refractory period R P •, A P (t), • and the conduction delay D P •, A P (t), • .The time-varying scaling factor A P (t) is common between the SP and FP, defined as with a constant respiratory frequency f resp and peak-to-peak amplitude a resp .The scaling factor of the refractory period and conduction delay of the CN is described by A CN = 1 and not modulated by respiration.
The electrical excitation propagation through the AV node is modeled as a series of impulses that can either be conducted or blocked by a node.An impulse is conducted to all adjacent nodes, if the interval ∆t k between the k:th impulse arrival time t k and the end of the (k-1):th refractory period, computed as is positive.Then, the time delay between the arrival of an impulse at a node and its transmission to all adjacent nodes is given by the conduction delay of the current node are updated according to Eqs. (3)(4)( 6).Details about how the impulses are processed chronologically and node by node, using a priority queue of nodes and sorted by impulse arrival time can be found in (Wallman, M., and Sandberg, F. ( 2018)).
The input to the AV node mode is a series of atrial impulses during AF, with inter-arrival times modeled according to a Pearson Type IV distribution (Climent, A. M., Atienza, F., Millet, J., and Guillem, M. S. ( 2011)).The AA series is generated with a point process with independent inter-arrival times and is completely characterized by the four parameters of the Pearson Type IV distribution, namely the mean µ, standard deviation σ, skewness γ and kurtosis κ.
The output of the AV node model is a series of ventricular impulses, where t V q denotes the time of the q:th ventricular impulse.As the refractory period R P (∆t k , •, •) and conduction delay D P (∆t k , •, •) are history-dependent, the first 1 000 ventricular impulses leaving the AV node model are excluded from analysis to avoid transient effects.

Simulation of AV nodal conduction
For the training and validation, a dataset with 2 million unique parameter sets was generated.This dataset was divided into 20 datasets with 100 000 parameter sets each, where a dataset was either used for training or validation of one of ten realizations of the convolutional neural network (CNN) that is described in Section 2.4.1.Simulations were performed with each parameter set using the AV node model described in Section 2.3.1.For each simulation, a series of 11 000 AA intervals was generated using the Pearson Type IV distribution, defined by the four parameters µ, σ, γ, and κ.The parameter µ was randomly drawn from U[100, 250] ms and σ was randomly drawn from U[15, 30] ms.The parameters γ and κ were kept fixed at 1 and 6, respectively, since they cannot be estimated from the f-waves of the ECG (Plappert, F., Wallman, M., Abdollahpur, M., Platonov, P. G., Östenson, S., and Sandberg, F. ( 2022)).Negative AA intervals were excluded from the impulse series.The model parameters for the refractory period θ P R and conduction delay θ P D of the SP and FP were drawn from bounded uniform distributions and the model parameters of the CN were kept fixed according to Table 2.The given ranges were in line with our previous work (Plappert, F., Wallman, M., Abdollahpur, M., Platonov, P. G., Östenson, S., and Sandberg, F. ( 2022)).The model parameters for the respiratory modulation and simulated respiration signal that are used in Sec.2.3.3 were also drawn from bounded uniform distributions, with a resp randomly drawn from U[−0.1, 0.5], f resp randomly drawn from U[0.1, 0.4] Hz and η randomly drawn from U[0.2, 4].For testing, another dataset with 2 million unique parameter sets was generated using the same ranges listed above, except for a resp , which was randomly drawn from U[0, 0.4].
When sampling, initially a value for a resp was drawn from a uniform distribution.To exclude non-physiological parameter sets from the dataset, we resampled the rest of the parameters until the following five selection criteria were met: 1) the slow pathway in every parameter set must have a higher conduction delay D SP (∆t k , •, •) > D F P (∆t k , •, •) and lower refractory period R SP (∆t k , •, •) < R F P (∆t k , •, •) than the fast pathway for all ∆t k ; 2) the resulting average RR interval has to fall within the range of 300 ms to 1000 ms, which corresponds to heart rates between 60 bpm and 200 bpm; 3) the resulting root mean square of successive RR interval differences (RR rmssd) has to be above 100 ms; 4) the resulting sample entropy of the RR series has to be above 1; 5) the relative contribution of the respiration frequency in the frequency spectrum of the RR series with zero-mean F RR (f resp )/ f F RR (f ) has to be below 7% to exclude RR series with visible periodicity.Note that the frequency spectrum is computed from the RR series with 240 samples and the sampling rate of 4 Hz.
Similar to the clinical data described in Section 2.2.1, RR series were computed from intervals between the simulated ventricular impulses, and the time of each RR interval sample was set to the time of the first ventricular impulse.The resulting non-uniformly sampled RR series were interpolated to a uniform sampling rate of 4 Hz using piecewise cubic hermite interpolating polynomials as implemented in MATLAB ('pchip', version R2023a, RRID:SCR_001622).The simulated RR series were cut into 1-minute segments of length N = 240, resulting in RR series X Sim RR of dimension 1×N .For each RR series, µ was repeated N times to form a vector X Sim AF R of dimension 1×N , corresponding to the mean atrial arrival rate.

Modelling respiratory signals
For the modeling of the respiratory signals resembling joint-lead EDR signals (cf.Sec.2.2.4), we start with the underlying assumption that respiration can be described according to m(t) = sin(2πtf resp ), i.e., by a sine wave oscillating at the respiratory frequency f resp .Eight identical lead-specific EDR signals m ′ p (t) with p = 1, ..., 8 were created, composed of non-uniform samples of m(t) at the times of the ventricular impulses t V q generated by the AV node model.To emulate lead-specific EDR signals, Gaussian noise with standard deviation η was added to all samples of m ′ p (t), making them non-identical.Next, m ′ p (t) were processed in five steps to mimic the processing steps for the clinical data (cf.Sec.2.2.3 and 2.2.4): 1) using the same criteria as for the outlier exclusion in the clinical data, all samples in m ′ p (t) for the same ventricular impulse were excluded as outliers, if the value in one of the eight leads was outside the mean±3 std, computed for each lead within a 1-minute running window; 2) as for the clinical lead-specific EDR signals, the simulated signals m ′ p (t) were interpolated to a uniform sampling rate of 4 Hz using the modified Akima algorithm as implemented in MATLAB ('makima', version R2023a, RRID:SCR_001622), resulting in m ′ p (n); 3) m ′ p (n) were cut into 1-minute segments of length N = 240 and had the dimension 8×N ; 4) the resampled and cut signals are filtered with a Butterworth highpass filter of order 4 with the cut-off frequency 0.08 Hz to remove baseline-wander; 5) a joint-lead EDR signal X Sim Resp with dimension 1×N was extracted from m ′ p (n) using the periodic component analysis described in Section 2.2.4.

Architecture of 1-dimensional convolutional neural network
To predict the peak-to-peak amplitude of the respiratory modulation, a resp , a 1D-CNN architecture was used as illustrated in Figure 2. The CNN architecture consists of five convolution layers, where each layer l was composed of 100 1D-CNN filters with kernel size k C = 3, stride s C = 1 and dilation factor d C = 2 l−1 , followed by a rectified linear unit (RELU) and a batch normalization layer.After the five convolution layers, the data passed through a global average pooling layer and dense layer, the output of which is a prediction âresp .To assess the performance of the CNN with or without the RR series, respiration signal, and mean µ of the AA series, seven versions of the CNN were trained.
The respective CNNs and their input data are given as follows: the CNN C RR was trained on the input data with the format

Training the convolutional neural network
For each CNN version, i.e., C RR , C Resp , C AF R , C RR,Resp , C RR,AF R , C Resp,AF R and C RR,Resp,AF R , described in Sec.2.4.1, ten realizations were trained with unique training and validation datasets, X Sim,T rain and X Sim,V al , respectively, containing 100 000 parameter sets each, as described in Sec.2.3.2.The CNNs were trained to estimate the a resp and the weights of the CNN were updated during backpropagation based on the root-mean-square error (RMSE)

Prediction of respiratory modulation in simulated data
The performance of the CNN on simulated data was assessed for and C RR,Resp,AF R , using the testing dataset X Sim,T est described in Sec.2.3.2.The total performance on X Sim,T est was assessed using the RMSE, Pearson correlation, and coefficient of determination R 2 between the true a resp and predicted âresp .
In addition, the performance was assessed over a range of respiration frequencies f resp and characteristics of nonperiodicity in the respiration signal ϵ(w, τ, Z), here denoted ϵ.To produce local RMSE estimates σ(f ′ resp , ϵ ′ ) for specific values f ′ resp and ϵ ′ , the following three steps were applied: 1) a squared difference (a resp − âresp ) 2 was computed for each of the 2 million parameter sets in X Sim,T est ; 2) a weighted average of the 2 million squared differences was computed using a Gaussian kernel centered at f ′ resp and ϵ ′ with the standard deviation of 0.015 Hz and 0.075 for the f resp and ϵ, respectively; 3) the square root of the weighted average resulted in σ(f ′ resp , ϵ ′ ).In the present study, all versions of the CNN were trained and tested using 1-minute segments, with one exception: An additional CNN C 2.5min RR,Resp,AF R was trained and tested using X * = [X Sim,2.5 RR ; X Sim,2.5 Resp ; X Sim,2.5 AF R ] containing 2.5minute-long segments to investigate the impact of segment length on the RMSE.For C 2.5min RR,Resp,AF R , ten realizations were trained with additional unique training and validation datasets, X * Sim,T rain and X * Sim,V al , respectively, containing 100 000 parameter sets each.Apart from the different segment lengths, the additional datasets were generated as described in Sec.2.3.2.

Prediction of respiratory modulation in clinical data
The CNN C RR,Resp,AF R was used for predicting a resp in the clinical deep breathing test data, described in 2.1.The clinical predictions were used to investigate differences in âresp between deep breathing and normal breathing using Monte Carlo sampling.Using these samples, the probabilities of the following three scenarios were computed for each patient: 1) the highest âresp was achieved for deep breathing, 2) the lowest âresp was achieved for deep breathing and 3) the highest and lowest âresp did not correspond to deep breathing.To draw the samples for each 1-minute segment in X Clin,T est , the prediction âresp was determined using the CNN C RR,Resp,AF R , while the f ′ resp and ϵ ′ were estimated by the fresp and ϵ(w, τ, Z) described in Sec.2.2.4.Next, values of âresp were resampled 100 000 times for each 1-minute segment in S * seg .The samples were drawn from Gaussian distributions with âresp as mean and σ(f ′ resp , ϵ ′ ) described in Sec.2.4.3 as standard deviation.

Analysis of clinical data
The length of the interpolated RR series varied between patients depending on the duration of the recordings; during normal breathing, the length of the RR series was in the range between 288 s and 328 s; during deep breathing, the length of the RR series was in the range between 57 s and 72 s.Statistics quantifying the clinical dataset are shown in Table 3.In accordance with the exclusion criteria defined in Sec.2.2.5, 98 out of 120 non-overlapping 1-minute segments remained in the normal breathing data and 22 out of 28 1-minute segments remained in the deep breathing data.Typical examples of a clinical RR series X Clin RR and joint-lead respiration signal X Clin Resp during normal breathing and deep breathing, respectively, are shown in Fig. 3.The characteristics of these signals, listed in Table 4 are within 1 standard deviation of the population mean (cf.Table 3).Fluctuations in the clinical RR series matching the respiration frequencies were not clearly visible and F RR (f resp )/ f F RR (f ) was always below 7%.The respiration signals estimated from clinical data had ϵ(w, τ, Z) ranging between 0.198 and 1.485.The clinical value pairs of ϵ(w, τ, Z) and respiration frequency fresp are shown in Fig. 4.There was a statistically significant weak negative correlation between

Simulated RR series and respiration signals
The statistics quantifying X Sim,T rain , X Sim,V al and X Sim,T est are shown in Table 3 together with X Clin,T est .The simulated datasets were created according to the description in Sec.2.3 and compared to the clinical data using the unpaired t-test.It should be noted that although there are significant differences between the characteristics of the clinical and simulated data, the distributions of the simulated data cover the distribution of the clinical data.The heart rate was on average slightly faster and more regular in X Sim than in X Clin , as indicated by the differences in RR mean, RR rmssd, and RR sample entropy.Further, the RR series in X Sim showed on average more visible fluctuations matching the respiration frequency compared to the RR series in X Clin , as indicated by the difference in F RR (f resp )/ f F RR (f ).The AF R was on average slightly lower in X Sim than in X Clin , whereas f resp was slightly higher.In normal breathing, ϵ in X Clin was comparable to X Sim ; however, in deep breathing, ϵ was lower in X Clin than in X Sim .
Examples of a simulated RR series X Sim RR and joint-lead respiration signal X Sim Resp resembling clinical signals during normal breathing and deep breathing, respectively, are shown in Fig. 3.The signals were chosen based on similarities to the clinical signals in the RR series characteristics and respiration signal morphology.The characteristics of these signals are listed in Table 4. Note, that while the peak-to-peak amplitude of respiratory modulation a resp is high during normal breathing and low during deep breathing in this example, a general conclusion about the a resp values of the clinical signals can not be drawn from this comparison and is not intended.When emulating lead-specific EDR signals and adding Gaussian noise with standard deviation η, the simulated data showed a strong correlation between η and ϵ (ρ = 0.89, p < 10 −5 ).The examples in Fig. 3 are representative of this correlation with the η and ϵ listed in Table 4, where X Sim Resp in Fig. 3D was generated with a higher η and showed a higher ϵ compared to X Sim Resp in Fig. 3H.

Accuracy of convolutional neural network
All CNNs C RR , C Resp , C AF R , C RR,Resp , C RR,AF R , C Resp,AF R and C RR,Resp,AF R , described in Sec.2.4.1 and trained according to Sec. 2.4.2, were tested using X Sim,T est described in Sec.2.3.2.The resulting distribution of predicted âresp over true a resp for C RR,Resp,AF R is shown in Fig. 5.The corresponding distribution for a prediction using linear regression L RR,Resp,AF R based on the same data X = [X Sim RR ; X Sim Resp ; X Sim AF R ] is also displayed in Fig. 5 for comparison.The RMSE, Pearson sample correlation and R 2 are listed for the seven CNN versions and L RR,Resp,AF R resp and ϵ ′ were computed according to Sec. 2.4.3 and illustrated in Fig. 6.It can be seen in all four contour plots that the RMSE is dependent on f ′ resp and ϵ ′ .The CNNs produce more accurate predictions for data with a high f ′ resp and low ϵ ′ , however, the RMSE is more sensitive to changes in f ′ resp .Adding X Sim AF R to the input improves the RMSE for all values of f resp and ϵ.While the addition of X Sim Resp to the input improves the RMSE for most f ′ resp and ϵ ′ , it worsens the RMSE for high ϵ ′ and low f ′ resp as indicated in Fig. 6.Within the indicated region, the accuracy of âresp is higher without X Sim Resp in the input data.The accuracy of the CNN improves with longer input data, indicated by the fact that the RMSE of C 2.5min RR,Resp,AF R was 0.050.The RMSE, Pearson sample correlation and R 2 is listed for C 2.5min RR,Resp,AF R in Table 5.The RMSE improved for all values of ϵ ′ and f ′ resp , whereas the local RMSE improved especially at lower f ′ resp (data not shown).

Prediction of respiratory modulation in clinical data
The CNN C RR,Resp,AF R was used to obtain âresp from the clinical data in X = [X Clin RR ; X Clin Resp ; X Clin AF R ].The resulting âresp for 1-minute segments during normal breathing and deep breathing are shown in Figure 7.There was high interpatient variability in âresp in the study population and no clear relation was found between âresp during normal breathing and deep breathing.No significant correlation was found between a change in respiration frequency f DB resp − f N B resp and a change in respiratory modulation âDB resp − âNB resp .The vertical lines around âresp in Fig. 7A correspond to ±σ(f resp , ϵ), described in Sec.2.4.3 and is used for the Monte Carlo sampling described in Sec.2.4.4.For 20 subjects, âresp was available for at least one segment during normal breathing and one segment during deep breathing (cf.exclusion criteria in Sec.2.2.5).For those 20 subjects, Monte Carlo sampling was used to investigate whether âresp is larger during deep breathing than during normal breathing as described in Sec.2.4.4.As illustrated in Figure 7B: it was most likely for 5 patients that the highest a resp was achieved for deep breathing; it was most likely for 5 patients that the lowest a resp was achieved for deep breathing; and it was most likely for 10 patients that neither the highest nor lowest a resp corresponded to deep breathing.

Discussion
The aim of this study was to develop and evaluate a method to extract respiratory modulation in the AV node conduction properties from ECG data during AF.To achieve this we extended our AV node model (Plappert, F., Wallman, M., Abdollahpur, M., Platonov, P. G., Östenson, S., and Sandberg, F. (2022)) to account for respiratory modulation by multiplying a time-varying scaling factor to the AV nodal refractory period and conduction delay.We trained a 1D-CNN on simulated 1-minute segments of RR series, respiration signals, and mean arrival rate of atrial impulses which replicate clinical data to predict a resp which quantifies the peak-to-peak amplitude of respiratory modulation.We evaluated the network on simulated data and the results indicated that a resp can be estimated with an RMSE of 0.066, corresponding to a sixth of the expected range for a resp between 0 and 0.4.
The clinical predictions of âresp in Fig. 7 obtained using the trained CNN show a large interpatient variability in âresp during deep breathing as well as during normal breathing.If the magnitude of respiratory modulation in AF would be respiration frequency dependent as it is in NSR (Angelone, A., and Coulter, N. A. (1964), Bernardi, L., Porta, C., Gabutti, A., Spicuzza, L., and Sleight, P. (2001), Russo, M. A., Santarelli, D. M., and O'Rourke, D. ( 2017)), then we would expect to see a clear increase in âresp during deep breathing.Results from the Monte Carlo sampling showed that âresp increased in response to deep breathing in 5 patients, decreased in 5 patients, and remained the same in 10 patients.This interpatient variability cannot be explained by differences in respiration rates, since there was no correlation between a change in respiration frequency and a change in respiratory modulation.
In our previous model formulation, we accounted for the ANS-induced changes by introducing constant scaling factors for the refractory period and conduction delay (Plappert, F., Wallman, M., Abdollahpur, M., Platonov, P. G., Östenson, S., and Sandberg, F. (2022)).With the scaling of AV nodal conduction properties, it was shown that the incorporation of ANS-induced changes in the model allowed better replication of several statistical properties of clinical RR series obtained from tilt tests.In the present study, this approach was further developed by using a time-varying scaling factor A P (t) to account for respiratory modulation in AV nodal conduction properties based on the assumption that some degree of respiratory modulation generally influences RR series characteristics during AF.We model respiratory modulation as a joint increase in AV nodal refractoriness and conduction delay in response to exhalation and a joint decrease in AV nodal refractoriness and conduction delay in response to inhalation.It is known that respiration modulates the parasympathetic activity, where inspiration decreases vagal activity and expiration increases vagal activity (Katona, P. G., Poitras, J. W., Barnett, G. O., and Terry, B. S. (1970), Russo, M. A., Santarelli, D. M., and O'Rourke, D. (2017)).Many electrophysiological (EP) studies have demonstrated that an increase in parasympathetic activity causes an increase in AV nodal conduction delay; studies in dogs reported an increased conduction delay with vagal  1990), Goldberger, J. J., Kadish, A. H., Johnson, D., and Qi, X. (1999)) and acetylcholine administration (Priola, D. V., Curtis, M. B., Anagnostelis, C., and Martinez, E. (1983)).Furthermore, an increase in parasympathetic activity with vagal stimulation in dogs has been demonstrated to increase the AV nodal refractory period (Spear, J. F., and Moore, E. N. (1973), Nayebpour, M., Talajic, M., Villemaire, C., and Nattel, S. (1990), Goldberger, J. J., Kadish, A. H., Johnson, D., and Qi, X. (1999)).For a decrease in parasympathetic activity with atropine, EP studies demonstrate that the AV nodal conduction delay decreases in dogs (Irisawa, H., Caldwell, W. M., and Wilson, M. F. (1971)) and humans (Lister, J. W., Stein, E., Kosowsky, B. D., Lau, S. H., and Damato, A. N. (1965), Akhtar, M., Damato, A. N., Caracta, A. R., Batsford, W. P., Josephson, M. E., and Lau, S. H. ( 1974)), and the AV nodal refractory period also decreases in humans (Akhtar, M., Damato, A. N., Caracta, A. R., Batsford, W. P., Josephson, M. E., and Lau, S. H. (1974)).
The assumption that some degree of respiratory modulation generally influences the RR series characteristics during AF is also indicated by the fact that some AF patients display clear fluctuations in their RR series matching their respiration frequency (Rawles, J. M., Pai, G. R., andReid, S. R. (1989), Chandler, S. T., andTrewby, P. N. (1994), Nagayoshi, H., Janota, T., Hnatkova, K., Camm, A. J., and Malik, M. (1997)).Such fluctuations could also be seen in simulated RR series for some AV node model parameter sets.During model development, we noticed that an increase in a resp leads to an increase in the relative contribution of the respiration frequency in the frequency spectrum of the RR series with zero-mean F RR (f resp )/ f F RR (f ) and an increase in the sample entropy of the RR series.We also noticed that an increase in f resp leads to a decrease in F RR (f resp )/ f F RR (f ) and an increase in the sample entropy of the RR series.When averaging over several realizations of RR series (data not shown), F RR (f resp )/ f F RR (f ) could be clearly seen for most of the parameter sets but is usually masked in individual RR series segments by the irregularity of the RR series.Using cross-spectral analysis, no simple linear relationship has been found between respiration signal and RR series in AF patients, but a linear relationship was shown in NSR (Pitzalis, M. V., Massari, F., Forleo, C., Fioretti, A., Colombo, R., Balducci, C., et al. (1999)).A possible reason for this is that the relationship between the RR series and respiratory modulation in AV nodal conduction properties during AF is complex and non-linear, emphasizing the need for a model-based approach.Besides some indications of fluctuations in the RR series, for most of the patients reported in (Rawles, J. M., Pai, G. R., andReid, S. R. (1989), Chandler, S. T., andTrewby, P. N. (1994), Nagayoshi, H., Janota, T., Hnatkova, K., Camm, A. J., and Malik, M. (1997), Pitzalis, M. V., Massari, F., Forleo, C., Fioretti, A., Colombo, R., Balducci, C., et al. (1999), Pacchia, C. F., Kline, G. P., Hamdan, M. H., Clark, K. G., Clark, M. G., and Smith, M. L. ( 2011)) and also for the clinical data used in this study, no fluctuations in the RR series matching their respiration frequency were found.To match F RR (f resp )/ f F RR (f ) in the clinical data which was always below 7%, parameter sets with a higher relative peak spectral energy were excluded from the simulated data (criterion 5 in Sec.2.3.2).The RR series characteristics of the simulated data differed significantly from both the normal breathing and deep breathing data (cf.Table 3).Simulated data with RR series characteristics more similar to the clinical data could be generated by imposing stricter exclusion criteria, e.g., increasing the lower bounds for irregularity and variability set by criteria 3 and 4 in Sec.2.3.2.However, the simulated data still included signals resembling the clinical data, and the wider range of characteristics likely improved the CNN training by facilitating generalization across a broader range of RR-series.Nevertheless, it is assumed that by the sheer size of the simulated datasets and the conservative model parameter ranges, there will be simulated RR series in the dataset that resemble the clinical data.
The lead-specific respiration signals were computed using the slope range method which was designed for ECG data during AF (Kontaxis, S., Lázaro, J., Corino, V. D., Sandberg, F., Bailón, R., Laguna, P., et al. (2020)) and found to be one of the best performing and simplest methods for lead-specific respiration signal extraction (Varon, C., Morales, J., Lázaro, J., Orini, M., Deviaene, M., Kontaxis, S., et al. (2020)).The result of the lead-specific respiration signal extraction can be improved when combining respiration signals from multiple ECG leads with a joint-lead respiration signal.Previously, the principal component analysis (PCA) has been used to extract joint-lead respiration signals from the clinical data used in this study (Abdollahpur, M., Engström, G., Platonov, P. G., and Sandberg, F. (2022)).However, the principal components were sensitive to high variance noise as the PCA is based on second-order statistics.To address this issue, we developed a novel approach for robust fusion of lead-specific respiration signals based on the πCA (Sameni, R., Jutten, C., and Shamsollahi, M. B. (2008)).Under the assumption that the respiration signal has a periodic structure where the respiration frequency and volume between breaths are constant, the πCA is more suitable for the extraction of joint-lead respiration signals compared to other blind-source separation methods, such as the PCA and basic independent component analysis (ICA).This is because the πCA finds the linear mixture of lead-specific respiration signals with maximal periodic structure, whereas the PCA and basic ICA are based on second-order and fourth-order statistics, respectively.We assume that the respiration frequency and volume between breaths do not vary much in 1-minute segments, making the πCA a suitable approach for the extraction of short joint-lead respiration signals.However, considering that the CNN C 2.5min RR,Resp,AF R performs better when using 2.5-minute segments instead of 1-minute segments, another method may be required for the extraction of longer joint-lead respiration signals.
The comparison between the CNN C RR,Resp,AF R and the linear regression shown in Fig. 5 demonstrate that the relation between the RR series, respiration signal and mean atrial arrival rate to a resp is not simple and not linear.In this study, we only investigate the performance of one basic CNN architecture.While some variations on this were tested during the neural network development, no extensive investigation has been performed and there may be better neural network architectures for this task.The CNN requires the RR series for the prediction of a resp and the mean atrial arrival rate always improved the prediction.In this evaluation, however, µ was set to the correct value; we did not account for estimation errors that are most likely present in real data since AFR provides a crude estimate of the atrial arrival rate.Moreover, the addition of the respiration signal only improves the prediction when of sufficient quality as quantified by ϵ.The linear dependence between η and ϵ support our assumption of ϵ as a marker of respiration signal quality (cf.Sec.3.2).Whereas the addition of the respiration signal and mean atrial arrival rate can improve the prediction of âresp , a method based on RR series only is less sensitive to noise in the recordings.Potentially, the RR series could be extracted from pulse watch data, allowing for continuous monitoring of a resp in a wide range of applications.
The performance of the CNN is dependent on f resp and ϵ (cf.Fig. 6), where f resp appears to have a larger impact on the performance than ϵ.The marker of respiration signal quality ϵ was not used as an exclusion criterion for 1-minute segments, because the addition of X Sim Resp to the input only slightly improved the accuracy of the âresp prediction and the influence of ϵ on the RMSE compared to f resp was small.Instead, ϵ was used to choose the best combination of non-overlapping 1-minute segments.Interestingly, the performance of the CNNs C RR , C AF R , C RR,AF R still show a slight dependence on ϵ although this parameter quantifies the non-periodicity and signal quality of X Sim Resp (cf.Fig. 6).This suggests that ϵ carries information about the RR interval series, and may indicate that the distribution of AV node model parameters varies over different ϵ and that different subsets of model parameters result in different local RMSEs.One possible explanation why the impact of f resp on the performance is prominent may be that there are fewer respiratory cycles in the 1-minute segment at lower f resp .When using 2.5-minute segments in the input data, the performance of the CNN C 2.5min RR,Resp,AF R improved overall, especially at lower f resp .The segment length was set to 1 minute in this study due to the recording length of 1 minute during deep breathing.
There are several limitations of the present study.We assume for simplicity that the variations in AV nodal refractoriness are similar to the variations in AV nodal conduction delay.We also assume that the variations in AV nodal refractoriness and conduction delay are similar between SP and FP.Moreover, the model does not include phase shifts between the RR series and respiration signal for different respiration frequencies (Angelone, A., and Coulter, N. A. (1964)), or effects of respiration volume (Grossman, P., and Taylor, E. W. (2007)).Hence, a different scaling for the refractory period and conduction delay, a different scaling for the SP and FP, a phase shift between the RR series and respiration signal, and an inclusion of respiration volume might form interesting directions for future model improvements.We did not account for respiratory modulation in the AA series, because the modulation is small during AF (Abdollahpur, M., Engström, G., Platonov, P. G., andSandberg, F. (2022), Celotto, C., Sánchez, C., Mountris, K. A., Abdollahpur, M., Sandberg, F., Laguna, P., et al. (2020)).When choosing the bounded uniform distribution of a resp for the training and testing dataset, we made a tradeoff between bias and variance.The reason why a resp was randomly drawn from U[−0.1, 0.5] in the training data and randomly drawn from U[0, 0.4] in the testing data of the CNN is to reduce the bias in the âresp prediction (cf.Fig. 5).Without extending the range of a resp in the training data, the sample mean of the âresp diverged more from a resp at values close to 0 and 0.4.However, the accuracy of the CNNs decreased by extending the range of a resp in the training data.While plenty of simulated data with a resp ground truth can be generated using the AV node model, there was no a resp ground truth available for the clinical dataset and its size was limited.

Conclusion
We presented an extended AV node model that accounts for respiratory modulation in conduction delay and refractory period.We trained a 1D-CNN to predict the respiratory modulation in the AV node with simulated RR series, respiration signal, and mean atrial arrival rate which replicate clinical ECG-derived data.Using simulated data, we demonstrated that the respiratory modulation can be predicted using the 1D-CNN from RR series alone and that the prediction is improved when adding a respiration signal and AFR.Results from analysis of clinical ECG data of 20 patients with sufficient signal quality suggest that the respiratory modulation decreased in response to deep breathing for five patients, increased for five patients, and remained similar for ten patients, indicating a large inter-patient variability.

Figure 1 :
Figure 1: A schematic representation of the AV node model.Retrograde conduction was possible within the AV node model.For simplicity, only a subset of the ten nodes in each pathway is shown.Note that the atrioventricular node used different θ P R and θ P D for the three different pathways, the same time-varying A P (t) for SP and FP and a constant A CN = 1 for CN.

Figure 2 :
Figure 2: The CNN was composed of five 1D convolution layers with 100 filters each.The convolution layers had a kernel size k C , stride s C and dilation factor d C .Training datasets X Sim,T rain , validation datasets X Sim,V al , and testing datasets X Sim,T est were constructed from the simulated data X Sim RR , X Sim Resp and X Sim AF R .A testing dataset X Clin,T est was constructed from the clinical data X Clin RR , X Clin Resp and X Clin AF R .

Figure 3 :
Figure 3: Two examples of clinical RR series (A+E), simulated RR series (B+F), clinical respiration signals (C+G), and simulated respiration signals (D+H) during normal breathing (A-D) and deep breathing (E-H).

Figure 5 :
Figure 5: Binned scatter plot of predicted âresp versus true a resp for the CNN C RR,Resp,AF R and linear regression L RR,Resp,AF R , where both were based on the same input data X = [X Sim RR ; X Sim Resp ; X Sim AF R ].The black dotted line shows where âresp is equal to a resp .The white dotted line shows the sample mean of the âresp prediction.

Figure 6 :
Figure 6: Contour plot of local RMSE estimates over a range of f ′ resp and ϵ ′ for C RR (A), C RR,AF R (B), C RR,Resp (C) and C RR,Resp,AF R (D).Except for the grey region, the CNNs in C and D have a higher accuracy than the CNNs in A and B, respectively.

Figure 7 :
Figure 7: (A) Black dots correspond to the predicted âresp of 1-minute segments during normal breathing (NB) and red squares correspond to âresp of 1-minute segments during deep breathing (DB).The vertical lines correspond to ±σ(f resp , ϵ), where the local RMSE σ(f resp , ϵ) is taken from Figure 6D.(B) Probabilities of a resp being higher in DB than in NB (yellow, arrow-up), similar in DB and NB (red, tilde), and lower in DB than in NB (blue, arrow-down).

Table 1 :
Clinical characteristics of study population.
40 then add τ j to S τ end if end if end for set τ resp as value in S τ closest to τ * obtain w resp by solving the generalized eigenvalue problem of matrix pair (C z

Table 2 :
AV Node model parameters used for simulated data.

Table 3 :
Characteristics of clinical and simulated data.The training data is divided into 20 datasets with equal size to train the 10 realizations of the CNN with unique X Sim,T rain and X Sim,V al .The variables AF R, f resp , and a resp characterize estimates in the clinical data and model parameters in the simulated data.

Table 4 :
Characteristics of the clinical and simulated examples are shown in Fig. 3.

Table 5 :
RMSE, Pearson sample correlation and R 2 of the seven CNN versions and linear regression L RR,Resp,AF R using 1-minute segments, and C 2.5min RR,Resp,AF R using 2.5-minute segments.

Table 5 .
The C RR,Resp,AF R resulted in the lowest RMSE and highest correlation and R 2 .The CNNs C AF R , C Resp and C Resp,AF R without RR series in the input data performed poorly.The C RR predicted âresp with an RMSE of 0.074, where the addition of X Sim Resp or X Sim AF R to the input improved the accuracy of the âresp prediction slightly.For C RR , C RR,AF R , C RR,Resp and C RR,Resp,AF R , the local RMSE of âresp for specific f ′