Impulsive time series modeling with application to luteinizing hormone data

This work considers the estimation of impulsive time series pertaining to biomedical systems and, in particular, to endocrine ones. We assume a signal model in the form of the output of a continuous linear time-invariant system driven by a sequence of instantaneous impulses, which concept is utilized here, in particular, for modeling of the male reproductive hormone axis. An estimation method to identify the impulsive sequence and the continuous system dynamics from sampled measurements of the output is proposed. Hinging on thorough mathematical analysis, the method improves upon a previously developed least-squares algorithm by resolving the trade-off between model fit and input sparsity, thus removing the need for manual tuning of user-defined estimation algorithm parameters. Experiments with synthetic data and Markov chain Monte-Carlo estimation demonstrate the viability of the proposed method, but also indicate that measurement noise renders the estimation problem ill-posed, as multiple estimates along a curve in the parameter space yield similar fits to data. The method is furthermore applied to clinical luteinizing hormone data collected from healthy males and, for comparability, one female, with similar results. Comparison between the estimated and theoretical elimination rates, as well as simulation of the estimated models, demonstrate the efficacy of the method. The sensitivity of the impulse distribution to the estimated elimination rates is investigated on a subject-specific data subset, revealing that the input sequence and elimination rate estimates can be interdependent. The dose-dependent effect of a selective gonadotropin releasing hormone receptor antagonist on the frequency and weights of the estimated impulses is also analyzed; a significant impact of the medication on the impulse weights is confirmed. To demonstrate the feasibility of the estimation approach for other hormones with pulsatile secretion, the modeling of cortisol data sets collected from three female adolescents was performed.

In the noise-free case, the estimation utilizes how the solutions to the optimization problem (5) divide the b 1 -b 2 -parameter space into distinct regions. In this setup we allow negative impulse weight estimates, and use how the signs of the impulses depend on the parameter values to navigate to the true parameter values b * 1 , b * 2 . The following proposition provides an initial division of the space. PROPOSITION 1. Runvik and Medvedev (2022) Consider LS problem (5) with the initial condition x 2 (t 1 ) = 0. Assume that the noise-free measurements Y are produced by (2) and that A further refinement of this division is derived by considering the estimated impulse weights in a simplified case with three measurements of the response of a single impulse. Each such triplet gives rise to a curve in the b 1 -b 2 -space, defining the boundary between regions of positive and negative impulse weight estimates. By assuming that properties of this type of curve also hold in the general setting (which appears to be true in numerical simulations), and assuming sufficient sparsity of the impulses in relation to the sampling rate, the maximum and minimum of all possible curves generated by any triplet of measurements define two curves in the parameter space, respectively representing the boundaries of the regions with positive and negative impulse weights. For the positive region, this is the curve γ P that is sketched in Fig. 1. Importantly, the boundary of the negative region meet γ P only the point (b * 1 , b * 2 ), which enables the determination of these parameters by locating the point where the boundaries meet.

Estimation under uncertainty
When additive measurement noise is present, the parameter space is no longer divided into the welldefined regions outlined above, instead impulses with both positive and negative weights would be typically be present in most of the parameter space. To robustify the estimation, a non-negativity constraint on the impulses is therefore added in the optimization. This produces the alternative characterization of γ P based on zero loss given in Section 2.1.1 and motivates the estimation procedure using (6) and (7).

LEAST SQUARES IMPLEMENTATION
The methods used to solve optimization problem (6), and thus enabling the estimation of γ P , are presented here. It can be seen in numerical experiments that this problem generally tends to be non-convex, but with only few (typically one or two) local minima located in a small interval. Assuming that no prior information is available to determine a good starting guess for b 2 , we use a refined gridding strategy for this problem in the experiments with synthetic data in Section 3.1. Algorithm 1 is then modified so thatb 2 is first calculated over a coarse grid of b 1 and b 2 , giving estimatesb 2 b (n) 1,c , n = 1, . . . N c . Then a finer grid over b 1 is used 1 is the point of the coarse grid closest to b (n) 1,f and δ is a positive parameter. In both grids, a finite difference approximation over the points in the b 2 -grid is utilized to approximate the derivative of the residual sum. The distance between the points of the coarse grid is set to 0.02 (in both directions); the corresponding value for the fine grid set to 0.002. δ = 0.06 is furthermore chosen.
In the estimation from clinical data, where the LH level is not reduced by medication in Section 3.2, gridding is used without the refinement described above. In Section 3.2.1, the distance between the points of the grid is 0.002 in the b 1 -direction (with the range defined by (8)) and 5 × 10 −4 in the b 2 -direction (in the range [0.0015, 0.04]) while 5 × 10 −4 is used in both directions in Section 3.2.2. In the estimation with cortisol data, the distance between grid points is 5 × 10 −4 in the b 1 -direction and 2 × 10 −4 in the b 2 -direction. The parameter d min is set to 5% of the maximal estimated impulse weight in Section 3.2.1, 3.2.3 and 3.3 and 2% is Section 3.2.2 (the lower value of is used as the noise level appears lower in the limited data-set considered in that experiment), while d Π is set to 5% of the mean estimated impulse weight and Π = K/2 in all experiments.
Finally, for the data sets where the LH level is reduced by the GnRH-receptor antagonist, the estimated value of b 2 for each individual in the unmedicated case is used for initializing an interior point optimization with the Matlab function Fmincon, where the derivative in the objective function again is approximated with a finite difference. This strategy makes the estimation more robust compared to the gridding in this instance, when the medication decreases the signal-to-noise ratio.

Empirical covariance
In the AM algorithm, a Gaussian zero-mean proposal distribution is used with a covariance which depends on the iteration index i and the states of the Markov chain (i.e. the parameter vector) Θ i according to where c 0 is a fixed covariance matrix, i 0 is a positive integer, s d and ϵ are positive scaling parameters, I denotes the identity matrix and c(Θ 1 , . . . Θ i ) is the empirical covariance matrix where the bar denotes the mean of the states over indices 1, . . . i. In the present work, the parameter values c 0 = 0.1I, i 0 = 2, s d = 0.02 and ϵ = 0.01s d were chosen.

Superfluous impulse removal
The method for identifying and removing superfluous impulses from the posterior samples of the AM algorithm is given here. For each sample, Algorithm 1 is performed. Here N is the assumed number of impulses, δ > 0 is a time relaxation parameter and d min > 0 is the impulse weight threshold. The parameter values δ = 0.1, d min = 0.1 were used in the numerical experiments. if n < N and ∃ t i , t i+1 s.t. t i < τ n < τ n+1 < t i+1 + δ then 3: Merge impulses n, n + 1 according to Algorithm 1 in Mattsson and Medvedev (2015)

Gelman-Rubin statistic
Convergence of the AM algorithm is verified using the Gelman-Rubin statistic (Gelman and Rubin, 1992). This verification is based on running multiple Markov chains for the same estimation, and then comparing the within-chain and between-chain variability of the estimates. In particular, unbiased estimates of the variances of each parameter are calculated in two ways; using the variance between chains and through averaging over all chains. The square root of the ratio between these estimates produces the statistic R, which approaches 1 as the chains converge. In our work, we use the mean value of R over all estimated parameters to verify convergence.

SYNTHETIC DATA EXPERIMENT
The probability distributions and the fixed parameters used to generate the synthetic data used in Section 3.1 are summarized in Table S1. For both the least squares and the AM estimation, b 1 and b 2 were restricted to the ranges b * Constraints restricting the impulse times to be within the time horizon and the impulse weights to satisfy 0 ≤ d i ≤ 8 (i.e. twice the maximal amplitude allowed for the generating model) were also used for the AM algorithm. Five impulses were assumed in the AM algorithm, see the discussion in Section 2.2 and 3. Table S1. Experiment parameters for synthetic data generation. U [·,·] is the uniform distribution, ∆τ i denotes the time separation between the impulses and t end − τ end defines the time between the last impulse and the end of the data set. Further information on measurements and data sets that were removed in the estimation experiments in Section 3.2 are given in this section.
In Fig. S1, typical LH profiles for removed and included subjects are illustrated.
In Fig. S2 the distribution of LH measurements from a 32-year-old is displayed. The highlighted outlier was removed from this data set as it is clearly outside the distribution of the remaining measurements for this individual.  Figure S1. Subsets of the LH concentration profiles for a 68-year-old (top) and 24-year-old (bottom). The dataset corresponding to the upper plot was removed due to the deviating behaviour.

SENSITIVITY ASSESSMENT FOR GNRH-RECEPTOR ANTAGONIST AND AGE EFFECTS
In Section 3.2.3, the value of b 1 is set to 0.5. To assess the sensitivity with respect to this choice, we compare the results when b 1 = 0.35, b 1 = 0.5 and b 1 = 0.65 in this Section. Estimation results are summarized in Table S2. Some differences can be observed, but the results appear not to be particularly sensitive to the choice of b 1 overall.  Figure S2. Logarithm of LH concentration measurements from 32-year-old subject. The outlier marked in red was removed from the data set. Table S2. Estimation results for the experiments described in Section 3.2.3 for different values of b 1 . The first two rows contain the results of repeated one way analysis of variance of the average time separation between impulses and the average impulse weight with respect to different doses of Ganirelix. The last two rows contain the results from fitting affine curves to the same estimates as functions of the ages of the subjects. The p-values there are with respect to the null hypothesis of no age dependence, i.e. a zero slope.