Oscillatory Biomedical Signals: Frontiers in Mathematical Models and Statistical Analysis

Herein we describe new frontiers in mathematical modeling and statistical analysis of oscillatory biomedical signals, motivated by our recent studies of network formation in the human brain during the early stages of life and studies forty years ago on cardiorespiratory patterns during sleep in infants and animal models. The frontiers involve new nonlinear-type time–frequency analysis of signals with multiple oscillatory components, and efficient particle filters for joint state and parameter estimators together with uncertainty quantification in hidden Markov models and empirical Bayes inference.


INTRODUCTION
The 2017 Nobel Prize in Physiology or Medicine was awarded to Jeffrey Hall and Michael Rosbash of 2020). The statistical properties of these methods have been relatively unexplored and we are currently 116 investigating them; new methods to handle emerging scientific problems might be developed on the way. 118 During the past three years the second author (Lai) has been developing a new Markov Chain Monte 119 Carlo (MCMC) procedure called "MCMC with sequential substitutions" (MCMC-SS) for joint state 120 and parameter estimation in hidden Markov models. The basic idea is to approximate an intractable 121 distribution of interest (or target distribution) by the empirical distribution of N representative atoms, 122 chosen sequentially by an MCMC procedure, so that the empirical distribution approximates the target 123 distribution after a large number of iterations as explained below. 124 Lai's work in this area began with the landmark paper of Gordon et al. (1993) on the development of sequential Monte Carlo (SMC), also called particle filters, for the estimation of latent states in a hidden Markov model (HMM). Liu (2008) contains a collection of techniques that have been developed since then, with examples of applications in computational biology and engineering, and Chan and Lai (2013) provide a general theory of particle filters. Let = {X t , t 1} be a Markov chain and let Y 1 , Y 2 , . . . be conditionally independent given , such that X t ∼ p t (·|X t−1 ), Y t ∼ g t (·|X t ) in which p t and g t are density functions with respect to measure ν X and ν Y . The density function p T of X 0:T = (X 0 , . . . , X T ) conditional on

Efficient Particle Filters for Joint State and Parameter Estimation in HMM
This conditional distribution is often difficult to sample from and the normalizing constant is also difficult to compute for high-dimensional or complicated state spaces, and particle filters use sequential Monte Carlo that involves importance sampling and resampling to circumvent this difficulty. The particle filter computes E [ψ(X 0:T )|Y 1 , . . . , Y T ] by the recursive Monte Carlo scheme summarized in Algorithm 1. Let X m 0:t−1 denote the sample path of the mth particle (trajectory), 1 m M . The scheme uses importance sampling from a proposal density q t to circumvent this difficulty and updates not only the particles X m 0:t−1 but also the associated weights w m t−1 and ancestor A m t−1 of X m 0:t . It is initialized with A m 0 = m and w m 0 = 1. The SMC estimate of ψ T := E [ψ(X 0:T )|Y 1:T ] is By using martingale theory, Chan and Lai (2013) provide a comprehensive theory of the SMC estimateψ T , 125 which includes asymptotic normality and consistent standard error estimation as the following: Moreover, letting w t = M −1 M i=1 w i t , σ 2 can be consistently estimated by This is a provisional file, not the final typeset article Algorithm 1 SMC with M particles 1. Initialization: A i 0 = i, w i 0 = 1 for i = 1, . . . , M . 2. Importance sampling at stage t ∈ {1, . . . , T }: Generate conditionally independent X i t from q t (·|X i 0:t−1 ) and set X i Chan and Lai (2013, Lemmas 1 and 4) use the following representation ofψ T − ψ T to derive Theorem 1. Let w t (x 0:t ) = p t (x t |x t−1 )g t (Y t |x t )/q t (x t |x 0:t−1 ), in which that Y t can be treated as constants since the particle filter is the conditional distribution of X 0:t given the observations Y 1 , . . . , where E q denotes expectation under which X t |X 0:t−1 has the conditional density function q t (·|X 0:t−1 ) for 1 t T . Letting Ψ 0 = ψ T and Ψ t (X 0:t ) = for the unknown parameter vector and to use Markov chain Monte Carlo (MCMC) to estimate the posterior distriubtion. Andrieu et al. (2010) carried out this method for time-homogeneous Markov chains X t ∼ p θ (·|X t−1 ) for t 1 and X 0 ∼ p θ (·), with latent states X t and observations Y t ∼ g θ (·|X t ), in which θ is an unknown parameter with a prior density function π(·) with respect to some measure ν θ on the parameter space Θ. The posterior density of (θ, X 0:T ) given Y 1:T is proportional to target heuristically the posterior distribution of (θ, X 0:t ) given Y 1:t (1 t T ) as follows. It involves N 141 θ-particles, which we will call "atoms", and attaches to each atom θ a particle filter that propagates and 142 resamples M particles (state trajectories X m 0:t ) generated by SMC (as in Algorithm 1 with the given θ). It 143 carries out the MH iterations to determine if a candidate atom is accepted (as in Step (c) of Algorithm 2).

144
For the N atoms θ 1 t , . . . , θ N t and their corresponding importance weights at time t generated in this way, 145 if the degeneracy criterion in Chopin (2002) is satisfied, carry out bootstrap resampling of the weighted 146 parameter-particle set to replace it by an unweighted set, but no convergence theory as k → ∞ is provided.

147
Although MCMC methods with MH iterations are widely used computational tools in Bayesian inference 148 on θ ∈ Θ that has prior density function with respect to some measure ν θ , they do not have convergence the assumption E p (ψ 2 (θ)) < ∞, the estimated standard error isσ N / ζ q is the qth quantile of the standard normal distribution. This follows from the classical central limit 156 theorem and is very useful for determining N to ensureμ to be within some prescribed tolerance limit of which is asymptotically equivalent to the oracle procedure that assumes known target density p and which 160 they call MCMC with sequential state substitutions (MCMC-SS). Algorithm 3 Updating procedure SS(Θ b,k , w b k ) for MCMC-SS 1. Sampleθ from q(·|γ b,k−1 ) as candidate atom.
2. Let θ b ν+1,k−1 =θ and compute in the bth block and to assign the weight w in which κ represents an initial burn-in period that is asymptotically negligible as κ = o(K). In many image reconstruction and brain network development.
(i) Let G b,k be the joint distribution of θ b 1,k , . . . , θ b ν,k and let Q ν be the probability measure on Θ ν that has the density of ν independent components each of which has density q(·|γ f ) with respect 180 to m, where γ f = argmin γ∈Γ I(q γ f ) and I(q f ) = E f {log (q(θ)/f (θ))} is the Kullback-Leibler 181 divergence (or relative entropy) of q from the target density f in Algorithm 3. Then there exist positive 182 constants a and c such that G b,k − Q ν V ce −ak for 1 k K, where · V denotes the 183 weighted total variation norm associated with the weight function V . Hence after k log B iterations, be the total number of atoms used to define the MCMC-SS estimate ofψ of µ = E p (ψ(θ)). Then as K → ∞ and B → ∞ such that B = O(K), where σ 2 = Var p (ψ(θ)) and can be consistently estimated bŷ As shown by Lai et al. (2021), with probability approaching 1 by large k, the candidate atomθ in Algorithm 3 substitutes some existing atom in Θ b,k−1 . Hence, similar to the case of known target density p from whichθ is sampled, the newly sampled atom features in the weighted averageψ b,k . The reason we need the weighted average, with "importance sampling weights" w b i,k , is that for large k, the conditional distribution of Θ b,k given Θ b,k−1 behaves like the ν-fold product measure Q ν on Θ ν . This shows that importance sampling (likelihood ratio) weights w b i,k are needed to convert Q to P and suggests the asymptotic optimality ofψ, which is the overall average of the B(K − κ) estimatesψ b,k , similar toμ that is described for the case of known p. Each random variable generated in the MCMC-SS scheme asymptotically contributes weight (N ν) −1 to (a) the estimateψ of µ and (b) the asymptotic variance ofψ. Theorem 2 shows that there is in fact considerable flexibility in the choice of the factors K (the number of iterations) and B (the number of blocks) in N = B(K − κ) that determines the scaling factor in the central limit theorem, although the theorem highlights the case B = O(K) to emphasize that K should not be chosen too small relative to B. Lai et al. (2021) give an application to uncertainty quantification in the following image reconstruction problem. Cotter et al. (2013) propose to use MCMC methods "whenever the target measure has density with respect to a Gaussian process or Gaussian random field reference measure". A wide range of applications involving such a framework considers Bayesian inference on a latent random field {u(x) : x ∈ D} ⊂ R d generated by some stochastic partial differential equation (SPDE) in which D is a connected subset of R d , based on data generated by some nonlinear function of the random field. It is shown that after discretization and truncation to fit into this framework, the Radon-Nikodym derivative of the target measure P with respect to the reference measure Q has the form (dQ/dP )(u) ∝ exp(−l(u)) for some real-valued function l, which Cotter et al. (2013) call "potential" in their substantive applications.

186
The advantage of using a zero-mean Gaussian random field reference measure Q is that it is specified by the 187 covariance operator C whose eigenvalues λ i and orthonormal eigenfunctions φ i yield the Karhunen-Loève use a random truncation τ with a sieve prior to convert the infinite-dimensional expansion to a finite sum . In addition, a discrete approximation of the random field u(x) is used, with x taken over a mesh of width δ in each coordinate.
of proposal densities q(θ|γ), in which γ ∈ Γ indexes the family and Γ is a convex subset of R d . Lai et al.

ACKNOWLEDGMENTS
We thank Lai's research assistant Chenru Liu at Stanford University for her valuable contributions to the 282 preparation, timely submission and revision of the paper. We also thank the reviewers whose innovative