Toward a New Paradigm in the Analysis of Asteroseismic Lightcurves

This paper aims at being a provocative guide to the future of asteroseismology from the perspective of the analysis of time series, where the fundamentals of harmonic analysis are subjected to stress tests. In this context, we give an annotated summary of our research over the last decades on harmonic analysis of A-F stars. We discuss and explore the consequences of our findings, which may extend to any kind of pulsators. As well, we analyse the impact of this reconsideration on future asteroseismic studies, which would entail a paradigm shift. This includes a discussion on the presence of fractal behavior in δ Sct stars, and how this can be used to develop a stopping criterion of the pre-whitening process, as an alternative to SNR (or significance) criterion. Drilling a scientific paradigm has its natural resilience, hence the path described here is being arduous, although fruitful at the same time.


INTRODUCTION
Nowadays, the paradigm of the asteroseismic analysis establishes that the lightcurve of any pulsating star is the result of the convolution of the actual harmonic components (associated to stellar oscillations with an observation window Deeming, 1975), and the response function of the instrument. This implies the assumption that the necessary conditions to obtain the correct solution by applying the usual harmonic analysis techniques are guaranteed.
Before the space era, many of the spurious peaks in the power spectra were attributed to the day/night window from a single observatory. Here, spurious peaks are significant and nonsignificant peaks in the frequency spectrum that do not correspond to any oscillation mode 1 . This problem was partially overcome by performing multi-site observations where telescopes were distributed in different longitudes along the globe. The Delta Scuti Network 2 ( DSN Breger and Pamyatnykh, 1998) was the first to be established (1983), which observed numerous δ Sct stars through 37 campaigns until 2008. This network delivered, among many other data, the first asteroseismic analysis of the δ Scuti star FG Vir, and showed how non-radial modes are grouped around the radial ones .
However, such observations neither could be as long as desired nor solved the problem of scintillation due to the atmosphere, which limited the amplitudes of the detected modes. The solution was to observe from space (e.g., MOST, CoRoT, Kepler;Baglin et al., 2006;Matthews, 2007;Gilliland et al., 2010, respectively). The ultra-high quality of the data and the long continuous observations guarantees-in principle-that spurious peaks in the powerspectra were marginal (or at least fully controlled). Indeed, those space observations convoluted the signal from the star with an observation window which is not a box (due to gaps and outliers). Applying Deeming's method we should obtain a signal without any interference. However, other obstacles came up. An illustrative example is the periodic character of the window associated with the passage of the CoRoT satellite through the South Atlantic Anomaly (SAA) 3 . This was the origin of periodic gaps in the satellite's lightcurves, which potentially could affect the Fourier analysis as explained above. This was solved by the community by filling the gaps with linear interpolations (see section 2.1). In any case, an iterative spectral deconvolution (pre-whitening) was necessary (see section 2.2). That process, in turn, even in the best conditions, could still introduce new frequencies in the signal, responsible of the observed plateau of frequencies over the minimum noise level (see e.g., Poretti et al., 2009;Chapellier et al., 2011). Moreover, to distinguish the peaks coming from oscillations of those coming from non-linear interactions caused by the highest-amplitude frequencies, several techniques have been applied (e.g., Garrido and Rodríguez, 1996;Balona, 2012).
In addition to all these difficulties, new data from the satellites revealed the existence of the so-called hybrid pulsators, i.e., A-F exhibiting both g and p modes, respectively (see e.g., Grigahcène et al., 2010;Catanzaro et al., 2011;Balona et al., 2015, to name a few), which has called into question the theory of excitation mechanism of these stars .
This work intends to synthesize decades of research of our team devoted to try to understand the pulsation content of A-F, intermediate-mass stars following a holistic approach, in which both observations (and their analysis) and numerical modeling have been studied. This include non-adiabatic pulsations, their interaction with convection (Moya et al., 2004;Dupret et al., 2005;Grigahcène et al., 2005), as well as rotation-pulsation effects (e.g., Suárez et al., 2005Suárez et al., , 2006Suárez et al., , 2007Suárez and Goupil, 2008, to name a few). The present paper intentionally focuses on the harmonic analysis, since we have found that a major revision, at least for this kind of pulsators, was necessary.

DO SPACE SATELLITES SOLVED THE PROBLEM OF SPURIOUS PEAKS IN CLASSICAL PULSATORS?
Analysis of CoRoT's data revealed an unexpected large number of frequencies in δ Sct stars (Poretti et al., 2009) together with ubiquitous correlated noise in the residuals of their lightcurves' fitting. Balona and Dziembowski (2011) evidenced a large disparity between the expected distribution of stars with maximum amplitudes below 100 ppm (∼7%) and the abundance of constant stars (∼50%). In addition, the detection limit of CoRoT and Kepler satellites is far below 100 ppm. As an example, a grass level of 4 ppm is estimated for HD50844 (Balona, 2014), which means that frequencies exceeding 16 ppm might be detected. These results suggest that there are no pulsations below the detection limit.
Since we could be confident with the quality and precision of space data, we wondered at that time if the only possible cause of those yet fully unexplained effects were the systematics. Among which, gaps in lightcurves are one of the most delicate, and as we show in this section, their treatment cannot be avoided.

To Fill or Not to Fill. That's Not the Question Anymore
Gaps in lightcurves can be caused by different reasons: instrumental, e.g., Kepler's data downlink to Earth every 32 d introducing gaps of 1 day in length (see Kepler Data Characteristics Handbook, Van Cleve et al., 2016), or environmental, e.g., CoRoT satellite's passage through the SAA. These make the spectral window function different from a sinc and it can be hard to perform the spectral deconvolution to obtain the original frequency content of the signal.
Considering thus the importance of gaps, our team decided to make a considerable effort to tackle this problem with the objective of better preserving the frequency content. In Pascual-Granado et al. (2015b) we evaluated the most widespread treatment of gaps for CoRoT data, originally implemented in the correction pipeline, i.e., linear interpolation. We showed that it does not preserve the original frequency content. To be more precise, we showed that this happens for any gapfilling method using analytic functions. Indeed, under certain circumstances, analytic interpolation may mimic the spectral window caused by zero-filling (see Appendix A in Pascual-Granado et al., 2018, for more details), i.e., no interpolation at all as often is the case of Kepler/K2 data (see Figure 1, left panel).
While Fourier analysis can be especially suitable for purely harmonic signals, where Fourier coefficients provide a sparse representation, it can be inappropriate for non-analytic signals, e.g., stochastic or auto-regressive (AR) processes.
Our approach to gap-filling thus was to assume that the observed signal contains analytic (harmonics) and non-analytic components. This can be done by fitting the lightcurves with auto-regressive, moving-average (ARMA) models. These models, their variations, e.g., auto-regressive, integrated, moving-average (ARIMA), auto-regressive, fractionally-integrated, movingaverage (ARFIMA), etc., or their continuous versions, can represent almost any kind of signal, whether it is analytic or not. ARMA models have been used extensively in statistics but they are less popular in astronomy (Scargle, 1981). In addition, contrary to what some authors claim , these models are not only extensible to situations involving combinations of deterministic and stochastic behaviors but are also able to represent strictly periodic variations (Roth and Zhugzhda, 2010).
The only possible concern in using ARMA models might be that stationarity is assumed. To overcome this problem, we developed MIARMA algorithm (Pascual-Granado et al., 2015b), which makes use of local ARMA models (see examples of AR global models in Fahlman and Ulrych, 1982) of data segments bracketing the gaps. ARIMA models could also solve this problem but we preferred ARMA for the sake of simplicity.
MIARMA performs an optimal interpolation of gaps compared to polynomial interpolation, often used in the field (Figure 1, left panel). We thus conclude that : (1) gap-filling yields substantial improvements when the number of data points before and after the gap is of the size of the gaps, i.e., duty-cycle ≥60% (conservative criterion); (2) the method to fill the gaps must preserve the original frequency content. That is, to fill or not to fill is not the question anymore.

Pre-whitening
Since regular sampling is recovered by gap-filling, the spectral window is just the sinc function related with the observation time span (boxcar function). In this case FFT is a more efficient estimate of the powerspectrum. Then, the spectral deconvolution problem mentioned above could be directly solved. We might wonder if the pre-whitening procedure would then be necessary, since sidelobes located next to each frequency might be easily identified with the sinc. Although this claim is plausible in theory, things are quite different in practice. For instance, δ Sct stars exhibit a very high density of significant frequencies and thereby, sidelobes may be confused with actual frequencies due to the leakage of power from near frequencies, making them appear significantly higher above the noise.
It is necessary thus to check the consistency of the prewhitening process between gapped (non-regular sampling) and filled (regular sampling) lightcurves. We performed such a test in the pre-whitening frequency sets obtained for 15 δ Sct stars observed by CoRoT . In that paper we found significantly different frequency distributions, which indicates that pre-whitening is not an unbiased procedure. This put into question the feasibility of this widespread technique for the extraction of pulsation content from stellar lightcurves.
Some alternatives to pre-whitening or the clean method for the deconvolution of the spectral window have been proposed in the literature. One promising solution is the use of the so-called direct deconvolution which was applied successfully in Scargle et al. (2017) for the estimated galaxy transform derived from redshift surveys.

REVISITING THE GROUNDS FOR HARMONIC ANALYSIS APPLICATION
In the light of the issues mentioned in the precedent section, it is reasonable to raise the following question.
3.1. Is the Signal Underlying the Lightcurves Mathematically Eligible for Periodogram Analyses? Wiener (1930) established the most general form of the Fourier theorem. It states, among other things, that even some functions which are neither periodic nor absolutely integrable (see an example of this in Priestley, 1981) can be represented by a Fourier-Stieltjes Transform (FT). Thus, the signal underlying a given observed data sample could be most probably represented in Fourier terms. The only condition is that the corresponding function is of the "steady-state" type, i.e., it does not become arbitrarily large or small as t → ±∞. However, the frequency content that can be obtained through a FT is not necessarily equivalent to the estimates obtained through periodogram analysis of our data sample.
Periodogram analyses like the classical Schuster (Schuster, 1898) or the most widespread Lomb-scargle periodogram (Scargle, 1982) for unevenly spaced data, rely on the assumption that the frequency content estimated converges to the original frequency content of the underlying signal. Otherwise, this discrete transform has no physical sense. Nevertheless, as pointed out by Wiener (1930): "The periodogram of a function contains but a small amount of the information which the complete graph of the original function is able to yield. Not only we deliberately discard all phase relations, but a large part of the original function-often the most interesting and important part-is thrown away as the aperiodic residue." It is thus well-justified to check for the eligibility of the signal underlying the lightcurve. Indeed, this should be a prerequisite for a data analysis procedure to qualify as scientific methodology to try to answer this question. In order to do that we need to define first the eligibility criteria. It is worth recalling that DFT is formally similar to FT but it is not the same mathematical concept (Deeming, 1975), it has not even the same units. So the question might be rephrased in this way: when do DFT of a time series provide an analogous description to the FT of the underlying function?
The usual test to check the convergence of the Fourier expansion of a function (and almost the only one that can be performed in practice) is the Dirichlet criterion. It is known that any analytic function always satisfies that criterion, hence the importance of the analyticity for Fourier analysis. Intuitively, analytic functions are the smoothest of all the families since all their derivatives are continuous everywhere. On the opposite side we have the Weierstrass function, which is continuous everywhere but differentiable nowhere. This socalled pathological function has a Fourier expansion, although it cannot be described with a DFT since it requires an infinite sum of frequency components. It worth here evoking Charles Hermitte's words in 1893: "I turn away with fear and horror from the lamentable plague of continuous functions which do not have derivatives."

Connectivity of Measure Points: An Old Concept for a Paradigm-Changing Test
In Pascual-Granado et al. (2015a) we introduced a new parameter known as connectivity to characterize the differentiability of the underlying signal in order to evaluate the question in the previous section. Indeed, the correspondence between the frequency content obtained by means of a periodogram analysis and the original frequency content (i.e., obtained by means of harmonic analysis) has not so far been addressed 4 . This connectivity concept, which comes from the analysis of continuous functions, can also be applied to signals underlying time series 5 .
The connectivity parameter can be understood as the deviation of a fitted analytic model on the observed value of the underlying signal. The arrow of time is taken into account here by fitting the analytic model both in forward and backward directions to find their difference.
Connectivities C n can also be defined implicitly through numerical derivatives D n in this way: where y n+1 , y n−1 are values of the sequence y n corresponding to the sampled underlying signal y(t), t is the sampling rate, ξ n is the non-differentiable component of the sequence and ǫ n accounts for the numerical error. An additional condition is introduced in order to make this implicit definition of connectivities self-consistent. The first term in the definition C(ξ n ) is zero for ξ n = 0. When this happens the connectivities reduce to the uncorrelated normal stochastic sequence ǫ n and the equation of D n is just the typical point derivative expression. When ξ n is non-negligible connectivities show a correlation with the signal (see e.g., right panel of Figure 1). Therefore, correlation with the original time series means that an analytic model is insufficient to account for the underlying signal.
In practice, it is the quadratic sum of the connectivity (global connectivity coefficient) for each data point in the time series what provides a measure of the non-differentiability of the underlying signal in a similar way as the Wiener nondifferentiability coefficient does.
We found that the global connectivity coefficient is significantly >0 for some δ Sct stars observed by CoRoT and Kepler. This implies that, at least in those cases, the underlying signal is a non-differentiable function and, therefore, not mathematically eligible for periodogram analyses. This is what we called an inconsistency in the application of harmonic analysis to the lightcurves of pulsating stars.

TOWARD A NEW PARADIGM
Considering that non-analyticity might play an important role in the measurement of stellar lightcurves, this new frame should contain mathematical techniques able to deal with such nonanalytic objects. One of the first attempts to define fractals as mathematical objects was by mean of recursive continuous but non-differentiable functions, like the Weierstrass function studied by Pascual-Granado (2011). In this context, the fractal analysis, i.e., the study of them by mean of tools derived from the self-affine functions framework (see details in De Franciscis et al., 2018, and references therein), of time series comes up naturally. Indeed, it has also been used to study the rotational periods and flicker noise amplitudes of solar-like stars (De Freitas et al., 2013;de Freitas et al., 2016), and the magnetic activity of M dwarfs (de Freitas et al., 2017), although it is the first time that such an analysis was performed to classical pulsators, for which nonfractal fingerprints was expected to be found. Here fingerprint is any evidence that the studied signal is self-affine, i.e., it exhibits similar structures in different flux and time scales (e.g., the typical zig-zags observed in the stellar energy spectrum).
In this section we explain how we conceive the fractal analysis as key factor in the new paradigm that we propose.

Fractal Analysis in Delta Scuti Stars
We considered that most of the physical phenomena proposed to explain the complex oscillation spectrum of A-F type stars (see section 1) might be related with a specific fractal fingerprint. This hypothesis was confirmed by our study on the lightcurves of 15 CoRoT δ Sct stars (De Franciscis et al., 2018), for which different fractal fingerprints were found. Those fingerprints might provide clues for a better description of non-harmonic components of the lightcurve, such as turbulence, convection, or magnetic activity. However, such a decomposition of physical phenomena required to accurately isolate the harmonic component. To do we studied the fractal analysis on the pre-whitening process.
The pre-whitening process applies a Fourier technique to determine the highest-amplitude frequency. A χ 2 minimization fits the free parameters in a sinusoidal function (frequency, amplitude, and phase). The fit then is subtracted to the original signal and the residuals are used to get the following highest frequency, and so forth. This iterative process usually stops when the SNR ratio drops below 4, which is an arbitrary conservative threshold (Breger et al., 1993).

Decompose and Identify Physical Sources With Fractal Analysis
We applied a Coarse Graining Spectral Analysis (hereafter, CGSA) test to a sample of CoRoT δ Sct, γ Dor and hybrids stars (Franciscis et al., 2019). This test can discriminate the stochastic fractal power spectra from the harmonic one in a time series. Here, the CGSA test accounts for the proportion of fractal signature in the signal within the residuals at each prewhitened step. We applied the test to each and every one of the pre-whitening steps (see Figure 2). As expected, all the stars begin with a low fractal index, i.e., CGSA close to 0, since the principal components of the signal correspond to the harmonic function. The more frequencies are pre-whitened, the closer to a pure fractal the residuals will be. Two distinct behaviors are observed. For some stars, the fraction of CGSA grows until it reaches a maximum and then it decreases (see GSC00144-0301, HD172189, HD181555, and HD51359 in lower panel of Figure 2). This indicates that the pre-whitening extracts properly the harmonic signal up to critical step from which new spurious harmonic components are injected. Here, the stop criterion for the pre-whitening cascade is obvious. In the second case, the CGSA fraction grows until it reaches a kind of saturation (see upper panel in Figure 2). Here, the harmonic signal is rapidly extracted up to the asymptotic point.
The number of frequencies found here is significantly different compared to the literature (see Franciscis et al., 2019, for a full table showing the number of frequencies in each case). For example, for HD50870 the CGSA algorithm stop at 400 frequencies (CGSA∼ 0.84), in contrast to the 734 frequencies found in the literature following standard pre-whitening process. On the other hand, for HD174966, the number of frequencies found using CGSA criterion is higher than the standard prewhitening (400 vs. 185).
Also, the change of sign of the CGSA slope found in some stars (e.g., HD181555) might be a consequence of the presence of another kind of signal hidden in the lightcurve; maybe not only white noise but e.g., rotation modulation effects, binarity or modes not coherently excited (Antoci et al., 2011).
In general, we found the variation of the CGSA slope as a robust (and physical) stop criterion for the pre-whitening cascade, in contrast to the ad-hoc SNR criterion used up to now.

DISCUSSION
We have gathered here the main results of our research in the quest for better understanding the intermediate-mass pulsating stars during the last decades. These are in line with most of the results in the field, where the number of important open questions about δ Sct stars has increased with time at a faster rate than those that have been resolved. This has lead us to suspect that we are on the verge of a paradigm shift.
We have found that ultra-high quality data from space do not solve the problem of finding spurious peaks in the powerspectra of δ Sct stars. The team have been developing a self-consistent method to extract non-linearities, considered spurious in these stars (Lares-Martiz et al., inpress). Another part of the problem has been to systematically neglect the gaps in space photometric lightcurves, or to fill them with a linear interpolation. We have developed the MIARMA algorithm, based on auto-regressive, moving -average models, to maximize the recovery of the frequency content in harmonic analysis. This algorithm has allowed us to improve the asteroseismic studies based on the global behavior of the modes, as attested by different publications on the subject, even in the search for planetary transits (see e.g., Caceres et al., 2019;Handler et al., 2019;Stuhr et al., 2019). Thanks to this improvement our group has confirmed the presence of a low-order large separation and its proportionality to the stellar mean density (Suárez et al., 2014;García Hernández et al., 2015). This makes this seismic parameter a valuable observable 6 , thanks to which we have been able to determine the surface gravity of δ Sct stars . All the above has allowed us to use machine learning to find some hidden relations between seismic variables in a sample of δ Sct stars observed by the CoRoT satellite. This constitutes a first step in the search for scale relationships similar to those found in solar type stars .
Gaussian Process modeling have proved to be a promising technique for the study of stochastic oscillations under a nonwhite noise background, e.g., solar-like and red giant stars Brewer and Stello (2009), White (2010), Foreman-Mackey et al. (2017. This technique can be more suitable than traditional frequency analysis methods in these cases since gaussian models fit properly stochastic processes showing a power-law distribution. Despite all these results, further progress on this field will require to understand the γ Dor/δ Sct-hybrid phenomenon, as well as the forest of low-amplitude, high-frequency modes in certain observed spectra. Our group has addressed this issue by studying the Connectivity of the measure points. We have found non-analytic lightcurves among CoRoT data, for which harmonic analysis cannot ensure to provide the correct oscillation spectrum. What does this mean? Is harmonic analysis wrong? Or, is it properly applied? This puts into question the whole concept of the current paradigm (see section 1).
We addressed this problem using fractal analysis of time series. First, we have detected δ Sct stars different fractal footprints in the lightcurves of δ Sct stars. Moreover, we have demonstrated that the fractal character of the lightcurve varies with the pre-whitening iteration step. This has allowed us to find a new stop criterion that does not depend on any arbitrary criteria, such as the signal-to-noise ratio. A question naturally arises: how does fractality relates to the connectivity of the lightcurves? (work in progress). More interestingly, this opens up the possibility to better describe/understand the non-harmonic components, like, turbulence, convection, activity, etc. We foresee this will have a significant impact on the asteroseismic studies of A-F stars in the near future, where a huge amount of data from TESS (Ricker et al., 2009) and PLATO (Rauer et al., 2014) will be available, together with MOST, CoRoT, and Kepler legacy data. Currently we analyse the impact of using photometric time series from different instruments (namely Kepler and TESS) on the results, for which we expect to find a similar fractal behavior.
This also opens up new questions: Can we translate these results to all pulsating stars? Should we review some physical processes that are currently being studied with harmonic analysis? Could this new paradigm affect to other disciplines in time domain astrophysics, like exo-planetary science?

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

AUTHOR CONTRIBUTIONS
JS, RG, JP-G, AG, SF, ML-M, and JR contributed equally to the paper. FUNDING SF, JP-G, JR, ML-M, and RG acknowledges financial support from the State Agency for Research of the Spanish MCIU through the Center of Excellence Severo Ochoa award for the Instituto de Astrofísica de Andalucía (SEV-2017-0709) and Spanish public funds for research under projects ESP2015-65712-C5-5-R. JS and AG acknowledge funding support from Spanish public funds for research under projects ESP2017-87676-C5-2-R. JS also acknowledges funding support from project RYC-2012-09913 under the Ramón y Cajal program of the Spanish MINECO. AG acknowledges support from Universidad de Granada under project E-FQM-041-UGR18 from the Programa Operativo FEDER 2014-2020 programme by Junta de Andalucía regional Government.