A Proof of Concept Study of Function-Based Statistical Analysis of fNIRS Data: Syntax Comprehension in Children with Specific Language Impairment Compared to Typically-Developing Controls

Functional near infrared spectroscopy (fNIRS) is a neuroimaging technology that enables investigators to indirectly monitor brain activity in vivo through relative changes in the concentration of oxygenated and deoxygenated hemoglobin. One of the key features of fNIRS is its superior temporal resolution, with dense measurements over very short periods of time (100 ms increments). Unfortunately, most statistical analysis approaches in the existing literature have not fully utilized the high temporal resolution of fNIRS. For example, many analysis procedures are based on linearity assumptions that only extract partial information, thereby neglecting the overall dynamic trends in fNIRS trajectories. The main goal of this article is to assess the ability of a functional data analysis (FDA) approach for detecting significant differences in hemodynamic responses recorded by fNIRS. Children with and without SLI wore two, 3 × 5 fNIRS caps situated over the bilateral parasylvian areas as they completed a language comprehension task. FDA was used to decompose the high dimensional hemodynamic curves into the mean function and a few eigenfunctions to represent the overall trend and variation structures over time. Compared to the most popular GLM, we did not assume any parametric structure and let the data speak for itself. This analysis identified significant differences between the case and control groups in the oxygenated hemodynamic mean trends in the bilateral inferior frontal and left inferior posterior parietal brain regions. We also detected significant group differences in the deoxygenated hemodynamic mean trends in the right inferior posterior parietal cortex and left temporal parietal junction. These findings, using dramatically different approaches, experimental designs, data sets, and foci, were consistent with several other reports, confirming group differences in the importance of these two areas for syntax comprehension. The proposed FDA was consistent with the temporal characteristics of fNIRS, thus providing an alternative methodology for fNIRS analyses.

Functional near infrared spectroscopy (fNIRS) is a neuroimaging technology that enables investigators to indirectly monitor brain activity in vivo through relative changes in the concentration of oxygenated and deoxygenated hemoglobin. One of the key features of fNIRS is its superior temporal resolution, with dense measurements over very short periods of time (100 ms increments). Unfortunately, most statistical analysis approaches in the existing literature have not fully utilized the high temporal resolution of fNIRS. For example, many analysis procedures are based on linearity assumptions that only extract partial information, thereby neglecting the overall dynamic trends in fNIRS trajectories. The main goal of this article is to assess the ability of a functional data analysis (FDA) approach for detecting significant differences in hemodynamic responses recorded by fNIRS. Children with and without SLI wore two, 3×5 fNIRS caps situated over the bilateral parasylvian areas as they completed a language comprehension task. FDA was used to decompose the high dimensional hemodynamic curves into the mean function and a few eigenfunctions to represent the overall trend and variation structures over time. Compared to the most popular GLM, we did not assume any parametric structure and let the data speak for itself. This analysis identified significant differences between the case and control groups in the oxygenated hemodynamic mean trends in the bilateral inferior frontal and left inferior posterior parietal brain regions. We also detected significant group differences in the deoxygenated hemodynamic mean trends in the right inferior posterior parietal cortex and left temporal parietal junction. These findings, using dramatically different approaches, experimental designs, data sets, and foci, were consistent with several other reports, confirming group differences in the importance of these two areas for syntax comprehension. The proposed FDA was consistent with the temporal characteristics of fNIRS, thus providing an alternative methodology for fNIRS analyses.

INTRODUCTION
Functional near infrared spectroscopy (fNIRS) is a non-invasive method for measuring near-infrared light absorption through the skull, enabling researchers to speculate a close proxy to neural activation that results from relative changes of the cerebrovascular alterations in oxygenated and deoxygenated hemoglobin concentrations in cortical structures (Villringer and Dirnagl, 1994;Boas et al., 2014;Tak and Ye, 2014). Since light between 650 and 950 nm is weakly absorbed by biological chromophores (Hoge et al., 2005), the relatively deep penetration of NIR light makes it an effective research tool in neuroimaging studies. Compared to other imaging technologies such as functional magnetic resonance imaging (fMRI) and positron emission tomography (PET), fNIRS has a few advantages such as low cost, high flexibility, portability, and the ability to accommodate young children and patients with psychological issues (Arenth et al., 2007;Ye et al., 2009). fNIRS offers superior temporal resolution with dense measurements over time and provides data for a wide range of functional contrasts such as oxygenated ( HbO), deoxygenated ( HbD), and total hemoglobin ( HbT) simultaneously as participants perform functional tasks in naturalistic environments (Kozel et al., 2009;Ye et al., 2009;Hall et al., 2013;Tak and Ye, 2014). Despite the extensive study of fNIRS data, little has been done to study the mean and variation trends of hemodynamic curves as individuals complete language processing tasks. Indeed, analysis approaches that truly utilize the superior temporal characteristics of fNIRS are rare in the existing literature. Even rarer are studies of concomitant behavioral and neural differences between children with specific language impairment (SLI) and typically developing control children as they complete language comprehension tasks.
In this article, we introduce a functional data analysis (FDA) methodology with a goal of addressing several challenging questions: (1) how to best utilize the superior temporal resolution of fNIRS; (2) how to model its hemodynamic trends for syntax-related stimuli; (3) how to connect light optodes with brain regions without anatomy information; (4) how to speculate the differences in brain activities between case and control in reaction to the same stimuli. FDA is a nonparametric data-driven statistical technique that does not make any parametric assumptions such as linearity or normality. Our main objective was to model the overall hemodynamic trends from a functional perspective as opposed to individual discrete points that are considered using existing analysis approaches. Although the modeling goal of FDA conforms to the temporal hemodynamic signals of the fNIRS context (Barati et al., 2013), it has seldom been applied in the fNIRS literature. Tak and Ye (2014) reviewed currently existing statistical models in fNIRS data. The most well-known and widely used method was the GLM (Schroeter et al., 2004;Plichta et al., 2007), which has been integrated into numerous fNIRS analysis tools (Shimada and Hiraki, 2006;Koh et al., 2007;Abdelnour and Huppert, 2009;Huppert et al., 2009;Strangman et al., 2009;Ye et al., 2009;Custo et al., 2010;Penny et al., 2011). As a multivariate statistical model, GLM works well, but FDA differs in important ways. First, GLM is a traditional parametric model that assumes a linear combination structure. Assuming a parametric form would likely be misleading if the underlying data did not satisfy the main linear assumptions. Therefore, nonparametric modeling without any assumptions should be more flexible. Second, as a multivariate model, GLM does not utilize the time course of the data and hence can not capture the overall trends of the hemoglobin concentration in the dynamic or functional sense (Barati et al., 2013). Third, GLM does not provide a relevant hypothesis test approach to compare the differences in the overall hemodynamic trends between case and control groups due to its model structure restrictions.
Comparing which brain regions are significantly involved in a task performed by two groups requires formal hypothesis testing. Unfortunately, many of the current statistical approaches used to perform hypothesis tests for fNIRS data may not be optimal in the functional sense. Simple statistics such as t-test have been performed to statistically compare single-value differences between different groups (Aldrich et al., 1994;Germon et al., 1994Germon et al., , 1999Young et al., 2000;Hoshi et al., 2001;Isobe et al., 2001;Kennan et al., 2002;Schroeter et al., 2002;Hoshi, 2003;Matsuo et al., 2003;Tachtsidis et al., 2004;Tsujimoto et al., 2004;Shibuya-Tayoshi et al., 2007;Kim et al., 2010). Multi-way ANOVA has also been employed in fNIRS studies (Fallgatter and Strik, 1998;Bartocci et al., 2000;Fallgatter and Strik, 2000;Herrmann et al., 2003;Hoshi, 2003;Suto et al., 2004;Folley and Park, 2005;Kameyama et al., 2006;Arenth et al., 2007;Irani et al., 2007). Although these methods were able to evaluate differences in hemoglobin observations, information was lost because only partial measurements were considered. Using FDA to compare the overall temporal mean and variation trends of hemodynamic response functions rather than simply defining a magnitude may be more informative and robust, especially in a context in which optical signal attenuation or motion artifacts cause noise (Ye et al., 2009).
When repeated measurements are recorded over a dense grid of time points, often by machine, they are typically termed as functional or longitudinal data, with one observed curve per subject. Formally, FDA models each hemodynamic response curve as a continuum function over time, thus capturing the overall dynamic trajectories of the function over time, even though the measurements are collected discretely (Ramsay and Silverman, 2002;Ferraty and Vieu, 2006;Ramsay, 2006;Barati et al., 2013). Although some experimental errors are generally unavoidable, nonparametric kernel smoothing captures the underlying mean function and hence greatly reduces the effects of noise. The functional principal component analysis (FPCA) based on the Karhunen-Loeve theorems decomposes the high dimensional auto-covariance matrix extracted from fNIRS data to a few important orthogonal eigenfunctions. The first few eigenfunctions explaining the majority of variation are likely induced by cognitive related tasks, with the remaining eigenfunctions explaining only a very small percentage of variation that may be caused by nuisance factors such as breathing, vasomotor, measurement error, movement artifacts, and other unaccounted activities (Akgül et al., 2006). To perform comprehensive comparisons on the hemodynamic curves between case and control groups, we tested the equality of mean functions and eigenfunctions and eigenvalues of the auto-covariance functions using two-sample FPCA approaches. Bootstrap sampling was used to determine the threshold of the significance of the tests because the distributions of the test statistics were unknown (Benko et al., 2009). Importantly, FDA is inherently nonparametric and does not assume any parametric structure or distributions within the hemodynamic curve data.
Some researchers have investigated the functional relationship between fNIRS and fMRI and their correlation over time (Mandeville et al., 1999;Siegel et al., 2003;Fujiwara et al., 2004;Okamoto et al., 2004;Steinbrink et al., 2006). Although many common properties exist between fNIRS and fMRI, functional curve based modeling, which is mature in fMRI research (Grodzinsky, 2000;Ben Schachar et al., 2003;Müller et al., 2003;Ben-Shachar et al., 2004;Weismer et al., 2005;Binder et al., 2009;Seghier et al., 2010;Seghier, 2013), has rarely been used for fNIRS stand-alone experiments. The progress achieved in fMRI analyses paves the way for improvements on fNIRS approaches. We believe that the FDA approach could promote breakthroughs in fNIRS research, similar to the way it did for fMRI.
To test the potential of FDA to analyze fNIRS data, we used fNIRS to asssess differences in neural activation between children (case: children with SLI; control: age-matched, typicallydeveloping children) as they engaged a language comprehension task that is known to favor the children in the control group. SLI is a developmental language disorder of unknown origin that is characterized by significant deficits in the acquisition and use of spoken and written language in the absence of hearing, intellectual, emotional, or acquired neurological impairments (Bishop, 2014;Leonard, 2014). This disorder affects approximately 7% of the school-age population (Tomblin et al., 1997). If FDA is a promising statistical approach for fNIRS, it should reveal group differences in parasylvian (language related) neural regions as children perform the task.

Participants
Thirty children (15 children with SLI and 15 age-matched, typically developing control children) between the ages of 8 and 12 participated in the study. There were eight males in each group. The children in the SLI group met the standard classification criteria of performance on multiple language measures that was one or more standard deviations below the mean. The typically-developing controls performed above one standard deviation from the mean on multiple language measures. All the children in both groups were right-handed, monolingual English speakers. All the children in the SLI group were receiving special education services in the public schools. In addition, we provided independent testing to insure that the children in the SLI group met our identification criteria.

Sentence Comprehension Task
The children completed a language comprehension task in which they listened to a sentence and then selected a picture (from three choices) that depicted the agent (actor) in the sentence. There were 60 total sentences with 15 sentences representing each of four sentence types: subject-verb-object ("The ring had moved the square behind the very bright cold bed"), subject relatives ("The watch that had hugged the truck behind the kite was bright"), passives ("The shoe was hugged by the clock under the very cold box"), and object relatives ("The book that the shirt had hugged under the kite was new"). The sentences were controlled for length, vocabulary complexity, and vocabulary imageability (Montgomery et al., 2015). Similar to Dick et al. (2004), noun animacy and noun affordance cues were removed, making the sentences semantically implausible. This was done so that the children's decisions about the agent of the sentence would be based primarily on syntactic knowledge or word order rather than semantic plausibility. Children saw three pictures on a computer screen as they listened to each sentence. They were asked to point to the picture of the agent of the sentence (the thing doing the action) as quickly as possible after hearing each sentence. All children completed eight training items before fNIRS scanning began. See Montgomery et al. (2015) for a complete description of the stimuli.

Functional Near Infrared Spectroscopy Procedures
Data was collected with the Hitachi ETG-4000 (Hitachi Medical Co., Japan) with 44 channels divided across two 3 × 5 probe caps. The channels were determined by bilateral placement of the optode caps such that the middle detector in the lowest row of optodes was placed over T3 or T4. The measurement patches covered the majority of the right and left parasylvian regions including inferior frontal cortex, inferior parietal lobule (including the temporal parietal junction and inferior posterior parietal cortex), and superolateral temporal cortex. The channel locations are depicted in Figure 1.
The fNIRS scan began with a 30 s rest period in which children were instructed to focus on a "+" in the middle of the computer screen and to "relax" their mind. After the first rest FIGURE 1 | Display of the 44 channels divided across two 3 × 5 probe caps. The channels 1-22 belong to the left brain hemisphere and the channels 23-44 belong to the right brain hemisphere.
period, children listened to 60, 12-word sentences representing four different syntax types (15 subject-verb-object sentences, 15 subject relative clause sentences, 15 passive sentences, and 15 object relative clause sentences). E-prime software was used to present the stimuli in a pseudo-random order and to record the accuracy and speed of the children's responses. The sentences were presented in three blocks of 20 items, presented in a psudorandom order, with each item being separated by a jittered rest interval that varied between 2 and 6 s. Each block was separated by a 25 s rest period. The stimuli onsets for each participant were consistently predefined and each participant was given 8 s to think and respond.
Throughout the fNIRS scan, near-infrared light from the source optodes travels approximately 1-1.5 cm into the cortex where it is absorbed by oxygen molecules attached to hemoglobin in the blood in the brain (Dehghani and Delpy, 2000). The amount of light that is not absorbed is measured by the detecting optodes. The relative changes in the concentration of oxygenated hemoglobin ( HbO), deoxygenated hemoglobin ( HbD), and total hemoglobin ( HbT) were estimated according to changes in the optical properties of the light using the Beers-Lambert conversion (see Plichta et al., 2007 for a detailed description). A length of 8521 and a frequency of 10 Hz time series was collected within a duration of 851 sec for each channel of each participant. Figure 2A displays one example of original HbO time series at channel 31 (mainly overlapped in the right inferior frontal cortex) for a child in the SLI group.

Data Preprocessing
There were a total of 3960 individual time series collected from three hemoglobin categories ( HbO, HbD, and HbT), 44 channels, and 30 participants (15 cases and 15 controls). Each time series contained 8521 measurement units consisting of 4800 intermittent task measurement units and 3721 rest measurement units. The active periods represented 15 stimuli segments for each of the four syntax types. The following preprocessing steps were designed to extract the most important information from such a large amount of data.
The first step of data preprocessing was to group channels based on regions of interest (ROIs). The global alignments of the channel positions between individuals were difficult because fNIRS has the shortcoming of weak spatial anatomical representation. The ROIs for the current project were derived a priori based on previous findings in both the fMRI and fNIRS literature demonstrating changes in cortical activation during language processing tasks. Four areas within the parasylvian region, inferior frontal cortex (Broca's area), superolateral temporal cortex, the temporal parietal junction and posterior inferior parietal cortex (Angular Gyrus) are frequently implicated in verbal tasks (Rossi et al., 2012;Scherer et al., 2012;Petrides, 2013). A Polemus system was used for 3D digitization of head size and optode location following testing. This provided standardized Montreal Neurological Institute coordinates and anatomical labels that related to each participant individually. We determined the corresponding channel for each monitored brain region based on the largest percentage of overlapping rate between the channel and the brain ROI for each participant.
The second step of data preprocessing was to extract only stimulus-related active units from the original time series and focus only on the segments associated with cognitive activity during the target stimulus comprehension tasks. There were 60 such windowed segments, each lasting 8 sec (corresponding to 80 units), and hence, a total 4800 units were extracted. As an example, Figure 2B displays the stimulus-relevant HbO extracted from the original time series at channel 31 (mainly overlapping the right inferior frontal cortex) for a child (Sue) in the SLI group. This process was repeated for all individuals.
Since the observations were collected very densely, we used the average of the 10 units per second as the modeling target, illustrated in Figure 2C. Comparing Figure 2B and Figure 2C, notice that the two signals look almost the same, except Figure 2B has length 4800 but Figure 2C is only of length 480 (1/10 of original length). If there were any differences caused by averaging the 10 dense units per second (Figure 2C), it would be smoother and would capture the trend even better by removing more noise or errors from averaging. The third step of data preprocessing related to selecting the hemoglobin categories. It is not clear whether neuronal activation is best represented by HbO, HbD, or HbT. Researchers may expect the deoxygenated hemoglobin to show opposite trends to that of HbO because the HbO and HbD often complement each other (Cui et al., 2010). However, comparing Figure 3A with Figure 3B for one example of the same channel for the same person, note that the deoxygenated hemodynamic trends are flatter than the oxygenated hemodynamic trends, and there does not appear to be opposite trends in most time segments. This suggests that the oxygenated hemoglobin contains a more rubust signal than the deoxygenated hemoglobin. In this article, we mainly focused on modeling HbO and HbD because the results of HbT (the sum of HbO and HbD) were highly correlated with the other two.
The forth step of data preprocessing involved extracting the syntax-relevant time course by locating the time onsets of the 15 questions for each syntax type. This yielded four different time courses, each with 120 units. As an example, Figures 3, 4 display one example of the four syntax-relevant HbO hemodynamic curves extracted from the original time series at channel 31 (mainly overlapped in the right inferior frontal cortex) for a child in the SLI group named Sue.
Factors such as breathing, vasomotor response, measurement error, movement artifacts, and other unaccounted activities (Akgül et al., 2006), may cause noise in fNIRS data. These four preprocessing steps enabled us to extract the most important signals and remove unavoidable confounding factors. Comparing Figure 2A with Figure 4, notice that it is harder to recognize patterns from Figure 2A due to many complex and sharp fluctuations and strands. On the contrary, the patterns are smoother and clearer in Figure 4. After these preprocessing steps, our data were ready for the statistical models and hypothesis tests.

Functional Data Analysis Structure
Let Y ikc , i = 1, . . . , n; k = 1, . . . , T; c = 1, 2, denote the relative changes in the concentrations of oxygenated  or deoxygenated hemoglobin of the ith subject measured at discrete time point t k for the cth group. Here c = 1 denotes the case group and c = 2 denotes the control group, n is the number of subjects per group, and T is the total time points measured for each subject. These observed densely collected curves with noise can be modeled as independent realizations of a stochastic process with smooth trajectories. Let X 1c (t), . . . , X nc (t) denote random smooth trajectories of the underlying stochastic process in L 2 (T ), t ∈ T , where T is the time interval. Then we can reconstruct the smooth functions X ′ i s from the original densely collected noisy observations Y ′ i s as (Müller, 2008) Y ikc = X ic (t) + ε ikc , i = 1, . . . , n; c = 1, 2; k = 1, . . . , T; where ε ikc are the experimental errors and assumed to be independent, with E(ε ikc ) = 0 and Var(ε ikc ) = σ 2 kc . For each group c, the mean function of X ic (t) is µ c (t) = E(X ic (t)) and auto-covariance function of X ic (t) is for s, t ∈ T . Here µ c (t) is interpreted as the mean function of oxygenated or deoxygenated hemodynamic curves for group c. Throughout this paper, it is assumed that µ c (t) is a smooth function of t, and G c (s, t) is a positive definite and bivariate smooth function of s and t, for s, t ∈ T . The "smooth" refers to twice continuously differentiable. The idea of model Equation (1) is that the observed noisy curve over time is described by an underlying smooth function plus noise.
In order to model the auto-covariance function, functional PCA interprets G c (s, t) as the kernel of a linear integral operator on the space L 2 (T ) of square-integrable functions on T , mapping An eigenfunction v of the auto-covariance operator A G c is a solution of the equation (A G c v)(t) = λv(t), with eigenvalue λ. For each c, we assume that the operator A G c has a sequence of smooth orthonormal eigenfunctions v lc satisfying T v kc (t)v lc (t)dt = δ kl (here δ kl is the Kronecker symbol), with ordered eigenvalues λ 1c ≥ λ 2c ≥ . . . ≥ 0. By Mercer's Theorem, applying a spectral decomposition on the function G c yields Since the eigenfunctions v lc 's form a complete orthonormal sequence on L 2 (T ), the generalized Fourier expansion (Karhunen − Loeve Theorem (Karhunen, 1946) or functional principal component expansion) on X ic yields where the sum is defined in the sense of L 2 convergence and are uncorrelated random variables with E(ζ ilc ) = 0, and var(ζ ilc ) = λ lc , subject to the L 2 convergence, i.e., ζ lc are frequently referred to as the lth functional principal component score or the lth dominant modes of random effects. By way of Equation (4), the dynamic trends of random function X ic (t) can be modeled by the mean trend function µ c (t), the eigenfunction v lc , and the distribution of functional principal component scores ζ ilc . The first L principal components were used to approximate Equation (4) to capture the most important variations, remove the noise effects, and estimate the main signals of the trajectories of X c (t) effectively (Ramsay and Silverman, 2002).

Parameter Estimates
Using the observed data set D = {Y ikc , i = 1, . . . , n; k = 1, . . . , T; c = 1, 2}, we were able to estimate all unknown parametersμ c (t),Ĝ c (s, t), andσ 2 kc from Equations (1) and (5). The smooth function X ic (t k ) andσ 2 kc of each discrete noisy observation (t ik , Y ikc ) were estimated by model Equation (1) via nonparametric kernel smoothing. Then the unbiased estimator of µ c (t) was easily obtained from the sample mean of X ic (t).
Once the estimatorμ c (t) was obtained, we computed the sample estimate of auto-covariance matrix bŷ The estimate of eigenfunctions were obtained by the corresponding spectral decomposition onĜ c (s, t). To be more specific,λ qc are eigenvalues ofĜ c , given by TĜ c (s, t)v lc (s)ds =λ lcvlc (t).
Andv lc is the eigenfunction corresponding toλ lc , satisfying Tv 2 lc (t)dt = 1 and Tv kcvlc (t)dt = 0 if k = l. The signs ofv lc were not uniquely determined. In order to ensure the closeness of v lc from two groups of c = 1, 2, we allowed the signs ofv lc to be chosen arbitrarily as long as <v l1 ,v l2 > ≥ 0 for l = 1, . . . , L. G c also presents an empirical version of the expansion (Equation 3) where I is the indicator function used to only keep the terms with positive eigenvalues. From the percentage of variation explained by the first few eigenfunctions, the first L largest eigenvaluesλ 1c , . . . ,λ Lc were chosen. The positive definiteness of the estimated auto-covariance matrixĜ c (s, t) was not always guaranteed, which might be a problem in practical applications. Onceλ lc andv lc were obtained, we checked whether or not λ lc > 0 (Müller, 2008). Ifλ lc was negative, then we dropped this negative eigenvalue and its corresponding eigenfunction, and reconstituted the estimate from remaining eigenvalues and eigenfunction estimates. Once eigenvaluesλ 1c ≥ . . . ≥λ Lc and orthonormal eigenfunctionsv 1 , . . . ,v L were obtained, the fitting of individual trajectories required estimation of functional principal component scores. By the discretization on the Equation (5), pluggingμ c andv lc into a Riemann sum approximation of the integral, we havê setting t 0 = 0 (Müller, 2008). We assured that n −1 ζ ilc = 0, n −1 ζ ilcζiwc = 0 for l = w; l, w = 1, . . . , L, and n −1 ζ 2 ilc = λ lc . This approximation method by sum worked well because our observations were collected densely and consistently for all subjects without missing values.

Nonparametric Kernel Smoothing
The nonparametric regression kernel smoothing was a traditional approach to capture the curve trends without making assumptions about the error distributions. The goal of smoothing was to model the underlying function by estimating X(t) = E(Y|t) from the original discrete measurement and removing the noisy observations caused by measurement errors. To define a kernel smoother, we need a bandwidth h and a kernel function K.
The Nadaraya-Watson Estimator (NW), a basic framework for kernel estimators (Nadaraya, 1964;Watson, 1964;Cai, 2001;Racine and Li, 2004;Bailey and Addison, 2010;Demir and Toktamiş, 2010;Kato, 2012;Simonoff, 2012), was defined by where K h (t) = 1/hK(t/h). The kernel function K(t) was a nonnegative symmetric real valued integrable function satisfying ∞ −∞ K(t)dt = 1, ∞ −∞ tK(t)dt = 0, and ∞ −∞ t 2 K(t)dt > 0. The Epanechnikov kernel K(t) = 3/4(1 − t 2 )I(|t| < 1) was used. The bandwidth h controled the number of points that neighbored each t i and hence determined the weight of each point contributing to the estimator. The choice of bandwidth was crucial in changing the result because it served as a smoothing parameter and determined the trade-off between the variance and bias of the resulting nonparametric regression estimates. Typically, smaller h decreases the bias but increases the estimation variance. We chose the optimal bandwidth that minimized the Generalized Cross Validation (GCV).
Once the smooth trajectory of each X i , i = 1, . . . , n was estimated from the NW nonparametric kernel smoother with the optimal bandwidth, we estimated the meanμ c (t) for each group directly from the sample mean, which was a consistent and unbiased estimator.

Hypothesis Tests
The main goal of this article was to determine whether FDA applied to fNIRS data would reveal significant differences in the hemodynamic function curves between the case and control groups as they processed syntax-related stimuli. We examined eight parasylvian brain regions: left and right inferior frontal cortex, the temporal parietal junction, inferior posterior parietal cortex, and superolateral temporal cortex. Statistically, we used formal hypothesis tests to judge the extent to which the distributions of the random functions X 1c , . . . , X nc differed for case and control groups. By way of the empirical Karhunen-Loeve decompositions Equation (4), we approximated the functions of X ic (t) as ilcvlc (t), c = 1, 2; i = 1, . . . , n. (9) As a result, the possible differences of the hemodynamic signals between the case and control group could be tested from the following three steps. The first test was whether or not significant differences existed for the overall mean trends between case and control group for each syntax type at each brain area of interest: If H 01 failed to be rejected, it would mean that the overall mean trends of hemodynamic curves were similar between the case and control groups. The second test was whether or not significant differences existed for the variation trends between case and control groups for each syntax type at each brain area of interest: If H 0,2l failed to be rejected, it would mean that the l th variation mode had similar trends between the case and control groups. The third test was whether or not significant differences existed for the variance of principal component scores for each syntax type at each brain area of interest: If H 0,3l failed to be rejected, it would mean that distribution of the l th principal component scores were similar between the case and control group.
The first two tests, H 01 and H 0,2l were challenging because they were based on high dimensional curves, and both the test statistics and the distribution were unknown. The most traditional approach involves judging the similarity of two curves by measuring how far the norm of the differences of the two vectors is away from zero. Define the following measures (Benko et al., 2009): The three null-hypotheses would be rejected respectively, if D 1 ≥ 1;1−α ; D 2,l ≥ 2,l;1−α ; D 3,l ≥ 3,l;1−α , where 1;1−α , 2,l;1−α , and 3,l;1−α denotes the α-level critical values of the distributions of We decided to use s as the primary test because Ds were equal to s under the null hypotheses and the values of Ds were shifted by the difference in the true means, eigenfunctions, and eigenvalues under the alternative hypotheses. However, because the true population mean, eigenvalues and eigenfunctions were unknown, above s can not be accessed directly. Therefore, we used the bootstrap sampling to determine the threshold (Benko et al., 2009) whereμ * 1 (t),v * l1 (t),λ * l1 (t), as well asμ * 2 (t),v * l2 (t),λ * l2 (t) were estimated from each independent bootstrap samples X * 11 (t), . . . , X * n1 (t) and X * 12 (t), . . . , X * n2 (t), respectively. We performed 1000 nonparametric bootstrap samples for both case and control group and we repeated the nonparametric kernel smoothing for each sample. Finally the 1 − α percentiles were used to determine the thresholds of the tests.

Real NIRS Data Analysis
Behaviorally, the case (SLI) group identified the agents of subjectverb-object and subject relative clause sentences as well as their age-matched, typically developing controls. However, the children in the case group were significantly less accurate than the children in the control group on the passive and object relative clause sentences.
The goal of statistical modeling was to determine whether there were significant differences in the hemodynamic trends between the case and control groups. Additionally, we speculated which brain regions were associated with children's syntax comprehension ability from the significant group differences. For each hemodynamic category ( HbO and HbD), we performed 32 tests to consider all combinations of four different syntax types and eight different brain regions.
Using the FDA approaches described in Sections 2.5 and 2.6, we first estimated the mean functionμ c (t), eigenfunctionsv lc (t), and eigenvaluesλ lc for each group, with c = 1 corresponding to case group and c = 2 for control group. During the analysis, we kept the first two eigenfunctions (i.e., L = 2) because they explained 90% of the overall variations, and the remaining eigenfunctions explained only a very small percentage of the variations.
With respect to potential group differences in mean trends of HbO, H 01 was rejected at the significance level of 0.1 at two brain regions: right inferior frontal cortex brain region for subject-verb-object, subject relative clause, and object relative clause sentences, and at the left inferior posterior parietal cortex brain region for object relative clause and passive sentences. Therefore, we concluded that the right inferior frontal cortex and left inferior posterior parietal cortex were associated with the control group's syntax comprehension processing ability to a greater extent than the SLI group's. Figure 5 displays the estimated mean trajectoriesμ c (t) of HbO in these two brain regions with corresponding significant syntax types. A close inspection of Figure 5 reveals that the mean trajectories of case and control groups have different dynamic trends (different shape and magnitude) for each syntax type, with opposite fluctuate oscillations at some time segments but similar directions at other time segments. The mean trajectories of the control group were always above those of the case group in these two brain regions. In the right inferior frontal cortex brain region, the mean oxygenated hemodynamic trajectories of the control group were always above zero, while those of the case group were below zero. In the left inferior posterior partietal cortex, the mean oxygenated hemodynamic trajectories of both case and control groups were below zero.
The hypothesis test H 01 for HbD was rejected at the right inferior posterior parietal cortex (at 0.05 significance level) and the left temporal parietal junction (at 0.1 significance level) for all four syntax types. We concluded that there were significant differences (related to both shape and magnitude) in the mean trajectories of HbD at these two brain regions between case and control group, and these two brain regions were also associated with children's syntax comprehension ability. A close inspection of Figure 6 reveals that the mean trajectories of HbD for the control group mainly fluctuate around zero but those of the case group are around −0.2 for all the eight scenarios. Using the same range of y-axis as the oxygenated hemodynamic trajectories of HbO in Figure 5, the overall mean trends of the deoxygenated hemodynamic trajectories HbD were very flat, especially those of the case group. So, we decreased the range of the y-axis in Figure 6 to the half of that of Figure 5 so that the significant oscillations were more apparent.
None of the 32 hypothesis tests related to the variation trends (H 0,2l , l = 1, 2) could be rejected for either HbO or HbD at any of the eight brain regions or for any of the four syntax type types. Thus, there were no significant differences in the eigenfunction (i.e., variation trends) in HbD and HbD between the case and control groups. Hypothesis test H 0,3l , l = 1, 2, related to the eigenvalues, was rejected at a few brain regions and syntax types. It indicated that the percentages of variation explained by the first two eigenfunctions (i.e., the distributions of the first two principle component scores) were significantly different between case and control groups. Table 1 summarizes the details of percentage of variation for all significant brain regions and syntax types. Among all these significant differences, the left inferior posterior parietal cortex brain region for HbD achieved the maximum for all four syntax types, with the first eigenfunction of the case group (v 11 (t)) explaining 96−98% of the total variation of the case group vs. 59−70% of the total variation of the control group (v 12 (t)). Similarly, the second eigenfunction (v 21 (t)) explained 0.7−1.0% of the total variation of the case group vs. 15−34% of the total variation of the control group (v 22 (t)). Additionally, we also noticed that the superolateral temporal cortex brain regions for HbO showed opposite directions in the percentage of variation explained by the first two eigenfuncitons as compared to other brain regions. Specifically, the first eigenfunction of the case group (v 11 (t)) explained a greater percentage of total variation than the first eigenfunction of the control group (v 12 (t)) for almost all scenarios, except the HbO at left superolateral temporal cortex for passive and subject-verb-objects sentences, and right superolateral temporal cortex for object relative clause sentences. Also, we observed that the second eigenfunction of the case group (v 21 (t)) explained a much smaller percentage of total variation than the second eigenfunction of the control group (v 22 (t)) for almost all scenarios with the exception of the HbO at left superolateral temporal cortex for subject-verbobject sentences and right superolateral temporal cortex for object relative clause sentences.

DISCUSSION
The primary goal of this article was to determine whether significant group differences in the hemodynamic trajectories existed for two groups with known language differences. To achieve this goal, we designed a syntax comprehension task in which 15 children with SLI and 15 age-matched, typicallydeveloping controls pointed to pictures representing the agent (actor) after hearing four types of sentences (subject-verb-object sentences, subject relative clause sentences, passive sentences, and object relative clause sentences). We administered the 60 questions in a pseudo-random order to 30 participants during the NIRS data collection. We performed three formal hypothesis tests to formally assess the group differences between the case and control groups, and determined the threshold by a bootstrap approach for high dimensional objects when both test statistics and distributions were unknown (Benko et al., 2009). The FDA approach is different from the widly used traditional approaches in existing NIRS literature (e.g., GLM and t-test). In functional data analyis, the modeling is performed in the functional sense that treats the entire curve as the modeling target and fully utilizes the superior temporal resolution of fNIRS data. But GLM extracts multivariate discrete points and does not utilize the dynamic trajectories of the fNIRS curve. As a nonparametric data-driven approach, FDA does not assume any linear structure or normality distribution such as that within the GLM model (Shimada and Hiraki, 2006;Koh et al., 2007;Abdelnour and Huppert, 2009;Custo et al., 2010;Penny et al., 2011;Tak and Ye, 2014). Unlike simple t-tests (Aldrich et al., 1994;Germon et al., 1994Germon et al., , 1999Young et al., 2000;Hoshi et al., 2001;Isobe et al., 2001;Kennan et al., 2002;Schroeter et al., 2002;Hoshi, 2003;Matsuo et al., 2003;Tachtsidis et al., 2004;Tsujimoto et al., 2004;Shibuya-Tayoshi et al., 2007;Kim et al., 2010), FDA tests the trajectory differences of two entire curves for two groups and captures not only the differences in magnitude but also in shape. Thus, our approach was inclusive of all observed stimulus-relevant data information and was not restricted to the magnitudee differences as t-test does.
We successfully detected significant group differences in the oxygenated hemodynamic mean trends in two brain regions, right inferior frontal cortex and left inferior posterior parietal cortex. The mean oxygenated hemodynamic trajectories between case and control groups showed different trends (different shape and magnitude) in these two brain regions, with some segments showing opposite fluctuating oscillations but other segments having similar directions. In the right inferior frontal cortex, the mean oxygenated hemodynamic trajectories of the control group were always above zero, while those of the case group were below zero. In the left inferior posterior partietal cortex, the mean oxygenated hemodynamic trajectories of both case and control groups were below zero. We also detected significant group differences in deoxygenated hemodynamic mean trends in the right inferior posterior partietal cortex and left temporal parietal junction. The mean deoxygenated hemodynamic trajectories of the control group mainly fluctuated around the zero line while that of case group were all below −0.2. Some of these significant findings from our quantitative functional NIRS analysis were consistent with the results of a few other studies that had dramatically different approaches, experiments, data sets, and foci. For example, the left inferior posterior parietal cortex (Angular Gyrus) brain region has been reported to be highly engaged in semantic processing during language comprehension (Geschwind, 1965;Joseph, 1982;Démonet et al., 1992;Vandenberghe et al., 1996;Vigneau et al., 2006;Houdé et al., 2010;Price, 2010), including some reports got by MRI (Binder et al., 2009;Seghier et al., 2010;Seghier, 2013). Further, differences between children with and without SLI in the extent of activation of this area has been noted in studies of listening to nonwords and words (Weismer et al., 2005). A number of MRI studies have noted group differences between children with SLI and their age-matched controls in the size of right hemisphere parasylvian areas (Plante et al., 1991).
There were no significant differences in the eigenfunctions, but the percentage of total variation explained by each eigenfunction significantly differenced in the left inferior posterior partietal cortex, left temporal parietal junction, and both left and right superior temporal cortex. The finding of significant group differences in the percentage of variation explained by the first two eigenfunctions may be of particular interest. Recall that the first two orthogonal eigenfunctions derived from the fNIRS high dimensional auto covariance matrix were likely related to the cognitive processes involved in performing our syntax comprehension task. The significant group differences in the percentage of total variation explained by the eigenfunctions may relate to group differences in information processing functions that have been associated with attention, semantic processing, and syntactic processing in the left inferior posterior parietal cortex, the left temporal parietal junction, and the left superolateral temporal cortex. Further research on larger samples of participants are needed to fully understand the meaning of these results.
In future work, we will compare the significant differences between left and right hemispheres. Unlike the comparisons between case and control groups, the left and right brain samples are not independent requiring a different approach. We will also explore more detailed functional properties in the rest periods. Although there are several hypothesis tests involved, we will leave the multiple correction for the future for a few reasons.
First, there are only 15 subjects within each group, which is much less than the dimension of the curves (length of 120 after preprocessing and length of 8521 before preprocessing). As a result, power is limited due to the difficulties of collecting children with SLI. Therefore, we do not want to diminish our findings due to a large number of multiple corrections. We believe that our methods will yield better results after the sample size is large enough and will investigate the multiple correction when we have an appropriate sample size. Second, the multiple tests involved here are not independent. Instead, they form close correlations, as HbO and HbD and the four syntax types are highly correlated. Therefore, many multiple correction approaches will not be appropriate and likely will mislead the results. For example, the test of equal mean hemodynamic trends between case and control (H 1,0 ) was rejected, whether we considered each of the syntax types (with 120 length) individually or we tested the stimuli of the four syntax types simultaneously (with 480 length). However, if we used multiple correction, say Bonferroni correction, then each syntax test would only have an α/4 significance level, which makes the individual syntax period impossible to be rejected given the current sample size. In that case, none of the individual syntax types would show significant differences, but the whole stimuli curve with four syntax types would be significant between case and control. This would result in conflicting conclusions.
In summary, this proof of concept study was conducted to explore a more advanced statistical analysis approach to the analysis of the time course of hemodynamic data collected with fNIRS. This approach enables us to compare which brain regions are significantly involved in syntax comprehension ability in the two groups. FDA strategies were used to decompose the high dimensional HbO and HbD time curves into mean curves and eigenfunctions to represent overall trends and variation structures (Ramsay and Silverman, 2002;Ferraty and Vieu, 2006;Ramsay, 2006;Barati et al., 2013). After detailed comparisons and hypothesis tests, we revealed greater brain activity for the control group than the case group for all four syntax types. In addition, different percentages of variation for the case and control groups were explained by the first two eigenfunctions, suggesting that the two groups used different cognitive processing strategies while performing the tasks. The approach of FDA proposed in this paper has promise as an analysis method that captures the overall mean trends and variation trends of hemoglobin concentration over time within and between groups without assuming any structure.

ETHICS STATEMENT
This study was approved by the Utah State University Institutional Review Board. All participants (adults and children) and the parents or guardians of all children signed consent forms that were approved by the IRB. The participant's name "Sue" was a pseudonym.

AUTHOR CONTRIBUTIONS
GF conceived the statistical modeling, programmed and performed the data analysis and figures, and wrote the first version of the manuscript; NW collected and processed the data; JB assisted with experimental design and programmed the task; JM and JE created the experimental task; RG conceived the research, helped design the experimental task, supervised all aspects of data collection and data processing, edited the manuscript, and wrote the experimental design and discussion sections.