Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Psychiatry, 16 January 2026

Sec. Autism

Volume 16 - 2025 | https://doi.org/10.3389/fpsyt.2025.1703302

Cortical hemodynamic responses and deep learning models of emotional face processing in preschool children with autism spectrum disorder: a fNIRS study

Liping Qi*&#x;Liping Qi1*†Jing-Wen Ni&#x;Jing-Wen Ni2†Guijun DongGuijun Dong3Tao Sun&#x;Tao Sun1†Jian-Wei Zhang*&#x;Jian-Wei Zhang1*†
  • 1School of Control Science and Engineering, Dalian University of Technology, Dalian, China
  • 2School of Computer Science, Dalian University of Technology, Dalian, China
  • 3Quzhou University, Quzhou, China

Purpose: The purpose of the present study was to characterize cortical hemodynamic responses during emotional face processing in preschool children with autism spectrum disorder (ASD) using functional near-infrared spectroscopy (fNIRS), and to develop machine learning frameworks for emotion recognition based on these hemodynamic signals.

Methods: Fifty-three ASD preschoolers (41 males, 12 females; aged 3–7 years, mean age 5.20 ± 1.23 years) were exposed to dynamic video and static image facial stimuli displaying angry, happy expressions, and neutral flowers, with their brain activity concurrently recorded using whole-brain fNIRS. A convolutional neural network-long short-term memory (CNN-LSTM) model was proposed to decode spatiotemporal neural patterns of angry/happy emotion recognition.

Results: fNIRS analysis revealed significantly enhanced activation in bilateral dorsolateral prefrontal cortex (DLPFC) and frontal pole during dynamic versus static stimulus processing. Angry expressions elicited the most pronounced neural responses, engaging a distributed cortical areas involving DLPFC, ventrolateral prefrontal cortex, and primary visual areas. The CNN-LSTM architecture achieved 86.2% accuracy in dynamic angry/happy emotion classification.

Conclusion: This study provides evidence of altered cortical hemodynamics during dynamic emotional facial processing and demonstrates the feasibility of CNN-LSTM models for the objective assessment of emotional facial processing potential in preschool children with ASD.

Introduction

Autism Spectrum Disorder (ASD) is a complex neurodevelopmental condition characterized by deficits in social communication and restricted repetitive behaviors (1). Globally, its rising prevalence presents significant challenges for individuals with ASD and their families (2) (3). A hallmark of ASD is persistent difficulty in interpreting nonverbal cues, particularly facial expressions, which undermines social interaction across developmental stages (F. Y. N. 4). While facial expressions are fundamental to emotional communication, individuals with ASD often struggle to recognize them as communicative signals, leading to impaired social engagement. Compared to neurotypical peers, ASD children show lower emotion identification accuracy, compromising their social adaptability (5).

Notably, despite early childhood being a critical window for neuroplasticity (6), most research focuses on adolescent and adult ASD populations. The preschool period (3–7 years) represents a pivotal stage where targeted interventions may promote neural reorganization. This study addresses this gap by using functional near-infrared spectroscopy (fNIRS), a neuroimaging modality ideal for pediatric research. Unlike fMRI, fNIRS requires no prolonged immobility, a limitation that restricts fMRI use in young children, and offers superior temporal resolution (7). Its non-invasive, portable, and motion-tolerant design enables hemodynamic measurements in challenging populations, driving growing interest in using fNIRS to study neural mechanisms in young children with ASD (8, 9).

The experimental paradigm uses dynamic and static facial expressions to enhance ecological validity, addressing limitations of traditional static stimulus designs. Behavioral studies show dynamic stimuli improve emotion discrimination in ASD by enhancing biological motion perception, yielding higher recognition accuracy than static images (10). Most prior face recognition research in ASD has involved older individuals with verbal tasks (4), yet age moderates facial affect recognition, with the performance gap between ASD and neurotypical groups widening over development. In the present study, we used fNIRS to measure cortical responses to happy/angry facial expressions in 3–7-year-old preschool children with ASD.

Recent advances in artificial intelligence (AI) have facilitated computational emotion analysis (11), whereas brain imaging techniques enable noninvasive detection of neural signatures for emotion classification in individuals with ASD (12) (13). Neural activity offers higher specificity for emotional categories than behavioral measures, containing discriminative features for emotion recognition (14). The proliferation of deep learning (DL) algorithms has rendered automatic feature learning via backpropagation increasingly viable, streamlining workflows and mitigating computational overhead relative to conventional machine learning (ML) paradigms. Eastmond et al. reported in a systematic review that 26 out of 32 DL-driven approaches outperformed traditional ML models in fNIRS research (15). Within the DL landscape, CNNs, RNNs, and LSTMs are extensively adopted for fNIRS-based classification, each tailored to distinct signal characteristics (15). LSTMs are well-characterized for capturing temporal variability in sequential data, which is critical for fNIRS, as hemodynamic responses exhibit inherent time-varying dynamics. In contrast, CNNs excel at extracting local spatial features from multi-channel fNIRS data but lack capacity to model long-range inter-regional interactions or temporal dependencies (16). In the present study, we proposed a cascaded CNN-LSTM architecture, which could accommodate fNIRS’s spatio-temporal nature. Therefore, the purposes of the present study were to: (1) compare neural activation patterns between dynamic and static facial expressions using fNIRS hemodynamics; (2) identify emotion-specific neural signatures in preschoolers with ASD; and (3) develop a deep learning framework for emotion recognition from fNIRS data. By identifying neural correlates of facial emotion processing deficits and exploring the integration of fNIRS and machine learning, this study aims to advance objective assessment of emotion recognition for early ASD intervention—an unmet need in clinical psychiatry, where subjective evaluation tools remain the dominant approach. Notably, the objective assessment tools developed herein may facilitate early identification of emotion processing impairments in ASD, thereby enabling timely and tailored intervention strategies to improve long-term outcomes for affected children.

Methods

A total of 53 children with autism spectrum disorder (41 males, 12 females; aged 3–7 years, mean age 5.20 ± 1.23 years) were recruited from Lixin Minkang Hospital, Shandong Province, China. Age distribution is presented in Figure 1. ASD diagnosis was confirmed using the Autism Diagnostic Observation Schedule (ADOS) and expert clinical assessment. Participants were excluded if they had a history of neurological/neurodevelopmental disorders (other than ASD for the clinical group) or an IQ below 65. Inclusion criteria required no prior antipsychotic medication use and typical intellectual functioning as assessed by the Chinese Wechsler Intelligence Scale for Children. All participants underwent resting-state and emotional face processing tasks. The study was approved by the hospital research ethics board and university ethics committee (ID: 2022027). Informed assent was obtained from the children, and written informed consent was obtained from their parents.

Figure 1
Pie chart illustrating the distribution of years and percentages: 5 years at 40%, 6 years at 23%, 7 years at 17%, 4 years at 13%, and 3 years at 7%.

Figure 1. The age distribution of 53 preschoolers with ASD in the emotion recognition test.

Participants

Procedure

Stimuli consisted of grayscale images of happy/angry faces (8 males, 8 females) and flowers (control stimuli), presented in static or dynamic form (Figure 2). Happy and angry faces were selected because they are the two most commonly experienced emotions in young children, making them widely used in child-focused studies (17) (18). Faces were selected from the Chinese Affective Picture System (19), which includes validated expressions with > 80% recognition accuracy. Images were cropped using an oval mask to remove hair/ears/shoulders and set against a uniform light grey background. Control stimuli included static flower images and dynamic blooming sequences. Dynamic stimuli were generated using WinMorph software, morphing faces from neutral to smiling expressions over 10 frames. These were compiled into 480-ms videos at 50 fps, with the final frame repeated for 9 frames (24 frames total). Static stimuli displayed only the final frame for 480 ms. Stimuli were presented in an ABXBAX alternating block design (A=dynamic, B=static, X=baseline), with each condition comprising four 13.5 s blocks (20). Each block included 8 stimuli (1,500 ms inter-stimulus interval), preceded by a 1s white fixation cross. A star probe (1,500 ms) appeared randomly within blocks, prompting participants to press a key for attention maintenance. Baseline blocks (13.5 s, fixation cross) were interspersed, with 16-s fixation periods at the run’s start/end.

Figure 2
Diagram depicting a timing sequence for displaying various static and dynamic images. Two rows are shown. The top row includes images labeled “Happy” and “Angry,” with static and dynamic variations lasting 13.5 seconds each, interspersed with dynamic image of “Flower.” The bottom row features similar static and dynamic labels with a 270-second cycle, transitioning every 480 milliseconds. Faces expressing different emotions are depicted beside each sequence.

Figure 2. Emotional facial processing task. Dynamic (10-frame videos) and static (single-image final frames) emotional expressions (happy/angry) were presented for 480 ms each. The task used an ABXBAX randomized block design (A=dynamic, B=static, X=baseline) with four 13.5-second blocks per condition.

fNIRS signals were recorded using a multichannel system (Nirsmart; Huichuang Medical Equipment Co., Ltd., Beijing, China) during emotional face processing tasks. Two wavelengths of 730 and 850nm were used. Thirty-eight channels were configured with 20 light emitters and 16 detectors, set at a 3-cm inter-probe distance, as shown in Figure 3. A customized probe cap was designed to maintain sensor positioning, guided by the international 10–20 system: anterior sensors were placed around FP1/FP2, posterior sensors near PO7/PO8, left-lateral sensors around T3, and right-lateral sensors around T4. During probe placement, the acquisition software provided real-time signal quality evaluation and visualization for each channel. Regions of interest (ROIs) included the frontopolar area (FPA), dorsolateral prefrontal cortex (DLPFC), ventrolateral prefrontal cortex (VLPFC), pre-motor/supplementary motor cortex (PM&SMA), primary somatosensory cortex (S1), and primary visual cortex (V1) in both hemispheres. The sampling frequency was 10 Hz. To enhance compliance in children with ASD, each participant underwent a one-month pre-adaptation protocol: wearing a similar probe cap for 10 minutes daily prior to the study.

Figure 3
Diagram of a human brain with three transparent views displaying numbered regions in different colors. Each color corresponds to a specific brain area: red for the frontal pole area (FPA), yellow for the dorsolateral prefrontal cortex (DLPFC), green for the ventrolateral prefrontal cortex (VLPFC), cyan for the premotor and supplementary motor area (PM&SMA), blue for the primary somatosensory cortex (S1), and pink for the primary visual cortex (V1). A legend at the bottom associates colors with their respective regions.

Figure 3. The configuration of fNIRS channels use in the present study. 20 emitters and 16 detectors arranged at an inter-probe distance of 3 cm, resulting in 38 channels per set. Estimated fNIRS channel locations are exhibited in MNI space. Six regions of interest in the prefrontal cortex and motor cortex are indicated with colors and channel numbers. FPA, frontopolar area; DLPFC, dorsolateral prefrontal cortex; VLPFC, ventrolateral prefrontal; PM&SMA, pre-motor/supplementary motor cortex; S1, primary somatosensory cortex; V1, primary visual cortex.

fNIRS signal processing

Optical density signals were converted into hemoglobin concentration changes using the modified Beer-Lambert law, with the differential pathlength factor adjusted based on participant age to account for age-dependent cerebral tissue properties. The SPM-fNIRS toolbox was used to analyze the fNIRS data (21, 22). Motion artifacts, a common challenge in fNIRS data due to head movements, were mitigated using a moving window and spline interpolation method. The moving window length was set to 1 second, the threshold factor to 3, and the smoothing factor to 5. Physiological noise such as cardiac pulsation and respiration was reduced using band-stop filters targeting 0.12–0.35 Hz and 0.7–2.0 Hz. Additionally, a discrete cosine transform-based high-pass filter suppressed very low-frequency drifts below 0.01 Hz, such as blood pressure fluctuations. Temporal autocorrelation in hemodynamic signals was addressed via pre-whitening, modeled as a first-order autoregressive process combined with white noise.

First-level analyses employed a general linear model (GLM) to fit hemodynamic responses, contrasting task periods with baseline intervals (fixation cross presentation). This GLM estimated channel-specific hemodynamic response parameters, deriving participant-specific effects of interest from channel-wise signal dynamics. Statistical parametric mapping (SPM) was used to generate individual-level brain maps highlighting task-induced hemodynamic changes, with the canonical hemodynamic response function (HRF) implemented in SPM. Group-level analysis utilized these maps in a two-way ANOVA with a 2×2 factorial design considering face categories (anger, happiness, and flowers), and stimulus states (dynamic and static). Family-wise error rate correction was applied through permutation testing with 10,000 iterations, and statistical significance was defined as p<0.05.

Emotion recognition modeling

To optimize deep learning model performance, fNIRS Signals were scaled to the [0, 1] range using the Min-max normalization formula. Physiological signal outliers were removed using a 3-standard-deviation threshold. Processed data from each participant were categorized into six task conditions based on emotional type (anger, happiness, neutral) and presentation mode (dynamic, static): dynamic anger, static anger, dynamic happiness, static happiness, dynamic neutral, and static neutral. Each condition included three 13.5-second blocks, with each yielding a task matrix of 38×405 (number of fNIRS channels × time steps). Furthermore, contrast-based differences were calculated for four pairs: dynamic anger vs. dynamic neutral, dynamic happiness vs. dynamic neutral, static anger vs. static neutral, and static happiness vs. static neutral. These contrast-based differences served as the training targets for the classifiers, which enhancing the model’s discriminative power by highlighting discrepancies in neural responses between emotional states and neutral (baseline) states. Subsequently, classification tasks for dynamic emotions and static emotions (anger vs. happiness) were trained separately. Labels were assigned based on whether the data groups included “anger” or “happiness” annotations (0=anger, 1=happiness), and a random training/test set split (4:1) was implemented.

A CNN-LSTM architecture was developed to decode emotional states from fNIRS contrast signals. fNIRS collects cerebral blood oxygenation changes via multiple channels placed on the scalp, resulting in data that contains both spatial and temporal information. Specifically, the proposed model first extracts spatial features through CNN layers and then employs LSTM layers to model temporal dynamics, forming a hierarchical feature learning framework. The architecture includes an input layer, Conv1D layer, Sigmoid layer, global max pooling layer, micro-probabilistic dropout layer, bidirectional LSTM layer, ReLU activation layer, and fully connected classification layer, as shown in Figure 4. The CNN component captures local spatial patterns from multi-channel signals, such as synergistic activation features across brain regions. The model converts the input data matrix (time steps, number of channels) into a (number of channels, time steps) matrix, enabling 1D-CNN kernels to slide along the channel dimension. This transformation focuses convolution operations on spatial correlations between channels rather than the temporal dimension, facilitating better learning of fNIRS spatial characteristics. 1×1 convolution kernels can independently extract features from each channel while preserving the spatial positional relationships between channels. Weight sharing is then used to learn collaborative patterns of adjacent channels, endowing the model with local spatial feature extraction capability. The LSTM component deciphers the dynamic evolution of hemodynamic responses over time, such as delayed neural activity under dynamic facial stimuli. The Adam algorithm is selected as the optimizer, and the learning rate was adjusted according to the complexity of the task. In order to deal with the high-dimensional sparse characteristics of fNIRS signals, global pooling was used to avoid excessive compression of key information. Additionally, bidirectional LSTM layers were introduced to strengthen the capture of contextual temporal correlations, which is crucial for analyzing asymmetric brain region activation patterns under dynamic stimuli. The structural parameters of the CNN-LSTM model are shown in Table 1.

Figure 4
Diagram depicting a neural network architecture. It includes an input layer with multiple channels over time, followed by a Conv1D layer, MaxPool1D layer, Dropout layer, BiLSTM layer, and a fully connected (FC) layer. The final step leads to the output. Arrows indicate data flow through the layers.

Figure 4. Framework of the proposed CNN-LSTM model.

Table 1
www.frontiersin.org

Table 1. Structural parameters of the CNN-LSTM hybrid model.

The data is partitioned into training and test sets at a 4:1 ratio, with a fixed random seed to ensure consistency. In the training stage, the five-fold cross-validation was used to evaluate the generalization performance of the model. The optimal parameters of the validation set were saved in each round of training, and the final test results were averaged for five times to reduce the influence of random fluctuations.

Results

A significant main effect of stimulus modality (dynamic vs. static) was observed on cortical activation patterns, F=7.73, p<0.05, simple main effect analysis showed that dynamic presentations elicited significantly higher cortical activation in ASD preschoolers. Bilateral DLPFC activation was significantly enhanced during dynamic angry face processing relative to static conditions (Figure 5A, dynamic > static, p<0.05, FEWR-corrected for multiple comparisons). Right DLPFC showed higher activation during dynamic happy face processing compared to static stimuli (Figure 5B, dynamic > static, p<0.05, FEWR-corrected). Significantly increased activation in FPA was observed during dynamic neutral flower presentation versus static images (Figure 5C).

Figure 5
Brain imaging results show areas of activation comparing dynamic versus static stimuli. Panel a shows increased activity in the DLPFC for angry expressions, panel b for happy expressions, and panel c for flowers. Color scale indicates T values from seven to fifteen or nine.

Figure 5. Cortical activation patterns in ASD children during dynamic vs. static emotional face processing (dynamic > static, p<0.05, FEWR-corrected for multiple comparisons). (A) Dynamic angry faces induced significantly enhanced bilateral dorsolateral prefrontal cortex (DLPFC) activation than static angry faces. (B) Dynamic happy faces activated right DLPFC activation than static happy faces. (C) Dynamic neutral flower stimuli elicited increased right frontal pole area (FPA) activation compared to static flower images.

The two-way ANOVA also revealed that emotional faces (angry, happy and neutral) have a significant effect on cortical activation patters (F=4.65, p<0.05),. Cortical activation patterns during emotional face processing is shown in Figure 6. Simple main effects analysis showed that in the static facial stimuli conditions, ASD children showed significantly higher activation in right DLPFC during static angry face processing compared to static flower videos (Figure 6A). Static angry faces elicited higher right DLPFC activation than static happy faces (Figure 6B). Static happy faces also activated the right DLPFC in contrast to static flower neutral flowers (Figure 6C). In the dynamic facial stimuli conditions, ASD children showed significantly enhanced activation in bilateral FPA, DLPFC, VLPFC, right inferior parietal lobule (IPL), and left V1 during dynamic angry face processing compared to dynamic flower videos (Figure 6D). Dynamic angry face processing elicited significantly stronger bilateral DLPFC activation than dynamic happy faces (Figure 6E). No significant differences in cortical activation were observed between dynamic happy faces and neutral flowers (Figure 6F).

Figure 6
Brain scan images comparing neural activity under static and dynamic conditions, focusing on responses to Angry versus Flowers and Angry versus Happy expressions. Highlighted regions include the dorsolateral prefrontal cortex (DLPFC), frontal pole area (FPA), ventrolateral prefrontal cortex (VLPFC), and inferior parietal lobule (IPL), marked with varying T values demonstrating activation intensity. Each sub-image presents left, top, and right views of the brain. Color scales indicate T value intensity, with a range of seven to twelve or three to ten depending on the condition.

Figure 6. Cortical hemodynamic responses in ASD children to static/dynamic facial stimuli (p<0.05, FEWR-corrected for multiple comparisons). (A) Static angry faces induced significantly stronger right DLPFC activation than static flower images. (B) Static angry faces showed significantly higher activation in the right DLPFC compared to static happy faces. (C) Static happy faces also activated the right DLPFC, though with weaker activation than static angry faces. (D) Dynamic angry face processing engaged a distributed ROI, including bilateral frontopolar area (FPA), DLPFC, ventrolateral prefrontal cortex (VLPFC), right inferior parietal lobule (IPL), and left primary visual cortex (V1). (E) Dynamic angry faces elicited stronger bilateral DLPFC activation than dynamic happy faces. (F) No significant cortical activation differences were observed between dynamic happy faces and dynamic neutral flowers.

Model classification

The training performance of each dataset on different machine learning algorithms, mainly including Naive Bayes, support vector machine (SVM), and deep learning models, is shown in Table 2. In the present study, the maximum number of training epochs for each model was 500, with an early-stopping mechanism applied after 20–30 consecutive epochs without improvement on the validation set. The model parameters with the highest accuracy on the validation set were saved and evaluated on the test set. The final classification performance of each model was calculated as the average accuracy across all test sets. The evaluation metrics included accuracy, recall, precision, and F1-score. In terms of the classification performance of the proposed CNN-LSTM model, dynamic facial expression stimuli achieved higher accuracy than static stimuli, with the highest reaching 86.2%.

Table 2
www.frontiersin.org

Table 2. Comparison of anger-happiness classification performance across models.

Discussion

The aims of this study were to elucidate cortical hemodynamic dynamics during emotional facial processing in preschool children with ASD by fNIRS and to develop a deep learning framework for automatic emotional recognition based on hemodynamic signals. The main findings of the present study were: 1) fNIRS analysis revealed that dynamic facial stimuli elicited significantly greater activation in the prefrontal cortex compared to static stimuli. 2) Angry expressions elicited stronger neural responses than happy expressions. 3) The proposed CNN-LSTM model achieved the highest accuracy (86.2%) in dynamic emotion binary classification, significantly outperforming traditional machine learning methods.

Dynamic stimuli showed significantly higher activation in DLPFC or FPA than static stimuli in the present study. This modality effect aligns with previous studies that dynamic facial cues promote more integrated emotional processing in ASD across childhood and adulthood (17). In the field of neuroimaging, numerous studies have revealed different response patterns to dynamic and static facial emotional stimuli, with dynamic stimuli inducing stronger functional activity in the prefrontal and visual cortices (23). Given that facial expressions are typically dynamic in natural settings, these findings suggest that dynamic representations may be more realistic and more effective in stimulating neuronal activity. The present findings extend prior fMRI research by providing the fNIRS evidence of this dynamic stimulus advantage in preschool-aged children with ASD.

Across both static and dynamic conditions, DLPFC activation was significantly higher for angry versus happy faces. This indicates that young children with ASD exhibit more pronounced hemodynamic responses to anger, aligning with prior research showing heightened neural sensitivity to negative expressions in ASD populations (2426). Vandewouw et al. (17) used fMRI to demonstrate that compared with neutral flower stimuli, ASD adolescents showed enhanced frontal and occipital activation during dynamic processing of angry/happy faces, consistent with our fNIRS findings (17). DLPFC mediates top-down inhibition of negative emotional responses by regulating amygdala activity (27). For anger, this heightened activation may reflect the emotion’s strong salience, which preferentially captures attentional resources and engages distributed neural networks to process perceived social threats in children with ASD (28, 29).

Under dynamic emotional stimuli, viewing dynamic angry expressions compared with neutral flower videos induced significant activation in multiple brain regions in ASD children, including the bilateral FPA, DLPFC, VLPFC, right IPL, and left V1 (Figure 6D). When processing facial stimuli, studies have found that the FPA is more activated than non-facial stimuli, indicating that the FPA is an important part of the facial emotion processing network, participating in emotion generation and regulation (30). Previous research has shown that FPA plays a key role in tracking dynamically changing emotions (31), which is consistent with the FPA activation observed in this study under dynamic stimuli. Leung et al. (32) used magnetoencephalography to reveal that ASD participants showed higher activation levels in the inferior frontal gyrus (including FPA), anterior cingulate cortex, supramarginal and angular gyri, and superior and middle frontal regions when processing angry faces compared with neutral faces (R. C. 32). The IPL, part of the attention network, may reflect visuospatial processing in ASD children (30). Left occipital V1 activation suggests enhanced visual processing of salient dynamic angry stimuli, consistent with greater neural responses to negative versus positive expressions (33).

Previous studies using an Xception CNN-based facial emotion recognition system for high-functioning ASD adults reported a training accuracy of 71% (34). The proposed CNN-LSTM model constructed in this study demonstrated advantages in emotion classification tasks, achieving an accuracy of 86.2% in dynamic facial expression classification (anger vs. happy). LSTMs are well-characterized in the literature for their proficiency in capturing temporal variability within sequential data, a critical capability for fNIRS analysis, as hemodynamic responses to sensory stimuli inherently exhibit time-varying dynamics (15). Specifically, the bidirectional LSTMs employed in the present study can capture bidirectional temporal dependencies, capturing differences in hemodynamic changes during emotional responses. During the training process, the CNN-LSTM model adopts a phased parameter optimization strategy: the CNN part prioritizes learning spatial invariance features, and the LSTM part focuses on temporal pattern mining. The cascaded CNN-LSTM architecture preserves the spatial feature extraction capability of convolutional operations while leveraging recurrent networks to model long-range temporal dependencies. Moreover, compared to pure LSTM networks that are sensitive to noise and may misclassify noise as temporal features, the local aggregation and max-pooling mechanisms of 1DCNN can filter random noise in fNIRS, improving feature robustness.

The combination of CNN and LSTM significantly enhances classification robustness in complex emotional states, aligning with neuroimaging evidence that dynamic stimuli elicit more extensive cortical activation networks in ASD children. This convergence between model performance and neural mechanisms provides empirical support for using dynamic stimuli to optimize feature extraction in neuroimaging-based emotion recognition. Notably, the model’s ability to distinguish between angry and happy expressions, particularly under dynamic stimulus conditions, corresponds to our observation of enhanced activation in the prefrontal-parietal-visual brain regions in response to dynamic angry faces. This linkage between “classification performance” and “neural activation patterns” facilitates the identification of which emotional processing domains are more vulnerable in children with ASD, providing insights to guide the design of targeted interventions, for example, prioritizing threat emotion regulation training for ASD children who exhibit exaggerated neural responses to angry stimuli.

Limitations

This study has several limitations. First, the present findings center on “ASD-specific internal response characteristics” that inform the development of machine learning frameworks for emotion recognition, rather than advancing definitive assertions regarding “unique neural responses relative to neurotypical populations.” Accordingly, we emphasize the necessity of follow-up studies incorporating neurotypical control groups to validate the clinical utility of our emotion recognition models and clarify their specificity to ASD, this validation step is critical for enhancing the reliability of objective assessment instruments for early intervention in ASD. Second, the sample included only preschoolers with high-functioning ASD, which means the findings may not generalize to children with low-functioning ASD or other age groups. Third, the emotional stimuli were limited to “happy” (positive) and “angry” (negative) expressions, excluding other emotions like sadness or fear, this narrow focus restricts our understanding of the full range of emotional processing in this population.

Conclusions

This study integrates functional near-infrared spectroscopy (fNIRS) and deep learning to characterize hemodynamic responses during emotional face processing in preschool children with ASD under dynamic and static facial stimuli. Dynamic facial expressions elicited significantly higher prefrontal cortex activation than static stimuli. This aligns with neuroimaging evidence of dynamic stimuli engaging extensive cortical activations in ASD. In particular, compared to neural flower, dynamic angry faces promoted coordinated activation across prefrontal-parietal-visual regions, which may indicate an enhanced neutral processing of social threat cues. The proposed CNN-LSTM model achieved 86.2% accuracy in dynamic anger/happiness classification, outperforming traditional methods by integrating spatial and temporal fNIRS features. These findings validate the utility of dynamic facial stimuli for probing emotional processing in ASD and demonstrate the potential of deep learning for fNIRS-based emotion recognition, which may provide new directions for neurophysiology-informed intervention design. current findings should be interpreted with caution due to the absence of controls, and outline future research plans to recruit matched neurotypical groups for direct comparisons to validate the developed assessment tools.

We note that the current findings should be interpreted with caution due to the absence of a neurotypical control group. Accordingly, we outline future research directions that involve recruiting well-matched neurotypical control groups to conduct direct comparative analyses, with the goal of validating the clinical utility of the developed fNIRS-deep learning assessment tools.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving humans were approved by Lixin Minkang hospital research ethics board and university ethics committee (ID: 2022027). The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation in this study was provided by the participants’ legal guardians/next of kin.

Author contributions

LQ: Conceptualization, Investigation, Methodology, Project administration, Resources, Software, Supervision, Visualization, Writing – original draft, Writing – review & editing. J-WN: Data curation, Formal Analysis, Software, Validation, Writing – original draft. GD: Conceptualization, Methodology, Project administration, Resources, Writing – review & editing. ST: Data curation, Formal Analysis, Investigation, Validation, Writing – review & editing. J-WZ: Conceptualization, Funding acquisition, Methodology, Software, Validation, Writing – review & editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This work was supported by the Fundamental Research Funds for the Central Universities of China (grant number: DUT24LAB119).

Acknowledgments

We would like to thank all the children and parents who participated in this study.

Conflict of interest

The authors declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Association AP. Diagnostic and statistical manual of mental disorders, 5th edition (DSM-5). Arlington: American Psychiatric Publishing (2013).

Google Scholar

2. Issac A, Halemani K, Shetty A, Thimmappa L, Vijay VR, Koni K, et al. The global prevalence of autism spectrum disorder in children: a systematic review and meta-analysis. Osong Public Health Res Perspect. (2025) 16:3–27. doi: 10.24171/j.phrp.2024.0286

PubMed Abstract | Crossref Full Text | Google Scholar

3. Lord C, Elsabbagh M, Baird G, and Veenstra-Vanderweele J. Autism spectrum disorder. Lancet. (2018) 392:508–20. doi: 10.1016/S0140-6736(18)31129-2

PubMed Abstract | Crossref Full Text | Google Scholar

4. Leung FYN, Sin J, Dawson C, Ong JH, Zhao C, Veic A, et al. Emotion recognition across visual and auditory modalities in autism spectrum disorder: A systematic review and meta-analysis. Dev Rev. (2022) 63:101000. doi: 10.1016/j.dr.2021.101000

Crossref Full Text | Google Scholar

5. Griffiths S, Jarrold C, Penton-Voak IS, Woods AT, Skinner AL, and Munafò MR. Impaired recognition of basic emotions from facial expressions in young people with autism spectrum disorder: assessing the importance of expression intensity. J Autism Dev Disord. (2019) 49:2768–78. doi: 10.1007/s10803-017-3091-7

PubMed Abstract | Crossref Full Text | Google Scholar

6. Uljarevic M and Hamilton A. Recognition of emotions in autism: A formal meta-analysis. J Autism Dev Disord. (2013) 43:1517–26. doi: 10.1007/s10803-012-1695-5

PubMed Abstract | Crossref Full Text | Google Scholar

7. Pinti P, Tachtsidis I, Hamilton A, Hirsch J, Aichelburg C, Gilbert S, et al. The present and future use of functional near-infrared spectroscopy (fNIRS) for cognitive neuroscience. Ann New York Acad Sci. (2020) 1464:5–29. doi: 10.1111/nyas.13948

PubMed Abstract | Crossref Full Text | Google Scholar

8. Blanco B, Lloyd-Fox S, Begum-Ali J, Pirazzoli L, Goodwin A, Mason L, et al. Cortical responses to social stimuli in infants at elevated likelihood of ASD and/or ADHD: A prospective cross-condition fNIRS study. Cortex. (2023) 169:18–34. doi: 10.1016/j.cortex.2023.07.010

PubMed Abstract | Crossref Full Text | Google Scholar

9. Piatti A, van der Paelt S, Warreyn P, and Roeyers H. Neural correlates of response to joint attention in 2-to-5-year-olds in relation to ASD and social-communicative abilities: An fNIRS and behavioral study. Autism Res. (2024) 17:1106–25. doi: 10.1002/aur.3149

PubMed Abstract | Crossref Full Text | Google Scholar

10. Zane E, Yang ZJ, Pozzan L, Guha T, Narayanan S, and Grossman RB. Motion-capture patterns of voluntarily mimicked dynamic facial expressions in children and adolescents with and without ASD. J Autism Dev Disord. (2019) 49:1062–79. doi: 10.1007/s10803-018-3811-7

PubMed Abstract | Crossref Full Text | Google Scholar

11. Khare SK, Blanes-Vidal V, Nadimi ES, and Acharya UR. Emotion recognition and artificial intelligence: A systematic review, (2014-2023) and research recommendations. Inf Fusion. (2024) 102:102019. doi: 10.1016/j.inffus.2023.102019

Crossref Full Text | Google Scholar

12. Dapretto M, Davies MS, Pfeifer JH, Scott AA, Sigman M, Bookheimer SY, et al. Understanding emotions in others: mirror neuron dysfunction in children with autism spectrum disorders. Nat Neurosci. (2006) 9:28–30. doi: 10.1038/nn1611

PubMed Abstract | Crossref Full Text | Google Scholar

13. Harms MB, Martin A, and Wallace GL. Facial emotion recognition in autism spectrum disorders: A review of behavioral and neuroimaging studies. Neuropsychol Rev. (2010) 20:290–322. doi: 10.1007/s11065-010-9138-6

PubMed Abstract | Crossref Full Text | Google Scholar

14. Kleinhans NM, Richards T, Weaver K, Johnson LC, Greenson J, Dawson G, et al. Association between amygdala response to emotional faces and social anxiety in autism spectrum disorders. Neuropsychologia. (2010) 48:3665–70. doi: 10.1016/j.neuropsychologia.2010.07.022

PubMed Abstract | Crossref Full Text | Google Scholar

15. Eastmond C, Subedi A, De S, and Intes X. Deep learning in fNIRS: a review. Neurophotonics. (2022) 9:041411. doi: 10.1117/1.NPh.9.4.041411

PubMed Abstract | Crossref Full Text | Google Scholar

16. Kwon J and Im CH. Subject-independent functional near-infrared spectroscopy-based brain-computer interfaces based on convolutional neural networks. Front Hum Neurosci. (2021) 15:646915. doi: 10.3389/fnhum.2021.646915

PubMed Abstract | Crossref Full Text | Google Scholar

17. Vandewouw MM, Choi EJ, Hammill C, Lerch JP, Anagnostou E, and Taylor MJ. Changing faces: dynamic emotional face processing in autism spectrum disorder across childhood and adulthood. Biol Psychiatry-Cognitive Neurosci Neuroimaging. (2021) 6:825–36. doi: 10.1016/j.bpsc.2020.09.006

PubMed Abstract | Crossref Full Text | Google Scholar

18. Todd RM, Lee W, Evans JW, Lewis MD, and Taylor MJ. Withholding response in the face of a smile: Age-related differences in prefrontal sensitivity to Nogo cues following happy and angry faces. Dev Cogn Neurosci. (2012) 2:340–50. doi: 10.1016/j.dcn.2012.01.004

PubMed Abstract | Crossref Full Text | Google Scholar

19. Lu B, Hui M, Yu-Xia H, and Luo J-Y. The development of native Chinese affective picture system–A pretest in 46 college students. Chin Ment Health J. (2005) 19:719–22. doi: 10.1016/j.molcatb.2005.02.001

Crossref Full Text | Google Scholar

20. Arsalidou M, Morris D, and Taylor MJ. Converging evidence for the advantage of dynamic facial expressions. Brain Topography. (2011) 24:149–63. doi: 10.1007/s10548-011-0171-4

PubMed Abstract | Crossref Full Text | Google Scholar

21. Chul J, Tak S, Jang KE, Jung J, and Jang J. NIRS-SPM: Statistical parametric mapping for near-infrared spectroscopy. Neuroimage. (2009) 44:428–47. doi: 10.1016/j.neuroimage.2008.08.036

PubMed Abstract | Crossref Full Text | Google Scholar

22. Tak S, Uga M, Flandin G, Dan I, and Penny WD. Sensor space group analysis for fNIRS data. J Neurosci Methods. (2016) 264:103–12. doi: 10.1016/j.jneumeth.2016.03.003

PubMed Abstract | Crossref Full Text | Google Scholar

23. Johnston P, Mayes A, Hughes M, and Young AW. Brain networks subserving the evaluation of static and dynamic facial expressions. Cortex. (2013) 49:2462–72. doi: 10.1016/j.cortex.2013.01.002

PubMed Abstract | Crossref Full Text | Google Scholar

24. Enticott PG, Kennedy HA, Johnston PJ, Rinehart NJ, Tonge BJ, Taffe JR, et al. Emotion recognition of static and dynamic faces in autism spectrum disorder. Cogn Emotion. (2014) 28:1110–8. doi: 10.1080/02699931.2013.867832

PubMed Abstract | Crossref Full Text | Google Scholar

25. Jelili S, Halayem S, Taamallah A, Ennaifer S, Rajhi O, Moussa M, et al. Impaired recognition of static and dynamic facial emotions in children with autism spectrum disorder using stimuli of varying intensities, different genders, and age ranges faces. Front Psychiatry. (2021) 12:693310. doi: 10.3389/fpsyt.2021.693310

PubMed Abstract | Crossref Full Text | Google Scholar

26. Vandewouw MM, Choi E, Hammill C, Arnold P, Schachar R, Lerch JP, et al. Emotional face processing across neurodevelopmental disorders: a dynamic faces study in children with autism spectrum disorder, attention deficit hyperactivity disorder and obsessive-compulsive disorder. Trans Psychiatry. (2020) 10:375. doi: 10.1038/s41398-020-01063-2

PubMed Abstract | Crossref Full Text | Google Scholar

27. Herrington JD, Riley ME, Grupe DW, and Schultz RT. Successful face recognition is associated with increased prefrontal cortex activation in autism spectrum disorder. J Autism Dev Disord. (2015) 45:902–10. doi: 10.1007/s10803-014-2233-4

PubMed Abstract | Crossref Full Text | Google Scholar

28. Golan O, Gordon I, Fichman K, and Keinan G. Specific patterns of emotion recognition from faces in children with ASD: results of a cross-modal matching paradigm. J Autism Dev Disord. (2018) 48:844–52. doi: 10.1007/s10803-017-3389-5

PubMed Abstract | Crossref Full Text | Google Scholar

29. Leung RC, Pang EW, Brian JA, and Taylor MJ. Happy and angry faces elicit atypical neural activation in children with autism spectrum disorder. Biol Psychiatry-Cognitive Neurosci Neuroimaging. (2019) 4:1021–30. doi: 10.1016/j.bpsc.2019.03.013

PubMed Abstract | Crossref Full Text | Google Scholar

30. DeRamus TP, Black BS, Pennick MR, and Kana RK. Enhanced parietal cortex activation during location detection in children with autism. J Neurodev Disord. (2014) 6:37. doi: 10.1186/1866-1955-6-37

PubMed Abstract | Crossref Full Text | Google Scholar

31. Goodkind MS, Sollberger M, Gyurak A, Rosen HJ, Rankin KP, Miller B, et al. Tracking emotional valence: The role of the orbitofrontal cortex. Hum Brain Mapp. (2012) 33:753–62. doi: 10.1002/hbm.21251

PubMed Abstract | Crossref Full Text | Google Scholar

32. Leung RC, Pang EW, Cassel D, Brian JA, Smith ML, and Taylor MJ. Early neural activation during facial affect processing in adolescents with Autism Spectrum Disorder. Neuroimage-Clinical. (2015) 7:203–12. doi: 10.1016/j.nicl.2014.11.009

PubMed Abstract | Crossref Full Text | Google Scholar

33. Brotman MA, Rich BA, Guyer AE, Lunsford JR, Horsey SE, Reising MM, et al. Amygdala activation during emotion processing of neutral faces in children with severe mood dysregulation versus ADHD or bipolar disorder. Am J Psychiatry. (2010) 167:61–9. doi: 10.1176/appi.ajp.2009.09010043

PubMed Abstract | Crossref Full Text | Google Scholar

34. Abu-Nowar H, Sait A, Al-Hadhrami T, Al-Sarem M, and Qasem SN. SENSES-ASD: a social-emotional nurturing and skill enhancement system for autism spectrum disorder. Peerj Comput Sci. (2024) 10:e1792. doi: 10.7717/peerj-cs.1792

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: preschool children, autism spectrum disorder, functional near-infrared spectroscopy (fNIRS), cerebral hemodynamic response, convolutional neural network-long short-term memory model

Citation: Qi L, Ni J-W, Dong G, Sun T and Zhang J-W (2026) Cortical hemodynamic responses and deep learning models of emotional face processing in preschool children with autism spectrum disorder: a fNIRS study. Front. Psychiatry 16:1703302. doi: 10.3389/fpsyt.2025.1703302

Received: 11 September 2025; Accepted: 01 December 2025; Revised: 10 November 2025;
Published: 16 January 2026.

Edited by:

Lang Chen, Santa Clara University, United States

Reviewed by:

Aron T. Hill, Deakin University, Australia
Xiaojue Zhou, University of California, Irvine, United States

Copyright © 2026 Qi, Ni, Dong, Sun and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Liping Qi, bGlwaW5ncWlAZGx1dC5lZHUuY24=; Jian-Wei Zhang, and6aGFuZ0BkbHV0LmVkdS5jbg==

ORCID: Liping Qi, orcid.org/0000-0002-3109-2353
Jing-Wen Ni, orcid.org/0009-0007-8979-7181
Tao sun, orcid.org/0000-0002-6618-1081
Jian-Wei Zhang, orcid.org/0000-0002-3051-135X

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.