Machine learning algorithms to the early diagnosis of fetal alcohol spectrum disorders

Ramos-Triguero, Anna; Navarro-Tapia, Elisabet; Vieiros, Melina; Mirahi, Afrooz; Astals Vizcaino, Marta; Almela, Lucas; Martínez, Leopoldo; García-Algar, Óscar; Andreu-Fernández, Vicente

doi:10.3389/fnins.2024.1400933

ORIGINAL RESEARCH article

Front. Neurosci., 06 May 2024

Sec. Translational Neuroscience

Volume 18 - 2024 | https://doi.org/10.3389/fnins.2024.1400933

Machine learning algorithms to the early diagnosis of fetal alcohol spectrum disorders

Anna Ramos-Triguero^1,2^†

Elisabet Navarro-Tapia^3,4^†

Melina Vieiros^1,3

Afrooz Mirahi^1,5

Marta Astals Vizcaino^1,2

Lucas Almela²

Leopoldo Martínez^3,6

Óscar García-Algar^1,5^‡

Vicente Andreu-Fernández^1,7*‡

¹Grup de Recerca Infancia i Entorn (GRIE), Institut d'investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain
²Department de Cirurgia i Especialitats Mèdico-Quirúrgiques, Universitat de Barcelona, Barcelona, Spain
³Instituto de Investigación Hospital Universitario La Paz (IdiPAZ), Hospital Universitario La Paz, Madrid, Spain
⁴Faculty of Health Sciences, Valencian International University (VIU), Valencia, Spain
⁵Department of Neonatology, Instituto Clínic de Ginecología, Obstetricia y Neonatología (ICGON), Hospital Clínic-Maternitat, BCNatal, Barcelona, Spain
⁶Department of Pediatric Surgery, Hospital Universitario La Paz, Madrid, Spain
⁷Biosanitary Research Institute, Valencian International University (VIU), Valencia, Spain

Introduction: Fetal alcohol spectrum disorders include a variety of physical and neurocognitive disorders caused by prenatal alcohol exposure. Although their overall prevalence is around 0.77%, FASD remains underdiagnosed and little known, partly due to the complexity of their diagnosis, which shares some symptoms with other pathologies such as autism spectrum, depression or hyperactivity disorders.

Methods: This study included 73 control and 158 patients diagnosed with FASD. Variables selected were based on IOM classification from 2016, including sociodemographic, clinical, and psychological characteristics. Statistical analysis included Kruskal-Wallis test for quantitative factors, Chi-square test for qualitative variables, and Machine Learning (ML) algorithms for predictions.

Results: This study explores the application ML in diagnosing FASD and its subtypes: Fetal Alcohol Syndrome (FAS), partial FAS (pFAS), and Alcohol-Related Neurodevelopmental Disorder (ARND). ML constructed a profile for FASD based on socio-demographic, clinical, and psychological data from children with FASD compared to a control group. Random Forest (RF) model was the most efficient for predicting FASD, achieving the highest metrics in accuracy (0.92), precision (0.96), sensitivity (0.92), F1 Score (0.94), specificity (0.92), and AUC (0.92). For FAS, XGBoost model obtained the highest accuracy (0.94), precision (0.91), sensitivity (0.91), F1 Score (0.91), specificity (0.96), and AUC (0.93). In the case of pFAS, RF model showed its effectiveness, with high levels of accuracy (0.90), precision (0.86), sensitivity (0.96), F1 Score (0.91), specificity (0.83), and AUC (0.90). For ARND, RF model obtained the best levels of accuracy (0.87), precision (0.76), sensitivity (0.93), F1 Score (0.84), specificity (0.83), and AUC (0.88). Our study identified key variables for efficient FASD screening, including traditional clinical characteristics like maternal alcohol consumption, lip-philtrum, microcephaly, height and weight impairment, as well as neuropsychological variables such as the Working Memory Index (WMI), aggressive behavior, IQ, somatic complaints, and depressive problems.

Discussion: Our findings emphasize the importance of ML analyses for early diagnoses of FASD, allowing a better understanding of FASD subtypes to potentially improve clinical practice and avoid misdiagnosis.

1 Introduction

Fetal alcohol spectrum disorder (FASD) is a range of neurodevelopmental impairments produced by prenatal alcohol exposure (PAE) (Hoyme et al., 2005; Popova et al., 2023). Epidemiological studies estimate a global prevalence of 0.77% (Lange et al., 2017), with regional variations observed, particularly in Europe and North America where prevalence ranges from 2.0 to 5.0% (Wozniak et al., 2019). Despite its prevalence, FASD remains underdiagnosed due to the wide variety of associated symptoms and the complexity in the diagnosis of some of them specifically to FASD, which can overlap with alternative diagnoses, such as attention deficit hyperactivity disorder (ADHD). In addition, the social stigma can pose significant challenges for affected individuals, their families and healthcare systems.

Individuals diagnosed with FASD face a wide range of neurocognitive impairments and social challenges that persist throughout their lives (Kelly et al., 2000; Champagne et al., 2023). Primary disabilities associated with FASD include impairments in adaptive functioning, memory, attention, abstract thinking, judgement, and cause-effect reasoning (Maya-Enero et al., 2021). Secondary disabilities, which result from the interaction of primary disabilities with environmental factors, can adversely affect an individual's ability to actively and positively participate in their lives and can lead to academic failure, low self-esteem, housing instability, and depression (Pei et al., 2011; Leenaars et al., 2012). The complex interplay between these cognitive impairments and social difficulties highlights the need for early comprehensive diagnostic strategies to adequately support affected individuals.

The diagnostic criteria for FASD are multifaceted and include four domains: PAE, facial features, growth, and neurodevelopment (Hoyme et al., 2016). These domains create a spectrum from the most severe condition of FASD, the fetal alcohol syndrome (FAS) to alcohol-related brain damage (ARBD), with partial FAS (pFAS) and alcohol-related neurodevelopmental disorder (ARND) as intermediate terms. Several guidelines are commonly used, including those from the Institute of Medicine, Canadian Guidelines, Centers for Disease Control (CDC), and the University of Washington's 4-digit code (Bastons-Compta et al., 2016; Maya-Enero et al., 2021). The Institute of Medicine (IOM) criteria, which include craniofacial anomalies, growth retardation, mental disabilities and developmental disorders, are currently recommended for diagnosis (Hoyme et al., 2016). However, this approach has limitations, such as difficult physical assessments, extensive neuropsychological assessments and underreporting of alcohol use during pregnancy. Obtaining a confirmed history of alcohol use during pregnancy is hampered by various factors, such as change of custody or maternal death. Consequently, a significant proportion of individuals with FASD remain undiagnosed or receive delayed diagnoses, exacerbating their difficulties and limiting their access to early interventions and support services (Jańczewska et al., 2019).

The search for novel strategies and methodologies for early diagnosis is one of the most promising fields of research in FASD. Timely identification of affected individuals is crucial for implementing personalized interventions and mitigating the long-term impact of the disorder on cognitive, social, and behavioral outcomes. In this context, emerging technologies, such as machine learning, offer promising avenues for improving diagnostic accuracy and efficiency (Rodrigues et al., 2023).

Machine learning (ML) algorithms have demonstrated impressive capabilities in analyzing complex datasets and extracting meaningful patterns in other diseases such as autism spectrum disorder (ASD) or ADHD (Eslami et al., 2021; Bahathiq et al., 2022; Ehrig et al., 2023). By harnessing the power of computational algorithms, researchers can integrate diverse data sources, including physical and cognitive variables, to develop predictive models for FASD diagnosis (Blanck-Lubarsch et al., 2022; Ehrig et al., 2023). Such models have the potential to augment existing diagnostic frameworks, enabling clinicians to make more informed decisions and speeding up the diagnostic process.

In the present study, supervised classification ML algorithms were employed to construct a predictive diagnosis model of FASD and its subtypes. The model was trained using sociodemographic, clinical and psychological variables. ML provides a powerful tool for prediction and feature importance determination, especially when data patterns may be too complex for conventional statistical methods. The algorithms investigated include Logistic Regression (LR), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), K-nearest Neighbors (KNN), Random Forest (RF) and eXtreme Gradient Boosting (XGB).

This study aims to develop ML algorithms that use physical and neurocognitive data from children with FASD. The algorithms will identify a distinctive FAS profile in the dataset to enhance FASD diagnosis compared to the current methods. This research aims to provide more accurate diagnostic tools for the assessment of FASDs, which could revolutionize clinical practice, thereby facilitating the initiation of early therapies and improving the quality of life of people affected by this silent disease.

2 Material and methods

2.1 Study design and participant information

This is a multicentre and pilot investigation. The study included all patients from the Catalan Institute for Fostering and Adoption (ICAA) database, who agreed to participate. The total study cohort comprised 231 patients, which includes 73 control patients and 158 patients diagnosed with FASD. The study, registered at clinicaltrials.gov (NCT02558933), integrated cohorts from previous investigations (PI13/01135; OG085818; PI16/00566; PI19/01853) that included participants enrolled between March 2017 and November 2023. The study was conducted at the Hospital del Mar Medical Research Institute of Barcelona and Hospital Clinic of Barcelona, and all procedures adhered to ethical standards outlined in the Declaration of Helsinki and Spanish data privacy regulations. Consent was obtained from the caregiver or legal representative of patients due to their incapacity to provide informed consent, as approved by the Comité Ético de Investigación Clínica Parc de Salut MAR (No. HCB/2021/0459).

The minimum sample size calculation was conducted using G^*Power software (Faul et al., 2007) with the following parameters: bilateral contrast, alpha 0.1, beta cut-off of 0.2, corresponding to a power of 0.8 (Gupta et al., 2016), estimated proportion of replacements required (20%), precision 0.1 (90%) and estimate 50% of the population affected. A minimum of 70 samples in two independent groups (non-FASD and FASD groups or Non-FASD vs. each FASD subtype) were required to compute the necessary sample size.

2.2 FASD diagnosis and clinical evaluation

To diagnose FASD, all adopted children in the EEC who were included in this study, including those with verified prenatal alcohol exposure (PAE), underwent independent examination using standardized dysmorphology exams (Hoyme et al., 2016). The diagnostic category for each child was identified based on the 1996 IOM standards (reviewed in 2016) (Hoyme et al., 2005, 2016), which consist of five diagnostic characteristics. (1) confirmed prenatal alcohol exposure; (2) evidence of a characteristic minor facial abnormalities pattern, typified by having a thin upper lip, smooth philtrum and short palpebral fissures; (3) growth retardation, defined as height or weight ≤ 10th percentile; (4) evidence of deficient brain growth or subrogated data; and (5) behavioral or cognitive affected domains (1 or 2) related to prenatal alcohol exposure. For a diagnosis of complete FAS, criteria 2, 3, 4, 5 (confirmed or not confirmed prenatal alcohol exposure) were required. For partial FAS, criteria 1, 2, and at least one of criteria 5 (confirmed prenatal alcohol exposure) or 2, 5 and 3 or 4 (no confirmed prenatal alcohol exposure) were required. The diagnosis of alcohol-related birth defects (ARBD) required the finding of one criterion plus a minimum of one structural defect involving heart, skeleton, kidney, eye, ear or minor abnormalities like railway ears, midface hypoplasia or stick hockey hands. The diagnosis of alcohol-related neurodevelopmental disorders (ARND) required the finding of 1 and 5 criteria.

The variables selected for our study were based on IOM classification from 2016 (Hoyme et al., 2016) to FASD diagnosis. Selected sociodemographic variables are related to maternal alcohol consumption during pregnancy (Hoyme et al., 2016), origin and ethnicity (Oh et al., 2023a). Parent feeling variables were included in our selection as they may be relevant to determine if parental perceptions play a role in the diagnosis of FASD predicted by ML algorithms. For clinical variables were selected growth deficits (Astley et al., 2016; Hoyme et al., 2016; Treit et al., 2016), craniofacial dysmorphology (Smith et al., 2014; Hoyme et al., 2016), birth malformations (Dylag et al., 2023), neurodevelopmental disorders (Geier and Geier, 2022) and other physical features and medical history (Brennan and Giles, 2014; del Campo and Jones, 2017; Ninh et al., 2019). Lastly, related to neuropsychological domains, we selected variables significant for FASD diagnosis, including motor cognition (Bakoyiannis et al., 2014), language (Hendricks et al., 2019), academic achievement (Glass et al., 2017), memory (Rasmussen, 2005), attention (Young et al., 2016), executive functioning including impulse control and hyperactivity (Peadon and Elliott, 2010), affect regulation (Temple et al., 2019) and adaptive behavior, social skills, or social communication (Temple et al., 2019; Hammond et al., 2022).

2.3 Neurocognitive assessment

Cognitive assessment utilized the Wechsler Intelligence Scale for Children (WISC) series, using the Fifth Edition (WISC-V) (Weiss et al., 2019). Evaluating full-scale Intelligence Quotient (IQ), Verbal Comprehension (VCI), Visuospatial Index (VSI), Perceptual Reasoning (FRI), Working Memory (WMI), and Processing Speed (PSI).

Adults' cognitive functioning was evaluated using the Wechsler Adult Intelligence Scale (WAIS), the Fourth Edition (WAIS-IV) (Wechsler, 2008). Preschoolers' cognitive abilities were assessed using the Wechsler Preschool and Primary Scale of Intelligence (WPPSI-IV) (Raiford and Coalson, 2014). Additionally, the Adult Self-Report (ASR/18-59) (Achenbach and Rescorla, 2003) collected self-reported data on behavioral concerns in adults, while the Child Behavior Checklist (CBCL) (Achenbach, 2004) gathered parental reports on children aged 6–18. All assessments adhered to unified criteria and professionals received standardized training, ensuring consistency and reliability. Data were recorded in a confidential database, maintaining accuracy and confidentiality throughout the research process.

2.4 Statistical analysis

Statistical analysis was performed using SPSSv22 and R. Graphs were performed using Graphpad Prism 8.0 software. Descriptive analysis was used to characterize the samples. Categorical variables were presented as counts and percentages, while continuous variables were presented as means and standard deviations. Relationships between sociodemographic, clinical and neuropsychological features were examined for quantitative factors using Kruskall–Wallis test with Dunn's correction for multiple comparisons and for qualitative variables chi-square test. A significance level of p < 0.05 was applied to all analyses.

In addition to the aforementioned statistical tests, machine learning (ML) algorithms were also employed to create a predictive model, using the statistical software R (3.3.0+ version).

2.5 Machine learning models

This study employed several ML algorithms to predict FASD and its subtypes (FAS, pFAS and ARND), such as LR, LDA, linear SVM, polynomial SVM, KNN, RF and XGB (Zhang et al., 2019). The data underwent a preparation phase, where “mice” function from VIM package was used for missing values imputation, employing the predictive mean matching method (pmm) (Kowarik and Templ, 2016). This process was repeated 5 times, as per default setting. Just 1% of the data were missing and single imputation is considered appropriate when <5% of the data are missing (Graham, 2009). The dataset was subsequently scaled using “scale” function in base R.

LR is a binary classification algorithm, which uses a logistic function to predict class probability. Coefficients are obtained using maximum likelihood estimation (Hosmer et al., 2013). LR is easy to implement and performs well on linearly separable classes. However, it may overfit with many features and struggles with complex relationships. LDA projects data into a lower-dimensional space, maximizing class separability and minimizing variance within a class, finding a linear combination of features that characterizes a group (Gardner-Lubbe, 2021). SVM maximizes the distance between the separating hyperplane of the variables to classify (Huang et al., 2018). In our study, linear SVM and polynomial SVM are differentiated. Linear SVM classifies linearly separable data and is more computationally efficient. Polynomial SVM classifies non-linearly separable data, transforms input space into a higher-dimensional space, finds more complex relationships, and is computationally intense (López et al., 2022). KNN predicts class by calculating the Euclidean distance to all training points and selecting K most similar instances (the neighbors). It handles multiclass classification and learns complex decision boundaries (Zhang, 2016). However, it performs poorly on high-dimensional datasets because the distance to all neighbors must be recalculated. Ensemble methods like RF and XGB are decision tree-based algorithms. RF combines multiple independently trained decision trees, uses bagging to create subsets of the original dataset, and then aggregates the results (Denisko and Hoffman, 2018). On the other hand, XGB trains decision trees sequentially, with each new tree correcting errors made by the previous one (Li et al., 2022). The ML algorithms used in this study has its own strengths and weaknesses, leading to varied results. The range of ML algorithms compared spans from traditional predictive models like LR to more complex ensemble methods like RF and XGB, which are capable of handling high-dimensional data. By comparing different models, our study aimed to find the most effective model for predicting FASD and its subtypes. This diversity in approaches enhances the robustness and comprehensiveness of the study.

For the analysis, a total of 66 variables were selected, encompassing five sociodemographic parameters, 35 clinical features, six intelligence scores and 20 behavioral domains (Tables 1–4). Prior to model construction, a hold-out method was applied to split the data into training and test sets using “createDataPartition” function from caret package in R (Kuhn, 2008). Sixty-seven percent of the data was allocated to training set and the remaining 33% to test set. This function employs a stratified random sampling method, which minimizes the bias of the data distribution and creates balanced data.

Table 1

Table 1. Sociodemographic data of children population (n = 231).

In addition to the hold-out method, a resampling method involving four-fold cross-validation and three repeats was adopted. This was implemented using “trainControl” function from the caret package (Kuhn, 2008). The models were trained using “train” function with hyperparameters set to default, which gathers and simplifies numerous R algorithms for the development of predictive models (Kuhn, 2008). The models employed included LR, using “glm” method and binomial family, and LDA, implemented with “lda” method, which has “moment” as the default mean and variance estimator. Linear SVM and Polynomial SVM were performed using “svmLinear” and “svmPoly” methods, respectively. They have C tuning parameter, which determines the margin classification, equal to 1 as default settings. KNN was employed by “knn” method also from caret package, performing automatic hyperparameter tuning for k depending on instance-based learning. In addition, RF was employed using “rf” method, with 500 trees as default. XGB model used “xgbTree” method, having 100 maximum iterations by default.

The “predict” function from stats package was used to predict classes with the test group. In order to make comparisons, the “confusionMatrix” function from caret package was used to calculate true positive, true negative, false positive and false negative. These calculations provided measures including accuracy, precision, sensitivity, F1 score and specificity. ROC-AUC was obtained using “roc” function from pROC package (Robin et al., 2011). Training and test datasets were consistent across FASD and its subgroups, ensuring a fair and valid comparison.

Feature importance prediction of the models was determined by calculating the Root Mean Square Error (RMSE) loss after permutation. It was obtained with “explain” function from DALEX package, with “classification” type model in arguments (Law Biecek, 2018). Plots were generated from the object class formed by “variable_importance” function from caret package (Kuhn, 2008).

3 Results

3.1 FASD profile

The study initially included 273 patients. However, 42 were excluded: 28 lacked psychological evaluations and 14 refused participation. Of the remaining 231 subjects, 73 were diagnosed as non-FASD (controls), and 158 were diagnosed with FASD, comprising 33 with FAS, 81 with pFAS, and 44 with ARND. A database was compiled with sociodemographic and psychological characteristics of both FASD patients (and their subtypes) and non-FASD participants (Figure 1).

Figure 1

Figure 1. Study flowchart. Flowchart of FASD and non-FASD diagnosis and machine learning prediction.

Sociodemographic characteristics of the population were collected from FASD and non-FASD patients (Table 1). The chi-square test revealed no significant differences between groups. Significant differences were found in physical characteristics between children diagnosed with FASD and non-FASD (Table 2, Supplementary Table 1). Prematurity (p-value = 0.011) was higher in FAS children compared to non-FASD children (p-value = 0.018; Supplementary Table 1). Growth retardation (p < 0.001) was also higher in FAS and ARND children compared to non-FASD children (p-value <0.001; p-value = 0.040). Birth complications (p-value = 0.047), as perinatal asphyxia or abnormal heart rate, were more prevalent in ARND children compared to non-FASD children (p-value = 0.052). As expected, maternal alcohol consumption confirmation also showed significant differences (p-value <0.001) in all groups compared to non-FASD patients.

Table 2

Table 2. Clinical and physical features of children population (n = 231).

FASD patients showed lower height (p < 0.0001) by 81%, 16%, and 25% for FAS, pFAS, and ARND respectively, compared to non-FASD (Table 2, Supplementary Table 1). Weight alterations (p < 0.0001) increased significantly in FAS and pFAS. Microcephaly (p-value <0.0001), shorter palpebral fissures (p-value = 0.01) and lip-philtrum affectation (p-value <0.001) were more prevalent in FAS and pFAS. Facial anomalies (p-value = 0.025) were significantly higher in pFAS (p-value = 0.009) compared to non-FASD group. In particular, children with affected eyes (p-value = 0.013) and upper limbs (p = 0.001) were predominantly from FAS groups compared to non-FASD (p-value = 0.037 and p-value = 0.001, respectively). Moreover, significant disparities in lower limbs (p-value = 0.034) were observed, primarily in ARND compared to non-FASD group.

Among FASD groups (Table 2, Supplementary Table 1), significant differences included prematurity (p-value = 0.009), eyes (p-value = 0.011) and upper limbs affectation (p-value = 0.011) between FAS and ARND. FAS exhibited higher growth retardation levels compared to pFAS (p-value = 0.014). Maternal alcohol consumption, short palpebral fissures and lip-philtrum affectation showed increased levels in FAS (p-value = 0.042, p-value = 0.003 and p-value <0.001, respectively) and pFAS patients (p-value = 0.023, p-value = 0.018 and p-value <0.001, respectively) compared to ARND, confirming that this group does not exhibit physical characteristics. Microcephaly varied among all FASD groups, showing 87% of the cases in FAS, 46% in pFAS and 18% in ARND. No significant differences were noted in other physical or clinical characteristics, except for a trend toward greater cardiac damage in FASD (p-value = 0.06).

Significant differences were observed in psychological intelligence parameters between FASD and non-FASD patients. Evaluating cognitive performance with WISC V, children diagnosed with FASD, specifically FAS and pFAS groups (Table 3, Supplementary Table 2), exhibited lower scores on VCI (p-value = 0.001), VSI (p-value = 0.005), FRI (p-value = 0.001), WMI (p-value <0.001), PSI (p-value = 0.003) and IQ (p-value <0.001). No significant differences were found for the WAIS-IV and WPPSI-IV tests.

Table 3

Table 3. Intelligence scale data of children population (n = 231).

Related to behavioral parameters, CBCL 6–18 test showed impairments in several cognitive domains in FASD patients compared to their non-FASD counterparts (Table 4, Supplementary Table 3). Increased levels of thought problems (p-value = 0.035), rule breaking behavior (p-value = 0.002), externalizing problems (p-value = 0.045), total problems (p-value = 0.008), anxiety problems (p-value = 0.009), obsessive compulsive problems (OCP; p-value = 0.049) and stress problems (p-value = 0.001) domains were observed in ARND compared to non-FASD. Significantly increased levels of attention problems were observed in FAS (p-value = 0.021) and ARND (p-value = 0.020) compared to non-FASD. Within the FASD group found significant differences in thought problems (p-value = 0.007), anxiety problems (p-value = 0.006) and OCP (p-value = 0.003), with significantly increased levels in the ARND group compared to pFAS. Furthermore, ARND showed high levels of rule-breaking behavior compared to FAS and pFAS subgroups (p-value = 0.002 and p-value = 0.002), externalizing problems (p-value = 0.003 and p-value = 0.002), total problems (p-value = 0.006 and p-value = 0.021), oppositional defiant problems (p-value = 0.002 and p-value = 0.018), conduct problems (p-value = 0.016 and p-value = 0.009) and stress problems (p-value <0.001 and p-value = 0.015), respectively. Moreover, the results display significantly increased levels of aggressive behavior (p-value = 0.01) in the ARND group compared to FAS.

Table 4

Table 4. Behavior data of children population (n = 231).

Finally, in the adult behavioral test ASR 18–59 (Table 4, Supplementary Table 3), significant differences related to attention problems (p-value = 0.032) were observed, showing ARND higher levels compared to the pFAS subgroup (p-value = 0.003).

3.2 Machine learning predictive modeling

Predictive models for FASD diagnosis were developed using ML, considering the sociodemographic, clinical, and psychological variables previously discussed. The dataset consisted of 231 samples, with 155 samples used for model training and the remaining 76 samples saved for testing and final model evaluation. A variety of ML algorithms were employed, including XGB, LR, LSVML, LDA, SVMP, kNN, RF and XGB. These models were trained using four-fold cross-validation on the training dataset.

Figure 2 shows the key performance metrics associated with the predictive power of each model. Among the models, the ensemble algorithms (RF and XGB) outperformed the others. Notably, the RF model achieved the highest accuracy (0.92), precision (0.96), sensitivity (0.92), F1 score (0.94), specificity (0.92), and AUC (0.92), establishing it as the most effective model for predicting FASD diagnosis. Other models such as LR, LDA, SVMP, and kNN showed lower performance on these metrics (Figure 2). Consequently, we selected the RF model for our prediction tasks due to its superior discriminative ability.

Figure 2

Figure 2. Model performance of machine learning algorithms for FASD prediction. LR, Logistic Regression; SVML, Support Vector Machine Linear Kernel; LDA, Linear Discriminant Analysis; SVMP, Support Vector Machine Polynomial Kernel; KNN, k-Nearest Neighbor; RF, Random Forest; XGB, Gradient-Boosted Trees.

To understand the decision-making mechanism of the RF model, we examined the significance of the variables within this algorithm. The features were ranked according to their importance, with maternal alcohol consumption being the most significant (0.48), followed by lip-philtrum (0.27), microcephaly (0.19), height affectation (0.17), Working Memory Index (0.16), aggressive behavior (0.16), Intelligence Quotient (0.15), somatic complaints (0.15), weight affectation (0.15), and depressive problems (0.15; Figure 3). These findings offer crucial insights into the primary attributes associated with FASD conditions and their respective significance in the predictive model.

Figure 3

Figure 3. Mean variable-importance of RF model for FASD prediction. Mean variable importance was calculated by using 50 permutations and the root-mean-squared-error-loss-function for the RF model. RF, Random Forest.

Another aim of our study is to construct individualized models for each category of FASD. This methodology will allow us to uncover distinct attributes and trends that might remain concealed when all FASD types are examined collectively.

Focusing our analysis on FAS prediction in comparison to non-FASD, we employed the previous ML algorithms. The XGB model outperformed the others (Figure 4), achieving the highest accuracy (0.94), precision (0.91), sensitivity (0.91), F1 Score (0.91), specificity (0.96), and AUC (0.93), thereby proving to be the most effective model for FAS diagnosis prediction.

Figure 4

Figure 4. Model performance of machine learning algorithms for FAS prediction. LR, Logistic Regression; SVML, Support Vector Machine Linear Kernel; LDA, Linear Discriminant Analysis; SVMP, Support Vector Machine Polynomial Kernel; KNN, k-Nearest Neighbor; RF, Random Forest; XGB, Gradient-Boosted Trees.

In Figure 5 the features were ranked based on their importance, with Height (0.32) and Weight (0.28) being the most influential, followed by Fluid Reasoning Index (0.11), Internalizing Problems (0.08), Total Problems (0.1), and Processing Speed Index (0.1) (Figure 5). This highlights FAS prediction is mainly determined by failure to thrive.

Figure 5

Figure 5. Mean variable-importance of XGB model for FAS prediction. Mean variable importance was calculated by using 50 permutations and the root-mean-squared-error-loss-function for the XGB model. XGB, eXtreme Gradient Boosting.

In the subsequent stage of the research, the focus shifted to the prediction of pFAS compared to non-FASD. Upon evaluating all ML models (Figure 6), the RF model emerged as the most proficient, achieving the highest metrics in accuracy (0.90), precision (0.86), sensitivity (0.96), F1 Score (0.91), specificity (0.83), and AUC (0.90). This underscores its effectiveness in predicting pFAS diagnosis.

Figure 6

Figure 6. Model performance of machine learning algorithms for pFAS prediction. LR; Logistic Regression; SVML, Support Vector Machine Linear Kernel; LDA, Linear Discriminant Analysis; SVMP, Support Vector Machine Polynomial Kernel; KNN, k-Nearest Neighbor; RF, Random Forest; XGB, Gradient-Boosted Trees.

Lip-philtrum (0.36) and Maternal Alcohol Consumption (0.27) were the most impactful features for pFAS prediction, followed by Intelligence Quotient (0.21), Microcephaly (0.18), and Processing Speed Index (0.16), Verbal comprehension index (0.15), attention problems (0.15) and thought problems (0.15; Figure 7).

Figure 7

Figure 7. Mean variable-importance of RF model for pFAS prediction. Mean variable importance was calculated by using 50 permutations and the root-mean-squared-error-loss-function for the RF model. RF, Random Forest.

The final phase of the study involved the analysis of ARND prediction. As previously observed, the RF model demonstrated superior performance for ARND prediction (Figure 8), obtaining the best levels of accuracy (0.87), precision (0.76), sensitivity (0.93), F1 Score (0.84), specificity (0.83), and AUC (0.88).

Figure 8

Figure 8. Model performance of machine learning algorithms for ARND prediction. LR, Logistic Regression; SVML, Support Vector Machine linear kernel; LDA, Linear Discriminant Analysis; SVMP, Support Vector Machine polynomial kernel; KNN, k-nearest neighbor; RF, Random Forest; XGB, gradient-boosted trees.

Figure 8 displays Maternal Alcohol Consumption (0.64) as the most influential feature for ARND prediction, followed by Total Problems (0.11) and Attention Problems (0.10) (Figure 9). These results confirm the importance of PAE confirmation for diagnosis.

Figure 9

Figure 9. Mean variable-importance of RF model for ARND prediction. Mean variable importance was calculated by using 50 permutations and the root-mean-squared-error-loss-function for the RF model.

4 Discussion

The comprehensive analysis of sociodemographic, clinical, physical, and psychological characteristics in our study provides invaluable insights into the complex nature of FASD and underscores the importance of these features in diagnostic assessment and intervention planning. The variables have been selected based on IOM criteria (Hoyme et al., 2016).

Our study showed that FASD patients share a common profile of maternal alcohol consumption, low height and lip-philtrum affectation (Hoyme et al., 2016). FAS profile shows impaired intelligence domains observed in WISC V test, as previously reported (Bastons-Compta et al., 2016). Prematurity, growth retardation, weight affectation, short palpebral fissures, eyes and upper limbs affectation and attention problems are highlighted in FAS profile compared to non-FASD, being part of the specific diagnosis (Hoyme et al., 2016; Wang et al., 2020). These findings also confirm previous studies from Maschke et al. (2021) that observed facial abnormalities correlate with child's cognitive performance in FRI and WMI in FASD patients. pFAS profile exhibits distinctions from full FAS, particularly in growth problems and physical traits like microcephaly and upper limb impairment, since pFAS does not meet all the requirements of full FAS (Hoyme et al., 2016). ARND profile showed birth diseases, such as perinatal asphyxia or abnormal heart rate and lower limbs affectation. Furthermore, behavioral affectations included thought problems, attention problems, rule-breaking behavior, externalizing, anxiety, obsessive-compulsive and stress problems. ARND lacks certain physical impairments such as weight and height, microcephaly, short philtrum, and eye and upper limb impairments observed in FAS and pFAS (Hoyme et al., 2016). Therefore, these results highlight that FAS and pFAS may need therapies for educational support and intervention for growth retardation. However, ARND group may need a combination of cognitive behavioral therapy, attention training, and psychotherapy for the range of psychological and behavioral problems.

FASD, shares similarities with others neurocognitive disorders as Autism and ADHD, but is distinguished by its association with PAE, causing a distinct pattern of neurodevelopmental impairments (Rommelse et al., 2010; May and Gossage, 2011). Autism is characterized by elevated FRI and VSI, while ADHD is linked to deficiencies in FRI (Tamm and Juranek, 2012; Happé, 2021). Studies of ASD observed that VCI correlated negatively with communication symptoms, and WMI correlated positively with social symptoms (Rabiee et al., 2019). Similarly, individuals with ADHD exhibit deficits in attention domains, including PSI, WMI, and social cognition (Onandia-Hinchado et al., 2021), which is consistent with observations in FASD. Understanding these differences is crucial for accurate identification, intervention, and support for individuals affected by FASD.

ML has been effectively applied in the medical field to diagnose neurological disorders, including ASD (Vakadkar et al., 2021; Bahathiq et al., 2022; Briguglio et al., 2023) and ADHD (Slobodin et al., 2020; Mikolas et al., 2022; Briguglio et al., 2023; Kim et al., 2023). These studies have demonstrated the potential of ML to increase diagnostic accuracy, reduce time to diagnosis and improve reproducibility. For ASD, ML models have been used to identify key traits using sociodemographic, behavioral characteristics, or magnetic resonance imaging (MRI) results, thereby improving and automating the diagnostic process (Vakadkar et al., 2021; Bahathiq et al., 2022; Briguglio et al., 2023). Similarly, ML classifiers for ADHD have been developed based on clinical and psychological data (i.e. attention, impulsiveness, sleep, and emotional disorders) (Slobodin et al., 2020; Mikolas et al., 2022; Kim et al., 2023).

The implementation of ML in FASD diagnosis is crucial due to the complexity and heterogeneity of the disorder. Previous ML studies predicted FAS risk in pregnant drinkers using questionnaires (Oh et al., 2023b) assessing drinking timing, race, ethnicity, alcoholic beverage, prenatal care and pregnancy complications. However, inherent limitations arise due to potential maternal misrepresentation and impracticality when assessing biological mothers' post-adoption. Traditional diagnostic methods for FASD are often challenging, due to multiple factors, like unknown maternal alcohol confirmation, lack of facial dysmorphology or growth impairments, leading to misdiagnosis or delayed diagnosis (Chasnoff et al., 2015). In recent years, research exploring the potential use of ML algorithms for early diagnosing FASD has shown promising results (Suttie et al., 2024). Ehrig et al. used physical characteristics (such as body length and head circumference at birth) and neuropsychological parameters (IQ, behavior, memory) as predictable variables, achieving good levels of accuracy (0.85), precision (0.87), sensitivity (0.91) and AUC (0.93) (Ehrig et al., 2023). Goh et al. (2016) trained their model using CBCL scales, IQ and physical examination, obtaining a sensitivity of 64%−81% and specificity of 78%−80%. Zhang et al. (2019) developed a comprehensive ML framework using eye movements, psychometric tests and brain imaging to predict FASD. Rodriguez et al. (2021) used magnetic resonance imaging to detect PAE. Duarte et al. (2021) trained with NEPSY-II test, saccade eye movement, and diffusion tensor imaging. Furthermore, Lussier et al. (2018) used methylation signatures for FASD classification. Fu et al. (2022) devised a transfer learning approach leveraging extensive facial recognition datasets. Using similar inputs, Blanck-Lubarsch et al. (2022) formulated an automated classification algorithm with 3D facial scans. Our ML model has achieved better accuracy (0.92), precision (0.96), sensitivity (0.92), F1 score (0.94), specificity (0.92), and AUC (0.92) than previous ML algorithms for FASD diagnosis, helping to avoid misdiagnosis in the clinical setting.

This ML prediction aims to be specifically for FASD, thereby distinguishing it from other developmental disorders. Unique FASD indicators like PAE confirmation, distinctive facial features and microcephaly, together with psychometric data, enhance FASD detection. Models have been developed in other pathologies, incorporating clinical and neuropsychological variables, such as ADHD and ASD (Lange et al., 2019; Ehrig et al., 2023). These models have successfully identified these specific pathologies based on a combination of specific variables. Studies from Lange et al. (2019) and Ehrig et al. (2023) used specific parameters that predict FASD compared to ADHD or ASD. These parameters include gestational age, length, weight and head circumference affectation at birth, together with low IQ, socially intrusive behavior, rule-breaking behavior and attention problems. These studies further validate the accuracy of ML in predicting FASD, thereby mitigating the risk of misdiagnosis other neuropathologies.

Based on socio-demographic, clinical, and psychological data from children with FASD the present study has elaborated a common diagnostic model for FASD, obtaining RF algorithm as the best model predictor. We identified important variables for efficient FASD screening, including classic clinical characteristics for diagnosis like maternal alcohol consumption, lip-philtrum, microcephaly, and height and weight impairment. Other significant variables include the WMI, aggressive behavior, IQ, somatic complaints, and depressive problems. WMI, IQ and aggressive behavior are often observed in FASD patients and are considered a significant factor for the diagnostic process (Maya-Enero et al., 2021). However, our study establishes the domains related to somatic complaints and depressive problems, often reported in FASD patients (Mattson et al., 2011), as key diagnostic indicators.

ML models have also been performed for each FASD subtype, identifying specific patterns and enhancing the important variables for precise prediction. Our finding provides a detailed analysis for each specific type of FASD, offering clinicians more precise information for diagnosis and treatment planning.

The ML algorithm that best predicted FAS was XGB and the most important features were traditional physical traits, such as height and weight affectations. Additionally, neuropsychological variables, including FRI, internalizing problems and total problems, play a crucial role in the prediction of FAS, and have previously mentioned its association with FASD (Fagerlund et al., 2011; Popova et al., 2019; Maschke et al., 2021). Furthermore, studies with autism and ADHD found that internalizing problems are also increased, leading to long-term anxiety behavior in adulthood (So et al., 2021; Andersen et al., 2023). These findings suggest that patients primarily affected in these domains may be more likely to exhibit FAS. Interestingly, maternal alcohol consumption, while a significant predictor for pFAS and ARND, does not appear to be a determinant for FAS prediction.

On the other hand, RF model was the best at predicting pFAS, with classical FASD characteristics such as lip-philtrum affectation, confirmed maternal alcohol consumption, IQ, and microcephaly being the most important variables. However, our study showed that neuropsychological variables like PSI, VCI scores, attention problems and thought problems had also impact on pFAS prediction. Therefore, these results suggest that these neuropsychological variables, previously used to diagnose ADHD (Mikolas et al., 2022), together with classical FASD characteristics, may be relevant in predicting pFAS.

Lastly, ARND prediction was best performed by RF algorithm, with maternal alcohol consumption being the most predictable variable. Nevertheless, total problems and attention problems also had some impact on ARND prediction. Previous ML studies also determined that these neurological domains, along with other impairments in CBCL are key factors for bipolar disorder prediction (Uchida et al., 2022).

Conducting a separate machine learning analysis for each type of FASD, once confirmed prenatal alcohol exposure, is potentially beneficial for clinical practice. It allows for a better understanding of FASD subtypes and can contribute to more accurate diagnosis and targeted treatment strategies.

5 Conclusions

Our study has carried out significant progress in applying ML to the diagnosis of FASD. ML algorithms effectively diagnose FASD and its subtypes: FAS, pFAS, and ARND. Key variables for efficient FASD screening include classical clinical characteristics (maternal alcohol consumption, lip-philtrum, microcephaly, height and weight impairment) and neuropsychological variables (WMI, aggressive behavior, IQ, somatic complaints, and depressive problems). The best ML algorithm for predicting FAS was XGB, with height, weight affectations, and neuropsychological variables like IQ, internalizing problems, and total problems being the most important features. For pFAS, RF model was the best predictor, considering lip-philtrum affectation, confirmed maternal alcohol consumption, IQ, microcephaly, PSI, VCI, attention problems, and thought problems being the most significant variables. For ARND, the RF algorithm was the best performer, with maternal alcohol consumption, total problems, and attention problems being the most predictable variables. ML improves diagnostic accuracy and enhances understanding of FASD subtypes, leading to early intervention strategies, targeted therapeutic approaches, and ultimately mitigating the secondary disabilities of FASD. This could help the social and health systems for affected individuals and their families, supporting the consensus of the diagnostic criteria. All of this emphasizes the need for public policies to invest in ML integration into diagnostic strategies in order to improve clinical outcomes for FASD individuals. ML models will contribute to the development of more informed public health policies focused on this vulnerable population.

6 Limitations

The absence of an ARBD subgroup in our dataset restricts the comprehensiveness of our findings about this FASD subtype. Moreover, self-reported data could introduce bias in variables associated to personal perceptions. External validation on independent datasets is also needed to ensure the robustness of our ML models. Other limitation is related to the confirmation of maternal alcohol consumption, due to incomplete medical records in some adoptees. Additionally, the stress from diagnostic assessments could potentially affect children's performance in neuropsychological tests. Future work could enhance our model by integrating additional data such as magnetic resonance imaging (Rodriguez et al., 2021), NEPSY-II neuropsychological test (Duarte et al., 2021) and eye movement (Zhang et al., 2019) for a more refined FASD diagnosis.

Despite these limitations, our study advances the application of ML in FASD diagnosis, providing a foundation for future research and contributing to the development of more accurate diagnostic tools.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving humans were approved by Comité Ético de Investigación Clínica Parc de Salut MAR (No. HCB/2021/0459). The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation in this study was provided by the participants' legal guardians/next of kin.

Author contributions

AR-T: Formal analysis, Investigation, Methodology, Resources, Writing – original draft, Writing – review & editing, Data curation, Software. EN-T: Data curation, Formal analysis, Investigation, Methodology, Writing – review & editing, Supervision. MV: Conceptualization, Investigation, Resources, Writing – review & editing, Data curation, Formal analysis, Methodology, Visualization. AM: Data curation, Investigation, Methodology, Resources, Visualization, Writing – review & editing. MA: Resources, Data curation, Formal analysis, Methodology, Validation, Writing – review & editing. LA: Data curation, Writing – review & editing, Resources. LM: Supervision, Visualization, Validation, Writing – review & editing, Conceptualization, Funding acquisition, Project administration. ÓG-A: Conceptualization, Funding acquisition, Project administration, Supervision, Validation, Writing – review & editing, Investigation, Resources. VA-F: Project administration, Supervision, Writing – original draft, Investigation, Methodology, Resources, Visualization, Writing – review & editing, Conceptualization, Formal Analysis, Funding acquisition.

Funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This study has been funded by Instituto de Salud Carlos II (ISCIII) through the project PI19/01853, PI21/01415 and PI23/01220 and co-funded by the European Union. Project RD21/0012/0017 and RD21/0012/0023 financed by Instituto de Salud Carlos III (ISCIII) and Unión Europea NextGenerationEU/Mecanismo para la Recuperación y la Resiliencia (MRR)/Plan de Recuperación, Transformación y Resiliencia (PRTR). This research was funded also by Fundación Mutua Madrileña (AP183662023). This study has also been carried out thanks to the support of the Departament de Recerca i Universitats de la Generalitat de Catalunya al Grup de Recerca Infància i Entorn (GRIE) (2021 SGR 01290). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Acknowledgments

The authors acknowledge Visual TEAF, who contributed to patient recruitment and made this research possible.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnins.2024.1400933/full#supplementary-material

References

Achenbach, T., and Rescorla, L. (2003). Manual for the ASEBA Adult Forms and Profiles: For Ages 18-59 : Adult Self-Report and Adult Behavior Checklist. Burlington, NJ: Burlington.

Google Scholar

Achenbach, T. M. (2004). “Manual for the child behavior checklist and revised child behavior profile,” in Encyclopedia of Psychology, Vol. 2, 2008–2009.

Google Scholar

Andersen, P. N., Orm, S., Fossum, I. N., Øie, M. G., and Skogli, E. W. (2023). Adolescence internalizing problems as a mediator between autism diagnosis in childhood and quality of life in emerging adults with and without autism: a 10-year longitudinal study. BMC Psychiatry 23, 1–11. doi: 10.1186/s12888-023-04635-w

PubMed Abstract | Crossref Full Text | Google Scholar

Astley, S. J., Bledsoe, J. M., and Davies, J. K. (2016). The essential role of growth deficiency in the diagnosis of fetal alcohol spectrum disorder. Adv. Pediatr. Res. 3:9. doi: 10.12715/apr.2016.3.9

PubMed Abstract | Crossref Full Text | Google Scholar

Bahathiq, R. A., Banjar, H., Bamaga, A. K., and Jarraya, S. K. (2022). Machine learning for autism spectrum disorder diagnosis using structural magnetic resonance imaging: promising but challenging. Front. Neuroinform. 16:949926. doi: 10.3389/fninf.2022.949926

PubMed Abstract | Crossref Full Text | Google Scholar

Bakoyiannis, I., Gkioka, E., Pergialiotis, V., Mastroleon, I., Prodromidou, A., Vlachos, G. D., et al. (2014). Fetal alcohol spectrum disorders and cognitive functions of young children. Rev. Neurosci. 25, 631–639. doi: 10.1515/revneuro-2014-0029

PubMed Abstract | Crossref Full Text | Google Scholar

Bastons-Compta, A., Astals, M., and Garcia-Algar, Ó. (2016). Foetal Alcohol Spectrum Disorder (FASD) diagnostic guidelines: a neuropsychological diagnostic criteria review proposal. Clin. Neuropsychol. Open Access 1, 1–2. doi: 10.4172/2472-095X.1000e104

Crossref Full Text | Google Scholar

Blanck-Lubarsch, M., Dirksen, D., Feldmann, R., Bormann, E., and Hohoff, A. (2022). Simplifying diagnosis of fetal alcohol syndrome using machine learning methods. Front. Pediatr. 9:707566. doi: 10.3389/fped.2021.707566

PubMed Abstract | Crossref Full Text | Google Scholar

Brennan, D., and Giles, S. (2014). Ocular involvement in fetal alcohol spectrum disorder: a review. Curr. Pharm. Des. 20, 5377–5387. doi: 10.2174/1381612820666140205144114

PubMed Abstract | Crossref Full Text | Google Scholar

Briguglio, M., Turriziani, L., Currò, A., Gagliano, A., Di Rosa, G., Caccamo, D., et al. (2023). A machine learning approach to the diagnosis of autism spectrum disorder and multi-systemic developmental disorder based on retrospective data and ADOS-2 score. Brain Sci. 13:833. doi: 10.3390/brainsci13060883

PubMed Abstract | Crossref Full Text | Google Scholar

Champagne, M., McCrossin, J., Pei, J., and Reynolds, J. N. (2023). A tornado in the family: fetal alcohol spectrum disorder and aggression during childhood and adolescence: a scoping review. Front. Neurosci. 17:1176695. doi: 10.3389/fnins.2023.1176695

PubMed Abstract | Crossref Full Text | Google Scholar

Chasnoff, I. J., Wells, A. M., and King, L. (2015). Misdiagnosis and missed diagnoses in foster and adopted children with prenatal alcohol exposure. Pediatrics 135, 264–270. doi: 10.1542/peds.2014-2171

PubMed Abstract | Crossref Full Text | Google Scholar

del Campo, M., and Jones, K. L. (2017). A review of the physical features of the fetal alcohol spectrum disorders. Eur. J. Med. Genet. 60, 55–64. doi: 10.1016/j.ejmg.2016.10.004

PubMed Abstract | Crossref Full Text | Google Scholar

Denisko, D., and Hoffman, M. M. (2018). Classification and interaction in random forests. Proc. Natl. Acad. Sci. USA. 115:1690. doi: 10.1073/pnas.1800256115

PubMed Abstract | Crossref Full Text | Google Scholar

Duarte, V., Leger, P., Contreras, S., and Fukuda, H. (2021). Using artificial neural network to detect fetal alcohol spectrum disorder in children. Appl. Sci. 11:5961. doi: 10.3390/app11135961

Crossref Full Text | Google Scholar

Dylag, K. A., Anunziata, F., Bandoli, G., and Chambers, C. (2023). Birth defects associated with prenatal alcohol exposure—a review. Children 10:811. doi: 10.3390/children10050811

PubMed Abstract | Crossref Full Text | Google Scholar

Ehrig, L., Wagner, A. C., Wolter, H., Correll, C. U., Geisel, O., Konigorski, S., et al. (2023). (2023). FASDetect as a machine learning-based screening app for FASD in youth with ADHD. NPJ Digit Med. 6:130. doi: 10.1038/s41746-023-00864-1

PubMed Abstract | Crossref Full Text | Google Scholar

Eslami, T., Almuqhim, F., Raiker, J. S., and Saeed, F. (2021). Machine learning methods for diagnosing autism spectrum disorder and attention- deficit/hyperactivity disorder using functional and structural MRI: a survey. Front. Neuroinform. 14:575999. doi: 10.3389/fninf.2020.575999

PubMed Abstract | Crossref Full Text | Google Scholar

Fagerlund, Å., Autti-Rämö, I., Hoyme, H. E., Mattson, S. N., and Korkman, M. (2011). Risk factors for behavioural problems in foetal alcohol spectrum disorders. Acta Paediatr. 100:1481. doi: 10.1111/j.1651-2227.2011.02354.x

PubMed Abstract | Crossref Full Text | Google Scholar

Faul, F., Erdfelder, E., Lang, A. G., and Buchner, A. (2007). G^*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav. Res. Methods 39, 175–191. doi: 10.3758/BF03193146

PubMed Abstract | Crossref Full Text | Google Scholar

Fu, Z., Jiao, J., Suttie, M., and Noble, J. A. (2022). Facial anatomical landmark detection using regularized transfer learning with application to fetal alcohol syndrome recognition. IEEE J. Biomed. Health Inform. 26, 1591–1601. doi: 10.1109/JBHI.2021.3110680

PubMed Abstract | Crossref Full Text | Google Scholar

Gardner-Lubbe, S. (2021). Linear discriminant analysis for multiple functional data analysis. J. Appl. Stat. 48:1917. doi: 10.1080/02664763.2020.1780569

PubMed Abstract | Crossref Full Text | Google Scholar

Geier, D. A., and Geier, M. R. (2022). Fetal alcohol syndrome and the risk of neurodevelopmental disorders: a longitudinal cohort study. Brain Dev. 44, 706–714. doi: 10.1016/j.braindev.2022.08.002

PubMed Abstract | Crossref Full Text | Google Scholar

Glass, L., Moore, E. M., Akshoomoff, N., Jones, K. L., Riley, E. P., Mattson, S. N., et al. (2017). Academic difficulties in children with prenatal alcohol exposure: presence, profile, and neural correlates. Alcohol. Clin. Exp. Res. 41:1024. doi: 10.1111/acer.13366

PubMed Abstract | Crossref Full Text | Google Scholar

Goh, P. K., Doyle, L. R., Glass, L., Jones, K. L., Riley, E. P., Coles, C. D., et al. (2016). A decision tree to identify children affected by prenatal alcohol exposure. J. Pediatr. 177, 121–127.e1. doi: 10.1016/j.jpeds.2016.06.047

PubMed Abstract | Crossref Full Text | Google Scholar

Graham, J. W. (2009). Missing data analysis: making it work in the real world. Annu. Rev. Psychol. 60, 549–576. doi: 10.1146/annurev.psych.58.110405.085530

PubMed Abstract | Crossref Full Text | Google Scholar

Gupta, K. K., Attri, J. P., Singh, A., Kaur, H., and Kaur, G. (2016). Basic concepts for sample size calculation: critical step for any clinical trials! Saudi J. Anaesth. 10:328. doi: 10.4103/1658-354X.174918

PubMed Abstract | Crossref Full Text | Google Scholar

Hammond, L., Joly, V., Kapasi, A., Kryska, K., Andrew, G., Oberlander, T. F., et al. (2022). Adaptive behavior, sleep, and physical activity in adolescents with fetal alcohol spectrum disorder. Res. Dev. Disabil. 131:104366. doi: 10.1016/j.ridd.2022.104366

PubMed Abstract | Crossref Full Text | Google Scholar

Happé, F. (2021). “Fluid intelligence,” in Encyclopedia of Autism Spectrum Disorders 2055–2055 (Cham: Springer). doi: 10.1007/978-3-319-91280-6_1731

Crossref Full Text | Google Scholar

Hendricks, G., Malcolm-Smith, S., Adnams, C., Stein, D. J., and Donald, K. A. M. (2019). Effects of prenatal alcohol exposure on language, speech and communication outcomes: a review longitudinal studies. Acta Neuropsychiatr. 31:74. doi: 10.1017/neu.2018.28

PubMed Abstract | Crossref Full Text | Google Scholar

Hosmer, D. W. Jr., Lemeshow, S., and Sturdivant, R. X. (2013). Applied Logistic Regression, Vol. 398. John Wiley & Sons. doi: 10.1002/9781118548387

Crossref Full Text | Google Scholar

Hoyme, H. E., Kalberg, W. O., Elliott, A. J., Blankenship, J., Buckley, D., Marais, A. S., et al. (2016). Updated clinical guidelines for diagnosing fetal alcohol spectrum disorders. Pediatrics 138:e20154256. doi: 10.1542/peds.2015-4256

PubMed Abstract | Crossref Full Text | Google Scholar

Hoyme, H. E., May, P. A., Kalberg, W. O., Kodituwakku, P., Gossage, J. P., Trujillo, P. M., et al. (2005). A practical clinical approach to diagnosis of fetal alcohol spectrum disorders: clarification of the 1996 institute of medicine criteria. Pediatrics 115, 39–47. doi: 10.1542/peds.2004-0259

PubMed Abstract | Crossref Full Text | Google Scholar

Huang, S., Nianguang, C. A. I., Penzuti Pacheco, P., Narandes, S., Wang, Y., Wayne, X. U., et al. (2018). Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genomics Proteomics 15:41. doi: 10.21873/cgp.20063

PubMed Abstract | Crossref Full Text | Google Scholar

Jańczewska, I., Wierzba, J., Cichoń-Kotek, M., and Jańczewska, A. (2019). Fetal alcohol spectrum disorders – diagnostic difficulties in the neonatal period and new diagnostic approaches. Dev. Period Med. 23:60. doi: 10.34763/devperiodmed.20192301.6066

PubMed Abstract | Crossref Full Text | Google Scholar

Kelly, S. J., Day, N., and Streissguth, A. P. (2000). Effects of prenatal alcohol exposure on social behavior in humans and other species. Neurotoxicol. Teratol. 22:143. doi: 10.1016/S0892-0362(99)00073-2

PubMed Abstract | Crossref Full Text | Google Scholar

Kim, W. P., Kim, H. J., Pack, S. P., Lim, J. H., Cho, C. H., Lee, H. J., et al. (2023). Machine learning-based prediction of attention-deficit/hyperactivity disorder and sleep problems with wearable data in children. JAMA Netw. Open 6:E233502. doi: 10.1001/jamanetworkopen.2023.3502

PubMed Abstract | Crossref Full Text | Google Scholar

Kowarik, A., and Templ, M. (2016). Imputation with the R package VIM. J. Stat. Softw. 74, 1–16. doi: 10.18637/jss.v074.i07

PubMed Abstract | Crossref Full Text | Google Scholar

Kuhn, M. (2008). Building predictive models in R using the caret package. J. Stat. Softw. 28, 1–26. doi: 10.18637/jss.v028.i05

Crossref Full Text | Google Scholar

Lange, S., Probst, C., Gmel, G., Rehm, J., Burd, L., Popova, S., et al. (2017). Global prevalence of fetal alcohol spectrum disorder among children and youth: a systematic review and meta-analysis. JAMA Pediatr. 171, 948–956. doi: 10.1001/jamapediatrics.2017.1919

PubMed Abstract | Crossref Full Text | Google Scholar

Lange, S., Shield, K., Rehm, J., Anagnostou, E., and Popova, S. (2019). Fetal alcohol spectrum disorder: neurodevelopmentally and behaviorally indistinguishable from other neurodevelopmental disorders. BMC Psychiatry 19, 1–10. doi: 10.1186/s12888-019-2289-y

PubMed Abstract | Crossref Full Text | Google Scholar

Law Biecek, P. (2018). DALEX: explainers for complex predictive models in R. J. Mach. Learn. Res. 19, 1–5. Available online at: https://jmlr.org/papers/v19/18-416.html

Google Scholar

Leenaars, L. S., Denys, K., Henneveld, D., and Rasmussen, C. (2012). The impact of fetal alcohol spectrum disorders on families: evaluation of a family intervention program. Community Ment. Health J. 48, 431–435. doi: 10.1007/s10597-011-9425-6

PubMed Abstract | Crossref Full Text | Google Scholar

Li, K., Yao, S., Zhang, Z., Cao, B., Wilson, C. M., Kalos, D., et al. (2022). Efficient gradient boosting for prognostic biomarker discovery. Bioinformatics 38:1631. doi: 10.1093/bioinformatics/btab869

PubMed Abstract | Crossref Full Text | Google Scholar

López, O. A. M., López, A. M., and Crossa, J., (eds) (2022). “Support vector machines and support vector regression,” in Multivariate Statistical Machine Learning Methods for Genomic Prediction (Cham: Springer), 337–378. doi: 10.1007/978-3-030-89010-0_9

Crossref Full Text | Google Scholar

Lussier, A. A., Morin, A. M., MacIsaac, J. L., Salmon, J., Weinberg, J., Reynolds, J. N., et al. (2018). DNA methylation as a predictor of fetal alcohol spectrum disorder. Clin. Epigenetics 10, 1–14. doi: 10.1186/s13148-018-0439-6

PubMed Abstract | Crossref Full Text | Google Scholar

Maschke, J., Roetner, J., Goecke, T. W., Fasching, P. A., Beckmann, M. W., Kratz, O., et al. (2021). Prenatal alcohol exposure and the facial phenotype in adolescents: a study based on meconium ethyl glucuronide. Brain Sci. 11, 1–20. doi: 10.3390/brainsci11020154

PubMed Abstract | Crossref Full Text | Google Scholar

Mattson, S. N., Crocker, N., and Nguyen, T. T. (2011). Fetal alcohol spectrum disorders: neuropsychological and behavioral features. Neuropsychol. Rev. 21, 81. doi: 10.1007/s11065-011-9167-9

PubMed Abstract | Crossref Full Text | Google Scholar

May, P. A., and Gossage, J. P. (2011). Maternal risk factors for fetal alcohol spectrum disorders: not as simple as it might seem. Alcohol Res. Health 34:15. doi: 10.1111/acer.15193

PubMed Abstract | Crossref Full Text | Google Scholar

Maya-Enero, S., Ramis-Fernández, S. M., Astals-Vizcaino, M., and García-Algar, Ó. (2021). Neurocognitive and behavioral profile of fetal alcohol spectrum disorder. An Pediatr. 95, 208.e1–e9. doi: 10.1016/j.anpede.2020.12.012

PubMed Abstract | Crossref Full Text | Google Scholar

Mikolas, P., Vahid, A., Bernardoni, F., Süß, M., Martini, J., Beste, C., et al. (2022). Training a machine learning classifier to identify ADHD based on real-world clinical data from medical records. Sci. Rep. 12:12934. doi: 10.1038/s41598-022-17126-x

PubMed Abstract | Crossref Full Text | Google Scholar

Ninh, V. K., El Hajj, E. C., Mouton, A. J., and Gardner, J. D. (2019). Prenatal alcohol exposure causes adverse cardiac extracellular matrix changes and dysfunction in neonatal mice. Cardiovasc. Toxicol. 19:389. doi: 10.1007/s12012-018-09503-8

PubMed Abstract | Crossref Full Text | Google Scholar

Oh, S. S., Kang, B., Park, J., Kim, S. M., Park, E. C., Lee, S. H., et al. (2023a). Racial/ethnic disparity in association between fetal alcohol syndrome and alcohol intake during pregnancy: multisite retrospective cohort study. JMIR Public Health Surveill. 9:e45358. doi: 10.2196/45358

PubMed Abstract | Crossref Full Text | Google Scholar

Oh, S. S., Kuang, I., Jeong, H., Song, J. Y., Ren, B., Moon, J. Y., et al. (2023b). Predicting fetal alcohol spectrum disorders using machine learning techniques: multisite retrospective cohort study. J. Med. Internet Res. 25:e45041. doi: 10.2196/45041

PubMed Abstract | Crossref Full Text | Google Scholar

Onandia-Hinchado, I., Pardo-Palenzuela, N., and Diaz-Orueta, U. (2021). Cognitive characterization of adult attention deficit hyperactivity disorder by domains: a systematic review. J. Neural. Transm. 128, 893–937. doi: 10.1007/s00702-021-02302-6

PubMed Abstract | Crossref Full Text | Google Scholar

Peadon, E., and Elliott, E. J. (2010). Distinguishing between attention-deficit hyperactivity and fetal alcohol spectrum disorders in children: clinical guidelines. Neuropsychiatr. Dis. Treat. 6:509. doi: 10.2147/NDT.S7256

PubMed Abstract | Crossref Full Text | Google Scholar

Pei, J., Denys, K., Hughes, J., and Rasmussen, C. (2011). Mental health issues in fetal alcohol spectrum disorder. J. Ment. Health 20, 473–483. doi: 10.3109/09638237.2011.577113

PubMed Abstract | Crossref Full Text | Google Scholar

Popova, S., Charness, M. E., Burd, L., Crawford, A., Hoyme, H. E., Mukherjee, R. A. S., et al. (2023). (2023). Fetal alcohol spectrum disorders. Nat. Rev. Dis. Primers 9:11. doi: 10.1038/s41572-023-00420-x

PubMed Abstract | Crossref Full Text | Google Scholar

Popova, S., Lange, S., Poznyak, V., Chudley, A. E., Shield, K. D., Reynolds, J. N., et al. (2019). Population-based prevalence of fetal alcohol spectrum disorder in Canada. BMC Public Health 19:845. doi: 10.1186/s12889-019-7213-3

PubMed Abstract | Crossref Full Text | Google Scholar

Rabiee, A., Samadi, S. A., Vasaghi-Gharamaleki, B., Hosseini, S., Seyedin, S., Keyhani, M., et al. (2019). The cognitive profile of people with high-functioning autism spectrum disorders. Behav. Sci. 9. doi: 10.3390/bs9020020

PubMed Abstract | Crossref Full Text | Google Scholar

Raiford, S. E., and Coalson, D. (2014). Essentials of WPPSI-IV Assessment, 1st Edn. Wiley. Available online at: https://www.perlego.com/book/1001613/essentials-of-wppsiiv-assessment-pdf (accessed October 14, 2022).

Google Scholar

Rasmussen, C. (2005). Executive functioning and working memory in fetal alcohol spectrum disorder. Alcohol. Clin. Exp. Res. 29, 1359–1367. doi: 10.1097/01.alc.0000175040.91007.d0

PubMed Abstract | Crossref Full Text | Google Scholar

Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J. C., et al. (2011). pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 1–8. doi: 10.1186/1471-2105-12-77

PubMed Abstract | Crossref Full Text | Google Scholar

Rodrigues, P. M., Madeiro, J. P., and Marques, J. A. L. (2023). Enhancing health and public health through machine learning: decision support for smarter choices. Bioengineering 10:792. doi: 10.3390/bioengineering10070792

PubMed Abstract | Crossref Full Text | Google Scholar

Rodriguez, C. I., Vergara, V. M., Davies, S., Calhoun, V. D., Savage, D. D., Hamilton, D. A., et al. (2021). Detection of prenatal alcohol exposure using machine learning classification of resting-state functional network connectivity data. Alcohol 93:25. doi: 10.1016/j.alcohol.2021.03.001

PubMed Abstract | Crossref Full Text | Google Scholar

Rommelse, N. N. J., Franke, B., Geurts, H. M., Hartman, C. A., and Buitelaar, J. K. (2010). Shared heritability of attention-deficit/hyperactivity disorder and autism spectrum disorder. Eur. Child Adolesc. Psychiatry 19:281. doi: 10.1007/s00787-010-0092-x

PubMed Abstract | Crossref Full Text | Google Scholar

Slobodin, O., Yahav, I., and Berger, I. (2020). A machine-based prediction model of ADHD using CPT data. Front. Hum. Neurosci. 14:560021. doi: 10.3389/fnhum.2020.560021

PubMed Abstract | Crossref Full Text | Google Scholar

Smith, S. M., Garic, A., Berres, M. E., and Flentke, G. R. (2014). Genomic factors that shape craniofacial outcome and neural crest vulnerability in FASD. Front. Genet. 5:100524. doi: 10.3389/fgene.2014.00224

PubMed Abstract | Crossref Full Text | Google Scholar

So, F. K., Chavira, D., and Lee, S. S. (2021). ADHD and ODD dimensions: time varying prediction of internalizing problems from childhood to adolescence. J. Atten. Disord. 26, 932–941. doi: 10.1177/10870547211050947

PubMed Abstract | Crossref Full Text | Google Scholar

Suttie, M., Kable, J., Mahnke, A. H., and Bandoli, G. (2024). Machine learning approaches to the identification of children affected by prenatal alcohol exposure: a narrative review. Alcohol Clin. Exp. Res. 48, 585–595. doi: 10.1111/acer.15271

PubMed Abstract | Crossref Full Text | Google Scholar

Tamm, L., and Juranek, J. (2012). Fluid reasoning deficits in children with ADHD: evidence from fMRI. Brain Res. 1465, 48–56. doi: 10.1016/j.brainres.2012.05.021

PubMed Abstract | Crossref Full Text | Google Scholar

Temple, V. K., Cook, J. L., Unsworth, K., Rajani, H., and Mela, M. (2019). Mental health and affect regulation impairment in fetal alcohol spectrum disorder (FASD): results from the canadian national FASD database. Alcohol Alcohol. 54, 545–550. doi: 10.1093/alcalc/agz049

PubMed Abstract | Crossref Full Text | Google Scholar

Treit, S., Zhou, D., Chudley, A. E., Andrew, G., Rasmussen, C., Nikkel, S. M., et al. (2016). Relationships between head circumference, brain volume and cognition in children with prenatal alcohol exposure. PLoS ONE 11:150370. doi: 10.1371/journal.pone.0150370

PubMed Abstract | Crossref Full Text | Google Scholar

Uchida, M., Bukhari, Q., DiSalvo, M., Green, A., Serra, G., Hutt Vater, C., et al. (2022). Can machine learning identify childhood characteristics that predict future development of bipolar disorder a decade later? J. Psychiatr. Res. 156:261. doi: 10.1016/j.jpsychires.2022.09.051

PubMed Abstract | Crossref Full Text | Google Scholar

Vakadkar, K., Purkayastha, D., and Krishnan, D. (2021). Detection of autism spectrum disorder in children using machine learning techniques. SN Comput. Sci. 2, 1–9. doi: 10.1007/s42979-021-00776-5

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, R., Martin, C. D., Lei, A. L., Hausknecht, K. A., Ishiwari, K., Richards, J. B., et al. (2020). Prenatal ethanol exposure leads to attention deficits in both male and female rats. Front. Neurosci. 14:506255. doi: 10.3389/fnins.2020.00012

PubMed Abstract | Crossref Full Text | Google Scholar

Wechsler, D. (2008). WAIS-IV Administration and Scoring Manual [WWW Document]. PsychCorp. Available online at: https://books.google.es/books/about/WAIS_IV_Administration_and_Scoring_Manua.html?id=Bf-DswEACAAJandredir_esc=y (accessed February 24, 2024).

Google Scholar

Weiss, L. G., Saklofske, D. H., Holdnack, J. A., and Prifitera, A. (2019). WISC-V Assessment and Interpretation: Clinical Use and Interpretation, 2nd Edn. Academic Press.

Google Scholar

Wozniak, J. R., Riley, E. P., and Charness, M. E. (2019). Clinical presentation, diagnosis, and management of fetal alcohol spectrum disorder. Lancet Neurol. 18, 760–770. doi: 10.1016/S1474-4422(19)30150-4

PubMed Abstract | Crossref Full Text | Google Scholar

Young, S., Absoud, M., Blackburn, C., Branney, P., Colley, B., Farrag, E., et al. (2016). Guidelines for identification and treatment of individuals with attention deficit/hyperactivity disorder and associated fetal alcohol spectrum disorders based upon expert consensus. BMC Psychiatry 16, 1–14. doi: 10.1186/s12888-016-1027-y

PubMed Abstract | Crossref Full Text | Google Scholar

Zhang, C., Paolozza, A., Tseng, P. H., Reynolds, J. N., Munoz, D. P., Itti, L., et al. (2019). Detection of children/youth with fetal alcohol spectrum disorder through eye movement, psychometric, and neuroimaging data. Front. Neurol. 10:80. doi: 10.3389/fneur.2019.00080

PubMed Abstract | Crossref Full Text | Google Scholar

Zhang, Z. (2016). Introduction to machine learning: k-nearest neighbors. Ann. Transl. Med. 4:218. doi: 10.21037/atm.2016.03.37

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: fetal alcohol spectrum disorders, machine learning, eXtreme Gradient Boosting (XGB), Random Forest (RF), neurodevelopment, PAE, early diagnosis

Citation: Ramos-Triguero A, Navarro-Tapia E, Vieiros M, Mirahi A, Astals Vizcaino M, Almela L, Martínez L, García-Algar Ó and Andreu-Fernández V (2024) Machine learning algorithms to the early diagnosis of fetal alcohol spectrum disorders. Front. Neurosci. 18:1400933. doi: 10.3389/fnins.2024.1400933

Received: 14 March 2024; Accepted: 15 April 2024;
Published: 06 May 2024.

Edited by:

Giorgia Coratti, Agostino Gemelli University Polyclinic (IRCCS), Italy

Reviewed by:

Vannessa Duarte, Catholic University of the North, Chile
Auberth Venson, State University of Londrina, Brazil
Shana Hayes, Columbia University, United States

Copyright © 2024 Ramos-Triguero, Navarro-Tapia, Vieiros, Mirahi, Astals Vizcaino, Almela, Martínez, García-Algar and Andreu-Fernández. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Vicente Andreu-Fernández, dmFuZHJldUB1bml2ZXJzaWRhZHZpdS5jb20=

^†These authors have contributed equally to this work and share first authorship

^‡These authors share last authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.