Machine learning algorithms to the early diagnosis of fetal alcohol spectrum disorders

Introduction Fetal alcohol spectrum disorders include a variety of physical and neurocognitive disorders caused by prenatal alcohol exposure. Although their overall prevalence is around 0.77%, FASD remains underdiagnosed and little known, partly due to the complexity of their diagnosis, which shares some symptoms with other pathologies such as autism spectrum, depression or hyperactivity disorders. Methods This study included 73 control and 158 patients diagnosed with FASD. Variables selected were based on IOM classification from 2016, including sociodemographic, clinical, and psychological characteristics. Statistical analysis included Kruskal-Wallis test for quantitative factors, Chi-square test for qualitative variables, and Machine Learning (ML) algorithms for predictions. Results This study explores the application ML in diagnosing FASD and its subtypes: Fetal Alcohol Syndrome (FAS), partial FAS (pFAS), and Alcohol-Related Neurodevelopmental Disorder (ARND). ML constructed a profile for FASD based on socio-demographic, clinical, and psychological data from children with FASD compared to a control group. Random Forest (RF) model was the most efficient for predicting FASD, achieving the highest metrics in accuracy (0.92), precision (0.96), sensitivity (0.92), F1 Score (0.94), specificity (0.92), and AUC (0.92). For FAS, XGBoost model obtained the highest accuracy (0.94), precision (0.91), sensitivity (0.91), F1 Score (0.91), specificity (0.96), and AUC (0.93). In the case of pFAS, RF model showed its effectiveness, with high levels of accuracy (0.90), precision (0.86), sensitivity (0.96), F1 Score (0.91), specificity (0.83), and AUC (0.90). For ARND, RF model obtained the best levels of accuracy (0.87), precision (0.76), sensitivity (0.93), F1 Score (0.84), specificity (0.83), and AUC (0.88). Our study identified key variables for efficient FASD screening, including traditional clinical characteristics like maternal alcohol consumption, lip-philtrum, microcephaly, height and weight impairment, as well as neuropsychological variables such as the Working Memory Index (WMI), aggressive behavior, IQ, somatic complaints, and depressive problems. Discussion Our findings emphasize the importance of ML analyses for early diagnoses of FASD, allowing a better understanding of FASD subtypes to potentially improve clinical practice and avoid misdiagnosis.


Introduction
Fetal alcohol spectrum disorder (FASD) is a range of neurodevelopmental impairments produced by prenatal alcohol exposure (PAE) (Hoyme et al., 2005;Popova et al., 2023).Epidemiological studies estimate a global prevalence of 0.77% (Lange et al., 2017), with regional variations observed, particularly in Europe and North America where prevalence ranges from 2.0 to 5.0% (Wozniak et al., 2019).Despite its prevalence, FASD remains underdiagnosed due to the wide variety of associated symptoms and the complexity in the diagnosis of some of them specifically to FASD, which can overlap with alternative diagnoses, such as attention deficit hyperactivity disorder (ADHD).In addition, the social stigma can pose significant challenges for affected individuals, their families and healthcare systems.
Individuals diagnosed with FASD face a wide range of neurocognitive impairments and social challenges that persist throughout their lives (Kelly et al., 2000;Champagne et al., 2023).Primary disabilities associated with FASD include impairments in adaptive functioning, memory, attention, abstract thinking, judgement, and cause-effect reasoning (Maya-Enero et al., 2021).Secondary disabilities, which result from the interaction of primary disabilities with environmental factors, can adversely affect an individual's ability to actively and positively participate in their lives and can lead to academic failure, low self-esteem, housing instability, and depression (Pei et al., 2011;Leenaars et al., 2012).The complex interplay between these cognitive impairments and social difficulties highlights the need for early comprehensive diagnostic strategies to adequately support affected individuals.
The diagnostic criteria for FASD are multifaceted and include four domains: PAE, facial features, growth, and neurodevelopment (Hoyme et al., 2016).These domains create a spectrum from the most severe condition of FASD, the fetal alcohol syndrome (FAS) to alcohol-related brain damage (ARBD), with partial FAS (pFAS) and alcohol-related neurodevelopmental disorder (ARND) as intermediate terms.Several guidelines are commonly used, including those from the Institute of Medicine, Canadian Guidelines, Centers for Disease Control (CDC), and the University of Washington's 4-digit code (Bastons-Compta et al., 2016;Maya-Enero et al., 2021).The Institute of Medicine (IOM) criteria, which include craniofacial anomalies, growth retardation, mental disabilities and developmental disorders, are currently recommended for diagnosis (Hoyme et al., 2016).However, this approach has limitations, such as difficult physical assessments, extensive neuropsychological assessments and underreporting of alcohol use during pregnancy.Obtaining a confirmed history of alcohol use during pregnancy is hampered by various factors, such as change of custody or maternal death.Consequently, a significant proportion of individuals with FASD remain undiagnosed or receive delayed diagnoses, exacerbating their difficulties and limiting their access to early interventions and support services (Jańczewska et al., 2019).
The search for novel strategies and methodologies for early diagnosis is one of the most promising fields of research in FASD.Timely identification of affected individuals is crucial for implementing personalized interventions and mitigating the long-term impact of the disorder on cognitive, social, and behavioral outcomes.In this context, emerging technologies, such as machine learning, offer promising avenues for improving diagnostic accuracy and efficiency (Rodrigues et al., 2023).
Machine learning (ML) algorithms have demonstrated impressive capabilities in analyzing complex datasets and extracting meaningful patterns in other diseases such as autism spectrum disorder (ASD) or ADHD (Eslami et al., 2021;Bahathiq et al., 2022;Ehrig et al., 2023).By harnessing the power of computational algorithms, researchers can integrate diverse data sources, including physical and cognitive variables, to develop predictive models for FASD diagnosis (Blanck-Lubarsch et al., 2022;Ehrig et al., 2023).Such models have the potential to augment existing diagnostic frameworks, enabling clinicians to make more informed decisions and speeding up the diagnostic process.
In the present study, supervised classification ML algorithms were employed to construct a predictive diagnosis model of FASD and its subtypes.The model was trained using sociodemographic, clinical and psychological variables.ML provides a powerful tool for prediction and feature importance determination, especially when data patterns may be too complex for conventional statistical methods.The algorithms investigated include Logistic Regression (LR), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), K-nearest Neighbors (KNN), Random Forest (RF) and eXtreme Gradient Boosting (XGB).
This study aims to develop ML algorithms that use physical and neurocognitive data from children with FASD.The algorithms will identify a distinctive FAS profile in the dataset to enhance FASD diagnosis compared to the current methods.This research aims to provide more accurate diagnostic tools for the assessment of FASDs, which could revolutionize clinical practice, thereby facilitating the initiation of early therapies and improving the quality of life of people affected by this silent disease.

Material and methods
.

Study design and participant information
This is a multicentre and pilot investigation.The study included all patients from the Catalan Institute for Fostering and Adoption (ICAA) database, who agreed to participate.The total study cohort comprised 231 patients, which includes 73 control patients and 158 patients diagnosed with FASD.The study, registered at clinicaltrials.gov(NCT02558933), integrated cohorts from previous investigations (PI13/01135; OG085818; PI16/00566; PI19/01853) that included participants enrolled between March 2017 and November 2023.The study was conducted at the Hospital del Mar Medical Research Institute of Barcelona and Hospital Clinic of Barcelona, and all procedures adhered to ethical standards outlined in the Declaration of Helsinki and Spanish data privacy regulations.Consent was obtained from the caregiver or legal representative of patients due to their incapacity to provide informed consent, as approved by the Comité Ético de Investigación Clínica Parc de Salut MAR (No. HCB/2021/0459).
The minimum sample size calculation was conducted using G * Power software (Faul et al., 2007) with the following parameters: bilateral contrast, alpha 0.1, beta cut-off of 0.2, corresponding to a power of 0.8 (Gupta et al., 2016), estimated proportion of replacements required (20%), precision 0.1 (90%) and estimate 50% of the population affected.A minimum of 70 samples in two independent groups (non-FASD and FASD groups or Non-FASD vs. each FASD subtype) were required to compute the necessary sample size.

. FASD diagnosis and clinical evaluation
To diagnose FASD, all adopted children in the EEC who were included in this study, including those with verified prenatal alcohol exposure (PAE), underwent independent examination using standardized dysmorphology exams (Hoyme et al., 2016).The diagnostic category for each child was identified based on the 1996 IOM standards (reviewed in 2016) (Hoyme et al., 2005(Hoyme et al., , 2016)), which consist of five diagnostic characteristics.(1) confirmed prenatal alcohol exposure; (2) evidence of a characteristic minor facial abnormalities pattern, typified by having a thin upper lip, smooth philtrum and short palpebral fissures; (3) growth retardation, defined as height or weight ≤10th percentile; (4) evidence of deficient brain growth or subrogated data; and (5) behavioral or cognitive affected domains (1 or 2) related to prenatal alcohol exposure.For a diagnosis of complete FAS, criteria 2, 3, 4, 5 (confirmed or not confirmed prenatal alcohol exposure) were required.For partial FAS, criteria 1, 2, and at least one of criteria 5 (confirmed prenatal alcohol exposure) or 2, 5 and 3 or 4 (no confirmed prenatal alcohol exposure) were required.The diagnosis of alcohol-related birth defects (ARBD) required the finding of one criterion plus a minimum of one structural defect involving heart, skeleton, kidney, eye, ear or minor abnormalities like railway ears, midface hypoplasia or stick hockey hands.The diagnosis of alcohol-related neurodevelopmental disorders (ARND) required the finding of 1 and 5 criteria.
Adults' cognitive functioning was evaluated using the Wechsler Adult Intelligence Scale (WAIS), the Fourth Edition (WAIS-IV) (Wechsler, 2008).Preschoolers' cognitive abilities were assessed using the Wechsler Preschool and Primary Scale of Intelligence (WPPSI-IV) (Raiford and Coalson, 2014).Additionally, the Adult Self-Report (ASR/18-59) (Achenbach and Rescorla, 2003) collected self-reported data on behavioral concerns in adults, while the Child Behavior Checklist (CBCL) (Achenbach, 2004) gathered parental reports on children aged 6-18.All assessments adhered to unified criteria and professionals received standardized training, ensuring consistency and reliability.Data were recorded in a confidential database, maintaining accuracy and confidentiality throughout the research process.

. Statistical analysis
Statistical analysis was performed using SPSSv22 and R. Graphs were performed using Graphpad Prism 8.0 software.Descriptive analysis was used to characterize the samples.Categorical variables were presented as counts and percentages, while continuous variables were presented as means and standard deviations.Relationships between sociodemographic, clinical and neuropsychological features were examined for quantitative factors using Kruskall-Wallis test with Dunn's correction for multiple comparisons and for qualitative variables chi-square test.A significance level of p < 0.05 was applied to all analyses.
In addition to the aforementioned statistical tests, machine learning (ML) algorithms were also employed to create a predictive model, using the statistical software R (3.3.0+version).

. Machine learning models
This study employed several ML algorithms to predict FASD and its subtypes (FAS, pFAS and ARND), such as LR, LDA, linear SVM, polynomial SVM, KNN, RF and XGB (Zhang et al., 2019).The data underwent a preparation phase, where "mice" function from VIM package was used for missing values imputation, employing the predictive mean matching method (pmm) (Kowarik and Templ, 2016).This process was repeated 5 times, as per default setting.Just 1% of the data were missing and single imputation is considered appropriate when <5% of the data are missing (Graham, 2009).The dataset was subsequently scaled using "scale" function in base R.
LR is a binary classification algorithm, which uses a logistic function to predict class probability.Coefficients are obtained using maximum likelihood estimation (Hosmer et al., 2013).LR is easy to implement and performs well on linearly separable classes.However, it may overfit with many features and struggles with complex relationships.LDA projects data into a lowerdimensional space, maximizing class separability and minimizing variance within a class, finding a linear combination of features that characterizes a group (Gardner-Lubbe, 2021).SVM maximizes the distance between the separating hyperplane of the variables to classify (Huang et al., 2018).In our study, linear SVM and polynomial SVM are differentiated.Linear SVM classifies linearly separable data and is more computationally efficient.Polynomial SVM classifies non-linearly separable data, transforms input space into a higher-dimensional space, finds more complex relationships, and is computationally intense (López et al., 2022).KNN predicts class by calculating the Euclidean distance to all training points and selecting K most similar instances (the neighbors).It handles multiclass classification and learns complex decision boundaries (Zhang, 2016).However, it performs poorly on high-dimensional datasets because the distance to all neighbors must be recalculated.Ensemble methods like RF and XGB are decision tree-based algorithms.RF combines multiple independently trained decision trees, uses bagging to create subsets of the original dataset, and then aggregates the results (Denisko and Hoffman, 2018).On the other hand, XGB trains decision trees sequentially, with each new tree correcting errors made by the previous one (Li et al., 2022).The ML algorithms used in this study has its own strengths and weaknesses, leading to varied results.The range of ML algorithms compared spans from traditional predictive models like LR to more complex ensemble methods like RF and XGB, which are capable of handling high-dimensional data.By comparing different models, our study aimed to find the most effective model for predicting FASD and its subtypes.This diversity in approaches enhances the robustness and comprehensiveness of the study.
For the analysis, a total of 66 variables were selected, encompassing five sociodemographic parameters, 35 clinical features, six intelligence scores and 20 behavioral domains (Tables 1-4).Prior to model construction, a hold-out method was applied to split the data into training and test sets using "createDataPartition" function from caret package in R (Kuhn, 2008).Sixty-seven percent of the data was allocated to training set and the remaining 33% to test set.This function employs a stratified random sampling method, which minimizes the bias of the data distribution and creates balanced data.
In addition to the hold-out method, a resampling method involving four-fold cross-validation and three repeats was adopted.This was implemented using "trainControl" function from the caret package (Kuhn, 2008).The models were trained using "train" function with hyperparameters set to default, which gathers and simplifies numerous R algorithms for the development of predictive models (Kuhn, 2008).The models employed included LR, using "glm" method and binomial family, and LDA, implemented with "lda" method, which has "moment" as the default mean and variance estimator.Linear SVM and Polynomial SVM were performed using "svmLinear" and "svmPoly" methods, respectively.They have C tuning parameter, which determines the margin classification, equal to 1 as default settings.KNN was employed by "knn" method also from caret package, performing automatic hyperparameter tuning for k depending on instancebased learning.In addition, RF was employed using "rf " method, with 500 trees as default.XGB model used "xgbTree" method, having 100 maximum iterations by default.
The "predict" function from stats package was used to predict classes with the test group.In order to make comparisons, the "confusionMatrix" function from caret package was used to calculate true positive, true negative, false positive and false negative.These calculations provided measures including accuracy, precision, sensitivity, F1 score and specificity.ROC-AUC was obtained using "roc" function from pROC package (Robin et al., 2011).Training and test datasets were consistent across FASD and its subgroups, ensuring a fair and valid comparison.
Feature importance prediction of the models was determined by calculating the Root Mean Square Error (RMSE) loss after permutation.It was obtained with "explain" function from DALEX package, with "classification" type model in arguments (Law Biecek, 2018).Plots were generated from the object class formed by "variable_importance" function from caret package (Kuhn, 2008).

FASD profile
The study initially included 273 patients.However, 42 were excluded: 28 lacked psychological evaluations and 14 refused participation.Of the remaining 231 subjects, 73 were diagnosed as non-FASD (controls), and 158 were diagnosed with FASD, comprising 33 with FAS, 81 with pFAS, and 44 with ARND.A database was compiled with sociodemographic and psychological characteristics of both FASD patients (and their subtypes) and non-FASD participants (Figure 1).Sociodemographic characteristics of the population were collected from FASD and non-FASD patients (Table 1).The chisquare test revealed no significant differences between groups.Significant differences were found in physical characteristics between children diagnosed with FASD and non-FASD (Table 2, Supplementary Table 1).Prematurity (p-value = 0.011) was higher in FAS children compared to non-FASD children (p-value = 0.018; Supplementary Table 1).Growth retardation (p < 0.001) was also higher in FAS and ARND children compared to non-FASD children (p-value < 0.001; p-value = 0.040).Birth complications (p-value = 0.047), as perinatal asphyxia or abnormal heart rate, were more prevalent in ARND children compared to non-FASD children (p-value = 0.052).As expected, maternal alcohol consumption confirmation also showed significant differences (p-value < 0.001) in all groups compared to non-FASD patients.
Among FASD groups (Table 2, Supplementary Table 1), significant differences included prematurity (p-value = 0.009), eyes (p-value = 0.011) and upper limbs affectation (p-value = 0.011) between FAS and ARND.FAS exhibited higher growth retardation levels compared to pFAS (p-value = 0.014).Maternal alcohol consumption, short palpebral fissures and lip-philtrum affectation showed increased levels in FAS (p-value = 0.042, p-value = 0.003 and p-value < 0.001, respectively) and pFAS patients (p-value = 0.023, p-value = 0.018 and p-value < 0.001, respectively) compared to ARND, confirming that this group does not exhibit physical characteristics.Microcephaly varied among all FASD groups, showing 87% of the cases in FAS, 46% in pFAS and 18% in ARND.No significant differences were noted in other physical or clinical characteristics, except for a trend toward greater cardiac damage in FASD (p-value = 0.06).
Finally, in the adult behavioral test ASR 18-59 (Table 4, Supplementary Table 3), significant differences related to attention problems (p-value = 0.032) were observed, showing ARND higher levels compared to the pFAS subgroup (p-value = 0.003).

. Machine learning predictive modeling
Predictive models for FASD diagnosis were developed using ML, considering the sociodemographic, clinical, and psychological variables previously discussed.The dataset consisted of 231 samples, with 155 samples used for model training and the remaining 76 samples saved for testing and final model evaluation.A variety of ML algorithms were employed, including XGB, LR, LSVML, LDA, SVMP, kNN, RF and XGB.These models were trained using four-fold cross-validation on the training dataset.
Figure 2 shows the key performance metrics associated with the predictive power of each model.Among the models, the ensemble algorithms (RF and XGB) outperformed the others.Notably, the RF model achieved the highest accuracy (0.92), precision (0.96), sensitivity (0.92), F1 score (0.94), specificity (0.92), and AUC (0.92), establishing it as the most effective model for predicting FASD diagnosis.Other models such as LR, LDA, SVMP, and kNN showed lower performance on these metrics (Figure 2).Consequently, we selected the RF model for our prediction tasks due to its superior discriminative ability.
To understand the decision-making mechanism of the RF model, we examined the significance of the variables within this algorithm.The features were ranked according to their importance, with maternal alcohol consumption being the most significant (0.48), followed by lip-philtrum (0.27), microcephaly (0.19), height affectation (0.17), Working Memory Index (0.16), aggressive behavior (0.16), Intelligence Quotient (0.15), somatic complaints (0.15), weight affectation (0.15), and depressive problems (0.15; Figure 3).These findings offer crucial insights into the primary attributes associated with FASD conditions and their respective significance in the predictive model.
Another aim of our study is to construct individualized models for each category of FASD.This methodology will allow us to uncover distinct attributes and trends that might remain concealed when all FASD types are examined collectively.
Focusing our analysis on FAS prediction in comparison to non-FASD, we employed the previous ML algorithms.The XGB model outperformed the others (Figure 4), achieving the highest accuracy (0.94), precision (0.91), sensitivity (0.91), F1 Score (0.91), specificity (0.96), and AUC (0.93), thereby proving to be the most effective model for FAS diagnosis prediction.
In Figure 5 the features were ranked based on their importance, with Height (0.32) and Weight (0.28) being the most influential, followed by Fluid Reasoning Index (0.11), Internalizing Problems (0.08), Total Problems (0.1), and Processing Speed Index (0.1) (Figure 5).This highlights FAS prediction is mainly determined by failure to thrive.In the subsequent stage of the research, the focus shifted to the prediction of pFAS compared to non-FASD.Upon evaluating all ML models (Figure 6), the RF model emerged as the most proficient, achieving the highest metrics in accuracy (0.90), precision (0.86), sensitivity (0.96), F1 Score (0.91), specificity (0.83), and AUC (0.90).This underscores its effectiveness in predicting pFAS diagnosis.
The final phase of the study involved the analysis of ARND prediction.As previously observed, the RF model demonstrated superior performance for ARND prediction (Figure 8), obtaining ./fnins. .

FIGURE
Mean variable-importance of RF model for FASD prediction.Mean variable importance was calculated by using permutations and the root-mean-squared-error-loss-function for the RF model.RF, Random Forest.
Frontiers in Neuroscience frontiersin.org

FIGURE
Mean variable-importance of XGB model for FAS prediction.Mean variable importance was calculated by using permutations and the root-mean-squared-error-loss-function for the XGB model.XGB, eXtreme Gradient Boosting.

FIGURE
Mean variable-importance of RF model for pFAS prediction.Mean variable importance was calculated by using permutations and the root-mean-squared-error-loss-function for the RF model.RF, Random Forest.
Frontiers in Neuroscience frontiersin.org

FIGURE
Mean variable-importance of RF model for ARND prediction.Mean variable importance was calculated by using permutations and the root-mean-squared-error-loss-function for the RF model.
Figure 8 displays Maternal Alcohol Consumption (0.64) as the most influential feature for ARND prediction, followed by Total Problems (0.11) and Attention Problems (0.10) (Figure 9).These results confirm the importance of PAE confirmation for diagnosis.

Discussion
The comprehensive analysis of sociodemographic, clinical, physical, and psychological characteristics in our study provides invaluable insights into the complex nature of FASD and underscores the importance of these features in diagnostic assessment and intervention planning.The variables have been selected based on IOM criteria (Hoyme et al., 2016).
Our study showed that FASD patients share a common profile of maternal alcohol consumption, low height and lipphiltrum affectation (Hoyme et al., 2016).FAS profile shows impaired intelligence domains observed in WISC V test, as previously reported (Bastons-Compta et al., 2016).Prematurity, growth retardation, weight affectation, short palpebral fissures, eyes and upper limbs affectation and attention problems are highlighted in FAS profile compared to non-FASD, being part of the specific diagnosis (Hoyme et al., 2016;Wang et al., 2020).These findings also confirm previous studies from Maschke et al. (2021) that observed facial abnormalities correlate with child's cognitive performance in FRI and WMI in FASD patients.pFAS profile exhibits distinctions from full FAS, particularly in growth problems and physical traits like microcephaly and upper limb impairment, since pFAS does not meet all the requirements of full FAS (Hoyme et al., 2016).ARND profile showed birth diseases, such as perinatal asphyxia or abnormal heart rate and lower limbs affectation.Furthermore, behavioral affectations included thought problems, attention problems, rule-breaking behavior, externalizing, anxiety, obsessive-compulsive and stress problems.ARND lacks certain physical impairments such as weight and height, microcephaly, short philtrum, and eye and upper limb impairments observed in FAS and pFAS (Hoyme et al., 2016).Therefore, these results highlight that FAS and pFAS may need therapies for educational support and intervention for growth retardation.However, ARND group may need a combination of cognitive behavioral therapy, attention training, and psychotherapy for the range of psychological and behavioral problems.
FASD, shares similarities with others neurocognitive disorders as Autism and ADHD, but is distinguished by its association with PAE, causing a distinct pattern of neurodevelopmental impairments (Rommelse et al., 2010;May and Gossage, 2011).Autism is characterized by elevated FRI and VSI, while ADHD is linked to deficiencies in FRI (Tamm and Juranek, 2012;Happé, 2021).Studies of ASD observed that VCI correlated negatively with communication symptoms, and WMI correlated positively with social symptoms (Rabiee et al., 2019).Similarly, individuals with ADHD exhibit deficits in attention domains, including PSI, WMI, and social cognition (Onandia-Hinchado et al., 2021), which is consistent with observations in FASD.Understanding these differences is crucial for accurate identification, intervention, and support for individuals affected by FASD.
The implementation of ML in FASD diagnosis is crucial due to the complexity and heterogeneity of the disorder.Previous ML studies predicted FAS risk in pregnant drinkers using questionnaires (Oh et al., 2023b) assessing drinking timing, race, ethnicity, alcoholic beverage, prenatal care and pregnancy complications.However, inherent limitations arise due to potential maternal misrepresentation and impracticality when assessing biological mothers' post-adoption.Traditional diagnostic methods for FASD are often challenging, due to multiple factors, like unknown maternal alcohol confirmation, lack of facial dysmorphology or growth impairments, leading to misdiagnosis or delayed diagnosis (Chasnoff et al., 2015).In recent years, research exploring the potential use of ML algorithms for early diagnosing FASD has shown promising results (Suttie et al., 2024).Ehrig et al. used physical characteristics (such as body length and head circumference at birth) and neuropsychological parameters (IQ, behavior, memory) as predictable variables, achieving good levels of accuracy (0.85), precision (0.87), sensitivity (0.91) and AUC (0.93) (Ehrig et al., 2023).Goh et al. (2016) trained their model using CBCL scales, IQ and physical examination, obtaining a sensitivity of 64%−81% and specificity of 78%−80%.Zhang et al. (2019) developed a comprehensive ML framework using eye movements, psychometric tests and brain imaging to predict FASD.Rodriguez et al. (2021)  This ML prediction aims to be specifically for FASD, thereby distinguishing it from other developmental disorders.Unique FASD indicators like PAE confirmation, distinctive facial features and microcephaly, together with psychometric data, enhance FASD detection.Models have been developed in other pathologies, incorporating clinical and neuropsychological variables, such as ADHD and ASD (Lange et al., 2019;Ehrig et al., 2023).These models have successfully identified these specific pathologies based on a combination of specific variables.Studies from Lange et al. (2019) and Ehrig et al. (2023) used specific parameters that predict FASD compared to ADHD or ASD.These parameters include gestational age, length, weight and head circumference affectation at birth, together with low IQ, socially intrusive behavior, rulebreaking behavior and attention problems.These studies further validate the accuracy of ML in predicting FASD, thereby mitigating the risk of misdiagnosis other neuropathologies.
Based on socio-demographic, clinical, and psychological data from children with FASD the present study has elaborated a common diagnostic model for FASD, obtaining RF algorithm as the best model predictor.We identified important variables for efficient FASD screening, including classic clinical characteristics for diagnosis like maternal alcohol consumption, lip-philtrum, microcephaly, and height and weight impairment.Other significant variables include the WMI, aggressive behavior, IQ, somatic complaints, and depressive problems.WMI, IQ and aggressive behavior are often observed in FASD patients and are considered a significant factor for the diagnostic process (Maya-Enero et al., 2021).However, our study establishes the domains related to somatic complaints and depressive problems, often reported in FASD patients (Mattson et al., 2011), as key diagnostic indicators.
ML models have also been performed for each FASD subtype, identifying specific patterns and enhancing the important variables for precise prediction.Our finding provides a detailed analysis for each specific type of FASD, offering clinicians more precise information for diagnosis and treatment planning.
The ML algorithm that best predicted FAS was XGB and the most important features were traditional physical traits, such as height and weight affectations.Additionally, neuropsychological variables, including FRI, internalizing problems and total problems, play a crucial role in the prediction of FAS, and have previously mentioned its association with FASD (Fagerlund et al., 2011;Popova et al., 2019;Maschke et al., 2021).Furthermore, studies with autism and ADHD found that internalizing problems are also increased, leading to long-term anxiety behavior in adulthood (So et al., 2021;Andersen et al., 2023).These findings suggest that patients primarily affected in these domains may be more likely to exhibit FAS.Interestingly, maternal alcohol consumption, while a significant predictor for pFAS and ARND, does not appear to be a determinant for FAS prediction.
On the other hand, RF model was the best at predicting pFAS, with classical FASD characteristics such as lip-philtrum affectation, confirmed maternal alcohol consumption, IQ, and microcephaly being the most important variables.However, our study showed that neuropsychological variables like PSI, VCI scores, attention problems and thought problems had also impact on pFAS prediction.Therefore, these results suggest that these neuropsychological variables, previously used to diagnose ADHD (Mikolas et al., 2022), together with classical FASD characteristics, may be relevant in predicting pFAS.
Lastly, ARND prediction was best performed by RF algorithm, with maternal alcohol consumption being the most predictable variable.Nevertheless, total problems and attention problems also had some impact on ARND prediction.Previous ML studies also determined that these neurological domains, along with other impairments in CBCL are key factors for bipolar disorder prediction (Uchida et al., 2022).
Conducting a separate machine learning analysis for each type of FASD, once confirmed prenatal alcohol exposure, is potentially beneficial for clinical practice.It allows for a better understanding of FASD subtypes and can contribute to more accurate diagnosis and targeted treatment strategies.

Conclusions
Our study has carried out significant progress in applying ML to the diagnosis of FASD.ML algorithms effectively diagnose FASD and its subtypes: FAS, pFAS, and ARND.Key variables for efficient FASD screening include classical clinical characteristics (maternal alcohol consumption, lip-philtrum, microcephaly, height and weight impairment) and neuropsychological variables (WMI, aggressive behavior, IQ, somatic complaints, and depressive problems).The best ML algorithm for predicting FAS was XGB, with height, weight affectations, and neuropsychological variables like IQ, internalizing problems, and total problems being the most important features.For pFAS, RF model was the best predictor, considering lip-philtrum affectation, confirmed maternal alcohol consumption, IQ, microcephaly, PSI, VCI, attention problems, and thought problems being the most significant variables.For ARND, the RF algorithm was the best performer, with maternal alcohol consumption, total problems, and attention problems being the most predictable variables.ML improves diagnostic accuracy and enhances understanding of FASD subtypes, leading to early intervention strategies, targeted therapeutic approaches, and ultimately mitigating the secondary disabilities of FASD.This could help the social and health systems for affected individuals and their families, supporting the consensus of the diagnostic criteria.All of this emphasizes the need for public policies to invest in ML integration into diagnostic strategies in order to improve clinical outcomes for FASD individuals.ML models will contribute to the development of more informed public health policies focused on this vulnerable population.

Limitations
The absence of an ARBD subgroup in our dataset restricts the comprehensiveness of our findings about this FASD subtype.Moreover, self-reported data could introduce bias in variables associated to personal perceptions.External validation on independent datasets is also needed to ensure the robustness of our ML models.Other limitation is related to the confirmation of maternal alcohol consumption, due to incomplete medical records in some adoptees.Additionally, the stress from diagnostic assessments could potentially affect children's performance in neuropsychological tests.Future work could enhance our model by integrating additional data such as magnetic resonance imaging (Rodriguez et al., 2021), NEPSY-II neuropsychological test (Duarte et al., 2021) and eye movement (Zhang et al., 2019) for a more refined FASD diagnosis.
Despite these limitations, our study advances the application of ML in FASD diagnosis, providing a foundation for future research and contributing to the development of more accurate diagnostic tools.

Funding
The author(s) declare that financial support was received for the research, authorship, and/or publication of this article.This study has been funded by Instituto de Salud Carlos II (ISCIII) through the project PI19/01853, PI21/01415 and PI23/01220 and co-funded by the European Union.Project RD21/0012/0017 and RD21/0012/0023 financed by Instituto de Salud Carlos III (ISCIII) and Unión Europea NextGenerationEU/Mecanismo para la Recuperación y la Resiliencia (MRR)/Plan de Recuperación, Transformación y Resiliencia (PRTR).This research was funded also by Fundación Mutua Madrileña (AP183662023).This study has also been carried out thanks to the support of the Departament de Recerca i Universitats de la Generalitat de Catalunya al Grup de Recerca Infància i Entorn (GRIE) (2021 SGR 01290).The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

FIGURE
FIGUREStudy flowchart.Flowchart of FASD and non-FASD diagnosis and machine learning prediction.
TABLE Sociodemographic data of children population (n =).Chi-square test was used to compare the outcomes of the different groups: non-FASD, FAS, pFAS and ARND.Significant differences are bold and were considered when p-value < 0.05.ARND, alcohol-related neurodevelopmental disorder; ADHD, attention-deficit/hyperactivity disorder; CNS, central nervous system; FASD, fetal alcohol spectrum disorders; FAS, fetal alcohol syndrome; pFAS, partial fetal alcohol syndrome.
Chi-square test was used to compare the outcomes of the different groups: non-FASD, FAS, pFAS and ARND.Significant differences were considered when p-value < 0.05.ARND, alcohol-related neurodevelopmental disorder; FASD, fetal alcohol spectrum disorders; FAS, fetal alcohol syndrome; pFAS, partial fetal alcohol syndrome.
Wallis test was used to compare the outcomes of the different groups: non-FASD, FAS, pFAS and ARND.Significant differences are bold and were considered when p-value < 0.05.ARND, alcohol-related neurodevelopmental disorder; FASD, fetal alcohol spectrum disorders; FAS, fetal alcohol syndrome; pFAS, partial fetal alcohol syndrome; SD, standard deviation.