AUTHOR=Song Chao , Jiang Zhong-Quan , Hu Li-Fei , Li Wen-Hao , Liu Xiao-Lin , Wang Yan-Yan , Jin Wen-Yuan , Zhu Zhi-Wei 

TITLE=A machine learning-based diagnostic model for children with autism spectrum disorders complicated with intellectual disability

JOURNAL=Frontiers in Psychiatry

VOLUME=Volume 13 - 2022

YEAR=2022

URL=https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2022.993077

DOI=10.3389/fpsyt.2022.993077

ISSN=1664-0640

ABSTRACT=Background: Early detection of children with autism spectrum disorder (ASD) and comorbid intelligence disorder (ID) helps carry out individualized intervention. Grass root hospitals lack corresponding evaluation and diagnosis tools. This study aims to explore the applicability of machine learning methods in diagnosing ASD comorbid ID compared with traditional regression models. 
Method: From January 2017 to December 2021, 241 ASD children diagnosed in the developmental behavior Department of the children's Hospital Affiliated with the Medical College of Zhejiang University were included in the analysis. This study trained the traditional diagnostic models of Logistic regression (LR), Support Vector Machine (SVM), and two ensemble learning algorithms (Random Forest (RF) and XGBoost). Socio-demographic and behavioral observation data were used to distinguish whether ASD children had combined ID. The Hyperparameters adjustment uses grid search and 10-fold validation. The Boruta method is used to select variables. The model's performance was evaluated using discrimination, calibration, and decision curve analysis (DCA). 
Result: Among 241 ASD children, 98 (40.66%) were ASD comorbid ID. The four diagnostic models can better distinguish whether ASD children are complicated with ID, and the accuracy of SVM is the highest (0.836); SVM and XGBoost have better accuracy (0.800, 0.838); LR has the best sensitivity (0.939), followed by SVM (0.952). Regarding specificity, SVM, RF, and XGBoost performed significantly higher than LR (0.355). The AUC of ML (machine learning) (SVM, 0.835 [95% CI: 0.747–0.944]; RF, 0.829 [95% CI: 0.738–0.920]; XGBoost, 0.845 [95% CI: 0.734–0.937]) is not different from traditional LR (0.858[95% CI: 0.770–0.944]). Only SVM observed a good calibration degree. Regarding DCA, LR and SVM have higher benefits in a wider threshold range. 
Conclusion: Compared to the traditional regression model, ML model based on socio-demographic and behavioral observation data, especially SVM, has a better ability to distinguish whether ASD children are combined with ID.