AUTHOR=Li Dekui , Hu Xiaolong , Peng Yongdong TITLE=The classification method of donkey breeds based on SNPs data and machine learning JOURNAL=Frontiers in Genetics VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2025.1496246 DOI=10.3389/fgene.2025.1496246 ISSN=1664-8021 ABSTRACT=A method for accurately classifying donkey breeds has been developed by integrating single nucleotide polymorphism (SNPs) data with machine learning algorithms. The approach includes preprocessing donkey genomic sequencing data, addressing data imbalance with the Synthetic Minority Over-sampling Technique (SMOTE), and utilizing an improved Leave-One-Out Cross-Validation (LOOCV) for dataset partitioning. Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Random Forest (RF) models were constructed and evaluated. The results demonstrated that different chromosomes significantly influence classifier performance. For instance, chromosome Chr2 showed the highest classification accuracy with KNN, while chromosome Chr19 performed best with SVM and RF models. After enhancing data quality and addressing imbalances, classification performance improved substantially, with accuracy, precision, recall, and F1 score showing increases of up to 15% in certain models, particularly on key chromosomes. This method offers an effective solution for donkey breed classification and provides technical support for the conservation and development of donkey genetic resources.