AUTHOR=Humayun Fahad , Khan Fatima , Fawad Nasim , Shamas Shazia , Fazal Sahar , Khan Abbas , Ali Arif , Farhan Ali , Wei Dong-Qing TITLE=Computational Method for Classification of Avian Influenza A Virus Using DNA Sequence Information and Physicochemical Properties JOURNAL=Frontiers in Genetics VOLUME=Volume 12 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2021.599321 DOI=10.3389/fgene.2021.599321 ISSN=1664-8021 ABSTRACT=Accurate and fast characterization of the subtype sequences of avian influenza A virus (AIAV) hemagglutinin (HA) and neuraminidase (NA) is focused on expanded diagnostic services and is embedded in molecular epidemiological studies. A new approach for classifying the AIAV sequences of the HA and NA genes into subtypes using DNA sequence data and physiochemical properties is proposed. This method simply requires unaligned, full-length or partial sequences of HA or NA DNA as input. It allows quick and highly accurate assignment of HA sequences to subtypes H1–H16, and NA sequences to subtypes N1–N9. For feature extraction, k-gram, discrete wavelet transform and multivariate mutual information were used, and different classifiers were trained for prediction. The results of four different classifiers Naïve Bayes, Support Vector Machine (SVM), K nearest neighbor (KNN) and Decision Tree were compared using our method of feature selection: This comparison is based on the 30% dataset that was separated from original dataset for testing purpose. Among the four classifiers, Decision Tree was the best, and Precision, Recall, F1 score, and Accuracy were 0.9514, 0.9535, 0.9524, and 0.9571, respectively. Decision Tree had large improvements over the other three classifiers using our method. Five criteria—Precision, Recall, F1 score and Accuracy—were used to evaluate the prediction results of the method that was proposed. We could reach 0.9514 on Precision, 0.9535 on Recall, 0.9524 on F1 score and 0.9571 on Accuracy. Results shows that proposed feature selection method when trained with Decision Tree classifier gives best results for accurate prediction of AIAV subtype.