AUTHOR=Toliyat Amir , Levitan Sarah Ita , Peng Zheng , Etemadpour Ronak TITLE=Asian hate speech detection on Twitter during COVID-19 JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 5 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2022.932381 DOI=10.3389/frai.2022.932381 ISSN=2624-8212 ABSTRACT=COVID-19 started in Wuhan, China, in late 2019, and after being utterly contagious in Asian countries, it rapidly spread to other countries. This disease caused governments worldwide to declare public health crisis with drastic measures taken to contain the spread of the disease. This pandemic affected the lives of millions of people. Many citizens that have lost their lives and jobs are going through a wide range of emotions, such as disbelief, shock, concerns about health, fear about food supplies, anxiety, panic, etc. All of the aforementioned new incidents and phenomena led to the spread of racism and hate against Asians in western countries, especially in the United States. The statistics show that Anti-Asian hate crime in 16 of America’s largest cities increased by 149% in 2020 [1]. In this study, first, we chose a baseline on Americans’ hate crimes against Asians on Twitter. Then we presented an approach to balance the bias dataset and consequently improved the performance of tweets’ classification.% We also have downloaded 10 million tweets through Twitter API V-2 that in this study, we have used a small portion of that, and we will use the entire dataset in the experiment of our future work. In this paper, 3000 thousand tweets (downloaded using Twitter API V-2) are annotated by three Asian and one Asian-American annotator. We have used different machine learning methods and deep learning methods in predicting models. Our machine learning methods include Random Forest [2, 3], K-nearest neighbors(KNN) [4], Support Vector Machine (SVM) [5, 6], Extreme Gradient Boosting (XGBoost) [7], Logistic Regression [8], Decision Tree [9], Naive Bayes [8]. Our Deep Learning models include Basic Long Term, Short Term Memory (LSTM) [10], Bidirectional LSTM [11], Bidirectional LSTM with Drop out [12], Convolution [13] and Bidirectional Encoder Representations from Transformers (BERT) [14]. We also tuned our dataset by modifying the agreement between annotators and the Fleiss Kappa number. Our final result showed that Logistic Regression achieved better performance in Machine learning with an accuracy of 0.80 and Bert in Deep Learning Categories with an F1-Score of 0.85