Codon Usage Bias Analysis of Bluetongue Virus Causing Livestock Infection

Bluetongue virus (BTV) is a double-stranded RNA virus with multiple segments and belongs to the genus Orbivirus within the family Reoviridae. BTV is spread to livestock through its dominant vector, biting midges of genus Culicoides. Although great progress has been made in genomic analyses, it is not fully understood how BTVs adapt to their hosts and evade the host’s immune systems. In this study, we retrieved BTV genome sequences from the National Center for Biotechnology Information (NCBI) database and performed a comprehensive research to explore the codon usage patterns in 50 BTV strains. We used bioinformatic approaches to calculate the relative synonymous codon usage (RSCU), codon adaptation index (CAI), effective number of codons (ENC), and other indices. The results indicated that most of the overpreferred codons had A-endings, which revealed that mutational pressure was the major force shaping codon usage patterns in BTV. However, the influence of natural selection and geographical factors cannot be ignored on viral codon usage bias. Based on the RSCU values, we performed a comparative analysis between BTVs and their hosts, suggesting that BTVs were inclined to evolve their codon usage patterns that were comparable to those of their hosts. Such findings will be conducive to understanding the elements that contribute to viral evolution and adaptation to hosts.


INTRODUCTION
Bluetongue virus (BTV) causes a vector-borne viral disease [bluetongue (BT)], is an economically important virus of ruminants that belongs to the genus Orbivirus of the Reoviridae family, and has a genome that consists of multiple segments of double-stranded RNA. Some infected animals develop the disease known as BT, with reference to the characteristic cyanotic tongue and lip mucosa (Saegerman et al., 2007;Maclachlan et al., 2009). According to electrophoretic analyses, BTV proteins are divided into a large fragment group, medium fragment group, and small fragment group (Rossmann and Tao, 1999;Diprose et al., 2001Diprose et al., , 2002. BTV is transmitted to animals through its primary vector, biting midges of genus Culicoides, however, it is also transmitted directly through the placenta or by sex (De Clercq et al., 2008;Courtejoie et al., 2018b). At present, it is clear that there are 28 serotypes of BTV (Bumbarov et al., 2020), which have been distributed worldwide, and this vectorborne viral disease is listed by the World Organization for Animal Health as an infectious disease.
Bluetongue is a viral disease that causes mild fever or facial edema in domestic ruminants and wild ungulates; livestock can die from BTV infection (Courtejoie et al., 2018b). BTV is widely distributed and has caused serious losses to countries worldwide. It was first reported in South Africa and later named by Huntcheon (Mehlhorn et al., 2007). From 1956, BTV-10 from North Africa entered Portugal and Spain, and it gave rise to the deaths of nearly 180,000 sheep; then, BTV-4 entered the Greek Islands in the period from 1979 to 1980 (Mellor et al., 2008). In 1995, by comparing the L1 gene sequences of five serotypes (BTV1,10,11,13,and 17), studies showed that the nucleotide sequences of BTV1, 11, 13, and 17 were shorter than those of BTV10 (Huang et al., 1995). Phylogenetic analysis revealed that the L1 gene was the most conserved and highly homologous among the 10 gene fragments (Huang et al., 1995). Through analysis and comparison of capsid and outer coat protein nucleotide sequences, it is possible to explore the phylogenetic relationship of BTV serotypes using VP2 genes as a determinant (Gould and Pritchard, 1990). In 1999, sequence and phylogenetic analyses of the VP2 gene of BTV strains from China, Australia, South Africa, and the United States indicated that these viruses were grouped on the basis of serotype (Bonneau et al., 1999). During the years 1998-2005, BTV entered many countries that had never encountered this virus, especially around the Mediterranean basin (Purse et al., 2005). Meanwhile, there was a large pool of various BTV serotypes in Europe because of the incursions of BTV-1, BTV-2, BTV-3, BTV-4, BTV-6, BTV-9, BTV-13, and BTV-16, which constituted serious threats to mammals in Europe (Mellor et al., 2008). In 2006, BTV-8 first entered northern Europe, but the origin of the new serotype BTV-8 is still unclear (Courtejoie et al., 2018a). Most recently, a number of other new strains of BTVs have been detected that potentially signified additional virus serotypes (Savini et al., 2017). To date, some studies have implications for BTV vaccine control strategies (Batten et al., 2013;Courtejoie et al., 2018a). The live attenuated vaccines were available for many years, but were less used later with their potential safety issues (Veronesi et al., 2005). Then, the inactivated vaccines were developed and have shown great safety and efficacy in sheep and cattle (Eschbaumer et al., 2009;Gethmann et al., 2009). Currently, with the development of recombinant DNA technology, the intrinsically safe vaccines have been available and been still under development (Calvo-Pinilla et al., 2014;Marin-Lopez and Ortego, 2016). Although multiple BTV vaccines could limit the severity of viral infection, they could not completely prevent the disease.
Degeneracy of genetic codons provides a chance for evolution to improve translation efficiency while keeping the identical amino acid sequence . After a long period of evolution, the synonymous codons used by different species in the process of translation are very different (Nasrullah et al., 2015;Rahman et al., 2018). In general, 64 codons encode 20 various amino acids and three termination codons; therefore, most of the codons are synonymous in the translation process . Notably, synonymous codons appear with different frequencies while coding for the same amino acid, which is known as codon usage bias (Ikemura, 1981(Ikemura, , 1985. The investigation of molecular evolution shows that codon usage bias is widespread in viruses, prokaryotes, and eukaryotes and even exists among different genes in the same organism (Greenbaum et al., 2008;Butt et al., 2016;Rahman et al., 2018). Codon preference is more obvious in genes with higher expression levels than in those with lower expression levels (Jia and Higgs, 2008), which may be caused by mutational and selection forces (Sharp et al., 1993;. Studies on codon usage have suggested that there are several factors forcing codon usage patterns, such as gene expression level, translation, protein secondary motifs, GC content, and transcriptional factors, among others (Cristina et al., 2015;Wang et al., 2016;Ur Rahman et al., 2017;Rahman et al., 2018). However, the major factors are mutational pressure and natural selection, which are thought to cause codon usage variation in organisms (Michael, 1988;Sharp et al., 2005;Butt et al., 2016).
A number of studies have suggested that when compared with natural selection, mutational pressure is the major force establishing codon usage patterns (Sharp et al., 2010;Cristina et al., 2015). However, mutational pressure is not the only driving factor for various DNA or RNA viruses (Butt et al., 2014;Rahman et al., 2018). Compared with the genomes of prokaryotes and eukaryotes, there are some specific features in viral genomes, for instance, depending on their hosts to replicate, synthesize, and transmit protein. This interaction between virus and host is thought to influence the viral survival, adaptation, evolution, and immune escape from the host's immune system (Shackelton et al., 2006;Moratorio et al., 2013;Butt et al., 2014;Rahman et al., 2018). Accordingly, understanding of codon usage in viral genomes may improve the knowledge of molecular evolution and enhance our insight into the regulation of viral gene expression (Butt et al., 2014;Rahman et al., 2018). Thus, the codon usage pattern is a vital element to reflect the evolutionary process and BTV molecular mechanism in escaping host cell responses.
This study focused on 50 different strains of BTV and performed viral genomic analyses for codon usage patterns using available sequences data. We found that mutational pressure makes an important impact on building codon usage patterns in BTV genomes.

Data Description
In our research, complete genomic sequences of 50 BTVs were retrieved from the National Center for Biotechnology Information (NCBI) 1 . Supplementary Table S2 shows the sequence information. For each strain, the ORFs were obtained by Lasergene SeqBuilder (Singh et al., 2016) and aligned using the MUSCLE program (Goñi et al., 2012). Additionally, codon usage data of BTV's hosts, B. taurus, O. aries, and Culicoides, were acquired from the codon usage database 2 .

Nucleotide Components Analysis
Nucleotide compositional analysis of the 50 BTV genomic sequences was analyzed by online software, CAIcal 3 , and local software, codonW 4 . The whole nucleotide frequencies of four types of nucleotides that occurred at the third codon position (U3, G3, C3, and A3) and G + C nucleotides that occurred at the first (GC1), second (GC2), and third (GC3) positions were calculated. In addition, the mean frequency of GC at the first two positions (GC12) and the ratio of AU/CG were also calculated. In this study, we excluded the three stop codons (UAA, UAG, and UGA), AUG and UGG (no synonymous codon).

Codon Preference Characteristics
To determine the codon usage bias pattern of BTV coding sequences, the relative synonymous codon usage (RSCU) of the virus genome coding region was calculated by the software codonW (Sharp and Li, 1986), and dinucleotide content was calculated by SSE v1.2 editor software (Karniychuk, 2016). Furthermore, another vital index of the codon usage pattern is the effective number of codons (ENC). The formula is as follows: where F i is the average homozygosity evaluated for synonymous family type i; n indicates the number of codons in the sequences; k indicates the types of synonymous codons that encode the same amino acid; and p i indicates the ratio of the i codon to all codon numbers encoding the same amino acid (Wright, 1990).

ENC-Plot Analysis
An ENC plot can clarify the relationship between the ENC and the GC content at the third codon position (GC3). This method can vividly demonstrate the usage bias of gene codons.
To evaluate the correlation, the expected ENC values were calculated for the corresponding GC3 using the method of Singh et al. (2016): where s represents G + C contents at the third codon position (GC3s).

Neutral Evolution Analysis
Neutral evolution analysis or the neutrality plot analysis is used to determine the factors that influence the preference of codon usage (Nasrullah et al., 2015). This analysis was performed to determine and compare the extent of influence of mutation pressure and natural selection on the codon usage patterns of BTV by plotting the GC12 values of the synonymous codons against the GC3 values.

Correspondence Analysis (COA)
Correspondence analysis is a multivariate statistical analysis that is used to detect variable and sample relationships. COA displays sets of rows and columns in a particular data set (Wong et al., 2010). In this study, every ORF corresponds to 59 dimensions (59 codons) and every dimension is equivalent to the RSCU value for each codon (except for the Met, Tyr, and stop codons). This approach helps to reflect directly the trend of strain change. The codonW program was used to perform COA based on the RSCU values, and the R ggplot2 package was used to draw visual graphics.

Correlation Analysis
Correlation analysis was used to measure the correlation between variables. Spearman's rank correlation method was performed to analyze the relationship between the codon usage pattern and nucleotide content of the BTV genome (Wu et al., 2015). All statistical procedures were carried out using the R corrplot package, and the related indicators of codon usage bias were obtained by using codonW.

Nucleotide Contents Analysis in BTV
Codon usage patterns are considered to be largely affected by the nucleotide composition (Jenkins and Holmes, 2003;Wong et al., 2010). The nucleotide contents of the BTV complete coding sequences were measured to evaluate the impact of nucleotide composition on the codon usage pattern. The frequency of each nucleotide was as follows: A (30.42% ± 0.14), U (25.86% ± 0.15), C (17.65% ± 0.20), and G (26.07% ±0.23) ( Figure 1A and Table 1, wilcox.test, P < 0.01). It may indicate that A nucleotides of the BTV codons might be used more frequently. To further explore the nucleotide composition analysis of BTVs, mean values were considered for each codon at the third position of synonymous codons (A3, U3, G3, and C3). The percentages of nucleotide composition at the third codon position were A3 (27.45%), U3 (29.27%), G3 (28.21%), and C3 (15.07%) ( Figure 1C and Table 1, wilcox.test, P < 0.01). The average AU and GC contents were calculated to be 56.29 and 43.71%, respectively, emphasizing that the content of AU was enriched in the BTV coding sequences (wilcox.test, P < 0.01). Moreover, the scope of AU3 values ranged from 53.62 to 55.63%, and the average value was 54.90%, with a standard deviation (SD) of 0.40%. GC nucleotide content at different codon positions is a significant index to show base composition bias. The scopes of GC composition are as follows: 50.20 to 51.10% (mean = 50.69%, SD = 0.22%) at the first position of all codons; 36.80 to 37.60% (mean = 37.18%, SD = 0.22%) at the second position of all codons; and 43.60 to 44.30% (mean = 43.93%, SD = 0.17%) at the first and second positions of all codons. In addition, we also calculated  GC12 represents the G + C content at the first and second positions of codons. GC3 represents the G + C content at the third positions of codons. AU3 represents the A + U content at the third positions of codons. Gravy represents the hydrophobicity of protein. ARO represents the aromaticity of protein. the mean AU (56.28% ± 0.21%), GC (43.72% ± 0.21%), AU3 (56.72% ± 0.51%), and GC3 (43.28% ± 0.51%) contents (Table 1 and Figures 1B,D), showing that A/U nucleotides are preferred at the third codon position. It is indicated that in BTV genomes, the compositional constraint plays a vital key in the total nucleotide compositions and the nucleotide composition at the third codon position.

Relative Aynonymous Codon Usage (RSCU) Analysis
To understand the reason why A/U nucleotides were preferred at the third codon position, RSCU analysis was performed to describe the codon usage bias of BTV. The RSCU values of all synonymous codons were calculated for 50 BTV strains and compared with those of their hosts (  (Kattoor et al., 2015;Kumar et al., 2016;Rahman et al., 2018). Analysis of over-and underrepresented codons emphasized that the RSCU values of the majority of codons ranged from 0.6 to 1.6. Remarkably, the results also showed that a majority of overpreferred codons (RSCU > 1.6) had A-endings, while the most underpreferred codons (RSCU < 0.6) had G-endings (

BTV Codon Usage Is Largely Shaped by Mutation Pressure
We continuously calculated the ENC to evaluate the magnitude of codon usage bias among all BTV coding sequences. The range of  the ENC value is between 20 and 61, and the lower the ENC value is, the stronger the preference of codon usage (Wright, 1990;Comeron and Aguadé, 1998 To evaluate the degree of codon usage patterns among the coding sequences of all the different BTV isolates, an ENC-GC3 plot was produced. This plot is used to determine whether the codon usage pattern of a gene deviates from the equivalent usage of the corresponding synonymous codons (Wright, 1990;Wu et al., 2015). If there is no natural selection, genetic evolution is affected only by mutation pressure. The nucleotide composition of the genome sequence would be the only way to affect its codon usage bias. Therefore, each point will fall on the expected curve or near the expected curve. Conversely, if the points are below the expected curve, the gene expression is subject to natural selection. As Figure 2 shows, all the points lie greatly below the solid curve, which suggests that in addition to the mutation pressure, translation selection also influences the codon usage bias of BTV. Our results are consistent with previous studies (Butt et al., 2014;Wang et al., 2016;Rahman et al., 2018).
To estimate the contribution of mutation bias and natural selection, a neutrality plot was produced for GC12 and GC3 contents. In the plot, the regression coefficient against GC3 is regarded as the mutation-selection equilibrium coefficient and the evolutionary speed of the mutation pressure and natural selection pressure is expressed as the slope of a regression line. Each point represents a species corresponding to the composition of GC12 and GC3 from the neutrality plot. However, if all the points lie along the diagonal distribution, no significant difference exists at the three codon positions, and there is no or weak external selection pressure. Alternatively, if the regression curve tends to be sloped or parallel to the horizontal axis, then the variation correlation between GC12 and GC3 is very low. The result showed that the correlation between GC12 and GC3 is not significant (r = 0.09, P > 0.05), reflecting that both mutation pressure and natural selection shape the codon usage pattern of BTV (Supplementary Figure S1).

The Variation in Codon Usage Among all BTVs
Principal component analysis (PCA) is used to explore the variation in codon usage based on the RSCU values of all BTV isolates. Here, the first two principal components from the PCA were determined to offer two-dimensional visualization of the sample relationships. The results identified that the first principal component accounted for 69.91% and the second principal component accounted for 27.95% of the variance (Supplementary Figure S2). Scattered points in the plot describe the diverse geographical lineages and their connection with each other. FIGURE 4 | CAI of BTV to its hosts. In the plot, the x-axis represents the vector (Culicoides) and the host species (B. taurus and O. aries). The y-axis represents the CAI value. Different colors represent various species: red-B. taurus, blue-O. aries, and orange-Culicoides. To avoid the CAI value between the BTV and each host overlapping, we performed a 50% disturbance in the horizontal direction (width = 0.5, height = 0).
Correspondence analysis was also used on RSCU values of all viral sequences to visualize and explore these data. For large multidimensional variables, COA can reduce the dimensions of the datasets to achieve efficient visualization of numerous variables (Mallows et al., 1985). The results displayed that all BTV strains were collected into clusters (Figure 3). All BTV strains from the United States, Italy, France, and Brazil are assembled in one cluster, while BTV strains from India are grouped in another cluster. However, some BTV strains from China appeared in the different clusters. These results suggested that the geographical locations play an important role in BTV evolutionary process and a synonymous codon usage pattern. Besides, it was also highlighted that each infected country has emerged more than one viral genetic lineage.

Codon Usage Adaptation in BTVs
To investigate the optimal codon usage pattern of BTVs and the adaptation in their hosts, the codon adaptation index (CAI) of all strains was measured by taking the codon usage patterns of B. taurus, O. aries, and Culicoides as a reference. The range of CAI values is between 0 and 1; the higher CAI values, the better adaptation of virus (Butt et al., 2014). In our research, the CAI values of all the BTV isolates were 0.63 ± 0.004, 0.58 ± 0.004, and 0.55 ± 0.003 in reference to B. taurus, O. aries, and Culicoides codon usage patterns, respectively (Figure 4). Furthermore, the significant differences determined in this study were calculated by Mann-Whitney U test, and it was shown that there were significant differences among CAI values (Supplementary Table S2, wilcox.test, P < 0.01). In addition, we calculated the CAI values of BTV in relation to itself and suggested that BTVs were better adapted to their hosts (B. taurus and O. aries) than to their vector (Culicoides) (Supplementary Table S2).
The expected CAI (e-CAI) values were also obtained for the whole BTV strains in reference to B. taurus, O. aries, and Culicoides to discern whether the differences in the CAI value were statistically significant (Puigbo et al., 2008). The e-CAI values of 0.68 (P < 0.05), 0.64 (P < 0.05), and 0.61 (P < 0.05) for B. taurus, O. aries, and Culicoides, respectively, suggested that there was a normal distribution of all the generated sequences. Figure 4 shows that the CAI values for BTVs in relation to Culicoides are significantly different from those in relation to B. taurus and O. aries (wilcox.test, P < 0.01). The result reflects that the selection pressure from hosts may influence the codon usage pattern of BTV and that the translation resources of hosts are more efficient than those of the vector for BTV.

The Main Constraints of the Codon Usage Pattern
In view of two constraints (including mutation pressure and natural selection) of codon usage patterns in BTV, we further analyzed the correlation between the ENC and CAI values to examine the predominant factor. If the correlation coefficient (r) between two indices is close to 1, translational selection is the primary determinant, whereas the mutation pressure may be more preferred than translational selection (Wang et al., 2016). There were significant correlations between ENC and CAI values of BTV coding sequences in reference to B. taurus (r = 0.43, P < 0.01), O. aries (r = 0.52, P < 0.01), and Culicoides (r = −0.32, 0.01 < P < 0.05), indicating that the codon usage pattern of BTV genomes is limited by both natural selection and mutational pressure (Table 3). Furthermore, correlation analysis among T (−0.66), C 3 (0.63), and GC (0.51) with ENC was also performed by Spearman's rank correlation ( Table 4).
The composition of the overall nucleotide and the third codon position nucleotide was used as a reference to evaluate the The numbers in each column represent correlation coefficient "r" values, which are calculated in each correlation analysis. NS, non-significant (P > 0.05). *0.01 < P < 0.05. **P < 0.01. −0.23* The numbers in each column represent correlation coefficient "r" values, which are calculated in each correlation analysis. NS, non-significant (P > 0.05). *0.01 < P < 0.05. **P < 0.01. Gravy represents the hydrophobicity of protein.
ARO represents the aromaticity of protein.

DISCUSSION
This survey of the BTV complete genomes indicates a preference for A/U nucleotide over G/C nucleotide and that impacts the codon usage for translation of viral proteins. This finding is similar to previous research on Crimean-Congo hemorrhagic fever virus (CCHFV) being enriched with A and U (Rahman et al., 2018). However, the biological significance of this condition is still unclear, and therefore it is vital to explore the causes for significantly increased A content and concomitant decreased C content in the viral genomes (van Hemert and Berkhout, 2016). Some previous reports showed that the composition of amino acids was also the key factor in determining the nucleotide contents at the first and second codon positions of viral genomes, while the variation in proteins was forced by functional selection. However, 69% of the alteration at the third codon position always denoted synonymous or silent mutations, which was not affected by functional selection of protein products (van Hemert and Berkhout, 2016). Earlier research suggested that the codon usage pattern of Ebola virus (EBOV) was different from that of its hosts (Schubert and Catherine, 2010;Cristina et al., 2015). Our results are consistent with earlier studies showing that A/U-ended codons are more abundant in the BTV genome than in the host genome (Rabadan et al., 2006;Greenbaum et al., 2008). In addition, some previous reports also indicated that the identical compositions of codon usage patterns between viruses and hosts could increase the translation efficiency of the corresponding amino acids, while the contrary compositions of codon usage patterns could ensure the correct folding of viral proteins (Aragones et al., 2010;Costafreda et al., 2014;Hu et al., 2014;Cristina et al., 2015). These results also reflect that the similar usage of codons between BTV and its common hosts may enhance the ability of viral genes to participate in the translation process. Specifically, the codon usage pattern of BTV genomes may be largely influenced by the selection pressure of its natural hosts, which can be conducive to adaptation to the cellular conditions of its hosts and efficient replication (Wong et al., 2010;Ma et al., 2015). However, the influence of selection from hosts (B. taurus, O. aries) on shaping codon usage patterns of BTV is not similar to the vector (Culicoides). Previous researches on EBOV and Flaviviridae virus suggested that the codon usage patterns are very different with their hosts (Schubert and Catherine, 2010;Cristina et al., 2015).
In this study, a number of systemic analytical approaches were performed to explore the factors shaping the BTV codon usage patterns. To start with, an ENC-GC3 analysis was performed. In BTV genomes, the ENC values were considered to estimate the codon usage bias in the complete viral genomes. The result shows that overall codon usage bias of BTV genomes was low (ENC = 57.9). It has also been found among some other viruses, such as hepatitis C virus (ENC = 52.62) (Hu et al., 2011), Ebola virus (ENC = 57.23) (Cristina et al., 2015), and Crimean-Congo hemorrhagic fever virus (ENC = 52.34) (Rahman et al., 2018). It has been indicated that the low codon usage bias of virus is beneficial for the efficient replication in its host cells and the reduced competition between virus and its host for the protein synthesis. Although ENC values can estimate the codon usage bias of BTV genomes, these values alone cannot be used to reflect the driving force of codon usage bias. An ENC plot of BTV genomes whose codon usage patterns are only constrained by their GC3 compositions will lie on or slightly below the solid line of the expected ENC values. When present, this influence of nucleotide constraints indicates the fundamental effect of mutation pressure. In BTV genomes, we observed AU compositions to be significantly higher than GC compositions. To measure how this genomic content may have impacted the codon usage patterns of BTV, we derived our assumption from ENC-GC3 analysis. It shows that both natural selection and mutational pressure have affected the codon usage patterns in BTV complete genomes.
An earlier study also suggested that mutation pressure was the dominant factor affecting the codon usage pattern of Zaire ebolavirus (ZEBOV). Studies have shown that natural selection is largely limited by nucleotide contents in the first and second sites of codons, although the mutational pressure is generally limited by the nucleotide contents in the third site of codons (Roychoudhury and Mukherjee, 2010;Hu et al., 2014). Therefore, we further performed Spearman's rank correlation analysis between the base contents of BTV and the two principal components (axis 1 and axis 2) and validated that in addition to natural selection, nucleotide contents can also play a role in synonymous codon usage patterns.
Except for natural selection and mutation pressure, other factors including geographic origins and translation selection can also influence the viral codon usage patterns (Chen Y. et al., 2014;Rahman et al., 2018). The results of the COA suggested that all BTV strains are collected into clusters (Figure 3). These results highlight that the geographical location of BTVs plays a significant role in their evolution and codon usage patterns. These results also indicate that there is more than one prevailing genetic lineage in each infected country and promote studies to trace the origin of the existing BTVs. Besides, the results of CAI analysis reflect that the selection pressure from hosts may influence the codon usage pattern of BTV and that the translation resources of hosts are more efficient for BTV than those of the vector. These results supported the role of geographical locations and translational selection on codon usage patterns of BTV genomes. In addition, our study also reveals that the differences among various hosts are associated with the codon usage bias of BTV. It is compatible with previous studies that have shown distinct codon usage patterns between virus and host genes (Schubert and Catherine, 2010;Cristina et al., 2015).

CONCLUSION
According to the available evidence, our findings suggest that analysis of codon usage bias can offer an alternative strategy to explore the evolution of BTVs. Using PCA and COA on RSCU values, the codon usage patterns and trends of BTV strains were obtained. Furthermore, the results may distinguish the different viral groups and expose their evolutionary trends. Further studies of codon usage demonstrated that the evolution of BTV could be regulated mainly by natural selection in addition to mutation pressure. The observed nucleotide composition might also be the driving force shaping the codon usage patterns of BTVs. Additionally, it is suggested that there are similarities of codon usage between BTVs and their hosts. This research not only provides the knowledge about the variation in BTV codon usage patterns but also contributes to analyzing the factors that drive BTV evolution.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
DC and ST conceived and designed the experiments. XY and BY performed all the experiments. XY, QF, and BY collected and analyzed the data. SR and PL drafted the manuscript. All authors read and approved the final manuscript.

ACKNOWLEDGMENTS
We thank QF and PL for sample collection. We would like to thank the members of the Bioinformatics Center of Northwest A&F University for their useful input.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2020.00655/full#supplementary-material FIGURE S1 | Neutrality plot analysis (GC12 vs. GC3) for the entire coding sequences of BTVs. GC12 indicates the average value of GC contents at the first and second codon positions (GC1 and GC2), while GC3 refers to the GC contents at the third codon position. The red dotted line is the linear regression of GC12 against GC3, R 2 = 0.001271, P > 0.05. FIGURE S2 | A plot of all BTV complete sequences in PCA. The first axis accounts for 69.91% of the total variation, and the second axis accounts for 27.95% of the total variation.