Molecular Evolutionary Analyses of Euplotes Species Living in Freshwater and Marine Habitats: A Mitogenomic Perspective

Ciliates are the most complex unicellular eukaryotic organisms, which play important roles in various ecosystems. The Euplotes is a dominant genus in the ciliates Euplotida and consists of approximate one hundred species. They distribute widely in environments with various salinity levels including freshwater, brackish, seawater as well as hypersaline. In this study, we obtained four mitochondrial genomes of Euplotes species, using both high throughput sequencing and Sanger sequencing. Combined with two previously reported Euplotes mitochondrial genomes, we analyzed their gene structure, codon usage pattern as well as phylogenetic relationship. We found that gene rearrangement exists in Euplotes and codon usage bias is different among these species. Phylogenetic analyses based on both mitochondrial and nuclear genes further unveiled that Euplotes spp. living in similar salinity levels tend to be clustered together. Moreover, we found that the dN/dS ratios of two mitochondrial genes, cox1 and cox2, are significantly different between marine and freshwater species, indicating the salinity could act as a barrier for the Euplotes species distribution. We also recommended mitochondrial genes to discriminate the species with highly similarity of Euplotes which could not be easily distinguished by nuclear gene marker and morphological characteristics. This study provides novel resources to improve our understanding of Euplotes evolution and also its adaptation to habitats with different salinity levels.


INTRODUCTION
Ciliates play important ecological roles in natural environments and become one of the most prevailing model territory in unicellular eukaryotes (Graziano et al., 2010;Gentekaki et al., 2014). They are more diverse and highly differentiated than multicellular eukaryotes (Graziano et al., 2010) and able to survive in various types of environments (Giuseppe et al., 2011). The genus Euplotes is one of hypotrichous ciliates which contains approximately one hundred species (Pedrini et al., 2017). Recent studies have been focus on Euplotes diversity and phylogeny based on morphological characters and molecular data (Yi et al., 2009;Syberg-Olsen et al., 2016). However, different opinions by taxonomists may result in identification confusion, also lack of enough molecular sequences could lead to misestimating the inner relationship among the increasing taxa. Besides, as species in the genus Euplotes have a wide distribution from freshwater to marine water, more genetic traits are needed to provide the relation between salinity and Euplotes clusters. Mitochondrial genome analyses have been playing a growing important role in phylogenetic and evolutionary studies. Mitochondrial genome contains multiple vital genes and has a higher evolutionary rate than nuclear gene markers (Gray et al., 1999;Zhang et al., 2016). The inheritance of mitochondrial DNA occurs in a unique and heritably separate manner that is spatially different from nuclear DNA, which may provide a new perspective for the phylogenetics of the Euplotes. Moreover, its small size makes it to be more accessible than nuclear genome. Therefore, mitochondrial genomes have been widely used to study biogeography, phylogeny as well as evolutionary biology (Ballard, 2000;Sivasankaran et al., 2017). Until now, mitochondrial genomes from only three species have been sequenced in ciliated species (de Graaf et al., 2009;Serra et al., 2019).
This research aims to revisit the phylogenetic relationship among the representative Euplotes species, as well as to explore the correlation between Euplotes evolution and salinity of their habitats from the perspective of mitochondria. We obtained the mitogenomes of two freshwater species (E. aediculatus and E. otocarinatus) and two seawater species (E. vannus and E. focardii) by Sanger sequencing and high throughput sequencing. Then the structure feature and codon usage were compared among all Euplotes mitogenomes. We also identified the major force influencing codon usage pattern. By constructing phylogenetic trees based on both nuclear and mitochondrial genes, we investigated the correlation between phylogenetic diversity and salinity of habitats. Finally, two mitochondrial genes (cox1 and cox2) were speculated to be related to salinity adaptation. These data provided new clues to explore the genetic basis on how Euplotes adapts to diverse habitats.

Cultivation and DNA Extraction
E. aediculatus and E. vannus were obtained from Qingdao Agricultural University (Qingdao, China) and Ocean University of China (Qingdao, China), respectively. The former was cultured in sterilized freshwater at the room temperature, while the latter was cultured in sterilized seawater with concentration of 30 at the room temperature. When the density of cells reached 600 per milliliter (ml), we used a 15-µm-pore-size nylon filter membrane to harvest them from the medium and washed by sterilized water to reduce bacterial contamination (Kodama and Fujishima, 2005). The cells were starved and incubated with 1 × penicillin-streptomycin antibiotics (Invitrogen, Carlsbad, CA, United States) for 24 h to further eliminate the contamination of bacteria (McGrath et al., 2014;Slabodnick et al., 2017;He et al., 2019). We collected cells by centrifugation at 8,000 rpm/min for 10 min and extracted DNA using the DNeasy Blood & Tissue Kit (Qiagen, Düsseldorf, Germany).

Sequencing and Assembly
We obtained the mitochondrial sequences of E. vannus by Sanger sequencing. Conserved regions between E. minuta and E. crassus were selected to design several pairs of degenerate primers (Supplementary Table 1), then we amplified corresponding regions by degenerate Polymerase Chain Reaction (PCR) (de Graaf et al., 2009;Barth and Berendonk, 2011). The long range PCR was performed to acquire connection between conserved regions. Seqman in DNAstar 7.1 was used to assemble the amplified sequences.
Approximately 2 µg DNA from E. aediculatus was used to construct a DNA library with 500 bp insert size and then sequenced on an Illumina HiSeq 2500 platform. Ultimately, 1858120 read pairs were acquired (PE250). Adaptors and low quality bases were removed by Trimmomatic with default parameters (Bolger et al., 2014). We employed SPAdes to assemble the mitochondrial genome with an option of "careful" (Bankevich et al., 2012;Zhang et al., 2016;Serra et al., 2019) and obtained E. aediculatus mitochondrial contigs by TBLASTN (E value < 1e-5). We further used the same approach to assemble the mitogenomes from two published genomic DNA-seq datasets of E. focardii (SRR3929761) and E. ocatocarinatus (SRR2474557), respectively. The processes of verifying and concatenating the assembled contigs from E. aediculatus were accomplished by PCR.

Annotation
Firstly, we predicted all possible open reading frames (ORFs) from the assembled mitochondrial genomes. If there were multiple ORFs in one position, only the longest one was kept. Besides, ORFs shorter than 150 bp were also removed (Swart et al., 2012). BLASTX was employed to align these ORFs to non-redundant (nr) protein database and Swiss-prot database to detect homologous mitochondrial protein coding sequences. As for the protein coding genes with high divergence, a more sensitive homology detection approach HHpred was used to search against the protein families (Soding et al., 2005), which can detect homology based on HMM-HMMcomparison. The tRNA genes were identified with tRNAscan-SE version 2.0 (Lowe and Chan, 2016). The boundaries of rRNA genes were determined using clustalW by aligning to known Euplotes mitochondrial rRNA genes. The tandem repeat region was determined by Tandem Repeats Finder (Benson, 1999).

AT and GC Skew Profiling
We divided the upstream and downstream sequences of the tandem repeat region into 50 bins, respectively. The cumulative AT skew and cumulative GC skew were calculated by the Frontiers in Marine Science | www.frontiersin.org following formula, where i indicate i-th bin of mitochondrial genome (Grigoriev, 1998).

Codon Usage Bias Analysis
The relative synonymous codon usage (RSCU), effective number of codons (ENC), aromaticity (AROMO) as well as general average hydropathicity (GRAVY) were calculated by CodonW (v1.4.4) 1 . GC12 represents the mean GC content of the first and second sites of codons, and GC3 represents GC content of the third site of codons. The expect ENC value was calculated according to the previous study (Wright, 1990). Principal component analysis (PCA) and correlation analysis were carried out using R studio (Wei et al., 2014).

Phylogenetic Analysis
SSU-rRNA genes of E. vannus and E. aediculatus were obtained by PCR with degenerated primers Euk-A (5 -AACCTGGTTGATCCTGCCAGT-3 ) and EukB (5 -GATCC TTCTGCAGGTTCACCTAC-3 ) (Medlin et al., 1988). SSU-rRNA sequences of E. focardii and E. ocatocarinatus were extracted from the assembled contigs by BLASTN, and SSU-rRNA genes of other species were downloaded from GenBank (Supplementary Table 4). These sequences were aligned and manually edited using BioEdit (Hall, 1999). After curation, a total of 1984 positions of SSU-rRNA genes were kept for subsequent phylogenetic analyses. According to MrModeltest (version 2.4) 2 with Akaike Information Criterion (AIC), GTR + I + G was the best-fit model for SSU-rRNA genes. The datasets were then analyzed using two phylogenetic methods, Maximum Likelihood (ML) and Bayesian Inference (BI), under the bestfit model. ML analyses with 1000 non-parametric bootstrap replicates were carried out using RAxML (Stamatakis, 2014). BI analyses were performed using MrBayes (Ronquist et al., 2012). Four Markov Chain Monte Carlo (MCMC) were run twice with 10 6 generations and sampled every 100th cycle, while the first 2500 trees were discarded as burn-in (Fernandes et al., 2016;Shazib et al., 2016).
In order to construct phylogenetic trees using mitochondrial genes, we downloaded mitochondrial genomes of another eleven species in ciliates and Durinskia baltica (outgroup) from GenBank. GET_HOMOLOGUES was employed to identify orthologs genes among these mitogenomes (Contreras-Moreira and Vinuesa, 2013). The selected genes were aligned using MAFFT (Katoh and Standley, 2013) and then combined by SequenceMatrix (Vaidya et al., 2011). Conserved regions in the alignments were determined by Gblocks and all gaps were removed (Castresana, 2000;He et al., 2019). The best-fit model 1 http://codonw.sourceforge.net/ 2 https://github.com/nylander/MrModeltest2 LG + G + F amino acid replacement matrix according to AIC was determined by the ProtTest 3 (Darriba et al., 2011). The phylogenetic tree based on the concatenated mitochondrial genes was constructed by the same method as that based on SSU-rRNA genes.

Analysis of Selection
The ratio of non-synonymous to synonymous substitutions (dN/dS) is usually used to measure strength and mode of natural selection (Jeffares et al., 2015). If dN/dS > 1, positive Darwinian selection (positive selection) is acting on PCGs, whereas dN/dS < 1 suggests purifying selection, and dN/dS = 1 indicates neutral selection. The ω represents estimate dN/dS value. Branch model test was used to test whether ω between Euplotes G1 and G2 is significantly different. The branch model test based on protein coding genes was carried out by the program CODEML of PAML Version 4.8a (Yang, 2007). We set CodonFreq equal to 1. The one ratio model (icode = 3, model = 0) assumes that ω across all branches is identical. For two ratio model (icode = 3, model = 2), Euplotes G2 was selected as the foreground branch, ω0 was for Euplotes G2 species while ω1 was for Euplotes G1 species. The likelihood ratio test was used to compare and test the significance between two models by jModelTest 2 (Darriba et al., 2012).

Structure Feature of Mitochondrial Genomes
Similar to previous reports, the four new mitochondrial genomes in this study also have a linear structure, with the length ranging from 32.6 kb to 43.6 kb ( Table 1). All Euplotes mitogenomes have considerably high AT content, especially for E. aediculatus (80.2%) and E. otocarinatus (82.0%). In this study, we used the reannotation of E. minuta (Swart et al., 2012) and reannotated E. crassus through the same strategy (Supplementary Table 2). We found that Euplotes mitochondrial genomes have all proteincoding genes (PCGs) that have been reported in other ciliate mitochondria. Regardless of the ORFs with unknown function, gene order among E. minuta, E. crassus, E. vannus as well as E. focardii is almost the same with exception of nad6, which is absent in E. crassus. Here, we classified these four species as group one, termed as Euplotes G1. Likewise, the arrangements of PCGs in mitochondria of the other two Euplotes species are completely consistent, which could be classified as group two (Euplotes G2). However, there are inversion and translocation of PCGs between these two groups. For species in Euplotes G1, cob locates between ccmF and trnM but it interposes between nad3 and trnH in Euplotes G2. In E. octocarinatus, we also found that nad5-ccmf is translocated to the end of nad1a, but the orientation of transcription remains unchanged (Figure 1).
Compared with mitochondrial genomes of Euplotes G1, there are no trnM and trnE but two trnQs (Q1 and Q2 in Figure 1) in Euplotes G2 mitogenomes. The trnQ1 in Euplotes G2 owns the same anticodon (UUG) as trnQ in Euplotes G1, while the anticodon of trnQ2 is CUG (Supplementary Figure 1A). Besides, the sequence similarity between Euplotes G2 trnQ1 and Euplotes G1 trnQ is higher than that between Euplotes G2 trnQ2 and Euplotes G1 trnQ (Supplementary Figure 1B). The interspecific divergence of trnQ1 (between E. aediculatus and E. octocarinatus) is smaller than that of trnQ2 both in sequence and structure (Supplementary Figures 1A,B). The disparity of trnQ2 mainly reflects on both D arm and anticodon arms.

Asymmetric Nucleotide Composition Flanking the Tandem Repeat Regions
Although the sequences of the tandem repeat region are not completely consistent within the existing Euplotes mitochondrial genomes, their AT content is obviously higher than that of the entire mitogenome. We calculated AT skew and GC skew of upstream and downstream by regarding the tandem repeat region as midpoint. The AT skew of the non-overlapping windows of all species in upstream are mostly positive while windows in the downstream are opposite, which means there are more A than T in the upstream and more T than A in the downstream. However,  the GC skew of Euplotes G1 and Euplotes G2 has different tendencies regardless of upstream or downstream, which means there are more G than C in the upstream and more C than G in the downstream of Euplotes G1 but completely different for Euplotes G2 (Supplementary Figure 2). In order to facilitate the observation of their trend, we calculated cumulative AT skew and cumulative GC skew and found turning points near the tandem repeat region (Figures 2A,B).

Codon Usage Bias
The effective number of codons which ranges from 20 to 61 is an index used to measure codon usage bias level (Wright, 1990). A value of 20 indicates that there exists extreme bias -only one codon is used for each amino acid, while a value of 61 indicates no bias that all codons are equally used (Fuglsang, 2006). We found that Euplotes species in G2 have a higher degree of codon usage bias than those in G1 (Table 1). Then, we calculated the relative synonymous codon usage (RSCU) of 61 codons (except two stop codons and ATG) in all Euplotes species and found that Euplotes G1 and G2 exhibit different codon usage patterns but both prefer codons ending with AT. Compared with species belonging to G1, species in G2 seem to use more AT-ending codons ( Figure 3A).

Factors Influencing Codon Usage
To explore the potential factor that shapes the codon usage bias in Euplotes mitogenomes, the neutrality plot analysis and ENC-plot analysis were carried out using all PCGs as well as ORFs in six Euplotes. The correlation between GC12 and GC3 is statistically significant (r = 0.722, p < 2.2e −16 ) and the slope of the regression line is 0.549 (R 2 = 0.518, p < 2.2e −16 ) ( Figure 3B). We further calculated the expected ENC values by GC3 with the assumption that codon usage bias is only subject to mutation bias. Subsequently, we plotted the standard curve by GC3 and the corresponding expected ENC values. As shown in Figure 3C, most genes distributed near the standard curve and the Pearson correlation coefficient between actual ENC values and expected ENC values is 0.877 (p < 2.2e −16 ). It should be mentioned that the ENC values of several genes deviated from the standard curve by more than 20%, most of which have shorter length compared to the remaining genes. However, there is no significant difference on gene length between the genes deviated from the standard values by more than 10% and less than 20% and those deviated from the standard values by less than 10% (Supplementary Figure 3). Principal component analysis was carried out based on 61 synonymous codons. The first principal component (PC1) and the second principal component (PC2) can explain 73.06% and 5.04% of the total variation, respectively. The positions of codon were marked in two-dimensional coordinates and codons ending with AT and GC could be clearly separated ( Figure 4A). Both PC1 and PC2 were significantly correlated with GC3 and GC content but only slightly correlated with protein aromaticity (Figure 4B). There is no correlation between these principal components and protein hydrophily.

Phylogenetic Analysis
We selected conserved genes that are shared in all sequenced ciliate mitochondrial genomes (Supplementary Table 3), including cox1, cox2, nad4, nad7, rpl2, rps12, to construct phylogenetic trees based on the concatenated sequence of these genes. The phylogenetic trees using ML and BI method have similar topologies so we only present the ML tree with support values of both methods labeling on branches. The sequenced ciliates come from two classes and Euplotes belongs to Spirotrichea. Six Euplotes species were clustered into two groups, G1 and G2 (Figure 5A), which is consistent to the classification based on gene order. We further investigated the evolutionary history of the genus Euplotes based on SSU-rRNA genes (Supplementary Table 4). The phylogentic tree can classify species into five clades, with G1 and G2 belonging to two main clades respectively. Unlike phylogenetic tree GC3 refers to GC content of the third codon site, GC12 means GC content of the first and second codon sites. The dark blue line represents the regression line and the gray area represents 95% confidence interval. (C) ENC-plot analysis. We divided genes into three levels (<10%, <20% & >10%, >20%) based on the degree of the actual ENC values deviating from the standard curve.
based on mitochondrial genes, E. vannus is closer to E. crassus rather than E. minuta. Since salinity was often absent in previous studies, we defined the description of freshwater, brackish water and habitats less than 25 salinity to be low salinity while the description of seawater, hypersaline water and habitats more than 25 salinity as high salinity (Syberg-Olsen et al., 2016). Both two main clades contain species habitat in low salinity and high salinity and clustered Euplotes species tend to live in similar salinity level ( Figure 5B)

Selection on Mitochondrial Genes
The branch that represents Euplotes G2 is selected as the foreground branch for all mitochondrial protein coding genes. ω of all genes present are much less than 1, indicating they are under strong purifying selection. The one-ratio model assumes that foreground branch and background branch have the same ω value, while two-ratio model is opposite. The result of likelihoodratio test indicates that most genes fit the one-ratio model, while cox1, cox2 and rps3 seem to better fit to the two-ratio model ( Table 2) (p < 0.05). It is likely that ω of these three genes are significant different between Euplotes G1 and G2, indicating different degree of purifying selection pressure may occur on the speciation of the two clades.

Gene Rearrangement and Putative Origin in Euplotes Mitochondrial Genomes
There is a long tandem repeat region with high level of AT content in Euplotes mitochondrial genomes, which makes it difficult to acquire complete sequences by highthroughput sequencing techniques. Despite the incompleteness of mitogenomes, we found gene rearrangement events present in Euplotes mitochondrion. Genes involved in rearrangement include tRNA genes as well as protein-coding genes. In general, the tRNA genes rearrange more frequently than PCGs and rRNA genes within the mitogenome (Wu et al., 2014). In 2007, a pseudo-tRNA gene, which was absent in other Tetrahymena species, had been detected in Tetrahymena paravorax (Moradian et al., 2007). We noticed that species in Euplotes G1 are from the marine water, while Euplotes G2 species live in freshwater.
Interestingly, E. vannus living in low salinity stress could activated an extra pathway related to the glutamine (Gln, Q) metabolic process (Chen et al., 2019). The extra trnQ in Euplotes G2 mitogenome which has different anticodons might be related to salinity of habitats. Previous study shows that the AT rich region usually plays an important role in regulating the process of transcription and replication (Boore, 1999). In two genera Paramecium and Tetrahymena, the AT-rich region coincides with the DNA replication origin (Moradian et al., 2007). Due to the difference in replication mechanisms of leading strand and lagging strand, there should be a positive GC skew and negative AT skew in the leading strand, a negative GC skew and a positive AT skew in the lagging strand in prokaryote genomes (Charneski et al., 2011). AT skew or GC skew in prokaryotes can switch at the origin and terminus of replication (Fellenberg et al., 2001). Considering that both cumulative AT skew and cumulative GC skew have a turning point near the tandem repeat region, it suggests that this interval may be the replication origin of Euplotes mitochondrion. The origin of replication in Paramecium mitochondria is at one terminus, whereas it is near the center in Tetrahymena (Pritchard and Cummings, 1981;Pritchard et al., 1990;Burger et al., 2000). The fact that the sequence flanking the tandem repeat region transcribes in the opposite directions increases the possibility that this region may be involved in transcription and replication initiation.

Mutation Bias as the Major Driving Force of Codon Usage Bias
It is generally acknowledged that codon usage pattern reflects a balance between mutation bias and natural selection for translational efficiency (Sharp et al., 1993;Liu et al., 2004). The neutrality plot analysis is a classical method to compare the influences of these two factors by plotting the GC12 of the synonymous codons against the GC3. Considering that base composition of three codon positions should be similar without natural selection, the closer the slope of their regression curve is to 1, the greater the influence of mutational bias. In Euplotes mitogenomes, GC12 and GC3 are highly correlated with a slope of 0.549, indicating that mutation bias might be the major driving force of codon usage bias. For ENCplot analysis, if genes distribute near the standard curve, their codon usage is more likely affected by mutation bias. In this study, we found that most genes in the ENC-plot locate near the standard curve and the correlation between actual ENC and expected ENC is highly significant, which further confirm mutation bias as the major driving force. Genes which are deviated from the standard curve (>20% deviation) are mostly shorter than 500 bp (Supplementary Figure 3), it is likely the short length rather than actual codon usage bias lead to the extremely low ENC values.
There are various factors contributing to codon usage bias, including gene function (Chiapello et al., 1998), composition of bases and amino acid, gene length (Duret and Mouchiroud, 1999), gene expression (Liu et al., 2004;Wei et al., 2014), abundance of tRNA (Duret, 2000). Codons ending with AT and GC form two clusters by plotting PC1 against PC2, indicating that GC3 has great impact on codon usage. Although the effect is not as strong as GC3, GC content also affects codon usage. Protein aromaticity is negatively related to codon usage, as protein aromaticity depends on the composition of amino acid and gene function, which is considered to be the result of natural selection (Wei et al., 2014). Therefore, the codon usage bias in Euplotes mitogenomes is mainly

Phylogeny and Habitat Salinity
For aquatic organisms, salinity are one of the most important environmental factors which determine the community composition (Crump et al., 2004;Herlemann et al., 2016). Previous studies suggest that the Euplotes species and subclades in phylogenetic tree based on SSU-rRNA gene are more uniform in term of habitat salinity (Petroni et al., 2002;Yi et al., 2009;Syberg-Olsen et al., 2016). It is still be true when we carried out phylogenetic analysis by SSU-rRNA gene from more species in Euplotes. When constructing the phylogenetic tree by multiple mitochondrial genes, we found the six species in Euplotes were divided into two branches, corresponding to Euplotes G1 (seawater) and Euplotes G2 (freshwater), respectively. These results indicate that salinity might act as a barrier for the distribution of Euplotes species. The salinity barrier seems not be strict as the euryhaline species might be the intermediate stage of immigration from ancestrally marine to freshwater (Zhao et al., 2018).

Mitochondrial Genes Used for Discriminating Closely Related Euplotes Species
There is a long standing species classification problem in the vannus-crassus group (Caprette and Gates, 1994;Zhao et al., 2018). One view point is that E. vannus and E. crassus experienced separated evolution and speciation (Giannì and Piras, 1990), while another is that E. vannus and E. crassus form a highly polymorphic species (Caprette and Gates, 1994 In phylogenetic trees based on mitochondrial genes (both single and concatenated genes), phylogenetic relationship among E. vannus, E. crassus and E. minuta is slightly different from that based on nuclear genes, which may be due to incongruent evolutionary rates between mitochondrial and nuclear genes in these species.

Adaptations From Marine to Freshwater
Immigration from marine to freshwater require specific adaptive mechanism to adjust osmoregulation (Alverson et al., 2007). Evolutionary divergence between marine and freshwater Euplotes might be caused by this kind of physiological adaptations. We found that ω of all mitochondrial genes are far less than 1, indicating strong purifying selection acting on these genes in Euplotes. The result of branch-specific test indicates that cox1, cox2 and rps3 have inconsistent ω value along the branch of Euplotes G1 and G2, which means their evolutionary rate is different. The rps3 encodes ribosomal protein small subunit and it shows a relatively low sequence similarity in the middle of gene and is of great difference in length among Euplotes species. Previous study reported that the mitochondrial rps3 is variable in size and position as well as gene arrangement due to insertions, microsatellite expansions (Sethuraman et al., 2009). These may result in inconsistent ω values of rps3 between Euplotes G1 and G2. Both cox1 and cox2 encode cytochrome c oxidase which play crucial roles in ATP production by participating in oxidative phosphorylation (OXPHOS pathway). The strategy for osmotic adjustment rely on energy-depend mechanism (Harding et al., 2016;Weinisch et al., 2018). The two genes of Euplotes G1 and Euplotes G2 may undergo diversifying evolutionary process for physiological adaptations from marine to freshwater.

CONCLUSION
The current study mainly acquired four mitogenomes of Euplotes and help to provide valuable resources and new insights into the evolution process of Euplotes genus. Mitochondrial genes of Euplotes including PCGs and tRNA genes undergo rearrangement events but the position of putative origin of transcription and replication is conserved. Though the codon usage patterns of mitochondrion are different in Euplotes genus, mutation bias is always the major driving force resulting in the codon usage bias. Interestingly, both gene order and codon usage bias seem to be uniform with habitat salinity level. Phylogenetic analyses based on both mitochondrial and nuclear genes further confirm that species living in similar salinity habitat tend to cluster together. Salinity does present a barrier to the distribution of Euplotes. However, some of the ancestrally marine species could adapt to the freshwater. Physiological adaptations may happen to regulate osmotic pressure regulation participated in this process. The cox1 and cox2 which participate in OXPHOS pathway might be the key genes for physiological adaptations from marine to freshwater. Besides, mitochondrial genes have higher evolutionary rate compared with traditional nuclear gene markers, so they are more effective for discriminating species with highly similarity.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm. nih.gov/, MT665958; https://www.ncbi.nlm.nih.gov/, MT665959.

AUTHOR CONTRIBUTIONS
MM designed and guided the study. NH, SC, MH, and LH conducted sampling and performed laboratory work. NH and SC performed the evolutionary and phylogenetic analyses. NH drafted the manuscript. MM, SZ, YZ, and QS made further revisions. All authors read and approved this manuscript.

FUNDING
This work was supported by Natural Science Foundation of China (32070432, 31672279, 41702385, 32070461), and the Fundamental Research Funds for the Central Universities.

ACKNOWLEDGMENTS
Our thanks are due to Prof. Weibo Song (Ocean University of China, China) for supplying samples and technical support.

SUPPLEMENTARY MATERIAL
The   Swart et al. (2012). We reannotated E. crassus as the new annotation of E. crassus. In this study, we used new annotation results uniformly.
Supplementary Table 3 | Accession numbers of all sequenced ciliate mitochondrial genomes.
Supplementary Table 4 | Accession numbers of SSU-rRNA genes used to construct phylogenetic tree.