Ten Complete Mitochondrial Genomes of Gymnocharacini (Stethaprioninae, Characiformes). Insights Into Evolutionary Relationships and a Repetitive Element in the Control Region (D-loop)

ORCID: Rubens Pasa orcid.org/0000-0002-3513-4071 Fabiano Bezerra Menegídio orcid.org/0000-0002-4705-8352 Igor Henrique Rodrigues-Oliveira orcid.org/0000-0003-1998-7768 Iuri Batista da Silva orcid.org/0000-0003-2788-5665 Matheus Lewi Cruz Bonaccorsi de Campos orcid.org/0000-0001-7176-9667 Dinaíza Abadia Rocha-Reis orcid.org/0000-0003-1762-8745 John Seymour Heslop-Harrison orcid.org/0000-0002-3105-2167 Trude Schwarzacher orcid.org/0000-0001-8310-5489 Karine Frehner Kavalco orcid.org/0000-0002-4955-2792


INTRODUCTION
Stethaprioninae, a subfamily of characiform fish, comprises small fishes popularly known as tetras. Species in this subfamily were initially assigned to the genus Astyanax, which has a broad geographical distribution ranging from the southern United States to northern Argentina (Ornelas-Garcia et al., 2008). Some species of Astyanax, however, are hard to accurately identify based on the lack of diagnostic morphological characteristics (Weitzman and Malabarba, 1998). As noted by Weitzman and Malabarba (1998), the genus Astyanax is probably not monophyletic, and this thinking is reflected in the recent revision by Terán et al. (2020), who reassigned some species of Astyanax to six different genera. The species complexes A. fasciatus and A. scabripinnis, for instance, were placed in the genus Psalidodon, and Astyanax species from the coastal river basins of Brazil (e.g., A. giton) were placed in the genus Deuterodon. The only taxa remaining in the genus Astyanax were the species complex A. bimaculatus and North American species. The revision by Terán was predicted by previous studies on chromosomes and mitochondrial sequences. Molecular analysis of 16 species of Astyanax based on the mitochondrial gene ATPase 8 and chromosomal characteristics obtained four clades (Pazza et al., 2018), and based on a DNA barcoding approach, Rossini et al. (2016) identified five lineages of Astyanax separated by high levels of divergence.
Despite the great diversity and taxonomic issues, at least one species of the genus Astyanax, A. mexicanus (formerly cited as Astyanax fasciatus), has been used as a model to understand the development of its eyes and the evolution of complex traits (Borowsky, 2008;Jeffery, 2008). Astyanax mexicanus is represented by surface and troglobite populations that show similar traits for cave dwelling. According to Ornelas-Garcia et al. (2008), populations from Brazil are not the same species as those seen farther north in middle America and Mexico. Currently, the valid name for tetras from Brazil is Psalidodon fasciatus, a taxon probably containing cryptic species based on their chromosomal variation ranging from 2n = 45 to 49 plus B chromosomes and heterochromatin polymorphism (Pazza et al., 2006;Ferreira-Neto et al., 2012;Kavalco et al., 2013). These chromosomal variants display low molecular divergence (Kavalco et al., 2016). Morphologically, the populations of the São Francisco river basin are different from the specimens from Alto Paraná and Paraíba do Sul rivers (Melo, 2001) and the original morphological description is so broad that it certainly covers other species outside the complex (Melo and Buckup, 2006). In this case, P. fasciatus should be restricted to the specimens from the São Francisco river basin (original basin of the type species), while the others may be either considered cryptic species of the complex or even other species (Melo and Buckup, 2006). On the other hand, despite a clear genetic structure among populations from the Alto Paraná and São Francisco river basins, morphometric traits seem to be homoplasy (Pazza et al., 2017).
The A. bimaculatus complex is currently represented by Astyanax bimaculatus species along with others, such as A. altiparanae (considered as a junior synonym of A. lacustris by some authors), A. lacustris, A. assuncionensis, and A. abramis. With a preference for calm waters, these species inhabit mainly in the Alto Paraná, Paraguay, Iguassu, and São Francisco river basins (Domingues et al., 2007). Contrary to what has been observed in P. fasciatus and P. scabripinnis, A. bimaculatus shows a constant diploidy number in different populations, 2n = 50 chromosomes, which is considered a symplesiomorphic character in Gymnocharacini (Kavalco et al., 2011;Martinez et al., 2012;Fernandes et al., 2014). The diversity within the group refers to differences in its karyotypic formula, fundamental number, and general symmetry of the karyotypes (Kavalco et al., 2011;Fernandes et al., 2014). These cytogenetic data, associated with the molecular ones, suggest a relatively recent divergence, as well as the monophyletic status of this branch (Pazza et al., 2018).
Alongside with the P. fasciatus and A. bimaculatus, the third species complex explored in this work, P. scabripinnis, was proposed by Moreira-Filho and Bertollo (1991) based on morphological and chromosomal characteristics of specimens collected in the Paraná and São Francisco river basins. In a review of the P. scabripinnis group, Bertaco and Lucena (2006) pointed out the existence of 15 species, including P. paranae and P. rivularis. The species of this complex are known for their wide karyotypic diversity, with diploid numbers ranging from 2n = 46 to 50 chromosomes (Moreira-Filho and Bertollo, 1991;Fernandes and Martins-Santos, 2005). In recent studies using molecular phylogeography and geometric morphometry, Rocha et al. (2019) reinforced the validity of P. rivularis and P. paranae as sister species of the complex, inhabiting the São Francisco and Paraná river basins, respectively. However, among the populations from the Alto Paranaíba river, the existence of a new species of the complex was observed due to morphometric and mtDNA data (Alves R de et al., 2020). This new species, called Psalidodon rioparanaibanus, was collected only in a small tributary of the Paranaíba river, surrounded by populations of P. paranae. Moreover, within the P. paranae and P. rivularis groups, karyotypic diversity is also present (Maistro et al., 1998) indicating that, even though delimited by individual lineages, these groups still constitute compilations of cryptic species.
Hundreds of studies describing mitogenomes have been published in the last few years. Despite being the most sequenced genome nowadays (Smith, 2015), to this date, just the mitogenomes of A. mexicanus (Nakatani et al., 2011), P. paranae (Silva et al., 2016), Deuterodon giton (Barreto et al., 2017), P. fasciatus, and A. altiparanae (Calegari et al., 2019) are published. In an attempt to fill this gap, we focused on the three species complex of the group (A. bimaculatus, P. scabripinnis and P. fasciatus) to present here the complete sequences of the mitochondrial genome of 10 species/cytotypes of Astyanax/Psalidodon. Such data could be very useful in further phylogeny studies and to understand the diversity of the group.

MATERIALS AND METHODS
Specimens from the Astyanax/Psalidodon genus were analyzed. We collected the samples in different locations throughout the major Brazilian rivers and their vouchers are deposited in the ichthyological collection of the Laboratory of Ecological and Evolutionary Genetics at Federal University of Viçosa, campus Rio Paranaíba, Brazil (Supplementary Table 1). After sampling, we brought the living specimens to the laboratory, euthanized them in accordance with the ethical standards of CONCEA, the Brazilian Council for the Control of Animal Experimentation and CEUA/UFV-Ethics Committee on Animal Use/Federal University of Viçosa (760/2018). We performed the sampling with licenses provided by SISBIO/ICMBIO-Biodiversity Authorization and Information System (1938128) and SISGEN-National System for the Management of Genetic Heritage and Associated Traditional Knowledge (A9FE946).
We extracted the total genomic DNA from the liver and heart tissues of six specimens according to the instructions of Invitrogen's PureLink DNA extraction and purification kit. After quality checking using fluorometer Qubit (Thermo Fisher Scientific), the Whole Genome Sequencing was performed using Novaseq 6000 (Illumina, San Diego, CA) at Novogene, UK.
For broader comparisons, we also assembled the mitogenome of two Astyanax/Psalidodon species with raw reads available on ENA (European Nucleotide Archive): P. fasciatus, from the Alto Paraná river basin (SRR8476332) and A. aeneus, from Mexico (SRR1927238). Aiming to validate our methodology, we reassembled the mitogenomes of P. paranae (SRR5461470) and A. mexicanus (SRR2040423). In the Bayesian analysis, we also included the mitochondrial complete sequence of FIGURE 1 | Phylogenetic tree based on 13 protein-coding genes (PCGs) showing the relationships among the Stethaprioninae fish using Brycon orbignianus as outgroup. The topology was the same in Maximum Likelihood (100 bootstrap replicates) after the test of the best model (General Time Reversible +G) and Bayesian Inference after calculation of best evolutionary models for each segment, using four independent chains with 10-million-generations (The first 25% of the generations were discarded as burn-in). The posterior probabilities/bootstrap are on the branches.
We assembled the mitogenome from raw reads on Novoplasty v3.7 (Dierckxsens et al., 2017) in a parallel cluster computer (64 Gb RAM) using the mitogenome of Psalidodon paranae available on GenBank (SRR5461470) as seed. We chose this approach because it is fast and assemble the mitogenomes "de novo" from raw data using a single mitochondrial sequence as seed, without the bias of a reference. We annotated the sequences obtained on MitoAnnotator (Iwasaki et al., 2013) at MitoFish (http://mitofish. aori.u-tokyo.ac.jp).
We performed comparative genomics analysis by BLAST comparison of all the Astyanax/Psalidodon mitochondrial genomes against a reference (Psalidodon paranae) generated by Blast Ring Image Generator (BRIG) (Alikhan et al., 2011). To assess the repetitive region, we analyzed the mitochondrial sequences with Tandem Repeats Finder (Benson, 1999), following we isolate and duplicate the repeats of D-loop, and we aligned at ClustalW (Thompson et al., 1994) to find the repeat motif.
To validate the mitogenome as a tool for understanding the phylogenetic relationships among the samples, we aligned Fasta sequences with ClustalW and calculated the p-distance with MEGA X software (Kumar et al., 2018). We used the 13 protein-coding genes (PCGs), extracted by hand from fasta file produced by Mitoannotator, in Bayesian phylogenetic inference with MrBayes 3.2.7 (Roquist et al., 2012) after calculating the best evolutionary models for each segment with Partition Finder 2.1.1 (Lanfear et al., 2016). All PCGs used on phylogeny was tested by saturation (Xia et al., 2003) on DAMBE v7. (Xia, 2018). Bayesian analyses were performed using four independent chains with 10-million-generations and the effective sample size (ESS) and strand convergence were, posteriorly, verified in the software Tracer 1.7 (Rambaut et al., 2018). The first 25% of the generations were discarded as burn-in. For the Maximum Likelihood analysis, we used the concatenated 13 protein-coding genes (PCGs) after testing for the best model in the software MEGA X (Kumar et al., 2018).

DATA DESCRIPTION
Our results have shown that all mitogenomes content and gene order were identical (Figure 1), with 13 PCGs, 22 tRNA genes, and 2 rRNA genes, as already described for Characiformes mitogenomes. The same is true for other Astyanax/Psalidodon/Deuterodon species, such as A. mexicanus (Nakatani et al., 2011), P. paranae (Silva et al., 2016), and D. giton (Barreto et al., 2017). All PCGs, except the ND6 gene, are on the heavy chain. All but 8 tRNAs are on the heavy chain as well.
All the new sequences are deposited at GenBank (Supplementary Table 2). The length of mitochondrial sequences range from 16,626 bp in the Psalidodon fasciatus from the São Francisco river basin to 16,812 bp in Psalidodon rivularis 2n = 50. The average length of D-loop was 1,061 bp, ranging from 951 bp in Psalidodon fasciatus from São Francisco river basin to 1,136 bp in Psalidodon rivularis with 2n = 50 chromosomes. No differences between deposited mitogenome of P. paranae and our reassembling could be seen. On the other side, our pipeline could extend the D-loop region of A. mexicanus, a problematic FIGURE 2 | Comparative mitogenomics analysis of all the 10 Stethaprioninae fish against a reference (Psalidodon paranae), generated by Blast Ring Image Generator (BRIG). Gaps in rings correspond to regions with <50% identity to the reference sequence (BLAST comparison). Colors from the center: dd9998; cb3b54; a31418; e85e25; df9856; 95ac42; 38a67f; 79bdbe; 00a8c3; 016db8; 6c5ab1. region in the deposited sequence (named "almost complete" on the GenBank entry). The difference in the size of D-loop was due to a repeat of 35 bp in all D-loops, except in Deuterodon giton (Supplementary Table 3). For the alignment, we got a repeated motif (TATGTATTAGTACATATTATGCATAATTATACATA) slightly variable in some species.
Deepening the knowledge on the mitogenome control region, called D-loop, can play a fundamental role in understanding the evolutionary history in the Astyanax and Psalidodon genera. In this work, we observed that the size variation among different Astyanax/Psalidodon mitogenomes occurs mainly due to the extension of the D-loop. Neglecting this region in the reconstruction of mitogenomes can result in a valuable loss of information since, in addition to the variation in size, we found a repetitive sequence of 35 bp in nine of the 10 mitogenomes studied (Supplementary Table 3). On the other side, studies based on D-loop sequence must be aware of this kind of feature that can bias the analysis.
D. giton was the only species that did not present the repetitive sequence in the D-loop. In addition to occupying a sister group position in the recuperated phylogeny (Figure 2) when describing the mitogenome of the species, Barreto et al. (2017) observed in their phylogeny of Characidae that the species Grundulus bogotensis was closer to the Astyanax/Psalidodon group than to D. giton. Therefore, we not only confirm D. giton as a species outside the genus Astyanax as suggested by the taxonomic review (Terán et al., 2020) but also that the repetitive sequence found in the D-loop may correspond to a synapomorphy absent in the Deuterodon group. However, it is necessary to reassemble the genome of D. giton under our methodology and conduct complementary studies on species of the Deuterodon group to clarify this issue. Table 4) is reflected in both the Maximum Likelihood and the Bayesian tree (Figure 2), which shows strong construction with high bootstrap value and posterior probabilities, respectively. Once no gene sequence show saturation, we did the phylogeny with all 13 PCGs. The Partition Finder analysis results in five subsets (Supplementary Table 5) used in bayesian phylogeny.

The genetic distance among species (Supplementary
The topology of the tree is congruent with those inferred by Rossini et al. (2016) and Pazza et al. (2018), except for the North American clade, which appears as a sister group of A. altiparanae and A. lacustris here. Besides, our study reinforces the taxonomic review by Terán et al. (2020), meanwhile disagree with Lucena and Soares (2016) that describe A. altiparanae as a new junior synonyms of A. lacustris.
The methodology used to reconstruct the mitochondrial genome proved to be satisfactory and enabled the assessment of the length of this type of genome, plus the composition and nature of the D-loop, solving possible gaps in previous methodologies (Silva et al., 2016;Barreto et al., 2017;Calegari et al., 2019). Besides, the study of the complete mitochondrial genome proves to be a tool with the potential to solve taxonomic problems and to help understand the evolutionary relationships in species complexes, such as A. bimaculatus, P. fasciatus, and P. scabripinnis.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm. nih.gov/genbank/, BK013055 MT428072 MT428067 BK013062 BK013061 MT428071 MT428068 MT428069 MT428070.

ETHICS STATEMENT
The animal study was reviewed and approved by Brazilian Council for the Control of Animal Experimentation and CEUA/UFV-Ethics Committee on Animal Use/Federal University of Viçosa (760/2018).

AUTHOR CONTRIBUTIONS
All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.