Genetic Diversity of the Endangered Neotropical Cichlid Fish (Gymnogeophagus setequedas) in Brazil

Gymnogeophagus setequedas is a rare and rheophilic species of tribe Geophagini, considered endangered in Brazilian red lists. Its previously known geographical distribution range was the Paraná River basin, in Paraguay, and a tributary of the Itaipu Reservoir in Brazil. Since its description no specimens have been collected in the original known distribution area. However, recent records of G. setequedas in the lower Iguaçu River, in a region considered highly endemic for the ichthyofauna, extended the known geographical distribution and may represent one of the last remnants of the species. The aim of this study was to estimate the genetic diversity and population structure of G. setequedas, using microsatellite markers and mitochondrial haplotypes, in order to test the hypothesis of low genetic diversity in this restricted population. Muscular tissue samples of 86 specimens were obtained from nine locations in the Lower Iguaçu River basin, between upstream of the Iguaçu Falls and downstream of the Salto Caxias Reservoir. Seven microsatellites loci were examined and a total of 120 different alleles were obtained. The number of alleles per locus (NA) was 17.429, effective alleles (NE) 6.644, expected heterozygosity (HE) 0.675, observed (HO) heterozygosity 0.592, and inbreeding coefficient (FIS) 0.128. Twelve haplotypes in the D-Loop region were revealed, with values of h (0.7642) and π (0.00729), suggesting a large and stable population with a long evolutionary history. Thus, both molecular markers revealed high levels of genetic diversity and indicated the occurrence of a single G. setequedas population distributed along a stretch of approximately 200 km. The pattern of mismatch distribution was multimodal, which is usually ascribed to populations in demographic equilibrium. Nevertheless, the construction of a new hydroelectric power plant, already underway between the Salto Caxias Reservoir and Iguaçu Falls, could fragment this population, causing loss of genetic diversity and population decline, and for this reason it is necessary to maintain the Iguaçu River tributaries and downstream area from the Lower Iguaçu Reservoir free of additional dams, to guarantee the survival of this species.


INTRODUCTION
The largest biodiversity asset in the world is located in Brazil (ICMBio, 2017). However, more than 1,170 species in Brazil have been classified as threatened with extinction. Unfortunately, of these, more than 26% are Actinopterygii fish found in freshwaters (ICMBio, 2017), including Gymnogeophagus setequedas Reis et al. (1992), the only one of the 17 species of the genus described so far considered as threatened (Abilhoa and Duboc, 2004;Pavanelli and Reis, 2008). This has led the Brazilian Environmental Ministry to decree the species with the status of Endangered species (EN) (decree #445, International Union for Conservation of Nature [IUCN], 2014).
Gymnogeophagus setequedas was described based on specimens collected in tributaries of the Paraná River in Paraguay and Brazil, near the Sete Quedas region, an area currently submerged due to construction of the Itaipu Hydroelectric Power Plant (Reis et al., 1992;Pavanelli and Reis, 2008). However, since its description no specimens have been collected in the original known distribution area (Abilhoa and Duboc, 2004;Agostinho et al., 2004;Pavanelli and Reis, 2008). Nevertheless, 15 specimens of G. setequedas were recently collected in the Lower Iguaçu River, both up and downstream from Iguaçu Falls in the Iguaçu National Park (Paiz et al., 2017). According to the authors, finding this species in that region was quite unexpected as the Iguaçu waterfalls have lead to effective geographic isolation of the Ichthyofauna of the Iguaçu River (Zawadzki et al., 1999), providing an accentuated degree of endemicity, estimated between 51 and 71% (Abell et al., 2008). In addition, due to its high ecological importance, the Iguaçu River basin is considered an ecoregion, separated from the rest of the Paraná River Basin (Abell et al., 2008).
The Iguaçu River basin covers an area of approximately 72,000 km 2 , representing part of the landscape of the three Paraná plateaus, subdivided into three regions: Upper Iguaçu (1st plateau, Curitiba region), Middle Iguaçu (2nd plateau, Ponta Grossa region), and Lower Iguaçu (3rd plateau, Guarapuava region) (Maack, 2001). The portion of the 3rd plateau that includes the Lower Iguaçu is characterized by the presence of numerous waterfalls, such as Salto Grande (13 m), Salto Santiago (40 m), Salto Osório (30 m), and Iguaçu Falls (Maack, 1981). The region is very attractive for hydroelectric use due to its high gradient, and, thus, the original rapids and waterfalls, have been transformed into a sequence of reservoirs that flooded approximately 656 km 2 , remarkably altering the landscape (Júlio Júnior et al., 1997).
It was believed that G. setequedas preferred lentic environments, as in the other species of the genus (Pavanelli and Reis, 2008). However, it seems that this species behaves differently from its congeners, preferring fast waters. This fact was corroborated by its recent capture in the Lower Iguaçu River, in stretches without containment and with fast waters (Paiz et al., 2017). In addition, this species disappeared after construction of the Itaipu reservoir, being collected only twice, suggesting its dependence on lotic environments (Agostinho et al., 2004). Pavanelli and Reis (2008) consider that this species no longer occurs in the Itaipu reservoir, possibly because it did not succeed in colonizing the environment formed after construction of the reservoir. In Paraguay this species is also considered an EN (Liotta, 2010), for the same reasons as in Brazil.
According to Wu et al. (2015), understanding the diversity and genetic structure of endangered species is fundamental to engage effective environmental conservation and management actions. Genetic diversity is essential if populations are evolving in response to environmental changes. For instance, due to the effects of anthropogenic disturbances, a small and isolated population is more likely to lose genetic diversity, and consequently present population decline, than a huge population with high genetic diversity (Frankham et al., 2010;Allendorf et al., 2012).
The genetic diversity status of species is the starting point for systematic planning of actions that should be taken to ensure the survival of species and reduce their risk of extinction. No works are known which focus on the biological (Pavanelli and Reis, 2008) or genetic diversity of G. setequedas. In addition, the diploid number has only recently been presented (Paiz et al., 2017). Thus, the aim of this study was to estimate the genetic diversity and population structure of G. setequedas along its recently known area of occurrence, using microsatellite markers and mitochondrial haplotypes (D-loop), thus presenting the first data of a population study of this species threatened with extinction.

Study Area and Sampling
Our study area comprises a stretch of the Lower Iguaçu River basin, between upstream Iguaçu Falls and downstream Salto Caxias Reservoir (Figure 1).
Samples of 86 G. setequedas were collected at nine different points, some of them located in the Iguaçu National Park (PNI): two points in the main channel of the Iguaçu River (IGU 1 and IGU 2 -PNI), near the Iguaçu falls, and seven tributaries of the Iguaçu River (STO-Santo Antônio, SIL-Silva -PNI, Jardim, FLO-Floriano -PNI, GON-Gonçalves Dias -PNI, CAP-Capanema, AND-Andrada, and COT-Cotegipe) (Figure 1 and Table 1). The samples were collected in 2012 (November), 2013 (November and December), and 2014 (January, February, March, April, July, August, September, November, and December). The specimens were captured using nets of different mesh sizes and electric fishing. Samples of muscle and rayed fins were taken from the fish, stored in microtubes containing 100% ethanol and kept at −20 • C. Specimens were fixed in 10% formalin and preserved in 70% ethanol and deposited in the fish collection of the Zoology Museum at the Universidade Estadual de Londrina under catalog number: MZUEL 16332, 16353, 16354, 17094-17096.

DNA Extraction and Quantification
Total DNA was extracted from muscle or rayed fins preserved in 95% EtOH following the phenol/chloroform protocol of Almeida et al. (2001). NanoDrop TM 1000 was used for determination of DNA concentrations and samples were diluted in ultrapure water, 10 ng/µL for microsatellite markers and 5 ng/µL for mtDNA D-loop markers.

mtDNA (D-Loop) Marker
Part of the control region (D-loop) of the G. setequedas mitochondrial DNA was amplified using PCR. The primers used were L 5 -AGAGCGTCGGTCTTGTAAACC-3 (Cronin et al., 1993) and H 5 -CTGAAGTAGGAACCAGATG-3 (Meyer et al., 1990). PCR reactions were performed in a 25 µL final volume containing 1X GoTaq Master Mix (Promega), 1 µM Multiple alignment analysis was carried out using the ClustalW application (Thompson et al., 1994) in BioEdit 7.1.3.0 (Hall, 1999). NCBI's BLAST search (Basic Local Alignment Search Tool, Altschul et al., 1990) was used to confirm the origin of the fragment. To search for possible tRNA, was used an online version of the tRNAscan-SE (Lowe and Eddy, 1997), available at http://lowelab.ucsc.edu/tRNAscan-SE. Sequences of the 12 different haplotypes were deposited in GenBank (MG581478 to MG581489).

Genetic Analyses (Microsatellites)
Population Structure The first step for genetic analyses was to define the number of existing populations. For this we used population analyzes based on Bayesian approaches that mainly include "attribution methods." These methods calculate the probability of the different genotypes being observed in each population and assign the individuals to the populations according to the possibilities of the genotypes belonging to them, without any a priori inference. Thus, such analyzes allow to infer which population an individual belongs to, regardless of their collection site (Beaumont and Rannala, 2004). In order to evaluate the relationship between samples, we conducted a Bayesian cluster analysis of the population by using STRUCTURE v.2.3.3 (Pritchard et al., 2000) program. The number of populations (K) was estimated by using the admixture model and correlated allele frequencies among populations, with K ranging from 1 to 10 (K = 1-10) (Evanno et al., 2005). A total of 20 independent runs of 100,000 Markov Chain Monte Carlo (MCMC) iterations discarded as burn-in, followed by 1,000,000 MCMC iterations were used for each value of K. The best-fit number of groupings was evaluated using K, ln Pr (X/K) (Pritchard et al., 2000) and K ad hoc statistics (Evanno et al., 2005) by Structure Harvester v.0.6.7 (Earl and VonHoldt, 2012). Graphs representing the membership coefficient of each sampled individual were plotted using Distruct 1.1 (Rosenberg, 2004). Genetic differentiation estimates were assessed from pairwise ST values obtained in ARLEQUIN v.3.5.1.3 (Excoffier and Lischer, 2010). Significant estimates were based on 10,000 permutations. Subsequently, P-values corresponding to alpha = 0.05 were adjusted after Holm-Bonferroni correction for multiple tests (Holm, 1979).

Genetic Diversity
Number of alleles per locus (N A ), effective number of alleles (N E ), expected and observed heterozygosity (H O , H E ) were obtained with POPGEN v. 1.31 (Yeh et al., 2000) software. Inbreeding coefficient (F IS ) was obtained with Fstat v2.9.3 program (Goudet, 2001). Deviation from Hardy-Weinberg equilibrium (HWE) and the linkage disequilibrium between pairs of loci with significance (P-value), later adjusted by the Bonferroni sequential correction (Rice, 1989) were tested with the GENEPOP v.1.2 (Raymond and Rousset, 1995). MICRO-CHECKER 2.2.1 (Van Oosterhout et al., 2004) software was used to test for the possible presence of null alleles or other genotyping errors such as allelic dropout and reading errors due to stutter peaks.

Gene Flow
The contemporary migration rates over a few previous generations and the direction of migration among the samples studied, was estimated by using the BayesAss v 3.0.3 program (Wilson and Rannala, 2003), at 95% confidence intervals. Ten runs were analyzed using different random starting seed numbers, with 3,000,000 MCMC iterations, including 999,999 discarded burn-in iterations. After the burn-in, every 2000th iteration was sampled. The delta values (maximum amount by which parameter values are allowed to change between iterations) were 0.15 for allele frequencies, 0.025 and 0.05 for migration rate, and 0.15 for inbreeding value.

Demographic Analyses
Recent population bottleneck signs were evaluated on microsatellite data using Bottleneck v.1.2.02 program (Piry et al., 1999), considering deviations from the mutation-drift equilibrium. Three tests were used, including two tests to indicate bottlenecks in the presence of significant excess heterozygosity: "Sign test" (Cornuet and Luikart, 1996) and the "Wilcoxon sign-rank test" , both based on the Infinite Alleles Model (IAM), Stepwise Mutation Model (SMM), and Two-Phase Model (TPM -with 90% SMM and 10% IAM), with a P-value < 0.05. The third test was the "Mode shift test" that indicates bottlenecks resulting from alterations in allele frequency distributions .

Population Structure
Bayesian clustering analysis (Structure) applied to microsatellite data indicated that the most probable K (K+ cluster number) was K = 1, from ln Pr(X/K). The graphic representation of K showed that there were no well-defined groups, the ancestral values were distributed homogeneously among individuals and samples, indicating the occurrence of a single G. setequedas population (Figure 2A

Genetic Diversity
In the entire sample, a total of 120 different alleles were obtained from seven microsatellite loci. The number of alleles per locus (N A ) was 17.429, effective alleles (N E ) 6.644, expected heterozygosity (H E ) 0.675, observed (H O ) heterozygosity 0.592, and inbreeding coefficient (F IS ) 0.128.
After applying the Bonferroni sequential correction, there were no significant deviations (P < 0.05) in the Hardy-Weinberg equilibrium (HWE) at the majority of microsatellite loci, only the locus Gbra96 showed significant deviation. This correction was also applied to the linkage disequilibrium (LD) tests, and a significant value was found only in locus Gbra96. The Micro-Checker program found no null alleles among the samples.
From the amplification and sequencing of mtDNA of 82 G. setequedas individuals, a 449 bp fragment from the D-loop region was obtained. Twenty polymorphic sites (17 transition and three transversion mutations) and four indels sites were found. Twelve different haplotypes were revealed, of which four haplotypes (H5, H7, H10, and H12) were singletons ( Figure 2B). The H2 haplotype was the most frequent, observed in 37 samples from four different locations (IGU1, IGU2, STO, and FLO). Although SIL, FLO, GON, COT, and CAP present only one haplotype each, these haplotypes are shared with other locations, except for H5 found only in CAP. The IGU2 has the highest number of haplotypes (N = 8), followed by STO (N = 6), AND (N = 3), and IGU1 (N = 2). Haplotype (h) and nucleotide (π) diversity values were 0.7642 and 0.00729, respectively ( Table 2).

Demographic Analyses
The signed-rank test did not produce significant values in any of the mutational models (IAM, SSM, or TPM). In the mode-shift test, the samples showed typical L-shaped distribution (nonbottleneck) in the frequency of the alleles in the mode-shift test ( Table 3). The mismatch distribution graphic demonstrated a multimodal distribution for haplotypes (Figure 3), which is usually ascribed to populations in demographic equilibrium. In the neutrality tests, Tajima test values (D) and Fu test values (F s ) were negative and not significant ( Table 2).

Gene Flow
Bayesian gene flow analysis of the microsatellite data revealed contemporary migration values among the samples within the confidence interval (95%). Values for non-immigrants in each sample, ranging from 79 to 82.6%. However, migration estimates were also obtained and showed similar values among the majority of samples. The lowest migration estimates were from IGU 1 to COT and IGU 2 to AND, both with 1.9%. On the other hand, the highest migration values were from FLO to IGU2, and to GON, and COT to CAP, with 2.7%. Among all the samples, FLO was the one that obtained the highest percentage of migrants, for the largest number of sites (Table 4).

Genetic Diversity and Population Structure
The study of molecular markers, such as Microsatellite and mtDNA, generates important information on the genetic variation and structure of fish species and is a significant step toward realizing the goal of conservation of species in their natural populations Piorski et al., 2008;Garcez et al., 2011;Abdul-Muneer, 2014). According to Wu et al. (2015), understanding the diversity and genetic structure of endangered species are essential to engage effective environmental conservation and management action. The longterm persistence of species depends on sufficient genetic diversity to adapt and survive in variable or changing environments (Hughes et al., 2008).
According to DeWoody and Avise (2000), based on a metaanalysis of microsatellite polymorphisms, freshwater fish, on average, have 9.1 ± 6.1 alleles and expected heterozygosity of 0.54 ± 0.25 per population. Therefore, based on microsatellites, the genetic diversity of the G. setequedas population in terms of allele numbers (17.14) and the expected heterozygosity (0.67) are as expected for freshwater fish. In addition to the high diversity in the nuclear markers, the G. setequedas analyzed showed significant variations in mitochondrial DNA, exhibiting high levels of genetic diversity. According to Freeland (2005), h is considered the haploid equivalent of HE in data on diploids. The similarity among these estimates suggests that the current variations in nuclear and mitochondrial DNA are evenly distributed throughout G. setequedas population.
A high level of genetic diversity is an important attribute for species and may confer the basis for adaptation to environmental change (Piorski et al., 2008), especially when it comes to endangered species such as G. setequedas (Abilhoa and Duboc, 2004;Pavanelli and Reis, 2008; International Union for Conservation of Nature [IUCN], 2014).

Demographic History
A 449 bp test in the D-Loop region, one of the most variable regions of mtDNA (Frankham et al., 2010), revealed 12 haplotypes and high values of π (0.00729) and h (0.750). In addition, analysis of the microsatellite data using Bottleneck program showed no significant recent bottlenecks. The absence of recent bottlenecks is corroborated by the high haplotype (h > 0.5) and nucleotide diversity (π > 0.5%) values in the mtDNA. According to Grant and Bowen (1998), high haplotypic diversity combined with high nucleotide diversity represents a TABLE 2 | Genetic diversity of G. setequedas in the Lower Iguaçu River basin, based on microsatellite markers and mitochondrial haplotypes (D-Loop).   large and stable population with a long evolutionary history, or secondary contact between different lineages. Stable population with a long evolutionary history seems to be a very plausible possibility for G. setequedas, as the Iguaçu waterfalls have exerted effective geographic isolation on ictiofauna of the Iguaçu river (Zawadzki et al., 1999), providing an accentuated degree of endemicity, of more than 70% (Abell et al., 2008). In addition, analysis of the distribution of substitution differences between pairs of haplotypes (mismatch distribution) (Cunha and Solé-Cava, 2012) shows multimodal distributions, which is generally attributed to populations in demographic equilibrium (Rogers and Harpending, 1992). However, negative values in the Tajima's D test and Fu's F s test, even if not significant, could suggest population expansion after an ancient bottleneck (Slatkin and Hudson, 1991;Grant and Bowen, 1998), indicating that all the current haplotypes are closely related and derived from a single main haplotype (H2). Signs of old bottlenecks may be less evident at microsatellite loci, since they tend to recover from the variation more rapidly than mitochondrial sequences. At the same time, π recovery after a genetic bottleneck is slower than h at mtDNA (McCusker and Bentzen, 2010).

Gene Flow
The individuals of the Floriano River (FLO) presented the highest rates of migration, and the highest levels of admixture in samples were found in the Iguaçu 2 and Gonçalves Dias rivers. The specimens from rivers further upstream in the drainage (FLO, GON, CAP, and COT) appear to have more levels of admixture between them. This factor might suggest that entry into the upper tributaries is more likely than the lower. However, the highest rates of migrants to Iguaçu 1 are from most upstream tributaries, suggesting that populations of G. setequedas maintain satisfactory gene flow in all stretches of the river studied. According to Palstra and Ruzzante (2008), if local populations are small, as is the case in the present study, gene flow is the key factor to prevent the The bold values along the diagonal represent non-migrants within a putative source subpopulation.
Frontiers in Genetics | www.frontiersin.org stochastic loss of genetic diversity, besides providing the required alleles to subpopulations under selection that lack favorable genotypes (Kinnison and Hairston, 2007). Although these results allow inferring gene flow between localities, according to Wilson and Rannala (2003) a strong estimate can be reach with a higher sample size per locality. It can be the next goal for further studies, but it is a difficult task to solve immediately because the species is not abundant and is mainly distributed in a preservation area. According to Fagan et al. (2002Fagan et al. ( , 2005, riverine populations are forecasted to be particularly vulnerable to fragmentation due to their dendritic structure, which may be exacerbated by unidirectional migration. Natural barriers (rapids and waterfalls) and man-made structures, such as dams, also fragment riverine populations, influencing in the dispersal rate and migration pattern (Wofford et al., 2005), even of a rheophilic species of fishes with strong swimming abilities such as G. setequedas (Paiz et al., 2017). However, the construction of a new hydroelectric power plant (Baixo Iguaçu HPP), already underway between the Salto Caxias Reservoir and Iguaçu Falls, could fragment this population preventing the gene flow. As a consequence, there may be loss of genetic diversity and population decline, especially in the area of future reservoir. Moreover, this separate population can be extinguished, as has already happened with another population of G. setequedas after the construction of the Itaipu Hydroelectric Power Plant. The disappearance was attributed to the lentic waters of the Itaipu Reservoir, which isolated populations of this rheophilic species, which previously occurred in tributaries of both river banks, in Paraguay and Brazil, and probably in the Paraná River (Paiz et al., 2017).

Conservation Implications
The abundance, dispersal, and population size are reduced in populations structured by habitat fragmentation due to barriers such as dams, thereby increasing the risk of extinction (Gross et al., 2004;Letcher et al., 2007). This fragmentation can lead to the total or partial isolation of a population, conditioning the response of the individuals. Thus, in the recently found population of G. setequedas in the Iguaçu River, a drastic reduction and loss of genetic diversity, due to inbreeding, must be avoided preserving the lotic characteristics of the environment.
For instance, due to the effects of anthropogenic disturbance, small and isolated populations are more likely to suffer loss of genetic diversity and population decline, than a huge population with high genetic diversity (Frankham et al., 2010;Allendorf et al., 2012). According to Frankham (2003), inbreeding reduces reproduction and survival rates, and loss of genetic diversity reduces the ability of populations to evolve to cope with environmental changes, leading to extinction risk.
The type locality and most of the records of G. setequedas are in Paraguay, in tributaries of the right bank of the Paraná river, in the region of influence of the Itaipu reservoir and downstream (Reis et al., 1992). Since the species description, despite several attempts, it was not possible to collect new specimens from the known geographic range of occurrence (Agostinho et al., 2004;Pavanelli and Reis, 2008). According to Pavanelli and Reis (2008), this species no longer occurs in the Itaipu reservoir, as well as in the floodplain upstream of the reservoir. Despite several collection efforts on the Iguaçu River (Pavanelli and Reis, 2008), mainly in the Lower Iguaçu upstream the National Park (Baumgartner et al., 2012), this species was not collected. For this reason, the conservation status of G. setequedas was invariably attributed to a threatened category (Abilhoa and Duboc, 2004;Pavanelli and Reis, 2008; International Union for Conservation of Nature [IUCN], 2014; Paiz et al., 2017). However, recently Paiz et al. (2017) and the present study report the presence of G. setequedas in the Lower Iguaçu in the National Park region. In this way, the population of G. setequedas of the Lower Iguaçu River may be one of the last remnants of this species and, according to Pavanelli and Reis (2008), as G. setequedas is a naturally rare species, it is advisable that any anthropogenic changes in its original ecosystem be discouraged.
The results presented here demonstrate that the population of G. setequedas of the Iguaçu River still maintains satisfactory levels of genetic diversity. However, in terms of conservation management plans, to guarantee the survival of this species, it is necessary to maintain the tributaries of the Iguaçu River and the downstream area from the future reservoir (Baixo Iguaçu Reservoir) without additional dams. Long-term monitoring of genetic diversity and inbreeding could also help conserve this population and provide a basis for future decisions.

ETHICS STATEMENT
This study was carried out in strict accordance with the recommendations provided in the Guide for the Care and Use of Laboratory Animals. Collection was authorized by the System of Authorization and Information on Biodiversity -SISBIO (SISBIO n • . 25648-3 and 25648-4), by the Chico Mendes Institute for Biodiversity Conservation ICMBio 003/2014 and Official SEI n • . 63/2016-DIBIO/ICMBio), and by the Environmental Institute of Paraná -IAP (n • . 37788 and 43394). The sampling protocol was approved by the Ethics Committee on the Use of Animals -CEUA of the Universidade Estadual do Oeste do Paraná (n • . 62/09).

AUTHOR CONTRIBUTIONS
LS-S, DF, MM, and OS designed the research. LA, SP, SM, and MM collected data. LS-S, TK-D, and DF performed the molecular genetic studies. All authors contributed to the writing of the manuscript.