A Multilocus Approach to Understanding Historical and Contemporary Demography of the Keystone Floodplain Species Colossoma macropomum (Teleostei: Characiformes)

We studied the natural populations of a flagship fish species of the Amazon, Colossoma macropomum which in recent years has been suffering from severe exploitation. Our aim was to investigate the existence or not of genetic differentiation across the wide area of its distribution and to investigate changes in its effective population size throughout its evolutionary history. We sampled individuals from 21 locations distributed throughout the Amazon basin. We analyzed 539 individuals for mitochondrial genes (control region and ATPase gene 6/8), generating 1,561 base pairs, and genotyped 604 individuals for 13 microsatellite loci obtaining, on average, 21.4 alleles per locus. Mean HE was 0.78 suggesting moderate levels of genetic variability. AMOVA and other tests used to detect the population structure based on both markers indicate that C. macropomum comprises a single and large panmitic population in the main channel of the Solimões-Amazonas River basin, on the other hand localities in the headwaters of the tributaries Juruá, Purus, Madeira, Tapajós, and localities of black water, showed genetic structure. The greatest genetic differentiation was observed between the Brazilian Amazon basin and the Bolivian sub-basin with restricted genetic flow between the two basins. Demographic analyzes of mitochondrial genes indicated population expansion in the Brazilian and Bolivian Amazon basins during the Pleistocene, and microsatellite data indicated a population reduction during the Holocene. This shows that the historical demography of C. macropomum is highly dynamic. Conservation and management strategies should be designed to respect the existing population structure and minimize the effects of overfishing by limiting fisheries C. macropomum populations.


INTRODUCTION
The Amazon basin holds the largest diversity of fishes in the world. It is estimated that approximately 2,411 fish species occur there (Reis et al., 2016), with 1,089 species being endemic. Aquatic biodiversity of the Amazon basin is thought to be the consequence of diversification of modern fauna that occurred mainly during the Miocene (Lovejoy et al., 2010), driven, to a large extent by the establishment of the current hydroscape. Amazonian rivers also drain three principal geological formations, the Andes and the Guyana and Brazilian Shields, with consequences for the physicochemical properties of the waters draining these geological formations. Thus some of these rivers present physical barriers which limit geneflow between different sections of the river, further acting as agents of divergence (Hoorn et al., 2010). Naturally, all of these forces interact, producing an amazingly diverse ichthyofauna. Part of this ichthyofauna is also exploited as a fisheries resource that represent the production base of an economic sector that contributes more than US$ 200 million per year to the economy of the Brazilian Amazon basin (Barthem and Fabré, 2003). Colossoma macropomum (tambaqui) is on the top of the list of most important commercial species. This species also has an important ecological role, as it is an important disperser of seeds of trees and shrubs of the Amazonian floodplain (Araújo-Lima and Goulding, 1998). For all its commercial importance, this species has suffered over-exploitation of its natural stocks over the last years and today, juveniles account for most of the catch (Barthem and Goulding, 2007). The average size of the fish landed and sold in the main markets in the Amazon suggests that many individuals are captured before reaching sexual maturation, which occurs in females between 50 and 55 cm in length, at an estimated mean age of 3 years (Goulding and Carvalho, 1982; based on the length/age relationship estimated from Bertalanffy's model by . Colossoma macropomum is found throughout almost the entire length of the Amazon River and most of its affluents, as well as in the Orinoco basin (Araújo-Lima and Goulding, 1998). Thus, the species is found in the three main Amazonian water types (white, clear, and black), as well as upstream and downstream of geographic barriers such as rapids (Goulding et al., 2003). Colossoma macropomum is thus an idea candidate for the study of and the understanding of the structuring patterns in the Amazon basin, which are important for the implementation of science-driven conservation measures.
Previous studies have found a high degree of genetic variability of populations of C. macropomum (Santos et al., 2007;Farias et al., 2010), indicating that overfishing has not yet affected genetic diversity of wild populations, nor were signals of population reduction detectable. The authors suggested that the absence of a genetic sign of population reduction was likely related to the large effective population size of the species. Moreover, C. macropomum has migratory behavior and moves through the rivers of the Amazon seasonally for the purposes of feeding and breeding (Araújo-Lima and Ruffino, 2004). The behavior of C. macropomum and the hydrological dynamic of the floodplain habitat it predominantly occupies may partially explain the panmixia reported for this species. This apparent lack of population structuring is found throughout the mainstream of the Amazon basin, with the exception of individuals found upstream of the series of rapids delimiting the Bolivian sub-basin from the Amazon basin . The authors also suggested that populations of C. macropomum from Bolivian sub-basin were largely demographically stable, while the Brazilian Amazon basin populations evidenced a historical population growth from the Pleistocene onward.
Knowledge of changes of effective population sizes of C. macropomum is important for understanding the demography of the species. In addition, robust estimates of population differentiation, are important for implementing conservation and management strategies. Therefore, the aim of the present study was to test the two hypotheses raised in previous studies of C. macropomum using samples from the entire area of distribution of C. macropomum in the Amazon basin, and using both nuclear-encoded microsatellites and mtDNA genes sequences. As the first hypothesis we test if C. macropomum populations are differentiated, considering: (i) samples of the mainstream of the Amazon River, as well as eight of its main tributaries; (ii) samples of all three major water types of the Amazon (white, clear, and black water) based on the classifications of Sioli (1984) and Venticinque et al. (2016); (iii) samples of water upstream and downstream of rapids in the Madeira and Tapajós rivers. In the second hypothesis, historical and contemporary demographic approaches were used to test if C. macropomum underwent changes in the effective population size throughout its evolutionary history in the Amazon basin.

Samples and Data Collection
A total of 637 samples of Colossoma macropomum were collected directly from artisanal fishers at 21 localities within the Brazilian Amazon basin and one locality in the Bolivian subbasin (Figure 1). The adipose fin or fragment of muscle tissue was removed from between 20 (localities within the Brazilian Amazon Basin) and 69 individuals (within the Bolivian subbasin) and then preserved in 100% ethanol for subsequent laboratory analyzes.
Total genomic DNA was extracted using Proteinase K/Phenolchloroform/isoamyl alcohol protocol and precipitated with 70% ethanol (Sambrook et al., 1989). Approximately 50 to 100 ng of genomic DNA was used as a template for PCR reactions. We amplified the mitochondrial DNA control region (mtDNA control region) and the ATPase subunits 6 and 8, using the primers Chara_LDloop and Chara_RDloop; CMF2 and CMR2 (control region) and ATP 8.2_L8331 and CO3.2_H9236 (ATPase genes) listed in Supplementary Table S1. The PCR reactions for the two regions were performed in a final volume of 15 µL containing 1.5 µl of the forward and reverse primer (2 mM), 1.5 µl of buffer (Tris-KCL 200 mM, pH 8.5), 1.5 µL of 25 mM MgCl, 1.5 µL of 25 mM dNTP, 0.3 µL of 5 U/µL Taq polymerase and 6.2 µL ddH 2 O. PCR conditions (for control region and ATPase gene) were as follows: denaturation at 94 • C for 60 s, primer annealing at 50 • C for 30 s, primer extension at 68 • C for 90 s, followed by a final extension at 68 • C for 5 min. The first three steps were repeated 35 times.
Purification of the PCR products was performed using ExoSAP (Exonuclease Enzymes and Shrimp Phosphatase Alkaline). The samples were sequenced using the BigDye terminator v3 kit (ThermoFisher), following the manufacturer's protocol. Due to the size of the control region of C. macropomum (approximately 1,100 bp), each sample was sequenced in two steps, using the CMF2 (forward) and CMR2 (reverse) internal primers (Supplementary Table S1). For the ATPase gene only the primer ATP 8.2 (forward) was used. The precipitated product was resolved in the ABI 3130xl DNA Analysis System sequencer (ThermoFisher), according to the manufacturer's standard protocol.
PCR conditions were as follows: denaturation at 94 • C for 20 s, primer annealing at 60-65 • C (depending on the primer combination) for 20 s, and extension at 68 • C for 30 s, repeated for 30 times, followed by another cycle for annealing the M13 primers with the following conditions: denaturation at 94 • C for 20 s, annealing of the M13 fluorescence-labeled primer at 50 • C for 20 s, and extension at 68 • C for 30 s, repeated for 20 times, with final extension of 30 min at 68 • C.
For the genotyping reaction the PCR products were diluted to between 10-50 µL with ultra-pure water depending on the intensity of PCR products on an agarose gel. For each 1 µl of diluted product, 8.0 µL of Hi-Di formamide (ThermoFisher, Inc.), and 1.0 µL 6-carboxy-X-rhodamine (ROX) size standard from DeWoody et al. (2004) were added. The samples were genotyped in ABI 3130xl automatic sequencer (ThermoFisher, Inc.) and allele sizes (in base pairs) were estimated in GeneMapper TM software version 4.0 (ThermoFisher, Inc.). Matrix of genotypes is available at https://github.com/legalLab/datasets.
The sequences of the control region and subunits 6 and 8 of the ATPase gene were verified, edited and aligned in the program BIOEDIT v7.0.5 (Hall, 1999). The ATPase genes were translated into hypothetical amino acids in the program MEGA 6.0 (Tamura et al., 2013) to verify the presence of any unexpected stop codons. Sequences were deposited in the GenBank under accession numbers MH514288-MH514827 for control region and MH520124-MH520663 for ATPase genes.

Mitochondrial DNA Analyses
The existence of population structure was tested for using the Analysis Molecular Variance (AMOVA) implemented in the program ARLEQUIN v3.5 (Excoffier and Lischer, 2010). We analyzed three datasets: (1) all 21 locations were analyzed as a single hierarchical level; (2) the Guajará-Mirim (Bolivian basin) locality was removed from the data matrix; 3) tributaries vs. locations of the main channel, of Amazon River. Pairwise ST were also estimated, and statistical significance was corrected for multiple comparisons (Rice, 1989). We also tested for population structuring using the Spatial Analysis of Molecular Variance (SAMOVA) (Dupanloup et al., 2002), and the Bayesian Analysis of Genetic Population Structure (BAPS) (Mantel, 1967;Corander and Tang, 2007). Spatial structuring was tested using the Mantel test as implemented in ARLEQUIN v3.51 (Excoffier and Lischer, 2010).
In order to investigate patterns of change in C. macropomum historical effective population sizes, we carried out a Bayesian Skyline plot analyses in the program BEAST v.1.8.4 (Drummond and Rambaut, 2007). We collected 50,000,000 Monte Carlo Markov Chain steps (MCMC), discarde the first 5,000,000 steps as burnin, and subsequently sampled every 1,000th step, retaining 45,000 topologies. The HKY85 (Hasegawa et al., 1985) model of molecular evolution was selected as the best fitting model in the program Modeltest. We estimated a genetic network of mtDNA haplotypes from all samples using Network (http://www.fluxus-engineering.com/) using the median-joining algorithm.
To convert the results of the coalescent analyzes into years and effective number of individuals, we assumed a threeyear generation time (Goulding and Carvalho, 1982;, and a rate of molecular evolution of 2.0 × 10 −8 mutations per site and per year . The Tajima's D (Tajima, 1989) and the Fu's Fs (Fu, 1997) tests were used to examine whether populations are at a mutation-drift equilibrium assuming no selective differences among haplotypes. Both tests were performed using the program ARLEQUIN v3.5 (Excoffier and Lischer, 2010). Demographic history may also be inferred from frequency distribution of pairwise haplotype differences. In populations that are at a demographic equilibrium, the distribution of differences are generally multimodal, while populations that have undergone recent expansion or reduction typically have a unimodal distribution (Slatkin and Hudson, 1991). In order to distinguish population reduction and expansion, we used the results of two tests. The first test evaluates the distribution of the sum of the squares of differences (SSD) between the mismatch distribution observed for each locality and the expected distribution for a null expansion model, where significant values for SSD indicate deviations from the population expansion model (Schneider and Excoffier, 1999). The other test is based on the Harpending inequality index (Hri = r) (Harpending, 1994), which quantifies the variance of the mismatch distribution, assuming that the mismatch distribution is unimodal. These analyzes were performed in the programs DNASP v5.0 (Librado and Rozas, 2009) and ARLEQUIN v3.5 (Excoffier and Lischer, 2010); significance was tested via 10,000 permutations with a P = 0.05 cut-off.

Microsatellite DNA Analyses
The data matrix with allele sizes was verified for the occurrence of null alleles, allelic stutter, and large allele dropout in the program MICRO-CHECKER (Van Oosterhout et al., 2004) The number of alleles (A), observed (H O ) and expected (H E ) heterozygosities, gene diversity (h), nucleotide diversity (π), linkage disequilibrium (LD) between pairs of loci and the Hardy-Weinberg equilibrium (HWE) were calculated. All these parameters were estimated using ARLEQUIN v3.5 (Excoffier and Lischer, 2010). Considering that some of these estimates suffer influence of sample size (Leberg, 2002), we implemented a rarefaction analysis and calculated allelic richness (AR) and private allelic richness (PAR) in the program HP-Rare (Kalinowski, 2005), so that the number of alleles and allele richness estimates could be compared between localities. Additionally, we estimated the inbreeding coefficient (F IS ) for each sample. The effective population size (Ne) for each population was estimated using the LD method (Waples and Do, 2008) as implemented in NeEstimator 2.0 (Do et al., 2014). The Ne estimates are equivalent to the effective number of breeders that produced offspring during a certain period of time and assuming that sample sizes are not representative of the entire generation (Palstra and Fraser, 2012). In all instances, significance levels for tests involving multiple comparisons were adjusted using the sequential Bonferroni correction (Rice, 1989).
The overall genetic structuring was estimated using the analysis of molecular variance (AMOVA- Excoffier et al., 1992) performed in the program Arlequin v3.5 (Excoffier and Lischer, 2010). We also analyzed three datasets: (1) all 21 locations were analyzed as a single hierarchical level; (2) the Guajará-Mirim (Bolivian basin) locality was removed from the data matrix; (3) tributaries vs. locations of the main channel of Amazon River. Genetic differentiation between pairs of populations was estimated using F ST. Additionally, pairwise genetic differentiation between populations was estimated using Hedrick's G ST , (Hedrick, 2005) based on the empirical Bayes (EB) G ST estimator (Kitada et al., 2007) suitable for high gene flow species (Kitada et al., 2017), using the FinePop 1.3.0 package (Kitada et al., 2017) implemented in the R statistical language (R Development Core Team, 2011).
We used SAMOVA (Dupanloup et al., 2002) to infer spatial population structure and STRUCTURE v2.3.4 (Pritchard et al., 2000) to identify biological populations. For STRUCTURE analyses we used the admixture and correlated allelic frequencies models with and without location prior, and we tested one to 20 groups (K = 1-20). The analysis was run with 1,000,000 MCMC step, discarding the first 100,000 steps. The Isolation by Distance (IBD) was tested via a correlation between genetic and geographical distance using the Mantel test implemented in the Software Arlequin (Excoffier and Lischer, 2010). In addition, we also used a multivariate approaches implementing the Discriminant Analysis of Principal Components (DAPC; Jombart et al., 2010) to cluster genotypes using the R package Adegenet (Jombart, 2008).
Coalescent analyses implemented in the program IMa2 (Hey and Nielsen, 2007) were used to partition allele sharing between populations due to ongoing geneflow and ancestral haplotype sharing. We estimated the parameters t (splitting time), m (migration rates), and theta (θ ) where θ = 2 Neµ. We sampled 20,000,000 Monte Carlo Chain Markov Chain Monte Carlo (MCMC) generations after discarding the first 1,000,000 generations as burn-in. Two independent runs were carried out with different starting points, in order to verify convergence. The two independent runs converged and thus were combined and the parameters θ , m, and t were estimated. Then, these were converted into demographic parameters: contemporary effective population size, number of migrants per generation, and time of divergence of the populations in generations. The analyses using IMa2 were performed with pairs of sampling sites located at the geographic extremes along the main channel of the Amazon River and between upstream and downstream localities of principal tributaries. We also estimated geneflow using the program MIGRATE 3.1.6 (Beerli and Felsenstein, 2001), where for diploid data θ = 4 Neµ and M = m/µ migration rate ratio and mutation rate. For the Bayesian analysis, we ran ten short chains, sampling each chain 10,000 times. Then we ran six long chains of 2,000,000 steps, sampling each chain 200,000 times and discarding the first 2,000 samples. The runs were replicated, and the convergence between the chains was evaluated using the Gelman-Rubin statistic implemented in the program. We estimated the historical migration rates (M) between the localities and the relative number of migrants per generation Nm = Mθ/2. To convert the results into biological information, we assumed a 3-year generation time for C. macropomum (previously justified with mtDNA data), and a mutation rate µ = 5 × 10 −4 (mean rate of evolution of microsatellites, Di Rienzo et al., 1994).
To detect, quantify, and date the historical and contemporaneous demographic changes in C. macropomum populations we implemented the coalescent sampler implemented in the program MSVar v1.3 (Beaumont, 1999;Storz et al., 2002). We ran 11 independent parallel chains sampling every 1,000th proposal to collect 20,000 proposals in the MCMC chain in each parallel run. Priors for current and historical population size means and variances were equal, and variances encompassed three orders of magnitude. Prior for mean time of population size change was set at 1,000 with variance encompassing time range from 1,000,000 to 0 generations ago. The runs were evaluated for convergence and were pooled to provide an estimate of current and historical effective population size. Convergence was assessed using the Gelman-Rubin criterion (Gelman and Rubin, 1992) and the test of alternative hypotheses (population decline vs. stable population size) as suggested by Beaumont (1999) was tested using Bayes factors. Calculations and plots were performed in the R statistical programming language (R Development Core Team, 2011) using the packages CODA and ggplot2.
In order to verify reduction in effective population size (Ne) or bottleneck effect, we tested for heterozygosity excess in the program BOTTLENECK (Piry et al., 1999) using three different mutation models: the stepwise mutation model, SMM (Ohta and Kimura, 1973); the two-phase model, TPM (Di Rienzo et al., 1994); and the infinite alleles model, IAM (Estoup et al., 1995). Genetic bottlenecks can also leave a signature in the ratio of number alleles to the allele size range (the M-ratio), where a bottleneck depletes the number of alleles faster than reducing allelic size range of the microsatellite (Garza and Williamson, 2001). We calculated the M-ratio using ARLEQUIN, and considered a reduction in the number of alleles to occur when M < 0.68, as suggested by Garza and Williamson (2001).

Genetic Diversity of C. macropomum
Eight hundred and thirty-nine base pairs from the control region and 732 base pairs from the ATPase6/8 gene were obtained from 539 individuals. Approximately 5% of the samples were resequenced to confirm the sequences obtained. The sequences of the mitochondrial gene fragments were concatenated, resulting in a total of 1,561 base pairs. A total of 444 haplotypes were found, 400 of which were unique. The haplotype network showed numerous reticulations between haplotypes (Figure 2). There was very little clustering among the haplotypes found at most localities, implying in a high degree of gene exchange. High and relatively homogeneous values of haplotype diversity was found, ranging from h = 0.895 in Porto Velho to h = 1.000 at nine of the 21 sampled localities ( Table 1).
A total of 604 individuals were genotyped for 13 microsatellite loci. The data revealed no evidence of allelic stutters or large allele dropouts (genotyping errors) and neither linkage disequilibrium (LD). However, deviation from the Hardy-Weinberg equilibrium (EHW) was observed at the Cm1E3 locus in 17 of the 21 sampled localities and the locus was therefore removed from the population analyses. Genetic variability parameters were quite homogeneous among the individuals from different localities (See Table 1). The expected heterozygosity (H E ) ranged from 0.714 in Guajará-Mirim to 0.797 in Eirunepé (Juruá River). Mean heterozygosity was 0.777 ± 0.395 for all loci and all locations (Supplementary Table S3). Allelic richness varies from 5.32 alleles in Guajará-Mirim (Guaporé River) to 6.53 in Carauari (Juruá River). The endogamy coefficient (F IS ) ranged from 0.023 to 0.144 and was significant for all localities. FIGURE 2 | Haplotype network of Colossoma macropomum haplotypes estimated using Network. Each line represents a single mutation. Circle size correspond to the number of observations, and missing haplotypes remain unfilled. Colors corresponds to observation of a haplotype in one of three main regions (yellow = mainstream of Amazon River, blue = tributaries of the Amazon River, and red = Bolivian sub-basin).

Population Structure
AMOVA of the mtDNA data demonstrated that more than 90% of genetic variance was within sampling sites. When AMOVA was performed without Guajará-Mirim (Bolivian sub-basin), ST was 0.032, which is lower than the ST = 0.062 found in the analysis including all sampling sites. Considering the tributaries vs. locations of the main channel, ST was 0.052. Nonetheless, AMOVA was significant for all three datasets analyzed. The result of the Mantel test was non-significant (r = 0.1587, P = 0.157), demonstrating no correlation between the genetic distances of the sampling sites and their respective geographic distances.
Global AMOVA of microsatellite data resulted in partitioning more than 98% variance within sites (F ST = 0.0111, P = 0.00124). Based in this result, additional AMOVA tests were implemented assuming two main groups: Amazon basin vs. Bolivia basin and, within the Amazon basin, tributaries vs. main channel. The AMOVA results were significant for both analyses (F ST = 0.0192, The matrix of pairwise G ST (Figure 3 Table S5) were congruent and indicated population structuring. Significant values were observed for almost all comparisons involving the Bolivian sub-basin (Guapore River), and the Madeira (Porto Velho, Humaita, Borba), upper Jurua (Eirunepé), upper Purus (Boca do Acre), and upper Tapajós (Itaituba, Jacareacanga) rivers. The Mantel test was significant (r = 0.34260, P < 0.05) only when the Bolivian sub-basin was included in the analysis.

, Supplementary Table S4) and F ST values (Supplementary
SAMOVA analyses indicated the existence of two geographic groups, one group comprised of Guajará-Mirim and another group comprising all remaining localities. At K = 2 (Group 1: Guajará-Mirim; Group 2: other sampling sites in the Brazilian   Sioli (1984) and Venticinque et al. (2016).

Classification of water types based in
Frontiers in Genetics | www.frontiersin.org Amazon basin), F CT was maximized for both the mtDNA and microsatellite datasets, but with significant support only for the mtDNA data. Bayesian analyses implemented in STRUCTURE v2.3.4 identified three biological groups. The highest posterior probability was LnP (K = 3) = −31766.0000. The three populations comprised individuals from the Bolivian sub-basin and the Brazilian Amazon basin. Individuals from the three localities in the Madeira River showed a linear gradient of admixture between these two populations, and a contribution of an additional biological group principally within the Humaitá locality (Figure 4). Results based on DAPC analysis displayed a general pattern of low genetic differentiation (Figure 5), however, as observed in the previous results, some individuals from upper Madeira River and Guaporé drainage are partially differentiated from the other localities.

Gene Flow
Results of the isolation-with-migration analyses using microsatellite data are in Supplementary Table S6. Thus, Mexiana and Tabatinga (on the Amazon River) were paired, and a group denominated the main channel was formed by randomly sampling 30 individuals from among the sampling sites of the main Amazon River channel, which was then analyzed with upper-most tributary localities: Jacareacanga (Tapajós River), Guajará-Mirim (Bolivian subbasin), Boca do Acre (Purus River), and Eirunepé (Juruá River) ( Table 2). The result indicated bidirectional gene flow between all localities. In all cases, the direction of migration from upstream areas of tributaries to the central Amazon basin predominated except in the case of the Jacareacanga locality.
The results of MIGRATE analyses supported substantial levels of geneflow between sites in the main stream of the Amazon, but reduced gene flow levels between localities at tributary headwaters, and of the Madeira River ( Table 3). The genetic parameters estimated for C. macropomum in MIGRATE version 3.1.3 inferred from microsatellites data is reported in Supplementary Table S7.

Population Demography
Using the genetic parameters from IMa2 analyses ( Table 2), the coalescent effective population size did not differ substantially for almost all pair of localities examined. As a whole, effective population sizes were of thousands of individuals, with the exception of Guajará-Mirim.
The Bayesian skyline plot for C. macropomum from the Brazilian Amazon demonstrated a strong sign of population expansion, which began slowly approximately 3,000,000 years ago. Demographic growth accelerated considerably approximately 450,000 years ago, with a weak signature of a recent population decline. From the beginning of the initial growth phase, the population size of this species have increased two orders of magnitude from approximately little more than 750 thousand to 75 million individuals in the coalescent history of the populations sampled (Figure 6). Population from Bolivian sub-basin shows demographic growth beginning at 500,000 years ago, with a signature of recent population stability.
Population expansion was also supported by Harpending's raggedness index, which was significantly small (r = 0.0046, P = 1.0000) considering all samples as well as when considering the two basins separately (Amazon basin: r = 0.00060, P = 1.0000; Bolivian basin: r = 0.0018, P = 0.9990) ( Table 4). These indexes statistically support the inference of population expansion based on the observation of the distribution of mismatch distribution (Harpending, 1994) for all samples of C. macropomum. However, when mismatch distribution was investigated for the basins separately, unimodal distribution was found only within the Amazon basin, whereas multimodal distribution was found for the Bolivian basin (results not shown), which fits a pattern expected under stable population size, although this was not supported by Harpending's raggedness index. The sum of squared deviations (Schneider and Excoffier, 1999) was non-significant for the overall sample (SSD = 0.0020, P = 0.6260) as well as the inference performed for the two basins separately (Amazon basin: SSD = 0.0021, P = 0.6430; Bolivian basin: SSD = 0.0047, P = 0.8420). Thus, these values neither support nor reject the null hypothesis of a demographic population expansion for C. macropomum ( Table 4). The Tajima's D was non-significant for all sampling localities. The same was found for Fu's Fs. When considering the basins separately, Fu's Fs was significantly negative for the Bolivian basin, suggesting a population expansion.
Results based in MSVar analyses show that historically C. macropomum has undergone a pronounced population decline in both the Amazon and Bolivian basins (Figure 7). The mean estimated ancestral effective population size were at approximately 100,000 individuals, declining recently to approximately 5,000 individuals, an approximately 1.5 orders of magnitude decrease. Demographic decrease was strongly supported (BF = 1206) and occurred with 0.9992 probability. Population decline started at approximately 10,000 (Amazon) and 2,500 (Bolivia) years ago. Signs of population reduction in the Bottleneck program were significant for 18 locations under the SMM model, however Mvalue showed no signal of population reduction, with exception of individuals from Tefé (Table 1). Effective population sizes (Ne) were low for majority of the localities. However, the confidence intervals were also "infinite" for all but the Madeira River, Parintins, Eirunepe and Tabatinga localities.  Effective population size Ne = 4 Neµ/4 µx3 (Ne 1 of the locality1, Ne 2 of the locality 2 and Ne A of the ancestor of the two localities); current time in years T = (t x µ) × 3; number of migrants per generation Nm = Mθ/2. Values in parentheses = minimum and maximum of confidence limits. Three year generation time is assumed.

The Role of Rapids in C. macropomum Gene Flow
The use of more variable genetic markers, such as microsatellites, has confirmed some of our earlier findings. Colossoma macropomum is not panmictic throughout its distribution area. Considering the entire sample, AMOVA, SAMOVA, STRUCTURE, ST (DNAmt), and F ST /G ST (microsatellites) analyses, suggested genetic differentiation of the Bolivian sub-basin and Brazilian Amazon basin localities.
Within a given river system, freshwater fishes can either form a large panmictic population or be divided into genetically differentiated groups with sufficient gene flow between groups to maintain the integrity of the meta-population. Gene flow measured indirectly by the number of effective migrants per generation (Nm) for DNAmt and microsatellite data evidenced restricted gene flow between the two basins, but enough to  maintain the exchange of genes, thereby minimizing effects of genetic drift. The microsatellite data demonstrate that migration between the two basins is bidirectional. The results of both the IMa and MIGRATE analyses show that gene flow is greater from Bolivia to the Brazilian Amazon, which is in agreement with the results described by Farias et al. (2010). The Brazilian part of the Amazon basin receives more migrants, probably through the passive downstream transport of larvae and juveniles. The genetic differentiation evidenced between the two basins (Brazilian and Bolivian) is associated with the upper Madeira River rapids, which serve as a natural barrier that restricts, but does not prevent, geneflow between the populations of C. macropomum of the two basins. The origin of the Bolivian sub-basin is related to the elevation of the Fitzcarrald arch at the beginning of the middle Pliocene (4 to 3 Ma), which gradually isolated the Bolivian basin, resulting in considerable changes in the drainage pattern, in which the main rivers north of Bolivia drain into the Amazon River through the Madeira River (Hoorn et al., 1995;Campbell et al., 2001), which is the largest tributary of the southern margin of the Solimões-Amazonas basin (Lundberg et al., 1998). The Bolivian sub-basin includes the main Beni and Mamoré rivers, as well as approximately 60% of the entire drainage area of the Madeira River (upper part of Madeira) and is separated from the Amazon basin by a set of 18 rapids and cataracts located between Guajará-Mirim and Porto Velho (Cella-Ribeiro et al., 2013). The largest of these cataracts, the Teotônio cataract, constituted the greatest barrier to navigation on this river, as well as to the movement of many species of fish (Goulding et al., 2003). The rapids of the upper Madeira River play an important role in the structuring of populations of other Amazonian aquatic species, such as the river turtle Podocnemis expansa (Pearse et al., 2006), river dolphins (Gravena et al., 2014(Gravena et al., , 2015, the black Amazonian flanelmouth characin Prochilodus nigricans (Machado et al., 2017), and the catfish Brachyplatystoma rousseauxii (Batista, 2010;Carvajal-Vallejos et al., 2014). The Teotônio cataract, as well as the Jirau cataract, the second largest of the Madeira River, have been submerged by hydroelectric reservoirs.

Population Genetic Structure in the Amazonas River
The lack of genetic differentiation of C. macropomum in the main channel of Amazonas River in the Brazilian Amazon basin was supported by all population structure analyses based on mitochondrial genes and microsatellites loci. These findings confirm the pattern reported by Santos et al. (2007) and Farias et al. (2010) who used mtDNA only (supported by ST) ), and worked at much smaller geographic scales. However, ST comparisons involving the tributaries were significant even after Bonferroni corrections. Fisher's exact test and Hedrick's G ST analysis with microsatellites markers show a weak population differentiation between localities, and stronger differentiation involving comparisons with tributaries, and also black water sites. Geneflow between the main stream localities and tributary headwaters and black water sites generally is smaller than one effective migrant per generation which also confirms the migration patterns observed in IMa and MIGRATE analyses. Contrary to this pattern, STRUCTURE analysis shows population differentiation only for upper Madeira (localities below the rapids). STRUCTURE program uses a Bayesian approach to investigate the number of biological groups in the dataset. The discordance, between these analyses, could be due to the fact that G ST analyses are based on variance in allelic frequencies between/among groups and Fisher's exact test uses contingency tables to test null hypothesis that the alleles are drawn from the same distribution in all populations. These two analyses are more sensitive to detecting smaller and finer levels of genetic differentiation, while STRUCTURE tests for shared system of mating among individuals of the same group. The algorithm of STRUCTURE will not necessarily detect weak structure (Evanno et al., 2005). The population structure observed in C. macropomum falls within the category of weak to moderate level of population differentiation (according to Wright, 1965), which could be limiting the sensitivity of the analysis to find more refined substructuring. Putman and Carbone (2014) emphasize that analyzes used to infer population differentiation have limitations in detecting or not the population structure. Therefore, being conservative for purposes of management and conservation of this species, we consider that the populations of the tributaries (Juruá, Purus, Madeira, Tapajós rivers, and blackwater localities) are different management units until proven otherwise. In this case, the management of fisheries, and seasonal fishing closures during reproductive period, must be effectively respected to preserve the evolutionary potential for the species sustainability. Encouraging aquaculture of the species could also minimize the impact of harvesting natural stocks, which is in fact already occurring.
On the other hand, in the great corridor of the main channel of the Amazon River, from Mexiana (1) to Tabatinga (20) there seem to be not even a signal of isolation-by-distance between the two localities separated by approximately 2,500 Km. The observed lack of genetic structuring of C. macropomum in the main channel of the Amazon River basin is probably the result of living in a floodplain environment. The life cycle of this species is tied directly to the seasonal flood cycle of the Amazon; during the flood C. macropomum disperses to reproduce and feed in the floodplains and in the flooded forest, while in the dry period fishes become concentrated in lakes and rivers (Araújo-Lima and Goulding, 1998). During the reproduction season, the eggs and larvae are passively transported by the millions to the floodplains, as this species is highly fecund (Araújo-Lima and Goulding, 1998;Araújo-Lima and Ruffino, 2004). This dynamic together with the interlinking of river channels during from flood pulses thus is the primary factor in homogenizing differences among populations of C. macropomum (Junk, 1997) and this is observed in molecular data as well. The lack of population structuring is not a unique feature of C. macropomum; numerous other species occupying the Amazonian floodplain [e.g., Prochilodus nigricans (Machado et al., 2017), Brycon amazonicus (Oliveira et al., 2018), and Paratrygon aiereba (Frederico et al., 2012)] show this pattern as well. Thus, the pattern found in the present study likely stems from events acting at a macro time scale that affected the region, which, together with current water cycles and the migratory movements of C. macropomum, may maintain intra-population homogeneity over generations.

Genetic Diversity: Large or Small?
Considering that H O suffers from sampling effects, we used H E as a minimally biased estimate of diversity (Frankham et al., 2002). The H E for C. macropomum for the microsatellite data were moderate and uniform across the sampling localities, ranging    (2000), the authors summarized microsatelite data from North American and European freshwater species, that in comparison to the Amazon basin are geographically very restricted. At 5.5 million km 2 , the Amazon basin is by far the largest hydrographic basin on the planet, and the size of area occupied by many fish species is comparable to that for marine fishes. The Amazon is in a sense a "sea" that provides an expansive and effectively continuous environment for migratory freshwater fishes such as C. macropomum and the other cited species. Migratory fishes, whether freshwater or marine, are usually r strategists, that is, they have high dispersal capacity, produce a lot of offspring, and in general have a large effective population size, which is turn is reflected in high heterozigosity levels.
In this respect, when compared to freshwater fishes of Europe and North America, the heterozygosities of Amazonian fishes may appear to be high, but this is an illusion. The expected heterozygosities are on par with those expected for fishes occupying large areas and having large census numbers. Within the Amazonian species analyzed, the large predatory catfishes of the genus Brachyplatystoma have lower H E as would be expected by their smaller census sizes. At the opposite end of the spectrum are the relatively small, detrivorous and frigivorous migratory characids (Semaprochilodus, Prochilodus, and Brycon) which have higher H E as would be expected by their much larger census sizes. Colossoma macropomum has an intermediate H E , again a reflection of its frugivorous lifestyle combined with large body size, and thus smaller census size than the other migratory characids but larger census size than the predatory catfishes.

Population Genetic Demography
An analysis of historical demography of C. macropomum suggested population expansion in the Amazon basin, which is the same scenario suggested by Farias et al. (2010). However, the result of the current study suggest a population size reduction in the Holocene (Figure 5). This result is confirmed by the very recent population decline observed in the Skyline plot analyses (Figure 4). Similar pattern are observed for the Bolivian basin, however, very recent population decline is not evidenced in the Skyline plot.
Most studies conducted in the Amazon involving fish species such as Prochilodus nigricans (Machado et al., 2017), Brachyplatystoma rousseauxii (Batista and Alves-Gomes, 2006; Carvajal-Vallejos et al., 2014), and Brachyplatystoma platynemum (Ochoa et al., 2015) indicate recent population expansion, at least as indicated by Fu's Fs test. The difference in effective population size of C. macropomum prior estimated for the Bolivian and Amazon basin broadly corresponded to the relative proportion of potential habitat in the Bolivian basin. This basin accounts for approximately 20% of the total Amazon basin, suggesting that the Bolivian basin had a C. macropomum population approximately 20% smaller than the rest of the Amazon basin during the Holocene.
Glacial and inter-glacial periods of the Pleistocene exerted considerable impact on the climate, which consequently affected the vegetation in South America (Ledru et al., 1996) and also had impact on aquatic and terrestrial fauna. It is during the Late Pleistocene that C. macropomum in the Amazon basin began expanding (Figure 4) associated with the expansion of the várzea-like habitat (Irion and Kalliola, 2010). Therefore, the population expansion of C. macropomum in the Amazon basin likely occurred due to the increase in the availability of habitat for this species starting in the later half of the Pleistocene; however, population growth is no longer observed in the Holocene.
Corroborating the observation for the Holocene, analyses of microsatellite data in MSVar indicate that C. macropomum has undergone a pronounced population decline in both drainages during the Holocene (10,000 years ago-Amazon and 2,500 years ago Bolivia), probably due to climate change related to Last Glacial Maximum and during the mid-Holocene epoch (Wang et al., 2017). Demographic decrease was strongly supported (BF = 1206) and occurred with 0.9992 probability.
The demographic decrease was from approximately 100,000 effective individuals to approximately 5,000 individuals, an approximate 1.5 orders of magnitude decrease. Similar values for current effective population size were inferred using IMa2 ( Table 2). By any measure, the effective population size is small, and much smaller than in the last several thousand years.
DeWoody and Avise (2000) estimate that at equilibrium H E = 0.79 represents 25,000 effective individuals assuming a substitution rate of 10 −4 . Our estimate of substitution rate from MSVar was 10 −3.81 , or just about 17,000 effective individuals are expected at equilibrium. In this sense, the Ne values of C. macropomum are below an equilibrium expectation, which also suggests a current reduction of Ne. In fact, the results of the Bottleneck program indicate decrease in population size in the majority of sampling localities, which is also corroborated by the Ne values ( Table 1). The only major event that may have contributed to population decrease of C. macropomum in the nowadays, within a time window of decades, is the overexploitation of the species and the destruction of its floodplain habitat. Natural stocks of the C. macropomum suffer from overfishing and juveniles currently account for the largest part of the catch (Barthem and Goulding, 2007). Although aquaculture of C. macropomum has grown in recent years, there is strong evidence that the natural population of this species are still depressed because of over-fishing. This can be evidenced in the continuous reduction of the tonnage landed in the port of Manaus and other major Amazonian ports. Araújo-Lima (2002) report that in 1976 C. macropomum reached 16,000 tons/year landed in the port of Manaus, while data from late 1990's indicated less than 4 thousand tons. During this time, the population of Manaus more than doubled. Furthermore, another worrying factor is the mean size of the fish landed, with juveniles representing the majority of the catch (Barthem and Goulding, 2007). The average size of the fish landed in the main markets in the Amazon suggests that most individuals are fished before reaching sexual maturation, which in the case of females occurs between 50 and 55 cm (Isaac and Ruffino, 2000).
In conclusion, naturally exogamous species with large census sizes have considerable genetic diversity and large effective population sizes (Frankham et al., 2002). In this context, C. macropomum has levels of genetic diversity that are on par with expectations for species of similar lifestyle and body size. However, with the historical decline of C. macropomum populations, it is evident that part of the genetic diversity that existed in the past has been lost. Still the remaining diversity is representative of this species's historical genetic diversity and it is this genetic diversity that can secure the recovery and longterm persistence of natural populations of C. macropomum in the Amazon basin.

ETHICS STATEMENT
All field collections were authorized by IBAMA/SISBIO 11325-1, and access to genetic resources was authorized by permit No. 034/2005/IBAMA. Field collection permits are conditional that collection of organisms be undertaken in accordance with the ethical recommendations of the Conselho Federal de Biologia (CFBio; Federal Council of Biologists), Resolution 301 (December 8, 2012).

AUTHOR CONTRIBUTIONS
IF and TH conceived the experiment and obtained funding. MS, IF, and TH conducted fieldwork, collected specimens, analyzed the results, and wrote the manuscript. MS collected molecular data. All authors contributed to and reviewed the manuscript.