ORIGINAL RESEARCH article
Distinguishing Between Nile Tilapia Strains Using a Low-Density Single-Nucleotide Polymorphism Panel
- 1WorldFish, Penang, Malaysia
- 2Department of Aquatic Resources, Ecology and Management, College of Fisheries, Central Luzon State University, Muñoz, Philippines
- 3Freshwater Aquaculture Center, Central Luzon State University, Muñoz, Philippines
- 4Bureau of Fisheries and Aquatic Resources, National Freshwater Fisheries Technology Center, Central Luzon State University Compound, Science City of Munoz, Philippines
- 5School of Biological Earth and Environmental Sciences, University College Cork, Cork, Ireland
Nile tilapia (Oreochromis niloticus) is among the most important finfish in aquaculture, particularly in Asia. Numerous genetically improved strains of Nile tilapia have been developed and disseminated through formal and informal channels to hatcheries, many of which operate at a relatively small scale in developing countries. The primary objective of this study was to assess the extent to which molecular genetic tools can identify different and interrelated strains of Nile tilapia in Bangladesh and the Philippines, two globally significant producers. A tool was developed using a low-density panel of single-nucleotide polymorphisms (SNPs), genotyping-by-sequencing and discriminant analysis of principal components (DAPC). When applied to 2,057 samples from 205 hatcheries in Bangladesh and the Philippines, for hatcheries where the hatchery-identified strain was one of the sampled core populations used to develop the tool, hatchery-identified and DAPC-assigned hatchery-level strains were in agreement in 74.1% of cases in Bangladesh and 80.6% of cases in the Philippines. The dominant hatchery-identified and DAPC-assigned strains were GIFT, in Bangladesh, and GET-ExCEL—a composite strain partially derived from GIFT—in the Philippines.
By weight, Nile tilapia (Oreochromis niloticus) is the most important finfish species in global aquaculture after carp (Cyprinidae) (Cai et al., 2019). Nile tilapia production has increased substantially over the past 20 years aided by the widespread adoption of technologies to produce monosex all-male tilapia and the development of genetically improved strains (Gupta and Acosta, 2004; ADB, 2005; Ponzoni et al., 2010). Genetic improvement of Nile tilapia was initiated in 1988 with the creation of the “Genetically Improved Farmed Tilapia” (GIFT) strain, initially in the Philippines by ICLARM (now WorldFish) and its partners (Gupta and Acosta, 2004). The GIFT strain has subsequently been disseminated widely in Asia and has genetically contributed to numerous “GIFT-derived” strains—including “Genetically Enhanced Tilapia—Excellent strain that has Comparable advantage over other tilapia strains for Entrepreneurial Livelihood projects” (GET-ExCEL), “Brackishwater Enhanced Selected Tilapia” (BEST), and Molobicus in the Philippines. Genetic improvement programs based on populations wholly descended from the original GIFT population are maintained by WorldFish (GIFT-WF, Malaysia), the Bureau of Fisheries and Aquatic Resources National Freshwater Fisheries Technology Center and its partners (GIFT FeedMix Fortified; GIFTFF, Philippines), and GenoMar (GenoMar Supreme Tilapia, Philippines), among others (Gupta and Acosta, 2004; Eknath and Hulata, 2009; Ponzoni et al., 2010; Ordoñez et al., 2014, 2017). In southeast Asia, other strains, such as Chitralada (Thailand) and “Freshwater Aquaculture Center Selected Tilapia” (FaST, Philippines), have been developed independently of GIFT (Pullin, 1988; Ordoñez et al., 2017).
In Bangladesh, production of Nile tilapia was 380,000 metric tons in 2017–2018, making it the fourth largest tilapia producer globally (DOF, 2018). Nile tilapia is also the second most important farmed fish species after milkfish (Chanos chanos) in the Philippines, with 267,700 metric tons produced in 2017 (Bersales and Bautista, 2018), concentrated in the Central Luzon and Calabarzon regions. Numerous genetically improved Nile tilapia strains are now available in these countries. However, the origins, genetic purity, and level of genetic improvement of strains supplied by hatcheries are not always known.
Genetically improved strains of tilapia known to have been introduced into Bangladesh include Chitralada from Thailand (1974, 1987, 2002, 2010), GIFT from the Philippines (1994, 1996), GenoMar Supreme Tilapia (GST) from the Philippines (2003), GIFT-WF from Malaysia (2005, 2012), GIFU from China (2008), and FaST from the Philippines (2011) (Ponzoni et al., 2010; Hussain et al., 2014). Other undocumented introductions of improved strains have likely occurred. In the Philippines, numerous locally developed strains have been developed. In addition, GIFT-WF from Malaysia was introduced for strain comparison purposes in 2012 (Battad, 2013) and for direct dissemination to farmers in 2014–2015 (Worldfish, 2015).
Lack of clarity concerning seed origins and distribution makes it difficult to understand the level of adoption and performance of fish strains in farming systems. Corresponding assessment of returns on investment may be inaccurate, and decisions on future actions by policy makers and investors are adversely affected. Numerous tools relying on a small number of genetic markers to identify genetically homogenous inbred crop lines and clonal horticultural varieties have been developed, and the extent of seed misidentification in crops is increasingly recognized in agriculture through the application of these methods (Rabbi et al., 2015; Chen et al., 2016; Floro et al., 2018; Kosmowski et al., 2019; Wineman et al., 2020). However, the development of comparable tools to distinguish between strains of outcrossing aquaculture species is challenging, and information on the extent of misidentification of tilapia strains is currently limited (Baggio et al., 2016; Oponda et al., 2017; Ordoñez et al., 2017; Moses et al., 2020).
The broad objective of the present study was to assess the extent to which molecular genetic tools can identify different and interrelated strains of tilapia and then test, to the extent possible, the actual prevalence of different strains in commercial hatcheries in Bangladesh and the Philippines. The specific aims were to (i) identify single-nucleotide polymorphisms (SNPs) for Nile tilapia, (ii) examine SNP genetic affinities among “core breeding populations” of widely disseminated genetically improved strains, (iii) identify a subset of SNPs that allows core breeding populations of Nile tilapia to be distinguished, and (iv) validate hatchery-identified strains.
Materials and Methods
Sampling Core Breeding Populations and Hatcheries
In 2015, a total of 852 fin-clip samples were obtained from 10 core breeding populations—two “GIFT,” four “GIFT-derived,” and four “non-GIFT” strains available in Bangladesh and/or the Philippines (Table 1) whose complex relationships are summarized in Figure 1. The number of individuals sampled per strain ranged from 21 to 122. Separately, a total of 2,057 fin clip samples of broodstock were obtained from tilapia hatcheries in Bangladesh and the Philippines. Sampled hatcheries provided details of the origins of their broodstock, herein referred to as the “hatchery-identified strain.” With the exception of GIFU (one hatchery in Bangladesh) and GenoMar (two hatcheries in Bangladesh and one in the Philippines), all hatchery-identified strains are represented in Figure 1. The GIFU strain was developed in China, but beyond that, its origins are unclear in the literature (Hasan et al., 2014). The GenoMar strain is wholly descended from the original GIFT population (Rodriguez, 2006; Ponzoni et al., 2010). In total, 1,053 samples were obtained from fish held by 106 private and public hatcheries in Bangladesh (Figure 2A), and 1,004 samples were obtained from 99 private hatcheries in the Philippines (Figure 2B). All fish sampled for this study were handled and biopsied using standard practices routinely employed in commercial tilapia operations. Fish were fin clipped using non-lethal, humane methods in accordance with the Guiding Principles of the Animal Care, Welfare and Ethics Policy of the WorldFish Center (Worldfish, 2004).
Table 1. Sampled genetically improved farmed tilapia (GIFT), GIFT-derived and non-GIFT core breeding populations.
Figure 1. Genetic origins of Oreochromis niloticus in sampled core populations: 1Bartie et al. (2020), 2Eknath et al. (1993), 3Ordoñez et al. (2014), 4Ponzoni et al. (2010), 5Pullin (1988), 6Rezk et al. (2002), 7Rodriguez (2006), 8Scott et al. (1989), 9Tayamen et al. (2002), 10Tayamen et al. (2004), 11Tayamen (2004), 12Bureau of Fisheries and Aquatic Resources National Integrated Fisheries Technology Development Center, 13Center de Cooperation Internationale en Recherche Agronomique pour le Development, 14Bureau of Fisheries and Aquatic Resources National Freshwater Fisheries Technology Center, 15Freshwater Aquaculture Center Central Luzon State University, 16University of the Philippines Visayas, 17International Development Research Center.
Figure 2. Location of hatcheries from which samples were obtained in (A) Bangladesh and (B) the Philippines.
Sequencing and SNP Discovery
A total of 852 samples from core breeding populations were genotyped by Diversity Arrays Technology (DArT), using the DArTseq genotyping-by-sequencing platform, according to the methods detailed in Lind et al. (2017). DNA was extracted by DArT using a Macherey–Nagel (Düren, Germany) NucleoMag 96 Tissue Kit and a NucleoMag SEP Magnetic Separator 744,900 to allow automated separation of high-quality DNA on a Tecan (Männedorf, Switzerland) Freedom Evo robotic liquid handler. Samples were genotyped along with 35 duplicates and 517 samples from African populations. In total, 21,195 SNP loci were identified (Supplementary Material 1). Prior to analyses, duplicate and African samples not relevant to the current study were excluded, and quality control procedures were implemented—SNPs with a minor allele frequency of <0.01 (1852 SNPs), and those for which more than 25% of individuals had missing genotypes (an additional 5678 SNPs), were excluded. This is herein referred to as the “full DArTseq panel.” The removal of African samples prior to quality control inflated the number of excluded SNPs—as SNPs unique to, or disproportionately expressed in, African populations were removed. All analyses of SNP data were conducted using R (R Core Team, 2018).
To assess the ability to correctly identify the strain of individuals sampled from the core breeding populations using the full DArTseq panel, a 4-fold cross-validation approach was adopted by (i) masking the strain of 25% of animals, selected at random, from within each core breeding population, (ii) performing DAPC (Jombart et al., 2010) using the full DArTseq panel and the unmasked training set of individuals, (iii) inferring the strain of masked individuals, and (iv) determining the proportion of masked individuals that were correctly assigned to their strain. The DAPC analyses were performed using the dapc function of the adegenet package (Jombart and Ahmed, 2011). dapc default settings were adopted except that n.da and n.pca were both set to the number of principal components achieving the lowest root mean squared error outputted by the xvalDapc function—which implements a stratified cross-validation of DAPC using varying numbers of principal components, while keeping the number of discriminant functions fixed (Di Prinzio et al., 2015; Jombart and Collins, 2015). xvalDapc default settings were also adopted except that training.set was set to 0.75, result was set to “groupMean” and n.rep was set to 100. Strain predictions were then made using the DAPC results and the predict.dapc function (default settings). Masked individuals were determined to be correctly assigned if the core breeding population from which the individual was sampled had the greatest posterior membership probability. This procedure was repeated 10 times to reduce bias due to sample allocation to the training datasets.
To investigate genetic affinities among the 10 core populations, unsupervised k-means clustering was undertaken using the glPca, find.clusters, and dapc functions of the adegenet package (Version 2.1.1 Jombart and Ahmed, 2011; Jombart and Collins, 2015). The glPca function was used to undertake principal component analyses (PCA), using default settings with nf set to 500—to ensure that data for all pertinent principal components were retained. The find.clusters function was then used to identify the number of groups that usefully describe the data, by plotting the Bayesian information criterion (BIC) for increasing values of k (Jombart and Collins, 2015). Default settings of find.clusters were adopted but with n.start set to 1,000—to ensure convergence of the K-means algorithm—n.pca set to 500—to include all retained principal components—and max.n.clust set to 40—to evaluate levels of k from 1 to 40. Discriminant analysis of principal components (DAPC, Jombart et al., 2010) was then performed for values of k increasing from 2 to 15 using the dapc function of adegenet. Default settings were adopted except that the number of principal components retained (n.pca) was determined by the optim.a.score function—to avoid overfitting (Di Prinzio et al., 2015)—and n.da was set to 100—so that all discriminant axes were retained.
To reduce costs, simplify implementation, and ease the computational burden of strain identification for the hatchery samples, a subset of informative SNPs was identified. This was achieved by computing the pairwise FST and δ values for each SNP across all possible pairwise combinations of the 10 core breeding populations (45 combinations). FST values were computed as (HT−HS)/HT, where HT is the expected heterozygosity across the total population, and HS is the expected heterozygosity of the of the individual core breeding populations (Weir and Cockerham, 1996), and δ was computed as |pAi−pAj|, where pAi and pAj are the frequencies of allele A in the ith and jth core breeding populations, respectively (Supplementary Material 2). A cutoff criteria to include the top 75 ranked SNPs for every pairwise combination provided a subset of 1,297 unique SNPs using FST and 1,214 unique SNPs using δ. When these lists of SNPs were combined, 1,387 unique SNPs that met the quality control criteria were identified, and these were defined as the “full list of informative SNPs.”
Hatchery samples were genotyped for the full list of informative SNPs using DArTcap, a low-cost targeted genotyping method that applies a selective step after complexity reduction to genotype-specific markers from DArTseq representations (Chen et al., 2016). With DArTcap, 1,334 SNPs were expressed. Quality control on DArTcap data was undertaken—SNPs with a minor allele frequency of <0.01 (85 SNPs), and those for which more than 25% of individuals had missing genotypes (an additional 10 SNPs), were excluded. Of the remaining SNPs, 789 corresponded to those from DArTseq (707 were in the full list of informative SNPs) and were used to assign hatchery samples to “DAPC-assigned strains.” These 789 SNP were referred to as the “reduced subset of informative SNP.” Using core population data, the 4 fold cross validation scheme adopted for the full DArTseq panel, detailed above, was repeated for the reduced subset of informative SNPs. DAPC strain assignment was then undertaken for hatchery samples with the predict.dapc function—after completing DAPC using core population data—by assigning individuals to the strain with the greatest posterior membership probability (referred to as “individual-fish level” assignment). Hatcheries were then assigned the strain represented by the most individuals (i.e., the “modal strain,” referred to as “hatchery-level” assignment). In addition, DAPC assignment to “groups” was undertaken by allocating core populations to ancestral groups (Table 1)—GIFT, GIFT-derived, non-GIFT (O. niloticus), and non-GIFT (O. mossambicus)—and repeating the assignment process.
Core Breeding Populations
The scatterplots of the first two DAPC discriminant functions using the full DArTseq panel (Figure 3A) revealed three distinct clusters of non-hybrid O. niloticus strains—Abbassa and FaST formed two distinct clusters, whereas there was substantial overlap among Chitralada, ExCEL, GIFT-WF, and GIFTFF strains. Individuals from the BEST strain, despite descending from multiple tilapia species (Figure 1) also clustered with Chitralada, ExCEL, GIFT-WF, and GIFTFF. Furthermore, the three strains descended from O. mossambicus formed non-overlapping clusters in plots involving the third discriminant function. Using the reduced subset of informative SNP, these four clusters—i) Abbassa, ii) FAST, iii) GIFT-WF, GIFTFF, Chitralada, and GET-ExCEL, and iv) Molobicus, Nile × Moss, and O. mossambicus—were evident in the plot of the first two DAPC discriminant functions.
Figure 3. Scatterplots of the first three discriminant functions from discriminant analysis of principal components (DAPC) using (A) the full DArTseq panel and (B) the reduced subset of informative SNPs. Insets show bar plots of discriminant analysis eigenvalues, and those axes used in a given plot are darkened (so top to bottom axes 1–2, 1–3, 2–3).
Plots of the Bayesian information criterion (BIC) for increasing values of k (Jombart and Collins, 2015) did not reveal an optimal number of clusters to usefully describe the data but indicated that no more than 15 groups would be appropriate (Supplementary Material 3). Using the full DArTseq panel, when two groups (k = 2) were defined using unsupervised k-means clustering (Figure 4), individuals from the non-hybrid O. niloticus strains and BEST clustered together in one group and hybrid and non-hybrid O. mossambicus strains in the other. When three groups were defined (k = 3), FaST formed its own group, and when five groups were defined (k = 5) Abbassa formed its own group. Notably, most individuals from the GIFT-WF and GIFTFF strains were assigned to a single group, even when 15 groups (k = 15) were defined, reflecting the shared origins of these strains (Figure 1 and Table 1). In contrast, Molobicus individuals were assigned to multiple groups, indicating substantial heterogeneity among individuals within the population.
Figure 4. Unsupervised k-means clustering of individuals in core breeding populations performed using discriminant analysis of principal components (DAPC) for differing numbers of groups (k) using the full DArTseq panel. Each of the groups is represented by a different color. Vertical lines represent the cluster membership probability of individuals.
The application of unsupervised k-means clustering to individuals from core breeding populations using the reduced subset of informative SNP (Figure 5) resulted in more distinct partitioning of O. niloticus strains—reflecting intentional ascertainment bias in the selection of informative SNP toward those under selection or affected by genetic drift (Bradbury et al., 2011; Grewe et al., 2015; Gilbey et al., 2016). However, using the reduced subset of informative SNP, individuals from GIFTFF did not form a distinct cluster, clustering primarily with individuals from GIFT-WF or GET-ExCEL. Individuals from the Molobicus core population also clustered with individuals from multiple other core populations.
Figure 5. Unsupervised k-means clustering of individuals in core breeding populations performed using discriminant analysis of principal components (DAPC) for differing numbers of groups (k) using the reduced subset of informative SNPs. Each of the groups is represented by a different color. Vertical lines represent the cluster membership probability of individuals.
Prediction efficiency of DAPC for core breeding populations using the full DArTseq panel was high for all strains, with >93.3% correct assignment (Table 2A). GIFTFF had the lowest prediction efficiency, but the majority of incorrectly assigned individuals in this case (4.2% of 6.7%) were assigned to GIFT-WF which, like GIFTFF, is descended from the original GIFT population (Figure 1). The most notable difference between the prediction efficiency of the full DArTseq panel and the reduced subset of informative SNP (Table 2B) was in the hybrid (Nile × Moss), in which, in the case of the reduced subset of informative SNP, prediction efficiencies were substantially lower, due to the erroneous assignment of 10.0% of individuals to the Molobicus strain. In addition, 5.6% of O. mossambicus individuals were assigned to the Abbassa strain using the reduced subset of informative SNP. It is unclear why these strains were more affected than others by the adoption of the reduced subset of informative SNP, but it is notable that these stains had the lowest number of samples from core populations (21 and 36, respectively).
Table 2. Prediction efficiency expressed as a percentage from discriminant analysis of principal components (DAPC) of core breeding populations using (A) the full DArTseq panel and (B) the reduced subset of informative single-nucleotide polymorphisms (SNPs).
Numerous mismatches between the hatchery-identified strain and the DAPC-inferred strain were evident. This was most apparent at the individual-fish level. In Bangladesh, where the hatchery-identified strain was one of the sampled core populations (i.e., GIFT, Chitralada, or FaST), hatchery-identified and DAPC-inferred strains were in agreement in 67.3% of cases (Table 3A). In comparison, at the hatchery level, hatchery-identified and DAPC-inferred strains were in agreement in 74.1% of cases (Table 3B; see also Figure 6A). At the hatchery level, the most common hatchery-identified strain was GIFT (47% hatcheries), of which 50.0% were assigned to GIFT-WF and 26.0% to GIFTFF using DAPC. Of the 42% of Bangladeshi hatcheries with an unknown strain, 51.1% were assigned to the Chitralada strain and 44.5% to GIFT-WF or GIFTFF. Overall, 32.1% of Bangladeshi hatcheries were assigned to Chitralada, 26.4% to GIFT-WF, and 33.0% to GIFTFF.
Table 3. Degree of agreement between hatchery-identified strain and DAPC-assigned strain expressed as a percentage.
Figure 6. Histogram of the percentage of DAPC-assigned individuals from a hatchery in the modal DAPC-assigned (A) strain or (B) group. Gray shading indicates hatcheries where the hatchery-identified strain or group matched the DAPC-assigned (i.e., modal) strain or group. Black shading indicates mismatches. Hatcheries for which the hatchery-identified strain was unknown (45 from Bangladesh and 32 from the Philippines) are excluded.
For the Philippines, where the hatchery-identified strain was one of the sampled core populations (i.e., GIFT, BEST, GET-ExCEL, Chitralada, or FaST), hatchery-identified and DAPC-inferred strains were in agreement in 72.1% of cases at the individual-fish level (Table 3A) and 80.6% of cases at the hatchery level (Table 3B; see also Figure 6A). At the hatchery level, the dominant hatchery-identified strains were GET-ExCEL−37% (of which 94.6% were assigned to GET-ExCEL using DAPC) and FaST−24% (of which 70.8% were assigned to FaST using DAPC). Of all the sampled hatcheries in the Philippines, 14.1% were assigned to Chitralada, 54.5% to GET-ExCEL, 24.2% to FaST, and 5.1% to GIFTFF using DAPC (Table 3B).
Since many of the strains were closely related, it was reasoned that some mismatches between hatchery-identified strain and DAPC-inferred strain were potentially due to DAPC misassignment among closely related strains. Accordingly, the analysis was repeated to assign individuals and hatcheries to one of the groups of GIFT, GIFT-derived, non-GIFT O. niloticus and O. mossambicus (Table 4). However, only minor differences in the concordance between hatchery-identified and DAPC-inferred results were evident when data were analyzed as groups, rather than strains. At the individual-fish level, where the hatchery-identified strain was one of GIFT, BEST, GET-ExCEL, Chitralada, or FaST, hatchery-identified and DAPC-inferred groups were in agreement in 66.1 and 82.6% of cases, for Bangladesh and the Philippines, respectively (Table 4A). At the hatchery level, hatchery-identified and DAPC-inferred groups were in agreement in 69.0 and 87.9% of cases, respectively (Table 4B; see also Figure 6B). GIFT was the most prevalent DAPC-assigned group in Bangladesh (50.0%), and GIFT-derived strains (48.5%) were most dominant in the Philippines. The non-GIFT O. niloticus DAPC-assigned group represented similar percentages in both Bangladesh (42.5%) and the Philippines (46.5%).
Table 4. Degree of agreement between hatchery-identified strain and DAPC-assigned group expressed as a percentage, where group is defined as GIFT, GIFT-derived, or non-GIFT O. niloticus or O. mossambicus.
Varying levels of DAPC assignment consistency among sampled individuals within hatcheries were evident, with 100% of individuals assigned to the same DAPC-inferred strain or group in some hatcheries but only 40% being assigned to a common DAPC-inferred strain (Figure 6A) or group (Figure 6B) in others. For hatcheries where the hatchery-identified and DAPC-assigned strain or group did not match, it was not possible to independently verify which was correct. However, where 75% or more of DAPC-assigned individuals from a hatchery were in the modal DAPC-assigned strain (or group), concordance between hatchery-identified and DAPC-inferred strain (or group) was strong—indicating that our SNP and assignment method was accurate for hatcheries in which this threshold was met (Figure 6). The percentage of DAPC-assigned individuals from the modal DAPC-assigned strain (or group) is a simple measure of confidence in assignment. At the strain level, 74 hatcheries (27 in Bangladesh and 47 in the Philippines) were above this 75% “confidence threshold” of which 69 exhibited a match between hatchery-identified strain and DAPC-inferred strain (Figure 6A). At the group level, 85 hatcheries (29 in Bangladesh and 56 in the Philippines) were above the threshold, of which 81 matched. The high proportion of hatcheries with 100% of individuals assigned to the same DAPC-inferred strain (or group) in the Philippines was likely due to the high proportion of hatcheries maintaining the FaST and GET-ExCEL strains. The FaST strain was most easily distinguished from other strains using our subset of informative SNPs and assignment method (Figures 3–5), and a high degree of concordance between hatchery-identified strain and DAPC assigned strain was evident in the case of GET-ExCEL (Table 3).
Sampled core populations were in many cases interrelated and descended in part, or full, from common founder populations (Figure 1) and, given their known parent selection and mating strategies, are likely to have retained substantial genetic variability. These factors alone make the identification of tilapia strains using molecular markers more complex than for inbred crop lines and clonal horticultural varieties. However, at the core population level, DAPC and our SNP panels were used to assign individuals to populations with a high degree of accuracy, particularly in the case of the full DArTseq panel (Table 2). Furthermore, SNP genetic affinities among core breeding populations (Figures 3–5) broadly reflected the documented ancestry of these populations (Figure 1 and Table 1). For example, among Nile tilapia populations, GIFT/GIFT-derived and non-GIFT populations were readily distinguished, with the notable exception of Chitralada. The close SNP genetic affinity of Chitralada with GIFT and GIFT-derived strains—BEST, ExCEL, GIFT-WF, and GIFTFF—seems incongruous, given their putative ancestry (Figure 1), but has been observed in other studies involving similar populations (Moses et al., 2020).
For the majority of hatcheries, the hatchery-identified strain accorded with the DAPC-inferred strain, using the reduced subset of informative SNPs. However, for hatcheries where the hatchery-identified and DAPC-assigned strain did not match, it was not possible to independently verify which was correct. The existence of unregulated and uncertified broodstock supply chains or deliberate or inadvertent misrepresentation of broodstock origin could explain misidentification of strains by hatcheries. Furthermore, the genetic management of tilapia stocks held by hatcheries is highly variable—not all maintain records of the origin of their stocks; some maintain multiple strains but may not maintain them separately (i.e., some maintain strain admixtures); not all adopt appropriate practices to limit inbreeding, and not all routinely obtain new genetically superior stocks from core breeding populations. With respect to possible DAPC misassignment, many of the hatchery populations sampled for our study diverged from core populations numerous generations prior to sampling. Accordingly, sampled hatchery populations had undoubtedly genetically diverged, to varying extents, from their core breeding population/s of origin due to selection, genetic drift, and strain mixing, with unpredictable consequences for the accuracy of our assignment method. Indeed, in some hatcheries, there was substantial variation in the DAPC assignments among the approximately 10 individuals sampled from a putatively single strain—indicating that DAPC assignment, for individual animals sampled from hatcheries, using our method, is not sufficiently accurate for most purposes. Nevertheless, in our study, there was strong concordance between hatchery-identified strain and DAPC-inferred strain in hatcheries where 75% or more of DAPC-assigned individuals were in the modal DAPC-assigned strain (Figure 6). This suggests that our method could be used to accurately assign strain to hatchery populations, in Bangladesh and the Philippines, if only DAPC-assignments from hatcheries that met this 75% “confidence threshold” were accepted. Although increasing the accuracy of assignment, adopting such an approach inevitably results in a substantial proportion of hatcheries being categorized as “unassigned” and does not totally exclude the possibly of false assignments.
In Bangladesh, it was evident that a disproportionate number of hatcheries with an unknown hatchery-identified strain had a DAPC-inferred strain of Chitralada (Table 3). It is conceivable that hatcheries with stock of unknown origin are more likely to hold local strains descended from early introductions of Thai origin (Figure 1) (Hussain et al., 2014). Alternatively, hatcheries may have been unwilling to identify their strain as Chitralada if their broodstock were sourced through informal channels. However, it is also possible that DAPC incorrectly inferred that fish were from the Chitralada strain, given the close SNP genetic affinities between GIFT/GIFT-derived strains and Chitralada (Figures 3, 4).
In our study, DAPC assignment to groups (i.e., GIFT, GIFT-derived, and non-GIFT), rather than individual strains, only marginally improved the concordance between hatchery-identified and DAPC-inferred results (Figure 6). However, single-nucleotide polymorphisms included in our reduced subset of informative SNPs were selected to maximize the ability to distinguish between sampled core populations of key tilapia strains, not groups. This approach likely resulted in ascertainment bias toward SNP affected by selection, or genetic drift, subsequent to the divergence of core populations (e.g., GIFT-WF and GIFTFF, Figure 1). Accordingly—if the only objective of the study had been to distinguish between GIFT, GIFT-derived, and non-GIFT groups, ignoring individual strain—an alternative approach to SNP selection should have been adopted to obtain an optimal subset of informative SNPs for this purpose.
Despite their respective limitations, the hatchery surveys and strain assignment using DAPC confirmed the ongoing importance of GIFT and GIFT-derived strains to tilapia aquaculture in Bangladesh and the Philippines (Gupta and Acosta, 2004; ADB, 2005; Ponzoni et al., 2010). In Bangladesh, the dominant hatchery-identified and DAPC-assigned strains were GIFT-WF or GIFTFF, and in the Philippines, GET-ExCEL—a composite strain partially derived from GIFT (Figure 1)—was the most prevalent. Our study also highlighted the prevalence of locally developed strains in the Philippines and absence of such strains in Bangladesh. The Philippines has a long history of tilapia genetic improvement, beginning with the development of GIFT in the 1980s (Figure 1) and, accordingly, has mature, structured, and systematic genetic improvement, dissemination, and extension programs in place. In Bangladesh, the tilapia sector has expanded rapidly in recent decades, and genetic improvement and associated systems are currently less sophisticated. This distinction was possibly reflected in the higher proportion of hatcheries with an unknown strain in Bangladesh (Tables 3B, 4B), with recent and informal introductions likely to be a factor in the inability of hatchery owners to identify the origins of their stock.
In conclusion, this study (i) successfully identified and characterized single-nucleotide polymorphisms (SNPs) for Nile tilapia; (ii) SNP genetic affinities among core breeding populations were shown to broadly reflect the documented ancestry of these populations, with the notable exception of Chitralada; (iii) identified a subset of SNPs and developed a tool to assign individuals to strains using DArTcap genotyping and DAPC methods; and (iv) found that, in the majority of 205 sampled hatcheries in Bangladesh and the Philippines, the hatchery-identified strain accorded with the DAPC-inferred strain (or group). Furthermore, the study verified the importance of GIFT and GIFT-derived strains to tilapia aquaculture in these countries. However, for hatcheries where the hatchery-identified and DAPC-assigned strain (or group) did not match, it was not possible to independently verify, which was correct, and the possibility of false DAPC assignment could not be excluded. Accordingly, our SNP panel and assignment method must be implemented in a manner that recognizes its inherent limitations—such as excluding hatchery-level DAPC assignments that do not meet a predefined “confidence threshold” —to avoid spurious conclusions.
Data Availability Statement
Ethical review and approval was not required for the animal study because as stated in the Materials and Methods, fish sampled for this study were handled and biopsied using standard practices routinely employed in commercial tilapia operations. Fish were fin clipped using non-lethal, humane methods in accordance with the Guiding Principles of the Animal Care, Welfare and Ethics Policy of the WorldFish Center (Worldfish, 2004). Written informed consent for participation was not obtained from the owners because fish sampled for this study were sampled in 2015 with the full knowledge and consent of the owners.
CL and JB oversaw the project, identified the subset of informative SNPs, and undertook initial analyses and reporting. BB, RV, and MD contributed to the hatchery sampling design and coordination of hatchery samples. MH undertook the final analyses and produced the first draft of the manuscript. All authors reviewed and contributed to the final version of the manuscript.
The project was supported by a grant from the Standing Panel on Impact Assessment (SPIA) program titled Adoption of Genetically Improved Farmed Tilapia (GIFT) 20 years after release to Industry: A GIFT that keeps on giving?. The project was also financially supported by the CGIAR Research Program on Fish Agri-Food Systems (FISH) led by WorldFish—supported by contributors to the CGIAR Trust Fund—and the USAID funded Aquaculture and Nutrition (AIN) project.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
In-country partners that assisted in sample collection in the Philippines included the Bureau of Fisheries and Aquatic Resources (BFAR) regional offices in Cagayan Valley, Central Luzon, and Calabarzon; the National Freshwater Fisheries Technology Center (BFAR-NFFTC); the National Integrated Fisheries Technology Development Center (BFAR-NIFTDC); and the Freshwater Aquaculture Center, Central Luzon State University (FAC-CLSU). We thank those who collected data and samples from hatcheries: Ruben A. Reyes, Milagros M. Apaga, Faith S. Tadeo, Roniño C. Del Pilar, Elena Lanuza, Carlon Villota, Johnny Munar, Archievald Valiente, and Maritess Pablo (BFAR-NFFTC); Westly R. Rosario and Nerafe Muyalde (BFAR-NIFTDC); Roberto Miguel V. Sayco, Lourdes D. Noscal, Mary Grace R. Padilla, and Ave Joy C. Ramos (FAC-CLSU); Hugues de Verdal (CIRAD) in the Philippines; Utpal Chakraborty, Madhu Sudan Dey, Abdullah-Al-Mamun, Imran Hossain, Abu Taleb Mollah, and Md. Mamunur Rashid in Bangladesh; and Ram Bhujel in Thailand. Thanks is also extended to all the hatcheries and core population owners who participated in the study, for their willingness to provide data and samples, and Khairul Rizal Abu-Bakar for laboratory processing of samples.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2020.594722/full#supplementary-material
ADB (2005). An Impact Evaluation of the Development of Genetically Improved Farmed Tilapia and Their Dissemination in Selected Countries. Mandaluyong: Operations Evaluation Department, Asian Development Bank.
Baggio, R. A., Orélis-Ribeiro, R., and Boeger, W. A. (2016). Identifying nile tilapia strains and their hybrids farmed in Brazil using microsatellite markers. Pesqui. Agropecu. Bras. 51, 1744–1750. doi: 10.1590/s0100-204x2016001000006
Bartie, K. L., Taslima, K., Bekaert, M., Wehner, S., Syaifudin, M., Taggart, J. B., et al. (2020). Species composition in the molobicus hybrid tilapia strain. Aquaculture 526:735433. doi: 10.1016/j.aquaculture.2020.735433
Battad, D. L. A. (2013). Nile tilapia super strain: pinoy fishers' new heroes. BAR Digest 15. Available onlin at: https://www.bar.gov.ph/index.php/digest-home/digest-archives/371-2013-3rd-quarter/5379-julsep2013-nile-tilapia-super-strain
Bradbury, I. R., Hubert, S., Higgins, B., Bowman, S., Paterson, I. G., Snelgrove, P. V. R., et al. (2011). Evaluating SNP ascertainment bias and its impact on population assignment in Atlantic cod, Gadus morhua. Mol. Ecol. Resour. 11, 218–225. doi: 10.1111/j.1755-0998.2010.02949.x
Cai, J., Zhou, X., Yan, X., Lucentea, D., and Lagana, C. (2019). Top 10 Species Groups in Global Aquaculture 2017. Rome: Fisheries and Aquaculture Department, Food and Agriculture Organization of the United Nations.
Chen, J., Zavala, C., Ortega, N., Petroli, C., Franco, J., Burgueño, J., et al. (2016). The development of quality control genotyping approaches: a case study using elite maize lines. PLoS ONE 11:e0157236. doi: 10.1371/journal.pone.0157236
Di Prinzio, C. Y., Rossi, C. R., Ciancio, J., Garza, J. C., and Casaux, R. (2015). Disentangling the contributions of ocean ranching and net-pen aquaculture in the successful establishment of Chinook salmon in a Patagonian basin. Environ. Biol. Fishes 98, 1987–1997. doi: 10.1007/s10641-015-0418-0
Eknath, A. E., Tayamen, M. M., Palada-De Vera, M. S., Danting, J. C., Reyes, R. A., Dionisio, E. E., et al. (1993). “Genetic improvement of farmed tilapias: the growth performance of eight strains of Oreochromis niloticus tested in different farm environments. ICLARM Contribution No. 709,” in Genetics in Aquaculture, eds G. A. E. Gall & H. Chen (Amsterdam: Elsevier), 171–188.
Floro, V. O., Labarta, R. A., Becerra López-Lavalle, L. A., Martinez, J. M., and Ovalle, T. M. (2018). Household determinants of the adoption of improved cassava varieties using DNA fingerprinting to identify varieties in farmer fields: a case study in Colombia. J. Agric. Econ. 69, 518–536. doi: 10.1111/1477-9552.12247
Gilbey, J., Cauwelier, E., Coulson, M. W., Stradmeyer, L., Sampayo, J. N., Armstrong, A., et al. (2016). Accuracy of assignment of Atlantic salmon (Salmo salar L.) to rivers and regions in Scotland and northeast England based on single nucleotide polymorphism (SNP) markers. PLoS ONE 11:e0164327. doi: 10.1371/journal.pone.0164327
Grewe, P. M., Feutry, P., Hill, P. L., Gunasekera, R. M., Schaefer, K. M., Itano, D. G., et al. (2015). Evidence of discrete yellowfin tuna (Thunnus albacares) populations demands rethink of management for this globally important resource. Sci. Rep 5:16916. doi: 10.1038/srep16916
Gupta, M., and Acosta, B. (2004). From drawing board to dining table: the success story of the GIFT project. WorldFish Center Q. 27, 4–14. Available online at: http://pubs.iclarm.net/Naga/naga27-3n4/pdf/article01.pdf
Hamilton, M., Lind, C., Barman, B., Velasco, R., Danting, J., and Benzie, J. (2020). Distinguishing Between Nile Tilapia Strains Using a Low-Density Single Nucleotide Polymorphism (SNPs) Panel: Data. Cambridge, MA: Harvard Dataverse.
Hasan, M., Sarker, B.S., Nazrul, K.S., and Tonny, U.S. (2014). Salinity tolerance level of GIFU tilapia strain (Oreochromis niloticus) at juvenile stage. Int. J. Agric. Sci. 4, 83–89. Available online at: https://www.academia.edu/10458194/Salinity_tolerance_level_of_GIFU_tilapia_strain_Oreochromis_niloticus_at_juvenile_stage
Hussain, M., Barman, B., Karim, M., and Keus, E. (2014). “Progress and the future for tilapia farming and seed production in Bangladesh,” in The 10th International Symposium on Tilapia in Aquaculture (ISTA10) (Jerusalem).
Jombart, T., and Collins, C. (2015). A Tutorial for Discriminant Analysis of Principal Components (DAPC) Using Adegenet 2.0.0. London: Imperial College London, MRC Centre for Outbreak Analysis and Modellin.
Jombart, T., Devillard, S., and Balloux, F. (2010). Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC Genet. 11:94. doi: 10.1186/1471-2156-11-94
Kosmowski, F., Aragaw, A., Kilian, A., Ambel, A., Ilukor, J., Yigezu, B., et al. (2019). Varietal identification in household surveys: results from three household-based methods against the benchmark of DNA fingerprinting in southern Ethiopia. Exp. Agric. 55, 371–385. doi: 10.1017/S0014479718000030
Lind, C. E., Kilian, A., and Benzie, J. A. H. (2017). Development of Diversity Arrays Technology markers as a tool for rapid genomic assessment in Nile tilapia, Oreochromis niloticus. Anim. Genet. 48, 362–364. doi: 10.1111/age.12536
Moses, M., Mtolera, M. S. P., Chauka, L. J., Lopes, F. A., De Koning, D. J., Houston, R. D., et al. (2020). Characterizing the genetic structure of introduced Nile tilapia (Oreochromis niloticus) strains in Tanzania using double digest RAD sequencing. Aquac. Int. 28, 477–492. doi: 10.1007/s10499-019-00472-5
Oponda, C. V. L., Santos, B. S., and Basiao, Z. U. (2017). Morphological differences in five strains of genetically improved Nile tilapia (Oreochromis niloticus) using geometric morphometrics. J. Int. Soc. Southeast Asian Agric. Sci. 23, 44–55
Ordoñez, J. F. F., Santos, M. D., and Tayamen, M. M. (2014). Tilapia Genetic R&D in the Philippines: Challenges and Prospects for Future Development. Bangkok: Secretariat, Southeast Asian Fisheries Development Center.
Ordoñez, J. F. F., Ventolero, M. F. H., and Santos, M. D. (2017). Maternal mismatches in farmed tilapia strains (Oreochromis spp.) in the Philippines as revealed by mitochondrial COI gene. Mitochondrial DNA Part A 28, 526–535. doi: 10.3109/24701394.2016.1149824
Ponzoni, R. W., Khaw, H. L., and Yee, H. Y. (2010). GIFT: The Story Since Leaving ICLARM (Now Known as the WorldFish Center): Socioeconomic, Access and Benefit Sharing and Dissemination Aspects. Lysaker: Fridtj of Nansen Institute (FNI).
Pullin, R. S. V., (ed.). (1988). “Tilapia genetic resources for aquaculture,” in ICLARM Conference Proceedings (Manila: International Center for Living Aquatic Resources Management), 108. Available online at: https://digitalarchive.worldfishcenter.org/bitstream/handle/20.500.12348/3223/Pub%20CP6%2016.pdf?sequence=1
Rabbi, I. Y., Kulakow, P. A., Manu-Aduening, J. A., Dankyi, A. A., Asibuo, J. Y., Parkes, E. Y., et al. (2015). Tracking crop varieties using genotyping-by-sequencing markers: a case study using cassava (Manihot esculenta Crantz). BMC Genet. 16:115. doi: 10.1186/s12863-015-0273-1
Rezk, M. A., Kamel, E. A., Ramadan, A. A., and Dunham, R. A. (2002). Comparative growth of Egyptian tilapias in response to declining water temperature. Aquaculture 207, 239–247. doi: 10.1016/S0044-8486(01)00748-7
Rodriguez, B. (2006). “Disseminating genetically improved tilapia fingerlings through the GIFT licensing program,” in Public and Private Partnerships in Aquaculture: A Case Study on Tilapia Research and Development, eds B. O. Acosta, R. C. Sevilleja and M. V. Gupta (Penang: The WorldFish Center), 21–23.
Scott, A. G., Penman, D. J., Beardmore, J. A., and Skibinski, D. O. F. (1989). The ‘YY' supermale in Oreochromis niloticus (L.) and its potential in aquaculture. Aquaculture 78, 237–251. doi: 10.1016/0044-8486(89)90102-6
Tayamen, M., Reyes, R., Danting, M. J., Mendoza, A., Marquez, E., Salguet, A., et al. (2002). Tilapia broodstock development for saline waters in the Philippines. ICLARM Q. 25, 32–36 . Available online at: http://pubs.iclarm.net/Naga/Naga25-1/pdf/NAGA%2025no1_inga.pdf
Tayamen, M. M. (2004). “Nationwide dissemination of GET-ExCEL tilapia in the Philippines,” in 6th International Symposium on Tilapia in Aquaculture (ISTA), eds R. B. Bolivar, G. C. Mair, K. Fitzsimmons (Manila), 74–89.
Tayamen, M. M., Abella, T. A., Reyes, R. A., Danting, M. J. C., Mendoza, A. M., Marquez, E. B., et al. (2004). “Development of tilapia for salinity waters in the Philippines,” in 6th International Symposium on Tilapia in Aquaculture (ISTA), eds R. B. Bolivar, G. C. Mair, K. Fitzsimmons (Manila).
Wineman, A., Njagi, T., Anderson, C. L., Reynolds, T. W., Alia, D. Y., Wainaina, P., et al. (2020). A case of mistaken identity? Measuring rates of improved seed adoption in Tanzania using DNA fingerprinting. J. Agric. Econ. 71, 719–741. doi: 10.1111/1477-9552.12368
Keywords: aquaculture, nile tilapia (Oreochromis niloticus), genetic improvement, single nucleotide polymorphism (SNP), discriminant analysis of principal components (DAPC), strain identification
Citation: Hamilton MG, Lind CE, Barman BK, Velasco RR, Danting MJC and Benzie JAH (2020) Distinguishing Between Nile Tilapia Strains Using a Low-Density Single-Nucleotide Polymorphism Panel. Front. Genet. 11:594722. doi: 10.3389/fgene.2020.594722
Received: 14 August 2020; Accepted: 23 October 2020;
Published: 01 December 2020.
Edited by:Nguyen Hong Nguyen, University of the Sunshine Coast, Australia
Reviewed by:Papius Dias Tibihika, National Environment Management Authority (NEMA), Uganda
Costas S. Tsigenopoulos, Hellenic Centre for Marine Research (HCMR), Greece
Copyright © 2020 Hamilton, Lind, Barman, Velasco, Danting and Benzie. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Matthew G. Hamilton, email@example.com
†Present address: Curtis E. Lind, Commonwealth Scientific and Industrial Research Organisation Aquaculture, CSIRO Agriculture and Food, Castray Esplanade, Hobart, TAS, Australia