Edited by: Bernie Carroll, The University of Queensland, Australia
Reviewed by: Zhukuan Cheng, University of Chinese Academy of Sciences, China; Roger Paul Hellens, Queensland University of Technology, Australia
*Correspondence: David N. Kuhn
This article was submitted to Plant Genetics and Genomics, a section of the journal Frontiers in Plant Science
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Mango (
Mango (
Mango has been widely cultivated in India and Southeast Asia for thousands of years. In the fifteenth and sixteenth centuries, Portuguese and Spanish traders spread mango to other tropical and subtropical regions of the world (Mukherjee and Litz,
Mango is now grown throughout the sub-tropical and tropical world in 99 countries with a total fruit production of 34.3 million tons of fruit per annum (Galán Saúco,
Around the world there are hundreds and possibly thousands of different mango cultivars and selections, most of which are only grown and marketed locally. Relatively few cultivars are traded internationally due to the highly specific requirements for cultivars with favorable color, storage, and shipping traits.
Mango is suggested to have a partial allopolyploid genome based on cytogenetics (Mukherjee,
To date the development of genetic and genomic resources in mango have been limited and have not greatly contributed to mango breeding around the world. An early, very limited genetic map of mango produced by Kashkush et al. (
The current improved commercial cultivars have typically been selected from open pollinated seedling progeny and then vegetatively propagated to maintain genetic uniformity (Bally et al.,
Major mango breeding/selection programs exist in India, Australia, Brazil, and Israel, and although each program has breeding goals specific for their industries, they share many productivity and quality goals. Full-sib hybrid populations from two known parents with differing horticultural traits, such as hand pollinated populations, are more effective for breeding progress than half-sib populations from open pollinated maternal parents. Genetic maps that are based on segregating full-sib hybrid populations are a powerful tool to identify linkage between horticultural traits and molecular markers for MAS as seen in other tree fruit crops (Ogundiwin et al.,
Linking and mapping important mango traits with molecular markers will improve the efficiency of mango breeding. One of the traits of mango that is very distinct is polyembryony in which multiple apomictic embryos develop from the maternal nucellar tissues around the fertilized egg in addition to a single zygotic embryo (Asker and Jerling,
Although, polyembryony in mango was originally thought to be controlled by recessive genes (Sturrock,
In this study, we generated a mango consensus genetic map, a valuable tool that can be used to improve the efficiency and overcome the challenges facing mango breeding programs. We used the genetic map to identify markers and regions of the genome that are associated with important horticultural traits such as embryo type, branch habit, bloom, ground skin color, blush intensity, beak shape, and pulp color.
Seven mapping populations were used to make the consensus map (Table
Tommy Atkins × Tommy Atkins (TA × TA) (Self-pollinated) | 60 | USDA-ARS, SHRS, USA |
Tommy Atkins × Kensington Pride (TA × KP) | 100 | DAFQ, Australia |
Haden × Tommy Atkins (H × TA) | 225 | Embrapa, Brazil |
Haden × Haden (H × H) (Self-pollinated) | 40 | USDA-ARS, SHRS, USA |
Irwin × Kensington Pride (I × KP) | 180 | DAFQ, Australia |
NMBP1243 × Kensington Pride (NMBP1243 × KP) | 100 | DAFQ, Australia |
Creeper × Kensington Pride (Cr × KP) | 70 | DAFQ, Australia |
SNP containing sequences came from three different sources: Department of Agriculture and Fisheries, Queensland (DAFQ), Australia, SHRS, USA and the Agriculture Research Organization (ARO), Israel (Table
Australia | 144 | DAFQ Hoang et al., |
Israel | 384 | ARO Sherman et al., |
US | 526 | USDA-ARS SHRS Kuhn et al., |
Total | 1,054 | – |
DNA for genotyping was isolated from the leaves of individual progeny in the mapping populations as in Kuhn et al. (
All 1,054 SNP assays were produced from SNP containing sequences by Fluidigm (South San Francisco, CA, USA) and assayed on a Fluidigm EP-1 platform.
Perl scripts (available on request) were written to reformat data from all 1,054 markers generated by the Fluidigm EP-1 platform. Data from all mapping populations for all 1,054 markers were appended into a single file. Due to the large size of the combined data file, the initial analysis was performed on a 32 core Linux cluster followed by data reformatting and analyzing with scripts that produced csv files for export to Excel. Off type individuals, i.e., not hybrid progeny of the parents of the population, were identified by multiple occurrence of genotypes that could not have been inherited from the parents and were removed from the dataset. Markers with >5% missing data were also removed from the dataset. In the resulting edited dataset, individual progeny with >5% missing data were then removed. SNP markers that were homozygous for both parents in a population were removed because they would not be informative for finding recombination events. Selection was made for markers with disomic inheritance segregation ratios. SNP markers with segregation ratios differing by more than 20% from the expected disomic genotypic frequency or allelic frequency were removed from the dataset. Such markers had either aberrant segregation ratios based on the parental genotypes or segregation ratios indicative of tetraploid inheritance.
Two mapping programs, JoinMap4 (Kyazma B.V.®, Wageningen, Netherlands) and OneMap (Margarido et al.,
The TA × KP population analysis in OneMap produced a map with the most markers per LG (480 markers total were grouped with at least 20 per LG). These individual LGs were used to force the initial marker grouping in JoinMap4. All calculations in JoinMap4 were conducted with default parameter settings for the population, grouping, and Maximum Likelihood (ML) mapping. JoinMap4 has a function that allows ungrouped markers to be added to groups based on an association score, the Strongest Cross Link value (SCL value). Any marker with an SCL value ≥5.0 was added to its SCL group. This was repeated until no markers had SCL values >5.0. Loci that were marked as identical to another locus were also included in groups. Markers were removed from linkage groups if they prevented mapping in JoinMap4 or if they were >200 cM distance from the next closest marker in the group. The most informative map was from the TA × KP population. This map was then used in JoinMap4 to provide a starting point for the maps in the other populations which were eventually merged using the map integration functions in JoinMap4 to produce the consensus map.
The resulting TA × KP map contained 600 markers and was used to force the grouping of another population, H × TA. More markers were added to the H × TA groups based on SCL values and identity with other markers. Markers were again removed if they prevented mapping or caused the linkage map to be an unreasonable size, such as 5,000 cM. The TA × KP map was integrated with the resulting H × TA map and this integrated map was used to force grouping in the next population. This procedure was repeated for every population using the newly integrated maps as a starting point for the forced grouping. The order of grouping and population integration into the map was as follows, TA × KP, H × TA, TA Selfs, I × KP, NMBP1243 × KP, Creeper (Cr) × KP, Haden Selfs. After each population was integrated into the map once, TA × KP and H × TA were grouped and integrated for a second time to see if the larger integrated maps could bring in more associated markers and reduce the total length of the maps of each linkage group.
Phenotype data for 14 qualitative traits were available for TA × KP, Cr × KP, and I × KP populations. In all cases KP was the pollen donor as it is polyembryonic. The qualitative traits measured were: stage of fruit ripeness, fruit shape, ground skin color, blush color, blush intensity, bloom, stem end shape, cleavage, beak shape, pulp color, embryo type, flavor, branch habit, tree vigor, beak shape, and cleavage (Table
Stage of ripeness | 0 | Hard (no give in fruit) |
1 | Rubbery (slight give in fruit under strong thumb pressure) | |
2 | Sprung (flesh deforms by 2–3 mm with moderate thumb pressure) | |
3 | Firm soft (whole fruit deforms with moderate hand pressure) | |
4 | Eating soft (whole fruit deforms with soft hand pressure) | |
Fruit shape | 1 | Long |
2 | Ovate | |
3 | Round | |
Ground skin color | 1 | Green |
2 | Green/yellow | |
3 | Yellow | |
4 | Orange | |
5 | Pink | |
Blush color | 1 | Orange |
2 | Pink | |
3 | Red | |
4 | Burgundy | |
Blush intensity | 1 | No blush |
2 | Blush barely visible | |
3 | Slight blush (similar to Kensington Pride) | |
4 | Medium blush (similar to Haden) | |
5 | Solid blush (similar to Tommy Atkins) | |
Bloom (the efflorescence of the wax covering the fruit) | 1 | Heavy |
2 | Light | |
Stem end shape | 1 | Deep |
2 | Slightly depressed | |
3 | Level | |
4 | Slightly raised | |
5 | Pointed | |
Pulp color |
1 | Orange group 24A |
2 | Yellow orange group 32A | |
3 | Yellow group 15A | |
4 | Yellow group 13B | |
5 | Yellow group 6A | |
Embryo type | 1 | Monoembryonic |
2 | Polyembryonic | |
Flavor | 1 | Unacceptable |
2 | Floridian | |
3 | Indian | |
4 | Other | |
5 | Kensington Pride | |
6 | South East Asian | |
Branch habit | 1 | Upright |
2 | Spreading | |
3 | Intermediate | |
Tree vigor | 1 | Extreme dwarf |
2 | Dwarf | |
3 | Low vigor | |
4 | Medium vigor | |
5 | High vigor | |
Beak shape (prominence of the point at the stylar scar) | 1 | Absent |
2 | Very slight | |
3 | Slight | |
4 | Medium | |
5 | Prominent | |
Cleavage (severity of the groove on the ventral shoulder of the fruit) | 1 | Deep |
2 | Shallow | |
3 | Absent |
Of the 14 traits, the twelve fruit traits were assessed on a sample of ten randomly picked at fruit maturity from each individual genotype within the three mapping populations. Fruit were ripened at 26°C and assessed at the eating ripe stage using the criteria detailed in Table
Associating traits with the mapped SNP markers was done using MapQTL6 (Kyazma B.V.®, Wageningen, Netherlands) using Cross Pollinated (CP) for population type and Interval Mapping (IM) for association statistic. All calculation parameters were set to MapQTL6 defaults. Global thresholds were calculated as described in MapQTL6 (permutation tests of 10,000 rounds) and only traits that showed higher association probabilities than the global threshold were considered to be significant.
Markers were chosen that segregated in a disomic fashion to produce our genetic map. From the 1,054 SNP markers used to genotype the 775 individuals from the seven mapping populations, 56 were removed due to excess missing data, 25 were removed due to aberrant segregation patterns, 19 had two homozygous parents, and 66 were unmappable across all populations for a combination of these reasons such as missing data in one mapping population and aberrant segregation in another, leaving 888 potentially mappable markers (Table
Total markers | 1,054 |
Aberrant segregation types in all populations | −25 |
Homozygote × Homozygote in all populations | −19 |
Too much missing data in all populations | −56 |
Unmappable in all populations because of a combination of unmappable marker types (e.g., aberrant segregation in one population, missing data in one population, etc.) | −66 |
Final mappable markers in at least one population | 888 |
H × TA | Mi_0299 | 60:26:63:76 |
TA × KP | Mi_0020 | 57:0:43:3 |
Mi_0171 | 50:1:49:3 | |
TA Self pollinated | Contig 1638_A98G | 0:66:0:0 |
Mi_0103 | 50:0:16:0 | |
I × KP | Mi_0200 | 121:3:52:3 |
NMBP1243 × KP | Mi_0425 | 72:4:23:1 |
Contig 6698_C90T | 16:52:32:0 |
To include all markers in the consensus map, we employed the strategy detailed in Section Materials and Methods, using the strengths of both JoinMap4 and OneMap. We produced a consensus map with 726 SNP markers distributed across 20 LGs shown in Figure
Table
1 | 28 | 111.2 | 4.1 | 14.6 | 0.06 |
2 | 31 | 135.6 | 4.5 | 22.8 | 0.05 |
3 | 26 | 79.4 | 3.2 | 19.8 | 0.08 |
4 | 36 | 223.2 | 6.4 | 41.6 | 0.07 |
5 | 31 | 126.3 | 4.2 | 19.4 | 0.18 |
6 | 25 | 80.4 | 3.4 | 17.4 | 0.17 |
7 | 29 | 151.1 | 5.4 | 25.0 | 0.00 |
8 | 42 | 247.8 | 6.0 | 32.9 | 0.00 |
9 | 35 | 143.1 | 4.2 | 25.7 | 0.01 |
10 | 42 | 186.5 | 4.5 | 28.8 | 0.00 |
11 | 26 | 77.2 | 3.1 | 14.4 | 0.00 |
12 | 35 | 148.8 | 4.4 | 26.1 | 0.00 |
13 | 43 | 154.9 | 3.7 | 44.8 | 0.00 |
14 | 27 | 114.9 | 4.4 | 22.6 | 0.02 |
15 | 45 | 166.2 | 3.8 | 18.0 | 0.00 |
16 | 71 | 228.0 | 3.3 | 17.9 | 0.00 |
17 | 56 | 156.7 | 2.8 | 26.7 | 0.00 |
18 | 21 | 76.5 | 3.8 | 21.6 | 0.00 |
19 | 34 | 126.7 | 3.8 | 20.5 | 0.00 |
20 | 43 | 156.1 | 3.7 | 20.1 | 0.02 |
Total | 726 | 2890.6 |
Assuming a haploid genome size of ~439 Mb and 20 chromosomes per haploid genome, the average size of a chromosome would be ~22 Mb. The total size of the map is 2,890 cM. An estimate of the average size of a cM would be ~150 Kb but would be expected to vary greatly within the genome.
Qualitative phenotypic data were available for three of the mapping populations (TA × KP, I × KP, and Cr × KP). Interval mapping testing using MapQTL found seven of the 14 qualitative traits used in the association study had significant LOD scores in at least one of the populations. Table
Embryo type | 8 | Mi_0173 | 46.1 | 4.96 | 8.82 | |
8 | mango_rep_c6716 | 74.8 | 7.70 | |||
8 | Contig1936 | 78.3 | 7.40 | |||
8 | mango_rep_c886 | 80.2 | 7.23 | |||
8 | Mi_0102 | 85.3 | 6.65 | |||
Ground skin color | 17 | Mi_0135 | 0.0 | 5.61 | ||
17 | SSKP009C1_A627T | 0.1 | 5.61 | |||
20 | Mi_0450 | 19.2 | 4.62 | |||
20 | Mi_0145 | 30.8 | 5.83 | |||
20 | mango_rep_c4542 | 33.9 | 6.17 | |||
Blush intensity | 20 | Mi_0341 | 45.6 | 6.65 | ||
20 | SSKP003C1_C682T | 57.6 | 5.99 | |||
20 | Mi_0343 | 67.5 | 5.75 | |||
20 | Mi_0277 | 68.6 | 5.69 | |||
20 | mango_rep_c15051 | 69.6 | 5.62 | |||
20 | mango_rep_c8905 | 70.4 | 5.60 | |||
20 | Mi_0357 | 71.1 | 5.57 | |||
20 | Mi_0330 | 72.4 | 5.49 | |||
20 | Mi_0046 | 73.1 | 5.43 | |||
20 | Contig2601 | 74.0 | 5.33 | |||
Bloom | 13 | Contig1142 | 0.4 | 5.80 | ||
9 | Mi_0417 | 109.2 | 4.86 | |||
9 | Mi_0402 | 122.4 | 8.05 | |||
9 | mango_rep_c9549 | 124.5 | 7.91 | |||
9 | Mi_0142 | 128.8 | 7.14 | |||
9 | Mi_0497 | 129.6 | 7.03 | |||
Beak shape | 11 | mango_c48384 | 17.7 | 6.16 | ||
11 | mango_rep_c52196 | 17.8 | 6.16 | |||
Pulp color | 16 | Mi_0217 | 125.8 | 5.18 | ||
13 | Mi_0029 | 5.6 | 4.36 | |||
Branch habit | 8 | Mi_0192 | 29.6 | 4.90 | ||
16 | Contig3904 | 97.5 | 4.48 | |||
16 | Contig1327 | 100.4 | 4.42 |
Embryo type was the only trait to have significant LOD scores at the same marker (Mi_0173) across two different populations (Figure
Bloom, pulp color, and branch habit traits showed significant association to markers in two different populations. The marker association was on different LGs in each population (Table
MAS provides a means to improve the efficiency of tree breeding. A genetic map provides a means to improve the strength of the association between traits and markers for MAS. We chose to produce a genetic map from SNP markers for several reasons: SNP markers are more abundant than microsatellite markers, easier to identify, easier to score and, as unambiguous markers, are appropriate for international databases as they show no platform bias, which means they can be assayed by any method and produce the same genotype. For mango, ~500,000 SNP markers were identified from RNA sequencing and alignment to a consensus transcriptome (Hoang et al.,
Mango has 40 chromosomes with the diploid number being 20. The markers we used for the map were inherited in a disomic fashion, leading to an expectation that we would find 20 identifiable LGs. This suggests that if mango is an allopolyploid, the two ancestral genomes are different enough to be distinguished by our markers.
We used a strategy to make the map that took advantage of the strengths of two different mapping programs, JoinMap4 and OneMap. Using OneMap we set the group size and group number parameters to artificially identify 20 LGs with at least 10 markers per LG. We then used these groups to force group formation using JoinMap4 and to identify a SCL value of markers that were not in the group identified by OneMap. Groups were expanded by setting a minimum SCL value for inclusion into the group and recursively applying this rule until all possible markers with an SCL value over a set threshold had been included in the group. At that point, the larger JoinMap group was used to force group formation in the next mapping population until all possible markers were included in the group. We started this process in OneMap with the TA × KP population as the data for this population showed the least segregation distortion, likely due to the accuracy of the parental genotypes. Using either hand-pollination or open-pollination to create a population of F1 hybrid individuals, the assumption is that all clones of a cultivar that are potential parents have identical genotypes. Thus, there should be no problem with using multiple trees of a cultivar as a parent, rather than a single tree. However, in both hand-pollinated and open-pollinated populations, there may be genotypic differences in the multiple trees used as parents. These slight genotypic differences may not be easily detectable when using a few diagnostic markers, but may be detected when more markers are applied or when segregation distortion in that population for some markers is observed. The TA × KP population had the least amount of this type of distortion, perhaps due to the genetic identity of all the TA and KP clones used as parents. In contrast, the I × KP population, although almost twice as large as the TA × KP population (180:100), had off types identified when all 1,054 markers were used as well as significant distortion that may have been due to the use of several Irwin maternal parents that were not completely identical in genotype. The I × KP map had many fewer mapped markers than the TA × KP map and did not contribute new markers to the consensus map that were unique to I × KP.
To be useful for MAS, important agronomic traits must be associated with markers. A map is not necessary to identify markers associated with a trait, but confidence in this association increases as multiple markers near the trait locus on the genetic map also show significant association with the trait. This was the case for seven of 14 of our qualitative traits used for the initial trait association studies.
Mango has its origins in Southeast Asia, primarily in the area from north-western Myanmar, Bangladesh, and north-eastern India. From these origins, two centers of diversity developed. A subtropical group in the Indian sub-continent that is characterized by monoembryonic seed and a tropical group in the south-east-Asia region that is characterized by polyembryonic seed (Mukherjee and Litz,
In our case, embryony type, which is dimorphic (monoembryonic or polyembryonic) showed significant association to a single locus on LG 8 in two of the three mapping populations (Table
We saw significant associations of six other traits to specific loci on the genetic map: bloom, pulp color, branch habit, ground skin color, blush intensity, and beak shape. Bloom, pulp color, and branch habit showed association to markers in two different mapping populations (TA × KP, I × KP), but on different linkage groups in each. These traits may be regulated differently in the different accessions. For example, the bloom trait is the amount of wax efflorescence covering the fruit and it was scored as light (I and KP) and heavy (TA). A potential explanation would be that the heavy phenotype for bloom in TA requires activation of wax biosynthetic genes to increase wax production, while the light phenotype in I and KP activates other pathways that use the same long chain fatty acid precursors and reduce wax production. A similar argument can be made for the pulp color and branch habit traits, which also show association to different loci and LGs in different mapping populations.
Significant association of SNP markers with blush intensity, beak shape, and ground skin color was only observed in TA × KP. For blush intensity, the TA and I parents are scored as a 5, KP is an intermediate 3, and Cr is 1. For beak shape, TA, KP, and I are scored as 4 and Cr as 2. One might expect that the blush trait should map in both TA × KP and I × KP and beak shape should map only in Cr × KP. Our results suggest that these traits are regulated in a more complex manner. For ground skin color, the two markers strongly associated with this trait in TA × KP are found at 0 cM and 0.1 cM on LG 17. The next mapped marker is more than 26 cM distant. These two markers only mapped in TA × KP and thus this region of the linkage group cannot be seen in the other populations.
We observed segregation patterns of markers that fit more closely to tetrasomic inheritance. For example, in the NMBP1243 × KP population, Mi_0055 showed a segregation pattern of 0:25:75:0 (Homozygous Allele1: Heterozygous: Homozygous Allele2: Missing data or null allele). No parental combination of genotypes for diploid parents could produce such a segregation pattern, but as tetraploid parents, XYYY × YYYY, where X is Allele 1 and Y is Allele 2, the expected segregation would be 0:1:3:0, which fits closely with the observed ratio.
We have produced a mango consensus genetic map based on individual maps from seven F1 hybrid populations. The individual maps showed strong agreement which makes the consensus map a powerful tool for comparative mapping and the association of markers and alleles to important horticultural traits. Desirable parents can be selected from germplasm collections based on the presence of favorable alleles for the desired trait and used in either hand-pollination crosses or open-pollination of the maternal parent to increase the efficiency of selection of improved material. The trait-associated SNP markers described here can be used to select progeny containing these favorable alleles by genotyping, which is now reliable, rapid, and inexpensive. Genotyping for these traits at the seedling stage will significantly reduce the expense in field use, maintenance and evaluation of material over years. The map opens the way for MAS in mango breeding.
MAS is an excellent tool for preselection of seedlings more likely to show improved traits, but in many fruit tree crops the required genetic resources are not available. The set of markers and genetic map we developed are valuable resources for mango breeders, helping them identify accessions as potential parents and validate progeny as hybrids. The markers and map are a significant step toward improving the efficiency of both traditional breeding and selection through early identification of progeny with trait- and allele-associated genotypes.
The consensus map and qualitative trait-associated markers presented here are the first for mango and demonstrate the utility of such genomics tools for breeding and selection of improved mango cultivars. However, markers associated with important quantitative traits are also needed to further improve mango breeding efficiency. Recently, we have begun a project to produce a map of the TA × KP population by genotyping by sequencing (GBS). The GBS map should be based on more than 100,000 SNP markers and provide the appropriate resolution for the association of quantitative traits to SNP markers for the TA × KP population and, by extension, to other mango hybrid populations with sufficient amounts of accurate phenotypic data.
DK, IB, ND—mango mapping populations; DK, DI, AS, RO, YC—SNP markers; DK, AG, JR—data reformatting and mapping; DK, IB, ND, DI, AG, JR, RO, YC, AS—conception and design of the work, drafting, and revising the manuscript.
DK, AG, JR were funded by USDA-ARS CRIS #6631-21000-022-00D and the National Mango Board NACA#58-6038-5-001. AS, RO were funded by MOAG Chief scientist grant 203-859. YC was funded by MOAG Chief scientist grant 203-088. ND, IB were funded by QDAF, Australia, #HF10189 and Horticulture Innovation Australia (HIA) #MG12015.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Thanks to Elaini Oliveira dos Santos Alves (UESC, Bahia, Brazil), Carlos Antonio Fernandes Santos, and Francisco Pinheiro Lima Neto (Embrapa Semiarido, Petrolina, Pernambuco, Brazil) for sharing the H × TA mapping population. Thanks to Ashley Johnson, Paola Sanchez, and Barbie Freeman (USDA-ARS-SHRS, USA) for outstanding effort in genotyping all the mapping populations. Special thanks to Leo Ortega and the National Mango Board (USA) for their exceptional support in funding and encouraging this research. We acknowledge the assistance of Cheryldene Maddox (QDAF, Australia) with the maintenance of the mango genepool collection and phenotypic data collection, and Louise Hucks (QDAF, Australia) for laboratory technical assistance.
The Supplementary Material for this article can be found online at: