Multi-Locus Genome-Wide Association Studies of Fiber-Quality Related Traits in Chinese Early-Maturity Upland Cotton

Early-maturity varieties of upland cotton are becoming increasingly important for farmers to improve their economic benefits through double cropping practices and mechanical harvesting production in China. However, fiber qualities of early-maturing varieties are relatively poor compared with those of middle- and late- maturing ones. Therefore, it is crucial for researchers to elucidate the genetic bases controlling fiber-quality related traits in early-maturity cultivars, and to improve synergistically cotton earliness and fiber quality. Here, multi-locus genome-wide association studies (ML-GWAS) were conducted in a panel consisting of 160 early-maturing cotton accessions. Each accession was genotyped by 72,792 high-quality single nucleotide polymorphisms (SNPs) using specific-locus amplified fragment sequencing (SLAF-seq) approach, and fiber quality-related traits under four environmental conditions were measured. Applying at least three ML-GWAS methods, a total of 70 significant quantitative trait nucleotides (QTNs) were identified to be associated with five objective traits, including fiber length (FL), fiber strength (FS), fiber micronaire (FM), fiber uniformity (FU) and fiber elongation (FE). Among these QTNs, D11_21619830, A05_28352019 and D03_34920546 were found to be significantly associated with FL, FS, and FM, respectively, across at least two environments. Among 96 genes located in the three target genomic regions (A05: 27.95 28.75, D03: 34.52 35.32, and D11: 21.22 22.02 Mbp), six genes (Gh_A05G2325, Gh_A05G2329, Gh_A05G2334, Gh_D11G1853, Gh_D11G1876, and Gh_D11G1879) were detected to be highly expressed in fibers relative to other eight tissues by transcriptome sequencing method in 12 cotton tissues. Together, multiple favorable QTN alleles and six candidate key genes were characterized to regulate fiber development in early-maturity cotton. This will lay a solid foundation for breeding novel cotton varieties with earliness and excellent fiber-quality in the future.


INTRODUCTION
Upland cotton (Gossypium hirsutum L.), a tetraploid plant, is the most important natural-fiber crop. It is widely cultivated in the world and supplies more than 95% of the global fiber yield due to its extensive adaptive ability and high productivity (Chen et al., 2007). Upland cotton cultivars can be divided into early-, middle-and late-maturity varieties, according to the duration of growth period. Early-maturity (short-season) cotton is an ecological type with a relatively short growing period Song et al., 2015). It is suited for wheat-cotton, barley-cotton and rape-cotton double cropping patterns in cotton growing areas of Yellow River Region (YRR) and Yangzi River Region (YZRR), and is also fit for single cropping production in the early-maturity areas of Northwest Inland Region (NIR) and the Northern Specific Early-Maturity Region (NSEMR), with the short frost-free period in China Song et al., 2015). Additionally, mechanized harvesting of cotton after good ripening is very common in NIR. The cotton varieties appropriate for mechanical harvesting should have earlier maturing characteristics, especially for early and concentrated boll-opening traits, compared with those suitable for manual harvesting (Bao et al., 2014;Feng et al., 2017). Therefore, the early-maturity upland cotton varieties are becoming more and more important in Chinese cotton production.
Currently, farmers gained increased economic benefits after using new production patterns of double cropping and mechanical harvesting; whereas application of these cultivation measures need early-maturity cotton (Du et al., 2015;Dai et al., 2017;Lu et al., 2017). Owing to their great necessity, a series of early-maturity cotton varieties, such as "Liaomian, " "Zhongmiansuo, " and "Xinluzao, " were developed and released in recent 40 years in China. However, their fiber qualities were relatively poor compared with those of middle-and late-maturity cotton varieties. Therefore, it is crucial to improve fiber quality of early-maturity cotton varieties.
To meet human higher needs for improving textile products, it is also essential for researchers to focus on fiber-quality improvement of early-maturity cotton in future. However, it is difficult to improve fiber quality of early-maturity cotton by means of traditional breeding strategy because of the significant negative correlation between earliness and excellent-quality fiber Fan et al., 2006). The rapid development of genotyping techniques based on simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) markers provided an alternative method to improve the efficiency of crop breeding. Generally, marker-assisted selection (MAS) is a high-efficiency and economical approach for modern breeding, compared with the traditional phenotyping breeding (Lande and Thompson, 1990). Researchers have spent a great amount of time and effort on mapping quantitative trait loci (QTL) by using linkage analysis. Over the last two decades, a number of cotton earlinessrelated QTL have been identified via linkage mapping (Fan et al., 2006;Li et al., 2013;Jia et al., 2016). Compared to the studies evaluating cotton early maturity, far too many investigations have been conducted to identify genetic signatures for fiber quality.
A recent meta-QTL analysis suggested that approximately one thousand QTL for fiber-related traits have been detected in intraspecific upland cotton populations (Said et al., 2015), and a few near-term studies have added new QTL for cotton fiber quality (Shang et al., 2015;Tan et al., 2015;Tang et al., 2015;Fang X. et al., 2017).
A genome-wide association study (GWAS) is a wonderful supplement to QTL mapping, and it has been widely used in upland cotton in recent years (Su et al., 2016a(Su et al., , 2018Fang L. et al., 2017;Huang et al., 2017;Sun et al., 2017;Ma Z. et al., 2018). Although there are a lot of reports on GWAS for cotton earliness and fiber-quality related traits in the past ten years (Zeng et al., 2009;Zhang et al., 2013;Cai et al., 2014;Nie et al., 2016;Su et al., 2016a,b;Sun et al., 2017;Ma Z. et al., 2018), few GWAS investigations have been conducted on fiber-quality related traits in early-maturity upland cotton. In the previous studies, the majority of QTL or quantitative trait nucleotides (QTNs) for fiber quality are mainly derived from germplasms of G. barbadense and late-maturity G. hirsutum, they are not convenient for use in fiber-quality improvement of early-maturity cotton. Therefore, it is needed to identify QTNs and candidate genes associated with fiber quality in the panel consisting of early-maturity upland cotton accessions.
To date, a lot of single-locus GWAS (SL-GWAS) have been reported in upland cotton (Zeng et al., 2009;Zhang et al., 2013;Nie et al., 2016;Su et al., 2016a,b,c;Sun et al., 2017;Ma Z. et al., 2018). The SL-GWAS methods are involved in multiple testing, and Bonferroni correction is frequently adopted to control the false positive rate. However, this correction is very stringent, thus some important loci cannot be detected, especially for large error in the phenotypic measurement in field experiments . To overcome this issue, multi-locus GWAS (ML-GWAS) methodologies have been developed. They include mrMLM , FASTmrMLM (Tamba and Zhang, 2018), ISIS EM-BLASSO , FASTmrEMMA , pLARmEB , and pKWmEB (Ren et al., 2018). Additionally, to decrease the false positive rate, a combination of several ML-GWAS methods have been applied in previous studies (Wu et al., 2016;Misra et al., 2017;Ma L. et al., 2018).
In this study, ML-GWAS for fiber-quality related traits were conducted in a panel composed of 160 early-maturing cotton accessions. The main objective of our study was to discover the favorable QTN allelic variations and some potential candidate genes controlling fiber quality in the early-maturity upland cotton. This investigation will lay a foundation for breeding new cotton varieties with earliness and excellent fiber quality in the future.

Plant Materials
A natural population consisting of 160 Chinese early-maturity upland cotton accessions were generated (Table S1). These accessions were sampled from the germplasm gene bank of the Cotton Research Institute of the Chinese Academy of Agricultural Sciences (CRI-CAAS). The germplasms fell into three groups based on cotton-planting regions in China. Specifically, 81, 58, and 21 accessions were from the YRR, NIR and NSEMR, respectively. All the accessions have relatively short whole growing period (ranging from 100 to 120 days). The field trials at SHZ were performed with drip irrigation under plastic film conditions, whereas the plots at AY were furrow irrigated as needed. The experimental field management measures were full accordance with local agronomic practices.

Phenotyping and Data Analysis
After mature, a total of 20 naturally opened bolls, as a cotton fiber sample, were handly picked from central part of the plants from each accession in each replicate every year. Fiber samples weighing 10∼15 g lint cotton were then measured for fiber property determination using an HVI-MF 100 instrument (User Technologies, Inc., USTER, Switzerland) at the Cotton Fiber Quality Inspection and Testing Center of the Ministry of Agriculture, Anyang, China. The following fiber-quality related traits were evaluated: 50% fiber span length (FL, mm), fiber strength (FS, cN.tex −1 ), fiber micronaire (FM), fiber uniformity (FU, %) and fiber elongation (FE, %). The analysis of variance (ANOVA) for phenotypic data was conducted using the SPSS22.0 software.

SNP Genotyping
Genomic DNA was isolated from young leaf tissue of all accessions using a modified cetyltrimethylammonium bromide (CTAB) method as described by Paterson et al. (1993). Reducedrepresentation DNA sequences of 160 early-maturity cotton accessions have been obtained by specific-locus amplified fragment sequencing (SLAF-seq) approach with coverage of approximate 5.50×. To mine the SNPs with higher quality, the raw reads were mapped to the G. hirsutum L. TM-1 genome v 1.1  using BWA software (Li and Durbin, 2009). The GATK (McKenna et al., 2010), and SAMTools ) packages were used for SNP calling. The filtered SNPs, with a missing rate <10% and a minor allele frequency (MAF) ≥ 0.05, were reserved and used for the subsequent analysis.

Clustering Analysis, Population Structure and Linkage Disequilibrium (LD) Analysis
A neighbor-joining phylogenetic tree among 160 individuals was constructed using the filtered SNPs by the Tassel 5.2 software (Bradbury et al., 2007). The population structure was analyzed using a principal component analysis (PCA) approach with the Tassel 5.2 program (Bradbury et al., 2007). LDs between SNPs were estimated as the squared correlation coefficient (R 2 ) of alleles using the Tassel 5.2 tool (Bradbury et al., 2007). The R 2 -values were calculated within a 0-to 10-cM window.

Genome-Wide Association Study and Allelic Variation Analysis
Six ML-GWAS methods, including the mrMLM, FASTmrMLM, FASTmrEMMA, pLARmEB, pKWmEB, and ISIS EM-BLASSO, were used in this study. The mrMLM is a multi-locus model including markers selected from the rMLM method with a less stringent selection criterion . The FASTmrMLM reduces the running time in mrMLM by more than 50%, and also shows slightly high statistical power in QTN detection, high accuracy in QTN effect estimation and low false positive rate as compared to mrMLM (Tamba and Zhang, 2018). FASTmrEMMA is a fast multi-locus random-SNP-effect EMMA model, which is more powerful in QTN detection and model fit . The pLARmEB integrates least angle regression with empirical Bayes to perform ML-GWAS under polygenic background control . The pKWmEB retains the high power of Kruskal-Wallis test, and provides QTN effect estimates and effectively controls false positive rate (Ren et al., 2018). The ISIS EM-BLASSO has the highest empirical power in QTN detection and the highest accuracy in QTN effect estimation, and it is the fastest, as compared with EMMA and mrMLM . All parameters were set at default values, and the critical thresholds of significant association for all the above six methods were set at LOD = 3.00 Tamba et al., 2017;Wen et al., 2017;Zhang et al., 2017;Ren et al., 2018;Tamba and Zhang, 2018).
The phenotypic-effect value of each allelic variation was calculated by the phenotypic data over the accessions with each type, and box plots of the relative phenotypic data were produced using the R software.

Prediction of Potential Candidate Genes
Putative candidate genes were identified by physical positions of significant trait-associated SNP loci in the G. hirsutum L. reference genomes v1.1 . According to LD decay distance, the interval for the prediction of candidate genes for the significant SNP loci was determined. The genes distributed in these regions were collected. Transcriptome sequencing data from 12 upland cotton tissues (including fiber in 5, 10, 20, 25 DPA (days post anthesis), root, stem, leaf, torus, calycle, cotyledon, petal and pistil) were available on the cotton website (https://cottonfgd.org/). Heat maps of the putative candidate gene expression patterns were drawn using the R package "pheatmap." The biological functions of putative candidate genes were annotated by gene ontology (GO) items on the cotton website (https://cottonfgd. org/).

SNP Genotyping
To gain insight into the genetic bases of fiber-quality related traits, 160 early-maturity upland cotton accessions were performed using SLAF-seq, and a complete set of markers containing 72,792 high-quality SNPs was explored by filtering according to the stringent quality control. These detected markers consisted of 47,594 and 25,198 SNPs in the At and Dt chromosomes respectively, and were unevenly distributed on all the 26 chromosomes of upland cotton. Moreover, the SNP loci with maximal number were identified on chromosome Chr., chromosome, according to the upland cotton reference genome .
A10 (5013), while those with the minimal number were detected on chromosome D04 (1479). The average marker density was about one SNP per 28.10 kb genomic regions. The greatest marker density was found on chromosome A10 with one SNP per 20.12 kb, while the smallest marker density was seen in chromosome D05, with one SNP per 39.93 kb ( Table 1).

Phenotypic Variation
To examine whether significant phenotypic variances exist in the fibers among the 160 upland cotton accessions, the five fiberquality related traits including FL, FS, FU, FM and FE were examined. The results showed that the parameters of fibers from different accessions were quite diverse (  Figure 1). These results indicate that early-maturity cotton varieties had broad variation in fiber-quality related traits under different planting conditions. We observed that the phenotypic values of FL and FS at Anyang (AY) were significantly lower than those at Shihezi (SHZ). By contrast, there were no significant differences in FM and FU between the two locations. The FE values in AY-2014 were also strikingly lower than those in other environments (Table 2, Figure 1). Furthermore, the statistically significant differences (P < 0.001) were observed among genotypes, environments, and the genotype × environment interactions on all the five target traits (Table S2). These data suggest that the  five fiber-quality related traits were significantly influenced by the environmental conditions.

Population Structure and Linkage Disequilibrium Analysis
To understand the phylogenetic relationship of the 160 upland cotton genotypes, a neighbor-joining phylogenetic tree was conducted based on their genetic distances, which derived from the SNP differences in these accessions. The population could be divided into three different groups, designated pop I (YRR, with 54 accessions), pop II (NIR, with 44 accessions) and pop III (YRR, NIR and NSEMR, with 62 accessions), respectively (Figure 2A). Furthermore, we found that there are an intimate genetic relationship between NSEMR accessions and the early varieties from YRR and NIR, which were mainly assigned to pop III, while the recent accessions from YRR and NIR belonged to pop I and pop II, respectively. These findings imply that earlymaturity accessions in YRR and NIR might derive from NSEMR varieties in upland cotton. Next, the population structure of the panel was analyzed using a PCA on the basis of the identified SNPs. Three conceivable subpopulations were separated by PC1 and PC2 (Figure 2B). Similarly, YRR, NIR and mixed group (YRR, NIR and NSEMR) were respectively distinguished via PCA. Based on the results from both the phylogenetic tree and PCA, the panel was separated into three groups (Figures 2A,B).
To examine the LD decay distance in the panel, its decay rate was estimated using the SNPs. The result showed that the genome-wide LD decay rate of the natural population was approximately 400 kb, where the R 2 drops to half of the maximum value ( Figure 2C). Due to the average marker density with one SNP per 28.10 kb (Table 1), we concluded that these markers were sufficiently dense for detecting the associated QTNs.

Identification of Favorable Allelic Variations
To identify favorable alleles of QTNs for target traits, we focused on the above 3 steady QTNs, which exhibited the maximum LOD, -lg(P) value and phenotypic variation. The striking QTN D11_21619830 presented three types of allele (AA, AG and GG), and the accessions with the favorable allele AA (n = 112) showed significantly higher FL than those with the GG (n = 26) and AA (n = 20) alleles ( Figure 3B). Moreover, we found that QTN A05_28352019 had three types of allelic variation AA, AG and GG, respectively, where the average FS of the favorable allele GG (28.89 cN.tex −1 ) was higher than those of the AA (26.26 cN.tex −1 ) and AG (26.98 cN.tex −1 ) ( Figure 3B). Additionally, the peak QTN (D03_34920546) had three allelic variations (AA, AG and GG), and the accessions with the GG variation showed higher FM than those with the alternate AA variation. Considering the most excellent level of FM (3.70∼4.20) for spinning, allele AA of the peak QTN could be regarded as the favorable allele with the mean FM value of 4.28, whereas the corresponding type GG was the unfavorable allele with the mean FM value of 4.89 ( Figure 3B). These findings indicated that the fibers of the accessions with favorable allelic variations   were clearly improved compared to those of the accessions with unfavorable allelic variations.

Prediction of Candidate Genes
The genomic regions (±400 kb around the associated QTNs) of QTN-linked candidate genes were adopted according to the genome-wide LD decay distances (about 400 kb) in this study. Thus, three target regions of the candidate genes were determined as A05: 27.95-28.75, D03: 34.52-35.32, and D11: 21.22-22.02 Mbp, and a total of 29, 32 and 35 genes were presented respectively in the above regions, according to upland cotton reference genome v1.1 ; Table S3). Furthermore, we observed that the expression of 72 genes of them was clearly increased in 12 cotton tissues using RNA-Seq (Figure 4). Among these genes, Gh_A05G2325, Gh_A05G2329, Gh_A05G2334, Gh_D11G1853, Gh_D11G1876, and Gh_D11G1879, were highly expressed in the fiber. Notably, Gh_A05G2334 was dominantly expressed in all the four fiber samples; Gh_D11G1853 was mainly expressed in fibers of 20 and 25 DPA; and Gh_D11G1876 and Gh_A05G2325 was preferentially expressed in fiber of 25 DPA; whereas Gh_A05G2329 and Gh_D11G1879 had the maximum expression level in the fibers of 5 and 10 DPA, respectively. Also, the transcriptional abundances of Gh_D03G1012 and Gh_A05G2335 were slightly higher in fibers than in the other tissues (Figure 4). These results suggest that the six genes (Gh_A05G2325, Gh_A05G2329, Gh_A05G2334, Gh_D11G1853, Gh_D11G1876, and Gh_D11G1879) might play important roles in controlling fiber quality of early-maturity upland cotton. To further understand thoroughly the above six putative candidate genes for target traits, their biological functions were annotated by gene ontology (GO) items. Three genes (Gh_A05G2334, Gh_D11G1876, and Gh_D11G1879) were annotated as transcription factors, such as sequence-specific DNA binding, DNA-binding transcription factor activity and regulation of transcription (Table 4). Gh_A05G2334 encoded the agamous-like MADS-box protein AGL11 which likely plays roles in many aspects of plant growth and development (Rounsley et al., 1995). These results indicate that the putative candidate genes may regulate fiber development by DNA-binding transcription factors in early-maturity upland cotton.

The Origin and Domestication of Chinese Early-Maturity Upland Cotton
To exploit the limited natural resources and increase economic income of cotton producers, it is especially necessary to make use of the double cropping systems and mechanical harvesting in the major cotton growing regions in China. Thus, earlymaturity cotton cultivars are needed. Indeed, early-maturity cotton varieties are attracting much attention from many cotton growers and breeders. Fiber characters are complicated and comprehensive traits regulated by a lot of QTL and influenced easily by many external factors (Ulloa and Meredith, 2000). Its related traits for example FL, FS, and FM are more important for the spinning industry. Previous investigations had shown that FL and FS have significant negative correlations with earliness in cotton. Thus, the early-maturity cotton varieties have much lower fiber quality than late-maturity ones. Sun et al. (2017) reported the association panel including early-, middle-and latematurity cotton varieties have a big phenotypic variation of the FL (22.07∼35.56 mm) and FS (22.69∼36.80 cN.tex −1 ). In this study, FL of the panel ranged from 24.07 to 33.69 mm, with a mean of 28.09 mm; while the FS had a great variation ranging from 22.70 to 40.65 cN.tex −1 . These findings indicate that FL of our association population of the early-maturity cotton has small distribution ranges compared with the previous results.
Although China is one of the largest nations producing and consuming cotton in the world, it is not an upland cotton domestication country (Zhang et al., 2013). The early cotton varieties were primarily developed by using introduced varieties (Zhang et al., 2013). King cultivar from America is the ancestor of the Chinese early-maturity upland cotton. Most of Chinese early-maturity cotton varieties of the early stage, such as "Jinmian1, " "Heishanmian1, " "Liaomian1, " "Zhongmiansuo10, " and "Xinluzao10, " were all derived from "Guannong1, " which had a breeding pedigree from the King cultivar. In this study, the association panel contained the above-mentioned core germplasms, and consisted of more than 80% of the Chinese early-maturity cotton varieties. Thus, it can represent the wide genetic diversity of Chinese early-maturity upland cotton. In the early stage, the Chinese early-maturity cotton varieties were developed by utilizing the core germplasms from NSEMR ("Jinmian1, " "Heishanmian1" and "Liaomian1). On the basis of the clustering of phylogenetic tree and PCA of the study, along with breeding history, the early-maturity cotton could be divided into three groups, designated pop I (the recent accessions from YRR), pop II (the recent accessions from NIR) and pop III (the NSEMR varieties and the early germplasms from YRR and NIR), respectively. These findings suggest that early-maturity accessions in YRR and NIR might derive from the NSEMR early varieties in Chinese upland cotton.

Comparison of Our GWAS Results With QTL or QTNs Detected in Previous Studies
In the recent 30 years, many QTL have been mapped, and some fiber-quality QTL hotspots have been discovered by a comparative meta-analysis (Said et al., 2015). It has been shown that chromosome D11 (c24) has the most prominent cluster  carrying FL, FE and FS QTL hotspots between CIR026 and NAU2407b. A hotspot cluster A07 (c7) carrying FL and FS QTL between E1M7_80 and CG05a has also been found (Said et al., 2015). Another cluster carrying FE, FL and FM QTL hotspots on D01 (c14) between CIR246 and G1012 has been identified; and the region between E5M4_480 and pAR544 harbors a hotspot cluster carrying FS QTL on chromosome D03 (c16) (Said et al., 2015). Additionally, some stable QTL for FS on A07 (Chr.07) have been identified by QTL mapping (Tan et al., 2015;Fang X. et al., 2017). Similarly, a few associated SNP loci with fiber quality have been detected via GWAS in upland cotton (Table S4). Among the identified FL-associated SNPs, most of markers were located on chromosome A10 and D11, such as A10_65694094, A10_65696540, D11_24030081 and D11_24030087. Recent reports have shown a number of cluster_A07 SNPs for FS are distributed in genome region A07: 71.99-72.25 Mbp (Sun et al., 2017;Ma Z. et al., 2018). In addition, we also found the major genomic region (D11:24.03-24.10 Mbp) consisting of nine SNP loci associated with FL, which was previously detected (Su et al., 2016a).
In the current study, we characterized the significant QTNs (D11_21619830, A05_28352019 and D03_34920546) for fiber-quality related traits. These QTNs were detected using several new ML-GWAS methods in at least two environments. Compared with the mapped QTL of the previous studies, the QTN D11_21619830 was located in the region of QTL hotspot clusters for fiber quality. Compared with the associated loci of previous GWAS, these associated QTNs were excluded in the genomic regions of the previous reports. Therefore, these identified SNPs may be novel QTNs controlling fiber quality in our association population of early-maturity cotton.

Superiority of the New Multi-Locus GWAS
Most of previous studies have focused on genetic bases of some complicated traits using general linear model (GLM) and mixed linear model (MLM) based on a single-locus GWAS (SL-GWAS) Zhang et al., 2010). However, both of these models have certain shortcomings. A big false positive incidence is the uppermost disadvantage of GLM because polygenic kinship is not considered (Korte and Farlow, 2013). In MLM, the stringent P threshold (P = 0.05/n, n is the number of SNPs) leads to missing many significant QTNs, particularly small-effect QTNs . To make up for deficiencies of GLM and MLM, some multi-locus GWAS (ML-GWAS) methodologies have been developed, such as mrMLM , FASTmrMLM (Tamba and Zhang, 2018), FASTmrEMMA , ISIS EM-BLASSO , pLARmEB , and pKWmEB (Ren et al., 2018). Compared with the conventional SL-GWAS MLM methods, these ML-GWAS methods are more powerful and have the advantages of accuracy. Thus, we adopted the ML-GWAS methods in this study.
In addition, the significant threshold of these new ML-GWAS methods is set to a LOD score = 3, which is equal to -lg(P) = 3.70 . Although the standards are less stringent in the ML-GWAS methods than in the SL-MLM ones, their false positive rates are effectively reduced Tamba et al., 2017;Wen et al., 2017;Zhang et al., 2017;Ren et al., 2018;Tamba and Zhang, 2018). Thus, the ML-GWAS approaches are considered more effective, practical and alternative. In this study, 70 QTNs significantly associated with five fiber-quality related traits were simultaneously identified in three or more ML-GWAS methods ( Table 3). Further investigation showed that three stably expressed QTNs were commonly detected in multiple environments (Table 3). However, no significantly associated QTN was found when using the Tassel 5.0 in MLM [PCs + K, -lg(P) = -lg(0.05/72792) = 6.16]. These data suggest that the ML-GWAS methods are more powerful and robust when applying to detect the small-effect QTNs for fiber-quality related traits of upland cotton.

CONCLUSION
In this study, a total of 70 significant QTNs were simultaneously detected to be associated with five objective traits by three or more methods. Among these QTNs, D11_21619830, A05_28352019 and D03_34920546, significantly associated with FL, FS, and FM, respectively, were simultaneously presented across at least two environments. Furthermore, favorable allelic variations of the three QTNs and 96 genes contained in the three target genomic range were mined. Among these, six genes highly expressed in the fibers might be candidate genes identified by RNA-Seq method. In summary, many favorable QTN alleles and six candidate genes were identified to modulate fiber development in early-maturity upland cotton. This will lay a solid basis for breeding earliness and excellent fiber-quality cotton varieties in the future.

AVAILABILITY OF SUPPORTING DATA
The sequence read data from the SLAF-seq analysis for the 160 sequenced upland cotton lines are available in the Sequence Read Archive (http://www.ncbi.nlm.nih.gov/ bioproject/PRJNA314284/) (SRP071133 under the accession number PRJNA314284).

AUTHOR CONTRIBUTIONS
JS and CW designed and supervised the research. JS and ML performed multi-locus GWAS. JS and QM conducted the field trials to evaluate the five fiber-quality related traits. JS, CW, and FH wrote and revised the manuscript. All of the authors read and approved the manuscript.