Quantitative Trait Locus Analysis and Identification of Candidate Genes for Micronaire in an Interspecific Backcross Inbred Line Population of Gossypium hirsutum × Gossypium barbadense

Cotton is the most important fiber crop and provides indispensable natural fibers for the textile industry. Micronaire (MIC) is determined by fiber fineness and maturity and is an important component of fiber quality. Gossypium barbadense L. possesses long, strong and fine fibers, while upland cotton (Gossypium hirsutum L.) is high yielding with high MIC and widely cultivated worldwide. To identify quantitative trait loci (QTLs) and candidate genes for MIC in G. barbadense, a population of 250 backcross inbred lines (BILs), developed from an interspecific cross of upland cotton CRI36 × Egyptian cotton (G. barbadense) Hai7124, was evaluated in 9 replicated field tests. Based on a high-density genetic map with 7709 genotyping-by-sequencing (GBS)-based single-nucleotide polymorphism (SNP) markers, 25 MIC QTLs were identified, including 12 previously described QTLs and 13 new QTLs. Importantly, two stable MIC QTLs (qMIC-D03-2 on D03 and qMIC-D08-1 on D08) were identified. Of a total of 338 genes identified within the two QTL regions, eight candidate genes with differential expression between TM-1 and Hai7124 were identified. Our research provides valuable information for improving MIC in cotton breeding.


INTRODUCTION
Cotton (Gossypium spp.) is an important cash crop species worldwide, providing an essential natural resource for the textile industry. Due to its high yield and wide adaptation, upland cotton (Gossypium hirsutum L.) accounts for more than 95% of global cotton production (Lacape et al., 2003;Li X. M. et al., 2016). However, extra-long staple, Pima, Egyptian, or Sea Island cotton (Gossypium barbadense L.) have excellent fiber quality with long, strong and fine fibers, but their low yield and requirements for warm and dry weather conditions limit their cultivation area Said J. I. et al., 2015). In recent years, the goal of cotton breeding in China has shifted to improving fiber quality (including fiber length, strength, and micronaire (MIC)], in addition to high yield (Fang et al., 2017). To date, there have been an increasing number of studies on improving cotton fiber quality traits through interspecific hybridization, especially G. hirsutum × G. barbadense Wang et al., 2020).
Cotton fibers are the longest and fastest growing cells of cotton plants. Each cotton fiber consists of a single cell that grows on the surface of the ovule. The fiber development process includes four main stages: fiber initiation, elongation, secondary wall thickening and maturation (Pang et al., 2010). Cotton fiber quality is a quantitative trait affected by multiple genes (genotype), environmental factors and genotype × environment interactions during fiber development. MIC is mainly determined by the formation characteristics of fiber secondary walls . Its value is determined by measuring the airflow resistance of a certain weight of cotton fiber plug (i.e., µg per inch of single fiber). Textile processing companies and scientific research organizations have adopted MIC as a key parameter of fiber maturity and fineness (Bradow and Davidonis, 2000). MIC is a comprehensive index of fiber fineness and maturity for fiber quality and plays an important role in the fiber spinning process. Because immature fibers have thin cell walls and therefore low MIC (below 3.4), they tend to become weaker and easily break during the spinning process, making low grade yarns. However, mature fibers have thick cell walls and produce a thick yarn. So mature cotton fibers are preferred in spinning (Kim et al., 2013). For mature fibers, MIC reflects the fineness of the fibers in that the higher the MIC, the coarser the mature fibers. Mature fibers with MIC readings between 3.70 and 4.20 are considered premium. However, micronaire readings of 3.4-and -under or 5.0-andhigher will receive discount in pricing. Therefore, it is of great theoretical and applied value to analyze and identify candidate genes regulating MIC at the quantitative trait locus (QTL) level for fiber quality molecular breeding and elucidate the genetic mechanism underlying cotton fiber development.
Quantitative trait locus mapping uses molecular marker technology as a tool based on genetic linkage maps and uses the linkage between linked molecular markers and QTLs to determine the position of candidate genes that control quantitative traits throughout the genome. At present, two commonly used methods include composite interval mapping (CIM) and inclusive composite interval mapping (ICIM) (Meng et al., 2015). Software used for these two methods include WinQTLCart 2.5 for CIM and QTL IciMapping 4.2 for ICIM. Most researchers have used these two software programs separately to carry out QTL mapping research on important cotton traits. For example, CIM was used by Li C. et al. (2016), Zhang et al. (2016) and Liu et al. (2018), and ICIM was used by Liu et al. (2017); Ma et al. (2017) and Liu et al. (2019). However, these two mapping methods can be simultaneously used for locating QTLs to perform a more accurate and comprehensive genetic analysis of traits.
Using one of the two methods, studies have also reported QTLs for MIC. Ali et al. (2018) identified 22 MIC-related QTLs in a RIL population of 180 lines in upland cotton, among which 13 QTLs were detected in two or more environments. Wang B. H. et al. (2017) detected 27 MIC-related QTLs using BC 3 F 2 , BC 3 F 2:3 , and BC 3 F 2:4 populations of an interspecific G. hirsutum × Gossypium mustelinum cross, among which 11 QTLs were located near the same marker in different populations or near linked markers in the same population. In addition, Fan et al. (2018) identified four MIC-related QTLs using a population of 143 RILs of an intra-G. barbadense cross. With the rapid development of genome sequencing technology, genome-wide association study (GWAS) has been successfully applied in the genetic analysis of fiber quality traits, including fiber MIC. Using genome resequencing, Wang M. J. et al. (2017) identified 3 significant single-nucleotide polymorphisms (SNPs) for fiber MIC in a group of 362 diverse upland cotton accessions, and Ma et al. (2018) identified 533 significant SNPs for fiber MIC in a panel of 419 upland cotton accessions. In addition, Huang et al. (2017) used the cotton Illumina 63K SNP array to genotype a collection of 503 upland cotton lines and identified 3 stable QTLs associated with MIC. Through a meta-analysis of numerous QTL reports,  compiled a total of 395 QTLs related to MIC in a QTL database for cotton. 1 Xu et al. (2020) recently performed a metaanalysis and identified a total of 15 meta-QTLs for MIC. These studies provide references for locations of QTLs for MIC.
Although G. barbadense has much longer, stronger and finer fibers than G. hirsutum, whether there exist major QTLs for MIC when crossing with G. hirsutum is currently not well understood. In this study, we used a population of 250 backcross inbred lines (BILs) from a G. barbadense × G. hirsutum cross  and identify QTLs for MIC based on a high-quality genetic map using two QTL mapping methods. The identified QTLs were then subjected to an integrated analysis to identify BILs with low MIC (i.e., fine fibers) and candidate genes for MIC. The results will lay the foundation for subsequent fine mapping of MICrelated genes and molecular marker-assisted selection (MAS) to improve MIC in upland cotton.

Plant Materials
An interspecific BIL population comprising 250 BC 1 F 7 lines was developed from a cross between Egyptian cotton (G. barbadense) Hai7124 and Chinese G. hirsutum CRI36. The parents and 250 BC 1 F 7 lines were planted in accordance with a randomized complete block design in nine environments at five locations, including the South Farm (nc) and the East Farm (dc) at the Institute of Cotton Research, Chinese Academy of Agricultural Sciences (CRI-CAAS), Anyang, Henan, China (Aync, 2015, and Aydc, 2017Weixian, Hebei (Hbwx, 2016); Sanya, Hainan (Hnsy, 2016);andAlar, south Xinjiang (Xjal, 2016, 2017);and Shihezi, Northern Xinjiang (Xjsh, 2017). Crop management practices followed local recommendations for cotton production. The use of two major cotton production regions (the Yellow River Valley and Northwestern Inland Valley) allowed the detection of consistent QTLs for MIC between the two regions. The specific length amplified fragment sequencing (SLAF-seq) strategy was followed for genotyping the BILs using a G. hirsutum reference genome with updates Hu et al., 2019). The details of a genetic linkage map consisting of 7709 markers were described previously by Ma J. et al. (2019).

Phenotypic Measurements and Analysis
Twenty normally mature (opened) bolls from the first and second nodes of middle fruiting branches were sampled in September each year. All seedcotton samples were ginned by a roller gin in the South Farm at CRI-CAAS, Anyang, Henan. Fiber samples were then tested by an HVI 1000 instrument at the Cotton Quality Inspection and Supervision Center of the Ministry of Agriculture, CRI-CAAS, Anyang, Henan. The R software package lme4 was used to estimate the best linear unbiased predictions (BLUPs) and broad-sense heritability (H 2 ) for MIC across the nine environments (Bates et al., 2014). The R software was also used for other statistical analyses including analysis of variance (ANOVA) and principal component analysis (PCA) of MIC for the BIL population across different environments.

Quantitative Trait Locus Analysis
Micronaire values in each of the nine testing environments and their BLUPs across the tests were used for QTL analysis using the ICIM of ADDitive QTL (ICIM-ADD) method in QTL IciMapping 4.2 (Meng et al., 2015) and the CIM method in WinQTLCart 2.5 . The parameters were set to a mapping step of 1 cM, a p value of 0.05 for type I error, and a PIN of 0.01, and 1000 permutations were taken to calculate the logarithm of odds (LODs) threshold. QTLs at the same location in two or more environments with a LOD threshold of >2.5 were considered significant QTLs (Shang et al., 2015). The QTL confidence interval (95%) was set as a mapping distance interval corresponding to a decrease of 1 LOD on either side of the peak (Yu et al., 2012a;Liu et al., 2019). MIC QTLs detected in three or more environments were considered stable QTLs when their confidence intervals overlapped (Yu et al., 2012b;Ma J. et al., 2019). A set of consensus QTLs for MIC was inferred by integrating the information of QTLs detected via the two methods. QTLs were named according to the method of Sun et al. (2013), with a prefix of W, I, and C for a QTL identified by CIM, ICIM and both methods, respectively. MapChart 2.2 was used to visualize the genetic map and QTL bars.

Common Quantitative Trait Loci for Micronaire in the Backcross Inbred Line Population and Previous Reported Studies
To identify new QTLs in this study, QTLs from our results were compared with previously reported QTLs. Previous MIC QTLs were retrieved from CottonGen (Yu et al., 2014) and Cotton QTLdb Release 2.3 (January 24, 2018, see text footnote 1)  and from recent reports by Majeed et al. (2019) and Xu et al. (2020). In addition, MIC QTL data from previous GWAS reports were also obtained. A co-localizing marker or a neighboring marker for a MIC QTL was identified, and then the marker location on the TM-1 genome was determined Hu et al., 2019). The physical intervals of all QTLs were queried via BLAST against the TM-1 genome, and QTLs were co-localized together with the previously identified MICrelated QTLs.

Gene Ontology Enrichment and Candidate Gene Identification
After the physical intervals of stable QTLs were queried via BLAST against the TM-1 genome (Hu et al., 2019), potential candidate genes were determined on the basis of the physical interval for a QTL. The homologous genes of candidate genes from Arabidopsis and the annotations of gene functions were determined from the TM-1 genome. The general pattern of expression of the candidate genes and their SNPs including insertion/deletion (InDel) of TM-1 and Hai7124 were also obtained from Hu et al. (2019) and then analyzed by SnpEff to predict variant impact (Cingolani, 2012). Gene Ontology (GO) enrichment of candidate genes was performed using the micStudio tools. 2 Homologous genes of candidate genes from Arabidopsis were used to determine enriched ontology clusters by Metascape (Zhou et al., 2019). Candidate genes were further used to predict the micro-RNA (miRNA) target genes by psRNATarget 3 (Dai et al., 2018), and the miRNA expression data of fibers at 14 days post-anthesis (DPA) were obtained from the Cotton Omics Database. 4

Micronaire Variation of Parents and the Backcross Inbred Line Population
Across the nine environments, MIC of the BILs ranged from 2.10 to 6.17, with an average of 4.14, and the mean MIC of the Upland CIR36 and Egyptian Hai7124 parents were 3.59 and 4.06, respectively ( Table 1). The MIC of CIR36 was significantly (P < 0.01) greater than that of Hai7124. The values of skewness and kurtosis in each environment showed that MIC followed a normal distribution in the BIL population (Supplementary Figure 1 and Table 1). Furthermore, there was a transgressive  Table 1).
Principal component analysis of the MIC value in this set of BILs showed that the nine environments could be classified into two regions: Region 1 (Northwestern Inland Valley) and Region 2 (Yellow River Valley) (Figure 1), mostly consistent with the official ecological classification of the cotton production regions in China, except for two tests-Anyang, Henan, 2015 (15Aync) and Sanya, Hainan, 2016 (16Hnsy) which were grouped with Region 1. Therefore, the testing environments of the two regions were separately estimated using BLUPs as BLUP-region 1 and BLUPregion 2.
Gene Ontology Enrichment Analysis of qMIC-D03-2 and qMIC-D08-1 Within the chromosomal regions of the two MIC QTLs (qMIC-D03-2 on D03 and qMIC-D08-1 of D08), there were 338 predicted genes, and 218 of them had GO annotations (Supplementary Table 4). Based on the GO analysis on the 218 genes, 161 genes were associated with the biological process category, 34 genes were associated with the cellular component category, and 23 genes were associated with the molecular function category. In these three categories, the oxidation-reduction process, integral component of membrane and ATP binding were the most enriched subcategories ( Figure 3A). Remarkably, negative regulation of catalytic activity was the most significantly enriched process according to the GO functional enrichment analysis ( Figure 3B). For 306 of the 338 putative genes with homologous in Arabidopsis, gene silencing, glutathione metabolism, plant epidermis development and root morphogenesis were found to be the main ontology clusters (Figure 3C). These four significant clusters were selected and converted into three network layouts ( Figure 3D). It was found that root morphogenesis and plant epidermis development cluster identities were linked. Tissue development, root system development and root development were the main terms and were more proportional to the 306 genes. Negative regulation of macromolecule metabolism and negative regulation of both gene expression and of metabolic process terms were the main terms associated with the gene silencing clusters.

Prediction of Candidate Genes Within qMIC-D03-2 and qMIC-D08-1
Because both the G. hirsutum TM-1 and G. barbadense Hai7124 genomes were sequenced and CRI35 is a typical upland cotton cultivar, The expression levels of the 338 genes from TM-1 and Hai7124 in the two QTL regions were determined based on existing RNA sequencing (RNA-seq) data (the National Genomics Data Center: https://bigd.big.ac.cn/ bioproject/; accession number: PRJNA490626) (Hu et al., 2019). The fold change in candidate gene expression was set to 2 as the threshold for significant differential expression between TM-1 and Hai7124 in corresponding tissues including embryos (0, 1, 3, and 5 DPA) and fibers (10, 20, and 25 DPA). As a result, eight candidate genes (three genes for qMIC-D03-2 and five genes for qMIC-D08-1) were found to be differentially expressed between TM-1 and Hai7124 (Figure 4). The following is a detailed in silico analysis of those eight genes.
For the three candidate genes on chromosome D03, GH_D03G1280 and GH_D03G1274 encode a kinase superfamily protein and the NADPH/respiratory burst oxidase protein D, and the expression levels of both genes in TM-1 were higher than that in Hai7124 at 20 DPA. GH_D03G1280 had SNP variants including frameshift variants and synonymous variants between TM-1 and Hai7124. Calcium-dependent NADPH oxidase generates superoxide molecules, a reactive oxygen species (ROS). The third gene, GH_D03G1286 encodes a transducin/WD40 repeat-like superfamily protein and its expression in both TM-1 and Hai7124 during the early stages of fiber development was low until 20 or 25 DPA. This gene had SNP variants between TM-1 and Hai7124 including loss/gain of a stop codon and splice region variants and GAA frameshift variants which might be involved in the change in fiber secondary cell wall synthesis.
GH_D08G2052 encodes a TCP family transcription factor, and its expression was significantly higher in Hai7124 than in TM-1 during fiber elongation from 5 to 20 DPA. Three SNP variants (frameshift variant, loss of a stop codon, and splice region types) were found between TM-1 and Hai7124, which might be involved in the change in fiber elongation. GH_D08G2091 encodes a glutathione S-transferase THETA 1 enzyme and is a homolog of AT5G41210 in Arabidopsis, and its expression was significantly higher in TM-1 than in Hai7124 during fiber development from 10 to 25 DPA, when fast elongation and secondary cell wall synthesis occurred. GH_D08G2127 encodes a receptor-like kinase in flowers and is a homolog of AT2G48010 in Arabidopsis, and the expression level in TM-1 was higher than that in Hai7124 at 20 and 25 DPA. Similar to GH_D03G1274, the protein encoded by the gene is involved in protein phosphorylation.
GH_D08G2099 and GH_D08G2286 encode beta-6 tubulin protein and xyloglucan endo-transglycosylase-related 8, respectively. The expression of these two genes was very low in the early stage of fiber development in both TM-1 and Hai7124 until 20 or 25 DPA. The expression levels of GH_D08G2099 (17 times higher) and GH_D08G2286 in Hai7124 fibers were higher than those in TM-1 fibers at 20 and 25 DPA, respectively. The upstream region of GH_D08G2286 lacked a 3126 bp fragment at −5393 bp in Hai7124 but not in TM-1. A detailed description of sequence variation for all the eight genes is listed in Supplementary Table 5.
Based on predictions of miRNA target genes by psRNATarget, GH_D08G2286 was the target gene of ghr-miR156a, ghr-miR156b, and ghr-miR156d, and the average expression level of these three miRNAs was 289.11 FPKM. GH_D03G1286 was the target gene of ghr-miR164, and the expression level of ghr-miR164 was 2114 FPKM.

Identification of Co-segregating Markers for Micronaire
Stable MIC QTLs are important loci shaping MIC, and the closed linked markers are valuable for MAS. For the two stable MIC QTLs, qMIC-D03-2 and qMIC-D08-1, Markers 150834 and 175863 were the nearest SNPs, respectively (Supplementary Figure 2A). For marker 150834 for qMIC-D03-2, the BILs with the SNP allele genotype (AA) from CRI36 averaged a significantly greater MIC value than did those with the SNP allele genotype (TT) from Hai7124 (4.30 vs. 3.89, P < 0.05). However, for marker 175863 for qMIC-D08-1, the BILs with the SNP allele genotype (AA) from CRI36 averaged a significantly lower MIC value than did those with the SNP allele genotype (GG) from Hai7124 (3.95 vs. 4.25, P < 0.05). The QTL allele for qMIC-D03-2 had a greater additive effect (−0.21) than that from qMIC-D08-1 (−0.15), consistent with the early QTL analysis ( Table 2). MIC for the desirable QTL genotype for D03 without the desirable genotype for D08 (i.e., Q 3 Q 3 q 8 q 8 ) was 4.06, vs. 4.16 for the desirable QTL genotype for D08 without the desirable genotype for D03 (i.e., q 3 q 3 Q 8 Q 8 ). When the desirable alleles from the two QTLs were combined into the same genotype (Q 3 Q 3 Q 8 Q 8 , i.e., TT for qMIC-D03-2 with AA for qMIC-D08-1), MIC was further reduced to 3.73 (significantly lower than that from q 3 q 3 Q 8 Q 8 but not from Q 3 Q 3 q 8 q 8 ), as compared to 4.44 for the genotype without any desirable allele (i.e., q 3 q 3 q 8 q 8 ). The effects from the two QTLs were additive and there appeared no interaction between them (Supplementary Figure 2D). Furthermore, the two homozygous genotypes for each of the two SNP markers (150834 and 175863) had similar fiber length and strength (Supplementary Figures 2B,C), indicating that these two QTLs did not affect fiber length and strength. Therefore, these two SNPs could be used to design portable markers for MAS to improve MIC without affecting fiber length and strength.

DISCUSSION
Micronaire is measured as the air permeability of a compressed lint sample of known mass and is essentially the fiber weight per unit length (µg inch −1 ) for a single fiber. Therefore, lint yield improvement through breeding has been accompanied by the increase of MIC . Therefore, it is not surprising that the MIC of new cultivars has been increased, because of the positive correlation between lint yield and MIC. Sun et al. (2019) showed that a global collection of 719 upland cotton germplasm accessions only had very low percentage of lines with the premium MIC (i.e., 3.70-4.20). As lint with MIC higher than 5.0 will suffer price discounts, breeding for low fiber MIC is becoming increasingly important.

Populations From Parents With Low Fiber Micronaire Could Be Used for Quantitative Trait Locus Analysis
In the nine testing environments, Hai7124 and CRI36 had MIC ranges of 3.12-4.22 and 3.27-4.50, respectively. Both parents had low MIC and were considered degree A MIC in China according to the national standards. The results indicated that both upland cotton CRI36 and Egyptian cotton Hai7124 possessed genomic regions that could decrease MIC. However, the upland parent still had significantly higher MIC. Therefore, it was still valid to carry out the current QTL analysis in the BIL population developed from the two parents. The results further showed that the two parents of different species possessed different genetic loci involved in MIC formation in that both parents had QTL alleles decreasing MIC (8 QTLs in CRI36 vs. 17 QTLs in Hai7124), consistent with many previous QTL studies in cotton (Lacape et al., 2010;Zhang et al., 2014Zhang et al., , 2020Zhu et al., 2020). As such, transgressive segregation in MIC was observed in that the BILs developed from the two parents had MIC ranging from 2.10 to 6.17, with an average BLUP of 4.18. Moreover, a very large proportion of the BIL population (35.6% of the BILs) had a degree A MIC. These lines with QTL introgression for low MIC and the desirable QTL alleles and their linked markers should be useful in MAS for breeding cotton with a premium fiber quality.

Utilization of Best Linear Unbiased Predictions With Different Principal Component Analysis Clusters and the Complementation of the Two Quantitative Trait Loci Mapping Methods
In this study, results showed that different ecological environments had a great influence on MIC. The MIC of the BILs grown in the Northwestern Inland Valley averaged 0.5 lower than these grown in the Yellow River Valley. Hence, the BLUPs of the PCA cluster, which represented the two different ecological environments, could be used to identify QTLs associated with specific ecological environments. These QTLs may be useful in MAS of low MIC for a particular ecological environment. Interestingly, QTL CqMIC-At11-1 was identified in blup region one test and in six individual tests. This indicated that it was necessary to analyze QTLs with BLUPs associated with different ecological environments and that the results were reliable. Therefore, when there is a genotype × environment interaction for a trait of interest in a multi-location experiment, testing environments can be grouped for a BLUP analysis for each group instead of using overall means in QTL mapping, in addition to a separate analysis for each environment. In this study, grouping based on PCA did not completely reflect geographical regions of tests because one test in Anyang, Henan, 2015, of Yellow River Valley andanother test in Sanya, Hainan, 2016 were grouped with the Northwestern Inland Valley (i.e., Xinjiang). In addition to soil type, soil fertility and moisture (Hearn, 1994), and crop management practices, it is known that MIC is greatly influenced by weather conditions including daily temperature (especially night temperature) and relative humidity (Gipson and Joham, 1968;Wanjura and Barker, 1985;Liakatas et al., 1998;Reddy et al., 1999;Bange et al., 2010). We speculate that the dry periods with low temperatures during the boll development stage in the two tests were likely the major cause for decreased MIC, as frequently observed in Xinjiang.
In this study, 13 and 4 specific QTLs were identified by CIM and ICIM, respectively. However, eight common QTLs were identified via both QTL mapping software programs. Both methods can identify common chromosomes with QTLs, and most of the QTLs (67.7%) of the MIC QTLs detected by ICIM were also detected by CIM, while the remaining unique QTLs detected by ICIM differed in mapping positions from these detected by CIM on the same chromosomes. CIM can detect more QTLs on the same chromosomes and may be more QTLs on additional chromosomes. Therefore, CIM is more powerful in detecting QTL, as proposed by Zeng (1993Zeng ( , 1994 when the CIM method was developed. The results demonstrate that both mapping methods are useful and are complementary to one another to detect additional QTL loci. Common QTLs detected by the two methods provide some levels of confidence in mapping results. Therefore, we suggest that the two QTL mapping methods be simultaneously used. Of course, common QTLs especially these with major effects should be focused in further studies.
Another important aspect is if some of the MIC QTLs detected in this study were also overlapped with QTLs for lint yield and fiber length and strength, leading to MIC's correlation with lint yield, fiber length and strength. Overlapped QTL regions for these traits are likely due to linked genes or pleiotropic effects of genes for the traits, which would explain the correlations of MIC with the three traits. A subsequent QTL analysis will be performed to address these questions.

Gene Ontology Enrichment and Candidate Gene Identification
In this study, two methods were used to perform GO analysis of putative genes with the two common QTL regions (qMIC-D03-2 and qMIC-D08-1). The results showed that the oxidationreduction process, integral component of membranes and ATP binding were the most populated subcategories. Root morphogenesis, plant epidermis development, gene silencing and response to hypoxia were the main clusters according to Metascape. The results of the two methods coincided and showed that the following hypothesis governing MIC by the two QTLs: During fiber elongation, fiber cells are hypoxic, giving rise to a response to hypoxia that negatively regulates enzymatic catalytic activities to induce fiber morphogenesis. This was followed by secondary cell wall synthesis and changes in membrane components, eventually leading to a change in MIC. This hypothesis was supported by the finding that immature fiber mutants had reduced ROS levels and reduced energy production in developing fibers compared with mature fibers (Kim et al., 2013).
In this study, eight candidate genes were identified for the two QTL regions. Xyloglucan might negatively affect fiber elongation according to comparisons of xyloglucan contents between G. barbadense and G. hirsutum . GH_D08G2286 encodes xyloglucan endo-transglycosylaserelated 8 (GhXTR8) and has a function similar to that xyloglucan endo-transglycosylase/hydrolase (XTH) proteins, which, when overexpressed in cotton plants, result in 15-20% longer fiber compared with that of wild-type cotton (Lee et al., 2010). GH_D03G1298 encodes a glucuronoxylan 4-Omethyltransferase-like protein (DUF579) that is involved in xyloglucan metabolism and that is located within the qMIC-D03-2 region. The expression of the genes encoding both of these proteins in Hai7124 was higher than that in TM-1 at 25 DPA. DUF579 was also determined to be involved in xylan biosynthesis according to phylogenetic analysis . IRX15 and IRX15-L are homologous genes of DUF579 in Arabidopsis; and characterization of a double knockout line revealed irregular secondary cell wall margins of fiber cells and a lower degree of xylan polymerization compared with that of the wild-type line (Jensen et al., 2011).
GH_D03G1280 (a protein kinase superfamily gene) was also reported to participate in fiber elongation (Li C. et al., 2016). The protein coded by GH_D03G1286 belongs to a WD40 protein superfamily and mainly regulates the formation of trichomes via the R3 MYB-bHLH-WD40 transcriptional complex in Arabidopsis (Gan et al., 2011), but a divergent WD40 protein (GhWDR) interacts with GhMML4_D12 in a process similar to but different from that of the MBW transcriptional complex involved in trichome development (Tian et al., 2020). The protein coded by GH_D08G2052 is a TCP family transcription factor, and GhTCP4 plays an important role in balancing cotton fiber elongation and cell wall synthesis together with miR319 (Cao et al., 2020). GH_D08G2099 encodes a beta-6 tubulin protein that is involved in fiber development. Nineteen beta-tubulin cDNAs were detected in developing cotton ovules and were found to be highly expressed in elongating fiber cells (He et al., 2008). Beta-tubulin was also identified by QTL analysis and was found to control fiber quality (Guo et al., 2021). GH_D03G1274 encodes NADPH/respiratory burst oxidase protein D (RBOHC), and AtRBOHC influences the development of root hairs via the activation of Ca 2+ and K + osmotic pathways in plant root cells (Bai et al., 2014). RBOHC may mediate the progression of ABA-regulated primary root growth by producing ROS in the roots (Ma X. et al., 2019). GH_D08G2091, which encodes a glutathione transferase, also regulates the production of ROS. The products of both GH_D03G1274 and GH_D08G2091 participate in ROS metabolic pathways, and ROS can act as developmental signaling molecules in the process of secondary cell wall differentiation in cotton fibers (Li et al., 2007;Guo et al., 2016). GH_D03G1262 encodes an ARF-GAP domain 1 protein (AGD1), which regulates root hair polarity by coordinating cytoskeleton and membrane trafficking (Yoo and Blancaflor, 2013). To determine which of these 8 genes contribute to MIC within the two QTL regions, further studies are needed.
It is recognized that only one specific candidate gene in each of the two MIC QTL regions will be the one determining a proportion of the genetic differences in MIC between the two parents. Functions of other genes within the two QTL regions are most unlikely associated with MIC and should not be overstated. Although other molecular aspects including quantitative RT-PCR between parents and BILs with contrasting MIC and virusinduced gene silencing can be performed for those 8 genes, further high resolution mapping using a larger interspecific genetic population is required. In addition, the desirable effect (reducing MIC) for qMIC-D03-2 was from the allele contributed from the Egyptian Hai7124 cotton. Therefore, the two QTLs may be specific to interspecific hybrid populations between the two species. A panel of upland cotton germplasm lines would not be useful in validating the QTL effect. Near-isogenic lines will be developed for the two QTL regions for a more in-depth analysis in the future.
In summary, an interspecific BIL population of 250 lines from G. hirsutum × G. barbadense was employed to detect MIC QTLs in nine replicated field tests. Based on a highdensity genetic map with 7709 genotyping-by-sequencing (GBS)based SNP markers, 25 MIC QTLs were identified, including 12 previously described QTLs and 13 new QTLs. Importantly, eight candidate genes within two stable MIC QTL regions were identified with differential expression between upland TM-1 and Egyptian Hai7124. This study provides valuable information for improving MIC in cotton breeding.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.

AUTHOR CONTRIBUTIONS
WP analyzed and summed all the data and wrote the manuscript. JS performed the gene expression and wrote introduction of the manuscript. WW, MW, QC, QQ, CH, HL, XG, HH, and YZ managed and collected the phenotype data. JM, BJ, and LW involved in the analysis of SNP markers. YQ and JY directed the experiments. JZ revised the manuscript. All authors read and approved the final manuscript.