Microbial Community of High Arsenic Groundwater in Agricultural Irrigation Area of Hetao Plain, Inner Mongolia

Microbial communities can play important role in arsenic release in groundwater aquifers. To investigate the microbial communities in high arsenic groundwater aquifers in agricultural irrigation area, 17 groundwater samples with different arsenic concentrations were collected along the agricultural drainage channels of Hangjinhouqi County, Inner Mongolia and examined by illumina MiSeq sequencing approach targeting the V4 region of the 16S rRNA genes. Both principal component analysis and hierarchical clustering results indicated that these samples were divided into two groups (high and low arsenic groups) according to the variation of geochemical characteristics. Arsenic concentrations showed strongly positive correlations with NH4+ and total organic carbon (TOC). Sequencing results revealed that a total of 329–2823 operational taxonomic units (OTUs) were observed at the 97% OTU level. Microbial richness and diversity of high arsenic groundwater samples along the drainage channels were lower than those of low arsenic groundwater samples but higher than those of high arsenic groundwaters from strongly reducing areas. The microbial community structure in groundwater along the drainage channels was different from those in strongly reducing arsenic-rich aquifers of Hetao Plain and other high arsenic groundwater aquifers including Bangladesh, West Bengal, and Vietnam. Acinetobacter and Pseudomonas dominated with high percentages in both high and low arsenic groundwaters. Alishewanella, Psychrobacter, Methylotenera, and Crenothrix showed relatively high abundances in high arsenic groundwater, while Rheinheimera and the unidentified OP3 were predominant populations in low arsenic groundwater. Archaeal populations displayed a low occurrence and mainly dominated by methanogens such as Methanocorpusculum and Methanospirillum. Microbial community compositions were different between high and low arsenic groundwater samples based on the results of principal coordinate analysis and co-inertia analysis. Other geochemical variables including TOC, NH4+, oxidation-reduction potential, and Fe might also affect the microbial composition.

Microbial communities can play important role in arsenic release in groundwater aquifers. To investigate the microbial communities in high arsenic groundwater aquifers in agricultural irrigation area, 17 groundwater samples with different arsenic concentrations were collected along the agricultural drainage channels of Hangjinhouqi County, Inner Mongolia and examined by illumina MiSeq sequencing approach targeting the V4 region of the 16S rRNA genes. Both principal component analysis and hierarchical clustering results indicated that these samples were divided into two groups (high and low arsenic groups) according to the variation of geochemical characteristics. Arsenic concentrations showed strongly positive correlations with NH + 4 and total organic carbon (TOC). Sequencing results revealed that a total of 329-2823 operational taxonomic units (OTUs) were observed at the 97% OTU level. Microbial richness and diversity of high arsenic groundwater samples along the drainage channels were lower than those of low arsenic groundwater samples but higher than those of high arsenic groundwaters from strongly reducing areas. The microbial community structure in groundwater along the drainage channels was different from those in strongly reducing arsenic-rich aquifers of Hetao Plain and other high arsenic groundwater aquifers including Bangladesh, West Bengal, and Vietnam. Acinetobacter and Pseudomonas dominated with high percentages in both high and low arsenic groundwaters. Alishewanella, Psychrobacter, Methylotenera, and Crenothrix showed relatively high abundances in high arsenic groundwater, while Rheinheimera and the unidentified OP3 were predominant populations in low arsenic groundwater. Archaeal populations displayed a low occurrence and mainly dominated by methanogens such as Methanocorpusculum and Methanospirillum. Microbial community compositions were different between high and low arsenic groundwater samples based on the results of principal coordinate analysis and co-inertia analysis. Other geochemical variables including TOC, NH + , oxidation-reduction potential, and Fe might also affect INTRODUCTION Arsenic (As)-contaminated groundwater used for drinking water has adversely impact the health of more than 140 million people all over the world including Bangladesh, Vietnam, India, China, Chile, USA, and Argentina (Berg et al., 2001;Anawar et al., 2011;Nicolli et al., 2012;Ayotte et al., 2014;Guo et al., 2014). Many studies have been conducted in recent years to investigate the geochemistry, biochemistry, and microbial ecology of As in groundwater aquifers (Sarkar et al., 2014;Wang et al., 2014bWang et al., , 2015 and the previous results showed that As release and mobilization were usually controlled by a series of microbially mediated reactions and geochemical processes. As one kind of important geological mediators, microorganisms can utilize As as an electron acceptor/donor or energy source, thereby changing the As speciation and mineralization in groundwater aquifers (Slyemi and Bonnefoy, 2012;Amend et al., 2014;Ghosh et al., 2014). To date, more than 200 strains assigned to 11 phyla of bacteria and one phylum of archaea (Crenarchaeota) have been isolated from different environmental conditions such as groundwater, hot springs, marine sediments, and mine drainage (Amend et al., 2014). Besides, some functional populations such as Fe-reducing bacteria and sulfate reducing bacteria have been previously found in As-rich aquifers (Kirk et al., 2010;Li et al., 2014Li et al., , 2015. Recent studies showed that agricultural activities such as irrigation might play important roles in As release and mobilization. The input of organic matter and fertilizers during the drainage process could change the environmental conditions, which could further affect the microbial communities (Kefala-Agoropoulou et al., 2008;Mayorga et al., 2013). These studies showed the importance of microorganisms in As release and mobilization, however, failed to provide information of in situ microbial community structure which is important in understanding the interaction between microorganism and As geochemistry in groundwater in agricultural area.
The Hetao Plain is one of the most serious and representative As-threatened areas in northwest China, with over 300,000 residents vulnerable to arsenicosis (Guo et al., 2003). Arsenic in groundwater in the Hetao Plain is released primarily through natural processes and anthropogenic activities including agricultural irrigation and mining (Deng, 2008;Guo et al., 2008). Our previous studies have demonstrated microbial community structure and functional microbial populations in high As groundwater from strongly reducing area of the Hetao Plain (Li et al., 2013Wang et al., 2014aWang et al., , 2015. Results showed that diverse microorganisms could be involved in As release and transformation in the Hetao Plain. The predominant populations in strong reducing conditions area of the Hetao Plain included Acinetobacter, Pseudomonas, Psychrobacter, and Alishewanella (Wang et al., 2014a;Li et al., 2015). These populations were reported to be related to As resistance, As reduction and oxidation, and iron reduction. As one of biggest agricultural areas in China, the subsurface hydrology in Hetao Plain has been largely affected by agricultural irrigation and drainage activities. Previous studies showed that drainage and irrigation would change natural biogeochemical processes and substantially affect As release in groundwater aquifers of the Hetao Plain (He, 2010;Guo et al., 2011). However, the effects of agricultural irrigation on in situ microbial diversity and community structure in high As groundwater have yet to be fully understood.
In the present study, we characterized and compared the microbial communities in groundwater along the agricultural drainage channels in Hetao Plain by Illumina sequencing. The objectives of this study were to (1) investigate the As geochemistry and the microbial community structure; (2) compare the microbial communities of high As groundwater in agricultural irrigation areas with those in strongly reducing high As groundwater of other areas at Hetao Plain; and (3) evaluate the key geochemical factors shaping the microbial community structures in high As groundwater aquifers in the agricultural irrigation area.

Site Description
The Hetao Plain (40 • 10 -41 • 20 N, 106 • 10 -109 • 30 E) is one of the Mesozoic-Cenozoic faulted basins located in the western part of Inner Mongolia. It is along the northern bank of the Yellow River, and is bounded to the north by the Yin Mountains and to the west by the Wulanbuhe Desert, and with lacustrine plain in the central part (Figures 1A,B). Groundwater in the Hetao Plain is discharged mainly by vertical evapotranspiration and pumping and recharged by irrigation. Our case study was carried out in Hangjinhouqi County (Figures 1B,C) where endemic arseniasis is most serious in the Hetao Plain, with about 76 thousands local residents exposed under As-rich groundwater (Deng, 2008). In Hangjinhouqi County, four main drainage channels (three SE-NW and one SW-NE orientated) have been used to drain redundant irrigational water and to lower the local groundwater table (Figure 1C).

Groundwater Sampling and Geochemical Analysis
Groundwater samples along the agricultural drainage channels were collected from 17 domestic wells of seven arseniasisaffected villages in Hangjinhouqi County, of which sample IC12 and IC16 had been collected for the study of methanogens in high As groundwater in a previous study . Sampling locations were shown in Figure 1. Wells were pumped to get stabilized oxidation-reduction potential (ORP) values (commonly 10-15 min) prior to groundwater sampling. Microbial samples were collected by filtering of approximately 10 L fresh groundwater through 0.2-µm filters (Millipore), the filters were immediately packed into 50 mL sterile tubes, and then frozen in dry ice. All samples were transported to the laboratory on dry ice and then maintained at −80 • C until further analysis. The geochemical parameters including temperature, pH, conductivity (COND), and ORP, were measured in situ using a multiple parameter water quality meter (Horiba, Japan). H 2 S, NH + 4 , and Fe(II) were determined using a portable spectrophotometer (HACH, DR890) according to the manufacture's protocols. (C) groundwater sampling sites in Hangjinhouqi. 01-17 represented samples IC01-IC17. Green ones were low As samples with As concentrations lower than 10 µg/L. Red ones were high As samples with As concentrations higher than 10 µg/L. All high As samples except 05 and 13 presented ratios of As(III) and total As higher than 0.5. Sampling methods for ions and As speciation analyses were from Deng (2008). Briefly, groundwater samples were filtered through 0.45 µm mixed cellulose ester membranes and then the filtrates for cation analysis were acidified to pH < 2. Water samples were filtered through a disposable syringe joined with an anion exchange cartridge (Supelco, USA) to separate soluble arsenite [As(III)] and arsenate [As(V)] species, which were subsequently measured using liquid chromatography-atomic fluorescence spectrometry (LC-AFS-9700, Haiguang, China). Measurement of anions including NO − 3 and SO 2 4 − was made with ion chromatography (DX-120, Dionex, USA). Major cations were determined with ICP-AES (IRIS Intrepid II XSP). Iron species were determined by the 1,10-phenanthroline-based assay which was described in our previous study (Li et al., 2013). Total organic carbon (TOC) was measured using a TOC analyzer (Vario MICRO cube, Elemental, Germany). All samples were run in triplicate and then averaged.

DNA Extraction, PCR, and Illumina Sequencing
For the analysis of microbial community composition, the filtered membranes were cut into small pieces under aseptic conditions and used for DNA extraction by the FastDNA SPIN Kit for Soil (MP Bio, USA) with a final elution in 80 µL deionized water. The universal forward primer 515 (5 -GTG CCA GCM GCC GCG GTA A-3 ) and reverse primer 806 (5 -GGA CTA CCA GGG TAT CTA AT-3 ) barcoded with a 12-base Golay code, which could cover a wide diversity of both archaea and bacteria (Bates et al., 2011), were used to amplify the microbial 16S rRNA gene V4 region. The 25 µL PCR reaction mix consisted of 2.5 µL of Takara 10× Ex Taq Buffer, 2 µL of dNTP Mix (2.5 mM each), 0.7 µL of Roche bovine serum albumin (20 mg/mL), 0.125 µL of Ex Taq DNA polymerase (Takara, 5 U/µL), 0.5 µL of 10 µM primer 515F, 0.5 µL of 10 µM barcode primer 806R, 1 µL of template DNA, and 17.675 µL of PCR grade water. The PCR amplification condition was: initial denaturation at 95 • C for 3 min, followed by 25 cycles of 95 • C for 30 s, 50 • C for 30 s, and 72 • C for 45 s, and a final extension at 72 • C for 10 min. All samples were run in triplicate and then pooled together. Successful amplifications were confirmed by E-Gel (Invitrogen, USA) electrophoresis. PCR products were purified using the Agencourt AMPure XP PCR purification system (Beckman Coulter, Brea, CA, USA). The purified amplicons were quantified using the Qubit dsDNA HS assay and the size of the amplicons was determined using a Bioanalyzer with Agilent DNA 1000 chips (Agilent Technologies, Santa Clara, CA, USA). Amplicons were pooled (25 ng per sample) and the mixed suspensions were then subjected to paired-end (PE) sequencing by an Illumina MiSeq 2000 instrument at the Yale Center for Genome Analysis.

Data Analysis
Principal component analysis (PCA) between As concentrations and other geochemical parameters were conducted using CANOCO for windows version 4.5. Hierarchical cluster analysis was performed with the "vegan" package in RStudio 1 using unweighted pair-group method with arithmetic means (UPGMA) and Euclidean distance measure. SPSS 20 software was used to analyze the significant differences (p ≤ 0.05) among different geochemical characteristics and microbial diversity.
The PE reads generated from Illumina MiSeq sequencing were assembled using FLASH (Fast Length Adjustment of Short reads) according to index sequence (Magoč and Salzberg, 2011). All sequences with ambiguous base calls were discarded. The PE sequences were also discarded if a mismatch was observed in the assembly. The output data was further analyzed using QIIME (quantitative insights into microbial ecology) v1.7.0. Operational taxonomic units (OTU) picking was performed using a "closed-reference" protocol. The most abundant sequence in each cluster was chosen to represent the cluster. All representative sequences were aligned with PyNAST. Chimeric sequences were identified with Chimera Slayer and excluded. Sequences were clustered against the Greengenes database 2 at 97% identity using uclust. Following assignment, 12,000 successfully assigned sequences from each sample were chosen at random to allow for downstream analyzes and cross-sample comparison. Alpha diversity analysis involving rarefaction curves, Chao1 and Shannon diversity (representing estimated OTU numbers and microbial diversities in the community, respectively) indices and beta diversity analysis were generated using OTU picking result in QIIME software.
Multivariate community analysis including principal coordinate analysis (PCoA) and co-inertia analysis was performed with the ade4 package (Chessel et al., 2004;Dray and Dufour, 2007) within R programming environment 3 using normalized OTU data. Correlated geochemical variables were removed from the analyzed table. Besides, some geochemical variables were made Log 10 transformation to reach normal distribution before analysis. SIMPER (similarity percentage) analysis (Clarke and Warwick, 1994) was conducted to rank the top 10 OTUs that contributed to the dissimilarity index between high As groundwater samples and low As groundwater samples. The DNA sequences were deposited to the Short Read Archive database at NCBI (Accession number: SRR4996298).

Groundwater Geochemistry
Arsenic concentrations of groundwater in Hangjinhouqi County were generally elevated along the agricultural drainage channels in the front of Yin Mountains (Figure 1; Table 1). This was consistent with the results by He (2010) that As in soils or sediments could be leached into groundwater with the irrigation return flow. The 17 groundwater samples were divided into two groups with the distinctive geochemical characteristics according to the PCA (Figure 2A) and Hierarchical clustering results ( Figure 2B). The first group (low As group) contained six groundwater samples which were characterized with low concentrations of As (As < 10 µg/L) and relatively low TOC and NH + 4 . The other group (high As group) included 11 samples with high concentrations of As (As > 10 µg/L), TOC and NH 4 + . Arsenic concentrations of those high As groundwater samples were from 50 to 1089 µg/L and As(III) was the dominant species in most of high As groundwaters, with ratios of As(III) to As Tot ranging from 0.04 to 0.93 ( Table 1). The concentrations of TOC lay in the range 1.9-18.4 mg/L in high As groundwater samples and 0-4.3 mg/L in low As samples. NH + 4 concentrations varied from 1.94 to 6.68 mg/L in high As groundwater samples and from 0 to 3.87 mg/L in low As groundwater samples.
Arsenic  Table 2). These results were different from those in our previous study of samples from strongly reducing areas in Hetao Plain that As concentrations showed negative correlations with the concentrations of SO 2 4 − and positive correlations with the concentrations of Fe(II) . Considering the positive correlation among As, NH + 4 , and TOC, these results suggest that groundwater samples collected along the drainage channels might present distinct biogeochemical characteristics and redox properties. Agricultural activities such as N-fertilizer application and irrigation might also play important roles in As release and mobilization (Busbee et al., 2009;Mayorga et al., 2013). The reasonable explanation might be that the irrigation and drainage processes could leach As and dissolved organic matter from surficial sediments to the underlying aquifer, which caused the biogeochemical changes (He, 2010).

Microbial Richness and Diversity
The microbial community diversities and phylogenetic structure of the 17 groundwater samples were analyzed by Illumina sequencing. A total of 509,554 full length V4 tags were obtained after removal of low quality and chimeric sequences and 24,938 OTUs were obtained at a distance of 0.03. A variety of taxa were observed at the 97% OTU level, with 329-2823 observed OTUs and 0.64-0.84 coverage values ( Table 3). The abundance of OTUs was observed to be higher in the low As groundwater samples than in high As samples ( Table 3). The rarefaction curves generated by QIIME were normalized to the minimum number (17,000) of sequences. The Chao1 and Shannon diversity ranged from 465.26 to 3420.42 and 4.08 to 9.32, respectively (Table 3), showing higher values in low As groundwater samples ( Table 3). These results implied that high As groundwater samples presented lower microbial diversities which was consistent with our result of the 16S rRNA gene clone library (Li et al., 2013). Compared with groundwater samples from strongly reducing areas of Hetao Plain , samples collected along the agricultural  Samples in the shaded part are low arsenic samples (As < 10 µg/L). * Samples were referred in our previous study . drainage channels showed much higher microbial diversities, implying that irrigation and drainage activities might affect local geochemical environment and thus enhance the diversity of microorganisms.

Microbial Compositions in High As Groundwater
Microbial community compositions at phylum level showed no distinct difference between high and low As groundwater samples (Supplementary Figure S2). Illumina results of the 17 groundwater samples indicated that the bacterial community was composed of 62 phyla (percentages lower than 0.01% were not calculated). Of these, only nine phyla dominated each community (Supplementary Figure S2), among which Proteobacteria was the most dominant group (32.02-86.50%), followed by Firmicutes (0.16-18.48%), Actinobacteria (0.34-12.08%), and Nitrospirae (0-22.64%). Within phylum Proteobacteria, Gammaproteobacteria (6-83%), and Betaproteobacteria (5-39%) were the most abundant (Supplementary Figure S2). The Alphaproteobacteria and Deltaproteobacteria accounted for 0-17% and 0-23% of the total Proteobacteria, respectively. The class Epsilonproteobacteria displayed a low rate of occurrence and was only abundant in two high As groundwater samples IC02 and IC14 (Supplementary Figure S2). Archaeal populations accounted only for 0.01-8.69% of the microbial populations. All detected archaeal populations belonged to the phyla Euryarchaeota and Crenarchaeota which represented by the classes of Parvarchaea, Methanomicrobia, and Methanobacteria. Members of the Methanomicrobia and Methanobacteria retrieved at high As groundwater samples were mostly affiliated with methane-producing archaea Samples in the shaded part are low arsenic samples (As < 10 µg/L).
Methanocorpusculum, Methanospirillum and SAGMEG-1 which have previously been detected in marine sediments and sludge (Iino et al., 2009;Barbier et al., 2012). These methanogenic populations might provide even stronger reducing conditions and thereby accelerate As release in groundwater aquifers in the Hetao Plain . However, irrigation and drainage activities in agricultural areas could bring more oxygen and the resulting oxidizing conditions might inhibit the activities of methanogenic populations. At the genus level, the top 10 taxonomic OTUs that contributed to the dissimilarities between high and low As groundwater samples were identified by SIMPER analysis ( Table 4). The average abundances of Alishewanella, Psychrobacter, Pseudomonadaceae, Methylotenera, Comamonadaceae, and Crenothrix were elevated in high As groundwater samples than in low As samples, while Rheinheimera and unidentified OP3 presented higher abundance in low As groundwater samples than high As samples ( Figure 3B; Table 4). Although Acinetobacter and Pseudomonas had been reported to be correlated with As metabolism (Anderson and Cook, 2004;Chang et al., 2009;Freikowski et al., 2010), these two populations appeared with slightly higher percentages in low As groundwater samples but dominated in both high (0.2-28.2% and 0.1-13.6%, respectively) and low (0.2-41.4% and 0.2-37.3%, respectively) As samples in the present study. This might be due to the wide occurrence of these two kinds of bacteria in nature, and some of their isolates are capable of tolerating high concentrations of As. These results were mostly consistent with those of our previous studies in the strongly reducing area conducted with traditional sequencing and 454 pyrosequencing methods (Li et al., 2013. Predominant populations in high As groundwater samples such as Alishewanella and Psychrobacter were previously isolated from industrial effluent or groundwater aquifers, showing the capability of hyper arsenite tolerance or arsenate reduction (Liao et al., 2011;Jain et al., 2014). In some recent studies, species of Alishewanella were also identified as  denitrifiers (Kolekar et al., 2013;Liu et al., 2015). Methylotenera species were reported as facultative methylotrophs and could be involved in the carbon and nitrogen metabolic pathways (Mustakhimov et al., 2013;Chistoserdova, 2014). The sample with the highest concentration of TOC presented the highest abundance of Methylotenera. Crenothrix was documented as one kind of iron oxidizing bacteria in groundwater (Kang and Liu, 2013;Thapa Chhetri et al., 2014). Besides, Crenothrix was also reported as a methane oxidizer with a unique methane monooxygenase (Stoecker et al., 2006). These dominant populations in high As groundwater indicated that As release in groundwater along agricultural drainage channels was correlated with microbial carbon, nitrogen, and iron reactions. Different from our previous studies with samples from the natural reducing area (Li et al., 2013, many oxidizing microbial populations such as Methylotenera and Crenothrix were found in the present study, indicating the possible influence of geochemical difference on microbial composition. The dominant populations in this study were also different from those in high As groundwater aquifer in Bangladesh (such as Hydrogenophaga and Acidovorax; Sutton et al., 2009), West Bengal (such as Agrobacterium-Rhizobium, Ochrobactrum, Anoxybacillus, and Paenibacillus; Sarkar et al., 2015), and Vietnam (Lawati et al., 2012). These community differences might be influenced by different geochemical conditions such as ORP (Sutton et al., 2009) and dissolved organic carbon (Lawati et al., 2012).

Microbial Community Structure in Relation to Geochemistry
PCoA plots were used to visualize the relationships among microbial communities using default beta diversity metrics of unweighted Unifrac. Geochemical parameters including pH, ORP, As, Fe, SO 2 4 − , NH + 4 , and TOC were used to calculate the possible influence on microbial differences. Results showed that microbial communities could be generally divided into two groups according to different As concentrations (Figure 4). A similar result was also obtained with UPGMA cluster tree analysis which was based on Bray-Curtis dissimilarity at the 97% similarity OTU level ( Figure 3A). These results implied that As was an important factor shaping the microbial community structure in groundwater of agricultural irrigation area.
Co-inertia analysis was carried out in order to reveal the relative importance of geochemical vectors that affect microbial community structures. The arrow head and end represented the position of sample according to the geochemical ordination and microbial ordination, respectively ( Figure 5). As shown in Figure 5B, the first two axes explained 81% of the total environmental variability. Results revealed that high As groundwater and low As groundwater samples were significantly different (p value < 0.001) in community structures ( Figure 5A) and geochemical variables had a significant influence on their microbial community compositions. In addition to As concentration, geochemical variables such as TOC, NH + 4 , Fe, and ORP might also affect the composition (Figure 5C). These important geochemical factors shaping the microbial community structure were different from those of strong reducing area in the Hetao Plain which were characterized with high concentrations of Fe(II), H 2 S, and SO 2 4 − .

CONCLUSION
Arsenic concentrations in groundwater samples along agricultural drainage channels of the Hetao Plain were positively correlated with concentrations of TOC and NH + 4 . Illumina sequencing of the 16S rRNA genes demonstrated that a diverse array of microorganisms existed in the high As groundwater aquifers which were mainly composed of Proteobacteria, Firmicutes, Actinobacteria, and Nitrospirae. The dominant microbial populations were distinctly different between the high and low As groundwater samples. The predominant groups in high As samples were Alishewanella, Psychrobacter, Methylotenera, and Crenothrix. Microbial groups such as Rheinheimera and unidentified OP3 were much more predominant in low As samples. These dominant populations in high As groundwater were previously reported to be correlated with microbial carbon, nitrogen, and iron reactions. Oxidizing populations such as Methylotenera and Crenothrix were found in samples along agricultural drainage channels, which was different from our previous studies in samples from other areas under strongly reducing conditions. Statistic results indicated that microbial community structures were different between high and low As groundwater samples, under the impact of As concentrations as well as other geochemical variables including TOC, NH + 4 , ORP, and Fe. Overall, the results of this study indicate that both the geochemistry and microbial community structures in groundwater agricultural areas are different from those in other As-rich aquifers of Hetao Plain, Inner Mongolia and other high As groundwater aquifers such as Bangladesh, West Bengal, and Vietnam. (A) Samples projected on the first two axes; (B) Eigenvalues; (C) Main geochemical vectors that affect sample ordination in the CIA. As concentration was used as a factor. Circular ones represent low As samples (As < 10 µg/L) while rectangular ones represent high As samples (As > 10 µg/L).

AUTHOR CONTRIBUTIONS
YXW and PL designed experiments; YHW carried out experiments and wrote the manuscript; YHW and ZJ analyzed experimental results; AS assisted with sequencing data analysis; SW assisted with Illumina sequencing experiment; HLD, YXW, and PL revised the manuscript; JT and DW collected all samples needed in the study.