An ATAC-seq Dataset Uncovers the Regulatory Landscape During Axolotl Limb Regeneration

Citation: Wei X, Li H, Guo Y, Zhao X, Liu Y, Zou X, Zhou L, Yuan Y, Qin Y, Mao C, Huang G, Yu Y, Deng Q, Feng W, Xu J, Wang M, Liu S, Yang H, Liu L, Liu C and Gu Y (2021) An ATAC-seq Dataset Uncovers the Regulatory Landscape During Axolotl Limb Regeneration. Front. Cell Dev. Biol. 9:651145. doi: 10.3389/fcell.2021.651145 An ATAC-seq Dataset Uncovers the Regulatory Landscape During Axolotl Limb Regeneration


INTRODUCTION
Tissue regenerative potential varies significantly across species, tissues, and ages (Yun, 2015;Iismaa et al., 2018). For example, planarian can reconstruct its whole body from small fragments of the original organism (Pellettieri et al., 2010;Zeng et al., 2018); in contrast, many vertebrate organs, such as the heart, can only regenerate primarily through preexisting proliferating cardiomyocytes, like in adult zebrafish and neonatal mice (Vivien et al., 2016). Since Spallanzani first reported the salamander regeneration in 1760s, scientists have been devoted to decipher the codes of such powerful regenerative capability (Dinsmore, 1991). Using different methods to analyze the cellular and molecular phenomena during salamander limb or tail regeneration, researchers revealed complex processes including clotting, immune activation, apoptosis, and reprogramming (Tanaka, 2016). Within such process, a mass of cells called blastema proliferates from the wounded site and fully regenerates the lost body part (McCusker et al., 2015;Haas and Whited, 2017).
Axolotl (Ambystoma mexicanum) is a species of salamander, which has been used as the model animal to investigate key biological processes such as embryo development, limb regeneration, and central nervous system regeneration for nearly 150 years (Pietsch, 1961;Schreckenberg and Jacobson, 1975;Seyedhassantehrani et al., 2017). Although several studies have focused on bulk transcriptome studies (Monaghan et al., 2009;Campbell et al., 2011;Knapp et al., 2013;Stewart et al., 2013;Wu et al., 2013;Bryant et al., 2017), the axolotl genome was not assembled until 2018 with features of large sizes (32 Gb) and abundant repetitive sequences (Nowoshilow et al., 2018). Interestingly, in axolotl, intron size expands 13-to 25-fold in non-developmentally related orthologous genes and 6-to 11-fold in developmentally related orthologous genes as compared to human, mouse, and frog, thus indicating that a more complex regulatory network in noncoding regions may play an important role in both development and regeneration (Nowoshilow et al., 2018). Since the first axolotl genome assembly, multiple studies have been carried out to investigate the transcriptomic patterns of axolotl limb regeneration at single-cell resolution (Gerber et al., 2018;Leigh et al., 2018;Qin et al., 2020). These studies used single-cell gene expression variations to reflect dynamic cell population changes and cell fate transitions, as well as unique immune responses during regeneration (Tsai et al., 2019;Li et al., 2020;Rodgers et al., 2020). Standing on the shoulder of these studies and looking forward, analysis of the epigenetic regulations, which are responsible for the dynamic gene expression changes, will help scientists to better understand the underlying mechanisms of the regenerative process.
To identify crucial regulatory elements and transcription factors (TFs) that drive or support the regenerative response, the assay for transposase-accessible chromatin using sequencing (ATAC-seq) has been used to profile the chromatin accessibility dynamics in multiple species (Buenrostro et al., 2015). For instance, a genome-wide scan for TF binding motifs in thousands of chromatin regions revealed by ATAC-seq highlighted the role of EGR (early growth response) as a pioneer factor to directly activate regeneration-related genes in Hofstenia (Gehrke et al., 2019). In addition to TFs, enhancers also have great significance in regeneration. The conserved teleost regeneration response enhancers in zebrafish and African killifish (Nothobranchius furzeri) were uncovered by histone H3K27ac chromatin immunoprecipitation sequencing (ChIPseq, a marker for active enhancers), bulk RNA sequencing (RNA-seq), and single-cell RNA sequencing (scRNA-seq) . These studies suggested that epigenetic regulatory elements play fundamental roles in regeneration. However, how the non-coding axolotl genome responds to wounding to regulate gene expression and consequently drive the process of limb regeneration remains to be elucidated.
Here, we present a comprehensive dataset of chromatin accessibility for eight stages of the axolotl limb regeneration process, including homeostatic [uninjured control, 0 h after amputation (0 hpa)], trauma (3 hpa), wounding healing (1 day after amputation, 1 dpa), early-bud blastema (3 dpa), midbud blastema (7 dpa), late-bud blastema (14 dpa), palette stage (22 dpa), and redifferentiated stages (33 dpa) ( Figure 1A). These time points represent the main events during axolotl limb regeneration, making this dataset a valuable platform to understand the complex regulatory network from an overall perspective. We generated 24 samples from the eight stages of limb tissues (three biological replicates per group). Systematic analysis of our dataset identified a total of 342,341 peaks, of which 33,604 showed transient dynamic patterns. We further investigated the occupancy of TFs in clusters with different peaks, which may help to explain the activation and manipulation of these regulatory elements during injury response and regeneration process ( Figure 1B).

Sample Collection
The institutional review board approved all experiments in this study on the ethics committee of BGI (permit BGI-IRB 19059). Axolotl breeding, housing, and tissue isolation were performed as previously described (Li et al., 2020). Briefly, we anesthetized the axolotls with 0.2% Tricaine (ethyl 3-aminobenzoate methane sulfonate) before the amputation surgery. The lower forearm tissues were isolated at eight time points including the homeostatic stage (uninjured control, 0 hpa), trauma (3 hpa), wound healing stage (1 dpa), early-bud blastema (3 dpa), midbud blastema (7 dpa), late-bud blastema (14 dpa), palette (22 dpa), and redifferentiation stage (33 dpa), with three replicates for each stage. These eight time points represent the main phases of axolotl limb regeneration. All tissues were washed with amphibian phosphate-buffered saline for three times before further operation. Tissues were enzymatically digested to cell suspension using 0.2% collagenase type I (BBI, cat. #A004194-0001) and 0.2% collagenase type II (BBI, cat. #A004174-0001) at room temperature for 1 h.
The transposed DNA was purified with a DNA MinElute kit (Qiagen, Germany) and eluted with 20 µL nuclease-free H 2 O. The purified DNA was amplified for eight cycles using a reaction mixture containing 2.5 µL of Tn5 Ad153 N5 primer (20 µM), 2.5 µL of Tn5 Ad153 N7 primer (20 µM), 25 µL of NEB Next High-Fidelity 2× polymerase chain reaction (PCR) Master Mix, with a PCR protocol of 72 • C for 5 min, 98 • C for 30 s, and then eight cycles of 98 • C for 10 s, 63 • C for 30 s,72 • C for 1 min, finally by 72 • C for 10 min and hold at 4 • C. The 300-to 500-bp size PCR product was selected using AMPure XP beads (Agencourt, cat. #A63882) according to the manual. All libraries were further prepared based on BGISEQ-500 sequencing platform with pairend 50-bp read length (Huang et al., 2017).

Preprocessing of the ATAC-seq Datasets
The data of ATAC-seq were trimmed with SOAPnuke (Chen et al., 2018), and reads were aligned to axolotl genome (Nowoshilow et al., 2018) (https://www.axolotl-omics.org/ assemblies) by using Sentieon bwa mem (parameter: -K 100,000,000 -M -t 40) (Li, 2013). Subsequently, we filtered out reads with mapping quality of <30. PCR duplicate reads were discarded by applying Picard's MarkDuplicates (http:// broadinstitute.github.io/picard/) (Picard Toolkit, 2019). We next performed model-based analysis of ChIP-seq (MACS2) to identify the peak regions with options -B, -q 0.01 -nomodel, -f BAM (Zhang et al., 2008). The irreproducible discovery rate (IDR) method was employed to identify reproducible highquality peaks between each two biological replicates (Li et al., 2011). Peak signal can be visualized in IGV by the Broad Institute (http://software.broadinstitute.org/software/igv/). A standard peak list was established by merging reproducible peaks of each two replicates for each time point. The standard peak count matrix was calculated using the intersect function of BedTools (Quinlan and Hall, 2010).

Identification of Dynamic Chromatin Accessible Regions
Reads per million mapped reads (RPM) algorithm was used to normalize the raw count matrix (Wei, 2020). Pearson correlations based on the Log10 RPM matrix were used to calculate the coefficients between different biological replicates across every stage. The RPM matrix for biological replicates was aggregated, and peaks with an average of RPM <1 at all time points were removed. Peaks across time with <50 coefficient of variation were filtered out to form a pseudocluster prior to clustering, which reflects the regions with stable accessibility throughout the regeneration. The remaining peaks were then transformed into normalized data using Z score method, followed by performing time course c-means fuzzy clustering with a cluster membership cutoff of 0.8 (Kumar and Futschik, 2007).
Relative genomic region was determined by overlapping each peak with features defined in the custom's annotated genes. The distance to transcription start sites (TSSs) was calculated according to the distance between the peak center and the nearest TSSs using the distanceToNearest function in GenomicRanges packages (Lawrence et al., 2013).
Functional enrichment of peaks in each cluster with distance to TSSs <10,000 bp was performed by using the clusterProfiler R package (Yu et al., 2012), with a q-value threshold of 0.1 for statistical significance.
The findMotifsGenome.pl script of the HOMER software was employed to perform transcript factor enrichment analysis in regeneration dynamic peaks with default settings (Heinz et al., 2010).

Pseudobulk RNA-seq Analysis
To investigate the correlation between chromatin dynamics and gene expression changes, we took advantage of singlecell RNA-seq data of these eight stages we published previously and calculated the average expression of each gene to construct a pseudobulk gene expression matrix (Li et al., 2020). Correlation analysis was done between the chromatin accessibility of promoters (TSS ± 2 kb) and closest genes' expression.

ATAC-seq Quality Control and Reproducibility of Biological Samples
We inspected our ATAC-seq dataset by regularly used statistics, such as the number of total reads, number of mapped reads, percentage of mapped reads, the number of usable reads, the percentage of final usable reads, and the number of peaks (Supplementary Table 1). We generated more than 1,000 million ATAC-seq reads for each replicate on average. Among these reads, we detected strong enrichment around TSSs (Wei, 2020) (Figure 1C). Moreover, size periodicity of the chromatin accessibility fragments corresponding to integer multiples of nucleosomes (Wei, 2020) demonstrated the reliability of our dataset, this being consistent with previously published ATACseq profiles (Buenrostro et al., 2013) (Supplementary Figure 1).
To assess the reproducibility of chromatin accessibility regions between biological replicates, we used the IDR method to filter peaks that overlapped between replicates in each regeneration stage. Pearson correlations based on the Log 10 RPM matrix were used to calculate the coefficients, showing that a correlation coefficient is higher than 0.9 between each two replicates in each stage, with the exception of replicate 1 from 3 hpa, which was removed for downstream analysis (Figure 1D).

Temporal Dynamics of Chromatin Accessibility During Regeneration
To explore the chromatin accessibility with temporal dynamic features during regeneration, we used normalized ATAC-seq read counts in peaks to perform time-course fuzzy clustering. This approach yielded six separated clusters, which indicate six distinct categories defined by regions: (1) those that become accessible transiently at 22-and 33-dpa stages, cluster 1 (C1, n = 5,930); (2) regions that are close in the intermediate period of regeneration but accessible after 14 dpa, cluster 2 (C2, n = 3,838); (3) regions in which accessibility is established only at 22-dpa stage, cluster 3 (C3, n = 6,797); (4) regions accessible in the control but that exhibit loss of accessibility shortly at 3 hpa and later stages, cluster 4 (C4, n = 2,454); (5) regions that are accessible in specifically at 3 hpa and 14 dpa, cluster 5 (C5, n = 7,727); (6) regions that are stably accessible at the intermediate stages of regeneration, cluster 6 (C6, n = 6,858). This clustering highlighted several characteristics of chromatin reconfiguration during regeneration (Figure 2A). These data collectively demonstrated that the chromatin state is remodeling rapidly in the first few hours following amputation, to prepare for the subsequent regeneration process. The dynamics of chromatin accessibility provides a new perspective to understand the cell fate decision in axolotl limb regeneration process.
Peaks in C1 are highly enriched in Gene Ontology (GO) terms related to axonogenesis. Examples of C1 include a promoter in the Ndnf locus, which is a novel neurotrophic factor derived from neurons that may be useful in the treatment of neuronal degeneration diseases and nerve injuries (Kuang et al., 2010). C2 consists of elements that are highly accessible in the extracellular matrix organization and connective tissue development. For example, Col11a2, a fibril-forming collagen found mainly in the cartilage extracellular matrix, is important for the integrity and development of the skeleton (Lui et al., 1996). We also found some genes associated to limb morphogenesis in C3 such as the Hoxbox gene, Evx2, and Hoxd10 (Herault et al., 1996;Tarchini and Duboule, 2006). GO terms enriched in C4 were related to muscle cell development, whereas those in C5 were related to epidermis development. Interestingly, we observed some immune response-related GO terms in C6, such as T-cell activation and myeloid cell differentiation (Figures 2B,C), which is consistent with the inflammatory process following injury (Supplementary Figure 2).

TF Enrichment of Dynamic cis-Regulatory Elements
Our analysis also indicated that the binding site for TF that bound to C1 is enriched for NeuroG2 ( Figure 2D). NeuroG2 is a TF that can specify a neuronal fate and expressed in neural Where a point is present, a significant enrichment for the go term of biological process (x axis) was found in the ATAC-seq clusters (y axis). Point size represents the gene ratio in the go term, and color represents the adjusted P-value. (D) Enrichment of the indicated TF motifs in each ATAC-seq cluster. The size and color of each point represent the motif enrichment P-value (-log10 P-value). progenitor cells within the developing central and peripheral nervous systems (Dennis et al., 2019). Notably, we also observed major binding events for the pioneer factor PU.1 in C6, this being a master transcriptional regulator in activating many target genes during both myeloid and B-lymphoid development (Turkistany and DeKoter, 2011). In addition, transcript factors from the MyoG, MyoD, and Mef families, which are essential for the development of functional skeletal muscle, were found in C4 (Al-Khalili et al., 2004;Ganassi et al., 2020). Taken together, we provide a high-quality comprehensive dataset to study the regenerative epigenomic dynamics of axolotl limb regeneration.

CONCLUSIONS
To summarize, by applying the state-of-the-art technique ATACseq, we provide the first chromatin accessibility landscape in axolotl regenerative limb tissues from the immediate response stage to the complete recovery stage. These data will be of great importance to the studies of various scientific disciplines such as development, cell reprogramming, and mechanisms underlying regeneration. Further analysis of these datasets by focusing on the differentially regulated regions may help deduce key regulatory elements that are critical for regeneration initiation in the axolotl limb in the future.

ETHICS STATEMENT
The animal study was reviewed and approved by the Ethics Committee of BGI.

AUTHOR CONTRIBUTIONS
XW, YGu, HL, CL, YGuo, and XZh conceived the idea. YGuo and XZh collected samples. XZh and YGuo generated the data. HL, YYua, YQ, CM, JX, MW, YYu, and QD assisted with the experiments. XW analyzed the data with the assistance from YL, XZo, LZ, GH, and WF. XW wrote the manuscript with the input of HL, YGuo, and XZh. YGu and CL supervised the study and revised the manuscript. LL, HY, and SL provided helpful comments on this study. All authors reviewed and approved the final manuscript.

ACKNOWLEDGMENTS
We thank the Center for Digitizing Cells, Institute of SuperCells, BGI-Shenzhen for helpful comments. This work is supported by China National GeneBank for providing sequencing service. We also thank Liang Chen from Hubei Key Laboratory of Cell Homeostasis, College of Life Sciences, Wuhan University and Giacoma Volpe from GIBH for valuable advice.

651145/full#supplementary-material
Supplementary Table 1 | ATAC-seq metadata and mapping statistics. * Mapped reads: total number of reads minus number of unaligned reads. * Usable reads: number of mapped reads minus number of low mapping quality and duplicate reads.