Assaying Chromatin Accessibility Using ATAC-Seq in Invertebrate Chordate Embryos

Cis-regulatory elements (CREs) are non-coding DNA regions involved in the spatio-temporal regulation of gene expression. Gene regulatory changes drive animal development and play major roles during evolution of animal body plans. Therefore, we believe that determining CREs at different developmental stages and across animal lineages is critical to understand how evolution operates through development. The Assay for Transposase-Accessible Chromatin followed by high-throughput sequencing (ATAC-seq) is a powerful technique for the study of CREs that takes advantage of Tn5 transposase activity. Starting from fewer than 105 cells, in a 1-day procedure, it is possible to detect, at a genome-wide level, CREs located in open chromatin regions with high resolution. Here, we describe a detailed step-by-step ATAC-seq protocol for invertebrate chordate marine embryos. We have successfully applied this technique to amphioxus and two species of tunicate embryos. We also show an easy workflow to analyze data generated with this technique. Moreover, we point out that this method and our bioinformatic pipeline are efficient to detect CREs associated with Wnt signaling pathway by simply using embryos treated with a drug that perturbs this pathway. This approach can be extended to other signaling pathways and also to embryo mutants for critical genes. Our results therefore demonstrate the power of ATAC-seq for the identification of CREs that play essential functions during animal development in a wide range of invertebrate or vertebrate animals.


INTRODUCTION
During embryonic development, cellular cross-talks must be tightly coordinated to allow the proper formation of the different tissues in the developing organism. These cross-talks are mediated by a limited number of pleiotropic signaling pathways that precisely control gene expression. Developmental genes are among the main targets of these signaling pathways. Accordingly, the expression of these genes is precisely regulated in space and time by the combined action of different pathways. However, how developmental genes integrate these multiple inputs is largely unexplored.
Cis-regulatory elements (CREs) are non-coding DNA regions involved in the regulation of gene expression. These CREs, mainly constituted of enhancers and promoters, contain DNA binding sites for transcription factors (TFs). TFs binding at CREs facilitate the recruitment of the RNA polymerase II machinery at the promoter of target genes, triggering their expression. Signaling pathways operate through different families of TFs. Therefore, identification of the CREs bound by these TFs that respond to signaling pathways would help to unravel how developmental genes integrate these signals (Spitz and Furlong, 2012;Long et al., 2016).
In the last years, the identification of CREs has been boosted by the emergence of next-generation sequencing. Among the new available techniques, we selected the Assay for Transposase-Accessible Chromatin followed by high-throughput sequencing (ATAC-seq; Buenrostro et al., 2013). ATAC-seq is a simple, easy-to-use, and efficient technique for identification of CREs actively bound by TFs in most tissues and organs of different species (Bogdanović et al., 2016;Sebé-Pedrós et al., 2016;Kittelmann et al., 2018;Marlétaz et al., 2018;Ruiz et al., 2018;Madgwick et al., 2019). The assay is based on the hyperactive Tn5 transposase, which preferentially integrates into open chromatin regions, where it cuts DNA into fragments and simultaneously attaches sequencing Illumina adapters to the fragment ends, facilitating the direct construction of the sequencing library ( Figure 1A). Sequencing reads are used to infer regions of increased accessibility such as promoters or enhancers, discriminating them from regions occupied by nucleosomes. Unlike ChIP-seq and DNAse-seq, ATACseq enables obtaining, in a 1-day procedure, DNA libraries with genome-wide information from fewer than 10 5 cells. Thus, this technique permits identification of open chromatin regions, offering a much more precise and realistic picture of how functional CREs are distributed in the genome. Moreover, applying this protocol to embryos at specific embryonic stages or to isolated populations of cells allows efficient exploration of the dynamics of regulatory activity across the different developmental stages and tissues. Finally, comparing orthologous regulatory regions in related species will give us the opportunity to find functionally conserved enhancers that are maintained at equivalent genomic positions, even though their sequences have diverged. Consequently, these studies move us closer to studying how changes at CREs are associated with gene expression differences during development and how they contribute to the evolution of novel morphological traits.
Here, we provide a step-by-step protocol with the aim of unifying species-specific procedures recently published (Marlétaz et al., 2018;Madgwick et al., 2019). We also provide detailed information about the handling of embryos, such as in vitro fertilization, and about how to obtain the optimal cell lysates specific for each organism. Starting from the cell lysate, the subsequent protocol steps are common for all the organisms used. Then, we propose an easy-to-follow workflow for basic bioinformatic analysis applicable to biological questions, such as identifying differential ATAC-seq signal across development or in embryos with perturbed signaling FIGURE 1 | (A) ATAC-seq cartoon reaction. The Tn5 transposase (dark pink) inserts two sequencing adapters (yellow and blue) only in accessible regions, between nucleosomes and binding sites for proteins (green), such as transcription factors (TF). Promoter regions (P) and transcription start sites (TSS) are also considered open chromatin regions. Tn5 generates sequencing fragments that can be amplified by PCR. (B) ATAC-seq tracks in the six3 region from chordates embryos at late gastrula stage. ATAC-seq tracks are marked in orange while the gene model is represented in blue. Mapped sequenced reads, <130 bp, identify discrete open chromatin regions mostly upstream the body gene region.
pathways. To demonstrate the robustness of our analysis, we used recently published ATAC-seq data from wild-type (wt) Ciona embryos at the 112-cell stage and embryos treated with CHIR-99021 (Madgwick et al., 2019), a pharmacological activator of the Wnt/β-catenin pathway (Naujok et al., 2014;Wu et al., 2015). This pathway has an ancestral function during primary body axis formation in both bilaterians and non-bilaterians (Petersen and Reddien, 2009;Imai et al., 2000;Creyghton et al., 2010;Loh et al., 2016). Our novel analysis reveals that the ATAC-seq peaks that are more accessible in the embryos treated with the Wnt agonist are enriched for TFs binding motifs (TFBMs) linked with this pathway. Accordingly, these potential CREs are associated with genes that play critical roles during development.
Our results strongly suggest that combining ATAC-seq with embryo perturbation experiments is a powerful method for identification of biological significant CREs critical for embryo development in multiple animal models.

Branchiostoma lanceolatum
Branchiostoma lanceolatum adults were collected during their breading season in the coast of Argelès-sur-mer, France, according to the previously described method (Fuentes et al., 2004).

Ciona intestinalis and Phallusia mammillata
Adult animals of Ciona intestinalis (previously C. intestinalis type B; Caputi et al., 2007) and Phallusia mammillata were collected from the wild by the marine service provided by Centre de Ressources Biologiques Marines in Roscoff, France.

Solutions
Attention: all solutions described are prepared with milli-Q quality water, unless otherwise specified.
To prepare 1 L of ASWH, dissolve 29.22 g NaCl, 0.67 g KCl, 1.11 g CaCl 2 , 2.33 g MgCl 2 , 2.95 g MgSO 4 , and 0.18 g NaHCO 3 in 1 L of water. Next, add 1 ml of HEPES. Sterilize through 0.2-µm PES membrane and add 0.05 g/L of gentamycin. Comments: ASWH may be kept at 18 • C to use it with ascidian embryos.
• Filtered seawater: filter natural seawater through bottle top filter with 0.45-µm PES membrane.
Comments: filtered seawater may be kept at 19 • C to use it with B. lanceolatum embryos.
Comments: 1-ml aliquots can be long term stored at −20 • C. The stock solution must be diluted 1:100 with distillated water to reach 10×.
Tips: IGEPAL CA-630 is a detergent difficult to dissolve. Use a platform rocker or vertical rotator to help to dissolve the detergent. Some hours are needed to obtain a homogeneous solution.
Comments: 20% IGEPAL CA-630 is stored for 1 week at room temperature.
To prepare 10 ml of tagmentation buffer, mix 100 µl of Tris-HCl 2 M pH 7.5, 100 µl MgCl 2 1 M, and 6 ml of water. Finally, add 2 ml N,N-dimethylformamide and increase the volume up to 10 ml with water. Sterilize the solution by filtration using a 0.2-µm cellulose acetate syringe filter. Tips: TAPS-NaOH 2 M, pH 8.5, can be used as an alternative to Tris-HCl to prepare the buffer. Comments: 2× tagmentation buffer may be kept at −20 • C no longer than 6 months.
Comments: this solution may be kept at −20 • C for longterm storage.
C. intestinalis and P. mammillata C. intestinalis and P. mammillata genome are available for users in http://www.aniseed.cnrs.fr/ ( Table 1). For the analysis described in the results section, Apr. 2011; Kyoto KH/ci3 Ciona robusta was used. Genome and the transcription start site (TSS) information were downloaded from UCSC Genome Browser 10 .

Procedure
In vitro Fertilization

B. lanceolatum samples
To obtain B. lanceolatum eggs and fertilize them, follow the protocol previously described (Fuentes et al., 2004(Fuentes et al., , 2007. Around 30 min after fertilization, collect the embryos in Petri dishes coated with a thin layer of 0.8% agarose and filled with filtered seawater. Dechorionate embryos manually by pipetting them toward the border of the dish. Incubate embryos at 19 • C. Check the development of the embryos 3 h after fertilization. Gently transfer healthy embryos to a new Petri dish with filtered seawater using tips with cut ends, previously incubated for several hours in filtered seawater.

C. intestinalis and P. mammillata samples
To obtain ascidian embryos, incubate eggs from both species with activated sperm for 8 min. Phallusia eggs needed to be dechorionated prior fertilization by incubation in 0.1% trypsin in ASWH, while Ciona eggs were dechorionated after fertilization (Christiaen et al., 2009). Keep developing embryos at 18 • C in ASWH in a Petri dish coated with a thin layer of 1% agarose melted in ASWH.

Collection of Embryos
We recommend starting ATAC-seq experiments with 100,000-180,000 cells for amphioxus and tunicate species. Use half of 7 http://www.pantherdb.org/ 8 https://www.r-project.org/ 9 http://www.htslib.org/ 10 https://genome-euro.ucsc.edu/cgi-bin/hgTables?hgsid=231102165_ vTWE7h7DGoTv0QlthOAGgKbTLi14 the cells (50,000-90,000 cells) for the actual tagmentation step and use the rest of the cells to estimate the number of nuclei in each sample. Optimizing the number of cells to use for the transposition is crucial, since it strongly affects library quality. In fact, using too few cells per experiment leads to overtransposition, while using too many cells produces incomplete transposition and over-sized DNA fragments.

B. lanceolatum samples
Keep embryos at 19 • C until they reach the desired developmental stage. Take the appropriate number of embryos according to their developmental stage ( Table 2) and transfer them to a 1.5 ml microtube.

C. instestinalis and P. mammillata samples
(i) Keep Ciona and Phallusia embryos at 18 • C until they reach the desired stage (Hotta et al., 2007). Collect embryos from the Petri dish and transfer them to a 1.5-ml microtube with 200 µl of ASWH at the bottom. To get a sufficient number of cells, we estimated that the number of dechorionated embryos in a packed volume of 100 µl was in the order of 10,000-20,000. Table 3 shows the volume of embryos collected depending on the developmental stage.
Attention: avoid air bubbles while pipetting since exposure of embryos to air dismantles embryonic structures.
(ii) Swirl the dish to group embryos, collect them using a glass Pasteur pipette (see Note 1), and transfer them to the 1.5-ml microtube with ASWH.
Attention: from here, keep samples on ice, and use gloves and filter tips.

B. lanceolatum samples
(i) Centrifuge the sample at maximum speed at 4 • C for 1 min and carefully discard supernatant.  (ii) Add 50 µl of cold lysis buffer and lyse the cells by pipetting up and down with P200 micropipette. For optimal cell lysis, proceed immediately to next step. (iii) Transfer 8 µl of lysate in three new 1.5-ml microtubes and centrifuge them at 500 × g at 4 • C for 10 min. During this 10 min, use the remaining 26 µl to count the cells for transposition: add 2.6 µl of 10× DAPI to 26 µl of the remaining sample and place 10-12 µl of the mix in a cell counting chamber and count cells in a compound fluorescence microscope using UV light. (iv) Remove supernatant carefully. Use the amount of sample needed to reach the optimal number of cells for the assay and continue with tagmentation step.

C. instestinalis and P. mammillata samples
Immediately place the microtube in a cold centrifuge and spin it at 500 × g and 4 • C for 5 min to pellet the embryos. Remove the supernatant carefully.
(i) Wash twice with 200 µl of ASWH by spinning down the cells at 500 × g and 4 • C for 5 min. (ii) Resuspend the pellet in 100 µl of cold lysis buffer and pipet up and down with a P200 micropipette. The lysate should be clear. Lysing Ciona and Phallusia embryos by pipetting can be difficult at early stages like 16-cell and 32-cell stages due to the large amount of yolk per cell. This may take around 3 min of vigorous pipetting with a P200 micropipette. For optimal cell lysis, incubate the cells no longer than indicated and proceed immediately to the next step. (iii) Transfer 50, 20, and 10 µl of the lysate into new 1.5ml microtubes and centrifuge them at 500 × g at 4 • C for 10 min. Meanwhile, use the remaining 20 µl to count the cells to estimate the number of cells to use in the transposition reaction: add 2 µl of 10× DAPI to 20 µl of the remaining sample. Load the mix in a cell counting chamber and count cells in a compound fluorescence microscope using UV light. (iv) Remove supernatant carefully with a micropipette. Use the aliquots required (50, 20, and 10 µl) to reach 50,000-90,000 nuclei required in the tagmentation step.
Attention: at this step, the nuclear pellet is hard to see. Avoid touching the bottom of the microtube, where the nuclei are pelleted, to not remove them.
The following procedures can be applied to all samples, regardless of species.
(4) Purify fragmented DNA by using the Qiagen MinElute PCR purification kit, following the manufacturer's instructions. Firstly, add 2.5 µl of 3 M sodium acetate pH 5.3 to the sample (see Note 2) prior to purification, then continue with the instructions. Elute tagmented DNA in 20 µl of Elution Buffer (see Note 3).
Comment: purified DNA fragments can be stored at −20 • C or immediately subjected to the library preparation step (Steps 5 or 6).

Library Preparation
(5) (Optional step) Determine the optimal PCR cycle number for the final ATAC library preparation. The optimal number of cycles is the Ct number obtained performing a qPCR plus one cycle more. Ct number is increased one cycle more to scale reaction and keep ATAC library preparation in exponential phase (see Notes 4, 5).

Quality Controls
(9) Use 1 µl of amplified DNA library to measure DNA sample concentration using Qubit dsDNA BR Assay kit with 500-µl thin-walled tubes. Follow manufacturer's instructions (see Note 6). (10) Run 2-5 µl of the amplified DNA library on 2% agarose gel to check tagmentation. A smear covering 100 bp to 1 kb should be observed. Depending on the quality of the library, besides nucleosome-free fragments (<200 bp), a mono-(200 bp) and di-nucleosome (400 bp) band pattern may be observed.

DNA Library Sequencing
(11) The required sequencing depth for the generation of a genome-wide profile with good footprinting signal depends on the genome size of the organism under study. In our experiments, we have used between 40 and 100 million paired-end reads per DNA library.

Data Analysis
The data used was obtained from B. lanceolatum (GSE106428), C. intestinalis (PRJNA474983), and P. mammillata datasets (PRJNA474750). After sequencing, align paired-end reads without adapter sequences against the reference genome using Bowtie2 (Langmead and Salzberg, 2012) with -X 2000 -nomixed -no-unal parameters. This procedure allows the retention of reads that are separated <2 kb. Then, remove PCR artifacts and duplicates using the tool rmdup, available in the Samtools toolkit (Li et al., 2009). In order to detect the position where transposase binds to the DNA, corresponding to accessible chromatin, read start sites need to be offset by +4 or by -5 bp in the plus or minus strand, respectively. Select read pairs that have an insert <130 bp, since they correspond to nucleosome-free reads. Next, generate BigWig files using genomecov from Bedtools (Quinlan and Hall, 2010) and wigtoBigWig tool from UCSC. These files can be uploaded to a genome browser, in order to explore the data. Call peaks using filtered read files using the MACS2 tool (Zhang et al., 2008). The parameters of this peak calling are -nomodel -shift -45 -extsize 100 and the genome size of the correspondent organism. For each of these peak callings, select the first 500,000 peaks ranked by p-value. Use the irreproducible discovery rate (IDR) framework (Li et al., 2011) for the identification of high confidence peaks based on replicate information, with the following parameters: -inputfile-type narrowpeak -rank p.value -soft-idr-threshold 0.1 and the peak calling coming from MACS2. Select high confidence peaks with and IDR global value of 0.01 for further analysis.
The proportion of reads inside peaks can be calculated with intersectBed from Bedtools (Quinlan and Hall, 2010), using the high confidence set of peaks as reference and the filtered reads as query, using the parameter -c.
Use the R package DESeq2 (Love et al., 2014) to assess differences between the accessibility of peaks between wt and a test condition. For each comparison, merge all the high confidence peaks of each condition that is going to be analyzed. Then, compute the number of reads inside the peaks for each experimental replicate. Select those peaks that show a p-value under 0.05 as differentially accessible peaks. Then, differentially accessible peaks can be associated with their putative target genes using "Genomic Regions Enrichment of Annotations Tool" (GREAT; McLean et al., 2010) or the closestBED tool (Quinlan and Hall, 2010) utility from BEDtools. For the analysis described in the section "Results, " we used the latter one with the parameters -D ref -iu -non-amecheck -k 1. The consecutive gene ontology (GO) analysis can be performed with PANTHER (Mi et al., 2019). Finally, to find TFBM enrichment in ATAC-seq peaks use the script FindMotifsGenome.pl from HOMER software (Heinz et al., 2010), selecting the set of desired TFBMs with the parameter -mset. As background model, merge high confidence peaks from all ATAC-seq datasets.

ANTICIPATED RESULTS
ATAC-seq assays have been performed in wt embryos of amphioxus and tunicates at different developmental stages mentioned in Tables 2, 3, respectively. ATAC-seq data can be easily visualized in a genome browser. For instance, ATACseq tracks generated in amphioxus and the equivalent stage in tunicate embryos are shown in Figure 1B. Interestingly, the ATAC-seq signal in the six3 region identified discrete regions in all three cases. There are more prominent accessible chromatin regions upstream of the six3 gene rather than downstream. In all cases, an ATAC signal is overlapping the TSS or promoter region of the six3 gene. The number of peaks per experiment and the percentage of mapped reads in these peaks are shown in Table 4. In B. lanceolatum samples, the number of peaks was between 16,000 and 48,000, while in both tunicate species, it was between 3,000 and 10,000. The percentage of nucleosome-free reads found in peaks was quite variable in all samples of the three species, but similar between most replicates.
We also provide an example of using ATAC-seq for the identification of CREs controlled by the Wnt signaling pathway. For that, we used recently generated ATAC-seq libraries from Ciona embryos at the 112-cell stage as a control and embryos treated with 4 µM CHIR-99021, a GSK3 inhibitor acting as an activator of the canonical Wnt pathway (Madgwick et al., 2019). Here, we showed further bioinformatic analysis that allows the identification of differential ATAC-seq peaks between control and treated embryos, the TFBMs enrichment in those peaks, and the GO of the associated genes.
By genome-wide searching for peaks that are differentially accessible between untreated and treated embryos upon Wnt activation, we find 57 more accessible and 40 less accessible ATAC-seq peaks. In fact, we identified two discrete regions in the nkx2.1 locus, which were significantly affected by the treatment (Figure 2A). GO analysis of the genes associated with more accessible chromatin regions underlined the role of Wnt activation in embryo development, as expected. In contrast, genes associated with less accessible ATAC-seq peaks were mostly related to metabolic processes ( Figure 2B). Our GO analysis was consistent with differentially expressed gene analysis obtained with another GSK3 inhibitor, known as BIO (Wu et al., 2015). Enrichment analysis, used to predict potential TFBMs in differentially accessible ATAC-seq peaks, showed that Fox, Tcf, and Nkx TFBMs were the most represented in the set of more accessible ATAC-seq peaks ( Figure 2C). The enrichment of Tcf, which is an effector TF of the canonical Wnt signaling pathway, validating that the treatment indeed promoted this pathway. On the contrary, Gata motifs were found enriched in the set of less accessible peaks (Figure 2C), confirming the previously reported antagonism between Gata and Wnt in Ciona (Imai et al., 2016).  In orange: GO term bars related to genes associated with chromatin regions significantly more accessible in embryos where the activity of Wnt pathway was increased. In brown: GO term bars related to genes associated with more accessible chromatin regions in wt embryos. (C) HOMER results for peaks differentially represented between wt and the perturbed condition. In the orange box: known motif enrichment results of peaks significantly more represented when Wnt/β-catenin signaling pathway was increased. In the brown box: known motif enrichment results of peaks significantly more represented in wt embryos.

CONCLUSION
Here, we present a comprehensive ATAC-seq protocol that is highly efficient for the identification of cis-regulatory regions operating during embryogenesis in several marine species.
The main advantages of ATAC-seq are the short experimental time and the low amount of starting material needed. We show that this protocol nicely works in embryos from three species, B. lanceolatum, C. intestinalis, and P. mammillata. Moreover, protocols for embryo collection and for experimental procedure are described in detail. Our results pointed out the robustness of the experimental procedure in marine invertebrate embryos and the great utility of the ATAC-seq technique.
Since we have succeeded in generating data from the different species tested at diverse developmental stages, we consider this protocol applicable to other species and/or stages. Indeed, in our laboratory, we have successfully applied this method on two different echinoderms and one hemichordate species. Additionally, we present some bioinformatic tools that can be used to analyze and visualize ATAC-seq data ( Figure 1B) and, more importantly, identify regulatory regions that respond to alterations in signaling pathways (Figure 2). Our work demonstrates that performing ATAC-seq in embryos treated with drugs that perturb signaling pathways is a powerful strategy for the genome-wide identification of CREs mediating the activity of different signaling pathways. Indeed, our laboratory is currently successfully applying this approach to study multiple signaling pathways in several invertebrate and vertebrate species. Therefore, one of the major goals of this protocol is to facilitate and encourage the community to use this approach to study their favorite systems and pathways.

Notes
(1) Glass Pasteur pipettes are previously dipped in milli-Q water overnight in order to avoid embryos sticking to the glass. (2) 3 M sodium acetate, pH 5.3, decreases the pH of the samples. This helps to recover a high quantity of DNA fragments during the purification step, according to the MinElute PCR purification kit manufacturer's instructions. It is important to not add pH indicator in the binding buffer, because it interferes with downstream sequencing. (3) Warming the Elution Buffer at 37 • C before use is recommended. (4) To obtain good quality ATAC-seq libraries in this work, we determined that 13-15 PCR cycles were optimal. However, to avoid experimental variations and PCR artifacts, it is recommended to determine the optimal PCR cycle number in every experiment independently. (5) A complete list of PCR primers complementary to Nextera adaptors is available in Supplementary Table 1 of Buenrostro et al. (2013). In order to multiplex different samples in the same sequencing lane, variable barcoded reverse primers need to be used. (6) A minimum concentration of 20 ng/µl is recommended for next-generation sequencing.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/supplementary material.

AUTHOR CONTRIBUTIONS
SJ-G, SB, HE, and JG-S developed the protocol for B. lanceolatum samples. MM, PL, and JG-S designed the protocol for C. intestinalis and P. mammillata, and conceived and designed experiments in C. intestinalis embryos. MM and AM performed the ATAC-seq assays in Ciona and Phallusia embryos. SJ-G and MM executed the bioinformatic analysis of B. lanceolatum, C. intestinalis, and P. mammillata ATAC-seq data. MM performed the computational analysis of differential ATAC-seq data of Ciona and data interpretation, and wrote the draft of the manuscript. SJ-G wrote the sections related to the ATAC-seq assays in B. lanceolatum samples. All authors contributed to the manuscript revision.  -2016-0687 to the Department of Gene regulation and morphogenesis of Centro Andaluz de Biología del Desarrollo) and by the EMBO short-term fellowship. The laboratory of HE and SB was supported by the Centre National de la Recherche Scientifique (CNRS) and the Agence Nationale de la Recherche (Grant No. ANR-16-CE12-0008-01).