A Wipe-Based Stool Collection and Preservation Kit for Microbiome Community Profiling

While a range of methods for stool collection exist, many require complicated, self-directed protocols and stool transfer. In this study, we introduce and validate a novel, wipe-based approach to fecal sample collection and stabilization for metagenomics analysis. A total of 72 samples were collected across four different preservation types: freezing at -20°C, room temperature storage, a commercial DNA preservation kit, and a dissolvable wipe used with DESS (dimethyl sulfoxide, ethylenediaminetetraacetic acid, sodium chloride) solution. These samples were sequenced and analyzed for taxonomic abundance metrics, bacterial metabolic pathway classification, and diversity analysis. Overall, the DESS wipe results validated the use of a wipe-based capture method to collect stool samples for microbiome analysis, showing an R2 of 0.96 for species across all kingdoms, as well as exhibiting a maintenance of Shannon diversity (3.1-3.3) and species richness (151-159) compared to frozen samples. Moreover, DESS showed comparable performance to the commercially available preservation kit (R2 of 0.98), and samples consistently clustered by subject across each method. These data support that the DESS wipe method can be used for stable, room temperature collection and transport of human stool specimens.


INTRODUCTION
There is a complex interplay between diet, the microbiome, and the metabolome that plays a central role in human biology and health (1)(2)(3)(4)(5). The detection and identification of biomolecules produced in microbial communities from samples is widely used for monitoring disease and aspects of overall health (6). Microbial surveillance of the gut is necessary to capture the relationship between diet, the microbiota, and microbially produced metabolites. The dynamic nature of the taxonomic communities with respect to dietary influences requires collection methods that can be easily and quickly utilized by the subjects for increased compliance and longitudinal at-home profiling. Stable transportation and delivery of biomolecules is also critical for the analysis of these samples. As such, low cost and efficient collection, storage, and delivery of biomolecules are critical for studies of the diet-microbiome relationship and the field of medical diagnosis.
For human gut microbiome analysis, recent advances in sequencing techniques and bioinformatics have increased our knowledge of the complex microbial communities and their interactions. It is now well established that these microbes play important roles in relation to inflammation (7), metabolic disease (8,9), mental disorders (10,11), aging (12,13), and several other diseases and health conditions (14)(15)(16)(17)(18). However, different approaches to sample processing can introduce human error variability or technical biases through inappropriate sample handling or storage. For example, fecal microbiota sequencing profiles have been shown to change significantly during ambient temperature storage after 48 hours (19,20). While performing nucleic acid extraction on fresh samples immediately after collection is impractical, freezing and storing samples (without using stabilization buffer) at −20°C, −80°C, or below, is considered to be standard practice when preserving microbial composition for sequence-based analysis at clinical settings (21)(22)(23)(24). However, this is difficult to achieve in many situations, such as sampling in remote areas, and thus may dramatically increase the costs of such studies. While some studies have investigated in detail the rapid deterioration of fecal samples that have been stored at room temperature for several days prior to lab processing (19,(25)(26)(27), there are few methods to address such issues.
Previously, authors have examined various collection methods and stabilization reagents (i.e., methods that include: no additive, 95% ethanol, RNAlater Stabilization Solution, fecal occult blood test cards, and fecal immunochemical test tubes). And although the stability of all these methods and the technical reproducibility were relatively high, the intraclass correlation coefficients were below 0.6 for metrics related to relative abundance (28). In another comparative analysis of the following collection methods: immediate freezing at −20°C without preservative, OMNIgene GUT, 95% ethanol, RNAlater, and Flinders Technology Associates (FTA) cards, the authors concluded that although all methods were comparable to immediate freezing without preservative, there were differences in gut microbiome metrics and specific species abundances (29).
In addition to these technical limitations and potential loss of data analysis accuracy after sequencing, the stool specimen collection itself can be challenging, and many individuals find the process difficult and not user-friendly (30), limiting adoption of microbiome testing and reducing the amount of relevant data that is collected to link observations to diet and phenotypes such as immune responses. Challenges include embarrassment, fear of results, concerns around hygiene and contamination, discretion and privacy, and lack of information. A 3-year randomized trial of 997 participants found that discomfort with the collection of a stool sample is the most frequently cited barrier for participation in fecal test-based screening. Furthermore, the study found that having a choice of screening methods significantly increases (13% vs. 43%) patient adherence (31).
A 2016 study, spanning 15 individuals and over 1,200 samples, provided the most comprehensive view to date of storage and stabilization effects on stool. It suggested that 95% ethanol can preserve samples sufficiently well at ambient temperatures for periods of up to 8 weeks, and includes the types of variation often encountered under field conditions, such as freeze-thaw cycles and high-temperature fluctuations. In addition, a solution containing dimethyl sulphoxide, disodium EDTA, and saturated NaCl (DESS) was originally used for various applications in the preservation entire soil/sediment samples or as a storage medium for microbial community analysis (32)(33)(34). Such preserved material can be easily stored for months at room temperature and provide an efficient, cost-effective method with widespread applications for microbiome studies.
To address such technical errors and biases in sample collection methods, as well as to enhance the user experience of stool sample collection, we have designed a practical and userfriendly fecal sample collection kit that includes a dissolvable wipe (e.g., ethanol-soluble film), in which the biological sample is dissolved in a DNA stabilizing solution (DESS). The film, solution, and biological sample are disposed of in a sealable container. Therefore, people can collect their fecal samples as easily as using toilet paper after defecation. The DNA stabilizing solution ensures that the microbiomic community structure is well-preserved during ambient temperature transportation and storage, and the microbiome DNA is extracted from the fixed microbes and used for further laboratory analysis when the samples arrive at the laboratory. In addition, the dissolvable characters of the wipe ensure that all the microbes contact the stabilizing solution adequately when the sample is collected. The primary objective of this study is to assess the extent to which our novel approach to fecal sample collection and stabilization could maintain microbiota composition compared to the following methods: immediate freezing (-20°C), preservation with a commercially available kit, and storage at room temperature (RT).

Study Summary
The study was set up to directly compare the quality of metagenomic sequencing of samples collected by the microbiome wipe against a current commercially available approach, positive controls (snap frozen at -20°C), and negative controls (room temperature without any stabilization) ( Figure 1). Six subjects were recruited to participate in the study to validate the wipe capture method. Two males and four females enrolled in the study (Supplemental Table 1). The median age was 42 ± 10.8. Three subjects were white, non-Hispanic and three subjects were black, non-Hispanic. The average BMI across the cohort was 27.8 ± 7.6 kg/m 2 . Four preservation methods were used to process the samples for metagenomics sequencing: freezing (-20°C), room temperature storage, a commercial preservation kit (Zymo DNA shield; room temperature), and DESS DNA preservation (room temperature). Three replicates per subject for each preservative were collected for a total of 72 Hua et al. samples. A total of 71 samples were successfully sequenced with an average of 8 million sequencing reads and a range of 3 to 38 million reads. One sample from the commercial preservation kit cohort was removed from the analysis due to sequencing failure (no library). The amount of DNA captured by wipe was found to be comparable to other collection and extraction methods. The average DNA yields from extraction were 98.2 ± 49.1ng/uL for DESS, 44.6 ± 42.7ng/uL for the commercial preservation kit, 286 ± 175.8ng/uL for the -20°C samples, and 122.3 ± 95.1ng/uL for the room temperature samples (Supplemental Table 2).

Sample Similarity
A t-SNE analysis across sample types and subjects showed clear clustering by each individual across the cohort, wherein each subject was isolated and separated from one another based on their unique microbiome signature ( Figure 2). Interestingly, wipe samples in the DESS clustered more closely with the frozen samples and the commercial preservation kit's samples. Meanwhile, in most subjects, the negative control samples that were stored at room temperature cluster together separately from the other preservation types. Supplementals Figures 1, 2 further demonstrates this finding in a dendrogram and PCoA plot respectively, showing clustering by subject and divergence of room temperature samples compared to the other sample types.

Taxonomic Profiles
Taxonomic assignments of reads to each domain of life were then examined for their relative distributions across the sample types. As expected, Bacteria was the predominant domain (>99% relative abundance) captured by the microbiome analysis across all subjects ( Figure 3A). Subject 1 had some more hits to Archaea (<l%) than others, Subjects 4 and 6 had some samples with Eukaryota hits, and Subject 5 had some samples with viral hits (most <1%) ( Figure 3B). Commensal gut flora including Firmicutes, Bacteroidetes, and Actinobacteria, were the top phyla across all samples, with some room-temperature samples also having Proteobacteria, particularly in Subject 4. Overall, no significant changes were found in phylum, class, order, family, genus ranks for any of the collection methods compared to the -20°C samples after correcting for subject and taxa in question (linear multivariate model, p > 0.5 for all collection methods). On species level only the room temperature samples showed significant difference compared to the frozen sample (p = 0.033), with the wipe sample and commercial sample not showing any significant differences (p > 0.5 for both). Generalized unifrac distance within the subject's -20°C samples (intra-group) to their samples collected by the wipe or commercial kit (inter-group) were not statistically significantly different (p > 0.1), whereas the room temperature samples had significant differences (p = 2.18x10 -18 ).
Figures 3C, D highlight the correlation of the subject's microbiome profile across different domains comparing wipe DESS preparation to frozen and the commercial kit to frozen, respectively (p < 2.2x10 -16 for the sample Pearson correlation based on a t distribution). There is an increased relative abundance of human DNA seen in wipe samples compared to the commercial kit ( Figure 3C), however, this is expected due to increased skin contact with the wipe and still negligible compared to the predominance of reads matching to bacteria ( Figure 3A).

Diversity Metrics
The metagenomic data were then examined for two metrics of species diversity (Figure 4). The Shannon index metric showed a similar range (2.7-3.8) across all sample types, but the wipe in DESS (median 3.2) had more comparable levels to the commercial preservation method (3.1) and gold standard frozen samples (3.2), than the room temperature storage (2.9). There were no statistically significant differences between the Shannon diversity of the -20°C samples to the wipe or the commercially available kit samples (p > 0.5), with the room temperature samples showing significantly decreased entropy (p = 0.013). The median species richness (151-159), however, was more comparable across all preservation techniques ( Figure 4B) and there were no significant differences between the -20°C samples and the other collection methods (p > 0.3 for all groups). Supplemental Table 3 has a statistical summary of these diversity metrics including means and standards of deviation.

Intra-and Inter-Sample Comparisons
Taxonomic profiles comparing the different preservation methods showed that DESS has a very strong correlation with the frozen samples and compares favorably to the currently used commercial kit ( Figure 5). The Pearson correlation of taxa log abundance with intragroup and inter-group comparisons. DESS was found to be very similar to the -20°C frozen samples when considering replicate-toreplicate variability (frozen-to-frozen correlation = 0.92, frozen-to-DESS correlation = 0.91) ( Figure 5A). Furthermore, Figures 5B, C highlight Pearson correlations calculated by median log10 relative abundances and median HUMAnN functional pathway scores, respectively (Supplemental Figures 3, 4). These analyses demonstrate strong intra-sample correlation across the wipe samples (p < 2.2x10 -22 ), as well as strong inter-sample correlation between the wipe and frozen samples (p < 2.2x10 -22 ). These correlations are even comparable to the correlation found between the commercial preservation and frozen samples ( Figures 5B, C). Figures 5D, E highlight each subject's taxonomic relative abundance comparing wipe to frozen and commercial preservation to frozen samples respectively. They further demonstrate that the wipe in DESS preservation method has a high positive correlation with the taxonomic profiles of the gold standard frozen preservation, and is comparable to the commercial preservation method.

DISCUSSION
This study demonstrates the use of a wipe-based capture method to collect stool samples for microbiome/metagenomics taxonomic and functional metabolite analysis, making collection from large numbers of people simpler and more user-friendly. The DESS wipe preservation method showed comparable performance to a commercial DNA/RNA preservation kit, and was also very similar to the gold standard frozen samples for most metrics, namely the taxonomic classification abundances had strong correlation (R 2 > 0.96, p = 2.2x10 -22 of the species abundances), functional pathway classification (R 2 > 0.99, p = 2.2x10 -22 of the pathway scores), and had no significant differences of alpha diversity (p > 0.5) nor beta diversity (p > 0.1, generalized unifrac distance of wipe to the frozen sample, compared against the intra-group variability in the frozen samples). Both the DESS and the commercial preservation method showed significantly higher Shannon diversity compared to the room temperature negative control (p < 0.05).
Although the quality and significance of standard microbiome metrics are comparable across the wipe method, gold standard, and other commercial methods, further validation and better understanding of the bacterial to human DNA ratio in a broader population can be addressed in future studies. This will involve including non-healthy subjects such as samples from people with gut conditions (i.e. bloody diarrhea, IBS, blood in the stool, colorectal cancer, hemorrhoids, etc.) where human DNA is more present in the stool (35,36), to further assess the performance of the wipe and integrity of microbiome analysis. Recruiting subjects with GI conditions such as constipation and diarrhea will further test the efficacy of the wipe. Moreover, subjects with different disease statuses and infections will be important to test, specifically patients with irritable bowel syndrome, IBD, Clostridium difficile infection, etc. There are also other collection methods and  preservation techniques to compare with the wipe method for future studies (37). Finally, RNA preservation and isolation for metatranscriptomics analysis poses its own set of unique challenges (38) and future studies will be needed to assess the wipe capture in DESS preservation for RNAseq analysis. This study provides validation evidence for a wipe-based collection and RT transport method for gut microbiome sampling and metagenomics sequencing analysis. Such a method may enable easier access to sampling, testing, and metagenomics implementation in clinical trials, home use, or even in remote environments, especially given the stability of the method when shipped at room temperature. Indeed, wipe-based collection and processing offers a more user-friendly approach to collecting stool samples for microbiome analysis. Its ease-of-use design and simple instructions (just wipe and place into the tube) should enable easy integration with commercial stool collection kits and future biomedical studies and trials. Importantly, simplified collection protocols that eliminate tasks that most people do not like (such as scooping their own feces for sampling) provides the opportunity to greatly increase adoption of microbiome testing in clinics and at-home testing. Increasing adoption generates enhanced data resources to the scientific community for discovery. Indeed, tools and methods such as these can be applied to help deploy metagenomics analysis approaches for a wide range of both research and clinical applications to learn more about the interplay between  diets, microbiota, bacterial metabolites and host to elucidate the role this complex system plays in intestinal health and disease.

Study Design
A total of 6 healthy subjects were enrolled in this pilot study. The inclusion criteria to be enrolled in the study included: age >18; Bristol Stool Scale type 3 and 4 (normal), agree to collect and donate the feces, and the ability to understand and write English. Exclusion criteria included people with constipation, slightly dry, or diarrhea feces (Bristol Stool Scale types 1-2, 5-7), pregnant or breastfeeding females, history of alcohol, drug, or medication abuse, known allergies to any substance in the study product, current diagnosis of inflammatory bowel disease (Crohn's Disease or Ulcerative Colitis), and currently taking any medication that may interfere with defecation.

Sample Collection and Processing
Fecal collection kits were created and mailed to enrolled subjects with clear instructions on sample collection. A total of 12 samples were collected by each subject yielding a total of 72 samples to be processed. Four preservation methods were used to process the samples for metagenomics sequencing: freezing (snap frozen at -20°C), room temperature storage, Zymo DNA shield kit (room temperature), and DESS DNA preservation (room temperature) (Supplemental Table 2). The DESS wipe kit utilizes a sterile DESS solution made up of 0.25M disodium EDTA pH 8.0, 20% dimethyl sulfoxide (DMSO), and saturated sodium chloride (NaCl). We have utilized DESS because in previous microbial community analyses, DESS has been shown to preserve the structure of microbial communities better than other preservatives [Ref. 1 belo (32). In another study, although all preservation methods examined for bacterial community sampling were biased towards G+C DNA rich microorganisms, DESS (within the liquid-based category) outperformed the cardbased methods (33). Moreover, studies have demonstrated that the constituents of DESS including DMSO and EDTA are frequently used in the deactivation of microorganisms (36). The wipe sheet is a polyvinyl alcohol film, 8cm x 8cm, similar size as/a little bit smaller than a piece of toilet paper. The volume of preservative is 30 mL. We advised participants to ensure that the size of each sample is around 1 peanut size and shake well until the wipe and stool particles are fully dissolved/dispersed. All the samples are shipped at room temperature except the samples meant for freezing which were shipped on dry ice. It took several days to up to a week for the shipment. When the samples arrived in the lab, all the samples were put into a fridge (4°C) except for the frozen samples which were put in a -20°C freezer.  Trimmomatic (39), and then aligned to human genome reference using bwa (40). Taxonomic annotation was performed by utilizing KrakenUniq (41) and subsequently Bracken (42) on a database that includes all bacterial, archaeal, viral, fungal references from RefSeq along with human reference. The lowest common ancestor taxonomic annotations were adjusted within the lineage until at least 10% of the unique k-mers belong to a specific clade and not its parent, then filtered for at least 10 reads and a minimum Bracken adjusted relative abundance of 0.005%. Pearson correlation was calculated as taking the log abundances of the species (or other relevant ranks) and comparing these between two samples. Summary Pearson correlation coefficients were calculated by taking the median of all groupwise comparisons of the minimum Bracken different replicates across the different collection methods, these were calculated within each subject individually. The functional annotations for genes were performed by using HUMAnN3 with the UniRef90 clusters, and summarized as MetaCyc v19.1 pathways by HUMAnN3 (43). The alpha diversity analysis was performed using Shannon entropy and species richness on the Bracken results of each replicate. Beta diversity was calculated as generalized unifrac distance between the samples (44). Statistical analyses of beta diversity were performed by comparing the withingroup distance to inter-group distances. The distance matrix from generalized unifrac metric were fed into t-distributed stochastic neighbor embedding (t-SNE) to project the samples into a 2 dimensional space (45).

IRB
The study protocol was approved by the Institutional Review Board of the University of Maryland Baltimore (Protocol # HP-00087571). The study participants provided their written informed consent before enrolling in the study.

DATA AVAILABILITY STATEMENT
All shotgun metagenomics data associated with the study are deposited into NCBI Sequencing Read Archive under the project accession PRJNA824068.

ETHICS STATEMENT
The study protocol was approved by the Institutional Review Board of the University of Maryland Baltimore (Protocol # HP-00087571). The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
HH, CM, BZ, JD, and CM designed the study. HH, BZ, and CD'A coordinated the IRB and subject recruitment. HH, CM, EA, LL, NR, NP, BZ, CM were involved in data analysis, interpretation, and preparation of the manuscript. All authors approved and contributed to writing the manuscript. All authors contributed to the article and approved the submitted version.