Challenges and insights in the exploration of the low abundance human ocular surface microbiome

Purpose The low microbial abundance on the ocular surface results in challenges in the characterization of its microbiome. The purpose of this study was to reveal factors introducing bias in the pipeline from sample collection to data analysis of low-abundant microbiomes. Methods Lower conjunctiva and lower lid swabs were collected from six participants using either standard cotton or flocked nylon swabs. Microbial DNA was isolated with two different kits (with or without prior host DNA depletion and mechanical lysis), followed by whole-metagenome shotgun sequencing with a high sequencing depth set at 60 million reads per sample. The relative microbial compositions were generated using the two different tools MetaPhlan3 and Kraken2. Results The total amount of extracted DNA was increased by using nylon flocked swabs on the lower conjunctiva. In total, 269 microbial species were detected. The most abundant bacterial phyla were Actinobacteria, Firmicutes and Proteobacteria. Depending on the DNA extraction kit and tool used for profiling, the microbial composition and the relative abundance of viruses varied. Conclusion The microbial composition on the ocular surface is not dependent on the swab type, but on the DNA extraction method and profiling tool. These factors have to be considered in further studies about the ocular surface microbiome and other sparsely colonized microbiomes in order to improve data reproducibility. Understanding challenges and biases in the characterization of the ocular surface microbiome may set the basis for microbiome-altering interventions for treatment of ocular surface associated diseases.


Introduction
The human ocular surface has long thought to be sterile due to the presence of antimicrobial components in the tear film (McDermott, 2013) and continuous motion of the eye lids (Shovlin et al., 2013).First investigations of conjunctiva swab cultures 92 years ago described 43% of samples as absolutely sterile.The predominating genus in the remaining cultures was Staphylococcus (Keilty, 1930).The idea that the ocular surface is sterile or only periodically colonized during infections has been changed with the introduction of modern sequencing technologies such as 16S rRNA gene sequencing and whole-metagenome shotgun sequencing (Doan et al., 2016).A resident microbiome on the ocular surface, as found in other mucosal sites throughout the body, has been described recently (Human Microbiome Project C, 2012a;Human Microbiome Project C, 2012b).Previous studies using either 16S rRNA gene sequencing or whole-metagenome shotgun sequencing, have shown that the three dominant microbial phyla of the ocular surface microbiome (OSM) are Proteobacteria, Firmicutes and Actinobacteria (Dong et al., 2011;Zysset-Burri et al., 2021) with highly variable abundances between individuals.To our current knowledge, the taxonomic composition of the OSM is influenced by environmental and demographic factors such as age (Zhou et al., 2014;Cavuoto et al., 2018b;Cavuoto et al., 2018a) but not by sex (Zhou et al., 2014;Ozkan et al., 2017;Cavuoto et al., 2018b;Cavuoto et al., 2018a;Suzuki et al., 2020).To date, the impact of DNA extraction kits on OSM data has not been explored in respect to microbial composition using wholemetagenome shotgun sequencing.Previous studies investigated the effects of swabbing technique (soft versus deep) (Dong et al., 2011), topical anesthetics (Delbeke et al., 2022) and temporal stability (Ozkan et al., 2017) on the outcome of OSM taxonomic profiling.It has been postulated that even the intraocular environment has its own microbiome in healthy and diseased eyes (Deng et al., 2021).Moreover, it has been suggested that, in addition to the tear film, eye lid motion and antimicrobial peptides, the OSM itself contributes to ocular surface health.The OSM's contribution is mediated by selective immune tolerance to commensal specific compounds (Ueta and Kinoshita, 2010) and colonization resistance (St Leger et al., 2017), showing the importance of the OSM in maintaining a healthy ocular surface and preventing infectious and inflammatory diseases.
Low microbial biomass combined with low levels of microbial DNA on the ocular surface lead to similar challenges as in the characterization of blood or placental microbiomes, with contamination issues outcompeting the biological signal (Glassing et al., 2016;Lauder et al., 2016).A bacterial density of 0.06 bacteria per human cell was measured by broad range 16S rDNA gene qPCR in conjunctival swab samples, around 200 times less than on the buccal mucosa (12 bacteria per human cell) and > 250 times less than on facial skin (16 bacteria per human cell) (Doan et al., 2016).
Even with this low density in biomass, modern sequencing techniques are able to determine microbial compositions, but suffer from contamination problems with decreasing microbial concentration.This was shown by Brandt and Albertsen 2018, where pure water spiked with Escherichia coli in varying concentration was analyzed by 16S rRNA sequencing and samples with 10 bacterial cells/ml showed 8% contaminating sequences.To circumvent these issues, we tested the following settings: sample collection using two different swab types (standard cotton versus flocked nylon swabs), isolation of microbial DNA by two different DNA extraction kits, quantification of bacterial 16S rRNA gene DNA with qPCR, an increased sequencing depth compared to earlier studies and inclusion of positive and no template controls (NTC) for each individual step of the pipeline.The two DNA extraction kits were chosen due to their use in previous studies (Omega (Zysset-Burri et al., 2021),) and their availability, combination of mechanical and enzymatic lysis, enzymatic host DNA depletion and low input material (Qiagen).While enzymatic host DNA depletion may not be the most efficient method to deplete host DNA compared to paramagnetic beads and DNA methylation traps (Ganda et al., 2021), a more than three-to then-fold increase in relative bacterial DNA was observed (Marotz et al., 2018;Heravi et al., 2020).The main goal of this study was to compare the impact of different sampling and analysis methods on the overall structure of OSM, while the described species itself may not be of clinical relevance.

Recruitment and study design
This study was approved by the Ethics Committee of the Canton of Bern (ClinicalTrials.gov:NCT04658238).The procedures followed the principles of the Declaration of Helsinki and the International Ethical Guidelines for Biomedical Research involving Human Subjects.Each study subject was informed about the procedures and purpose of the study and signed informed consent before study enrollment.All study subjects were enrolled at the Department of Ophthalmology of the Inselspital, University Hospital Bern in Bern, Switzerland.Inclusion criteria consisted of willingness to sign an informed consent and an age of 18 years or older.Subjects were excluded if they were not willing to or able to sign an informed consent, were younger than 18 years of age, had received systemic or topical antibiotics in the last three months, were using medical eye drops or underwent ocular surgery within the last three months.The recruited study cohort consists of six females with a mean age of 48 years (SD = 11.46) at sampling.

Sample collection
Ocular swab samples were collected from six subjects with healthy eyes at four time points over the course of two weeks for each DNA extraction kit.The first and third sampling was performed with flocked nylon swabs (FLOQSwabs #518CS01, Copan, Brescia, Italy), the second and fourth sampling with standard cotton swabs (Catalogue #1501256, Applimed SA, Chatel-St-Denis, Switzerland).A gap of two to three days were kept between each sampling for recovery reasons (Figure 1).The ocular surfaces of patients were anesthetized with one drop of Tetracaine 1% solution per eye (Tetracaine 1% SDU Faure, Theá, Clermont-Ferrand, France).Conjunctiva swabs were taken by swabbing over the lower conjunctiva three times counter-rotating the swab to the direction of movement.After conjunctiva swab collection, the Meibomian glands of the lower eye lid were expressed from caudal to cranial using a sterile cotton swab before the lid swab collection.Lower lid swabs were collected by swabbing three times over the lower lid, counter-rotating the swab to the direction of movement to increase the contact area between swab head and eye lid.

DNA extraction
Whole DNA was extracted with two different kits, the QIAamp DNA Microbiome Kit (51704) from QIAGEN (Hilden, Germany) and the E.Z.N.A. MicroElute Genomic DNA Kit (D3096-02) from Omega Bio-Tek (Norcross, USA).These two kits are further referred to as Qiagen or Omega, respectively.
Omega and Qiagen DNA extractions were performed according to the provided protocol for swabs with minor changes (Supplementary material S1).The Qiagen extraction kits were further split into Qiagen1 and Qiagen2 in data analysis, representing two different batches of kits with two different lot numbers.In extractions with the Qiagen1 kit, the enzymatic host DNA depletion did not work as well as intended.Therefore, this sampling was repeated with the same subjects and same sampling methods approximately four months after the initial Qiagen1 sampling and seven months after the Omega sampling, receiving the name Qiagen2.
DNA extraction eluates were measured on a spectrophotometer (NanoDrop 1000, Thermo Scientific, Waltham, Massachusetts, USA) to assess solvent and salt contaminations prior to storage at -20°C.
DNA from 18 aliquots of a positive control with known bacterial composition (ZymoBIOMICS Microbial Community Standard, D6300, Irvine, CA, USA) were extracted using both DNA extraction kits with the same protocol used in OSM DNA extractions.Additionally, eluates from these microbial community standard DNA extractions were diluted to the same DNA concentration found in OSM samples.Dilutions were made with the elution buffer of the Omega or Qiagen kit, respectively.This provided information about the accuracy of the sequencing pipeline at low DNA concentrations.

Quantification of bacterial DNA
Bacterial DNA content was assessed using qPCR of the 16S rRNA gene.The primers used in qPCR were taken from a publication by Galazzo et al., 2020 (primer pair 16S-341_F CCTACGGGNGGCWGCAG and 16S-805_R GACTACHVGGGTATCTAATCC) (Galazzo et al., 2020).As a master mix iTaq Universal SYBR Green Supermix (Bio Rad, Hercules, CA, USA) was used in a 10 ml reaction with a primer concentration of 300 nM.All samples were measured in triplicate and run on a CFX Connect Real-Time System (Bio Rad, Hercules, CA, USA).The PCR amplification program was initialized by a 3 min at 95°C denaturation followed by 40 two-step amplifications set at 95°C for 5 s and 60°C for 30 s. Melting curves were retrieved at the end of the amplification cycles and used to confirm amplification of the desired product.NTCs as well as a positive control stool sample with a high concentration of DNA were included in each plate.Cycle numbers were normalized to the expression of the higher concentrated positive stool sample control.

Library preparation and metagenomic DNA sequencing
DNA concentration and quality were assessed prior to sequencing using a fluorometer (QubitFlex Fluorometer, Qubit 1X dsDNA HS Assay Kit #Q33231, Thermofisher Waltham, Massachusetts, USA).Due to the low DNA concentrations, the sequencing libraries in all samples were amplified with 12 PCR cycles.In samples with a library concentration below 1 nM after the initial PCR amplification (20.18% in total, 2.38% in Omega, 31.88% in Qiagen) the amplification was repeated with a total of 18 PCR cycles.If the libraries still not met the requirements, the samples were excluded from further analysis (7.02%).Libraries were prepared and sequenced by the Next Generation Sequencing Platform of the University of Bern, Switzerland.Sequencing libraries were prepared using the Illumina DNA Prep, (M) Tagmentation kit (#20018705, Illumina, San Diego, CA, USA), with four index sets (IDT for Illumina DNA/RNA UD Indexes Set A, B, C and D, #20027312, #20027214, #20042666, #20042667, Illumina).Samples were sequenced by Illumina NovaSeq 6000 on S4 flow cells.Paired end reads of 150 bp length were selected for this project and a sequencing depth of 60 million reads was aimed for.
Taxonomic annotation of sequencing reads was performed using the Metagenomic Phylogenetic Analysis tool (MetaPhlAn 3, version 3.0.14)with ChocoPhlAn 3 (version mpa_v30_CHOCOPhlAn_201901) as reference pangenome database (Beghini et al., 2021).Alternatively, Kraken2 (version kraken2_2.0.9beta) with the relative abundance estimation tool Bracken (version bracken_2.6.0) was used to observe the effect of different taxonomic annotation methods (Lu et al., 2017;Wood et al., 2019).The relative abundances of the annotated reads were calculated to determine the taxonomic composition of the OSM.

Statistical analysis
Graphical representation of sequencing data and statistical analyses were produced using R (Version 4.2.1) and the R package ggplot2 (Version 3.3.6).The R package MaAsLin2 (Version 1.10.0)was used to create a mixed effect model with the DNA extraction method, swab type and sequencing run as fixed effects and with the study ID and age as random effects.Log transformation and normalization of the data were disabled.Subject IDs were set as random effect in MaAsLin2 to account for temporal dependence.
Statistical comparisons were performed with the following tests: Pairwise Wilcoxon rank sum tests, Students t-tests, One-way ANOVAs or PERMANOVAs (R package vegan (Version 2.6.2)).PCA and MaAsLin2 analyses are based on relative taxonomic abundance tables on species level.PCA analysis was performed with the R package vegan (Version 2.6.2).

Amount of extracted DNA
The total amount of extracted DNA in ocular swabs differed depending on the swab type, sampling location and DNA extraction kit (Figure 2).Independent of location and swab type, less DNA was extracted by the Qiagen compared to the Omega kit.The samples isolated by the Qiagen kit did not differ in DNA concentration from negative extraction controls according to Qubit fluorometer measurements.In the Omega kit, conjunctiva swab samples  Graphical overview of sampling procedure.This timeline was repeated three times with each DNA extraction kit (Omega, Qiagen without host DNA depletion, Qiagen).Ocular surface swabs were alternating taken using flocked nylon swabs or cotton swabs.Ocular swabbing was performed under local anesthesia (Tetracaine 1%).
contained more DNA compared to lid swab samples using cotton swabs (Wilcoxon rank sum test, p-value = 0.001).Additionally, there was an increase in the amount of DNA in conjunctiva samples sampled with nylon flocked swabs compared to lid samples sampled with cotton swabs (Wilcoxon rank sum test, p-value = 0.001).
Quantification of bacterial DNA qPCR of 16S rRNA genes was performed for lid and conjunctiva samples from both DNA extraction kits (Figure 3).Samples extracted with the Qiagen kit could not be distinguished from NTCs and were therefore omitted from the analysis.All Omega samples (except lid samples using cotton swabs) differed from NTCs in their Cq-values.A difference between flocked conjunctiva and cotton lid swabs was found (One-Way ANOVA, p-value = 0.006).
The composition of the OSM was dependent on the DNA extraction kit, whereas no differences were found dependent on swab type (Figures 5, 6).Due to the absence of differences from swab type, samples using both types of swabs were analyzed together.
Comparison of DNA extraction kits OSM samples did not differ in overall species richness (assessed with the Shannon diversity index) depending on the DNA extraction kit (Figure 7).A principal component analysis (PCA) showed differences in taxonomic composition in conjunctiva samples according to the extraction kit (Omega or Qiagen, using either cotton or flocked nylon swabs) (Figure 5).The same analysis for lid swabs can be found in the supplementary data (Figure S1).The taxonomic composition differed between all samples (Figure 5 and  (13.76% (SD = 13.57)),Corynebacterium mastitidis (6.93% (SD = 17.17)) and Acinetobacter junii (2.71% (SD = 5.08)) (Figures 8B, C).A more detailed graphical representation of the relative abundances of each single measurement in lid and conjunctiva samples can be found in Figures S2 and S3.

Positive controls
In order to investigate systematic biases of the DNA extraction kits, technical replicates of a positive control with known microbial composition were processed.All ten expected microbial species were found in extractions of both DNA extraction kits.However, the relative abundances differed from the expected values by up to 51.67% in Qiagen extractions versus 33.39% in Omega extractions.Omega extractions underestimated the presence of Listeria monocytogenes, Bacillus subtilis and the two fungal species Saccharomyces cerevisiae and Cryptococcus neoformans, while overestimating the abundance of Lactobacillus fermentum.Extractions with the Qiagen kit underestimated Bacillus subtilis, Enterococcus feacalis, Salmonella enterica, Escherichia coli and Pseudomonas aeruginosa, while overestimating Staphylococcus aureus and Listeria monocytogenes.Both fungal species were present at a representative level (Figure 9A).In a second approach, the same sequencing reads were analyzed with Kraken2 instead of MetaPhlan3, which also resulted in the detection of all microbial species in the positive control in both kits, but relative abundances differed (Figure 9B).The stability of the sequencing pipeline at low DNA concentration was assessed by dilution of positive controls to the DNA concentration measured in lid and conjunctiva samples.The relative microbial composition of these positive controls was not affected by dilution (Figure 10).

Discussion
With the introduction of modern sequencing technologies, the OSM has been described in much more detail compared to traditional culture techniques.However, due to the low microbial abundance, the characterization of the OSM leads to many challenges.Unlike intestinal microbiome samples, OSM samples are much more prone to contamination introduced in the course of sample collection to data analysis.This is a consequence of low abundant microbes and the resulting high host DNA contaminations.These contaminations may arise from unsterile sampling material, improper techniques or the reagents used for DNA extraction and/or library preparation (Laurence et al., 2014;Salter et al., 2014;Ozkan et al., 2017).In order to account for potential contamination, the inclusion of negative and positive controls for each step of the pipeline is essential.The sequencing depth was set at 60 million reads per sample in order to counteract the filtering of high numbers of host DNA reads.This leaves more microbial reads for analysis, increasing statistical power and decreasing the amount of undetected species (Pereira-Marques et al., 2019), but raising sequencing costs.Other sources of technical bias in the characterization of the OSM such as swabbing pressure (Dong et al., 2011), as well as sources of variability such as age (Zhou et al., 2014;Cavuoto et al., 2018b;Cavuoto et al., 2018a), contact lens wearing (Green et al., 2008;Shin et al., 2016;Stapleton et al., 2017), ocular surface diseases such as blepharitis (Lee et al., 2012), Meibomian gland dysfunction (Watters et al., 2017) or keratitis (Tuzhikov et al., 2013;Song et al., 2015) and the sample source (Ozkan et al., 2018) have been discussed.In this project, we focused on the effect of swab type, sampling location, different DNA extraction kits and taxonomic profiling tools on the taxonomic profile of the OSM using wholemetagenome shotgun sequencing.
The extracted amount of DNA differed between lid and conjunctiva samples, as well as depending on the swab type with flocked nylon swabs collecting more microbes than cotton swabs as previously described (Wise et al., 2021).It is worth noting that, especially for conjunctiva swabs under local anesthesia, all study subjects preferred the cotton swabs for sample collection due to its Heat map showing significant associations between metadata and microbial species.Correlations of conjunctiva (A) and lid (B) samples.Associations in taxonomic composition were found between DNA extraction kits and sequencing run.Shannon diversity of the samples extracted with either Omega or Qiagen kit.There was no difference in Shannon diversity observed between the two kits (Student's t-test, p = 0.3664).
softer head and less irritation of the conjunctiva.Since less material could be isolated from lid samples, the conjunctiva is the preferred location for ocular surface swabbing, especially in low abundant microbiomes such as the OSM.
The results in Figure 3 show that bacterial quantification is possible if the DNA concentration is above a certain threshold.In samples extracted with the Qiagen kit, the DNA concentration was below this threshold (Figure 2).Another method to estimate the absolute abundance of microbes in the sample is the use of internal standards (ISDs, spike-in controls) (Harrison et al., 2021).Bacterial quantification is important in association studies between the OSM and inflammatory ocular diseases since the bacterial load may be associated with disease development (Graham et al., 2007).Interpretation of 16S rRNA gene qPCR results have to be done in a cautious manner, as the copy number of the 16S rRNA gene is not constant in bacterial genomes ranging from one up to fifteen copies (Vetrovsky and Baldrian, 2013).Thus, a qPCR signal is dependent on microbial community composition.
The microbial compositions of conjunctiva and lid swabs were assessed by whole-metagenome shotgun sequencing, allowing the detection of viral and eukaryotic species in addition to bacterial species.Both fungal and viral communities have been shown to be part of the ocular surface and may contribute to the health of the underlying tissue (Doan et al., 2016;Wen et al., 2017;Shivaji et al., 2019;Shivaji, 2022), making whole-metagenome shotgun sequencing the preferred sequencing choice for the OSM.
Consistent with other studies, the main bacterial phyla present on the OSM are Actinobacteria, Firmicutes and Proteobacteria, while Bacteroidetes seems to be more prevalent in studies employing 16S rRNA gene sequencing (Zhou et al., 2014;Doan et al., 2016;Huang et al., 2016;Ozkan et al., 2017;Ham et al., 2018;Dong et al., 2019;Li S. et al., 2019;Li Z. et al., 2019;Yau et al., 2019;Andersson et al., 2021;Kang et al., 2021;Liang et al., 2021;Zhang et al., 2021;Zysset-Burri et al., 2021;Fu et al., 2022).The observed overlap of the main constituents of the OSM reinforce the validity of the presented method to characterize the OSM.Shannon diversity did not differ between the two DNA extraction kits.This result is consistent with a previous study showing no differences in Shannon diversity depending on the extraction method in four out of five different DNA extraction kits (Wagner Mackenzie et al., 2015).While the overall observed structure does not change, the DNA extraction method introduces bias into the relative abundances, impeding cross comparison between studies not employing the same protocols.
In a recently published paper by Delbeke et al., 2023, the authors could not generate sequencing libraries for 16S rRNA sequencing when using DNA extraction kits containing a host DNA depletion step (Delbeke et al., 2023).Our data suggests that the OSM can be characterized without the use of 16S rRNA sequencing even on samples where host DNA has been removed during DNA extraction, reinforcing the use of whole-metagenome shotgun sequencing in OSM research.While whole-metagenome shotgun sequencing does not involve a specific amplification of target DNA, amplification bias could not be eliminated in our approach since input microbial DNA was low concentrated (especially if using the Qiagen extraction kit for DNA isolation) and thus libraries had to be amplified during library preparation.Jones et al., 2015 showed that PCR cycles used for library amplification may lead to a taxonomic bias in bacterial mock communities (Jones et al., 2015).A recent study on the intestinal virome showed that this effect could only be observed by investigating rare viruses (Hsieh et al., 2021).This effect on rare species may be more pronounced in an environment with low microbial abundance such as the OSM.
There is an increased relative abundance of viral DNA in conjunctiva compared to lid swabs, regardless of the DNA extraction kit (Figure 4).Although this finding is consistent with previous studies from our lab, the relative abundance of viral DNA is increased in both locations in the current data (Zysset-Burri et al., 2021).This general increase in viral reads may be due to the upgrade from MetaPhlan 2 to MetaPhlan 3 since the database of MetaPhlan 3 contains more than twice as many species than the database of MetaPhlan2 (Truong et al., 2015;Beghini et al., 2021).Furthermore, previous studies showed that different taxonomic profiling tools do not result in equal relative abundances (McIntyre et al., 2017).There are differences in taxonomic output depending on the underlying database as well as on the used algorithm (marker-based versus kmer-based approach) (Miossec et al., 2020).Since low-abundant species are identified less accurately in marker-based approaches (Miossec et al., 2020), the more resource-intensive k-mer profilers may be appropriate.An interesting tool was presented by Metwally et al., 2016, combining different taxonomic identification methods by weighted voting.This approaches resulted in more accurate results than the individual tools alone (Metwally et al., 2016).Additionally, it cannot be determined if viral DNA originates from proviruses and/or entire viruses.Since the viral to bacterial ratio varied depending on the DNA extraction kit (Figure 4), we suppose that, at least not all detected viruses are proviruses.
By comparing ocular surface swabs from the same subjects, we showed that the DNA extraction kit had an effect on the microbial composition, while different swab types did not change the composition (Figures 5 and 6).This kit effect may result from the enzymatic host DNA depletion and mechanical lysis via ceramic bead beating steps included in the Qiagen kit.Both steps have been shown to influence the relative abundances of the microbial community (Nandakumar and Marten, 2002;Costea et al., 2017).
In accordance with previous studies, a bias against gram-negative bacteria in positive controls (Figure 9) as well as a decrease in the total amount of extracted DNA (Figure 2) were observed if microbial DNA was isolated with the Qiagen kit (Horz et al., 2008).This bias originates most likely from the host DNA depletion step since the three gram-negative bacteria in the positive control Salmonella enterica, Escherichia coli and Pseudomonas aeruginosa are underrepresented.This may be due to the mechanism of enzymatic host DNA depletion.To exclude host DNA, host cells are enzymatically lysed while bacterial cells remain intact with subsequent destruction of solved host DNA.Gram-negative bacteria may be more susceptible to this host cell lysis and thus, do not stay intact during treatment.Another explanation may be the partial lysis of susceptible bacteria in the storage solution before host DNA depletion.The two fungal species in the positive control were only found in samples extracted by the Qiagen kit.This may be due to the combination of mechanical and chemical lysis applied in the Qiagen kit, a treatment combination which was shown to increase the detection of fungal species (Janowski et al., 2019).An increased variance in microbial relative abundance due to low DNA concentration could be ruled out by diluting positive controls to the concentration of the OSM samples (Figure 10).Another bias in OSM characterization are errors during microbial annotation.Potential misidentification of microbial species can be observed in our data (Figure 9, see Bacillus intestinalis and Bacillus subtilis).Another case of potential taxonomic misidentification can be seen in Figure 8.Even if a high prevalence of herpesviridae DNA on the ocular surface is possible in our cohort, it is unlikely that it originates from the Cyprinid herpes virus 1 or 3 of the carp family.These misidentifications may be an artefact due to the database-dependent matching of marker sequences, in our case specific for MetaPhlan 3. Reads that do not exactly match all markers of a given species are assigned to the species with the highest overlap.Depending on the database and sequencing method, error rates can reach up to 17% (Edgar, 2018).This may be due to taxonomic mismatching or errors in the curation of the database (Lydon and Lipp, 2018).Data generated by whole-metagenome shotgun sequencing is generally less prone to annotation errors in lower taxonomic ranks compared to 16S rRNA sequencing (Escobar-Zepeda et al., 2018), making it the preferred tool for research where species identification is important.
Limitations of the study include the small sample size and the lack of longitudinal data to assess the temporal stability of the OSM.However, since there is a consistent microbial distribution in all study subjects (Figure 4), we suppose that the OSM is stable over the course of sampling in this cohort.Further, the use of anesthetic eye drops, such as tetracaine or oxybuprocaine, is known to inhibit bacterial growth (Labetoulle et al., 2002).However, a more recent study by Delbeke et al., 2022 could not detect a change in overall sequencing results of the OSM after the application of a topical anesthetic (Delbeke et al., 2022).

Conclusions
This study highlights challenges in the characterization of low abundant microbiomes including the sampling procedure, the selection of the DNA extraction method and the taxonomic profiling tool.Additionally, essential practices such as the inclusion of NTCs and internal standards were investigated for OSM samples which are prone to host and environmental contaminations.Since pipeline optimization is a continuous process and there is no single pipeline that fits all low abundance microbiomes, additional biases will be discovered and have to be accounted for in future projects.Thus, although certain biases during sampling, DNA extraction and sequencing cannot be avoided, careful planning of the pipeline for further research including low abundant microbiomes is crucial.

FIGURE 2
FIGURE 2DNA concentration of swab extractions.All Omega samples, except the negative controls, showed significantly higher DNA concentrations compared to any group extracted with the Qiagen kit.Conj, conjunctiva; NC, No-template control.* = p-value < 0.05.

FIGURE 1
FIGURE 1 Figure S1) that have not been extracted with the same kit in.PERMANOVAs with 1000 permutations were performed on conjunctiva samples (p-values: CO vs CQ = 0.001, CO vs FQ = 0.002, CQ vs FO = 0.001, FO vs FQ = 0.001) and lid samples (pvalues: CO vs CQ = 0.001, CO vs FQ = 0.001, CQ vs FO = 0.001, FO vs FQ = 0.001).Samples using different swab types but the same DNA extraction kit did not differ in the taxonomic composition, except for CO vs FO in conjunctiva samples (PERMANOVA, 1000 permutations, p-value = 0.022).

FIGURE 3 qPCR
FIGURE 3 qPCR data of quantification of 16S rRNA genes in samples extracted with the Omega kit.The data was normalized to a positive stool sample control showing stable results over all plates.Note that a lower Cq-value corresponds to a higher 16S rRNA concentration.Cq, Quantification cycle; NC, No-template control; PC, Positive control.* = p-value < 0.05.
FIGURE 4Taxonomic composition at phylum level.The average compositions for each study subject plus the overall mean of the subjects per DNA extraction kit in (A) conjunctiva samples and (B) lid samples were shown.

FIGURE 5 PCA
FIGURE 5 PCA of taxonomic composition in conjunctiva samples.The centroids of the ellipses (95% confidence interval for a multivariate t-distribution) clustered according to the DNA extraction kit.CO, Cotton swabs Omega; CQ, Cotton swabs Qiagen; FO, Flocked swabs Omega; FQ, Flocked swabs Qiagen.

8
FIGURE 8Relative taxonomic composition at species level at over 5% abundance.The mean microbial abundances are shown for each subject.Additionally the mean abundances of all subjects are shown as mean.The DNA extraction kits used are (A) Omega, (B) Qiagen1 and (C) Qiagen2.Taxonomic annotation was performed using MetaPhlan3.