Nasopharyngeal Microbial Communities of Patients Infected With SARS-CoV-2 That Developed COVID-19

Background SARS-CoV-2 is an RNA virus causing COVID-19. The clinical characteristics and epidemiology of COVID-19 have been extensively investigated, however, only one study so far focused on the patient’s nasopharynx microbiota. In this study we investigated the nasopharynx microbial community of patients that developed different severity levels of COVID-19. We performed 16S ribosomal DNA sequencing from nasopharyngeal swab samples obtained from SARS-CoV-2 positive (56) and negative (18) patients in the province of Alicante (Spain) in their first visit to the hospital. Positive SARS-CoV-2 patients were observed and later categorized in mild (symptomatic without hospitalization), moderate (hospitalization), and severe (admission to ICU). We compared the microbiota diversity and OTU composition among severity groups and built bacterial co-abundance networks for each group. Results Statistical analysis indicated differences in the nasopharyngeal microbiome of COVID19 patients. 62 OTUs were found exclusively in SARS-CoV-2 positive patients, mostly classified as members of the phylum Bacteroidota (18) and Firmicutes (25). OTUs classified as Prevotella were found to be significantly more abundant in patients that developed more severe COVID-19. Furthermore, co-abundance analysis indicated a loss of network complexity among samples from patients that later developed more severe symptoms. Conclusion Our study shows that the nasopharyngeal microbiome of COVID-19 patients showed differences in the composition of specific OTUs and complexity of co-abundance networks. Taxa with differential abundances among groups could serve as biomarkers for COVID-19 severity. Nevertheless, further studies with larger sample sizes should be conducted to validate these results.


IMPORTANCE
This work has studied the microbiota of the nasopharyngeal tract in COVID-19 patients using advanced techniques of molecular microbiology. Diverse microorganisms, most of which are harmless or even beneficial to the host, colonize the nasopharynx. However, changes in this microbiota could be related with different diseases such as cancer, gastrointestinal pathologies or even COVID-19. This study investigated the microbiota from patients with COVID-19, in order to determine its associations with pathology severity. The obtained results revealed several bacterial taxa with differential abundance among COVID-19 cases. This study represents the first step to understand the associations between microbiota composition and the severity of COVID-19.

BACKGROUND
Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) is a positive-sense single-stranded RNA virus causing Coronavirus Disease 2019 (COVID-19) (Andersen et al., 2020). On January 30, 2020, the World Health Organization (WHO) declared the COVID-19 outbreak as "public health emergency of international concern" and 2 months later on March 11th as a pandemic. The SARS-CoV-2 virus was first reported in central city of Wuhan, Hubei province, China, and presented 70% of sequence similarity with the SARS-CoV-1 virus (Zhu et al., 2020) and 96% of sequence similarity with a bat coronavirus, although the exact source has yet to be elucidated. While the most common symptoms are fever, cough and dyspnoea, the disease can cause other less frequent clinical manifestations such as myalgia, headaches, breathlessness, fatigue and nausea (Guan et al., 2020).
Viruses and bacteria are often present in the respiratory tract of healthy and asymptomatic individuals. Microaspiration of aerosols and direct mucosal dispersal is responsible for a constant inflow of microbes and viruses toward lower airways. Disease and inflammatory processes that lead to the emergence of anaerobic zones, or mucus accumulation in the alveoli can drastically change the microbial community of the airways (Huffnagle et al., 2017). For example, in diseased individuals, the lung microbiota composition undergoes a decrease in diversity (Kalantar et al., 2019) accompanied by a shift in the dominant taxa: from Bacteroidota to Gammaproteobacteria, a class that includes many respiratory pathogens.
Although the clinical characteristics and epidemiology of COVID-19 have been described (Chen N. et al., 2020;Rothan and Byrareddy, 2020;, studies focused on the associations between the patient's microbiota and the onset of the disease are still limited. This pilot study aims to characterize the nasopharynx microbial community of SARS-CoV-2 infected patients. We investigated samples from a control group of SARS-CoV-2 negative patients and three groups of SARS-CoV-2 positive patients, divided according to disease severity: one group of symptomatic patients that did not require hospitalization, a second group of patients that were admitted to conventional hospitalization facilities, and a third group of patients that required admission to the ICU.

Patients and Experimental Design
Fifty six nasopharyngeal microbiota samples from SARS-CoV-2 positive patients and 18 samples from SARS-CoV-2 negative patients were collected during March and April of 2020 in the Emergency Service of Hospital General Universitario de Alicante (HGUA). The samples were collected in a sterile tube containing Viral Transport Medium using a nasal swap, and they were stored at -80 • C until the DNA/RNA extraction. Cobas SARS-CoV-2 qPCR Test for the Cobas 6800 System (Roche Molecular Systems, Branchburg, NJ, United States) was used to detect the presence of SARS-CoV-2 and to obtain de Ct Values (Poljak et al., 2020). The Table 1 describes the groups of patients regarding sex, age, and clinical variables.
Patients were first classified based on SARS-CoV-2 presence, and then regarding on later developments (hospital admission and severity). All samples were obtained before the onset of severe symptoms, and before any treatment was administered to the patients. Following these criteria, four groups were established: group 0: SARS-CoV-2 negative patients (n = 18); group 1: mild COVID-19 symptoms but no later hospital admission (n = 19); group 2: severe COVID-19 symptoms followed by hospital admission (n = 18); and group 3: patients with severe COVID-19 symptoms which were eventually admitted into intensive care units (ICU) (n = 19). Protocols were developed in accordance with the national ethical and legal standards, and following the guidelines established in the Declaration of Helsinki (

DNA Isolation and Sequencing
DNA from nasopharyngeal swab samples was isolated using the QIAamp DNA Mini Kit (QIAgen) following the protocol recommended by the manufacturer. Sequencing libraries were prepared according to the 16S Metagenomic Sequencing Library Preparation protocol distributed by Illumina. Briefly, the sequence spanning the hypervariable regions V3 and V4 of the 16S rRNA gene was amplified through PCR. Amplicons were quantified using a Qubit 4 Fluorometer (Qubit dsDNA HS Assay Kit) validated by 4200 TapeStation (Agilent company). Amplicons were sequenced with Illumina MiSeq System using the 2 × 300 bp cartridge. The quality of raw sequences was assessed by FastQC software.

Taxonomic Classification of Amplicon Sequences
Paired end reads of 300 bp were generated with an average overlap of 140 bp. Sequences were trimmed using trimmomatic (Bolger et al., 2014) to clip out low-quality nucleotides at the read ends and to remove sequences that, after trimming, one of the paired-end reads was smaller than 150 bp and the resulting paired reads were merged using casper (Kwon et al., 2014), generating individual fragments of about 460 bp. Given the uneven coverage between samples, the number of individual reads was standardized to 20,000 per sample, removing samples that did not reach this sequencing depth (9 samples). Total number of reads and cycle threshold (Ct) per sample is shown in Supplementary Table S1. Merged amplicon sequences were grouped in operational taxonomic units (OTUs) using cd-hit (Fu et al., 2012) with an identity of 97%. Sequences were queried against small subunits (16S) rRNA genes of the SILVA database (version 138, Reference subset) (Quast et al., 2013) using the SILVA Incremental Aligner (SINA) software (Pruesse et al., 2012) for taxonomic classification. Sequences with low identity (<70%) to any reference 16S rRNA gene or classified as eukaryotic were excluded from further analysis.

Testing for Differences in Taxonomic Composition Among Patient Groups
We sought to determine how different samples were grouped according to their OTU composition. To that end, the Vegan (v 2.5-6) package available in R (v 3.6.3) was used to perform a non-metric multidimensional scaling (NMDS) analysis on Bray-Curtis dissimilarity measures among samples based on relative OTU abundances (i.e., percentages). The relative abundances of OTUs were also used to test for statistically significant differences among severity groups. Group OTU compositions were compared through ANOSIM. Next, Similarity Percentage (SIMPER) analysis was used to determine which OTUs were responsible for driving the differences in community composition among groups. For this analysis, all six possible pairwise combinations of severity groups were tested.

OTU Association With COVID-19 Severity
To infer associations between the severity of COVID-19 and the nasopharynx microbiota, general linear models (GLM) were built using the R package MaAsLin2 with centered log-transformed (CLR) OTUs counts as the dependent variable, and the severity group (with group 0 and group 1 as references), as independent variable. Due to the differences of age and sex between groups, we adjusted by sex and age. Only OTUs that presented a prevalence of 20% over the sample space were considered. The resulting p-values were adjusted for multiple testing using the Benjamini-Hochberg method (BH). The analysis was replicated using QIIME2 (Bolyen et al., 2019) together with LEfSE (Linear discriminant analysis Effect Size) and included as a Supplementary Material.

Co-abundance Networks for COVID-19 Severity Groups
Fastpar (Watts et al., 2019), a multi-thread implementation of the SparCC algorithm (Friedman and Alm, 2012), was used to generate co-abundance networks among OTUs of each of the four severity groups with default parameters (50 iterations and correlation threshold of 0.2) and 1,000 bootstrap iterations to infer significance. Results were processed using an in-house ipython notebook to generate network matrices for visualization with Cytoscape 3.8 (Shannon et al., 2003). The network matrices were loaded in the Cytoscape 3.8 software, and connections filtered by p-value (≤0.05) and correlation (≤0.6 or ≥0.6).

Study Set of COVID-19
Seventy-four patients were included in this pilot study to assess associations between the nasopharyngeal microbiota composition and the severity of the COVID-19 disease. However, only 65 samples remained after quality coverage control (see section "Materials and Methods"). Data including age, sex, diagnosis, hospital admission, and disease severity were registered (Supplementary Table S2). Sixteen patients belonged to the negative control (group 0, no-SARS-CoV-2), whereas the remaining patients were classified into three groups (group 1, 2, and 3) according to the severity (see section "Materials and Methods"). The average age of the patients ranged from 48.1 years old (group 0) to 66.8 years old (group 2) and around 49% of them were diagnosed with pneumonia ( Table 1).
A total of 62 OTUs were found exclusively in SARS-CoV-2 positive patients (at a minimum of three samples). Most of these OTUs were classified as members of the phylum Bacteroidota (18) and Firmicutes (25). Notably, the most common genera among the OTUs found exclusively on COVID-19 positive patients were Prevotella (13), followed by Leptotrichia (4) and Streptococcus (4). Samples were compared based on the relative abundances of OTUs. This analysis revealed that samples did not cluster according to the severity group neither by hierarchical clustering (Figures 1A,B) or NMDS ( Figure 1C). Nevertheless, the differences in OTU composition among severity groups were significant according to ANOSIM (R = 0.046, p = 0.036).
SIMPER analysis revealed that 25 OTUs were responsible for approximately 70% (p-value 0.04) of the differences in community composition between severity groups 1 and 3 (Supplementary Table S4). These OTUs were classified as members of the phyla Bacteroidotas, Firmicutes, Fusobacteriota, and Proteobacteria. Eleven OTUs had higher average abundance among samples from severity group 1, among which were included three OTUs classified as members of the genus Veillonella. On the other hand, 14 OTUs were more abundant among samples from severity group 3, among which were included four OTUs classified as Prevotella.

Multiple OTUs Display Differential Abundance According to COVID-19 Severity
Using MaAsLin2 and group 0 as a reference, we identified a total of 10 significant associations between bacterial OTUs and patient severity (p < 0.05, q < 0.25), corrected for age and sex. Among those, 9 were positively associated (8 in group 2 and 1 in group 3 when contrasted with group 0) and 1 negatively associated (in group 3 contrasted with group 0) (Supplementary Table S5 and Figure 2A). Of the OTUs positively associated with severity, three were classified as members of the genus Prevotella (OTUs 4, 14, and 16). Due to the heterogeneity of group 0, we also decided to investigate the differences within the SARS-CoV-2 positive patients, using group 1 as reference. The GLM model showed just 1 significant OTU (OTU 16), a Prevotella also found to be significantly associated with severity in the first model (Supplementary Table S5 and Figure 2B). In the same line, the results obtained with QIIME2 using the q2-composition plugin, and with LEfSE method showed that Prevotella sp. was associated with patient severity (Supplementary Figure S2). We did not find any OTUs significantly different between groups 1 and 0. Figure 2A shows the coefficients for all the significant OTUs found by both GLMs and Figure 2B shows OTU 16 CLR transformed counts for all severity groups.

Co-abundance Networks for COVID-19 Severity Groups
In order to investigate how OTUs correlate within the different groups, we generated a total of 4 co-abundance networks, one for each severity group. For the severity group 0, the SARS-CoV-2 negative group, the network displayed 118 nodes with 179 edges. Regarding the other three severity groups, ranging from mild to high severity, the complexity of the network decreased with the increase of severity. The network for patients with mild symptoms (group 1) had 137 nodes with 457 edges, while the network for patients with severe symptoms but not admitted in ICU (group 2) had 129 nodes with 171 edges. The network for severe patients admitted in ICU (group 3) had 100 nodes and 148 edges. In the network of severity group 1, OTU 16 (Prevotella, associated with severity in two GLMs) displayed 18 co-abundant OTUs connected in the network in first degree (Figure 3). Among these connections, ten were negative associations while eight were positive. Most of these connections with OTU 16 were absent from networks of severity groups 2 and 3. Only 3 and 2 first degree connections remained in each of these networks, respectively (Supplementary Figure S3).

DISCUSSION
In this preliminary study, the analysis of the taxonomic composition of the samples showed differences between patients that developed different onsets of COVID-19. Previously, only one study was conducted to investigate the nasopharynx microbiota of COVID-19 patients (Maio et al., 2020). This study did not find any differences between patients and controls. However, this study was conducted only in two groups, the control and mild COVID-19 patients.  In our study, we found subtle changes in nasopharyngeal community composition, meaning that they are restricted to few taxa out of the complete meta-community. Nevertheless, there are detectable and significant changes among OTU abundances. These changes could be linked to the different severity groups, as we identified both taxa that were present exclusively among COVID-19 positive patients as well as those whose abundance was significantly higher or lower among different severity groups. Not only this, but also the complexity of co-abundance networks (which can be taken as a proxy for potential interactions between taxa), was decreased among patients that developed more severe cases of COVID-19. Below we discuss the mechanisms by which specific microbes might play a role in either enhancing or decreasing the severity of COVID-19.

Potential Associations Between Bacterial Taxa and COVID-19 Severity
Among the OTUs positively associated with COVID-19 severity, three were classified as members of the genus Prevotella, and one to a closely related genus, Alloprevotella. A recent study showed that Prevotella proteins can promote viral infection through multiple interactions with NF-κB signaling pathway, which is also involved in COVID-19 severity (Khan and Khan, 2020). The genus Prevotella is usually considered commensal and, as such, rarely involved in infections. However, some strains have been identified as opportunistic pathogens in chronic infections, abscesses and anaerobic pneumonia (Brook, 1998;Brook, 2004;Nagy, 2010;Larsen, 2017). The role of some strains of Prevotella in chronic mucosal inflammation has been demonstrated. They are involved with augmented T helper type 17 (Th17)-mediated mucosal inflammation, through activation of Toll-like receptor 2, followed by production of cytokines by antigen-presenting cells, including interleukin-23 (IL-23) and IL-1 (Larsen, 2017). The severe symptoms of COVID-19 are associated with cytokine storms, many of which are involved in TH17 type responses (Wu and Yang, 2020). The significant association of Prevotella sp. and disease severity observed here suggests a possible link between Prevotella sp. and the COVID-19 through the activation of immunity signaling pathways that modulate inflammation, and this link should be further explored.

Reduced Network Complexity Among Patients Who Later Developed More Severe COVID-19
Several studies demonstrated the usefulness of co-abundance networks to elucidate changes in the microbiota associated with human diseases (Faust et al., 2012;Greenblum et al., 2012;Wang et al., 2016;Chen L. et al., 2020). By switching from individual OTU associations to a community interaction approach it is possible to attain a better understanding of the dynamic of microbiota/phenotype associations, revealing microbial consortia (and not only an OTU) that might be collectively influencing the host phenotype. Our linear models showed OTU 16 (Prevotella sp.) as an important OTU associated with severity. This OTU had the highest number of connections in the network, followed by OTU 9 (Veillonella sp.). Of the four networks generated, the severity group 1 network showed the higher number of interactions with this OTU. Ecological networking, in vitro, and clinical studies showed that Prevotella sp. and Veillonella sp. are keystone species in the microbiota during airway disease progression, especially in diseases associated with mucus accumulation such as cystic fibrosis (Flynn et al., 2016;Quinn et al., 2016;Silveira et al., 2020). These anaerobes are efficient at degrading mucin molecules on the airway mucosa, releasing by-products that enable the colonization and growth of pathogenic bacteria that are poor at degrading mucus for growth (Flynn et al., 2020). In COVID-19 patients, Prevotella sp. and Veillonella sp. could have a similar role due to the decreased mucociliary clearance caused by the viral infection (Robinot et al., 2020). Lower rates of clearing increase the residence time of Prevotella sp. and Veillonella sp. in the airways, likely increasing their mucus metabolism and enabling further colonization by pathogenic bacteria that may cause pneumonia.
OTU 96, classified as Dolosigranulum sp., was identified in the group 1 network and displayed a negative association with OTU 16 (Prevotella sp.) as a first-degree neighbor (Figure 3). OTU 96 did not pass the q-value threshold established for the GLMs but shows significant p-value (0.003 in the model comparing group 2 and group 0, and 0.02 in the model comparing group 2 and group 1 as reference). The only species currently described in this genus is Dolosigranulum pigrum, which is commonly found in the nasopharynx microbiota and is predicted to benefit the host through protection against pneumococcal colonization (Biesbroek et al., 2014;Bomar et al., 2016) and through protection against inflammation damage (Moyano et al., 2020). One study also found a lower abundance of Dolosigranulum in children with Influenza A Virus compared to healthy children (Wen et al., 2018). In addition, a study reported that patients with their airway microbiota dominated by Corynebacterium and Dolosigranulum experienced the lowest rates of early loss of asthma control and have a longer time to develop at least 2 episodes (Zhou et al., 2019). We did not identify Corynebacterium directly connected to OTU 16 (Prevotella sp.), but OTU 78, classified as Corynebacterium is positively associated with OTU 96 (0.7479, p = 0.001) in the co-abundance network from group 1 (Figure 3), indicating that in asymptomatic patients those two taxa are forming a consortium that might protect from disease development. This "consortium" was also implicated in resistance to recurrent ear infections and it was proposed as a probiotic candidate for upper respiratory tract infections (Lappan and Peacock, 2019). The reason that we did not have lower q-value in our GLM for those two taxa could be the lack of power due to the small size of our study. Thus, these associations warrant further investigation.

LIMITATIONS
The major limitation of our study is the small sample size. With only about 15 samples per severity group it is difficult to find statistically significant associations between microbiota composition and disease severity. Nevertheless, this limitation is more likely to lead to false negatives than to false positives. We also cannot rule out confounding factors that might explain the differences between groups, and the fact that many patients of the control group were diagnosed with other respiratory infections or pneumonia, which could have influenced their microbiota in a manner unrelated to COVID-19. Another important limitation is the fact that we performed amplicon rather than whole genome shotgun sequencing. This leads to three issues. First, some of the bacterial diversity is lost due to the fact that the selected primers do not amplify the entirety of bacterial diversity. Second, some genomes have more than a single copy of the 16S operon, which can lead to an overestimation of their abundance in the samples. Third, without metagenomes (and metagenomeassembled genomes) we could not make inferences about the presence of virulence factors and other features of the genomes of the microbes in our samples. We resorted to 16S amplification because our non-invasive approach to collect samples yields low DNA amounts that are inadequate for sequencing. However, as far as we know, this is a unique pilot study in the field. The aim is to be able to transfer useful results to help clinical practice in the fight against the virus and to optimize all the protocols and analyses for a second analysis in which the sample size will be much larger. We are currently working on collecting more samples and optimizing protocols that will allow us to obtain whole genome shotgun sequencing from them.

CONCLUSION
Our data provides preliminary evidence of significant differences in the composition of the upper airway microbiota according to COVID-19 severity, suggesting potential biomarkers of disease severity. While the richness and diversity indexes did not show significant differences among groups, specific taxa were significantly associated with disease development. We also demonstrated that the complexity of the co-abundance network is decreased in patients who came to develop severe cases of the disease, indicating that the interactions between the taxa are also relevant to this process. Further studies will be necessary to shed light on the molecular mechanisms that give rise to these associations. Finally, we make no claim that the differences in microbiota composition reported here are the cause of COVID-19 severity. Nevertheless, the significant associations found between these variables suggests that the role of the microbiota on the onset of disease severity warrants further investigation.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material. Raw data was deposited to the National Center for Biotechnology Information Sequence Read Archive under BioProject accession number PRJNA673585.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethic Committee of Clinical Research of Alicante University General Hospital (Ref. CEIm: PI2020-052). Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
JR conceived the study. MPV, IV, CM-P, and EM collected the data. RRCC, BGNA, CS, JH-M, and FHC analyzed the data. MPV, RRCC, ML-P, FHC, BGNA, LCAR, HA, and JR wrote the manuscript. All authors reviewed and approved the final version of the manuscript.