Proteome and Microbiome Mapping of Human Gingival Tissue in Health and Disease

Efforts to map gingival tissue proteomes and microbiomes have been hampered by lack of sufficient tissue extraction methods. The pressure cycling technology (PCT) is an emerging platform for reproducible tissue homogenisation and improved sequence retrieval coverage. Therefore, we employed PCT to characterise the proteome and microbiome profiles in healthy and diseased gingival tissue. Healthy and diseased contralateral gingival tissue samples (total n = 10) were collected from five systemically healthy individuals (51.6 ± 4.3 years) with generalised chronic periodontitis. The tissues were then lysed and digested using a Barocycler, proteins were prepared and submitted for mass spectrometric analysis and microbiome DNA for 16S rRNA profiling analysis. Overall, 1,366 human proteins were quantified (false discovery rate 0.22%), of which 69 proteins were differentially expressed (≥2 peptides and p < 0.05, 62 up, 7 down) in periodontally diseased sites, compared to healthy sites. These were primarily extracellular or vesicle-associated proteins, with functions in molecular transport. On the microbiome level, 362 species-level operational taxonomic units were identified. Of those, 14 predominant species accounted for >80% of the total relative abundance, whereas 11 proved to be significantly different between healthy and diseased sites. Among them, Treponema sp. HMT253 and Fusobacterium naviforme and were associated with disease sites and strongly interacted (r > 0.7) with 30 and 6 up-regulated proteins, respectively. Healthy-site associated strains Streptococcus vestibularis, Veillonella dispar, Selenomonas sp. HMT478 and Leptotrichia sp. HMT417 showed strong negative interactions (r < −0.7) with 31, 21, 9, and 18 up-regulated proteins, respectively. In contrast the down-regulated proteins did not show strong interactions with the regulated bacteria. The present study identified the proteomic and intra-tissue microbiome profile of human gingiva by employing a PCT-assisted workflow. This is the first report demonstrating the feasibility to analyse full proteome profiles of gingival tissues in both healthy and disease sites, while deciphering the tissue site-specific microbiome signatures.


INTRODUCTION
As a consequence of the microbial challenge, periodontitis causes destruction of underline connecting tissue, including gingival epithelial layer, which builds a barrier to the external challenge. Thus, microbial invasion of the periodontal tissues may take place during the respective pathological processes (Colombo et al., 2007). In earlier transmission electron microscopy studies, invasion of spirochetes and other microorganisms were evident in the gingival epithelium and connective tissues, especially in patients with acute necrotising gingivitis (Listgarten, 1965;Courtois et al., 1983). This tissue invasive feature is different from that of endocytosis by non-phagocytic host cells, through which bacteria evade phagocytic elimination by the immune system. However, bacterial invasion has traditionally been considered to take place at relatively late stages. Based on this rationale, most periodontal microbiome studies have been focused on the characterisation of biofilms. It is plausible that, at least in part, bacterial invasion is involved in the pathogenesis of periodontitis. Interestingly, one study showed that bacteria could form a biofilm-like structure within the gingival tissue (Baek et al., 2018). Further, 16s rRNA profiling analysis has shown that Fusobacterium nucleatum and Porphyromonas gingivalis were highly enriched within the tissue compared with the plaque (Baek et al., 2018). Although most available microbiome studies were mainly focused on the bacterial plaque (biofilm), an overall microbiome map directly derived from human gingival tissues is necessary to draw the whole picture for understanding this virulence aspect of periodontal disease.
With the help of mass spectrometry, researchers can identify thousands of proteins for a given sample in a single run , which is ideal for delivering a snapshot of protein regulations within the gingival tissue. However, attempts to map gingival tissue proteomes have been hampered by lacking sufficient protein extraction workflows. Bertoldi et al. (2013) reported 13 gingival proteins differentially regulated in diseased sites, compared with their neighboring inter-proximal healthy sites. This included the upregulation of annexin A2, actin cytoplasmic 1, carbonic anhydrase 1 and 2; Ig kappa chain C region and flavinreductase as well as downregulation of 4-3-3 protein sigma and zeta/delta, heat-shock protein beta-1, triosephosphateisomerase, peroxiredoxin-1, fatty acid-binding protein-epidermal, and galectin-7 in pathological tissues. Monari et al. identified 32 different protein spots and elevation of S100A9, 14-3-3 protein zeta/delta, Heat shock protein beta-1 and Galectin-7 in gingival tissues from periodontal patients compared with those from healthy individuals (Monari et al., 2015). Whereas, Yaprak et al. (2018) identified 47 proteins from healthy gingival tissue, including 14-3-3 protein sigma, S100A9 and Galectin-7, which also identified in the works of Bertoldi et al. and Monari et al. Yet, although transcriptomic and proteomic patterns are rarely similar (Wang et al., 2017), transcriptomic analysis of gingival tissues has identified as many as 12,744 expressing genes (Demmer et al., 2008), indicating that there is plenty of space for the improvement for proteomic identification.
A sensitive pressure cycling technology (PCT)-assistant workflow with proven efficiency in gingival tissue disruption (Bao et al., 2019) was used in this study. The present study aimed to concomitantly characterise the gingival tissue proteome and microbiome of systematically healthy individuals with periodontitis, by comparing healthy and diseased sites. Labelfree quantitative proteomics and 16SrRNA gene sequencing platforms were applied to dissect the relationship between bacterial abundance and protein regulation among these gingival tissues.

Proteome Profiles of Gingival Tissue Samples Cluster Based on Clinical State
The gingival tissue proteome charted in this study derived from 10 gingival tissues (one healthy and one diseased site per individual) obtained from 5 individuals with stage III periodontitis. Prevalence of teeth with one or more sites with probing pocket depth (PPD) > 5 mm and PPD > 5 mm were % 44 ± 5.3 and % 35.6 ± 6.3. Approximately 82% of sites with PPD > 5 mm had bleeding on probing (BOP). The mean PPD and clinical attachment loss (CAL) scores of the sampled diseased sites were significantly higher than the healthy ones [p < 0.05, PPD (mm): 2.2 ± 0.8 vs. 7.0 ± 0.7, p < 0.001, CAL (mm): O vs. 8.0 ± 0.7, p < 0.0001].
Following a PCT-assisted label-free quantification work-flow, we obtained an overview of the gingiva proteome of 1,369 proteins (including 2 contaminant and 3 decoy proteins), with a protein false discovery rate (FDR) of 0.22%. Each quantified protein consisted of at least two unique peptides identified and quantified (Appendix Table 1). Although unsupervised hierarchical clustering analysis of the tissue proteomes could not distinguish healthy from diseased sites based on their normalised abundances ( Figure 1A), this became possible by the utilisation of sPLS-DA ( Figure 1B). Considering that protein regulation among individuals may vary, we assessed the differentially expressed protein levels by comparing intra-individually healthy and diseased sites, using paired t-test. Of all quantified proteins, 62 qualified as higher [log2 (FC) ≥ 0, P ≤ 0.05], whereas only 7 qualified as lower [log2 (FC) ≤ 0, P ≤ 0.05] in diseased sites compared to the respective healthy sites ( Figure 1C, Appendix Table 2).

Gene Ontology (GO) Analysis of the Regulated Proteins in the Gingival Tissue
The GO functions of differentially expressed proteins were annotated using the METACORE online software (https:// portal.genego.com, Thomson Reuters). The top enriched GO terms for localisations, processes and molecular functions of all 69 regulated proteins were recognised and ranked according to their statistical significance ( Table 1, Appendix Table 3). The major cellular localisation of the regulated proteins was "extracellular"-or "vesicle"-related (Table 1A). For instance, "extracellular exosome, " "extracellular vesicle, " and "extracellular organelle" were the top three enriched terms ( Table 1A). Many of these proteins were linked to "localisation" or "transport" processes (e.g., "cellular localisation, " intracellular FIGURE 1 | Proteome of gingival tissues from healthy and diseased sites. (A) The heatmap of normalised abundance for identified and quantified proteins in gingival tissue. Samples isolated from healthy sites were highlighted in blue colour, while samples isolated from diseased sites were highlighted in red. (B) The sPLS-DA plots represented the normalised abundance of all the 1,366 proteins from healthy (blue triangles) or diseased sites (red triangles). (C) 62 proteins were upregulated [log2 (FC) ≥ 0, P ≤ 0.05] in the disease compared with healthy sites (red dots), and 7 were downregulated [log2 (FC) ≤ 0, P ≤ 0.05]. (D) String analysis for the interaction between regulated proteins. Network established using STRING with interaction confident scores more than 0.9. The upregulated proteins were labelled in black, while the downregulated proteins were labelled in blue. The methods for acquiring the different protein-protein interactions were illustrated by different lines. The interaction confirmed by the curated database and experimental results were shown in blue and purple line, respectively. The interaction predicated by gene neighborhood, gene fusions, and gene co-occurrence were shown in green, red, and dark blue lines, respectively. While interaction was determined by text-mining, co-expression and protein homology were shown in yellow, black, and light blue lines, respectively. Healthy sites: H. Diseased sites: D. protein transport) ( Table 1B), whereas their top molecular functions belonged to the "structural constituent of ribosome" and "structural molecule activity" (Table 1C).
To further understand their inter-relationships, the proteinprotein interactions were analysed using STRING (https:// string-db.org/) (Appendix Table 4). When applying the highest confidence score (0.9), 40 among the 69 regulated proteins, formed 73 pairs of known such interactions, as illustrated by string networks (Figure 1D). The largest cluster of protein interactions consisted of 10 different proteins, which were mainly ribosomal ones. The secondlargest cluster identified consisted of 8 proteins with assigned macromolecular transport properties (e.g., Rasrelated proteins, general vesicular transport factor p115). In addition, among the interactions illustrated in the network, some were identified between actin and other intracellularstructural proteins, or between five dehydrogenases and aminotransferases, as well as more sparse interactions of only two or three proteins.

Microbiome Profiles of Gingival Tissue Samples Cluster Based on Clinical State
Our next approach was to examine microorganisms present in the gingival tissues by establishing a microbial catalogue from 450,668 sequences that binned within 97% sequence identity from all ten-tissue samples (Appendix Table 5). On average, more than 119609.8 reads were identified from each sample, with a standard deviation of 77658.61. To analyse the alpha diversity, rarefaction curves were plotted based on the observed OTUs (Figure 2A), with calculated coverages for disease and health of 99.32 and 99.50%, respectively ( Figure 2B), while the inverse Simpson diversity for disease and health were 7.18 and 5.05, respectively ( Figure 2C). Furthermore, 362 non-rare OTUs were discovered among all ten-samples (data are available via ENA), with no significant differences in the number of detected OTUs between disease and health (P = 0.277, 77, and 74 average OTUs for diseased and healthy sites, respectively) using paired t-test ( Figure 2D). To visualise differences in community structure between the groups, an NMDS plot of the thetayc distance was generated ( Figure 2E), yielding sample clustering based on sites (healthy tissue vs. diseased tissue), not by sample pairing. Such sample clustering indicated the presence of different OTUs between diseased and healthy tissues. In addition, unsupervised hierarchical clustering analysis of OTU abundance also pointed to global microbiome differences between the two types of gingival tissues, where samples were also clustered according to tissue type ( Figure 2F).

DISCUSSION
Different proteome and microbiome studies have been performed to understand periodontal diseases, yet few have focused on gingival tissues. Earlier studies indicated that the microbial content of the periodontal pocket in non-human primates or human determines the gene expression patterns in the gingival tissues (Papapanou et al., 2009;Ebersole et al., 2020). In the present study, we successfully applied a contemporary PCT-assisted workflow to dissect the gingival proteome and microbiome of both diseased and healthy sites from patients with periodontitis. The tissue recipient sites included both maxilla and mandible, which may exhibit different degrees of keratinisation. It should also be acknowledged that the stringent requirement  Table 7). Positive correlations were indicated with red lines, while negative correlations were indicated with blue lines. Only correlations more than 0.7 were showed. The levels of expression in healthy sites were indicated in green line, while the levels of expression in inflammatory sites were indicated in red. Healthy sites: H. Diseased sites: D.
FIGURE 4 | Illustration of the sample sits for gingival tissue. Both pocket epithelium and underlying connective tissue were included in the gingival specimen taken from the approximal sites of the selected teeth (grey; tooth, pink; gingival connective tissue, purple-red; gingival epithelium, brown-beige; alveolar bone).
of obtaining both a healthy and a diseased tissue specimen from the same donor prohibited us from limiting further the sampling criteria to either jaw, else it would have been more cumbersome to identify suitable patients.
The study also presents the most comprehensive quantitative proteome map of human gingival tissue to date, by quantifying 1,366 proteins, while earlier proteome analyses quantified <50 proteins at a time. Interestingly, only 69 of over 1,300 quantified proteins were significantly differentially expressed, whereas many of the inflammation-related proteins, such as Ig gamma-1 chain C region and Protein S100-A9, were abundantly expressed at both sites. The fact that the global proteomic profiles were similar between healthy and diseased sites denotes that protein composition of clinically healthy gingival tissue in the periodontal patient may have already been altered, even in the absence of evident clinical signs of inflammation at those sites. Of note, we used a data-dependent acquisition (DDA) strategy in this study. DDA only samples a subset of the most abundant ions detected during the first MS scan for the further fragmentation and sequential MS scans, while discards the rest. Hence, many proteins with demonstrated roles in periodontitis, including various cytokines, might have been masked due to their lower abundance. The high biological variabilities among patients may also contribute to low number of differentially expressed proteins. Nevertheless, this observation on overall protein regulations is consistent with transcriptomic changes observed in gingival tissue in periodontitis (Kebschull and Papapanou, 2011). Furthermore, the prediction of the protein functions shows that most of the regulated proteins were localised in the extracellular space. Similarly, increased numbers of secreted proteins were previously identified in human experimental gingivitis  and murine ligature-induced periodontitis (Bao et al., 2019). Hence, it is not surprising to find that many of the regulated proteins had transport-related functions. We also observed increases in the ribosome-related proteins (e.g., 60S ribosomal protein L37a, 40S ribosomal protein S25, 40S ribosomal protein S8, etc.), indicating ribosomal biosynthetic activity in the inflamed sites (Zhou et al., 2015).
Under homeostatic conditions, the host is in a balanced relationship with commensal oral species or potential pathobionts (Hajishengallis and Lamont, 2014). Hence, the concomitant study of the proteome and the microbiome is high relevant. One the microbiome aspect, 14 species accounted for 80% of the abundances, only 11 significantly differentiated between health and disease, six of which belonging to the rarely abundant group. This denotes that both high and low abundance species are to be considered in the future for defining signatures of target organisms capable of distinguishing between clinical health and disease. The oral microbiome data obtained in this study from gingival tissues identified potentially invasive species of the periodontium. Nevertheless, it cannot be definitively confirmed that the detected bacteria were all actual tissue invaders, and not superficial persisters after the washing steps. Yet, only a limited portion of the tissue interface has been in direct contact with the biofilm, which is dispersed during the homogenisation process of the specimen, thus down-playing the representation of the non-invaded species. Although their precise effects on the tissue may currently be unclear, different species were earlier found to co-exist within the gingival tissue (Baek et al., 2018). The relative abundances of different taxa may denote their potential roles in health and disease, or their invasive capacity. Treponema sp. HMT 253 and F. naviforme were significantly increased in diseased compared to healthy sites. For Treponema sp. HMT253, the only information currently available on this as-yet-uncultured species is its 16S rRNA gene clone library, derived from dental plaque of subjects with periodontitis and acute necrotising ulcerative gingivitis (Dewhirst et al., 2000). It is closely related to Treponema denticola, a potential pathogen implicated in periodontal disease (Dewhirst et al., 2010). Different models have shown that T. denticola is able to invade the epithelium and basement membrane (Grenier et al., 1990;Lux et al., 2001;Chi et al., 2003), as well as to secrete a chymotrypsin-like protease that can digest host components including type IV collagen, laminin and fibronectin (Grenier et al., 1990). T. denticola dentilisin was also reported to disrupt the epithelial cell monolayer (Chi et al., 2003). Fusobacterium naviforme (formerly F. nucleatum ssp. naviforme) has been identified and isolated from subgingival plaque samples (Colombo et al., 2009). Based on phylogenetic analysis (Dewhirst et al., 2010), it is expected to display functional similarities F. nucleatum. Although Treponema sp. HMT 253 and F. naviforme were not usually found as an abundant constituent of the subgingival biofilm in patients with periodontitis, their abundance within the tissue documented in this study suggests a greater invasion potential than their cultivated and characterised relatives. Of further note, other species that were elevated in diseased compared to healthy tissue, but did not reach statistical significance, are worth mentioning. Such were F. alocis and Fretibacterium sp. HMT 358, suggesting that they are potentially invasive of the gingival tissues. Both F. alocis and Fretibacterium sp. have been increasingly associated with periodontal disease. Fretibacterium sp. belongs to the phylum Synergistetes and is shown to be increased in the saliva of patients with periodontitis (Belibasakis et al., 2013) and in dental biofilms of patients with ANUG (Baumgartner et al., 2012), an invasive form of periodontal disease.
Some potentially invasive but generally less pathogenic species, including S. vestibularis, V. dispar, and Selenomonas sp. HMT 478, were enriched in healthy sites. Even though originally identified in the oral cavity (Whiley and Hardie, 1988), the presence of S. vestibularis has not been reported in periodontitis, but in other infectious diseases (Duan et al., 2017;Yilmaz et al., 2017). V. dispar is found in subgingival plaque from chronic periodontitis patients (Moon et al., 2015) and plays an import role when saliva is the main nutritional source of oral biofilm (Kolenbrander, 2011). The presence of Selenomonas spp. are reported in the salivary (Duan et al., 2017) or dental plaque microbiome (Paster et al., 2001;Faveri et al., 2008) of periodontal patients, but at a lower prevalence compared with other putative pathogens (Goncalves et al., 2012). Previous studies have shown that Streptococcus , Veillonella (Kolenbrander, 2011) andSelenomonas (Goncalves et al., 2012) contribute to the structural organisation of oral biofilm. Streptococcus spp. have the potential to colonise or invade the gingival tissue, but with no known association to gingival inflammation, which is well in line with our findings. It should be noted that, although diseased sites showed higher abundances of Treponema sp. HTM 253, most healthy sites fostered a fairly high proportion of this species. Perhaps this species allows other less invasive microorganisms like Veillonella spp. and Streptococcus spp. to penetrate the tissue barrier.
Thus far there has been inconsistent evidence to support or exclude the invasive properties of oral species in the pathogenesis of periodontal disease (Mendes et al., 2015). From an epidemiological perspective, there is sufficient evidence on the role of specific species as etiological agents of periodontitis, but the disease may be better understood as dysbiotic inflammation resulting from the concerted interaction of correlations (Lopez et al., 2015;Hajishengallis and Lamont, 2016). The interactome analysis of the present study indicates that groups of significantly elevated species and proteins tend to correlate with one another in health or disease. Information derived from such studies may decipher biological signatures in periodontal disease, which will help us understand its etiopathogenesis on the tissue level and may confer future diagnostic and prognostic value (Belibasakis and Mylonakis, 2015).

Study Population and Design
Gingival tissues (n = 10) were collected from two sites (one healthy and one diseased) of each five systematically healthy individuals with stage III periodontitis (age range from 45 to 56 years with a mean age 51.6 ± 4.5 years, F:M: 2:3). The study was approved by the Ethics Committee of Ege University (number 17-11.1/34) and conducted following the guidelines of the World Medical Association Declaration of Helsinki. Patients first attended the Department of Oral Diagnosis and Radiology for the completion of clinical and radiological examination procedures and were then directed to the specialised within the University Dental Clinics, for further assessment and treatment. Exclusion criteria were the use of tobacco products, presence of cardiovascular and respiratory diseases, diabetes mellitus, HIV infection, systemic inflammatory conditions or non-plaqueinduced oral inflammatory conditions, immunosuppressive chemotherapy, and current pregnancy or lactation. None of the patients had a history of periodontal therapy or had taken medication such as antibiotics or anti-inflammatory drugs that could affect their periodontal status for at least 6 months prior to the study. Informed consent was obtained from all participants. Full-mouth and site-specific periodontal parameters including PPD, CAL, dichotomous presence of BOP, and plaque for each patient were recorded. The full-mouth means PPD (mm) and CAL (mm) were 5.1 ± 0.4, 5.7 ± 0.4, respectively. The fullmouth mean plaque and BOP scores were 72.0 ± 6.7 and 75.0 ± 5.0, respectively.

Collection of Gingival Tissue Samples
Gingival tissue samples, including both pocket epithelium and underlying connective tissue, were taken from the approximal sites of the selected teeth prior to non-surgical periodontal therapy (Figure 4). The "healthy sites" had no clinical signs of gingival inflammation (no BOP), exhibited a PPD of ≤ 3 mm and had no radiographic evidence of alveolar bone loss and no CAL. These healthy tissues were sampled when the premolars were scheduled to have periodontal crown lengthening surgery. The "diseased sites" showed BOP, had an interproximal PPD of ≥6 mm, and a concomitant CAL of ≥6 mm. Two gingival tissue samples from each participant were obtained and washed with sterile normal saline solution to remove any blood or detached biofilms on the tissue surface. Tissues were then placed in a sterile tube containing a tissue protectant solution (RNAlater, Sigma-Aldrich) and stored at +4 • C overnight, before long-term storage at −70 • C until later usage.

Protein Extraction and Digestion
Gingival tissues were washed three times, each for 5 min in PBS to remove any residues prior to lysis. The tissues were then lysed and digested using a Barocycler NEP2320 (Pressure BioSciences) at 33 • C as described previously (Bao et al., 2019). In brief, 2.5 to 3 mg of samples (n = 10) were placed in MicroTubes (Pressure BioSciences) and lysed with a 60-cycle barocyling process. The exacted proteins were then reduced and alkylated using tris (2-carboxyethyl) phosphine (Sigma) and iodoacetamide (Sigma). Later, extracted proteins were digested using Lys-C (Wako) at an enzyme-to-protein (estimated to be 10% of the wet sample weight) ratio of a 1:45 with a 45-cycle barocyling process. These resultant solutions were further diluted and then digested again using trypsin (Promega) at an enzymeto-protein ratio of 1:50 with a 90-cycle process. Each barocylingcycle mentioned above consisted of a 50 s ultra-high pressure phase (45, 20, 20 thousand pounds per square inch (KPSI) for 60-, 45-, 90-cycle process, respectively) followed with a 10 s ambient pressure phase in each cycle. Resultant solutions were acidified by trifluoroacetic acid (TFA) (Sigma) to a final concentration of 0.8% w/v, desalted using reverse-phase cartridges Finisterre SPE C18 (Wicom International AG), dried with vacuum centrifuged, and kept in −20 • C freezer until further use.

Mass Spectrometry (MS) and Data Analysis
All frozen peptides were reconstituted in 3% acetonitrile ACN in 0.1% formic acid and adjusted to 0.5 µg/µl using NanoDrop 1000 spectrophotometer (WITEC AG). One microlitre of desalted peptide was analysed on an Orbitrap Fusion mass spectrometer (Thermo Fisher Scientific) for proteomic analysis as described previously .
Label-free quantification was performed by Progenesis QI for proteomics (Non-linear Dynamics) as described previously . In brief, .raw files of individual samples were aligned with a pooled sample to create a Mascot files (.mgf). This .mgf files was searched with Mascot (version 2.4.1, Matrix Science) using the following search parameters: precursor tolerance: ± 10 ppm; fragment ion tolerance: ± 0.6 Da; enzyme: trypsin; maximum missed cleavages: 2; fixed medication: carbamidomethylation of cysteine; variable modification: deamidated (NQ), oxidation (M) and acetylation on protein N-termini. Data were searched against a FASTA file (40,510 sequences and 22,667,481 residues), consisting of the human proteome from UniProt (isoforms included, retrieved December 9th 2016), contaminant database from the FGCZ, the resulting.dat file was imported into Scaffold v4.0 (Proteome Software) to generate spectrum report, with protein false discovery rate (FDR) of 10%, minimal one peptide and peptide FDR of 5%. Finally, the spectrum report was imported back into Progenesis QI for identifying the quantified proteins. The "within-subject" option was used for experiment design set up. Proteins with a minimum of two unique quantified peptides and a significant ANOVA p-value smaller than 0.05 were considered as differentially regulated ones.

Data Clustering and Heat Maps for Regulated Proteins
Unsupervised clustering analysis and heat maps of regulated proteins were generated using the R software (R: A Language and Environment for Statistical Computing, R Development Core Team) in particular the Quantable packages (https://cran. r-project.org/web/packages/quantable/index.html) to obtain a global visualisation and regulation trends of protein profiles. Apparent outliers were excluded from the quantification. Sparse Partial Least Squares Discriminant Analysis (sPLS-DA) was used to visualise the similarity between healthy and inflammation sites using the SPLS package (Chun and Keles, 2010).

Functional Analysis of the Regulated Proteins
The regulated proteins were subjected to Metacore online database (29th May 2019) for "gene ontology (GO) enrichment analysis." Enriched GO terms were recognised and ranked according to their statistical significance (-log2P value), using a hypergeometric distribution.

Sample Disruption and DNA Extraction
Other gingival tissues (approximal 2.5 to 3 mg, n = 10) were lysed using a Barocycler NEP2320 (Pressure BioSciences) with only a 60-cycle barocyling process (each consisting 50 s at 45 KPSI followed by 10 s at ambient pressure) at 33 • C for DNA extraction. The genomic DNA from lysates was extracted using the GenElute TM Bacterial Genomic DNA Kit (Sigma) and stored at −20 • C until further use.

Sample Preparation for 16S rRNA Gene Sequencing
The hypervariable regions 7 to 9 (V7-9) of the 16S rRNA gene were amplified in the first round of PCR from isolated genomic DNA using universal bacterial primers that also contained the Illumina Truseq primer binding site (Appendix Table 8). Amplification reactions were performed in a total volume of 25 µl containing 5X KAPA HiFi Buffer, 10 mM KAPA dNTP Mix, 0.5 U KAPA HiFi DNA Polymerase (KAPA Biosystems), 4 µM of primers ordered from Microsynth (Balgach), and 22.4 ng DNA diluted in DNA-free water. The PCR amplification was performed on a Verity thermocycler (Thermo Fisher Scientific) with the following cycling conditions: 95 • C for 5 min, 25 cycles at 98 • C each for 20 s, 70 • C for 30 s, 72 • C for 30 s and a final extension at 72 • C for 10 min. The PCR reactions were run on a 2% agarose gel, the amplicon band was cut and extracted using MinElute Gel Extraction Kit (Qiagen) and eluted in 50 µl DNase-free water.
In the second round of PCR, the remaining Illumina Truseq adaptors together with dual indexing Truseq barcodes were incorporated into the previously amplified material (Appendix Table 9). Amplification reactions were performed in a total volume of 50 µl containing 5X KAPA HiFi Buffer, 10 mM KAPA dNTP Mix, 1 U KAPA HiFi DNA Polymerase (KAPA Biosystems), 4 µM of the primers ordered from Microsynth (Balgach) and 22.5 ng of the previously amplified material diluted in DNA-free water. PCR amplification was performed on a Verity thermocycler (Thermo Fisher Scientific) with the following cycling conditions: 98 • C for 5 min, five cycles at 98 • C each for 30 s, 54 • C for 30 s, 72 • C for 30 s and a final extension at 72 • C for 5 min. PCR products were gel-purified and eluted in 50 µl DNase-free water. The quality and quantity of resulting amplicon libraries were validated using Qubit R (1.0, Invitrogen) Fluorometer and the Tapestation (Agilent). The amplicons from the different samples were normalised to 4 nM in Tris-Cl 10 mM, pH8.5 with 0.1% Tween 20 and as they contain dual indexes, they were equimolarly pooled and paired-end sequenced in a Miseq Illumina Instrument (Illumina CA) using a 600cycle V3 kit.

Processing and Taxonomic Classification of 16S rRNA Gene Reads
MiSeq paired-end (PE) reads were first filtered based on average quality (>= Q20) using Trimmomatic (version 0.36) (Bolger et al., 2014). Quality checked PE reads were processed using mothur (version 1.38.1) (Schloss et al., 2009), following the MiSeq SOP (Standard Operation Protocol) (https://www. mothur.org/wiki/MiSeq_SOP). In detail, the quality-filtered PE reads were joined into contig sequences. Identical sequences were merged and the counts of all unique sequences were recorded. Unique sequences were aligned guided by the Silva bacterial 16S reference alignment (Release 102) (Quast et al., 2013). After alignment, the bulk of the sequences started at position 34,476 and ended at position 43,116 of the reference alignment. Sequences aligned at the different start and/or stop sites, as well as sequences with homopolymers longer than 8 nt were filtered out. Sites containing only gap characters were also removed. Sequences were pre-clustered allowing for up to three base differences. Chimaera sequences were removed using the UCHIME algorithm (Edgar et al., 2011). Sequences were initially classified by comparing them to the mothur-formatted RDP training set (v.9), with cutoff values set at genus level (Cole et al., 2014). This taxonomic information was used to remove undesired contaminants (Chloroplast, Mitochondria, unknown Archaea and Eukaryota) and to split the sequences into 16S genus bins (taxlevel=6) and one un-classified bin. Each of the 166 bins was then clustered into operational taxonomic units (OTUs) using the single-linkage clustering algorithm implemented in hpc-clust (Matias Rodrigues and von Mering, 2014), with 97% sequence similarity as the cutoff. The mothur-compatible OTU list was prepared using the utility script "makeotus_mothur" in the same software and imported into mothur for OTUbased analysis. To taxonomically classify OTUs, representative sequences (the most abundant) were compared against the Human Oral Microbiome Database (HOMD) (Dewhirst et al., 2010) using BLASTN in ncbi-blast-2.6.0+ (Altschul et al., 1990). The taxonomy of the best match (with >96.99% homology) was assigned to the corresponding OTU. If the representative sequence had <97% homology to the HOMD reference, the genus name was used to taxonomically designate the OTU.

Microbiome Data Analysis
In all analysis where normalisation was applied, standardised datasets were generated by randomly selecting 3,597 sequences 1,000 times from each sample. To analyse the alpha diversity of the samples, rarefaction curves describing the number of OTUs observed as a function of sampling effort were plotted. The numbers of sequences, the sample coverage, the number of observed OTUs, and the Inverse Simpson diversity were calculated. To compare the membership and structure of the samples between groups, distance matrices for the classical Jaccard (1908) and Yue and Clayton theta values (Yue and Clayton, 2005) were calculated. The distance matrices were also visualised using NMDS (non-metric multidimensional scaling) (Clarke, 1993). The R software, in particular, the packages "gplots" and "stats, " were used to generate unsupervised clustering analysis and heatmaps of non-rare OTUs (abundance ≥ 50 across all samples). Differentially represented OTUs were evaluated via paired Student's t-test using their relative abundances p < 0.05. Benjamini-Hochberg corrected P-values and power calculations were provided for each OTUs.

Correlations Between Microbiome and Protein Datasets
To visualise the correlation between differentially presented OTUs and proteins, Circos plots and cluster-imagine maps were generated for r values (Pair-wise variable associations for canonical correlation analysis correlation between variables, defined by a generalisation of the cosine angle between the center of the circle and each variable point; Gonzalez et al., 2012) using the mixOmics package in R (Rohart et al., 2017).

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found here: https://www.ebi.ac.uk/ena, PRJEB27125 (ERP109167); http:// www.proteomexchange.org/, PXD018029.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics Committee of Ege University (number 17-11.1/34). The participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
KB contributed to data acquisition, analysis, interpretation, drafted, and critically revised the manuscript. XL, LP, and WQ contributed to data analysis and interpretation as well as critically revised the manuscript. NS, JG, PD, and GH critically revised the manuscript. PG and GE contributed to conception and design, data acquisition, and critically revised the manuscript. NB and GB contributed to conception and design, data interpretation, drafted, and critically revised the manuscript. All authors gave final approval and agreed to be accountable for all aspects of the work.