Detection of Alpha, Beta, Gamma, and Unclassified Human Papillomaviruses in Cervical Cancer Samples From Mexican Women

Background: Cervical cancer (CC) is associated to high-risk human papillomavirus (HPV) infections, for this reason it is crucial to have sensitive and accurate HPV diagnostic tests. To date, most research is focused on HPVs within the Alphapapillomavirus (α-PVs) genus and little attention has been paid to cervical infections with other HPV genotypes, like those of the Betapapillomavirus (β-PVs) and Gammapapillomavirus (γ-PVs) genera. The aim of this study was to determine the HPV genotypes from different genera in women with CC using Next-Generation Sequencing (NGS). Methods: The study comprised 48 HPV positive CC samples evaluated with the Linear Array HPV Genotyping test and individually sequenced by 454 NGS using PGMY09/11 and FAP primers. To determine the HPV genotypes present in each sample, the obtained sequences were compared with all HPV L1 gene reference sequences from the Papillomavirus Episteme database (PaVE). Moreover, 50 HPV positive low-grade cervical lesion samples individually genotyped with NGS were also included to determine the genotypes present preferentially in CC patients. Results: Among the 48 CC samples, 68.75% consisted of multiple HPV infections, 51 different genotypes were detected, of which 7 are still unclassified, 28 belong to α-PVs (6, 11, 16, 18, 26, 30, 33, 35, 39, 42, 43, 44, 45, 51, 52, 53, 54, 59, 62, 66, 68, 69, 70, 71, 74, 81, 102, 114), 10 to β-PVs (5, 12, 21, 37, 38b, 47, 80, 107, 118, 122), and 6 to γ-PVs (101, 103, 123, 135, 147, 214). Among them, HPV16 was the most prevalent genotype (54.2%), followed by HPV18 (16.7%), HPV38b (14.6%), and HPVs 52/62/80 (8.3%). Some genotypes were exclusively found in CC when compared with Cervical Intraepithelial Neoplasia grade 1 (CIN1) samples, such as HPVs 5, 18, 38b, 107, 122, FA39, FA116, mSK_120, and mSK_136. Conclusions: This work demonstrates the great diversity of HPV genotypes detected by combining PGMY and FAP primers with NGS in cervical swabs. The relatively high attribution of β- and γ- PVs in CC samples suggest their possible role as carcinogenic cofactors, but deeper studies need to be performed to determine if they have transforming properties and the significance of HPV-coinfections.


INTRODUCTION
Human Papillomaviruses (HPVs) are small circular doublestranded DNA viruses that infect mucosal or cutaneous epithelium and that in most of the cases persist asymptomatically during the host's lifetime. However, certain HPV infections can cause clinical manifestations that range from benign lesions to malignant tumors such as skin or cervical cancers (Bravo and Félez-Sánchez, 2015). HPVs typically contain 8 genes in their 8 kb genome, including the L1 gene that encodes the major capsid protein and is used for virus classification as its nucleotide sequence is well-conserved among HPVs. A papillomavirus genotype is a full-length genome whose L1 gene sequence differs in at least 10% from that of any other papillomavirus (Bernard et al., 2010). HPVs are grouped in five major phylogenetic genera referred to as Alpha (α)-, Beta(β)-, Gamma(γ )-, Mu(µ)-, and Nu(ν)-papillomaviruses, where viruses that belong to different genera have <60% similarity within the L1 gene. Within a genus, HPVs are classified in species that share between 60 and 70% of similarity (Bzhalava et al., 2015). According to the International HPV Reference Center (Karolinska Institute), 226 HPV genotypes have been described to date, but new HPVs are continuously being found 1 . It is worth noting that the number of HPVs belonging to the β-and γ -PV genera has rapidly increased in the last years (Mühr et al., 2018). Concerning the α-PV genus, it contains the carcinogenic HPVs called highrisk (hr-HPVs). If those viruses persist over time in their site of infection, they can cause cervical cancer (CC), one of the most common cancers in women worldwide, as well as cancers of vagina, vulva, penis, anus, and oropharynx (Schiffman et al., 2016). HPV16 is the predominant genotype causing invasive CC and other HPV-related cancers worldwide (Guan et al., 2012). Therefore, to efficiently detect HPV16 and the other genotypes, HPV screening programs in reach of most of the target-population must be implemented. Apart from primary prevention with HPV vaccination, screening for cervical lesions using cytology and HPV tests is a necessary approach (Bosch et al., 2016). Even though the great majority of α-PVs infect mucosa in the anogenital epithelia, some viral genotypes from this genus were originally described as cutaneous, together with β-and γ -PVs. However, regarding β-PVs, it has been difficult to determine their individual role in cutaneous malignancies due to their high diversity and their dissemination throughout many tissues such as the skin, the oral cavity, the nasal mucosa, and the anogenital region (Nunes et al., 2018).
In recent years, the development of highly sensitive HPV detection methods has allowed the identification of a large number of new HPVs and their ubiquity in samples from different origin. For research purposes, sensitive, and specific technologies that cover a broad range of HPV genotypes have been developed (Nilyanimit et al., 2018). More precisely, the use of PGMY and FAP primers couple with Next-Generation Sequencing (NGS) strategies detect α-, β-, and γ -PVs simultaneously in different tissues have helped to broaden the knowledge in this field (Flores-Miramontes et al., 2015;Sias et al., 2019). The aim of this work was to study the HPV genotypes present in cervical samples from women diagnosed specifically with CC using NGS with primer pairs (both PGMY11/09 and FAP) that allow amplification not only of α-PVs, but also of multiple β-and γ -PVs.

Sample Collection
In this study, more than 50 CC samples were collected from women who attended a medical examination at the Hospital de Ginecología y Obstetricia, Centro Médico Nacional de Occidente, IMSS in Guadalajara, Jalisco, Mexico. Additionally, more than 100 CIN1 (Cervical Intraepithelial Neoplasia grade 1) samples were collected both in the hospital mentioned above, as in the Hospital General de Zona # 12, Lic. Benito Juárez, IMSS, in Mérida, Yucatán. Sample recruitment was done by gynecologists throughout 2016-2017, by collecting the cervical swabs in Preserv-Cyt medium solution Hologic,Inc.,Marlborough,MA,USA). All diagnoses were confirmed by immunohistopathological analysis. It is meaningful to mention that the samples were collected only from women who had not previously had any type of treatment.
This study was approved by the Ethics and Research Committees of the Instituto Mexicano del Seguro Social (IMSS), with registration number R-2014-785-036. All women provided written informed consent before the sample collection.

DNA Extraction and Linear Array HPV Genotyping Test
Total DNA was extracted from each cervical sample using the AmpliLute Liquid Media Extraction Amplicor kit (Cat. no. 03750540190; Roche Molecular Systems, Inc., Branchburg, NJ, USA.). HPV positivity was determined by the Linear Array HPV Genotyping Test (Cat. no. 03378179190, Roche Molecular Systems) according to the manufacturer's instructions. Two CC samples and six CIN1 samples were excluded due to low DNA concentrations.

HPV Genotyping by 454 Next-Generation Sequencing
Considering the positivity to HPV by Linear Array, 48 CC samples were selected to perform NGS. Additionally, two groups of CIN1 samples were chosen: (1) a group of 23 samples in which each sample had coinfection with at least 3 genotypes (named "multiple-infection group") and (2) a group of 27 samples that were HPV negative by Linear Array, but positive by FAP primers amplification (named "PGMY-HPV-neg group").
First, the DNA of each sample was amplified by two independent conventional PCRs with PGMY09/11 primers on one side and FAP59/64 primers on the other side (Forslund et al., 1999;Gravitt et al., 2000). Then, a second PCR reaction was performed using PGMY09/11 on one side and FAP6085/64 primer pairs on the other side, coupled to a universal tail sequence as previously described (Flores-Miramontes et al., 2015). PCR products were quantified using the Qubit dsDNA HS (Cat. no. Q32854, Life Technologies, Eugene, OR, USA), diluted to 5 × 10 9 molecules per microliter and 454 MIDs from Multiplicom NV CFTR kits individual barcodes were added to each sample (Cat. nos. Molecular Diagnostics,Niel,Belgium). Afterward, samples were purified with the Agentcourt AMPure XP beads (Cat. no. A63880, Beckman Coulter Genomics, Danvers, MA, USA) and evaluated for their quality with the Agilent DNA 1000 kit (Cat. no. 5067-1504, Agilent Technologies, Santa Clara, CA, USA) on the 2100 Bionalyzer (Cat. no. G2939BA, Agilent Technologies). Finally, 454 NGS was performed using GS Junior Titanium kits in a GS Junior Sequencer, following the manufacturer's instructions (Cat. no. 05996562001, 05996589001, 05996597001, and 05996619001, Roche Diagnostics, Basel, Switzerland).

Data Analysis
The sequencing data were first analyzed with the FastQC tool to determine the quality of the sequences 2 . Then, to identify specific HPV genotype sequences present in each cervical sample, the Roche GS Reference Mapper v3.0 software was used, taking as references all the L1 sequences reported in the Papillomavirus Episteme (PaVE) database (Van Doorslaer et al., 2012). The parameters considered for mapping were the following: trimming of adapters and primers, 95% of minimum overlap identity, Phred score ≥20, minimum read length of 150 bp and exclusion of all the repeated reads.
Additionally, to compare the results obtained with the Roche GS Reference Mapper v3.0 software, the following pipeline was performed: (1) the AlienTrimer v. 0.4.0 software was used to remove all adapter sequences and specific primers (Criscuolo and Brisse, 2013), (2) Printseq v. 0.20.4 software was utilized to filter all sequences with a Phred Score >20, remove sequences <150 nucleotides, and trim 12 additional nucleotides from the 5 ′ end (Schmieder and Edwards, 2011), (3) to mapping all reads both Bowtie 2 (Langmead and Salzberg, 2012) and Megablast (Zhang et al., 2000;Morgulis et al., 2008) were used.
All HPV raw sequence reads were deposited in the NCBI BioProject repository with the following accession numbers: PRJNA506700 (for CC samples), PRJNA506685 (for CIN1 samples with multiple HPV infections), and PRJNA506459 (for PGMY-HPV-negative CIN1 samples). HPVs that appeared in a sample with only 1 read and <150 bp or sequences that aligned to an HPV genotype with <95% of identity (checked with Nucleotide BLAST) were excluded.

HPV Genotype Prevalence and Attribution
The prevalence of the different HPV genotypes that are present in the CC samples under study is calculated by dividing the total number of type-specific positive samples by the total number of samples (in this work, 48 samples that are HPV positive). However, when multiple HPV infections are present, as it is the case in this study, the prevalence could probably over-or under-estimate the attribution of individual HPV genotypes. The contribution of each HPV genotype in multiple infections to lesion development is being investigated, but it is likely that some HPVs play a more dominant role than others (Insinga et al., 2008). Therefore, the relative attribution of each HPV was estimated in the 48 CC samples.
To calculate the relative attribution based on the presence of one or more HPVs, a value of 1 was assigned to samples with single infection, while in coinfections, a proportional fraction value (based on the number of HPVs detected in the same sample) was assigned. For example: assuming that in one sample there are 4 HPVs, a value of 0.25 was assigned to each HPV genotype. The global relative attribution of each HPV was obtained by adding all these values assigned to that specific HPV in all samples and dividing it by the total number of samples that amplified a sequence with PGMY (n = 41) or FAP (n = 39).

HPV Tree Construction
Two different analysis were carried out using the MEGA X software (Kumar et al., 2018), one including only the 41 HPV consensus sequences amplified with FAP primers (≈350 nucleotides) and the other containing the 19 HPV consensus sequences amplified with PGMY (≈450 nucleotides). The analyses were performed separately because the set of primers binds in different L1 positions. The sequences were aligned using the MUSCLE algorithm (Edgar, 2004), then the Maximum likelihood Statistical method and partial deletion was chosen to construct the phylogenetic tree. The radial form was selected.
To summarize the data, our findings are schematized in Figure 1, in which the global number of reads from each HPV genotype are depicted, including the number of patients in which each genotype was found. The HPV genotypes with the greatest number of total reads were: 16 (with 10746 reads), 38b (5172), 80 (4871), 52 (33047), and FA116 (3036).

Unclassified HPV Genotypes
To predict to which genus belong each one of the seven unclassified HPV genotypes detected in the CC samples, a phylogenetic analysis was performed using all the consensus sequences obtained by NGS (as described in the materials and methods section). As visualized in Figure 4, three of the seven genotypes were found to be very closely related to β2 and β1 species (HPV-FA116, HPV-mSK_220, and HPV-FA39), while HPV-mSK_136, HPV-mSK_221/EV03c40, HPV-mSK_220, and HPV-mSK_036/ w11C24 were found to belong to the γ-genera.

DISCUSSION
Accurate HPV genotyping is necessary for an effective screening of the cervical injuries and for epidemiological studies. Although there is already a wide overview of the circulating HPVs worldwide, detection methods have been focused only on α-PVs. In order to understand which HPVs infect the cervical mucosa of Mexican women that have developed CC, HPV genotyping was performed in 48 CC samples by using different primer pairs (PGMY09/11 and FAP6085/64) that amplified not only α-but also β and γ -PVs, followed by Next-Generation Sequencing. Additionally, CIN1 samples were also included to contrast the HPV genotypes that are circulating in both pathological groups.
This study contributes to the crescent evidence that βand γ -PVs are also present in CC samples. Recently, in a study performed in cervical samples among asymptomatic    Frontiers in Cellular and Infection Microbiology | www.frontiersin.org FIGURE 2 | HPV prevalence by using NGS determined in CC samples. Percentage of total HPV type-specific prevalence (blue), prevalence observed by using PGMY primers (pink), or FAP primers (purple). women from Brazil (Ludwig/McGill Cohort Study), Sichero et al. determined β-PVs by a type-specific, multiplex genotyping PCR assay (amplifying E7 gene), followed by HPV genotyping using a bead-based Luminex technology; and reported that β-PV infections were more frequent than those observed with α-PVs, however, authors assume that detection of β-PVs in the FIGURE 4 | Phylogenetic tree built with the 42 HPV consensus sequences obtained only by FAP primers amplification in CC samples. Analysis was performed by MEGA X software as described in materials and methods; the genera and species from each group are included. The symbol (?) shows all those sequences still unclassified.
cervix may be unrelated to an active infectious process (Sichero et al., 2017). Whether β-, or γ -PVs could have a relevant role as cofactors in the carcinogenic progression or a first infection with high-risk α-PVs is needed, are questions that need to be addressed, as further discussed. In our study, in CC samples, a high percentage of β-, γ -, and unclassified PVs genera were found (36.9, 10.4, and 14.6%, respectively). Additionally, using NGS, a great proportion of coinfections were observed (68.8%). In comparison, when the same 48 samples were genotyped with the Linear Array HPV Genotyping Test, only 35.4% were found in coinfections ( Table 1). This data present a similar trend to what was previously found by our research group, in which, by using the Linear Array HPV Genotyping test, only 26% of the CC samples showed coinfections (Aguilar-Lemarroy et al., 2015). Some other studies in which NGS has not been used, have also reported a low rate of coinfections in CC samples, as described by Senapaty et al., who found 22.67% of coinfections in 172 CC cases (Senapati et al., 2017), or as described by Pimenoff et al., where they analyzed 8780 HPV DNA-positive invasive CC cases worldwide and found that 6.7% contained multiple HPVs (Pimenoff et al., 2019). These contrasting differences show that most of the single infections detected with commercially tests (such as the Linear Array HPV Genotyping Test) are indeed multiple infections that should be evaluated with a more precise HPV detection method.
Regarding the prevalence in CC samples, HPV 16 was the most common detected genotype (26 out of 48) found in 7 cases FIGURE 5 | Frequency of the HPV genotypes found with NGS in CIN1 samples. The samples are divided into two groups: the "Multiple-infection group" and the "PGMY-HPV-neg group." In the first group, NGS was performed from amplicons using the two sets of primers, PGMY and FAP; in the 2nd group they were only amplified with FAP primers.
(27%) as a single infection, and in 19 cases (73%) as a multiple infection. HPV16 had the higher prevalence and attribution, as reported worldwide by many authors and particularly in a study comprising 8977 HPV positive invasive CC samples from 38 countries in Europe, North America, central South America, Africa, Asia, and Oceania (de Sanjose et al., 2010). We found this genotype mainly together with HPV 38b (n = 6), HPV 62 (n = 3), and HPVs 18/52/122 (n = 2). In addition, we also determined that HPV 38b and 80 are common in the Mexican population with CC, interestingly, both genotypes belong to β2 species. Until now, β-PVs are considered mainly to have a cutaneous tropism even if they have been detected in other anatomical sites apart from the skin, and they are suspected to promote nonmelanoma skin cancer development together with ultraviolet radiation (Tommasino, 2017).
Particularly, HPV 38 E6 and E7 were the first oncoproteins from β-species that were shown to display transforming properties (Caldeira et al., 2003). Although many HPV variants have been described, HPV subtypes are not commonly reported. A subtype of HPV 38 with 96% L1 sequence similarity to reference HPV 38 was found in 2006 and called HPV 38b (subtype FA125) (Hazard et al., 2006). Among the 48 CC samples analyzed in this study, 7 were positive to HPV 38b subtype (8.33% attribution), showing that this is the subtype that is present in Mexican samples and that its prevalence is almost as high as the one of HPV 18.
Interestingly, Kazemian et al. sought to identify new viruscancer associations by searching RNA sequencing data sets from more than 2,000 patients, encompassing 21 cancers from The Cancer Genome Atlas (TCGA), for the presence of viral sequences. Unexpectedly, they found sequences matching HPV 38 in 32 out of 168 (19%) RNA-Seq samples of uterine corpus endometrioid carcinoma (Kazemian et al., 2015), from those, authors detected that some sequences corresponded to HPV 38b. Despite authors mention that these data could be derived from a possible cross-contamination of the samples from The Cancer Genome Atlas Database, the fact that these samples were really infected with HPV 38 cannot be ruled out. In this way, HPV 38 could not only be infecting skin and cervix but could also be present in some other pathologies such as endometrial cancer. In addition, Hampras et al. analyzed 87 men samples from the HIM project and determined that HPV 38 was the most prevalent β-PV found in genital skin (32.2%) (Hampras et al., 2017).
Regarding HPV 80, we found that it ranks the 4th place in attribution (7.26%- Figure 3), even though there are very few studies that have reported the presence of this genotype. HPV 80 was isolated, full sequenced and characterized from histologically normal skin in 1998 by Delius et al. (Delius et al., 1998). Recently, its presence was detected in a study that aimed to know the nasal virome diversity from indigenous children at South American Villages (Altan et al., 2019). HPV 80 has also been observed in 10% of actinic keratosis samples (Nindl et al., 2007) and in a HIV-positive man from South Africa (Meiring et al., 2017).
On the other hand, as depicted in Figure 2, it is worth mentioning that only 11 out of the 51 genotypes detected (HPV 16,18,26,33,52,53,54,59,62,70, and 81-all of them belonging to α-PVs) were amplified with both sets of primers (PGMY09/11 and FAP). However, concerning the HPVs 18, 52, and 62, FAP primers were more efficient, suggesting that there are some variants that can only be detected by FAP primers. In addition, 31 genotypes can be exclusively detected by using FAP primers (including 7 unclassified HPVs), while only 9 genotypes exclusively amplified with PGMY primers. It is noteworthy that some HPVs that are supposed to amplify with PGMY, amplified only with FAP, as in the case of HPVs 6, 11, and 42, meaning that the prevalence of these genotypes could be underestimated. Therefore, as many genotypes are only amplified and sequenced with PGMY or FAP primers, the use of both sets of primers is a good way to detect a broader range of HPV genotypes present in CC samples. Also, some sequences that amplified by NGS were not included in this study because they showed an identity inferior to 80% to any HPV reference genotype or unclassified HPV reported sequences, suggesting that they could be new HPV genotypes not described to date.
By comparing four HPV genotyping methods, Nilyanimit et al. conclude that although NGS is expensive and complex, it detects multiple HPVs with high sensitivity and propose this method as an alternative for diagnostic HPV genotyping in certain situations (Nilyanimit et al., 2018). Derived from our studies, we also propose to use NGS (using at least PGMY and FAP primers) to determine the presence of a greater number of genotypes, including β-and γ -PVs in precursor cervical lesions and in CC, in order to determine the most prevalent genotypes in each geographical region.
By comparing the genotypes detected in CC with those determined in CIN1 samples, very promising observations were found. Firstly, it should be noted that in the CIN1 samples that were HPV negative with Linear Array (PGMY-HPV-neg group), HPV 58 was the most common genotype identified, being present in 63% of the samples analyzed. In contrast, this genotype was not found in any CC sample. The fact that PGMY amplification-based tests could not detect this genotype may be due to the variability reported for this genotype in Mexico and in some regions of the world (Raiol et al., 2009;Yue et al., 2013;Chen et al., 2016;Conde-Ferraez et al., 2017). HPVs 30, 214, and 11 were also common in this group. In a study carried out in 109 cervical specimens from South African HIV-positive women, HPV 30 was present in 14.6% of the samples (Meiring et al., 2012); on the other hand, HPV 214 (a γ -PV) has been recently reported to be present in penile swabs in South Africa (Murahwa et al., 2019).
Furthermore, another very interesting observation derived from our analysis is that some genotypes were exclusively found in the CC samples, such as: 5, 26, 38b, 39, 68, 107, 122, 135, FA39, FA116, mSK_120 and mSK_136 and mSK_221, among others. It should be noted that several of them belong to β2 and β1 species (as seen in Figure 4). β2 species have also been prevalent in men who have sex with men  and in head and neck squamous cell carcinoma (Agalliu et al., 2016;Sabol et al., 2016), The limited knowledge about the epidemiology of genital β-and γ -PVs infection makes it difficult to resolve their possible role as carcinogenetic co-factors increasing the effect of α-PVs at these anatomical sites.
The limitation of this study is that the number of cases should be increased, including CIN2 and CIN3 samples as well as the follow-up of patients with CIN1. Currently, we are recruiting more patients with cervical cancer and cervical precursor lesions to broaden our search for HPVs belonging to genera other than α-papillomaviruses.

CONCLUSION
This study highlights the presence of a greater number of HPV genotypes than reported until now in cervical cancer, that belong not only to alpha-PV genus, but also to the beta and gamma genera, including HPV genotypes that have not been classified yet. In Mexico, many of them have even greater prevalence and attribution than many other HPVs reported as high risk, such as 38b, 80, 107, 5, mSK_120, mSK_136, FA39, 122, and FA116, among others.
It will be essential to develop more accurate and specific diagnostic methods that allow the detection of a greater number of HPV genotypes adapted to the different geographical regions. Additionally, it will be crucial to determine if those most prevalent genotypes have oncogenic properties and to unravel the contribution of different genera HPV coinfections to cancer.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in the NCBI BioProject repository with the following accession numbers: PRJNA506700 (for CC samples), PRJNA506685 (for CIN1 samples with multiple HPV infections), and PRJNA506459 (for PGMY-HPV-negative CIN1 samples).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Comisión Nacional de Investigación Científica, Instituto Mexicano del Seguro Social, Project No. F-CNIC-2014-35. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
JC-R, VV-R, YL-H, and PP-S were involved in patient interviews and cervical sample recruitment. VV-R, PP-S, and AM-P carried out the Linear Array HPV genotyping tests. MF-M and DO performed PCR assays and the NGS experiments. MF-M, IB, and AW did the data analyses. MM-S performed the histopathological diagnosis of all patients. CA-I wrote the first draft of the manuscript. AA-L and LJ-S designed the study, supervised all experiments and analyses, and completed the manuscript. All authors read, suggested modifications, and approved the final manuscript.