ORIGINAL RESEARCH article

Front. Immunol., 13 December 2022

Sec. Cancer Immunity and Immunotherapy

Volume 13 - 2022 | https://doi.org/10.3389/fimmu.2022.1032716

Proteogenomic identification of Hepatitis B virus (HBV) genotype-specific HLA-I restricted peptides from HBV-positive patient liver tissues

  • MS

    Mayank Srivastava *

  • RC

    Richard Copin

  • AC

    Augustine Choy

  • AZ

    Anbo Zhou

  • OO

    Olav Olsen

  • SW

    Sarah Wolf

  • DS

    Darshit Shah

  • AR

    Anna Rye-Weller

  • LC

    Lisa Chen

  • NC

    Newton Chan

  • AC

    Angel Coppola

  • KL

    Kathryn Lanza

  • NN

    Nicole Negron

  • MN

    Min Ni

  • GS

    Gurinder S. Atwal

  • CA

    Christos A. Kyratsous

  • WO

    William Olson

  • RS

    Robert Salzler *

  • Regeneron Pharmaceuticals Inc., Tarrytown, NY, United States

Abstract

The presentation of virus-derived peptides by HLA class I molecules on the surface of an infected cell and the recognition of these HLA-peptide complexes by, and subsequent activation of, CD8+ cytotoxic T cells provides an important mechanism for immune protection against viruses. Recent advances in proteogenomics have allowed researchers to discover a growing number of unique HLA-restricted viral peptides, resulting in a rapidly expanding repertoire of targets for immunotherapeutics (i.e. bispecific antibodies, engineered T-cell receptors (TCRs), chimeric antigen receptor T-cells (CAR-Ts)) to infected tissues. However, genomic variability between viral strains, such as Hepatitis-B virus (HBV), in combination with differences in patient HLA alleles, make it difficult to develop therapeutics against these targets. To address this challenge, we developed a novel proteogenomics approach for generating patient-specific databases that enable the identification of viral peptides based on the viral transcriptomes sequenced from individual patient liver samples. We also utilized DNA sequencing of patient samples to identify HLA genotypes and assist in target selection. Liver samples from 48 HBV infected patients, primarily from Asia, were examined to reconstruct patient-specific HBV genomes, identify regions within the human chromosomes targeted by HBV integrations and obtain a comprehensive view of HBV peptide epitopes using our HLA class-I (HLA-I) immunopeptidomics discovery platform. Two previously reported HLA associated HBV-derived peptides, HLA-A02 binder FLLTRILTI (S194-202) from the large surface antigen and HLA-A11 binder STLPETTVVRR (C141-151) from the capsid protein were validated by our discovery platform, but both were detected at very low frequencies. In addition, we identified and validated, using heavy peptide analogues, novel strain-specific HBV-HLA associated peptides, such as GSLPQEHIVQK (P606-616) and variants. Overall, our novel approach can guide the development of bispecific antibody, TCR-T, or CAR-T based therapeutics for the treatment of HBV-related HCC and inform vaccine development.

Introduction

Chronic HBV infection affects ~250 million people worldwide and can result in serious complications including acute and chronic hepatitis, cirrhosis, and hepatocellular carcinoma (HCC) (). Vaccination against HBV is effective at preventing infection, but there is currently no cure for patients chronically infected with the virus. Prolonged exposure of the liver to high viral loads results in persistent inflammation, which can stimulate fibrogenic and cirrhotic processes that can eventually lead to HCC. Additionally, the integration of HBV DNA fragments into the host genome can introduce genetic damage and chromosomal instability that can further contribute to the development of HCC (). Globally, it is estimated that chronic HBV infections account for ~50% of primary liver cancers ().

The persistence of HBV infection, or its control, is determined by the host immune response, which, following HBV evasion of the innate immune system, is primarily mediated by the adaptive immune system. Cytotoxic CD8+ T cells protect a host against intracellular pathogens and tumors (). As such, cytotoxic CD8+ T cells play a pivotal role in defending against HBV infection, and are primarily responsible for clearing virus in acute infections (), as well as preventing development of HCC. In cases of chronic HBV infection, HBV-specific CD8+ T cells are frequently undetectable, which is thought to result from the chronic exposure of the liver to high antigenic loads and tolerogenic hepatocytes ().

Because of the critical role of CD8+ T cells in defending against viral infections, the cytotoxic T cell activity against infected cells can be utilized as an effective therapeutic approach. Cytotoxic T-cells mediate adaptive immunity by recognizing HLA-peptide complexes presented on the surface of infected cells via interactions with the T cell receptors (TCRs) expressed on their surface. These HLA-peptide complexes are utilized as targets for developing different immunotherapeutics including bispecific antibodies and CAR-Ts, but a lack of broadly expressed epitopes has made it difficult to implement these therapeutic strategies. Early attempts at identifying HBV-specific T cells typically utilize various T cell stimulation/proliferation assays (, ). However, the presence of these T cells does not necessarily mean that the peptide/HLA that they recognize is present HBV/HCC tumor cells. The most direct way to identify targetable HBV-HLA epitopes in HBV-HCC is to perform immunopeptidomics on these cells.

The emerging field of proteogenomics has been successfully applied to identify unique HLA-associated mutant peptides in cancer, characterizing HLA-presented epitopes that are expressed specifically by various tumors types (). Previous work on identifying HBV-HLA epitopes in HCC has been performed, identifying several HBV peptides (, ). The greatest challenge to identifying these epitopes from HBV-HCC tissue is that HBV covalently closed circular DNA (cccDNA), the template for viral replication, is frequently absent from tumor cells. Instead, tumor cells contain fragments of HBV DNA that have been inserted into the host genome () that can be transcribed and, ultimately, result in the expression of viral antigens on the cell surface. This poses a challenge for identifying HBV epitopes as the exact regions of the HBV genome that are integrated can differ from patient to patient (). Further complicating this effort is the existence of thousands of HBV strains distributed across 10 different HBV genotypes (genotype A to J) (, ), which differ from each other by at least 8% at the nucleotide level. These challenges emphasize the need for a new proteogenomic approach for comprehensive identification of HBV epitopes from patients. Here, we developed a novel proteogenomic approach to detect HLA-associated HBV peptides from HBV infected cells. We focused our efforts on characterizing the repertoire of HLA-I-HBV peptides that are expressed in the livers of infected patients, as this approach would likely provide the most accurate representation of HBV epitopes in the population. To achieve this, we performed bulk RNA sequencing of HBV-infected liver samples from Asian populations () to generate patient-specific HBV genome and construct associated-transciptome databases that were used to inform HLA-I-HBV peptide discovery from the same liver tissue. HLA-I associated peptides were purified from homogenized liver samples using immunoenrichment of HLA-I complexes and identified by subsequent mass spectrometric analysis (). Our analysis identified previously reported HBV antigens derived from the large surface antigen and polymerase proteins as well as a novel HLA-A11 associated polymerase peptide. The polymerase peptide and its associated sequence variants collectively were found in a majority of our HLA-A11+ patients. We demonstrate that our proteogenomic discovery platform provides a new avenue for identifying patient specific HBV genome and associated-peptides as targets for anti-HBV related-HCC immunotherapy.

Experimental section

HBV-positive HCC tissue procurement

All patient tissues were purchased from Proteogenex or BioIVT in fresh frozen form. The patient details including age, sex and diagnostic tests are provided in Supplementary Table S1. Tissues were cryopreserved until sample preparation.

RNA sequencing

Total RNA was extracted from human liver tissue using MagMAX kit (ThermoFisher). Strand-specific RNA-seq libraries were prepared from 1 µg RNA using KAPA stranded mRNA-Seq Kit (KAPA Biosystems). Twelve-cycle PCR was performed to amplify libraries. The amplified libraries were size-selected at 400~600 bp using PippinHT (Sage Science). Sequencing was performed on Illumina HiSeq®2500 (Illumina) by multiplexed paired-read run with 2X100 cycles.

Mapping of patient liver RNA sequences to HBV reference genome

Bulk RNA-seq reads were aligned to reference genome ayw (NC_003977.2) using minimap2 (v2.17) (). Alignments were then sorted by coordinates and quality controlled by samtools flagstats (v1.9) () and bedtools genomeCoverageBed (v2.17.0) () inspection. Duplicates were then marked and removed by the Picard toolkit (v2.18.2) (https://github.com/broadinstitute/picard).

Workflow to reconstruct HBV genomes from patient liver RNAseq data

Paired-end Illumina RNA reads from each sample were de novo assembled into large contigs using megahit () (options: –min-count 3 –k-min 27 –k-max 127 –prune-level 2) and mapped using BLAST to HBV reference genome, ayw (NC_003977.2) to select HBV specific sequences. BLAST parameters for sequence comparisons included outfmt ‘7 std sgi stitle’; minimum E value = 0.001; cost to open a gap = 5; cost to extend a gap = 2; length of best perfect match = 11; reward for a nucleotide match = 2; reward for a nucleotide mis-match = -3. Contigs without BLAST matches were discarded as well as BLAST results with E values greater than 0.001, percentage identity below 79% or alignment length of less than 50 nucleotides. Overlapping contigs were merged using custom scripts and final sequence that covered the entire or partial length of the reference sequence with the highest identity was selected. The final sequence was mapped against the reference genome to extract and translate all coding sequences. Non-redundant protein sequences were then added to customized databases for peptide identification by mass-spectrometry.

Workflow to identify integration sites from patient liver RNAseq data

Contigs matching the HBV reference genome (ayw) were extracted and mapped against human reference genome (GRCh38) using BLAST with parameters described above. Contigs matching both HBV and human (hybrid contigs) were collected for annotation. The HBV and human integration sites of the hybrid contigs were annotated against the genome features in HBV ayw Genbank file and human GRCh38 GTF file.

HLA genotyping for patient samples

Sample Preparation and Sequencing: DNA sample quantity was determined by fluorescence (Life Technologies) and quality was assessed by running 25 ng of sample on a 1% pre-cast agarose gel (Life Technologies). Samples with high molecular weight gDNA with a majority of the DNA fragments greater than 20 kb and a concentration of no less than 10 ng/ul passed the quality assessment. The DNA samples were normalized to 10 ng/ul and 50 ng was used to amplify the full-length HLA amplicons. The targets were amplified in three pools of variant tolerant primers optimized for similar binding temperatures and PCR conditions with LA Taq DNA Polymerase (Takara Bio). The resulting amplicon pools were combined equimolarly as determined by automated capillary electrophoresis (Agilent) and fluorescence (Life Technologies). DNA libraries were prepared for Illumina-based sequencing with a custom NEB kit (New England Biolabs). The amplicons were enzymatically fragmented to a mean insert size of 250 bp and universal adapters were ligated onto the DNA fragments. Unique 10 base pair barcode sequences were added to the DNA fragments during PCR with NEBNext Ultra II Q5 Master Mix (New England Biolabs, M0544L) to facilitate highly multiplexed sequencing. The samples were pooled and sequenced using 150 base pair paired-end sequencing on an Illumina Nextseq 500.

Data Analysis: Upon completion of sequencing, raw data from each Illumina Nextseq run was gathered in local buffer storage and uploaded to a local high-performance computing platform for automated analysis. The FASTQ-formatted reads were converted from the BCL files and assigned to samples identified by specific barcodes using the bcl2fastq conversion software (Illumina Inc., San Diego, CA). All the reads in sample-specific FASTQ files were subjected to HLA typing analysis using an updated version of PHLAT program () with the reference sequences consisting of GRCh38 genomic sequences and HLA type reference sequences in the IPD-IMGT/HLA database v3.30.0 ().

Tissue lysis, and HLA affinity enrichment

The tissues were pulverized using SPEX SamplePrep Freezer/Mill Dual-Chamber Cryogenix Grinder (Cole-Parmer) in liquid nitrogen, and lysed in ice-cold lysis buffer (1% NP-40, 150 mM NaCl, 50 mM Tris-HCl pH 8.0 and 10 mM EDTA pH 8.0) supplemented with HALT protease and phosphatase inhibitors (Thermo Scientific, cat. 78442) on ice using sonication.

Anti-HLA Class I (W6/32) was conjugated to NHS-sepharose beads by overnight incubation in coupling buffer (0.2 M NaHCO3 + 0.5 M NaCl pH 8.3) in 4°C. Reaction was quenched with 0.1 M Tris-HCl pH 8.5, and beads were washed with the Tris-HCl solution and 0.1 M acetate buffer.

The pre-cleared tissue lysate was passed through a column packed with 1 ml of HLA Class-I beads bed under gravity. The column was subsequently washed with Seppro Dilution Buffer (Sigma, cat. S4199) and 20 mM Tris-HCl pH 8, and HLA-peptide complexes were eluted with 0.1 M glycine pH 2.7.

Sample preparation for mass spectrometry

The glycine elute was loaded onto the C18 Sep-Pak (Waters, cat. WAT054960), followed by selective elution of peptides by 30% ACN/0.1% TFA. The peptides were further cleaned up by ziptip (C18 resin), and analyzed by nano-LC-MS/MS.

LC-MS/MS

HLA peptides as above were loaded onto a nanoViper Acclaim PepMap100 C18 trap column (75 μm i.d. × 2 cm, 3 μm, 100 Å, Thermo) and were separated using a nanoViper Acclaim PepMap RSLC C18 column (75 μm i.d. × 25 cm, 2 μm, 100 Å, Thermo) heated to 40°C and retrofitted with a New Objective SilicaTip (7 cm) with a distal conductive coating at the inlet end of the emitter. The gradient was delivered by an EASY-nLC 1200 HPLC system (Thermo) at 300 nL/min. The following 120-minute elution gradient with mobile phase A (Water/0.1% formic acid) and B (80% Acetonitrile/0.1% formic acid) was used: 3% B at 3 min, linear to 35% B at 100 min, and linear to 45% B at 123 min. The peptides eluted from the column were ionized via Flex ion source at 1.9 kV and analyzed by the Thermo Fusion Lumos Tribrid mass spectrometer (Thermo) using Xcalibur 4.1.31.9 (Thermo). Data acquisition was performed in data-dependent mode, where survey scans were carried out in the high field Orbitrap analyzer (range of m/z 300-1500 at a resolution of 60,000) with the automatic gain control target of 4.0E5 and maximal ion fill time of 100 ms. The MS/MS analyses were performed by 1.2 m/z precursor ion isolation with the quadrupole, applying normalized HCD (higher-energy collisional dissociation) collision energy of 32%, and analysis of fragment ions in the Orbitrap at a resolution of 15,000. Dynamic exclusion window was set to 6 seconds, monoisotopic precursor selection (MIPS) to peptide, maximum injection time to 100 ms, and charge states unknown. +1-+4 charge states were included and the advanced peak determination was toggled on. For FAIMS-enabled experiments, the settings were identical except the FAIMS device was placed between the nanoelectrospray source and the mass spectrometer. FAIMS separations were performed with the following settings: inner and outer electrode temperature set to 100°C (except where noted), FAIMS carrier gas flow of 5.0 L/min, asymmetric waveform with DV −5000 V, entrance plate voltage 250 V, and CV settling time of 25 ms. The FAIMS carrier gas is N2, and the ion separation gap is 1.5 mm. The noted CVs were applied to the FAIMS electrodes. For external stepping or single CV experiments, the selected CV was applied to all scans throughout the analysis. For internal CV stepping experiments, each of the selected CVs was applied to sequential survey scans and MS/MS cycles (1 s); the MS/MS CV was always paired with the appropriate CV from the corresponding survey scan.

Identification of HBV peptides from HLA-I immunopeptidomics of patient liver samples

All mass spectrometry raw files were searched against a consolidated database of human UniProtKB (homo sapiens) and HBV protein sequences, obtained from RNAseq of patient liver samples as described above, with PEAKS DB search engine (). PEAKSX+ (PEAKS Studio 10.5, Bioinformatics Solutions Inc.) was used for De novo-assisted database search with precursor mass tolerance of 8 ppm, and fragment ion tolerance of 0.02 Da. Enzyme selectivity was set to none, and methionine oxidation as the only variable modification with two maximum allowed modifications per peptide. The search was performed with a 5% false-discovery rate (FDR) at peptide level, and peptides were further filtered based on -logP score of 20 (corresponding to 1% FDR).

Peptide validation using synthetic heavy peptide analogues

The HLA-eluted peptides were mixed with synthetic peptides (50 fmol) containing heavy Leucine (13C(6)15N(1), +7.0172) and analyzed by LC-MS/MS. The identity of endogenous peptides was confirmed by comparing the elution and fragementation profile with the corresponding heavy peptides. Furthermore, the abundance of endogenous peptides was estimated based on the intensity of heavy and endogenous peptides, using Avogadro’s number. For this calculation, we assumed a 50% sample loss during the sample preparation, and 100 million cells as the sample amount.

Results

Generation of RNAseq databases from HBV infected patient samples

Comparative genomic analyses are often guided by a limited number of ‘reference’ genomes. As of August 2022, the National Center for Biotechnology and Information (NCBI) database contains a repository of more than 14,300 HBV genome sequences classified into at least ten distinct phylogenetic lineages (A to J https://www.ncbi.nlm.nih.gov/nuccore). While each lineage is represented by a single HBV genome, these reference sequences do not capture the full breadth of genomic diversity present in nature. This lack of coverage can negatively impact identification of new viral peptides as the sequence of patient isolates can greatly diverge from the reference genome sequence used for the comparative analysis.

To circumvent this problem, we employed a novel RNAseq-based approach to generate a database of patient-specific HBV genome sequences directly reconstituted from the transcribed virus RNA sequences amplified from patient liver samples (Figure 1A). Total human RNA reads are first converted into large contigs using a de-novo assembly approach, and contigs with loose homology to HBV reference genomes are marked as HBV-specific sequences. HBV contigs are then grouped into two categories based on the percentage of the contig sequence covering the HBV reference. Fully-mapped contigs to HBV reference are re-arranged in a reference free fashion to assemble patient-specific HBV genomic sequences. Contigs with hybrid matches to the HBV reference are mapped against the human genome and the hybrid contigs matching both HBV and human sequences are selected. The patient-specific viral sequence is then used to extract and translate all HBV coding sequences (Figure 1B). Using this approach, we identified a total of 80 unique coding sequences (large S= 14, middle S= 14, small S= 16, pre-capsid= 6, capsid= 12, X= 16 and polymerase= 2) retrieved from virus isolates that are representative of four phylogenetic lineages (A, B, C, D) (Figure 1C). In accordance with the reported prevalence of HBV genotypes in Asian populations, HBV genotypes B and C were most commonly identified, having been found in 24 of the 48 samples (12 each). HBV genotypes A and D were detected in only one patient sample each (Table 1). RNA sequencing data for 18 of the samples had low HBV coverage, suggesting either an absence of HBV virus in these samples, possibly due to viral latency or clearance, or low HBV expression.

Figure 1

Table 1

Sample IDDiagnosis (from vendor)HBV status from vendorHBV Peptides by MS (5% FDR, 9-12 residues)Number of HBV peptides detected by MS (5% FDR, 9-12 residues)HBV Genotype from RNAseqVirus reads
Liv1_09945T1(3)liver cancerHBVYes4D11865
Liv2_0000032532CirrhosisHBVNo0low coverage13
Liv3_0000040312CirrhosisHBVYes3A5205
Liv4_0000031295CirrhosisHBVNo0low coverage68
Liv5_0000049186Chronic inflammationHBVYes12poor QC0
Liv6_0000049219HepatitisHBVYes1poor QC0
Liv7_ILS37316FT2HCCHBVNo0low coverage26
Liv8_ILS30987D2HCCHBVYes6C14633
Liv9_ILS24999D2HCCHBVYes5B9423
Liv10_ILS39940FT1HCCHBVYes4poor QC0
Liv11_ILS31761D1HCCHBVYes3C9935
Liv12_ILS22791D2HCCHBVYes7C26154
Liv13_ILS23304D2HCCHBVNo0low coverage31
Liv14_ILS20947D3HCCHBVYes2C2070
Liv15_ILS21034D4HCCHBVYes4B5369
Liv16_ILS42208FT1HCCHBVYes4poor QC0
Liv17_ILS22914D2HCCHBVYes1B12853
Liv18_ILS20954D2HCCHBVNo0low coverage23
Liv19_ILS21955D2HCCHBVYes1C3909
Liv20_ILS10922D04HCCHBVYes11B26200
Liv21_499956PFHCCHBVYes2B9010
Liv22_524614VFHCCHBVNo0low coverage30
Liv23_ILS35989FT2HCCHBVNo0low coverage4
Liv24_ILS42195FT1HCCHBVNo0low coverage6
Liv25_ILS45824FT1HCCHBVYes4B43486
Liv26_ILS36725FT2HCCHBVNo0low coverage6
Liv27_ILS37352FT1HCCHBVYes2B15594
Liv28_ILS50526FA1Hepatitis, chronic (Diseased)HBVYes4C22294
Liv29_ILS50526FT2HCCHBVNo0low coverage30
Liv30_ILS50523FA1Cirrhosis (Diseased)HBVYes6B18692
Liv31_ILS50523FT2HCCHBVYes5B14716
Liv32_1176935FHCCHBVNo0low coverage1
Liv33_1176559FHCCHBVNo0low coverage18
Liv34_1182970FHCCHBVNo0low coverage88
Liv35_1181680FHCCHBVYes4B9563
Liv36_1182812FHCCHBVNo0low coverage990
Liv37_1189452FHCCHBVYes2B1468
Liv38_1193117FHCCHBVYes9B57394
Liv39_1192879FHCCHBVNo0low coverage25
Liv40_405295F2HCCHBVYes1C52627
Liv41_ILS32961D2HCCHBVNo0low coverage1
Liv42_ILS33989D2HCC and CholangiocarcinomaHBVNo0low coverage17
Liv43_ILS22892S1HCCHBVYes2C9174
Liv44_ILS24966D3-DS2HCC and CholangiocarcinomaHBVYes3C2980
Liv45_ILS30194D2HCCHBVYes3C4486
Liv46_ILS32365D1HCCHBVNo0low coverage39
Liv47_ILS32963D1HCCHBVYes1C675
Liv48_ILS36088FT2HCCHBVYes1C16030

The Patient sample ID, diagnosis from vendor, and HBV serological test results for all the patient samples analyzed, along with number of HBV unique peptides detected by mass spectrometry and HBV genotype identified from RNAseq results.

Read mapping analysis showed that the depth of coverage of virus reads was high and consistent in the genomic region encoding the S and X proteins, while coverage for the rest of the genome contained significant gaps (Figure 1B). Additionally, these reads often terminate in the middle of an HBV gene. These gaps and partial reads suggest that the reads could be coming from RNA transcribed from partially integrated HBV DNA sequences instead of cccDNA. Of note, the N-terminal region of the genome-spanning polymerase gene was poorly covered, whereas the C-terminal portion of the polymerase gene was well detected. For the majority of HBV+ samples, we also identified hybrid contigs combining both HBV and human sequences. These sequences are expected to result from integration of HBV DNA into the coding sequence of a human gene. The genomic location of the HBV sequence in these junctions is referred to as “breakpoints”. We mapped the HBV breakpoints found in RNAseq reads with HBV/human junctions and found that the majority of HBV breakpoints occurred between nucleotide positions 1601-1900 in the HBV genome (Figure 1D) while no clear pattern was observed for the host genome (Supplementary Table S2). This data is consistent with a previous report by Yang et al (), suggesting that at least some of the detected HBV sequences are coming from integrated DNA and not cccDNA. The reconstitution of HBV genome and the HBV sequences from patient liver samples were utilized to discover HBV peptides presented on the surface of HCC tumors.

Characterization of the immunopeptidome of HBV-infected patient samples

HLA Class-I complexes were immunopurified from homogenized hepatocellular carcinoma (HCC) liver tissue samples from 48 HBV seropositive patients (Proteogenex and BioIVT, Table 1, Supplementary Table S1) using an anti-HLA Class-I antibody (W6/32). Western blotting of liver lysates, flow throughs, and enrichment elutions demonstrated a near complete capture of HLA-I by W6/32 linked sepharose beads (Supplementary Figure S1). Following elution of the complexes, HLA-I bound peptides were identified using an Orbitrap Fusion Lumos mass spectrometer coupled to a nanoLC system. The raw mass spectrometry (MS) data files were searched against a combined database of human UniProt and HBV sequences derived from the RNA HBV reads from each patient sample. On average, approximately 8,500 total peptides were detected from liver samples from variably aged patients (Supplementary Figure S2). The distribution of peptide lengths isolated from the majority of liver tissue samples was consistent with those expected for Class I HLA enrichment, with peptides nine residues in length (9-mers) being the most abundant, followed by peptides ten (10-mers) and eleven (11-mers) residues in length (Figure 2A). Peptides with nine to twelve residues constituted over 50% of the total peptides isolated (Supplementary Figure S2).

Figure 2

High percentage of identified peptides are predicted to bind HLA using NetMHC pan 4.0 analysis

The polymorphic nature of HLA-I has a profound effect on the epitopes presented on the tissue surface. Hence, DNA sequencing of liver tissue samples was performed to determine the HLA genotype (Table 2), which was used to predict the binding of detected peptides by NetMHC pan 4.0 (). We found that a very high percentage (50 – 100% for most of the samples) of the detected 9-mer peptides were predicted binders of the identified HLA-I in the corresponding patient samples (Figure 2B; Supplementary Table S3). The percent of nonamers that bound to the patient HLAs exceeded 100% for some samples, owing to the promiscuity in peptide binding predicted by NetMHC Pan 4.0.

Table 2

HLA-AHLA-BHLA-C
Sample IDAllele 1Allele 2Allele 1Allele 2Allele 1Allele 2
Liv1_09945T1(3)*0201*2402*3801*3901*1203*1203
Liv2_0000032532*0301*0301*1516*5301*0401*1402
Liv3_0000040312*1101*2402*2706*3802*0702*0702
Liv4_0000031295*2402*2402*1301*4001*0304*0702
Liv5_0000049186*0201*0301*0702*3801*0401*0702
Liv6_0000049219*1101*3001*1302*4001*0602*0702
Liv7_ILS37316FT2*0203*0206*1525*5502*0702*1203
Liv8_ILS30987D2*1101*1101*1502*3501*0303*0801
Liv9_ILS24999D2*0203*3303*3802*4803*0702*0801
Liv10_ILS39940FT1*1101*3001*0702*3802*0702*0702
Liv11_ILS31761D1*0203*1101*1502*4601*0102*0801
Liv12_ILS22791D2*0201*1101*3901*5101*0702*1402
Liv13_ILS23304D2*1101*2402*3505*4601*0102*0401
Liv14_ILS20947D3*1101*1101*1502*3802*0702*0801
Liv15_ILS21034D4*1101*3303*1502*5801*0302*1402
Liv16_ILS42208FT1HLA genotype couldn’t be identified
Liv17_ILS22914D2*1101*3101*1301*5102*0304*1502
Liv18_ILS20954D2*0201*7402*1502*5101*0801*1402
Liv19_ILS21955D2*0203*3101*1525*5101*0403*1602
Liv20_ILS10922D04*02264*3303*1802*5801*0302*0704
Liv21_499956PF*021*111*131*552*012*034
Liv22_524614VF*111*111*182*382*043*072
Liv23_ILS35989FT2*027*111*382*461*0112*072
Liv24_ILS42195FT1*111*291*075*351*033*155
Liv25_ILS45824FT1*242*242*401*541*012*043
Liv26_ILS36725FT2*011*333*131*571*034*062
Liv27_ILS37352FT1*111*111*382*511*072*142
Liv28_ILS50526FA1*111*333*1525*355*041*043
Liv29_ILS50526FT2*111*333*1525*355*041*043
Liv30_ILS50523FA1*11*333*152*581*032*081
Liv31_ILS50523FT2*11*333*152*581*032*081
Liv32_1176935F*026*111*152*152*072*081
Liv33_1176559F*242*261*382*541*012*072
Liv34_1182970F*111*291*152*152*043*081
Liv35_1181680F*111*333*443*564*076*081
Liv36_1182812F*111*261*443*564*076*081
Liv37_1189452F*111*242*1512*1525*033*043
Liv38_1193117F*111*243*152*152*043*081
Liv39_1192879F*111*111*152*461*012*081
Liv40_405295F2*027*0265*382*461*0112*072
Liv41_ILS32961D2*011*027*461*571*012*062
Liv42_ILS33989D2*021*242*152*1512*033*081
Liv43_ILS22892S1*027*333*461*581*012*032
Liv44_ILS24966D3-DS2*023*333*131*581*032*034
Liv45_ILS30194D2*023*242*1512*401*033*072
Liv46_ILS32365D1*027*243*151*1512*013*033
Liv47_ILS32963D1*111*311*511*512*142*152
Liv48_ILS36088FT2*023*111*1521*564*012*043

HLA class I alleles identified for all HBV-infected patient samples from DNA sequencing.

Symbol * is the separator and it generally is associated with the first two digits (out of 4 digits) in HLA allele. The first two digits define the HLA type.

We next performed a motif analysis of detected peptides to identify overrepresented residues at different peptide positions using Seq2Logo () (Supplementary Figure S3). As expected, anchor residues at position 2 and 9, which specify HLA-I binding, were significantly overrepresented in all the tissue samples, with residue identities corresponding to the HLA-I genotype present. For example, hydrophobic anchor residues (i.e. Leu and Val) characteristic of HLA-A02 binders were enriched at positions 2 and 9 of peptides isolated from HLA-A02+ patient samples. Together, the high percentage of predicted HLA binders and the overrepresentation of characteristic anchor residues in isolated peptides provide a high degree of confidence in the HLA-I enrichment and peptide identifications.

Identification of unique HLA-associated HBV peptides using HBV genotype-specific databases

We reasoned that the number of HLA-I-presented HBV peptides would depend on the degree of HBV infection, proteasomal processing of antigen, and tissue sample quality. Indeed, the number of HBV specific peptide sequences identified from infected livers varied between samples and generally showed low sequence coverage of HBV proteins (Table 1; Figure 1B). In total, we identified 54 unique HBV peptides with lengths of nine to twelve residues using a PEAKS ion cut off score of 20 (Table 3). These peptides were detected from patient samples with diverse HLA genotype (Figure 3A). The most frequent HLA allele in the patient samples analyzed was HLA-A11 followed by HLA-B15 and HLA-A02. A majority of the viral peptides detected were NetMHC pan 4.0 predicted binders of the HLA-I allele present in the corresponding patient sample (Figure 3B; Supplementary Table S4). The large surface antigen protein was the source of a few strong HLA-A02 predicted binders (NetMHC pan 4.0) that were identified: GLSPTVWLSV (S348-357), GMLPVCPLL (S276-284), FLLTRILTI (S194-202) and GMLPVCPLI (S276-284), as well as a weak binder NILSPFMPLL (S370-379) and were all found at a relatively low frequencies (Table 3). In contrast, some HLA-A11 associated peptides: STLPETTVVRR (C141-151) from capsid protein and SAISSTFSK from polymerase were found at relatively higher frequencies (25-30%) (Figure 3C). Notably, we detected novel HLA-A11 restricted HBV-genotype specific polymerase peptides GSLPQEHIVQK (P606-616) and GSLPQEHIIQK (P606-616) from genotype B, and GTLPQEHIVHK (P606-616) and GTLPQEHIVQK (P606-616) from genotype C. The former peptide variant is predicted to be a weak HLA-A11 binder, while the latter three peptides are predicted to be strong binders of HLA-A11 by NetMHC pan 4.0 (Figure 3B). More importantly, this polymerase peptide and its variants (GSLPQEHIVQK, GSLPQEHIIQK, GTLPQEHIVHK and GTLPQEHIVQK) have a combined frequency of ~74% (14 out of 19 samples) in HLA-A11 tissue samples (Figure 3C, Supplementary Table S5). Notably, no HBV peptides were detected from any of the 18 tissues from which RNA sequencing had no or low HBV coverage, further validating our proteogenomic approach and highlighting a good correlation between the two complementary methods.

Table 3

Sample IDHBV Peptides (5% FDR)IonScore (-logP)Lengthm.zHBV Protein
Liv1_09945T1(3)LTIPQSLDSW4310580.302S
TRILTIPQSL3310571.3472S
FVGLSPTVWL2010559.8152S
MENITSGFLGPL4612639.8233S
Liv2_0000032532Not Detected
Liv3_0000040312LTIPQSLDSW3310580.303S
TRILTIPQSL2210571.3493S
MENITSGFLGPL3712639.823S
Liv4_0000031295Not Detected
Liv5_0000049186VGLSPTVWL399971.5576S
FVGLSPTVW3891005.5411S
ITSGFLGPL369904.514S
LTIPQSLDSW48101159.6012S
NITSGFLGPL41101018.5593S
TRILTIPQSL4010571.3502S
FVGLSPTVWL3310559.8156S
ITSGFLGPLL2910509.3024S
LTRILTIPQSL4311627.8931S
ENITSGFLGPL3211574.3026S
WTSLNFLGGTTV4712648.3353S
MENITSGFLGPL32121278.6306S
Liv6_0000049219TRILTIPQSL2910571.3505S
Liv7_ILS37316FT2Not Detected
Liv8_ILS30987D2HLYSHPIIL409546.8131Polymerase
LPQEHIVHK349367.545Polymerase
LPETTVVRR309535.8181Polymerase
TASPISSIF319922.4841S
GTLPQEHIVHK5511629.8475Polymerase
STLPETTVVRR3211420.2415Polymerase
Liv9_ILS24999D2IASGLLGPL259840.5204S
FVGLSPTVWL2810559.8152S
NILSPFMPLL2310572.8248S
MENIASGLLGPL4412607.8264S
IASGLLGPLLVL3112583.3806S
Liv10_ILS39940FT1SAISSTFSK299464.2422S
KPRKGMGTNL3510551.3126
GSLPQEHIVQK4711618.3405Polymerase
STLPETTVVRR2711420.241C
Liv11_ILS31761D1QSPTSNHSL409970.4584S
GLSPTVWLSV3110529.7979S
GTLPQEHIVQK5011625.3479Polymerase
Liv12_ILS22791D2LPSDFFPSI2291022.5152C
GMLPVCPLL349942.5134S
FLLTRILTI259545.3545S
GMLPVCPLI349942.5134S
TRILTIPQSL3210571.3494S
GTLPQEHIVQK5411625.3477Polymerase
STLPETTVVRR3011629.8578Polymerase
Liv13_ILS23304D2Not Detected
Liv14_ILS20947D3GTLPQEHIVHK4211420.2349Polymerase
STLPETTVVRR3311420.2417Polymerase
Liv15_ILS21034D4TVSAISSTF329912.4692S
TIPQSLDSW219523.7634S
GSLPQEHIVQK3611618.3413Polymerase
KILTIPQSLDSW5912700.8957S
Liv16_ILS42208FT1TASAISSTF359884.4364S
IPIPSSWAF3891017.5414S
LPYRPTTGR429530.7983Polymerase
FSLTKILTIPQ3411630.8829S
Liv17_ILS22914D2GSLPQEHIVQK4411618.3427Polymerase
Liv18_ILS20954D2Not Detected
Liv19_ILS21955D2GVWIRTPPAYR3411439.2469C
Liv20_ILS10922D04PQSLDSWLT249523.7632S
LPYRPTTGR249354.2018Polymerase
LTIPQSLDSW4510580.3043S
NILSPFMPLL4010572.8265S
FVGLSPTVWL3210559.8168S
FSLTKILTIPQ3911630.8831S
LTIPQSLDSWW2311673.3439S
KILTIPQSLDSW4712700.8918S
MENIASGLLGPL4312607.8276S
IASGLLGPLLVL4212583.3824S
FSLTKILTIPQS3312674.3987S
Liv21_499956PFGSLPQEHIVQK1911412.5624Polymerase
SAISSTFSK359464.242S
Liv22_524614VFNot Detected
Liv23_ILS35989FT2Not Detected
Liv24_ILS42195FT1Not Detected
Liv25_ILS45824FT1KYTSFPWLL379577.8159Polymerase
YPALMPLYA279519.7706Polymerase
LYAAVTNFL239506.2791Polymerase
MENIASGLLGPL3412607.8266S
Liv26_ILS36725FT2Not Detected
Liv27_ILS37352FT1GSLPQEHIVQK4111412.5654Polymerase
SAISSTFSK359464.2455S
Liv28_ILS50526FA1STLPETTVVRR3311420.2422Polymerase
LPFRPTTGR279348.8699Polymerase
ASRELVVSY239512.2773C
GTLPQEHIVHK3711420.2355Polymerase
Liv29_ILS50526FT2Not Detected
Liv30_ILS50523FA1MENIASGLLGPL3812607.8255S
LTIPQSLDSW3710580.3024S
SAISSTFSK339464.2425S
VGLSPTVWL339486.2813S
FVGLSPTVWL3210559.8156S
FVGLSPTVW279503.273S
Liv31_ILS50523FT2MENIASGLLGPL3612607.8255S
SAISSTFSK359464.2427S
LTIPQSLDSW3410580.3034S
TCIPIPSSW319502.2487S
FVGLSPTVWL2510559.8158S
Liv32_1176935FNot Detected
Liv33_1176559FNot Detected
Liv34_1182970FNot Detected
Liv35_1181680FSTISSTFSK319479.2466S
FVGLSPTVWL3010559.8156S
GSLPQEHIVQK2311412.5617Polymerase
MENIASGLLGPL2612607.8254S
Liv36_1182812FNot Detected
Liv37_1189452FFVGLSPTVWL2410559.8156S
MENIASGLLGPL2312607.8256S
Liv38_1193117FLQDPRVRAL359534.3187S
FVGLSPTVW259503.2733S
HLYSHPIIL369546.8134S
ASRELVVSY309512.2762C
LQDPRVRALY3210410.9025S
FVGLSPTVWL3110559.815S
STLPETTVVRR3811420.2418Polymerase
GSLPQEHIIQK3511417.2343Polymerase
MENIASGLLGPL3912607.826S
Liv39_1192879FNot Detected
Liv40_405295F2FVGLSPTVWL2910559.8152Polymerase
Liv41_ILS32961D2Not Detected
Liv42_ILS33989D2Not Detected
Liv43_ILS22892S1LTIPQSLDSW3010580.3041S
FVGLSPTVWL2410559.8157S
Liv44_ILS24966D3-DS2FVGLSPTVW299503.2723S
LTIPQSLDSW3610580.3026S
FVGLSPTVWL3110559.8146S
Liv45_ILS30194D2FLLTRILTI239545.354S
TASPISSIF219461.7469S
TASPLSSIF219461.7469S
Liv46_ILS32365D1Not Detected
Liv47_ILS32963D1GTLPQEHIVHK3811420.2349Polymerase
Liv48_ILS36088FT2GTLPQEHIVHK4211420.2341Polymerase

HBV peptides and proteins detected by search against patient-specific databases.

Peptide GSLPQEHIVQK, detected once with -logP of 19 was considered as a real identification (highlighted in red for sample Liv21_499956PF). For all other peptides, a score cut-off of 20 was used.

The Mass Spectrometer raw files were searched against a consolidated database of human UniProtKB and HBV protein sequences, obtained from RNAseq of patient liver samples using PEAKSX+. Peptides were initially filtered by 5% False-Discovery Rate (FDR) followed by -logP of 20 as minimum cut-off score. Decoy peptides were removed. Only peptides of length 9 – 12 are shown in the table.

Figure 3

Having discovered novel HBV Polymerase variant peptides, which are absent in Uniprot and other public databases, for the first time utilizing RNAseq of patient samples, we next aimed to validate the peptide identities using synthetic peptide analogues. HBV peptides predicted to bind HLA-A02 and HLA-A11 by NetMHC, including FLLTRILTI (S194-202), STLPETTVVRR (C141-151), GSLPQEHIVQK (P606-616) and GTLPQEHIVHK (P606-616), were validated by retention time alignment and matching the fragmentation patterns of peptides with their synthetic heavy versions (Figure 4 and Supplementary Table S6). The synthetic peptides used for this analysis differ from the HBV peptides by a heavy Leucine (13C(6)15N(1), +7.0172) and were mixed with HLA-eluted peptides prior to LC-MS/MS analysis. All peptides tested eluted at the same retention time and in the same MS1 scan as their heavy analogues in the LC-MS/MS run (Supplementary Figure S4) and also exhibited a similar fragmentation profile. The extracted ion chromatogram was also utilized to estimate the expression level or the copy number of some of the target peptides, including the polymerase peptide variants (copy number of ~100 – 500/cell) from the intensity of corresponding heavy peptide analogue parent ions. Even at relatively low surface presentation, we could differentiate between the polymerase peptide P606-616 variants based on the fragment ions (Figure 4). These HLA-I restricted HBV peptides provide novel targets for developing a potential immunotherapy.

Figure 4

Discussion

T cell based therapeutics are a promising treatment for HBV-HCC. As such, the identification of appropriate peptide targets for these T cell therapeutics is critically important. Here we report the identification of HBV peptides from HBV+ HCC patient samples. We first reconstructed the specific HBV genome present in each patient sample using RNAseq. Then using these patient-specific sequences, we were able to analyze the LC-MS/MS data from each patient sample using an appropriate reference sequence that would account for any patient specific mutations. Some of the identified peptides have been previously reported (i.e. STLPETTVVRR (C141-151)) (), while others are novel. We detected 45 previously unreported peptides out of 54 total identified HLA-I restricted HBV peptides (Supplementary Figure S5) (). Of particular interest, we were able to identify a novel peptide GSLPQEHIVQK (POL606-616) and its variants from HBV polymerase. This peptide is a predicted HLA-A11 binder and is the most frequent HBV epitope found, appearing (together with its variants) in 74% of HLA-A11+ patient samples.

The reconstruction of the HBV genomes present in each sample also provides valuable insight to what regions of the HBV genome are integrated into hepatocytes during HCC development. We discovered that the region of the genome encompassing the small surface antigen protein, X protein, and the latter half of the polymerase protein was frequently detected. Conversely, the region encompassing the capsid protein and the first half of the polymerase protein was either not detected or only partially detected. This is in agreement with previous reports of integration of HBV genome fragments rather than the entire genome. Understanding what is frequently integrated is important to choosing a peptide target for T cell based therapeutics as a peptide that falls in a region of the HBV genome that is not commonly integrated into HCC cells would not be ideal.

Together, the combined use of the RNAseq data and LC/MS-MS data reported here demonstrate a sound approach to finding and choosing appropriate peptide targets for T cell based therapeutics. The ideal peptide target would be highly abundant on the surface of cells, found commonly across patients, conserved across multiple HBV genotypes, and fall in a region of the HBV genome that is commonly integrated in HCC. Using the methods reported here, we identified a peptide that meets some of these criteria. The POL606-616 peptide was frequently found in the HLA-A11 patient samples and does fall in the HBV genome region that is commonly integrated. Its conservation across HBV genotypes is not ideal as there is usually 2-3 amino acid changes across the different variants detected. However, it may be possible to design T cell based therapeutics around the regions of this peptide that are conserved. As we use this approach to examine more patient samples representing different HLA alleles, we may be able to find other peptides that fit more of these criteria.

Statements

Data availability statement

The mass spectrometry immunopeptidomics data have been deposited to the ProteomeXchange Consortium via the PRIDE () partner repository with the dataset identifier PXD037270.

Author contributions

MS, RC, AuC, OO, GA, CK, WO, and RS conceptualized the study. MS performed HLA immunoaffinity enrichment from patient samples, prepared samples for Mass Spectrometry, and analyzed data. RC conceptualized the strategy to identify HBV reads from bulk RNA sequencing of patient liver samples, and reconstruct HBV genomes. RC, AZ, and AuC performed read mapping analysis and also identification of integration sites. SW, NC, AnC, KL, NN, and MN performed DNA and RNA sequencing and HLA genotyping. OO, GA, CK, WO, and RS analyzed data and supervised the project. MS, RC, and AuC drafted the manuscript with contribution from OO and editing by WO and RS. All authors contributed to the article and approved the submitted version.

Conflict of interest

Regeneron authors own options and/or stock of the company. The novel proteogenomics method and unique HBV peptides detected in the study have been described in a pending provisional patent application. GA, CK, and WO are officers of Regeneron.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

This study received funding from Regeneron Pharmaceuticals, Inc. Regeneron Pharmaceuticals was involved in the study design, collection, analysis, interpretation of data, and the writing of this article.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fimmu.2022.1032716/full#supplementary-material

References

  • 1

    TsaiKNKuoCFOuJJ. Mechanisms of hepatitis b virus persistence. Trends Microbiol (2018) 26(1):3342. doi: 10.1016/j.tim.2017.07.006

  • 2

    BousaliMPapatheodoridisGParaskevisDKaramitrosT. Hepatitis b virus DNA integration, chronic infections and hepatocellular carcinoma. Microorganisms (2021) 9(8):1787. doi: 10.3390/microorganisms9081787

  • 3

    LiuL. Clinical features of hepatocellular carcinoma with hepatitis b virus among patients on nucleos(t) ide analog therapy. Infect Agent Cancer (2020) 15:8. doi: 10.1186/s13027-020-0277-y

  • 4

    MengFZhenSSongB. HBV-specific CD4+ cytotoxic T cells in hepatocellular carcinoma are less cytolytic toward tumor cells and suppress CD8+ T cell-mediated antitumor immunity. Apmis (2017) 125(8):743–51. doi: 10.1111/apm.12704

  • 5

    ThimmeRWielandSSteigerCGhrayebJReimannKAPurcellRHet al. CD8(+) T cells mediate viral clearance and disease pathogenesis during acute hepatitis b virus infection. J Virol (2003) 77(1):6876. doi: 10.1128/JVI.77.1.68-76.2003

  • 6

    FerrariC. HBV and the immune response. Liver Int (2015) 35 Suppl 1:121–8. doi: 10.1111/liv.12749

  • 7

    ChenAWangLZhangJZouLJiaZZhouWet al. H-2 kd-restricted hepatitis b virus-derived epitope whose specific CD8+ T lymphocytes can produce gamma interferon without cytotoxicity. J Virol (2005) 79(9):5568–76. doi: 10.1128/JVI.79.9.5568-5576.2005

  • 8

    Brinck-JensenNSVorup-JensenTLeutscherPDErikstrupCPetersenE. Immunogenicity of twenty peptides representing epitopes of the hepatitis b core and surface antigens by IFN-gamma response in chronic and resolved HBV. BMC Immunol (2015) 16:65. doi: 10.1186/s12865-015-0127-7

  • 9

    Bassani-SternbergMBräunleinEKlarREngleitnerTSinitcynPAudehmSet al. Direct identification of clinically relevant neoepitopes presented on native human melanoma tissue by mass spectrometry. Nat Commun (2016) 7:13404. doi: 10.1038/ncomms13404

  • 10

    ChongCMüllerMPakHHarnettDHuberFGrunDet al. Integrated proteogenomic deep sequencing and analytics accurately identify non-canonical peptides in tumor immunopeptidomes. Nat Commun (2020) 11(1):1293. doi: 10.1038/s41467-020-14968-9

  • 11

    QiYAMaityTKCultraroCMMisraVZhangXAdeCet al. Proteogenomic analysis unveils the HLA class I-presented immunopeptidome in melanoma and EGFR-mutant lung adenocarcinoma. Mol Cell Proteomics (2021) 20:100136. doi: 10.1016/j.mcpro.2021.100136

  • 12

    CleyleJHardyMPMinatiRCourcellesMDuretteCLanoixJet al. Immunopeptidomic analyses of colorectal cancers with and without microsatellite instability. Mol Cell Proteomics (2022) 21(5):100228. doi: 10.1016/j.mcpro.2022.100228

  • 13

    de BeijerMTABezstarostiKLuijtenRDoffWASBoorPPCPietermanRFAet al. Immunopeptidome of hepatocytes isolated from patients with HBV infection and hepatocellular carcinoma. JHEP Rep (2022) 4(11):100576. doi: 10.1016/j.jhepr.2022.100576

  • 14

    DongLQPengLHMaLJLiuDBZhangSLuoSZet al. Heterogeneous immunogenomic features and distinct escape mechanisms in multifocal hepatocellular carcinoma. J Hepatol (2020) 72(5):896908. doi: 10.1016/j.jhep.2019.12.014

  • 15

    TantiwetrueangdetAPanvichianRSornmayuraPSueangoenNLeelaudomlipiS. Reduced HBV cccDNA and HBsAg in HBV-associated hepatocellular carcinoma tissues. Med Oncol (2018) 35(10):127. doi: 10.1007/s12032-018-1191-7

  • 16

    TanATYangNLee KrishnamoorthyTOeiVChuaAZhaoXet al. Use of expression profiles of HBV-DNA integrated into genomes of hepatocellular carcinoma cells to select T cells for immunotherapy. Gastroenterology (2019) 156(6):186276.e9. doi: 10.1053/j.gastro.2019.01.251

  • 17

    KaoJ-HChenD-S. HBV genotypes: epidemiology and implications regarding natural history. Curr Hepatitis Rep (2006) 5(1):513. doi: 10.1007/s11901-006-0017-9

  • 18

    SunbulM. Hepatitis b virus genotypes: global distribution and clinical importance. World J Gastroenterol (2014) 20(18):5427–34. doi: 10.3748/wjg.v20.i18.5427

  • 19

    SchweitzerAHornJMikolajczykRTKrauseGOttJJ. Estimations of worldwide prevalence of chronic hepatitis b virus infection: a systematic review of data published between 1965 and 2013. Lancet (2015) 386(10003):1546–55. doi: 10.1016/S0140-6736(15)61412-X

  • 20

    PurcellAWRamarathinamSHTernetteN. Mass spectrometry-based identification of MHC-bound peptides for immunopeptidomics. Nat Protoc (2019) 14(6):1687–707. doi: 10.1038/s41596-019-0133-y

  • 21

    LiH. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics (2018) 34(18):3094–100. doi: 10.1093/bioinformatics/bty191

  • 22

    LiHHandsakerBWysokerAFennellTRuanJHomerNet al. The sequence Alignment/Map format and SAMtools. Bioinformatics (2009) 25(16):2078–9. doi: 10.1093/bioinformatics/btp352

  • 23

    QuinlanARHallIM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics (2010) 26(6):841–2. doi: 10.1093/bioinformatics/btq033

  • 24

    LiDLiuC-MLuoRSadakaneKLamT-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de bruijn graph. Bioinformatics (2015) 31(10):1674–6. doi: 10.1093/bioinformatics/btv033

  • 25

    BaiYuNiMCooperBWeiYiFuryW. Inference of high resolution HLA types using genome-wide RNA or DNA sequencing reads. BMC Genomics (2014) 15(1):325. doi: 10.1186/1471-2164-15-325

  • 26

    RobinsonJHalliwellJAHayhurstJDFlicekPParhamPMarshSGE. The IPD and IMGT/HLA database: allele variant databases. Nucleic Acids Res (2015) 43:D423–31. doi: 10.1093/nar/gku1161

  • 27

    MaBZhangKHendrieCLiangCLiMDoherty-KirbyAet al. PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Communication Mass Spectrometry (2003) 17(20):2337–42. doi: 10.1002/rcm.1196

  • 28

    YangLYeSZhaoXJiLZhangYZhouPet al. Molecular characterization of HBV DNA integration in patients with hepatitis and hepatocellular carcinoma. J Cancer (2018) 9(18):3225–35. doi: 10.7150/jca.26052

  • 29

    JurtzVPaulSAndreattaMMarcatiliPPetersBNielsenM. NetMHCpan-4.0: Improved peptide-MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data. J Immunol (2017) 199(9):3360–8. doi: 10.4049/jimmunol.1700893

  • 30

    ThomsenMCNielsenM. Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion. Nucleic Acids Res (2012) 40(Web Server issue):W281–7. doi: 10.1093/nar/gks469

  • 31

    MissaleGRedekerAPersonJFowlerPGuilhotSSchlichtHJet al. HLA-A31- and HLA-Aw68-restricted cytotoxic T cell responses to a single hepatitis b virus nucleocapsid epitope during acute viral hepatitis. J Exp Med (1993) 177(3):751–62. doi: 10.1084/jem.177.3.751

  • 32

    WuYDingYShenC. A systematic review of T cell epitopes defined from the proteome of hepatitis b virus. Vaccines (Basel) (2022) 10(2):257. doi: 10.3390/vaccines10020257

  • 33

    Perez-RiverolYBaiJBandlaCGarcía-SeisdedosDHewapathiranaSKamatchinathanSet al. The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences. Nucleic Acids Res (2022) 50(D1):D543d552. doi: 10.1093/nar/gkab1038

Summary

Keywords

proteogenomics, Hepatitis B, hepatocellular carcinoma, immunopeptidomics, human leukocyte antigen (HLA)

Citation

Srivastava M, Copin R, Choy A, Zhou A, Olsen O, Wolf S, Shah D, Rye-Weller A, Chen L, Chan N, Coppola A, Lanza K, Negron N, Ni M, Atwal GS, Kyratsous CA, Olson W and Salzler R (2022) Proteogenomic identification of Hepatitis B virus (HBV) genotype-specific HLA-I restricted peptides from HBV-positive patient liver tissues. Front. Immunol. 13:1032716. doi: 10.3389/fimmu.2022.1032716

Received

31 August 2022

Accepted

22 November 2022

Published

13 December 2022

Volume

13 - 2022

Edited by

Ning Wei, Albert Einstein College of Medicine, United States

Reviewed by

Anthony Wayne Purcell, Monash University, Australia; Nicola Ternette, University of Oxford, United Kingdom

Updates

Copyright

*Correspondence: Mayank Srivastava, ; Robert Salzler,

This article was submitted to Cancer Immunity and Immunotherapy, a section of the journal Frontiers in Immunology

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics