Identification of AIDS-Associated Kaposi Sarcoma: A Functional Genomics Approach

Background Kaposi sarcoma-associated herpes virus (KSHV) is one of the most common causal agents of Kaposi Sarcoma (KS) in individuals with HIV-infections. The virus has gained attention over the past few decades due to its remarkable pathogenic mechanisms. A group of genes, ORF71, ORF72, and ORF73, are expressed as polycistronic mRNAs and the functions of ORF71 and ORF72 in KSHV are already reported in the literature. However, the function of ORF73 has remained a mystery. The aim of this study is to conduct comprehensive exploratory experiments to clarify the role of ORF73 in KSHV pathology and discover markers of AIDS-associated KSHV-induced KS by bioinformatic approaches. Methods and Results We searched for homologues of ORF-73 and attempted to predict protein-protein interactions (PPI) based on GeneCards and UniProtKB, utilizing Position-Specific Iterated BLAST (PSI-BLAST). We applied Gene Ontology (GO) and KEGG pathway analyses to identify highly conserved regions between ORF-73 and p53to help us identify potential markers with predominant hits and interactions in the KEGG pathway associated with host apoptosis and cell arrest. The protein p53 is selected because it is an important tumor suppressor antigen. To identify the potential roles of the candidate markers at the molecular level, we used PSIPRED keeping the conserved domains as the major parameters to predict secondary structures. We based the FUGE interpretation consolidations of the sequence-structure comparisons on distance homology, where the score for the amino acids matching the insertion/deletion (indels) detected were based on structures compared to the FUGE database of structural profiles. We also calculated the compatibility scores of sequence alignments accordingly. Based on the PSI-BLAST homologues, we checked the disordered structures predicted using PSI-Pred and DISO-Pred for developing a hidden Markov model (HMM). We further applied these HMMs models based on the alignment of constructed 3D models between the known structure and the HMM of our sequence. Moreover, stable homology and structurally conserved domains confirmed that ORF-73 maybe an important prognostic marker for AIDS-associated KS. Conclusion Collectively, similar variants of ORF-73 markers involved in the immune response may interact with targeted host proteins as predicted by our computational analysis. This work also suggests the existence of potential conformational changes that need to be further explored to help elucidate the role of immune signaling during KS towards the development of therapeutic applications.

Background: Kaposi sarcoma-associated herpes virus (KSHV) is one of the most common causal agents of Kaposi Sarcoma (KS) in individuals with HIV-infections. The virus has gained attention over the past few decades due to its remarkable pathogenic mechanisms. A group of genes, ORF71, ORF72, and ORF73, are expressed as polycistronic mRNAs and the functions of ORF71 and ORF72 in KSHV are already reported in the literature. However, the function of ORF73 has remained a mystery. The aim of this study is to conduct comprehensive exploratory experiments to clarify the role of ORF73 in KSHV pathology and discover markers of AIDS-associated KSHV-induced KS by bioinformatic approaches.

Methods and Results:
We searched for homologues of ORF-73 and attempted to predict protein-protein interactions (PPI) based on GeneCards and UniProtKB, utilizing Position-Specific Iterated BLAST (PSI-BLAST). We applied Gene Ontology (GO) and KEGG pathway analyses to identify highly conserved regions between ORF-73 and p53to help us identify potential markers with predominant hits and interactions in the KEGG pathway associated with host apoptosis and cell arrest. The protein p53 is selected because it is an important tumor suppressor antigen. To identify the potential roles of the candidate markers at the molecular level, we used PSIPRED keeping the conserved domains as the major parameters to predict secondary structures. We based the FUGE interpretation consolidations of the sequence-structure comparisons on distance homology, where the score for the amino acids matching the insertion/deletion (indels) detected were based on structures compared to the FUGE database of structural profiles. We also calculated the compatibility scores of sequence alignments accordingly. Based on the PSI-BLAST homologues, we checked the disordered structures predicted using PSI-Pred and DISO-Pred for developing a hidden Markov model (HMM). We further applied these HMMs models based on the alignment of constructed 3D models between the known structure and the HMM of our sequence. Moreover, stable homology and

INTRODUCTION
Pre-existing human immunodeficiency virus (HIV) infections affect the immune system increasing the risk for development of Kaposi sarcoma (KS). Since the discovery of Kaposi sarcomaassociated herpesvirus (KSHV), also termed human herpesvirus 8 (HHV8), the tumor development and oncogenesis were associated with co-expression of different genes (Barré-Sinoussi et al., 1983;Gelmann et al., 1983). KS is a common type of cancer associated with blood vessels and lymph nodes. Soon after the discovery of HIV-1, scientists discovered g-herpesvirus in KS lesions (Chang et al., 1994). Now that the full KSHV genome has been sequenced, it fulfils Koch's modern postulates linking the KS cancer initiation to the oncogenic virus (Russo et al., 1996;zur Hausen, 2001). KSHV is a key viral pathogen in cancer biology affecting humans and its discovery promoted clinical and epidemiological research into viral oncology (Chang et al., 1994). However, many questions remain unanswered due to the significant mortality and rapid morbidity of those affected by HIV-1 and KSHV (Parkin, 2006;Sinfield et al., 2007;Dittmer and Damania, 2019;Gaur et al., 2019).
In fact, KS was named after Dr. Moritz Kaposi, a prominent Hungarian dermatologist, who described KS as an 'idiopathic pigmented sarcoma of the skin' in 1872 (Kaposi, 1872). The evolved gamma-herpesviruses have been classified into many subfamilies (Roizman et al., 1981) and produce many viral gene products capable of subverting the normal cellular machinery through processes involving apoptosis, cell cycle progression, antiviral responses, and immune surveillance resulting in alterations in master cell signaling pathways to establish a persistent host infection. The double-stranded KSHV genome (124-174 kb) is enclosed in an icosahedral capsid composed of 162 capsomeres with many of its ORFs being conserved in alphaand beta-herpesviruses, but absent from other herpesviruses.
The KSHV is closely related to the subfamily Rhadinoviridae (gamma-2-herpesviruses), which is also close to the Herpes virus saimiri (HVS); therefore, similarities between ORFs of KSHV and HVS may influence the pathogenesis of KS (Schäfer et al., 2003). The HVS genome exists as a stable non-integrated circular episome in altered human and simian T cells. A group of genes, ORF71, ORF72, and ORF73, are located at the right end of the L-DNA and are expressed as polycistronic mRNAs (Fickenscher et al., 1996). Initial studies discerned that both KSHV and HVS ORF71 encode the anti-apoptotic FLICE inhibitory protein (vFLIP) (Thome et al., 1997), although HVS ORF71 is not mandatory for viral replication, transformation, or pathogenicity (Glykofrydes et al., 2000). ORF72 produces a v-Cyclin D homolog which is important for transformation of human T lymphocytes (Ensser et al., 2001). However, the function of ORF73 has remained a mystery. Therefore, developing and conducting comprehensive exploratory experiments to clarify the role of ORF73 in KSHV pathology is important.
Typically, the phenotypic features of KS initially appear on the face, legs, or feet as painless red spots but, in severe cases, the lesions also appear in the lungs and digestive tract (Bhutani et al., 2015;Yarchoan et al., 2015). KSHV is considered an oncogenic human virus (Martin et al., 1998). People with weak immune systems are more susceptible to . Even with the availability of the anti-retroviral treatment [HAART], the prevalence of AIDS-associated KS has not declined significantly (Nguyen et al., 2008). Although KSHV infection is important for the onset of KS, additional factors must be present to allow the establishment of the lesions. The chance of infection is one in 100,000 among the general population, but only around one in 20 among HIV-infected individuals (La Ferla et al., 2013). The chance of acquiring the infection was one in three among HIV-infected individuals before the introduction of HAART (Beral et al., 1990;Gallo, 1998). Epidemiological observations from incidence rates in endemic areas suggest that HIV-negative individuals with KSHV infections never develop KS due to the role of immunological host factors including immune-response genes and genetic polymorphisms of the inflammatory modulators (Cottoni et al., 2004;Gazouli et al., 2004;Dorak et al., 2005).
KSHV infection of endothelial and/or hematopoietic progenitors (Della Bella et al., 2008) alter their morphology (Moses et al., 1999), growth rate, gene expression (Flore et al., 1998;Ciufo et al., 2001), and glucose metabolism (Delgado et al., 2010), leading to development of KS. Antibody titers specific for KSHV correlate with its viral load. Among individuals with low viral load, antibody titer concentrations may be too low for current serological assays to identify them. Identification of circulating biomarkers in KSHV-associated disease may help in predicting clinical outcomes (Aka et al., 2015). Immune modulatory and evasion proteins of KSHV modulate cellular responses associated with complement activation, autophagy, IFN family signaling, chemokines, natural killer cells, and apoptosis (Liang et al., 2008). They are located in a region of the viral capsid that is rich in a protein known as tegument. Six tegument proteins have been identified: ORF21, ORF33, ORF45, ORF63, ORF64, ORF73 and ORF75. Among these, the roles of ORF63 and ORF64 in immune evasion have been elucidated (Zhu et al., 2005;Gregory et al., 2011). We focused on the identification of the role of ORF73 in KSHV. The ORF73 gene encodes the HHV-LANA1 viral proteins that have been linked with AIDS-associated KS, indicating an association between HIV and ORF73. For our computational study, we hypothesized that ORF-73 is a viral proliferation factor based on studies on KS and on its interactions with the host gene p53 (Woodberry et al., 2005). The importance of ORF-73 for cellular host apoptosis through the p53 signaling pathway and p53 is in order of ORF-73 which illustrates the molecular mechanism of this key biomarker associated with KS (Duus et al., 2004).
The variability in KS lesions observed in histopathological assays include spindle cell hemangiomas, cutaneous angiosarcomas, vascular leiomyomas, a nd fi brous histiocytomas (Hunt et al., 2004). Endothelial biomarkers, such as CD31 and CD34, bcl-2, c-kit, Ki-67, and p53, have been used to distinguish nonvascular spindle sarcomas from angiosarcomas (Weeden, 2002;Fukunaga, 2005). Hence, investigating the HHVlatent associated nuclear antigen-1 (LANA-1) viral protein encoded by ORF-73 is important to identify markers for AIDS-associated KS. Also, studying its interactions may help in the development of preventive strategies and therapeutic options against KS. In this study, we used advanced bioinformatics tools and approaches to identify KS markers Supplementary Figure 1.

Selection of Markers
We used publicly available databases including the National Centre for Biotechnology Information (NCBI), GeneCards (Hou et al., 2017) and UniProtKB  to identify potential markers of KS and selected the most specific ones using "Kaposi's sarcoma" as a keyword. Human protein markers were further ran through a BLAST search for homology sequences. We extracted ORF-73 sequences from the NCBI database search using the accession number AAC57158.1. These are the exact URLs of the searched databases we used to identify markers associated with KS : GeneCards https:// genecards.weizmann.ac.il/v3/index.php?path=/Search/keyword/ kaposi%20sarcoma%20markers/0/20; UniPortKB https://www. uniprot.org/uniprot/?query=kaposi+sarcoma&sort=score; and NCBI https://www.ncbi.nlm.nih.gov/protein/?term=ORF-73% 20kaposi%20sarcoma).

Bioinformatics: Sequence Computational Analysis
We used publicly available internet-based protein search tools and bioinformatics programs with default settings, unless otherwise stated in the text, for the analysis. We tested selected protein sequences to identify conserved domains from NCBI and BLAST algorithms, and we used the PSIPRED program to predict the secondary structure of proteins based on the conserved domain sequences. We further executed a position specific iterative BLAST (PSI-BLAST) search to build a PSSMs (position specific score matrix), which could predict the secondary structure of the input sequences (Majerciak et al., 2015) to predict secondary structures of the selected conserved domains based on multiple sequence alignment related proteins spanning a variety of organisms to reveal sequence regions containing the same, or similar, patterns of amino acids. We submitted the primary sequence of ORF-73 to FUGUE to show the sequence-structural homology by identifying distant sequence-structure homologues and alignments comparing amino acid insertions/deletions (Shi et al., 2001). We used BLASTp and PSI-BLAST (non-redundant protein databases) for pattern specific profiling (Bujnicki and Rychlewski, 2001).

Gene Ontology and Pathway Enrichment Analysis
We chose the ORF-73 target effector to perform a Gene Ontology (GO) search, is a hierarchical graph-based annotation system where the terms closer to the root describe more general information while those away from the root provide more specific information about a given GO category and all the GO terms associated with a protein sequence were obtained from the GO database. The KEGG network pathway enrichment analysis by collecting data of related genomes and their pathways associated with diseases (Yan et al., 2013) and we set a P value <0.05 as the cut-off criterion.

Protein-Protein Interaction (PPI) Network Analysis
We used the online Search Tool for the Retrieval of Interacting Genes (STRING) (Franceschini et al., 2013) and GeneMania (https://genemania.org/) to analyze interactions associated with KS among the proteins encoded by the DEGs. The two parts of GeneMania algorithm consists of an algorithm based on linear regression to calculate functional association from multiple networks from different data sources; and a label predicting gene function of composite network. We employed keywords such as-ORF73 to determine interacting partners. This was pursued using downstream regulator p53 as an apoptosis marker during pathogenesis in the host. Moreover, the marker protein was used for transient interaction study.

PPI Biochemical Analysis
We immobilized His-tag, GST-tag, or biotin-tag bait proteins to an affinity resin and incubated them with solution expressed proteins as prey proteins. We then captured the bound bait and pulled down the cell lysate flow through. Subsequently, we used mass spectrometry (MS) or Western blots to confirm interactions. Using this technique, we determined interacting protein partners of relevant proteins (Einarson, 2001;Arifuzzaman et al., 2006).

Homology Search and KS Marker Identification
Annotations used to search for the KS-associated markers in the UniProtKB database quoted about 137 entries, which we then screened to find those with computationally annotated data. Search engine GeneCards reported about 369 KS markers with a relevance score. Table 1 lists the markers with the top ten scores.
We found61 ORF-73 marker homologous hits related to the family of human gamma herpes virus 8 with varied E-values. Out of these, we used only the most identical sequence (based on sequence identity was measured by matched by dividing the length of region aligned match), AAC57158.1, for our computational analyses. A search for proteins similar to the selected marker ORF-73 resulted in8 protein accessions (ORF21, ORF33, ORF45, ORF63, ORF64, and ORF75), and 2 CDS regions (accession numbers AAC57158.1 and AAC55944.1).

Domain Prediction and Structural Profile
We looked for conserved domains in the marker protein ORF-73 based on hypothetical domain sequences using literature recapitulation NCBI's Conserve Domain Database (CDD). To identify potential marker roles at the molecular level, we focused on its predicted secondary structure. Therefore, we searched for hypothetical protein having conserved domain and used accession number AAC5744 of gi.1633572 in an NCBI domain search and found only one significant hypothetical conserved domain (PHA03169) with the same accessison number ( Figure  1). We then used PSIPRED to predict the secondary structure, noted the conserved domains ( Figure 2) and highlighted the regions with different markers to predict the secondary structures. FUGE interpretation consolidations of the sequence-structure comparison were based on distance homology, where the score for the amino acids matching the insertion/deletion (indels) detected were based on structures compared to the FUGE database of structural profiles and we calculated the compatibility scores of sequence alignment accordingly ( Table 2). Using PSI-BLAST, we confined the search of HHV-latencyassociated nuclear antigen homology to ORF-73 homologs. The DNA binding of viral protein associated with HHV-8 LANA sheltered 134 residues covering 12% of the sequence with 100% confidence based on the single highest scoring template of c4k2jB (Figures 3 and 4). 598 residues covering 51% could be modelled at >90% confidence using multiple-templates. We submitted the topranking model of the protein (c4k2jB, 100.0% confidence) to the 3DLigandSite (Wass et al., 2010) server to predict potential binding sites. Based on PSI-BLAST homologues, the predicted disordered structures were checked using PSI-Pred (Jones, 1999) and DISO-Pred (Jones and Cozzetto, 2015) for generating a hidden Markov model (HMM). The models were based on the alignment of the constructed 3D models between the known structure and the HMM of our sequence predicting the3-states-a-helix, b-strand or coil ("SS" indicates the predicted confidence; middle orange, yellow, and green indicate the confidence of prediction).

Gene Expression and Pathway Prediction
The exclusive over-expression of HHV-8 LANA-1 in KS confirms significant sensitivity and specificity. The domain is conserved in the HHV-8 and ORF-73, suggesting its expression during viral latency and allowing it to interact with p53, thereby inducing the apoptosis pathway. The evidence from another study indicates abnormal expression of p53 in the nodular region and metastatic lesion of angiosarcomas (rather than in the primary lesion) (Yee-Lin et al., 2018). To account for this, the lead p53 in KS was taken with reference to the database for a herpes virus-associated infection model so as to understand the immune evasion with a detailed pathway demonstrating the dominant role of a p53 oncogene in KSHV-( Figure 5). The tumor suppressor antigen p53 depends on cellular conditions inducing arrest of the cell growth and controlling cell division. This process inhibits cyclin-dependent kinases mediated by the expression of BAX and FAS antigens or by the repression of the Bcl-2expression (Kanashiro et al., 2003). Addressing the markers involved in the cell-cycle arrest is important to understand the molecular evolution of KS and for work towards its eradication. We examined PPIs to explore the complex biochemical interactions and molecular functions of proteins of interest with cellular components, as reported in Table 3. Table 3 also presents the functional enrichment of p53 including its biological process, molecular functions, and cellular components. The effector p53 is directly involved in the arrest of the  G1/S cell-cycle progression from normal to cancerous cells (Chen, 2016). Analysis of PPI with STRING showed an enriched p-value of 1.31e−05 with respect to the network having significantly more interactions than expected with 11 nodes, 47 edges, an average node degree of 8.55 and an average local cluster coefficient of 0.919 ( Figure 6). The functions of the protein p53, a tumor protein, are associated with various expression levels during oncogenesis. GeneMania predicted various valuable functions of the query protein and interacting partners associated with it ( Figure 7).

Pulldown Strategy and Protein Interaction Prediction for Biomarker Selection
Pull-down assays serve as a complementary method to further validate the predicted interactions in a quantitative manner towards understanding their dissociation constants and relative bindings of proteins and their direct binding sites. However, this is beyond the scope of this study. We believe the following recommendations should be followed by researchers investigating transient protein interactions: First, determining the protein solubility is essential. If the prey protein is at a toohigh concentration, it will not be sufficiently soluble. Second, shortening the time and adjusting buffer conditions of incubation help prevent prey protein degradation. Third, checking the prey protein with beads if bait protein is not bound should be done as a control. Fourth, conducting all assays at a constant temperature of 4°C should be considered if a variation in Kd is found between repeated experiments. The tumor suppressor antigen p53 depends on specific cellular conditions to induce arrest of cell growth and to control cell division (Pucci et al., 2000;Chen, 2016).
Our network analysis (entry N00170, class nt06164) showed involvement of LANA and other effector markers in KS conditions and helped elucidate their mechanisms of action ( Figure 8, Table 4). Therefore, we suggest that ORF-73 is an important protein that may be a useful biomarker for AIDSrelated KS. Studies have suggested a linkage between ORF-73 and host apoptosis through p53 signaling pathways (Tornesello et al., 2018), that could represent a molecular mechanism for the predicted markers associated with KS. Our study discovered KS-associated markers which trigger cancer. ORF-73 encodes LANA-1 virtual proteins of KSHV, linking them with AIDSassociated KS, by their interaction with several cellular processes which include cell apoptosis (through p53) and inhibition of downstream transcriptomic performance. The association between HIV and ORF73 can be inferred by these findings.

DISCUSSION
Many viral genes are homologous to host cellular genes in KSHV (Swanton et al., 1997). The PubMed, Google Scholar, and Scopus searches confirmed the key diagnostic markers for KS based on the available literature. Our computational study on them revealed their importance and evolutionary role in human cancer biology. LANA-1 imparts important immunogenic effects to KSHV, and it specifically interacts with many cellular pathways, including that of cell apoptosis (through its interaction with p53, and repression of downstream transcripts; see Table 4). This induces oncogenesis by targeting the protein-E2F transcriptional regulatory pathway (Radkov et al., 2000). The protein homologues identified through our search were structurally different from each other. Therefore, we analyzed selected proteins and compared them using homology searches for the selected domains to prove  interactions with other host proteins that trigger and induce cancer in individuals with immunosuppression (Kersse et al., 2011). Hyper mutation and conserved structural sequence similarities help to maintain key aspects of secondary and tertiary structures, which were consistent with the computational analyses in our study (Huang et al., 2002). Figure  5 shows the KSHV infection pathway from KEGG. We highlighted the reference pathway using a red box that shows that LANA is associated with the p53 signaling pathway. A BLAST homology search confirmed an ORF-73 marker interaction during herpesvirus pathogenesis. The results of STRING and KEGG searches suggested ORF-73 interacts with the host p53.
ORF-73 is not the only protein marker implicated in KS pathology, but much about it remains unknown. It is used as a marker for KSHV; especially, its protein folding and motifs are important for the marker assessment observed in the pattern of structural domains in the selected sequence analyzed with PSI-PRED. The pathogenic interactions in the network-based analysis between LANA and the host p53 suggest that LANA was confirmed by STRING and FUGUE tools. The predicted sequence motifs give detailed interactions that are conserved in the subfamilies of the herpesviruses as discussed in detail on the KEGG pathway with notable mechanisms described in the literature (Schulz, 2000;Direkze and Laman, 2004;Sharma-Walia et al., 2004;Mesri et al., 2010). However, the markers associated with KS need to be incorporated into comprehensive clinical cohort studies, designed using differential protein purification techniques and evidencebased knowledge on protein interactions with bait proteins to develop practical medical applications in the future.
Like all other herpesviruses, KSHV displays latency and a lytic life cycle replication that are characteristic of some viral gene expressions. The genes LANA, v-FLIP, v-cyclin, and Kaposins A, B, and C for latency facilitate the establishment of life in its host and survival against host immune mechanisms. During latency, proteins expressed as K1, K15, vIL6, vGPCR, vIRFs, and vCCLs participate in inflammatory and angiogenic processes evident in KS lesions. Many other lytic and latent viral proteins are involved in the transformation of KSHV host cells into malignant cells. Also, Bcl-2 is one of the major KS progression factors, and TP53 and cmyc have a role in the progression of disease. KS pathology is FIGURE 5 | The Kaposi sarcoma-associated herpesvirus infection pathway from KEGG. Reference pathway highlighted using red box shows that LANA is associated with p53 signaling pathway which confirms the predictable role of the ORF-73 protein in the KS associate marker protein.
interconnected with immune modulation effects such as cell cycle arrest in the host cell, which is required for pathogenic conditions and is mitigated by modulating key factors such as LANA. Likewise, measuring the expression level and identifying the function of the encoded protein products is important to understand the pathogenesis of KS. We used a methodology similar to that in co-immunoprecipitation (Co-IP) experiments because of our ligand's affinity to capture the strongest interacting proteins (Lapetina and Gil-Henn, 2017). MS identifies subunits and helps explore the structural information associated with the protein of interest (Byrum et al., 2012). Dynamic PPI machines assemble or disassemble the ever-changing inter-, intra-, and extracellular influx cues as a preliminary step towards understanding the structure of proteins and to determine their functions to identify the relevant pathways of interacting proteins (Einarson, 2001;Vikis and Guan, 2004;Einarson et al., 2007). The role and important reason to select ORF-73 in the study is that  encoding LANA protein distinct domain induces a putative nuclear localization signal (NLS), which product shown interacting with many co-cellular p53, pRb, and ATF4/CREB2.
LANA also modulates transcriptional activity of HIV-1 long terminal repeat and to understand the how ORF-73 appears to prevent activity of KS-associated genes was new to know to make  preventive strategy (Schäfer et al., 2003). Our findings may help researchers planning cancer prevention strategies, but we used common computational analyses alone, and future studies with expression and interaction analyses should be used to confirm our results and generate treatment options for KS.

CONCLUSION
Our computational studies found that ORF-73 is involved in host apoptosis through p53 signaling pathways and is a key marker associated for Kaposi Sarcoma. This study also identified potential KS-associated genes which are reported to trigger cancer and suggested mechanisms of interaction that may help researcher developing prevention strategies.

ETHICS STATEMENT
We retrieved all data from publicly available resources and we required no ethical approvals for dissemination of this purely academic information.