Hepatitis E virus RNA replication polyprotein: taking structural biology seriously

COPYRIGHT © 2023 Fieulaine, Tubiana and Bressanelli. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. Hepatitis E virus RNA replication polyprotein: taking structural biology seriously


Hepatitis E virus (HEV) infects approximately 20 million individuals each year all
around the world, both in developing and industrialized countries. It leads to 40,000-70,000 deaths annually, especially in immunocompromised patients and pregnant women. Despite its recognized major public health issue status and zoonotic potential, no specific treatment is available. Indeed, HEV life cycle characterization is hampered by the lack of efficient infectious cell culture systems or in vivo models. A better knowledge of HEV virology is therefore needed. By providing descriptions of the three-dimensional structures of viral proteins at atomic level, structural biology can be a powerful tool to understand viral replication and help develop specific antivirals. In this comment, we describe how both experimental and advanced computational structural biology help to decipher HEV virology and make a case for heeding its lessons.
HEV pORF1 is the replication polyprotein encoded by open reading frame 1. It contains domains that ensure the synthesis of new viral RNA genomes in infected cells. As for other single-stranded positive-sense RNA viruses [(+)RNA viruses], it encodes the viral replication complex (prominently, the RNA-polymerase) and presumably includes domain(s) that allow targeting to and remodeling of a specific host endomembrane to shelter the replication complex. pORF1 organization was delineated promptly after virus identification in the early 1980's (Khuroo, 1980;Balayan et al., 1983) and genome sequencing in the 1990's (Reyes et al., 1990;Tam et al., 1991). In 1992, Koonin et al. used sequence-based computational tools to perform sequence alignments with some closely related (+)RNA viruses belonging to the Alphavirus-like superfamily and define domain boundaries of pORF1 (Koonin et al., 1992). They tentatively proposed six domains embedded in pORF1. In decreasing order of confidence, these are the aforementioned RNA-dependent RNA-polymerase (RdRp, residues 1,207-1,693 for the genotype 1 (gt1) strain they analyzed), an RNA helicase (HEL, residues 960-1,192), a methyltransferase (Met, residues 56-240), a Y domain (residues 219-433), an X domain (residues 784-942) and a papain-like cysteine protease (PCP, residues 434-592). A prolinerich hypervariable region is sometimes considered as a 7 th domain. As mentioned by the authors then, the confidence index for the putative PCP domain prediction was very low and they proposed a protease in HEV pORF1 mainly because it was already known in other animal (+)RNA viruses. Since then, the HEV community has taken to referring to HEV pORF1 residues 434-592 as "the PCP domain" or "the HEV protease." . /fmicb. .
However, structural biology has recently shown that this initial assignment was erroneous, both with an experimental X-ray crystal structure and with groundbreaking computational tools, including those based on artificial intelligence (AI). As summarized below, it is now clear that HEV pORF1 does not contain any protease domain, either in residues 434-592 or anywhere else. First, in 2019 Proudfoot et al. solved the crystal structure of the 510-690 fragment of a gt1 HEV pORF1 (Proudfoot et al., 2019). Structurally, this fragment is unequivocally a member of a large protein family known as fatty acid binding proteins. Even though the role of this HEV fatty acid binding domain (FABD)-like domain during the viral life cycle is not yet known, it fits the definition of a protein domain, i.e. a region of a protein that is self-stabilizing and folds, functions and evolves independently from the rest. Thus, it is actually established since 2019 that residues 434-592 cannot be a protease domain.
In recent years, the AI-based AlphaFold2 (AF2) tool (Jumper et al., 2021) revolutionized the field of sequence-based protein structure prediction, providing, from protein sequences alone, structural models with high accuracy and very good estimation of the error in the coordinates (Jumper et al., 2021;Tunyasuvunakool et al., 2021). Three groups, including ours, have used AF2 to generate accurate structural models of HEV pORF1, either in its full-length form (Fieulaine et al., 2023;LeDesma et al., 2023) or segmented into two overlapping fragments [1-1,250 and 1,000-1,708 (Goulet et al., 2022)]. All the three groups independently obtained similar models, exhibiting five domains with very high confidence scores, and a long disordered region corresponding to the hypervariable, proline-rich region ( Figure 1A). is that pORF1 functions could be regulated not by proteolytic processing but rather by structural flexibility of different motifs: Indeed both the N-and C-terminal α-helices of MetY seem to alternate between unfolded and folded states, the C-terminal extension of the FABD-like domain could open to allow the binding of yet unidentified ligand(s), the RdRp could alternate between different conformations especially in its fingertips motif (Fieulaine et al., 2023). In this respect, decade-long studies on the distantly related flock house virus, whose counterpart to HEV pORF1 is not cleaved, has recently culminated in remarkable structural and cellular work (Zhan et al., 2023). This latter work establishes how flexible linkers allowing large conformational switches can be used to build a replication complex harboring all major functions for (+)RNA virus replication through formation of a large oligomeric ring of the uncleaved replication polyprotein.
We are at a watershed in HEV biology in which too many researchers still refer to the HEV PCP, despite the hard data we just outlined. This situation is very well exemplified by the work of LeDesma et al. recently published in eLife (LeDesma et al., 2023). These authors set out to probe the function of the putative PCP and also reached the conclusion that it is not a protease, that pORF1 is likely not cleaved, and that regulating pORF1 structure is crucial for its functions (Dearborn et al., 2023). Importantly, several cysteines, including Cys483 that was initially proposed to be the catalytic cysteine of the HEV protease, do play a role during viral replication. These Cys are found in a hexa-Cys motif (CxCx 11 CCx 8 CxC in region 457-483) that is likely to bind divalent cations, most probably zinc (Dearborn et al., 2023;LeDesma et al., 2023). Strikingly, this motif corresponds to the 461-477 α-helix located at the C-terminus of the MetY domain ( Figure 1B) that we proposed would play a central role during MetY oligomerization and binding to yet unidentified membranes (Fieulaine et al., 2023).
In the light on these new data, we think it is time to stop using the terms "HEV protease" or "PCP domain, " as the corresponding region is now established to be neither a PCP nor even a domain, and to heed the contribution of structural biology in probing the actual functions of this part of HEV pORF1.

Author contributions
SF: Conceptualization, Funding acquisition, Writing-original draft, Writing-review and editing. TT: Visualization, Writingreview and editing. SB: Conceptualization, Funding acquisition, Writing-review and editing.

Funding
This research received specific grants from ANRS (ECTZ105819 and ECTZ188022) and a postdoctoral fellowship (ECTZ189696 to TT).