Molecular Mechanisms Behind Anti SARS-CoV-2 Action of Lactoferrin

Despite the huge effort to contain the infection, the novel SARS-CoV-2 coronavirus has rapidly become pandemic, mainly due to its extremely high human-to-human transmission capability, and a surprisingly high viral charge of symptom-less people. While the seek for a vaccine is still ongoing, promising results have been obtained with antiviral compounds. In particular, lactoferrin is regarded to have beneficial effects both in preventing and soothing the infection. Here, we explore the possible molecular mechanisms with which lactoferrin interferes with SARS-CoV-2 cell invasion, preventing attachment and/or entry of the virus. To this aim, we search for possible interactions lactoferrin may have with virus structural proteins and host receptors. Representing the molecular iso-electron surface of proteins in terms of 2D-Zernike descriptors, we 1) identified putative regions on the lactoferrin surface able to bind sialic acid present on the host cell membrane, sheltering the cell from the virus attachment; 2) showed that no significant shape complementarity is present between lactoferrin and the ACE2 receptor, while 3) two high complementarity regions are found on the N- and C-terminal domains of the SARS-CoV-2 spike protein, hinting at a possible competition between lactoferrin and ACE2 for the binding to the spike protein.


I. INTRODUCTION
Lactoferrin (Lf) is a versatile glycoprotein, which plays a key role in many biological functions [1].In this work, we focus on the Lf as a crucial player in natural immunity, since it has been proposed to play a strong antiviral activity against a wide range of RNA and DNA viruses [2][3][4][5][6].Lf is composed of a single chain of about 700 residues folded into two symmetrical lobes.Each lobe possesses a metal-binding site, able to bind iron but also other ions like Cu 2+ , Zn 2+ and M n 3+ [7,8].
This protein is present in saliva, tears, seminal fluid, white blood cells, and milk of mammals [9].From its discovery in 1939 [10,11] lactoferrin has been identified as the most important iron-binding protein in milk.Besides, in recent years, lactoferrin has been found involved in a multitude of biological processes.In fact, despite the name, the iron cargo capacity of Lf is not the prominent activity exerted by this molecule.Instead, it performs antioxidant, anti-inflammatory, and anticancer activities [8,12], together with a broad antimicrobial action against bacteria and fungi.The latter activity, in particular, is due to Lf's ability to reversibly bind two atoms of iron with high affinity in the presence of bicarbonate.The iron-free form of Lf, apo-lactoferrin (apoLf), deprives bacteria of iron, thus inhibiting their metabolic activities in vivo.
Besides all the aforementioned activities, Lf has been demonstrated to prevent infection of a wide range of diverse viral species [13].
Many viruses make use of glycans, such as sialic acid (SIA) or glycosaminoglycan, like heparan sulfate (HF) as attachment factors.See, for example, [14] for details on glycans.When the contact between the virus particle and these receptors is established, they roll toward their spe-cific viral receptor and subsequently enter the host cell, for instance by fusing with the host cell membrane [15].
While the interaction between Lf and HF has been observed [16], studies on its interaction with sialic acid derivatives are still missing.On the other hand, Lf has been reported to interact with virus structural proteins, S, M, and E [17].
In general, depending on the specifics of the virus, lactoferrin prevents infection of the target cell by either (i) interfering with the attachment factor or (ii) by binding to host cell molecules that the virus uses as a receptor or co-receptor (competition) or (iii) by direct binding to virus particles, as described for herpesvirus [18], polioand rotavirus [19,20], and possibly human immunodeficiency virus [21].See, for example, [22] for a more detailed discussion.
While we are writing this article, a novel virus, first observed in the autumn of 2019, has rapidly become pandemic.This virus, called SARS-CoV-2, belongs to the coronavirus family and causes severe acute respiratory syndrome [23,24], somewhat similar to those caused by two other coronaviruses, SARS-CoV and MERS-CoV, which crossed species in 2002-2004 [25,26] and 2012 [27].In fact, SARS-CoV-2, similarly to SARS-CoV and MERS-CoV, attacks the lower respiratory system, thus provoking viral pneumonia.However, this infection can also lead to effects on the gastrointestinal system, heart, kidney, liver, and central nervous system [24,28,29].
As for SARS-CoV [30][31][32], recent in vivo experiments confirmed that also SARS-CoV-2 cell entry is mediated by high-affinity interactions between the receptorbinding domain (RBD) of the virus S glycoprotein and the human-host Angiotensin-converting enzyme 2 (ACE2) receptor [33].The spike protein is located on the virus envelope and promotes the host attachment and fusion between the viral and cellular membrane.[34,35].Structural studies determined the structures of such protein both in free form and bound to ACE2 [36].Further studies investigate the possible interaction of SARS-CoV-2 to sialic acids [37][38][39][40] or heparan surfate receptors [39], both considered involved in SARS-CoV-2 as well as in other coronavirus infections [14,16,[41][42][43].
While the use of Lf as an antiviral against previous coronaviruses infections has been poorly investigated, except for [16], where evidence of an effect in the attachment process is shown, new studies are proving the antiviral effect of Lactoferrin against the novel SARS-CoV-2 infection.In particular, administration of a liposomal formulation of Lf to a significant sample of Covid-19 positive patients, has been shown to provide an immediate beneficial effect [44].
Here, we computationally investigate the possible molecular mechanisms behind the observed antiviral action of lactoferrin.In particular, we make use of a recently developed computational protocol based on the 2D Zernike Polynomials, able to rapidly characterize the shape conformation of given protein regions [37].In this framework using a simple pairwise distance, it is possible to evaluate the similarity between 2 protein pockets or the shape complementarity between the binding regions of 2 interacting proteins.
We moreover checked for a possible direct interaction between LF and ACE2 receptor, which could inhibit the interplay between spike and ACE2 binding necessary to the virus infection.
Finally, we investigate the possible interaction between Lf and the 3 proteins present on the SARS-CoV-2 membrane, i.e. the spike (S), membrane (M), and envelope (E) proteins.

II. RESULTS
The entry of the virus inside the host cells requires the occurrence of a sequence of molecular interactions.Sialoside (SIA) and/or heparan sulfate (HF) chains mediate the attachment of the virion to the cell surface.Once in the proximity of the cellular receptor (ACE2), SARS-CoV-2 spike protein binds to the receptor and initiates the internalization process.Figure 1a shows a sketch of the mechanism.The observed antiviral action of Lf may consist of interference in one or more of those steps.We thus investigate in the next three sections the possibility of direct binding between lactoferrin and SIA, ACE2, and SARS-CoV-2 Spike protein, respectively (see Figure 1b).

A. Interaction with SIA
Possible binding regions for sialic acid, the terminal molecule of the SIA receptors on human lactoferrin are investigated on the basis of the procedure described in [37], i.e. we select the portion of the molecular surface of the MERS-CoV spike protein in interaction with sialic acid, experimentally solved in [45] (Figure 2a), and we search for similar patches on the Lf molecular surface.
Within the same strategy, we search for similar patches on the surface of human lactoferrin.
In the 2D Zernike framework, the geometrical shape of a protein surface patch is compactly summarized in a set of ordered numerical descriptors, whose number -121 in our case -modulate the detail of the description.Dealing with ordered numerical descriptors, the comparison between different protein patch can be performed with a Euclidean distance.
Figure 2b and c show the four most geometrically similar patches identified in the Apo and Holo form of Lf.An a posteriori check of the electrostatic potential on the patches, allows us to select only some of the possible solutions identified based on the shape comparison analysis, i.e. the ones having also a similar electrostatic surface with the SIA binding site on the MERS-CoV spike.
The region on Lf surface identified as the most similar to the MERS region interacting with sialic acid, both in shape and in electrostatics, is the one centered on VAL 346.

B. Interaction with ACE2
To check whether the action of lactoferrin can be ascribed to a competition with the virion spike proteins in binding directly the ACE2 receptor (Figure 1), we performed a blind search of the molecular surfaces of both ACE2 and human Lf to identify possible binding regions having a meaningful shape complementarity.Under this hypothesis, if the interaction between the Lf and the ACE2 receptor occurs, Lf could hinder the molecular binding between the spike protein of SARS-CoV-2 and the corresponding ACE2 receptor.Figure 3 shows the molecular surface of the ACE2 receptor colored according to its propensity to bind regions of the Lf protein.
The redder the region, the greater the shape complementarity between that region and another one found on the surface of the putative molecular partner, i.e. holo lactoferrin.As one can see from Figure 3, a complementary region is indeed present, however, it is located far from the binding site of the spike (grey in the figure) and in a part of ACE2 that looks toward the membrane.To better visualize the result, we have represented two points of view of the binding between spike and ACE2, one rotated 180 o with respect to the other.On the other hand, we can see that the ACE2 region interacting with the SARS-CoV-2 spike protein has no low shape complementarity with Lf regions.

C. Interaction with virion membrane proteins
A third possible mechanism at the basis of the observed antiviral activity of Lf could be ascribed to a direct interaction with the membrane proteins present on the virion envelope.In particular, SARS-CoV-2 presents three different kinds of proteins on its membrane, i.e. S, M, and E proteins [46,47].While the 3D structure of the S protein has been determined -even if some loop regions in the S1 sub-unit are not solved -unfortunately, no structures are available for the E and M proteins.Thus, the molecular surfaces of Lf were compared with those of the three proteins in order to check whether an interaction with lactoferrin is possible.For this analysis, we adopted the same computational procedure used in the previous paragraph.
For both S, M, and E proteins, we sample their whole molecular surface and compared all the possible patches with those of Lf.In this way, all molecular surfaces, both membrane proteins, and Lf ones are colored according to the corresponding binding propensity.
E and M presented a possible region of interactions located in the intra-membrane region (data not shown).The most robust and relevant result of this analysis regards the compatibility between the spike and lactoferrin.According to our findings, Lf presents two regions of high complementarity with one portion of the C-terminal domain of the spike S1 subunit and another located in the N-terminal part.
To test the reliability of the found signals, we performed a molecular dynamics simulation of the spike trimer (see Methods for details) and sampled 5 configurations for each of the three chains at equilibrium (see

Figure 4a).
In particular, for each chain, we performed a Principal Component Analysis (PCA) on the frames of the dynamics and thus projected them on the plane identified by the two principal components.Upon clustering these points we obtain 5 subgroups.For each subgroup, we extract the centroidal configuration.Since distant points in the PCA plane correspond to the different 3D structure, picking one point from each identified cluster assured that the selected configurations have high structural differences between them in the explored configuration space (see Figure 4b).
Remarkably, repeating the blind search for complementarity regions on the 15 surfaces of the extracted spike monomers, we found conservation of the signal over the conformational noise.Figure 4c) shows the identified regions and their conservation in the different sampled frames.
Among the found regions, the one involving the spike C-terminal domain is the one with higher shape complementarity.In particular, according to our method, human lactoferrin could interact with the spike protein with the surface region centered in residue ALA 539.Alternatively, the spike protein may interact with the Lf pro-tein using a molecular patch centered in the residue PHE 490.It must be pointed out that the complementarity achieved by these patches is comparable with those of experimentally solved complexes.Indeed, analyzing the shape complementarity of over 4600 x-ray protein-protein complexes (see Methods for details), we have the distribution shown in Figure 4d, where the lower the distance the higher the complementarity of the binding region.Red dotted line shows the complementarity found between the spike and Lt.As it is evident the proposed patch is characterized by values of shape complementarity typical of experimental complexes.
Finally, to further support this result with an independent and external methodology to our approach, we performed a completely blind molecular docking analysis between the spike protein and the Lf protein.To this end, Zdock server was used as a state of the art of molecular docking software [48].As per default settings, only the first 10 docking poses have been selected and the predicted contacts analyzed.In particular, five out of ten poses show bindings involving the spike region involve the region we found with our protocol when the residues are defined in contact if their C-alpha atoms have a distance less than 8 Å.
Very recently, we developed a new representation, based on the 2D Zernike polynomials, which allows an extremely efficient, fast, and completely unsupervised description of the local geometrical shape, allowing for easy comparison between different regions of molecules.Through this compact description, it is possible both to analyze the similarity between 2 different regions -suggesting, for example, a similar ligand for binding regions -and to study the complementarity between interacting surfaces [37].
Here, we used our novel method to shed some light on the molecular mechanisms ruling the observed antiviral action human lactoferrin exerts against SARS-CoV-2 infection [44].
In particular, we focused on the early stages of the infection, i.e. the attachment and entry of the virus to the host cell, when lactoferrin can interfere with the virushost interaction without the need to be internalized in the cell.We thus tried to establish whether lactoferrin could compete with the virus in binding to sialic acid, the sticky end of sialoside chains, which has been suggested to mediate the attachment of SARS-CoV-2 to the host cell.Interestingly, comparing the binding region of sialic acid in the MERS-CoV coronavirus with patches on the lactoferrin surface, we found possible spots on both apo and holo forms of Lf, which could compete in forming low affinity but high avidity interactions.
We then proceeded to test the hypothesis of an inter-action between lactoferrin and the primary SARS-CoV-2 protein receptor, ACE2.A blind search for complementarity regions highlighted a hot-spot in a region that in physiological conditions is oriented toward the membrane, while no significative complementarity is present in the ACE2 region involved in the interaction with the virus spike protein.At last, we analyzed the three membrane proteins on the virus envelope, i.e. the E, M, and S ones.Similarly to ACE-2, both E and M presented possible interacting regions in portions of the surface, that are buried in the virion membrane under normal conditions (data not shown).On the other hand, the spike protein showed two main hot spots, one in the N-terminal domain of the S1 subunit and another in the C-terminal one.Those two regions are robust to molecular noise, as the signal endures using different configurations sampled from a molecular dynamics simulation and each of the three chains of the trimer.Notably, the most complementary region is the one in the C-terminal region, the one involved in the spike-ACE2 interaction.Thus our finding suggests a possible competition between ACE2 and lactoferrin for the binding of the SARS-CoV-2 spike, which may explain the observed antiviral action.

A. Datasets
The protein, whose structures are analyzed in this paper are: • Human lactoferrin, in the apo (PDB id: 1CB6) and holo (PDB id: 1LFG) forms.
To set a reference for the measured complementarities, a dataset of protein-protein complexes experimentally solved in x-ray crystallography is taken from [55].We only selected pair interactions regarding chains with more than 50 residues.The Protein-Protein dataset is therefore composed of 4605 complexes.For each complex, the binding region is identified as the portions of the two protein molecular surfaces distant less than 3 Å.

B. Computation of molecular surfaces
For each protein of the dataset (x-ray structure in PDB format [56]), we use DMS [57] to compute the solventaccessible surface, using a density of 5 points per Å2 and

C. Patch definition and complementarity evaluation
A molecular surface is represented by a set of points in the three-dimensional space.We define a surface patch, as the group of points that fall within a sphere of radius R s = 6 Å, centered on one point of the surface.Once the patch is selected, • we fit a plane that passes through the points and reorient the patch in such a way to have the z-axis perpendicular to the plane and going through the center of the plane.
• we define the angle θ as the largest angle between the perpendicular axis and a secant connecting a given point C on the z-axis to any point of the patch.C is then set in order that θ = 45 • .r is the distance between C and a surface point.
• build a square grid and associate each pixel with the mean r of the points inside it.This 2D function can be expanded on the basis of the Zernike polynomials (see next section).
Once a patch is represented in term of its Zernike descriptors, the similarity between that patch and another one can be simply measured as the Euclidean distance between the invariant vectors.The relative orientation of the patches before the projection in the unitary circle must be considered.In fact, if we search for similar regions we must compare patches that have the same orientation once projected in the 2D plane, i.e. the solventexposed part of the surface must be oriented in the same direction for both patches, for example as the positive z-axis.If instead, we want to assess the complementarity between two patches, we must orient the patches contrariwise, i.e. one patch with the solvent-exposed part toward the positive z-axis ('up') and the other toward the negative z-axis ('down').

D. 2D Zernike polynomials and invariants
Each function of two variables, f (r, φ) (polar coordinates) defined inside the region r < 1 (unitary circle), can be decomposed in the Zernike basis as with being the expansion coefficients, while the complex functions, Z nm (r, φ) are the Zernike polynomials.Each polynomial is composed by a radial and an angular part, where the radial part for any n and m, is given by Since for each couple of polynomials, the following relation holds the complete set of polynomials forms a basis and knowing the set of complex coefficients, {c nm } allows for a univocal reconstruction of the original image (with a resolution that depends on the order of expansion, N = max(n)).Since the modulus of each coefficient (z nm = |c nm |) does not depend on the phase, i.e. it is invariant for rotations around the origin of the unitary circle, the shape similarity between two patches can be assessed by comparing the Zernike invariants of their associated 2D projections.In particular, we measured the similarity between patch i and j as the Euclidean distance between the invariant vectors, i.e.

E. Molecular dynamics simulations
The starting structure of the SARS-CoV-2 spike trimeric complex was taken from the model structure proposed by the I-Tasser server [54].All steps of the simulation were performed using Gromacs 2019.3 [58].Topologies of the system were built using the CHARMM-27 force field [59].The protein was placed in a dodecahedric simulative box, with periodic boundary conditions, filled with 131793 TIP3P water molecules [60].We checked that each atom of the trimer was at least at a distance of 1.1 nm from the box borders.The addition of 3 sodium counterions rendered the systems electroneutral.The final system, consisting of 448572 atoms, was first minimized with 2064 steps of steepest descent.Relaxation of water molecules and thermalization of the system in NVT and NPT environments were run each for 0.1 ns at 2 fs time-step.The temperature was kept constant at 300 K with v-rescale algorithm [61]; the final pressure was fixed at 1 bar with the Parrinello-Rahman algorithm [62] which guarantees a water density of 1004 kg/m 3 , close to the experimental value.LINCS algorithm [63] was used to constraint h-bonds.
Finally, the systems were simulated with a 2 fs timestep for 140 ns in periodic boundary conditions, using a cut-off of 12 Å for the evaluation of short-range non-bonded interactions and the Particle Mesh Ewald method [64] for the long-range electrostatic interactions.

FIG. 1 :
FIG.1: SARS-CoV-2 attachment and entry to host cell in physiological condition and possible actions of lactoferrin.a) Sketch of SARS-CoV-2 initial interactions with the host cell.Sialoside (SIA) and heparan sulfate (HS) glycan chains present on glycoproteins (PG) of the cell membrane are thought to facilitate the attachment of the virion to the cell surface.This favors the establishment of an interaction between the virus spike protein and the ACE2 receptor, which starts the internalization of the virus in the host cell.b) Human lactoferrin has been found to play an antiviral action against SARS-CoV-2 infection although it is not clear whether this action consists in (1) competition for the binding with glycan chains, and/or (2) competition for binding ACE2 receptor and/or (3) direct interaction with one of the proteins in the virion envelope, i.e. with S, M or E proteins.

FIG. 2 :
FIG. 2: Putative sialic-binding regions on human lactoferrin.a) Cartoon representation of the sialic-acid binding region of MERS-CoV (PDB id: 6Q04) and its representation in the Zernike disk (left) and local Coulombic surface (right).b) Four most complementar regions on holo human lactoferrin (PDB id: 1LFG) obtained comparing the lactoferrin molecular surface with the sialic acid binding region of MERS-CoV.c) Same as in b) but for the apo human lactoferrin (PDB id: 1CB6).Molecular surfaces are colored according to the Coulombic potential.

FIG. 3 :
FIG.3: Analysis of the binding between lactoferrin and the ACE2 receptor.Molecular representation of the experimentally solved complex of the ACE2 receptor with SARS-CoV-2 spike protein.The ACE2 molecular surface is colored according to its binding propensity to bind lactoferrin regions.Dark red indicates high binding propensity while white means no interaction.

FIG. 4 :
FIG. 4: Possible interaction between SARS-CoV-2 spike protein and human lactoferrin.a) Root mean square displacement as a function of time of the SARS-CoV-2 spike trimer as provided by the molecular dynamics simulation.b)Clustering analysis of the trimer's A chain in the plane of the two principal components of a PCA analysis over the MD configurations.Five regions of major variability of the chain are identified.c) Binding propensity computed from the Zernike descriptors between the 15 most variable conformations of the chains of the SARS-CoV-2 Spike protein and human lactoferrin.Each residue of the proteins is colored from white to red according to its increasing shape complementarity with the partner.d) Comparison between the complementarity score (red dashed line) of the best binding site (Zernike disks) and the distribution of complementarity scores belonging to 4600 binding regions of experimental complexes.