A Computer-Aided Approach for the Discovery of D-Peptides as Inhibitors of SARS-CoV-2 Main Protease

The SARS-CoV-2 main protease, also known as 3-chymotrypsin-like protease (3CLpro), is a cysteine protease responsible for the cleavage of viral polyproteins pp1a and pp1ab, at least, at eleven conserved sites, which leads to the formation of mature nonstructural proteins essential for the replication of the virus. Due to its essential role, numerous studies have been conducted so far, which have confirmed 3CLpro as an attractive drug target to combat Covid-19 and have reported a vast number of inhibitors and their co-crystal structures. Despite all the ongoing efforts, D-peptides, which possess key advantages over L-peptides as therapeutic agents, have not been explored as potential drug candidates against 3CLpro. The current work fills this gap by reporting an in silico approach for the discovery of D-peptides capable of inhibiting 3CLpro that involves structure-based virtual screening (SBVS) of an in-house library of D-tripeptides and D-tetrapeptides into the protease active site and subsequent rescoring steps, including Molecular Mechanics Generalized-Born Surface Area (MM-GBSA) free energy calculations and molecular dynamics (MD) simulations. In vitro enzymatic assays conducted for the four top-scoring D-tetrapeptides at 20 μM showed that all of them caused 55–85% inhibition of 3CLpro activity, thus highlighting the suitability of the devised approach. Overall, our results present a promising computational strategy to identify D-peptides capable of inhibiting 3CLpro, with broader application in problems involving protein inhibition.


Supplementary
. Cuboid docking box built with Autodock/Vina plugin of Pymol (DeLano, 2002;Seeliger and de Groot, 2010) and used for SBVS seen from A) a top view and B) a side view. The box dimensions in X, Y and Z axes are 19.5, 18.0 and 22.0 Å, respectively. The 3CL pro active site inside the docking box is shown in surface representation. Key residues of the active site and subsites are labeled, the latter being highlighted in bold. The crystal structure of 3CL pro PDB 6Y2E was employed for SBVS.  Figure  1 and shown in Table 1 are included here. D-peptides are termed according to the nomenclature used in Table 1.

Supplementary
Supplementary Figure S2. ΔGeff time profiles along the 110 ns trajectories of 3CL pro in complex with the D-peptides selected as potential inhibitors. Values collected after 40 ns (see dashed lines) were used to calculate the reported ΔGeff mean values for the complexes. D-peptides are labeled as in Table 1. Figure S3. Peptide RMSD time profiles along the 110 ns trajectories of 3CL pro in complex with the D-peptides selected as potential inhibitors. RMSD values were calculated for the Dpeptide heavy atoms after fitting all trajectory frames to the ChT-like domains backbone atoms of 3CL pro in the corresponding starting structures (t=0). Values collected after 90 ns (see dashed lines) were used to calculate the reported peptide RMSD mean values for the complexes. D-peptides are labeled as in Table  1.

Supplementary
Supplementary Figure S4. Number of intermolecular H-bonds between the indicated D-peptides and 3CL pro residues during the 110 ns MD simulations. The geometric criteria to determine the occurrence of an H-bond was a donor-acceptor distance ≤ 3.5 Å and a donor-H-acceptor angle ≥120 ○ . 〈 〉 represents the average number of intermolecular H-bonds formed along each trajectory. Figure S5. Structural representation of 3CL pro in complex with D-peptides proposed as potential inhibitors and with ΔGeff values >-40 kcal/mol. All D-peptides are shown as yellow sticks and their residues are labeled in bold and in the three-letter code. 3CL pro residues forming H-bonds with the peptides plus the catalytic residues H41 and C145 are labeled and represented as cyan sticks. The 3CL pro active site cavity is depicted as a transparent gray surface. H-bonds between the D-peptides and the 3CL pro residues with occupancies >25% during the respective 110 ns MD simulations are displayed as orange dashed lines. Subsites S4 to S2' are labeled in bold and italic. Figure S6. Instantaneous RMSD values and RMSF profiles for the replicate 1 μs MD trajectories of 3CL pro in complex with the experimentally-tested D-peptides. For every complex, from left to right, the RMSD time profiles with respect to all backbone atoms of 3CL pro , the RMSD time profiles with respect to the backbone atoms of 3CL pro ChT-like domains (residues 8 to 183), and the RMF profiles with respect to all backbone atoms in the 3CL pro /D-peptide complexes are represented. The three regions labeled 1, 2 and 3 in the RMSF graphs correspond to the ChT-like domains, domain III and the Dtetrapeptides, respectively. Profiles corresponding to different replicate MD simulations are colored differently (see legend on the right). RMF values were calculated after fitting the trajectories with respect to the backbone atoms of the ChT-like domains. In all cases, the respective starting structures (t=0) were set as reference for fitting. Figure S7. Number of intermolecular H-bonds between the experimentally-tested Dpeptides and 3CL pro residues during replicate 1 μs MD simulations. The geometric criteria to determine the occurrence of an H-bond was a donor-acceptor distance ≤ 3.5 Å and a donor-H-acceptor angle ≥120 ○ . 〈 〉 represents the average number of intermolecular H-bonds formed along each trajectory. Time profiles corresponding to different replicate MD simulations are colored differently (see legend on the right).

Supplementary Figure S8.
Peptide RMSD values along the replicate 1 μs MD simulations of 3CL pro in complex with the experimentally-tested D-tetrapeptides. RMSD values were calculated for the peptide heavy atoms after fitting all trajectory frames with respect to the backbone atoms of 3CL pro residues 8 to 183 (chymotrypsin-like domains) in the corresponding starting structures. Time profiles corresponding to different replicate MD simulations are colored differently (see legend on the right). Figure S9. Main central structures of 3CL pro in complex with the experimentally-tested D-peptides obtained from the replicate 1 μs MD simulations. Symbols in parentheses indicate which minimum of the PC1 vs PC2 FELs ( Figure 6) corresponds to each represented central structure. Different central structures of the same complex are labeled by adding a hyphen and a numerical identifier (0, 1, 2, 3) to the peptide abbreviation. The lower the numerical identifier the higher the size of the cluster to which the central structure belongs. All D-peptides are shown as yellow sticks and their residues are labeled in bold and in the three-letter code. 3CL pro residues forming H-bonds with the peptides plus the catalytic residues H41 and C145 are labeled and represented as cyan sticks. The 3CL pro active site cavity is depicted as a transparent gray surface. H-bonds between the D-peptides and the 3CL pro residues are displayed as orange dashed lines. Subsites S4 to S2' are labeled in bold and italic. Figure S10. Eigenvalues of the 12 eigenvectors corresponding to the concatenated 1 μs replicate MD simulations of 3CL pro in complex with the experimentally-tested D-peptides. PCA was conducted for the D-peptide Cα atoms. The cumulative fluctuations of the first four eigenvectors are shown beside their corresponding points in each graph.

Supplementary Text S1. Thermodynamic integration free energy calculations
Thermodynamic integration (TI) binding free energy (∆ ) calculations were conducted for the main conformations sampled by 4P4 bound to 3CL pro (Supplementary Figure S9) to determine the lowestenergy binding mode. A protocol published by Aldeghi et al. is particularly suitable to identify lowestenergy binding modes of ligands to their receptors (Aldeghi et al., 2016). That protocol was previously modified by us to conduct the MD simulations with pmemd.cuda of Amber 20 (Case et al., 2020), and estimate ∆ values using TI (Hernandez Gonzalez et al., 2021). Briefly, 4P4 was alchemically turned into non-interacting particles, i.e., dummy atoms, both free in solution and complexed with 3CL pro in different starting conformations (Supplementary Figure S9) through a two-stage process. First, its atoms were progressively discharged until they all had zero partial charges. Then, 4P4 van der Waals parameters were turned off thus leading to a dummy ligand. The coupling parameter λ was varied from 0 to 1, with a Δλ=0.05 for both the discharging and van der Waals alchemical transformations. Each λ window was subjected EM, a 1 ns NVT heating and a 2 ns NPT equilibration to reach a temperature of 298.15 K and a pressure of 1 bar. Finally, 30 ns NPT production runs were conducted and the collected snapshots were used for energy calculations. To reduce the computational demand of these simulations, only the ChT-like domains of 3CL pro bound to the different 4P4 conformations were chosen as starting structures. Moreover, hydrogen-mass repartitioning (Hopkins et al., 2015) was employed in order to increase the timestep from 2 to 4 fs during the NPT production runs. Pressure control was achieved by means of the Monte Carlo barostat implemented in Amber 20 (Åqvist et al., 2004;Case et al., 2020). Other details of these MD simulations not mentioned here are identical to those described for conventional MD simulations in Materials and Methods. Free energies were finally calculated using the alchemical-analysis.py program, which performs autocorrelation analysis to determine the associated SEMs (Klimovich et al., 2015).
Four unrestrained 30 ns MD simulations of each 3CL pro /4P4 conformation were performed to calculate ∆ through Eq. S1. The results of the four replicates corresponding to each starting conformation were averaged and SEMs were reported. Figure S11. Harmonic restraints applied to attach 4P4 to 3CL pro active site during TI free energy calculations. Cα atoms chosen to apply the restraints are shown as spheres and labeled (a1 to a3 and b1 to b3). Distance, angular and dihedral restraints are represented as orange, red and green lines, respectively, linking the involved atoms. For brevity's sake, restraints are depicted for only one of the analyzed 3CL pro /4P4 conformations. ) and free in solution (∆ ℎ ): ∆ ℎ = ∆ ℎ − ∆ ℎ + . c Free energy associated with the disappearance of fully decharged 4P4 bound to 3CL pro in the presence of a set of harmonic restraints (∆ + ) and free in solution (∆ ): ∆ = ∆ − ∆ + . d Free energy variation upon the addition of a set of harmonic restraints attaching non-interacting (=dummy) 4P4 to the binding site, calculated through an analytical formula proposed by Boresch et al. (Boresch et al., 2003), that also contains the correction to the molar standard state (volume equal to 1661 Å 3 ). e Free energy variation upon removal of the same set of restraints when the fully interacting D-peptide is bound to the protein, calculated using Eq. S1. f Calculated standard binding free energies for each 3CL pro /4P4 central structure, obtained from the sum of the previous components.