Structure-Based Virtual Screening to Identify Novel Potential Compound as an Alternative to Remdesivir to Overcome the RdRp Protein Mutations in SARS-CoV-2

The number of confirmed COVID-19 cases is rapidly increasing with no direct treatment for the disease. Few repurposed drugs, such as Remdesivir, Chloroquine, Hydroxychloroquine, Lopinavir, and Ritonavir, are being tested against SARS-CoV-2. Remdesivir is the drug of choice for Ebola virus disease and has been authorized for emergency use. This drug acts against SARS-CoV-2 by inhibiting the RNA-dependent-RNA-polymerase (RdRp) of SARS-CoV-2. RdRp of viruses is prone to mutations that confer drug resistance. A recent study by Pachetti et al. in 2020 identified the P323L mutation in the RdRp protein of SARS-CoV-2. In this study, we aimed to determine the potency of lead compounds similar to Remdesivir, which can be used as an alternative when variants of SARS-CoV-2 develop resistance due to RdRp mutations. The initial screening yielded 704 compounds that were 90% similar to the control drug, Remdesivir. On further evaluation through drugability and antiviral inhibition percentage analyses, we shortlisted 32 and seven compounds, respectively. These seven compounds were further analyzed for their molecular interactions, which revealed that all seven compounds interacted with RdRp with higher affinity than Remdesivir under native conditions. However, three compounds failed to interact with the mutant protein with higher affinity than Remdesivir. Dynamic cross-correlation matrix (DCCM) and vector field collective motions analyses were performed to identify the precise movements of docked complexes' residues. Furthermore, the compound SCHEMBL20144212 showed a high affinity for native and mutant proteins and might provide an alternative against SARS-CoV-2 variants that might confer resistance to Remdesivir. Further validations by in vitro and in vivo studies are needed to confirm the efficacy of our lead compounds for their inhibition against SARS-CoV-2.


INTRODUCTION
COVID-19 is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) . In December 2019, China reported its first case of SARS-CoV-2 in Wuhan, Hubei province . The infection was highly contagious, which led to the global spread of the virus in the following months, thus causing an outbreak (Giovanetti et al., 2020;Phan et al., 2020). It was finally declared as a Public Health Emergency of International Concern (PHEIC) by the World Health Organization (WHO) on January 30, 2020, and ultimately named Coronavirus Disease 2019 (COVID-19) on February 12, 2020. COVID-19 is responsible for high mortality worldwide (WHO, 2020a,b). Currently, there are 172, and 61 vaccines are in pre-clinical and clinical development, respectively. While it takes years to develop a new effective drug against a disease, the COVID-19 outbreak has created a global emergency. Hence, drug repositioning and repurposing are given more attention as it is a rapid solution to identify a drug to combat the disease (Li and Clercq, 2020;Morse et al., 2020). Coronaviruses are enveloped viruses about 80-120 nm in diameter with plus-strand (+) RNA as genetic material (Brian and Baric, 2005). Studies have shown that SARS-CoV-2 shares ∼80% genetic similarity with SARS-CoV, while RNA-dependent RNA polymerase (RdRp) has a 96% similarity . RdRp and protease are encoded by the viral RNA and play an essential role in replicating and assembling the virus, and hence preexisting drugs that target these proteins are preferred. Antiviral agents that have been developed to combat coronaviruses are used here as potential drugs for the treatment of COVID-19.
A 20-kb gene encodes the replicase complex. It encompasses several viral genes encoding RdRp, RNA helicase, proteases, and some RNA processing enzymes and cellular proteins that aid in replicating and transcribing the genome of coronavirus occurring in the cytoplasm of the host cell (Koonin and Dolja, 1993;Cheng et al., 2005). RNA synthesis takes place in both a continuous and a discontinuous fashion (Gallagher, 1996). Coronaviruses are plusstrand (+) RNA viruses with a tremendously diverse genome, resulting in a variation in their RNA synthesis machinery (Goldbach, 1987). RdRps exhibit a very high transcription error rate leading to variation in the genome of the virus. During the replication process, RdRps adopts the switching mechanism resulting in the recombination of RNA. This process is also responsible for repairing deleterious mutations in the genome of the viruses, leading to gene rearrangements and new gene acquisition (Strauss and Strauss, 1988;Xu et al., 2003).
Once the virus penetrates the host cell by binding to the angiotensin-converting enzyme 2 (ACE2) receptor, the process of uncoating is initiated that releases the plus-strand (+) RNA into the cytoplasm. The ribosomes present in the host cell's cytoplasm proceed to translate the genomic RNA into a polyprotein. This polyprotein undergoes proteolytic cleavage by the protease enzyme that results in the replicase enzyme production and various viral structural proteins. Approximately 67% of the genomic RNA encodes RdRp that arbitrates the genome's synthesis . The plus-strand (+) RNA can only produce viral protein products and not genetic material by replication. Thus, to achieve genome replication, RdRp first transcribes and replicates the plus-strand (+) RNA, which acts as a template to generate the minus-strand (-) RNA, and then serves as a template for the RdRPs to produce several plus-strand (+) RNAs. Some of this plus-strand (+) RNA is again translated into proteins.
In contrast, the others are enclosed into the capsid to regenerate complete virions released from the cell by exocytosis. RdRP is a popular target for a selective antiviral strategy against coronaviruses because RNA synthesis by RdRP does not occur in mammalian cells (Casais et al., 2001). Figure 1 illustrates the role of RdRp in viral replication and effect of RdRp inhibitors (Figure 1).
Current potential treatments against COVID-19 that have shown promising results are Lopinavir, a known protease inhibitor of coronavirus (Yao T.-T. et al., 2020), the guanosine analog, ribavirin that targets RdRp and was designed to combat the Ebola outbreak (Falzarano et al., 2013), and chloroquine, an antimalarial drug, that has shown antiviral effects by blocking viral fusion with the cell due to increased endosomal pH (Vincent et al., 2005). A derivative of chloroquine, hydroxychloroquine, is more potent than chloroquine, as reported by several studies (Yao X. et al., 2020). Several corticosteroids have also been studied for their coronavirus effects (Russell et al., 2020). Convalescent plasma transfusion is also currently being administered and has been shown to reduce mortality rate (Mair-Jenkins et al., 2015). Among the treatments, Remdesivir has shown promising results in in vitro and animal studies and is currently in phase III trials (Martinez, 2020).
Remdesivir, also known as GS-5734, is a nucleotide prodrug that is metabolized inside the cell into GS-441524 and an adenosine nucleotide analog, phosphorylated into cell-permeable di-and tri-phosphates (Varga et al., 2016;Warren et al., 2016) Viral RdRps can utilize these adenosine triphosphates during genome replication (Sheahan et al., 2017). Remdesivir causes the nascent RNA transcript to terminate prematurely and incorporates mutations into it (Agostini et al., 2018). Studies have shown that remdesivir causes RNA levels to decrease in a dose-dependent manner, and it is 4.5-fold more potent in cells lacking the ExoN proofreading activity. It has also been observed that GS-5734 is more active (∼30 times) than its metabolized form, GS-441524, against the coronaviruses tested (Sheahan et al., 2017). It has been shown that in the presence of GS-441524, there was 5.6-fold resistance in coronavirus mouse hepatitis virus, which was attributed to two mutations, F476L and V553L, in the coding region of nsp12 core polymerase (Agostini et al., 2018). Similar findings were observed in SARS-CoV carrying the mutations F480L and V557L, which are not in the vicinity of the binding site of RdRp; thus, the exact mechanism behind the resistance remains undetermined (Agostini et al., 2018). This raises the concern that resistance may cause the virus to be active and functional, leading to severe disease transmission. Because of its decreased proofreading activity, RdRps have an error rate in the range of 10 −4 −10 −6 errors/nucleotide/replication, which is very high compared to that of DNA polymerases (Eckerle et al., 2010). This high error rate is responsible for the rapid rate of virus evolution, contributing to its adaptability to the new surroundings and expediting its ability to jump between species (Pillay and Zambon, 1998). A recent study by Pachetti et al. in 2020 identified the P323L mutation in the RdRp protein of SARS-CoV-2, which may confer resistance to the drug. In this study, we utilized a virtual screening approach to identify the potential repurposed lead compound that might be efficient in treating COVID-19 due to SARS-CoV-2 variants resistant to Remdesivir.

Remdesivir and COVID19-RdRp Structure Analysis
The 2D and 3D structures of Remdesvir in canonical SMILES and SDF formats were obtained from the PubChem database with CID 121304016. These formats were also converted to PDB using the OpenBabel software (O'Boyle et al., 2011). The 3D structure of the complexes between the non-structural proteins 12, 7, and 8 and Remdesivir was obtained from the Protein Data Bank with PDB ID−7BV2, and the missing amino acids were theoretically built using the Swiss Model Server (Biasini et al., 2014;Wanchao et al., 2020). Further, the P323L mutation reported by Pachetti et al. in 2020 was introduced in the refined protein structure using the SwissPDB Viewer, and energy was minimized using the same method (Guex and Peitsch, 1997;Pachetti et al., 2020).

Virtual Screening
The inbuilt module of PubChem was used to identify compounds with 90% similarity to the control Remdesivir structure. The identified compounds were screened for absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties using the SwissADME server (Daina et al., 2017). The 90% similar compounds that satisfy the Lipinski drugability properties were further used in an in silico antiviral inhibition percentage study using the AntiViral Compound Prediction (AVCpred) server (Qureshi et al., 2017). These filtered compounds were finally subjected to molecular interaction studies against the native and mutant RdRp proteins using the AutoDock standalone package (Morris et al., 2009).

Molecular Interaction Studies
Molecular interaction studies were initiated to understand the differences between the control remdesivir and the virtually similar compounds with RdRp proteins. Hydrogen and charges were added to the RdRp protein, and the torsions were set to Remdesivir and nearly identical compounds. The grid box was placed around the active site (LYS546, SER683, ARG556, THR688, ASP624, SER760, ASN692, ASP761, and ASP762) retrieved from the published crystal structure (Wanchao et al., 2020). The Lamarckian Genetic Algorithm was used to generate the binding pockets and binding affinity of RdRp for Remdesivir or the virtually similar compounds. One hundred different pockets were developed, and those with the best binding affinity were used in the structural visualization.

Dynamic Cross-Correlation Matrix (DCCM) and Collective Motion
The docked complexes were further subjected to the webserver DynOmics ENM to identify the residue cross-correlation matrix. The tool developed with the Elastic network models (ENM) that integrate Anisotropic Network Model (ANM) and GaussianNetwork Model (GNM) (Li et al., 2017). Timecorrelated data was represented as a matrix between the protein atoms i and j (cij) in DCCM. Within the form of a map, typical fluctuations, and standardized correlations among residues are typically shown and represented with the following equation: The range of Cij(n) varies in terms (−1, 1) and analyzes information on the cross-correlation between residue movements i and j (Rader et al., 2005). To determine how mutations affect the internal mechanics of protein  conformations, the Bio3D module incorporated with the R studio was used to quantify residue-residue dynamic crosscorrelation networks. The normal mode, network analysis, and correlation analysis were called with the function "nma(), " "can(), "and "dccm()." The role from these features is usually a cross-correlation matrix of residue-residue. The results were plotted by calling the functions "plot.dccm()" and "pymol.dccm()" (Grant et al., 2006;Scarabelli and Grant, 2013).

Virtual Screening
As an initiative of virtual screening, the compounds with 90% structural similarity to the control drug, Remdesivir, were screened using the inbuilt PubChem module. The screening identified 704 compounds possessing 90% similarity with Remdesivir. These 704 compounds were subjected to in silico ADME analysis using the SwissADME server (Supplementary Table 1). Based on Lipinski's drugability properties, 32 compounds were further filtered from the pool of 704 compounds. Finally, these 32 compounds were subjected to an antiviral inhibition percentage study calculated using the AVCpred server (

Molecular Interaction Studies
Molecular interaction studies were performed using AutoDock to understand the interaction pattern of Remdesivir and its similar compounds with the native and mutant RdRp proteins. It was observed that all seven compounds had a higher binding affinity than Remdesivir with native RdRp. For RdRp protein carrying the P323L mutation, we observed that the compounds 134502628, 70649275, 137648734, and 145074552 possessed higher binding affinity than Remdesivir. The binding affinity of Remdesivir for native and mutant RdRp was −4.5 and −4.2 kcal/mol, respectively. However, the compound with PubChem ID 134502628 showed the highest binding affinity of −7.5 and −7.4 kcal/mol for the native and mutant RdRp proteins, respectively (Tables 2A,B). The detailed interaction comparison between Remdesivir and top-ranked compound, 134502628 (SCHEMBL20144212), against the native and mutant RdRp is shown in

DCCM and Collective Motion
Firstly, we investigated the residue cross-correlation networks and vector field collective motions with the help of the Bio3D module that was integrated within R studio. All four complexes were subjected to the Bio3D module to retrieve the residue cross-correlation networks and collective motions. As a result, the residues from 5 to 120 (α1, α2, α3, β1, β2, and βhairpin) and 830-933 (α40-45, β22, and β23) were obtained with large cross-correlation networks from RdRp-Remdesivir and SCHEMBL20144212 complexes (Figures 3A,C), whereas the P323L-Remdesivir obtained less cross-correlation networks especially at the region from 5 to 120 residues ( Figure 3E). The residues from 830 to 933 exhibited similar networks for all the docked complexes. However, the differences can be seen with the P323L-Remdesivir and P323L-SCHEMBL20144212 complexes, especially at the region from 5 to 120 residues. The P323L-Remdesivir complex exhibited the least residue crosscorrelation; the possible reason for this could be mutation at 323rd position in RdRp. Our results exhibited similar network cross-correlation for P323L-SCHEMBL20144212 ( Figure 3G) when compared to the RdRp-SCHEMBL20144212 complex. The atomic movements of these docked complexes were investigated by representing the vector field collective motions. The collective motions were largely observed in the 5-120 and 830-933 regions, as seen in Figures 3B,D,F,H. Here, we observed the differences in the residues from 5 to 120 from the P323L-Remdesivir complex ( Figure 3F) compared to the other three docked complexes. The arrows on the docked complex of proteins determine the magnitude and direction of the collective motion. Furthermore, we investigated the DCCM of RdRp (with or without mutation) with remdesivir and SCHEMBL20144212 complexes by utilizing the DynOmics tool. DCCM explores the precise movements of residues of proteins, demonstrates strongly correlated (in red) atomic movements between residues, and highlights atomic movements that are strongly anticorrelated (in blue) (Figure 4). The RdRp-SCHEMBL20144212 and P323L-SCHEMBL20144212 complexes (Figures 4B,C) illustrate the strongly correlated motions and less anticorrelation as compared to that of RdRp-Remdesivir and P323L-Remdesivir complexes (Figures 4A,C).

DISCUSSION
COVID-19, caused by SARS-CoV-2, has become a global pandemic and is responsible for about 2,71,954 deaths worldwide, with more than 3.95 million positive cases. COVID-19 has been transmitted to almost 120 countries, and discovering drugs to fight this disease is a great challenge (Worldometer, 2021). The genome of the SARS-CoV-2 virus was immediately sequenced to improve our understanding of the virus, formulate appropriate diagnostic and preventive techniques, and develop therapeutic strategies . The high transmission rate of the virus is responsible for the difficulty of implementing efficient preventive measures. Hence, there is an immediate requirement for a drug to inhibit the virus and eliminate the disease . One of the emerging drugs with promising results is Remdesivir. It is a nucleotide analog, initially developed for the Ebola virus, that targets the virus's RdRp (Gordon et al., 2020). The virus's high mutation rate may introduce changes to RdRp, rendering the drug ineffective (Eckerle et al., 2010). A recent study by Pachetti et al. in 2020 identified the P323L mutation in the RdRp of SARS-CoV-2 (Pachetti et al., 2020). The possible increase in mutations highlights the need to develop a series of drugs whose efficiency is not affected by these mutations. Thus, there is an urgent need to develop novel drugs. The conventional method for drug discovery is expensive and timeconsuming, and therefore the in silico approach is a means to move forward. Our study formulated a computational pipeline to identify a series of compounds similar to Remdesivir, which could be used as an alternative if SARS-CoV-2 develops resistance to Remdesivir due to mutations. Out of 704 compounds, which were 90% similar to Remdesivir, 32 compounds were found to satisfy the rule of 5 (Supplementary Table 1). To assess drugability, these 32 compounds were subjected to in silico antiviral inhibition percentage analysis. We observed that the seven virtually similar compounds possessed a higher antiviral inhibition percentage when compared to Remdesivir. Finally, molecular interaction studies against the native and mutant RdRp were carried out with docking analysis. The interactions revealed that all the identified seven compounds could interact with higher affinity with the native protein than Remdesivir. However, upon introducing the P323L mutation, only four compounds were found to bind with a higher binding affinity than Remdesivir. The binding efficacy of Remdesivir was also found to decrease from −4.5 kcal/mol in the native to −4.2 kcal/mol in the mutant protein. Interestingly, the compound, SCHEMBL20144212, was topmost in the binding affinity for native and mutant proteins ( Table 2). Remdesivir was found to interact with the native RdRp through 14 amino acids and 6 hydrogen bonds. However, the identified novel lead compound, SCHEMBL20144212, was found to interact with native RdRp through 11 amino acids and 7 hydrogen bonds.
Similarly, in interaction with the mutant protein, Remdesivir was found to interact through four amino acids and two hydrogen bonds. However, the novel lead compound, SCHEMBL20144212, was found to interact through 14 amino acids and 6 hydrogen bonds (Table 3, Figures 2A-D). The role of hydrogen bonds and interacting amino acids is crucial for a drug to exhibit its efficacy. We observed higher binding affinity, hydrogen bond interactions, and amino acid interactions with the identified novel lead compound compared to Remdesivir. We have also depicted the detailed pharmacokinetic properties obtained from the pkCSM server in Table 4. These results suggest that SCHEMBL20144212 is more effective against the original or the mutated virus compared with Remdesivir.
It is vital to characterize the protein folding via identifying conformational variances, change in conformational motions owing to mutations, mechanism of ion channel's opening/closing, and protein-ligand binding, as these factors directly related to the protein's stability and function (Grottesi et al., 2005;Yang et al., 2008;Kim et al., 2010;S et al., 2020;Udhaya Kumar et al., 2020). The residue cross-correlation networks were obtained for all the docked complexes, and among them, the P323L-Remdesivir complex obtained less correlative network at the region from 5 to 120 residues ( Figure 3E). A possible reason for this could be a mutation that occurred at position 323; thus, complex resulted in fewer correlative networks. Also, in comparison to our identified lead compound (SCHEMBL20144212) with mutant RdRp (P323L) exhibited large correlative networks and could act as a sturdy inhibitor against mutant RdRp ( Figure 3G). The vector field collective motions were identified for all the docked complexes to explore each complex's dynamic motion and magnitude (Figures 3B,D,F,H). The RdRp and mutant RdRp (P323L) with SCHEMBL20144212 lead compound exhibited a rigid complex as seen in Figures 3D,H. This clearly states that the SCHEMBL20144212 lead compound inhibits the RdRp from binding to other molecules and for the catalysis process, whereas the remdesivir complexes exhibited more flexible regions (Figures 3B,F) (Huber, 1987;Karshikoff et al., 2015). Furthermore, our results from DCCM showed the RdRp-SCHEMBL20144212 and P323L-SCHEMBL20144212 complexes exhibited strongly correlated motions (Figures 4B,D). From the above observations, we strongly believe that the molecular docking of SCHEMBL20144212 lead compound reduces the effect of P323L mutation and could act as an effective inhibitor against SARS-nCoV-2's RdRp.

CONCLUSION
This study was initiated to identify the potential repurposed lead compound to treat COVID-19 in case of possible SARS-CoV-2-virus resistance to Remdesivir due to mutations. The drug inhibits the activity of RdRp protein to a small extent. Identifying the mutation in RdRp of SARS-CoV-2 highlights the possible appearance of mutated viruses that might be resistant to current and future treatments. Therefore, it is necessary to establish a platform to identify compounds similar to but more useful than Remdesivir. Our study identified a total of 704 compounds from the PubChem database, which are 90% similar to Remdesivir. Of these 704 compounds, 32 compounds were found to possess druggability properties, out of these, seven compounds were found to possess higher antiviral inhibition percentages when compared to Remdesivir. Out of these seven compounds analyzed for molecular interaction. The SCHEMBL20144212 compound was found to possess the highest interaction affinity for both the native and mutant RdRp protein.
From the DCCM and vector field collective motion analysis, we demonstrated that our lead compound's molecular docking (SCHEMBL20144212) tolerates the effect of the P323L mutation and could act as an effective inhibitor against SARS-nCoV-2's RdRp. Thus, this compound should be further validated through in vitro experiments as an alternative to Remdesivir to bypass the resistant effect of the P323L mutation. Our study offers a platform for drug repurposing during the time of viral pandemics.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
TK, UK, HZ, and GD contributed to designing the study and data acquisition, analysis, and interpretation. TK, NS, and UK involved in the acquisition, analysis, and interpreting of the results. TK, UK, and GD contributed to data interpretation, conducted, and drafting the manuscript. GD and HZ supervised the entire study and were involved in study design, acquiring, analyzing, understanding the data, and drafting the manuscript. The manuscript was reviewed and approved by all the authors.

ACKNOWLEDGMENTS
The authors would like to thank the Vellore Institute of Technology, India, and Qatar University, Qatar, for providing the necessary research facilities and encouragement to carry out this work.