Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Bioinform.

Sec. Drug Discovery in Bioinformatics

QSAR-Guided Discovery of Novel KRAS Inhibitors for Lung Cancer Therapy

Provisionally accepted
Osasan  Stephen AdebayoOsasan Stephen Adebayo1Oche  Ambrose GeorgeOche Ambrose George2,3*Daramola  OlusolaDaramola Olusola4Adefolalu  OluwafemiAdefolalu Oluwafemi5Hind  A. AlzahraniHind A. Alzahrani6Abdulkarim  HasanAbdulkarim Hasan6
  • 1Prince Mishari Bin Saud Hospital, Baljurashi, Al-Baha, Al-Baha, Saudi Arabia
  • 2Department of Plant Biology, Faculty of Life Sciences, University of Ilorin, Ilorin, Nigeria
  • 3Centre for Malaria and Other Tropical Diseases Care University of Ilorin Teaching Hospital, Ilorin, Ilorin, Nigeria
  • 4North Devon Council, Barnstaple, United Kingdom
  • 5Obafemi Awolowo University Teaching Hospital Complex, Ife, Nigeria
  • 6Albaha University, Al Aqiq, Saudi Arabia

The final, formatted version of the article will be published soon.

KRAS mutations are a known driver of lung cancer, yet pharmacological targeting remains a challenge due to the protein's elusive binding sites. In this study, a quantitative structure–activity relationship (QSAR) modeling strategy was employed to predict the pIC₅₀ values of KRAS inhibitors and guide the discovery of novel lead compounds through de novo design. Chemopy was used to generate molecular descriptors of 62 KRAS inhibitors from the ChEMBL database (CHEMBL4354832). After applying dimensionality reduction and normalization, five machine learning techniques were used in developing the model: partial least squares (PLS), random forest (RF), stepwise multiple linear regression (MLR), GA-optimized multiple linear regression (GA-MLR), and XGBoost. R2, RMSE, and MAE were used in assessing performance. For feature interpretation, permutation-based and SHAP analyses were employed. PLS performed better than the other models (R2 = 0.851; RMSE = 0.292), and RF came in second (R2 = 0.796). Based on eight carefully chosen descriptors, the GA-MLR model demonstrated strong interpretability (R2 = 0.677). Virtual screening of 56 de novo–designed molecules using the GA-MLR model identified compound C9 (pIC₅₀ = 8.11) as a promising hit within the model's applicability domain. This integrative QSAR framework successfully identified novel KRAS inhibitor candidates with strong predicted potency.

Keywords: KRAS mutations, QSAR, De novo design, GA-MLR model, KRAS inhibitor

Received: 17 Jul 2025; Accepted: 27 Oct 2025.

Copyright: © 2025 Adebayo, George, Olusola, Oluwafemi, Alzahrani and Hasan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Oche Ambrose George, aafaith516@gmail.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.