ORIGINAL RESEARCH article
Front. Public Health
Sec. Environmental Health and Exposome
Volume 13 - 2025 | doi: 10.3389/fpubh.2025.1581717
Exploring the Relationship Between Per-and Polyfluoroalkyl Substances Exposure and Rheumatoid Arthritis Risk Using Interpretable Machine Learning
Provisionally accepted- 1Nanjing Jiangbei Hospital, Affiliated Nanjing Jiangbei Hospital of Xinglin College, Jiangsu, China
- 2Huai’an No. 3 People's Hospital, Huaian Second Clinical College of Xuzhou Medical University, jiangsu, China
- 3Nanjing University of Chinese Medicine, Nanjing, China
- 4Huai’an TCM Hospital Affiliated to Nanjing University of Chinese Medicine, Jiangsu, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Rheumatoid arthritis is a chronic autoimmune disease influenced by environmental exposures, including per-and polyfluoroalkyl substances (PFAS). Although previous studies have suggested links between PFAS and rheumatoid arthritis risk, none have used interpretable machine learning models for prediction. This study aimed to develop such a model to assess risk based on PFAS exposure.We analyzed data from 11,705 participants in the National Health and Nutrition Examination Survey (2003-2018). Twelve machine learning algorithms were evaluated using metrics including area under the curve (AUC), accuracy, sensitivity, specificity, and F1 score. Key predictors were identified using SHapley Additive exPlanations (SHAP). Partial dependence plots and locally weighted scatterplot smoothing (LOWESS) curves were used to examine nonlinear associations and exposure thresholds. A web-based risk calculator was developed to enhance clinical and public health applicability.CatBoost showed the best performance (AUC: 0.82; Accuracy: 74%; F 1 score: 0.62) and was selected for further interpretation. SHAP analysis identified perfluorooctane sulfonic acid (PFOS) and 2-(N-Methyl-perfluorooctane sulfonamido) acetic acid (MPAH) as major contributors to risk prediction. PFOS exhibited a U-shaped relationship with increased risk above 15.10 ng/mL, while MPAH showed a risk transition at 0.22 ng/mL. Waterfall plots illustrated the contribution of individual exposures. The interactive web-based calculator allows users to input PFAS levels and receive personalized rheumatoid arthritis risk estimates. It is freely available on Hugging Face Spaces (https://huggingface.co/spaces/Machine199710/RA_ML).This study demonstrates the potential of machine learning to predict rheumatoid arthritis risk based on PFAS exposure. The identified nonlinear patterns provide insights into environmental contributions to disease risk and may inform future prevention strategies.
Keywords: machine learning, Rheumatoid arthritis, PFAS, Shap, Environmental Pollution
Received: 22 Feb 2025; Accepted: 13 May 2025.
Copyright: © 2025 Li, Xu and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Zhi Li, Nanjing Jiangbei Hospital, Affiliated Nanjing Jiangbei Hospital of Xinglin College, Jiangsu, China
Ke Zhang, Nanjing University of Chinese Medicine, Nanjing, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.