REVIEW article

Front. Immunol., 10 June 2024

Sec. Autoimmune and Autoinflammatory Disorders: Autoinflammatory Disorders

Volume 15 - 2024 | https://doi.org/10.3389/fimmu.2024.1409555

Advancing precision rheumatology: applications of machine learning for rheumatoid arthritis management

  • 1. Department of Rheumatology, Shanghai Guanghua Hospital of Integrative Medicine, Shanghai University of Traditional Chinese Medicine, Shanghai, China

  • 2. Guanghua Clinical Medical College, Shanghai University of Traditional Chinese Medicine, Shanghai, China

  • 3. Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China

  • 4. Traditional Chinese Medicine Hospital of Inner Mongolia Autonomous Region, Hohhot, Inner Mongolia Autonomous Region, China

  • 5. Guangxi Key Laboratory for Genomic and Personalized Medicine, Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi, China

  • 6. Department of Urology, Affiliated Tumor Hospital of Guangxi Medical University, Guangxi Medical University, Nanning, Guangxi, China

Abstract

Rheumatoid arthritis (RA) is an autoimmune disease causing progressive joint damage. Early diagnosis and treatment is critical, but remains challenging due to RA complexity and heterogeneity. Machine learning (ML) techniques may enhance RA management by identifying patterns within multidimensional biomedical data to improve classification, diagnosis, and treatment predictions. In this review, we summarize the applications of ML for RA management. Emerging studies or applications have developed diagnostic and predictive models for RA that utilize a variety of data modalities, including electronic health records, imaging, and multi-omics data. High-performance supervised learning models have demonstrated an Area Under the Curve (AUC) exceeding 0.85, which is used for identifying RA patients and predicting treatment responses. Unsupervised learning has revealed potential RA subtypes. Ongoing research is integrating multimodal data with deep learning to further improve performance. However, key challenges remain regarding model overfitting, generalizability, validation in clinical settings, and interpretability. Small sample sizes and lack of diverse population testing risks overestimating model performance. Prospective studies evaluating real-world clinical utility are lacking. Enhancing model interpretability is critical for clinician acceptance. In summary, while ML shows promise for transforming RA management through earlier diagnosis and optimized treatment, larger scale multisite data, prospective clinical validation of interpretable models, and testing across diverse populations is still needed. As these gaps are addressed, ML may pave the way towards precision medicine in RA.

1 Introduction

Rheumatoid arthritis (RA) is a prevalent autoimmune disorder characterized by inflammation and discomfort in numerous small joints, potentially leading to joint deformity and impaired functionality. Furthermore, it ranks among the primary contributors to chronic disability (1). Furthermore, RA not only impacts the joints but also has implications for other bodily systems, including the cardiovascular and respiratory systems, leading to an elevated susceptibility to conditions such as myocardial infarction, stroke, and pulmonary fibrosis (2, 3). Chronic illnesses and persistent pain can result in psychological distress for patients, manifesting as symptoms of depression and anxiety (4). Hence, it is imperative to promptly identify individuals with a high susceptibility to RA in order to facilitate early diagnosis and anticipate the potential severity of disease progression. Furthermore, the timely administration of efficacious medications is essential in impeding the advancement of the disease.

The phrase “machine learning (ML)” surged in popularity in the late 1990s in the field of artificial intelligence (5). In the past decade, ML has made significant advancements as a result of the increased availability of data and improvements in algorithms, enabling the identification of complex patterns and correlations within datasets (6). The biomedical field has experienced a significant increase in data volume, ranging from molecular details to comprehensive information on the human body system, due to advancements in high-throughput sequencing technologies, electronic health records, and medical imaging (7). Healthcare providers and researchers are currently facing a growing number of clinical challenges, leading them to explore ways to enhance decision-making effectiveness, refine personalized treatment strategies, and optimize resource allocation methods. ML is uniquely positioned to extract valuable patterns and insights from large datasets, potentially automating and enhancing the efficiency of healthcare decision-making and services. The incremental incorporation of biomedicine with various disciplines, including computational science, mathematics, and statistics, has spurred interdisciplinary partnerships, leading to accelerated progress in the application of ML in the field of biomedicine (8). In the clinical practice of RA, Rheumatoid Factor (RF) and Anti-Citrullinated Protein Antibody (ACPA) serve as crucial diagnostic biomarkers for RA, playing key roles in its diagnosis. However, approximately 20-25% of RA patients are seronegative, posing challenges to early diagnosis and potentially leading to delayed diagnosis and treatment (9). With the advent and development of biologics, significant progress has been made in the treatment of RA. Nevertheless, many RA patients exhibit poor responses to drug treatments, failing to achieve sustained remission (10), and currently, it is not possible to predict which treatment drugs will have the best therapeutic effect on individual patients. The accumulation of biomedical big data may provide new insights into better understanding the heterogeneity of RA (11). With the increase in data volume and complexity, traditional statistical analysis methods have become insufficient, especially when dealing with nonlinear relationships and complex interactions between variables (12). These unmet needs pose challenges to the precision medicine of RA. Using ML techniques for data processing and pattern recognition to build predictive models for RA can assist clinicians in making more accurate data-driven decisions (13). Therefore, understanding the prevalent ML algorithms in RA, their effectiveness, and potential applications is crucial. Our study is dedicated to evaluating recent literature on applications of ML in RA classification and outcome prediction, with the goal of offering a dependable benchmark for reference and guiding future research endeavors. By enhancing the utilization of sophisticated modeling in RA and advocating for precision medicine in the field, our work aims to propel advancements in RA treatment and management.

2 ML algorithms to enhance precision rheumatology

ML, a crucial component of artificial intelligence, is divided into two main categories: supervised and unsupervised learning. Supervised learning employs labeled training datasets to identify patterns and relationships. Upon training, the model can predict or classify new data inputs, yielding corresponding results. This method utilizes a range of algorithms, such as logistic regression, random forests, gradient boosting, and decision trees. Each algorithm contributes uniquely to the robustness and accuracy of predictive outcomes, making supervised learning integral to advancements in data-driven research methodologies (14). Supervised learning is divided into two principal methodologies: classification and regression (15). Classification methodologies segregate patients according to distinct characteristics (16). By employing datasets comprising genetic information, gene expression profiles, and clinical indicators from patients with RA, algorithms can be trained to identify RA patients within populations, as well as to ascertain which patients exhibit optimal responses to specific treatments. Regression models, on the other hand, are designed to predict continuous outcomes (17), such as disease activity scores and response rates to treatments in RA patients, thus facilitating personalized monitoring and management to optimize treatment efficacy. In contrast, unsupervised learning explores inherent patterns and relationships in datasets without predetermined labels (18). Clustering algorithms, an exemplary application of unsupervised learning, automatically group data into multiple clusters to maximize intra-cluster similarity and minimize inter-cluster similarity, aiding significantly in RA research by identifying potential patient subgroups who may exhibit favorable responses to specific treatments or distinct disease progression patterns. Deep learning, employing Artificial Neural Network (ANN) technologies, enhances the analysis and prediction of complex data through sophisticated non-linear mapping relationships (19). Particularly, Convolutional Neural Networks (CNNs) in deep learning architectures are adept in processing image data (20), enabling automatic feature learning from multiple convolutional layers which assist physicians in identifying early signs of arthritis or disease progression in X-ray or Magnetic Resonance Imaging (MRI) images of RA patients. In summary, supervised and unsupervised learning each serve specific roles, while deep learning technologies enhance the capability of these methods to process complex data, thereby effectively advancing the field of precision rheumatology.

In the preprocessing phase, data cleaning and organization are paramount, involving the removal of duplicates and correction of anomalies (21). Furthermore, feature engineering plays a critical role in identifying predictors (x) that significantly influence the target variable (y) through strategic selection and transformation of data, a crucial task in supervised learning. Accurate feature selection not only enhances the precision of the model but also its interpretability. When constructing predictive models, addressing the challenge of managing a large volume of available features is commonplace. While the use of advanced and efficient algorithms is vital, ineffective predictive information derived from these features, or the presence of numerous irrelevant variables, can impair model performance. Implementing key feature selection strategies is crucial, including statistical filtering, wrapper methods, and advanced embedded techniques (2224). For instance, Random Forest assesses feature importance by calculating their contribution to model accuracy (25), whereas Logistic Regression identifies key influencing factors by analyzing the magnitude and direction of coefficients (26). Through rigorous feature selection, the dimensionality and complexity of the dataset are effectively reduced, thereby enhancing the interpretability and practical application of the predictive model in clinical decision-making (22). For example, identifying RA patients with specific genetic mutations through feature selection has indicated that these individuals respond more positively to methotrexate, a principal drug for RA treatment. This insight assists physicians in devising targeted treatment plans, thereby improving therapeutic outcomes.

ML algorithms are increasingly recognized as powerful analytical tools in the field of RA research. As depicted in Figure 1, they provide assistance across multiple domains, including diagnosis, disease progression forecasting, prediction of treatment responses, and identification of potential complications. These computational tools are guiding the field towards a more refined and individualized approach, allowing clinicians and researchers to explore the complexities of RA with greater accuracy.

Figure 1

3 ML models in precision diagnosis and therapeutics for RA

A variety of predictive models have been built using ML algorithms in RA research. Presented in Table 1 is the appraisal of performance when these ML models serve as classifiers across a multitude of data types from various sources. The functionalities of these classifiers include identification of individuals at risk for RA, diagnosis and differentiation of subtypes, discrimination of disease activity levels, forecasting of treatment outcomes as effective or ineffective, and predicting the presence or absence of comorbidities.

Table 1

TaskSample SizeFeaturesML algorithmsPerformanceRef
Risk PredictionTraining set:
RA patients: n = 599
Controls: n = 1673
Test set 1:RA: n = 125
Controls: n = 349
Test set 2:RA: n = 127
Controls: n = 355
Test set 3:RA: n = 127
Controls: n = 355
9 SNPsLR, SVM, Naïve Bayes, RF, XGBoostAUC > 0.9(27)
RA or no arthritis:
n =17,366
Training set: n = 8683
Validation set: n = 4342
Test set: n = 4341
Age, gender, race, high BMI, gout, diabetic, smoked, sleep, blood pressure, patient health questionnaire, income to
poverty ratio
Bayesvalidation set:
AUC = 0.826
test set:
AUC = 0.805
(28)
Training cohort:
RA: n=47
non-RA: n=64
Test cohort:
UA: n = 62
the Leiden prediction rule, 12-gene risk metricSVMAUC = 0.84(29)
UA: n = 72,
RA: n = 8,
HD: n = 13
cpg sites, clinical parametersLR, SVM, RFAUC: 0.875-1(30)
Diagnosishand radiograph images:
Training set:
RA: n = 256
OA: n = 262
Normal: n = 231,
Others: n = 242;
Validation set:
RA: n = 56
OA: n = 57
Normal: n = 51
Others: n = 53;
Test set:
RA: n = 56
OA: n = 58
Normal: n = 51
Others: n = 53
CNNsClassification of RA and normal:
AUC = 0.97
Classification of RA and OA and normal:
Acc = 0.806
Classification of RA and OA and normaland others:
Acc = 0.844
(31)
1337 RA ultrasound images of 208 patientsDLClassification of synovial proliferation or not:
Group1/Group2/Group3:
AUC = 0.863/0.861/0.886
Classification of healthy and diseased:
Group1/Group2/Group3:
AUC=0.848/
0.864/0.916
(32)
Training set:
HC: n = 100
RA: n = 100
Validation set:
HC: n = 18
RA: n = 20
hand images, Age, gripforceBayesNet, NaïveBayes, Logistic, k-NN, RF,etc.Classification of RA and HC
Acc = 0.947
Sen = 0.95
Spe = 0.944
AUC = 0.971
(33)
Training set: GSE93272, GSE45291, GSE74143, GSE65010, GSE15573, GSE61635, GSE65391, GSE138458, GSE143272, GSE113469, GSE50772
Test set: GSE55457,
15 key genesLASSO, SVM, RF,XGBoost, BPNN, CNNAUC > 0.85(34)
GSE93272, GSE17755MAPK3, ACTB, ACTG1, VAV2, PTPN6, ACTN1LASSOTraining set: AUC= 0.801
Validation set: AUC= 0.979
(35)
Uninflamed: n = 10
Resolving arthritis: n = 9
Early RA: n = 17
Established RA: n = 12
cytokine, chemokineGMLVQRA vs. non-inflamed group:
AUC = 0.996
Early RA vs. resolved arthritis group: AUC = 0.764
(36)
Training set: GSE12021, GSE55235, GSE55457, GSE55584
Validation set: Dataset1: GSE89408
Dataset2: GSE77298, GSE153015
m6A methylation regulatorsRF, Rpart, LASSO, XGBoost, LRClassification of RA and HC
AUC = 0.85 (IGF2BP3)
AUC = 0.85 (YTHDC2)
(37)
Serum of 225 RA patients and 100 HC
Discovery set: n = 243
Validation set: n = 82
26 metabolites and lipidsLR, RF, SVMClassification of RA and HC:
AUC = 0.91
Sen = 0.897
Spe = 0.906
(38)
Test cohort:
RA: n=36
OA: n=18
HC: n=18
Validation cohort:
RA: n=24
OA: n=12
HC: n=12
3 groups of differentially expressed proteinsRFClassification of RA:
AUC = 0.9949
Classification of ACPA-positive RA patients:
AUC = 0.9913
Classification of ACPA-negative RA patients:
AUC = 1.0
(39)
IBD: n = 14, MS: n = 7, RA: n = 5, JIA: n = 3, SLE: n = 3, T1D: n = 2, BS: n = 2, AS: n = 2, APS: n = 1、PSC: n = 1, MG: n = 1, ReA: n = 1gut microbiomeRF, SVM
, XGBoost, Ridge Regression
Classification of RA and IBD: AUC > 0.86
Classification of RA and MS: AUC > 0.96
(40)
Discovery cohort: 167 RA and 91 controls
Validation cohort: 12 SLE、32 RA and 32 controls
miR-22-3p,
miR-24-3p,
miR-96-5p, miR-134-5p, miR-140-3p, miR-627-5p
LASSO, RF, LRClassification of RA and non-RA: AUC = 0.71
Classification of ACPA-positive RA and others: AUC = 0.73
Classification of ACPA-negative RA and others: AUC = 0.73
(41)
H&E-stained images of TKR explant synovium (OA: n = 147, RA: n = 60)
Training set: n = 166
Test set: n = 41
14 pathologist-scored features、computer vision-quantified cell densityRFClassification of RA and OA
AUC = 0.91
(42)
129 synovial tissue samples
RA: n = 123
OA: n = 6
histologic scoringSVMClassification of the high inflammatory subtype and others:
AUC = 0.88
Classification of the low inflammatory subtype and others:
AUC = 0.71
Classification of the mixed subtype and others:
AUC = 0.59
(43)
Disease activity/
imaging progression
Hanyang Bae RA Cohort:
No progression:
n = 118
Severe progression:
n = 120
NARAC Cohort:
No progression: n = 68
Severe progression: n = 86
genetic and clinical factorsSVMClassification of radiologic progression and no progression
AUC = 0.7872
(44)
ultrasound images from RA patients
Training set: n = 1678
Testing set: n = 322
CNNDistinguishing class 0 from the other classes: AUC = 0.96
Distinguishing class1 from class 2 and 3 classes: AUC = 0.94
Distinguishing class 2 from class 3 classes: AUC = 0.93
(45)
135 visits from 41 patientsdose percentage change, the DAS-28 ESR score, ESR, disease duration, CRP, and the duration of remission at study entryLR, KNN, NB, RF, Stacking-Meta ClassifierClassification of flare yes and. flare no
AUC: 0.72 - 0.81
(46)
stable RA patients: n = 130
training set: n = 104
test set: n= 26
baseline serum proteomicsLASSO, XGBoostClassification of flare and remission
AUC = 0.8
(47)
2 electronic health record platforms
UH Cohort: n = 578 (Training Cohort : Test Cohort: n= 116)
SNH Cohort: n= 242 (Training Cohort: n = 125, Test: n = 117)
medications, patient demographics, laboratories, and prior measures of disease activity.DLClassification of controlled and uncontrolled
UH training model test in UH Test Cohort:
AUC = 0.91
UH training model test in SNH test Cohort:
AUC = 0.74
(48)
300 RA patientslaboratory data, medicare claims and medicationsLRClassification of high/moderate and low disease activity/remission
AUC = 0.76
(49)
Optum dataset:n = 68,608
Externally validatiation:
IBM CCAE: n = 75,579
IBM MDCD: n = 7,537
IBM MDCR: n = 36,090
health service utilization, demographics, prescription claims for immunosuppressants, steroids, DMARDs, pain medications, and other comorbid conditions.regularized LASSO, LR, RF, GBM90-day TAR: AUC (IBM CCAE) = 0.77, AUC (IBM MDCR) = 0.75, AUC (IBM MDCD) = 0.77,
730-day TAR: AUC = 0.71
(50)
Terapeutic responseMTXAll patients with new onset RA
Training cohort:
n = 26
Validation cohort: n = 21
metagenomic, clinical-­pharmacogenetic variablesRFAUC = 0.84(51)
Training dataset: ESPOIR: n  =  493
EAC: n  =  239
External validation dataset:
Treach: n  =  138
DAS28, creatininemia, leucocytes, lymphocytes, AST, ALT, swollen joints count and corticosteroids co-treatment.LR, RF, LightGBM, CatBoostTraining dataset: AUC = 0.73
External validation set: AUC = 0.72
(52)
349 RA patients:
Training set: n = 279
Test set:
n = 70
95 haplotypes and 5 non-genetic factorsNN, SVM, LR, EN, RF, Boosted TreesAUC: 0.776 - 0.828
Sen: 0.656 - 0.813
Spe: 0.684 - 0.868
(53)
82 RA patients:
good responders:
n = 42
poor responders/nonresponders:
n = 43
gene expressionL2-regularized LR, RF, network‐based approachpredictive utility between 4 weeks and pretreatmen: acc = 0.61, AUC = 0.78
predictive utility at the 4‐week time point: acc = 0.68, AUC = 0.78.
(54)
TNFiDiscovery cohort:
n = 74(52 responders and 22 non responders)
Validation cohort:
n = 25(14 responders and 11 non responders)
clinical and molecular parametersLRAUC = 0.91(55)
Training dataset: n = 1892
Testing dataset: n = 680
demographic, clinical, and genetic markerslinear models, CART, SVM, GPRTraining dataset: AUC = 0.66
Testing dataset: AUC = 0.615
(56)
Synovial tissue samples:
RA: n = 256,
OA: n = 41
NC: n = 36;
Genes: n = 11,769
pathway and DEGNB, DT, KNN, SVMFor infliximab response:
Pathway-driven model
AUC = 0.87, AUPR = 0.78;
DEG-driven mode
AUC = 0.92, AUPR = 0.86
(57)
179 RA patients:
Training set: n = 141
Validation set: n = 38
9 clinical
parameters
NNResponse to infliximab
AUC = 0.75
(58)
responders: n = 23
non-responders: n = 16
clinical data, flow cytometry measurements, protein measurements and transcriptomics dataLinear, non-linear, kernel-basedresponse to TNFi
AUC = 0.81
(59)
Training set: n = 161
Validation set: n = 118
DAS28, lymphocytes, ALT, neutrophils, Age, weight and ever smokedLR, RF, XGBoost, CatBoostResponse to Etanercept:
Training set: AUC = 0.74
Validation set: AUC = 0.70
Response to monoclonal anti-TNF antibodies:
Training set: AUC = 0.74
Validation set: AUC = 0.71
(60)
Other
drugs
R4RA synovial biopsies:
n = 164
gene expression, clinical data and histological dataelastic net regression, GBMFor rituximab response AUC = 0.744
For tocilizumab response AUC = 0.681
For refractory state: AUC = 0.686
(61)
1204 patients treated with bDMARDsage, rheumatoid factor, ESR,
disease duration, CRP
Lasso, Ridge, SVM
, RF, XGBoost
Acc:
0.528 - 0.729
AUC: 0.511 - 0.694
(62)
Training set:
n = 625
Independent test set: n = 322
PtGARF, XGBoost, ANN, SVMAcc = 0.726
AUC = 0.638
F1 score = 0.841
(63)
Training set:
51 MR and 85 NR
External validation cohort:
35 MR and 47 NR
DAS-28CARTTraining set:
AUC = 0.89
Sen = 0.88
Spe = 0.94
Validation cohort:
AUC = 0.82
(64)
Comorbidities487 patients diagnosed with RA and osteoporosis
Training set: n = 340
Testing set: n = 147
baseline demographic, clinical test indicatorsRF, ANN, SVM, XGBoost, DTTraining set: AUC = 0.878
Testing set:
AUC = 0.872
(65)
2374 RA patientsclinical features, medication, laboratory resultsLR, RF, XGBoost, LightGBMAUC = 0.75
Acc =0.68
F1 score = 0.7
(66)
2 atherosclerosis and 2 RA datasetsNFIL3, EED, GRK2, MAP3K11, RMI1, TPST1LASSO, RFAUC: 0.723 to 1(67)
Training cohort:
RA+CHD: n = 294
RA: n = 718
Validation cohort: RA+CHD: n = 70
RA: n = 204
age, hypertension, anti-CCP antibody positivity, rheumatoid factor positivity, a high ESR, high CRP levels, and dyslipidemia of LDL-c, TC, triglycerides and HDL-cGBDT, KNN, LR, RF, XGBoost, SVMAUC = 0.77
Sen = 0.639
Spe = 0.772
(68)
RA-ILD: n = 75
RA-non-ILD: n = 78
age, KL-6, D-dimer, CA19-9LASSO, RF, PLSAUC = 0.928
Sen = 0.83
Spe = 0.81
(69)

Application of ML in RA.

Acc, accuracy; ADA, adaptive boosting; ALT, alanine aminotransferase; AST, aspartate aminotransferase; APS, antiphospholipid syndrome; AS, ankylosing spondylitis; AUPR, area under the precision-recall; BMI, body mass index; BS, behcet’s syndrome; b/tsDMARDs, biologic or targeted synthetic disease modifying antirheumatic drugs; CART, classification and regression tree; CA19-9,carbohydrate antigen 19-9; CCP, cyclic citrullinated peptide; CHD, coronary heart disease; CRP, c-reactive protein; DAS 28, disease activity score-28; DEG, differentially expressed gene; DL, deep learning; DT, decision tree; EN, elastic nets; ESR, erythrocyte sedimentation rate; GBDT, gradient boosting decision tree; GBM, gradient-boosted machine; GPR, gaussian process regression; HC, healthy control; HDL, high density lipoprotein; IBD, inflammatory bowel disease; ILD, interstitial lung disease; JIA, juvenile idiopathic arthritis; KL-6, Krebs von den Lungen-6; KNN, k-nearest-neighbors; LASSO, least absolute shrinkage and selection operator; LDL, low density lipoprotein; LR, logistic regression; MG, myasthenia gravis; MR, multi-refractory; MS, multiple sclerosis; MTX, methotrexate; Non-ILD, rheumatoid arthritis-without interstitial lung disease; NB, naïve bayes; NN, neural networks; NR, non-refractory; OA, osteoarthritis; OP,osteoporosis; PLS, partial least square; PRS, polygenic risk score; PSC, primary sclerosing cholangitis; PtGA, patient global assessment of disease activity; R, responders; RA, rheumatoid arthritis; ReA, reactive arthritis; RF, random forest; SEN, sensitivity; SLE, systemic lupus erythematosus; SNH, safety-net hospital cohort; SNP, single nucleotide polymorphism; SPE, specificity; SVM, support vector machine; TAR, time at risk; TC, total cholesterol; T1D, type 1 diabetes; TNFi, tumor necrosis factor inhibitor; TKR, total knee replacement; UH, university hospital cohort; XGBoost, eXtreme Gradient Boosting.

3.1 Stratification of RA risk cohorts

Identifying individuals at risk for RA is crucial for early intervention, which has been shown to yield substantially better outcomes when applied during the preclinical stages rather than after the overt development of clinically significant arthritis (70). Specifically, by identifying individuals at high risk and conducting regular medical examinations and monitoring RA-related biomarkers, such as inflammation levels and autoantibodies, early detection of the disease can utilize the ‘window of opportunity’ for therapeutic intervention. Early interventions can help prevent severe radiographic damage and disability, thus significantly improving patient prognosis (71). The exact etiology of RA remains not fully understood; however, it is known that genetic and environmental factors, as well as their interactions, influence the onset and progression of RA (72). ML, as an effective data analysis tool, is capable of processing and interpreting large volumes of diverse data, ranging from genetic factors to lifestyle choices. ML can uncover potential risk patterns within complex genetic and environmental datasets, assisting clinicians in making more accurate disease predictions and risk assessments.

Predictive modeling harnessing ML techniques to pinpoint individuals at an elevated risk for RA can be principally segregated into two domains: forecasting the incident risk in asymptomatic persons and assessing the progression likelihood in symptomatic patients with undifferentiated arthritis towards RA. The detection of RA susceptibility in the broad population leans on the analysis of genetic variants alongside common clinical risk indicators such as family history, age, and gender. A study found nine single nucleotide polymorphisms (SNPs) linked to RA, by combining these variations into a risk score and using ML algorithms, researchers were able to accurately distinguish RA patients from those without the condition, exhibiting five-fold cross-validated AUCs surpassing the 0.9 threshold (27). 11 risk factors for RA were identified from National Health and Nutrition Examination Survey (NHANES) data and used to create a Bayesian logistic regression model, which was refined using a Genetic Algorithm. The model showed high predictive accuracy with an AUC of 0.826 on the validation set (28). These findings highlight the potential of machine learning strategies in predicting risk populations for RA. Genetic risk scores derived from SNPs can help identify an individual’s potential genetic risks, thereby providing a crucial foundation for personalized medicine (73). However, translating these studies into clinical decision support tools faces obstacles, primarily ensuring the equal applicability of Polygenic risk score (PRS) across populations (74). In reality, PRS exhibits limited transferability among populations, and its clinical utility in RA remains undetermined, necessitating substantial investment in extensive data collection across diverse ethnic groups and methodological research to enhance genetic prediction in admixed individuals (75). Another critical issue is the interpretability of genetic findings in participants, requiring clinicians to possess the capacity to comprehend and interpret data (76). Furthermore, privacy and security of the involved genetic data must be adequately ensured. Federated learning, as a distributed machine learning technique, aims to achieve collaborative modeling while ensuring data privacy, security, and legal compliance (77). Participants can train their local models using their proprietary data, and through iterative training, each participant contributes to the construction of a global model without sharing their data externally (78). This approach fosters collaboration among multiple medical institutions, facilitating the sharing of model learning outcomes (79).

The likelihood of individuals with undifferentiated arthritis (UA), who exhibit joint symptoms without fulfilling the full diagnostic criteria, subsequently progressing to RA poses a clinical conundrum. Accurate prediction of this progression can facilitate early diagnosis and intervention for those at risk, while concurrently preventing overtreatment and diminishing both the health repercussions and superfluous healthcare expenditures for those unlikely to develop RA (80). Models are increasingly geared towards the evaluation of dynamic variables, reflecting shifts correlated with disease activity, such as gene expression profiles, epigenetic modifications, and a spectrum of detailed symptomatic and clinical markers.

A notable investigation sought to unearth clinically pertinent predictive biomarkers from peripheral blood CD4 T cells in UA patients, employing a support vector machine (SVM) classification model. This approach demonstrated that an integration of the pre-established Leiden predictive rule with a 12-gene risk indicator notably enhanced the prognostic capability from the original (AUC=0.74) to a significantly improved accuracy for seronegative UA patients (AUC=0.84) (29). A comparative analysis of three distinct ML algorithms revealed that a SVM model, which integrated DNA methylation profiles from 40 CpG sites with clinical parameters including disease activity score (DAS) and RF, effectively distinguished individuals with UA who were predisposed to developing RA within one year, achieving an AUC range of 0.85 to 1 (30).

Contemporary studies report promising predictive performance in identifying at-risk individuals within the general population and in forecasting RA development in patients with UA, and that the features having the greatest impact on predictive outcomes were identified and selected as much as possible during model training in order to simplify the model and potentially improve performance and generalizability. More important than performance, however, is the potential for practical clinical application, and future studies will need to examine the generalizability of the model by testing it in populations of multiple ethnicities and regions, and tracking the progression of individuals to RA in larger prospective cohorts to observe the accuracy of the model.

3.2 Diagnosis and subtype classification of RA

The diagnostic framework for RA, especially in the context of seronegative RA, is intricate and often obstructed by the absence of potent biomarkers, impeding early detection and management (47). Investigations are thus aimed at the identification of new biomarkers to bridge this gap.

Non-invasive imaging techniques are pivotal in elucidating inflammatory activity and its effects on joint morphology, especially when serological markers are indistinct or inconclusive. These tools are indispensable for both diagnostic purposes and for monitoring treatment efficacy (81). Furthermore, the application of ML algorithms in the analysis of imaging data presents a sophisticated approach to patient classification (82). Üreten K et al. presented a model of a Visual Geometry Group-16 (VGG-16) neural network for hand radiographs augmented by transfer learning to distinguish RA patients from non-RA patients, which achieved an AUC of 0.97 (31). Ultrasound imaging of the metacarpophalangeal joints in RA patients has been categorized for classification purposes, employing a DenseNet-based deep learning model in several regions of interest, significant efficacy was demonstrated in distinguishing between synovial proliferation and healthy and diseased synovium, as evidenced by AUCs exceeding 0.8 (32). Additionally, research has been conducted utilizing hand RGB images and gripforce as features to develop a random forest model with an AUC of 0.97 for distinguishing between individuals with RA and control subjects, thereby offering a supplementary diagnostic tool for RA (33). Image-based predictive models have shown notable performance in research settings, accurately differentiating RA patients from others in various cohorts, thereby contributing to the precision and efficiency of RA diagnosis. These models facilitate the early detection of abnormal changes within the joints, enabling timely intervention and ultimately delaying the progression of RA. However, their clinical application still faces significant challenges. A primary obstacle is the interpretability of the models. Owing to the ‘black box’ nature of deep learning models, the decision-making processes are opaque and difficult to comprehend, which may affect both physician and patient trust and understanding of model predictions (83). To address this limitation, some well-known methods can be utilized: The Class Activation Mapping (CAM) technique helps in understanding the regions of interest within images as attended by the model (84); Shapley Additive exPlanations (SHAP) elucidate the global impact of each feature on the model (85); and Local Interpretable Model-agnostic Explanations (LIME) explicate the local prediction process for individual samples (86). Collectively, these methods provide interpretability tools that enhance comprehension of the model’s decision-making process and improve its interpretability. Future studies are also suggested to involve multi-center collaborations to enhance image collection with the intent to further refine and generalize these diagnostic models.

In RA, both individual analyses and integrative omics studies have accumulated a vast amount of data, providing insights into the mechanisms of RA from multiple perspectives. Genomics identifies genetic variations associated with RA, revealing potential genetic mechanisms influencing gene expression (87). Epigenetic modifications, including DNA methylation, histone modifications, chromatin remodeling, and non-coding RNA, play crucial roles in maintaining normal gene expression patterns. Epigenomics studies these modifications to reveal gene expression and regulatory mechanisms in RA, offering insights into the diverse molecular processes involved (88). Transcriptomics, by analyzing the variations in gene expression under different conditions, provides a detailed elucidation of which genes are upregulated or downregulated in RA. This process not only involves the regulation at the genetic level but also directly affects the production and function of the corresponding proteins (89). Proteomics provides a comprehensive analysis of protein composition, expression levels, and modification states, elucidating the interactions and connections among proteins that may play key roles in RA inflammation and immune response processes (90). Metabolomics provides insights into the shifts in metabolic states and pathways during the progression of RA. These changes are potentially influenced by alterations in gene and protein activities. Furthermore, metabolites themselves can play a modulatory role, affecting gene transcription and protein expression, thereby forming a complex interplay that influences disease dynamics (91). Host genomic variations significantly influence the composition of the gut microbiota, which can synthesize, regulate, or degrade endogenous small molecules or macromolecules, resulting in metabolic changes. Utilizing metagenomics and related techniques reveals the role of gut microbiota in the development of RA by influencing metabolic pathways and modulating the host immune system (92). Omic studies are characterized by the generation of vast, high-dimensional datasets. ML algorithms are critically employed for visualization and processing such information—finding patterns, crafting predictive models, and examining large-scale, multi-omic data to identify biomarkers and pathways implicated in disease progression (93, 94). Existing research has integrated multimodal data and employed various machine learning algorithms to develop high-performance diagnostic models for RA. Key genes highly correlated with RA phenotypes have been identified through the application of weighted gene co-expression network analysis (WGCNA) and differential gene expression (DEG) analysis on RA blood sample microarray datasets. These genes have been deployed as features to assess the performance of six ML models, with five demonstrating commendable efficacy (AUC > 0.85) (34). Through the sourcing of RA patient peripheral blood sample microarray datasets from the GEO database, a platelet-related signature risk score model was formulated, comprised of six genes, using the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm. The model exhibited AUCs of 0.801 and 0.979 across the training and validation sets, respectively (35). Employing the Generalized Matrix Learning Vector Quantization (GMLVQ) method, mRNA expression profiles of cytokines and chemokines from synovial biopsies were analyzed, leading to the identification of two gene sets. These sets were instrumental in generating a model capable of differentiating between various arthritis types, with AUC scores reaching 0.996 and 0.764 for distinguishing diagnosed RA from non-inflammatory cases and early-stage RA from self-remitting arthritis, respectively (36). By focusing on the expression of 19 N6-methyladenosine (m6A) methylation regulators, diagnostic models have been established to separate RA from non-RA conditions. A subset of these regulators, particularly IGF2BP3 and YTHDC2, demonstrated accuracies and AUCs exceeding 0.8 across most ML models, indicating the potential diagnostic importance of m6A methylation profiles (37). A multi-variable classification model, incorporating 26 metabolites and lipids, was devised utilizing three ML algorithms. The logistic regression model, in particular, stood out for its ability to differentiate seropositive and seronegative RA from normal controls within an independent validation cohort, securing an AUC of 0.91, thus showcasing that a holistic metabolomic and lipidomic approach grounded in Liquid Chromatography-Mass Spectrometry (LC-MS) can effectively segregate RA cases (38). Serum antigens were analyzed in patient cohorts with RA, osteoarthritis (OA), and healthy controls. Subsequently, distinct biomarker sets were identified for the differentiation of RA, ACPA-positive RA, and ACPA-negative RA using feature selection through the Random Forest algorithm. The model demonstrated exceptional performance with AUC values of 0.9949, 0.9913, and 1.0, respectively, establishing a proteomics-based diagnostic model for RA (39). Furthermore, leveraging metagenomic data to predict the microbiomic characteristics of the gut in autoimmune diseases has been demonstrated to discriminate between various types of autoimmune disorders (40).

Histopathology, as a fundamental pillar in confirming disease diagnosis, stands as the definitive standard for the verification of numerous ailments (95). Overlap of symptoms in certain pathologies may obscure the principal etiology responsible for articular manifestations; in such instances, tissue biopsy, particularly of synovial tissue, proves invaluable. Following Total Knee Arthroplasty (TKA), synovial samples from 147 OA and 60 RA individuals were subjected to hematoxylin and eosin (H&E) staining. Utilization of a Random Forest Algorithm, integrating pathologist-derived scores with computer vision-generated cellular density measures, led to the construction of an optimal discriminative model for OA and RA, achieving a model AUC of 0.91 (42). This serves as a potent discriminative tool for RA assessment. Orange et al. utilized consensus clustering of gene expression data from synovial tissues of patients with RA to identify three distinct synovial subtypes: high-inflammatory, low-inflammatory, and mixed. They subsequently employed a support vector ML algorithm to distinguish between these subtypes based on histological features, achieving area under the curve values of 0.88, 0.71, and 0.59, respectively (43).

Despite the high performance of ML-derived predictive models for RA diagnosis, concerns on potential model overfitting due to limited sample sizes, which may exaggerate effect sizes, cannot be overlooked. Additionally, independent evaluation of the research methodology, data processing, and outcomes by an external party ensures the accuracy and reliability of the research findings. Validation of these models in diverse datasets, supplemented by molecular biology experimentation, is imperative for evaluating true diagnostic merit. Predictive models relying on histopathological data encounter additional challenges, including the necessity for manual feature annotation by pathologists and the invasiveness of the procedure, compounded by technical and sample handling issues. External validation is a critical quality control measure, ensuring that model utility and accuracy in diagnosing RA reflect true clinical relevance and potential for widespread application. The diagnosis of RA extends beyond segregating RA from healthy subjects or OA patients. Future investigations must address the diagnostic capacity of predictive model-derived markers in distinguishing seronegative RA from other inflammatory arthritides, such as psoriatic arthritis, reactive arthritis, or spondyloarthritis. Concomitantly, safeguarding against confounding variables and maintaining diversity within patient cohorts are essential to render the model universally applicable.

3.3 Prediction of disease activity and imaging progression in RA

Radiographic deterioration in RA is characterized by the degree of articular damage and the presence of distinct lesions such as joint space narrowing, bone erosion, and osteoporosis, as revealed through diagnostic imaging modalities including X-rays, magnetic resonance imaging, or computed tomography scans (96). The quantification and prognostication of structural joint impairment traditionally hinge on clinical expertise, underscoring the necessity for an automated, bias-free evaluation method. A study utilizing SVM modeling on cohorts comprising 374 Korean and 399 North American patients with incipient RA identified SNPs correlated with radiographic progression. An integrated model encompassing SNPs with clinical parameters exhibited optimal performance, yielding a mean ten-fold cross-validation AUC of 0.78, providing a more satisfactory distinction between severe and non-severe progression (44).

Radiological damage bears a significant association with disease activity in RA, with heightened activity posing an increased risk for osseous impairment. CNNs trained on ultrasound imagery of RA joints, have facilitated the automatic grading of disease activity, achieving an overall classification accuracy of 83.9% (45). Vodencarevic et al. used data from 135 consultations with 41 RA patients to predict flare incidents during biologic disease-modifying antirheumatic drugs (DMARDs) tapering in remission. They combined multiple ML models to achieve an AUC of 0.81 (46). Furthermore, baseline serum proteomics from 130 stable RA patients in clinical remission was analyzed for biomarkers predictive of future disease flares, employing LASSO and eXtreme Gradient Boosting (XGBoost) algorithms to construct predictive models. The XGBoost model exhibited superior performance in differentiating between relapsed and non-relapsed patients with an AUC of 0.80 (47).

The expansive volume of patient intelligence and clinical information harbored in electronic medical records (EMR) and electronic health records (EHR) constitutes a substantial body of data ripe for investigation (97, 98). Nonetheless, hindrances such as imbalances in data record quantities across patients, omissions of pivotal information, and the variability in patient conditions and therapeutic outcomes over time contribute to the complex temporal nature of the data (48). Conventional ML techniques encounter constraints concerning data pre-processing, time-series analysis capacity, and the simplification of intricate relational processing (99). Deep learning integrated with structured EHR data, have been deployed to prognosticate disease activity during subsequent outpatient rheumatology consultations, wherein the model trained on the UH cohort manifested an AUC of 0.91 for internal validation and 0.74 for external cohort testing (48). Feldman et al. endeavored to enhance the precision of RA disease activity evaluation by integrating electronic medical records and claims data, achieving an AUC of 0.76 in discriminating high/moderate from low disease activity/remission (49). Chandran et al. employed the use of biologic agents or tofacitinib as a surrogate for distinguishing disease severity indicators, with the model accurately predicting both current and future disease activity validated across various databases with AUCs exceeding 0.7 (50).

The aforementioned results substantiate the viability of employing routinely documented clinical and laboratory data to assess and forecast disease activity in RA. With the progressive advancements in information technology, an extensive array of data has become accessible, prompting researchers to explore ML methodologies for the extraction of RA patient records from electronic health record data, thereby enabling the study of substantial populations at minimal expense. Algorithms trained via ML are progressively leveraged with EMR for clinical investigations. These algorithms function by detecting specifiable patterns in the data associated with RA, yet systematic disparities in EMR data quality present hurdles for model generalizability. Despite these challenges, high-caliber investigations are somewhat limited and the dependability and transferability of pertinent ML methods remain largely undetermined, rendering periodic evaluation of algorithm performance imperative. The current research trend involves the utilization of thousands of digitally annotated images obtained from large-scale observational studies, clinical trials, and electronic medical records, along with clinical data, to automatically classify and quantify the extent of joint damage and activity scores in RA using ML algorithms (100102).

3.4 Prediction of RA treatment response

In the realm of RA therapeutics, a plethora of options including nonsteroidal anti-inflammatory drugs (NSAIDs), glucocorticoids, conventional synthetic DMARDs, biologic DMARDs, and oral small molecules have been made available (103). The selection of appropriate treatments continues to challenge clinicians owing to the vast range of alternatives and the prevalent trial-and-error approach in therapeutic prescription, exacerbated by a lack of comprehensive knowledge regarding drug efficacy and safety across distinct patient demographics (53).

Methotrexate (MTX) stands as the quintessential first-line therapy in RA treatment strategies (104). Investigation into whether disparities in the gut microbiome across individuals could serve as predictive markers for MTX efficacy in newly onset RA was conducted by Artacho et al. Fecal samples from 26 new-onset RA patients, procured prior to MTX treatment, were analyzed using 16S ribosomal RNA (16S rRNA) and shotgun sequencing. Subsequent construction of a predictive model via random forests revealed that a response to MTX treatment at 4 months could be anticipated, with an AUC of 0.84, based on colony characterization (51). Additional research involving ML algorithms applied to clinical and biological data from 493 and 239 patients across two cohorts, aimed to predict MTX treatment response at 9 months. Notably, the Light Gradient Boosting Machine (LightGBM) model acquired AUCs of 0.73 and 0.72 in training and external validation sets, respectively (52). Lim et al. analyzed exome sequencing data from 349 RA patients and predicted treatment response to MTX using six ML algorithms. They identified 95 genetic factors and 5 non-genetic factors that influenced response. The predictions had strong performance with AUCs between 0.776 and 0.828 in the test set (53). Plant et al. utilized whole blood samples from RA patients initiating MTX treatment, both before and 4 weeks after commencement, conducting gene expression profiling to foretell treatment response at 6 months. Application of an L2 regularized logistic regression yielded an AUC of 0.78 (54). The development of these predictive models has contributed significantly towards identifying patients who are more likely to respond favorably to, or may not derive benefit from, MTX treatment.

Anti-tumor necrosis factor (anti-TNF) agents have been established as pivotal second-line therapeutic agents following methotrexate. A prospective multicenter study recruited 104 RA patients and 29 healthy donors to discover predictive biomarkers for anti-TNF treatment using ML. A hybrid model combining clinical and molecular variables achieved a high AUC value of 0.91 (55). The DREAM RA Responder Challenge introduced a novel approach to predicting anti-TNF treatment response by proposing an optimal model that incorporates Gaussian Process Regression (GPR) and integrates demographic, clinical, and genetic markers. This model accurately predicts the Disease Activity Score in patients 24 months post-baseline assessment and categorizes treatment response according to the EULAR response criteria, effectively identifying non-responders to anti-TNF therapy with an AUC of 0.6 in cross-validation data (56). Kim et al. utilized 11 datasets containing 256 synovial tissue samples, integrating RA-associated pathway activation scores and four ML types, and found that the SVM model performed the best, with an AUC of 0.87 using the pathway-driven model and an AUC of 0.9 using the DEG-driven model (57).

Recent research has emphasized the potential benefits of integrating diverse datasets for the purpose of treatment decision-making. ML algorithms have demonstrated efficacy in enhancing the precision of response prediction for TNF inhibitors and MTX. Furthermore, ML methodologies are being increasingly utilized in forecasting treatment responses to a range of other biologic therapies (6164). Clinical data may be limited by trial design, including inclusion and exclusion criteria.Using deep learning technology for cluster analysis on RA patients has revealed the connection between patient characteristics and treatment response (105). Advancements in spatial omics technologies enable a comprehensive and spatially intact analysis of synovial tissue in RA patients. This approach allows for precise localization of cells, exploration of cellular interactions, assessment of cell type distributions, and identification of disease-associated molecular markers (106). Integrating traditional multi-omics with spatial data, spatial multi-omics elucidates the complexity and dynamics of biological processes across various levels, including their interactions and influences on each other. This approach deepens our understanding of the pathological mechanisms of RA and enhances our knowledge of its spatial heterogeneity (107). The biopsy-driven RA randomized clinical trial (R4RA), which utilizes spatial omics to create synovial biopsy gene maps, provides a paradigm for predicting drug treatment responses and refining therapeutic strategies. This is crucial for achieving personalized medicine and optimizing treatment outcomes. Despite some progress, spatial omics in RA research is still in its early stages. Numerous challenges remain, such as high costs, high demands on sample handling, patient acceptance, ethical issues, and the need for advanced computational tools for data integration (108). Overcoming these challenges will be crucial for developing accurate, interpretable, and clinically applicable predictive models. In summary while opportunities exist for refining the accuracy of these predictions, progress is evident in this area of study. In the future, using a larger, more comprehensive datase, appropriate algorithms, and methods in parameter optimization, improving model features and validating against independent cohorts may further improve the discriminative power of predictive models.

3.5 Prediction of comorbidities related to RA

ML is also gaining attention in the prediction of comorbidities associated with RA. Focus within extant research has primarily been oriented towards the identification of risk factors for osteoporosis (65, 66), assessment of cardiovascular risk (67, 68), and the prediction of interstitial lung disease development (69) in individuals with RA. Current models pertaining to comorbidities are limited in both quantity and accuracy, with constraints stemming from various sources, notably the scarcity of comprehensive comorbidity data within RA patient cohort datasets. Furthermore, there is significant variability in data quality across different cohorts. To overcome these obstacles, future research should prioritize the accumulation of larger, more robust datasets and improve integration among diverse data sources.Simultaneously, there is a necessity for the advancement of algorithms with broader applicability, thereby enabling the utilization of ML in the prediction of complications associated with RA.

4 Conclusion and outlook

Integrating data from diverse sources allows ML models to yield more comprehensive and precise predictions for the diagnosis and treatment outcomes of RA. However, more focus and effort are needed to create predictive models for comorbidities related to RA. Recent research has demonstrated the potential of multimodal learning to improve clinical prediction accuracy. The optimal performing model under specific conditions often necessitates an extensive comparative analysis. Beyond frequently used metrics such as AUC, accuracy, sensitivity, specificity, and F1 score, the employment of cross-validation, the statistical tests applied, the model’s computational cost, the data requirements, and accessibility, the adoption of multimodal learning approaches aims to refine clinical predictions. Efforts should be made to improve the clinical operability of models, utilize external datasets from diverse origins for validation, assess the model’s generalizability, monitor its long-term performance, and evaluate its strengths and weaknesses through multidimensional approaches rather than relying on a single performance metric. Although ML models have demonstrated impressive predictive prowess in research settings, it is imperative to establish their practicality and effectiveness in real-world clinical scenarios. To cultivate trust and acceptance among medical practitioners, it is essential to enhance the interpretability of these models. This can be achieved by prioritizing simplicity in experimental design or by employing tools that enhance model interpretability. Finally, but importantly, the privacy and ethical implications of big biological data should be emphasized and protected.

Statements

Author contributions

YMS: Data curation, Visualization, Writing – original draft. MZ: Data curation, Formal analysis, Writing – review & editing. CC: Data curation, Formal analysis, Writing – review & editing. PJ: Data curation, Formal analysis, Writing – review & editing. KW: Data curation, Formal analysis, Writing – review & editing. JZ: Data curation, Formal analysis, Writing – review & editing. YS: Data curation, Formal analysis, Writing – review & editing. YZ: Data curation, Formal analysis, Writing – review & editing. FZ: Data curation, Formal analysis, Writing – review & editing. XL: Data curation, Formal analysis, Writing – review & editing. SG: Conceptualization, Writing – review & editing. FW: Supervision, Writing – review & editing. DH: Funding acquisition, Supervision, Writing – review & editing.

Funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was funded by the National Natural Science Funds of China (82074234, 82004166 and 82071756), Shanghai Chinese Medicine Development Office, National Administration of Traditional Chinese Medicine, Regional Chinese Medicine (Specialist) Diagnosis and Treatment Center Construction Project-Rheumatology, State Administration of Traditional Chinese Medicine, Shanghai Municipal Health Commission, East China Region-based Chinese and Western Medicine Joint Disease Specialist Alliance, and Shanghai He Dongyi Famous Chinese Medicine Studio Construction Project (SHGZS-202220).

Acknowledgments

Figure 1 was created by Figdraw (www.figdraw.com).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

  • 1

    CrossMSmithEHoyDCarmonaLWolfeFVosTet al. The global burden of rheumatoid arthritis: estimates from the global burden of disease 2010 study. Ann Rheum Dis. (2014) 73:1316–22. doi: 10.1136/annrheumdis-2013-204627

  • 2

    JohnsonTMSaylesHRBakerJFGeorgeMDRoulPZhengCet al. Investigating changes in disease activity as a mediator of cardiovascular risk reduction with methotrexate use in rheumatoid arthritis. Ann Rheum Dis. (2021) 80:1385–92. doi: 10.1136/annrheumdis-2021-220125

  • 3

    RedenteEFAguilarMABlackBPEdelmanBLBahadurANHumphriesSMet al. Nintedanib reduces pulmonary fibrosis in a model of rheumatoid arthritis-associated interstitial lung disease. Am J Physiol Lung Cell Mol Physiol. (2018) 314:L998L1009. doi: 10.1152/ajplung.00304.2017

  • 4

    NgKJHuangKYTungCHHsuBBWuCHKooMet al. Modified rheumatoid arthritis impact of disease (RAID) score, a potential tool for depression and anxiety screening for rheumatoid arthritis. Joint Bone Spine. (2019) 86:805–7. doi: 10.1016/j.jbspin.2019.04.007

  • 5

    LibbrechtMWNobleWS. Machine learning applications in genetics and genomics. Nat Rev Genet. (2015) 16:321–32. doi: 10.1038/nrg3920

  • 6

    JordanMIMitchellTM. Machine learning: Trends, perspectives, and prospects. Science. (2015) 349:255–60. doi: 10.1126/science.aaa8415

  • 7

    RaghupathiWRaghupathiV. Big data analytics in healthcare: promise and potential. Health Inf Sci Syst. (2014) 2:3. doi: 10.1186/2047-2501-2-3

  • 8

    ButlerKTDaviesDWCartwrightHIsayevOWalshA. Machine learning for molecular and materials science. Nature. (2018) 559:547–55. doi: 10.1038/s41586-018-0337-2

  • 9

    CoffeyCMCrowsonCSMyasoedovaEMattesonELDavisJM3rd. Evidence of diagnostic and treatment delay in seronegative rheumatoid arthritis: missing the window of opportunity. Mayo Clin Proc. (2019) 94:2241–8. doi: 10.1016/j.mayocp.2019.05.023

  • 10

    ConigliaroPTriggianesePDe MartinoEFontiGLChimentiMSSunziniFet al. Challenges in the treatment of rheumatoid arthritis. Autoimmun Rev. (2019) 18:706–13. doi: 10.1016/j.autrev.2019.05.007

  • 11

    ZhaoJGuoSSchrodiSJHeD. Molecular and cellular heterogeneity in rheumatoid arthritis: mechanisms and clinical implications. Front Immunol. (2021) 12:790122. doi: 10.3389/fimmu.2021.790122

  • 12

    Lo-CiganicWHHuangJLZhangHHWeissJCWuYKwohCKet al. Evaluation of machine-learning algorithms for predicting opioid overdose risk among medicare beneficiaries with opioid prescriptions. JAMA Netw Open. (2019) 2:e190968. doi: 10.1001/jamanetworkopen.2019.0968

  • 13

    Warnat-HerresthalSSchultzeHShastryKLManamohanSMukherjeeSGargVet al. Swarm Learning for decentralized and confidential clinical machine learning. Nature. (2021) 594:265–70. doi: 10.1038/s41586-021-03583-3

  • 14

    GoodswenSJBarrattJLNKennedyPJKauferACalarcoLEllisJT. Machine learning and applications in microbiology. FEMS Microbiol Rev. (2021) 45:fuab015. doi: 10.1093/femsre/fuab015

  • 15

    JiangTGradusJLRoselliniAJ. Supervised machine learning: A brief primer. Behav Ther. (2020) 51:675–87. doi: 10.1016/j.beth.2020.05.002

  • 16

    GittoSCuocoloRAnnovazziAAnelliVAcquasantaMCincottaAet al. CT radiomics-based machine learning classification of atypical cartilaginous tumours and appendicular chondrosarcomas. EBioMedicine. (2021) 68:103407. doi: 10.1016/j.ebiom.2021.103407

  • 17

    KulinMFortunaCDe PoorterEDeschrijverDMoermanI. Data-driven design of intelligent wireless networks: an overview and tutorial. Sensors (Basel). (2016) 16:790. doi: 10.3390/s16060790

  • 18

    WilliamsonDJBurnGLSimoncelliSGriffiéJPetersRDavisDMet al. Machine learning for cluster analysis of localization microscopy data. Nat Commun. (2020) 11:1493. doi: 10.1038/s41467-020-15293-x

  • 19

    GaoTLuW. Machine learning toward advanced energy storage devices and systems. iScience. (2020) 24:101936. doi: 10.1016/j.isci.2020.101936

  • 20

    BajićFOrelOHabijanM. A multi-purpose shallow convolutional neural network for chart images. Sensors (Basel). (2022) 22:7695. doi: 10.3390/s22207695

  • 21

    SchwendickeFSamekWKroisJ. Artificial intelligence in dentistry: chances and challenges. J Dent Res. (2020) 99:769–74. doi: 10.1177/0022034520915714

  • 22

    PengHFanY. Feature selection by optimizing a lower bound of conditional mutual information. Inf Sci (N Y). (2017) 418-419:652–67. doi: 10.1016/j.ins.2017.08.036

  • 23

    YangLJiangHDingXLiaoZWeiMLiJet al. Modulation of sleep architecture by whole-body static magnetic exposure: A study based on EEG-based automatic sleep staging. Int J Environ Res Public Health. (2022) 19:741. doi: 10.3390/ijerph19020741

  • 24

    TasciEJagasiaSZhugeYSproullMCooley ZgelaTMackeyMet al. RadWise: A rank-based hybrid feature weighting and selection method for proteomic categorization of chemoirradiation in patients with glioblastoma. Cancers (Basel). (2023) 15:2672. doi: 10.3390/cancers15102672

  • 25

    LiangYZhangZQLiuNNWuYNGuCLWangYL. MAGCNSE: predicting lncRNA-disease associations using multi-view attention graph convolutional network and stacking ensemble model. BMC Bioinf. (2022) 23:189. doi: 10.1186/s12859-022-04715-w

  • 26

    ChenYLuoMChengYHuangYHeQ. A nomogram to predict prolonged stay of obesity patients with sepsis in ICU: Relevancy for predictive, personalized, preventive, and participatory healthcare strategies. Front Public Health. (2022) 10:944790. doi: 10.3389/fpubh.2022.944790

  • 27

    LimAJWTynianaCTLimLJTanJWLKohET ,TTSH Rheumatoid Arthritis Study Groupet al. Robust SNP-based prediction of rheumatoid arthritis through machine-learning-optimized polygenic risk score. J Transl Med. (2023) 21:92. doi: 10.1186/s12967-023-03939-5

  • 28

    LufkinLBudišićMMondalSSurS. A bayesian model to analyze the association of rheumatoid arthritis with risk factors and their interactions. Front Public Health. (2021) 9:693830. doi: 10.3389/fpubh.2021.693830

  • 29

    PrattAGSwanDCRichardsonSWilsonGHilkensCMYoungDAet al. A CD4 T cell gene signature for early rheumatoid arthritis implicates interleukin 6-mediated STAT3 signalling, particularly in anti-citrullinated peptide antibody-negative disease. Ann Rheum Dis. (2012) 71:1374–81. doi: 10.1136/annrheumdis-2011-200968

  • 30

    de la Calle-FabregatCNiemantsverdrietECañeteJDLiTvan der Helm-van MilAHMRodríguez-UbrevaJet al. Prediction of the progression of undifferentiated arthritis to rheumatoid arthritis using DNA methylation profiling. Arthritis Rheumatol. (2021) 73:2229–39. doi: 10.1002/art.41885

  • 31

    ÜretenKMaraşHH. Automated classification of rheumatoid arthritis, osteoarthritis, and normal hand radiographs with deep learning methods. J Digit Imaging. (2022) 35:193–9. doi: 10.1007/s10278-021-00564-w

  • 32

    WuMWuHWuLCuiCShiSXuJet al. A deep learning classification of metacarpophalangeal joints synovial proliferation in rheumatoid arthritis by ultrasound images. J Clin Ultrasound. (2022) 50:296301. doi: 10.1002/jcu.23143

  • 33

    Alarcón-ParedesAGuzmán-GuzmánIPHernández-RosalesDENavarro-ZarzaJECantillo-NegreteJCuevas-ValenciaREet al. Computer-aided diagnosis based on hand thermal, RGB images, and grip force using artificial intelligence as screening tool for rheumatoid arthritis in women. Med Biol Eng Comput. (2021) 59:287300. doi: 10.1007/s11517-020-02294-7

  • 34

    XiaoJWangRCaiXYeZ. Coupling of co-expression network analysis and machine learning validation unearthed potential key genes involved in rheumatoid arthritis. Front Genet. (2021) 12:604714. doi: 10.3389/fgene.2021.604714

  • 35

    LiuYJiangHKangTShiXLiuXLiCet al. Platelets-related signature based diagnostic model in rheumatoid arthritis using WGCNA and machine learning. Front Immunol. (2023) 14:1204652. doi: 10.3389/fimmu.2023.1204652

  • 36

    YeoLAdlardNBiehlMJuarezMSmallieTSnowMet al. Expression of chemokines CXCL4 and CXCL7 by synovial macrophages defines an early stage of rheumatoid arthritis. Ann Rheum Dis. (2016) 75:763–71. doi: 10.1136/annrheumdis-2014-206921

  • 37

    GengQCaoXFanDGuXZhangQZhangMet al. Diagnostic gene signatures and aberrant pathway activation based on m6A methylation regulators in rheumatoid arthritis. Front Immunol. (2022) 13:1041284. doi: 10.3389/fimmu.2022.1041284

  • 38

    LuanHGuWLiHWangZLuLKeMet al. Serum metabolomic and lipidomic profiling identifies diagnostic biomarkers for seropositive and seronegative rheumatoid arthritis patients. J Transl Med. (2021) 19:500. doi: 10.1186/s12967-021-03169-7

  • 39

    HanPHouCZhengXCaoLShiXZhangXet al. Serum antigenome profiling reveals diagnostic models for rheumatoid arthritis. Front Immunol. (2022) 13:884462. doi: 10.3389/fimmu.2022.884462

  • 40

    VolkovaARugglesKV. Predictive metagenomic analysis of autoimmune disease identifies robust autoimmunity and disease specific microbial signatures. Front Microbiol. (2021) 12:621310. doi: 10.3389/fmicb.2021.621310

  • 41

    OrmsethMJSolusJFShengQYeFWuQGuoYet al. Development and validation of a microRNA panel to differentiate between patients with rheumatoid arthritis or systemic lupus erythematosus and controls. J Rheumatol. (2020) 47:188–96. doi: 10.3899/jrheum.181029

  • 42

    MehtaBGoodmanSDiCarloEJannat-KhahDGibbonsJABOteroMet al. Machine learning identification of thresholds to discriminate osteoarthritis and rheumatoid arthritis synovial inflammation. Arthritis Res Ther. (2023) 25:31. doi: 10.1186/s13075-023-03008-8

  • 43

    OrangeDEAgiusPDiCarloEFRobineNGeigerHSzymonifkaJet al. Identification of three rheumatoid arthritis disease subtypes by machine learning integration of synovial histologic features and RNA sequencing data. Arthritis Rheumatol. (2018) 70:690701. doi: 10.1002/art.40428

  • 44

    JooYBKimYParkYKimKRyuJALeeSet al. Biological function integrated prediction of severe radiographic progression in rheumatoid arthritis: a nested case control study. Arthritis Res Ther. (2017) 19:244. doi: 10.1186/s13075-017-1414-x

  • 45

    ChristensenABHJustSAAndersenJKHSavarimuthuTR. Applying cascaded convolutional neural network design further enhances automatic scoring of arthritis disease activity on ultrasound images from rheumatoid arthritis patients. Ann Rheum Dis. (2020) 79:1189–93. doi: 10.1136/annrheumdis-2019-216636

  • 46

    VodencarevicATascilarKHartmannFReiserMHueberAJHaschkaJet al. Advanced machine learning for predicting individual risk of flares in rheumatoid arthritis patients tapering biologic drugs. Arthritis Res Ther. (2021) 23:67. doi: 10.1186/s13075-021-02439-5

  • 47

    O'NeilLJHuPLiuQIslamMMSpicerVRechJet al. Proteomic approaches to defining remission and the risk of relapse in rheumatoid arthritis. Front Immunol. (2021) 12:729681. doi: 10.3389/fimmu.2021.729681

  • 48

    NorgeotBGlicksbergBSTrupinLLituievDGianfrancescoMOskotskyBet al. Assessment of a deep learning model based on electronic health record data to forecast clinical outcomes in patients with rheumatoid arthritis. JAMA Netw Open. (2019) 2:e190606. doi: 10.1001/jamanetworkopen.2019.0606

  • 49

    FeldmanCHYoshidaKXuCFritsMLShadickNAWeinblattMEet al. Supplementing claims data with electronic medical records to improve estimation and classification of rheumatoid arthritis disease activity: A machine learning approach. ACR Open Rheumatol. (2019) 1:552–9. doi: 10.1002/acr2.11068

  • 50

    ChandranURepsJStangPERyanPB. Inferring disease severity in rheumatoid arthritis using predictive modeling in administrative claims databases. PloS One. (2019) 14:e0226255. doi: 10.1371/journal.pone.0226255

  • 51

    ArtachoAIsaacSNayakRFlor-DuroAAlexanderMKooIet al. The pretreatment gut microbiome is associated with lack of response to methotrexate in new-onset rheumatoid arthritis. Arthritis Rheumatol. (2021) 73:931–42. doi: 10.1002/art.41622

  • 52

    DuquesneJBougetVCournèdePHFautrelBGuilleminFde JongPHPet al. Machine learning identifies a profile of inadequate responder to methotrexate in rheumatoid arthritis. Rheumatol (Oxford). (2023) 62:2402–9. doi: 10.1093/rheumatology/keac645

  • 53

    LimAJWLimLJOoiBNSKohETTanJWL ,TTSH RA Study Groupet al. Functional coding haplotypes and machine-learning feature elimination identifies predictors of Methotrexate Response in Rheumatoid Arthritis patients. EBioMedicine. (2022) 75:103800. doi: 10.1016/j.ebiom.2021.103800

  • 54

    PlantDMaciejewskiMSmithSNairN ,Maximising Therapeutic Utility in Rheumatoid Arthritis Consortium, the RAMS Study GroupHyrichKet al. Profiling of gene expression biomarkers as a classifier of methotrexate nonresponse in patients with rheumatoid arthritis. Arthritis Rheumatol. (2019) 71:678–84. doi: 10.1002/art.40810

  • 55

    Luque-TévarMPerez-SanchezCPatiño-TrivesAMBarbarrojaNArias de la RosaIAbalos-AguileraMCet al. Integrative clinical, molecular, and computational analysis identify novel biomarkers and differential profiles of anti-TNF response in rheumatoid arthritis. Front Immunol. (2021) 12:631662. doi: 10.3389/fimmu.2021.631662

  • 56

    GuanYZhangHQuangDWangZParkerSCJPappasDAet al. Machine learning to predict anti-tumor necrosis factor drug responses of rheumatoid arthritis patients by integrating clinical and genetic markers. Arthritis Rheumatol. (2019) 71:1987–96. doi: 10.1002/art.41056

  • 57

    KimKJKimMAdamopoulosIETagkopoulosI. Compendium of synovial signatures identifies pathologic characteristics for predicting treatment response in rheumatoid arthritis patients. Clin Immunol. (2019) 202:110. doi: 10.1016/j.clim.2019.03.002

  • 58

    MiyoshiFHonneKMinotaSOkadaMOgawaNMimuraT. A novel method predicting clinical response using only background clinical data in RA patients before treatment with infliximab. Mod Rheumatol. (2016) 26:813–6. doi: 10.3109/14397595.2016.1168536

  • 59

    YoosufNMaciejewskiMZiemekDJelinskySAFolkersenLMüllerMet al. Early prediction of clinical response to anti-TNF treatment using multi-omics and machine learning in rheumatoid arthritis. Rheumatol (Oxford). (2022) 61:1680–9. doi: 10.1093/rheumatology/keab521

  • 60

    BougetVDuquesneJHasslerSCournèdePHFautrelBGuilleminFet al. Machine learning predicts response to TNF inhibitors in rheumatoid arthritis: results on the ESPOIR and ABIRISK cohorts. RMD Open. (2022) 8:e002442. doi: 10.1136/rmdopen-2022-002442

  • 61

    RivelleseFSuraceAEAGoldmannKSciaccaEÇubukCGiorliGet al. Rituximab versus tocilizumab in rheumatoid arthritis: synovial biopsy-based biomarker analysis of the phase 4 R4RA randomized trial. Nat Med. (2022) 28:1256–68. doi: 10.1038/s41591-022-01789-0

  • 62

    KooBSEunSShinKYoonHHongCKimDHet al. Machine learning model for identifying important clinical features for predicting remission in patients with rheumatoid arthritis treated with biologics. Arthritis Res Ther. (2021) 23:178. doi: 10.1186/s13075-021-02567-y

  • 63

    LeeSKangSEunYWonHHKimHLeeJet al. Machine learning-based prediction model for responses of bDMARDs in patients with rheumatoid arthritis and ankylosing spondylitis. Arthritis Res Ther. (2021) 23:254. doi: 10.1186/s13075-021-02635-3

  • 64

    Novella-NavarroMBenaventDRuiz-EsquideVTorneroCDíaz-AlmirónMChacurCAet al. Predictive model to identify multiple failure to biological therapy in patients with rheumatoid arthritis. Ther Adv Musculoskelet Dis. (2022) 14:1759720X221124028. doi: 10.1177/1759720X221124028

  • 65

    ChenRHuangQChenL. evelopment and validation of machine learning models for prediction of fracture risk in patients with elderly-onset rheumatoid arthritis. Int J Gen Med. (2022) 15:7817–29. doi: 10.2147/IJGM.S380197

  • 66

    LeeCJooGShinSImHMoonKW. Prediction of osteoporosis in patients with rheumatoid arthritis using machine learning. Sci Rep. (2023) 13:21800. doi: 10.1038/s41598-023-48842-7

  • 67

    LiuFHuangYLiuFWangH. Identification of immune-related genes in diagnosing atherosclerosis with rheumatoid arthritis through bioinformatics analysis and machine learning. Front Immunol. (2023) 14:1126647. doi: 10.3389/fimmu.2023.1126647

  • 68

    WeiTYangBLiuHXinFFuL. Development and validation of a nomogram to predict coronary heart disease in patients with rheumatoid arthritis in northern China. Aging (Albany NY). (2020) 12:3190–204. doi: 10.18632/aging.v12i4

  • 69

    QinYWangYMengFFengMZhaoXGaoCet al. Identification of biomarkers by machine learning classifiers to assist diagnose rheumatoid arthritis-associated interstitial lung disease. Arthritis Res Ther. (2022) 24:115. doi: 10.1186/s13075-022-02800-2

  • 70

    KarlsonEWvan SchaardenburgDvan der Helm-van MilAH. Strategies to predict rheumatoid arthritis development in at-risk populations. Rheumatol (Oxford). (2016) 55:615. doi: 10.1093/rheumatology/keu287

  • 71

    BurgersLERazaKvan der Helm-van MilAH. Window of opportunity in rheumatoid arthritis - definitions and supporting evidence: from old to new perspectives. RMD Open. (2019) 5:e000870. doi: 10.1136/rmdopen-2018-000870

  • 72

    HazlewoodGSBarnabeCTomlinsonGMarshallDDevoeDJBombardierC. Methotrexate monotherapy and methotrexate combination therapy with traditional and biologic disease modifying anti-rheumatic drugs for rheumatoid arthritis: A network meta-analysis. Cochrane Database Syst Rev. (2016) 2016:CD010227. doi: 10.1002/14651858.CD010227.pub2

  • 73

    NahonPBamba-FunckJLayeseRTrépoEZucman-RossiJCagnotCet al. Integrating genetic variants into clinical models for hepatocellular carcinoma risk stratification in cirrhosis. J Hepatol. (2023) 78:584–95. doi: 10.1016/j.jhep.2022.11.003

  • 74

    MartinARKanaiMKamataniYOkadaYNealeBMDalyMJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet. (2019) 51:584–91. doi: 10.1038/s41588-019-0379-x

  • 75

    RuanYLinYFFengYAChenCYLamMGuoZet al. Improving polygenic prediction in ancestrally diverse populations. Nat Genet. (2022) 54:573–80. doi: 10.1038/s41588-022-01054-7

  • 76

    HaoLKraftPBerrizGFHynesEDKochCKumarPKVet al. Development of a clinical polygenic risk score assay and reporting workflow. Nat Med. (2022) 28:1006–13. doi: 10.1038/s41591-022-01767-6

  • 77

    LiHCaiZWangJTangJDingWLinCTet al. FedTP: federated learning by transformer personalization. IEEE Trans Neural Netw Learn Syst. (2023). doi: 10.1109/TNNLS.2023.3269062

  • 78

    GuXSabrinaFFanZSohailS. A review of privacy enhancement methods for federated learning in healthcare systems. Int J Environ Res Public Health. (2023) 20:6539. doi: 10.3390/ijerph20156539

  • 79

    HaggenmüllerSSchmittMKrieghoff-HenningEHeklerAMaronRCWiesCet al. Federated learning for decentralized artificial intelligence in melanoma diagnostics. JAMA Dermatol. (2024) 160:303–11. doi: 10.1001/jamadermatol.2023.5550

  • 80

    van den BergROhrndorfSKortekaasMCvan der Helm-van MilAHM. What is the value of musculoskeletal ultrasound in patients presenting with arthralgia to predict inflammatory arthritis development? A systematic literature review. Arthritis Res Ther. (2018) 20:228. doi: 10.1186/s13075-018-1715-8

  • 81

    JoJTianCXuGSarazinJSchiopuEGandikotaGet al. Photoacoustic tomography for human musculoskeletal imaging and inflammatory arthritis detection. Photoacoustics. (2018) 12:82–9. doi: 10.1016/j.pacs.2018.07.004

  • 82

    MadaniAArnaoutRMofradMArnaoutR. Fast and accurate view classification of echocardiograms using deep learning. NPJ Digit Med. (2018) 1:6. doi: 10.1038/s41746-017-0013-1

  • 83

    ChenDLiuSKingsburyPSohnSStorlieCBHabermannEBet al. Deep learning and alternative learning strategies for retrospective real-world clinical data. NPJ Digit Med. (2019) 2:43. doi: 10.1038/s41746-019-0122-0

  • 84

    LeiYTianYShanHZhangJWangGKalraMK. Shape and margin-aware lung nodule classification in low-dose CT images via soft activation mapping. Med Image Anal. (2020) 60:101628. doi: 10.1016/j.media.2019.101628

  • 85

    RynazalRFujisawaKShiromaHSalimFMizutaniSShibaSet al. Leveraging explainable AI for gut microbiome-based colorectal cancer classification. Genome Biol. (2023) 24:21. doi: 10.1186/s13059-023-02858-4

  • 86

    LeeWYLeeYLeeSKimYWKimJH. A machine learning approach for recommending herbal formulae with enhanced interpretability and applicability. Biomolecules. (2022) 12:1604. doi: 10.3390/biom12111604

  • 87

    LeeYGChoiSCKangYKimKMKangCSKimC. Constructing a reference genome in a single lab: the possibility to use oxford nanopore technology. Plants (Basel). (2019) 8:270. doi: 10.3390/plants8080270

  • 88

    SunYChenBRDeshpandeA. Epigenetic regulators in the development, maintenance, and therapeutic targeting of acute myeloid leukemia. Front Oncol. (2018) 8:41. doi: 10.3389/fonc.2018.00041

  • 89

    Rodríguez-MolinaJBWestSPassmoreLA. Knowing when to stop: Transcription termination on protein-coding genes by eukaryotic RNAPII. Mol Cell. (2023) 83:404–15. doi: 10.1016/j.molcel.2022.12.021

  • 90

    GravesPRHaysteadTA. Molecular biologist's guide to proteomics. Microbiol Mol Biol Rev. (2002) ;66:3963. doi: 10.1128/MMBR.66.1.39-63.2002

  • 91

    GuoHGuoHZhangLTangZYuXWuJet al. Metabolome and transcriptome association analysis reveals dynamic regulation of purine metabolism and flavonoid synthesis in transdifferentiation during somatic embryogenesis in cotton. Int J Mol Sci. (2019) 20:2070. doi: 10.3390/ijms20092070

  • 92

    SmeekensSPHuttenhowerCRizaAvan de VeerdonkFLZeeuwenPLSchalkwijkJet al. Skin microbiome imbalance in patients with STAT1/STAT3 defects impairs innate host defense responses. J Innate Immun. (2014) 6:253–62. doi: 10.1159/000351912

  • 93

    TarazonaSBalzano-NogueiraLGómez-CabreroDSchmidtAImhofAHankemeierTet al. Harmonization of quality metrics and power calculation in multi-omic studies. Nat Commun. (2020) 11:3092. doi: 10.1038/s41467-020-16937-8

  • 94

    YiDBayerTBadenhorstCPSWuSDoerrMHöhneMet al. Recent trends in biocatalysis. Chem Soc Rev. (2021) 50:8003–49. doi: 10.1039/D0CS01575J

  • 95

    BrownMVMcDunnJEGunstPRSmithEMMilburnMVTroyerDAet al. Gunst PR Cancer detection and biopsy classification using concurrent histopathological and metabolomic analysis of core biopsies. Genome Med. (2012) 4:33. doi: 10.1186/gm332

  • 96

    YangSHollisterAMOrchardEAChaudherySIOstaninDVLokitzSJet al. Quantification of bone changes in a collagen-induced arthritis mouse model by reconstructed three dimensional micro-CT. Biol Proced Online. (2013) 15:8. doi: 10.1186/1480-9222-15-8

  • 97

    LiaoKPKurreemanFLiGDuclosGMurphySGuzmanRet al. Associations of autoantibodies, autoimmune risk alleles, and clinical diagnoses from the electronic medical records in rheumatoid arthritis cases and non-rheumatoid arthritis controls. Arthritis Rheumatol. (2013) 65:571–81. doi: 10.1002/art.37801

  • 98

    KurreemanFLiaoKChibnikLHickeyBStahlEGainerVet al. Genetic basis of autoantibody positive and negative rheumatoid arthritis risk in a multi-ethnic cohort derived from electronic health records. Am J Hum Genet. (2011) 88:5769. doi: 10.1016/j.ajhg.2010.12.007

  • 99

    LiHGuanY. Multilevel modeling of joint damage in rheumatoid arthritis. Adv Intell Syst. (2022) 4:2200184. doi: 10.1002/aisy.202200184

  • 100

    SunDNguyenTMAllawayRJWangJChungVYuTVet al. RA2-DREAM challenge community. A crowdsourcing approach to develop machine learning models to quantify radiographic joint damage in rheumatoid arthritis. JAMA Netw Open. (2022) 5:e2227423. doi: 10.1001/jamanetworkopen.2022.27423

  • 101

    FiorentinoMCCipollettaEFilippucciEGrassiWFrontoniEMocciaS. A deep-learning framework for metacarpal-head cartilage-thickness estimation in ultrasound rheumatological images. Comput Biol Med. (2022) 141:105117. doi: 10.1016/j.compbiomed.2021.105117

  • 102

    AndersenJKHPedersenJSLaursenMSHoltzKGrauslundJSavarimuthuTRet al. Neural networks for automatic scoring of arthritis disease activity on ultrasound images. RMD Open. (2019) 5:e000891. doi: 10.1136/rmdopen-2018-000891

  • 103

    SinghJAHossainAMudanoASTanjong GhogomuESuarez-AlmazorMEBuchbinderRet al. Biologics or tofacitinib for people with rheumatoid arthritis naive to methotrexate: a systematic review and network meta-analysis. Cochrane Database Syst Rev. (2017) 5:CD012657. doi: 10.1002/14651858

  • 104

    BluettJRiba-GarciaIVerstappenSMMWendlingTOgungbenroKUnwinRDet al. Development and validation of a methotrexate adherence assay. Ann Rheum Dis. (2019) 78:1192–7. doi: 10.1136/annrheumdis-2019-215446

  • 105

    KalweitMBurdenAMBoedeckerJHügleTBurkardT. Patient groups in Rheumatoid arthritis identified by deep learning respond differently to biologic or targeted synthetic DMARDs. PloS Comput Biol. (2023) 19:e1011073. doi: 10.1371/journal.pcbi.1011073

  • 106

    JainSEadonMT. Spatial transcriptomics in health and disease. Nat Rev Nephrol. (2024). doi: 10.1038/s41581-024-00841-1

  • 107

    WuHDixonEEXuanyuanQGuoJYoshimuraYDebashishCet al. High resolution spatial profiling of kidney injury and repair using RNA hybridization-based in situ sequencing. Nat Commun. (2024) 15:1396. doi: 10.1038/s41467-024-45752-8

  • 108

    KiesslingPKuppeC. Spatial multi-omics: novel tools to study the complexity of cardiovascular diseases. Genome Med. (2024) 16:14. doi: 10.1186/s13073-024-01282-y

Summary

Keywords

ML, rheumatoid arthritis, precision medicine, diagnosis, treatment

Citation

Shi Y, Zhou M, Chang C, Jiang P, Wei K, Zhao J, Shan Y, Zheng Y, Zhao F, Lv X, Guo S, Wang F and He D (2024) Advancing precision rheumatology: applications of machine learning for rheumatoid arthritis management. Front. Immunol. 15:1409555. doi: 10.3389/fimmu.2024.1409555

Received

30 March 2024

Accepted

24 May 2024

Published

10 June 2024

Volume

15 - 2024

Edited by

Xu-jie Zhou, Peking University, China

Reviewed by

Hiufung Yip, Hong Kong Baptist University, Hong Kong SAR, China

Miha Lavric, University of Maribor, Slovenia

Updates

Copyright

*Correspondence: Dongyi He, ; Fubo Wang,

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics