Predictive risk factors before the onset of familial rheumatoid arthritis: the Tatarstan cohort study

Background A familial history of rheumatoid arthritis (RA) predisposes an individual to develop RA. This study aimed at investigating factors associated with this conversion from the Tatarstan cohort. Methods A total of 144 individuals, referred to as pre-RA and at risk for familial RA, were selected 2 years (range: 2–21 years) before conversion to RA and compared to non-converted 328 first-degree relatives (FDR) from RA as assessed after ≥2 years follow-up, and 355 healthy controls were also selected (HC). Preclinical parameters and socio-demographic/individual/HLA genetic factors were analyzed when data were available at the time of enrollment. Results As compared to FDR and HC groups, pre-RA individuals were characterized before conversion to RA by the presence of arthralgia, severe morning symptoms, a lower educational level, and rural location. An association with the HLA-DRB1 SE risk factor was also retrieved with symmetrical arthralgia and passive smoking. On the contrary, alcohol consumption and childlessness in women were protective and associated with the HLA-DRB1*07:01 locus. Conclusion Before RA onset, a combination of individual and genetic factors characterized those who are at risk of progressing to RA among those with familial RA relatives.


. Introduction
Susceptibility for rheumatoid arthritis (RA) involves genetic and environmental factors explaining a familial risk (2×-5×) for developing RA as compared to the general population (1,2).In this high-risk population for future development of RA, preclinical symptoms are frequently present including arthralgia, swelling, and morning stiffness, which are used to define clinically suspect arthralgia (CSA) (3).However, many questions remain since not all relatives of RA patients will develop RA, including those with CSA, which supports the possible involvement of additional parameters such as socio-demographic factors (e.g., age, sex, pregnancy, and education level), individual behavior factors (e.g., active/passive tobacco smoking), and genetic factors [e.g., human leukocyte antigen (HLA)].
Located in Russia at the crossroads of the East and West, the population of Tatarstan is characterized by an elevated risk for RA development among first-degree relatives (FDR) of RA patients (9.1 cases/1,000/year) (4).In a previous study, 26 FDR women having developed RA were studied at different time points showing a relationship between joint symptoms and then susceptibility to upper respiratory tract infection events in 3 years preceding the RA onset (4).In this cohort, herpes simplex virus (HSV) reactivation and oral microbiome changes also characterize the familial RA onset (5,6).To proceed further with the analysis of this cohort, 144 RA patients with a familial RA history were included when a rheumatological evaluation and information regarding sociodemographic and individual behavior factors were recorded at least 2 years before the RA onset.Biological factors were tested a posteriori from available samples.Next, pre-RA individuals with familial RA were compared with FDR without RA evolution and healthy controls (HC).
. Materials and methods

. . Subjects
From its initiation in 1997 and as previously described (4), a rheumatological pre-clinical evaluation and a questionnaire regarding socio-demographic and individual behavior factors are proposed to FDR from RA patients (i.e., kids, siblings, and parents), and such evaluation has been extended to seconddegree relatives (SDR: grandparents, aunts, uncles, nieces, and nephews) in multicase RA families.Among them, 144 pre-RA individuals were selected based on two criteria: (i) first, fulfilling the 2010 American College of Rheumatology (ACR)/European League Against Rheumatism (EULAR) criteria at diagnosis (7), and before 2010, a consensus diagnosis was made by three experienced rheumatologists; and (ii) second, an evaluation conducted ≥ 2 years before the onset of RA.
Control groups were composed of familial RA individuals who had not progressed to RA (FDR, n = 328) after ≥ 2 years of follow-up and healthy controls (HC, n = 355).HC included subjects without any signs of chronic disease and no autoinflammatory/autoimmune diseases among first-and second-degree relatives.The study was approved by the Ethical Committee of the Kazan State Medical Academy, Kazan, Russia (Permit no.1/2002).Consent to conduct studies and to allow publication of the results was received from all the individuals involved in the study according to the legal requirements in Russia.

. . Clinical, environmental, and serological factors
At the inclusion visit, an evaluation of joint symptoms was performed by a rheumatologist, and the evaluation was completed with magnetic resonance imaging (MRI) in the case of joint symptoms (pain and morning stiffness) in the small joints of the feet and hands.The seven criteria that define CSA were collected from medical records to evaluate CSA status (positivity ≥ 3): arthralgia (≤1 year), metacarpophalangeal (MCP) arthralgia, morning stiffness ≥ 60 min, severe symptoms present in the morning, difficulty with making a fist by testing the strength and the ability to completely close the fist, positive MCP squeeze test, and the presence of FDR with RA (3).In some cases, due to the design of the study that included RA relatives, the "presence of FDR with RA" was not included as the criterion in the CSA score, and this was specified.Before the formulation of the CSA criteria in 2017 and as presented in Table 2, the non-standard assessment of pre-RA activity was performed using 11 criteria that were used a posteriori to evaluate CSA status or were referred to as non-CSA criteria to be included in the analysis (small/large/upper and lower/symmetrical/lower limb arthralgia) (8-12).Moreover, parameters collected at the inclusion included socio-demographic factors (age, sex, rural residence, childlessness, and education level) and individual behavior factors (fish/alcohol/coffee consumption and active/passive/no tobacco smoking).For statistical purposes, education was dichotomized into low educational level (secondary and high school graduates) and high educational level (university graduates).When serum was available at the time of the inclusion, anti-CCP and IgM RF were tested.

. . Genetic factors
HLA typing was performed by DNA sequencing of loci A, B, C, and DQB1 and 2-4 exons of the DRB1 gene using the HLA holotype kit (Omixon, Budapest, Hungary).Briefly, the isolated DNA concentration was estimated using an Implen NP80 NanoPhotomer (Fisher Scientific, Pittsburgh, PA), and long-range PCR was performed next with the resulting amplicons of the five loci.Then, in the process of library preparation, the amplicon pools were fragmented, ends were repaired, and barcodes were ligated.Finally, the sequencing was performed on a MiSeq nextgeneration sequencing (NGS; Illumina, San Diego, CA) instrument in the pair-terminal reading mode.The readings obtained (251 pb) were analyzed using the HLA twin software (Omixon) with two algorithms in a day-by-day assembly mode and by mapping against the global international ImMunoGeneTics (IMGT) database.The following alleles were considered as shared epitope (SE) alleles: DRB1 * 01:01, * 01:02, * 04:01, * 04:04, * 04:05, * 04:08, and * 10:01 (13).

. . Statistics
Quantitative results are expressed as the mean and interquartile (IQ) and compared by pairwise analysis.Categorical data were analyzed using Fisher's exact test, and when appropriate, a false discovery rate post-hoc correction was applied, and an odds ratio (OR) with a 95% confidence interval (CI) was determined.All tests and figures were built with Prism 9.4 (GraphPad Software, La Jolla, CA), with the exception of the radar plots (Microsoft Corp, Redmond, WA).

. . Population characteristics
As presented in Table 1, 827 individuals included in this study were subdivided into three groups: pre-RA stage corresponding to 144 individuals having evolved to RA [median: −3 years (IQ: −2/−10 years) from diagnosis]; 328 FDR from RA and without RA evolution after ≥ 2 years follow-up; and 355 HC.Age at inclusion was similar between pre-RA and FDR, while older in HC.The three groups included more women than men, which is related to the design of the Tatarstan cohort that favors the inclusion of women.Pre-RA, FDR, and HC subgroups were similar with regard to ACPA positivity, and RF positivity was lower in HC.

. . Clinical risk factors in pre-RA
At the early phase of RA (3, 14), a CSA score of ≥3 was effective in distinguishing pre-RA from HC (p < 10 −4 ).However, to distinguish pre-RA from FDR, the criteria "presence of FDR with RA" had to be removed from the CSA score (p = 0.0005).
Next, to gain further insight into the relevance of disabilities in assessing those familial RA individuals at risk of developing RA, 11 clinical parameters were selected from CSA criteria (n = 6) and from additional arthralgia characteristics (n = 5).As presented in Table 2 and Figures 1A, B, all these parameters, except isolated lower limb arthralgia and pain joints in the upper and lower extremities (black line), were effective in distinguishing pre-RA from HC with an OR ranging from 1.6 to infinity.Among them, significance was conserved when applying a post-hoc correction (red line) except for difficulties with making a fist (0.01 < p < 0.05, blue line).
The same analysis performed between pre-RA and FDR (Figure 1C) retrieved three groups of parameters: (i) those parameters that are ineffective in discriminating pre-RA from FDR (black line), which included MCP/lower limb arthralgia, morning stiffness of ≥60 min, and difficulty with making a fist; (ii) those parameters with low discriminating efficacy and not confirmed following post-hoc adjustment (0.01 < p < 0.05, blue line) such as small joints and a negative squeeze test in MCP joints that reflects local inflammation; and (iii) those parameters that are highly effective after post-hoc adjustment (red line) in discrimination including arthralgia (OR = 1.8, CI: 1.2-2.7),large joint pain (OR = 2.2, CI: 1.4-3.4),upper and lower joint pain (OR = 2.9, CI: 1.7-4.8),symmetrical arthralgia (OR = 4.4, CI: 2.6-7.1), and most severe symptoms occurring in the morning (OR = 3.7, CI: 1.7-8.2).We conclude from such analysis, and based on the radar plot comparison that considers the FDR group according to the CSA score (Figures 1A, D), that the most discriminating clinical criteria to distinguish pre-RA from FDR was symmetrical arthralgia.tobacco smoking) were considered (Table 2; Figure 2A).Of note, the rate of inclusion for environmental risk factors ranged from 77.8 to 88.9% in pre-RA, 35.4 to 45.4% in FDR, and 49.9 to 58.0% in HC.When pre-RA individuals were compared to HC (0.01 < p, red line), the risk of RA development was higher among those living in rural areas (OR = 4.2; CI: 2.6-6.9) and lower in those reporting alcohol consumption (OR = 0.6, CI: 0.35-0.90)(Figure 2B).When replacing HC with FDR in the analysis (Figure 2C), three risk factors were associated with pre-RA individuals: living in a rural environment (OR = 4.1, CI: 2.4-6.7), a lower educational status (OR = 4.4, CI: 2.6-7.5), and passive tobacco smoking (OR = 2.0, CI: 1.15-3.3).On the other hand, alcohol consumption (OR = 0.5, CI: 0.30-0.85)and childlessness (OR = 0.33, CI: 0.17-0.63)were protective, and no associations were retrieved when considering active tobacco usage and fish/coffee consumption.Among these factors, the radar plot analysis further supports prominent roles for educational attainment, rural location, and passive tobacco usage to discriminate pre-RA from FDR (Figure 2D).

. . HLA-DRB genetic factors in pre-RA
High-resolution HLA allele class I (A, B, and C) and class II (DQB1 and DRB1) distribution was assessed by NGS in pre-RA (n = 59), in FDR (n = 50), and HC (n = 78) in order to perform association studies between (i) pre-RA vs. HC; (ii) FDR vs. HC; and (iii) pre-RA vs. FDR (Figure 3).
Alleles with a conserved sequence at amino-acid residues 70-74 in the third hypervariable region of HLA-DRB1 and referred to as SE showed a significant association with both pre-RA individuals (OR = 2.7, CI: 1.3-5.6;p = 0.005) and FDR (OR = 2.8, CI: 1.3-5.7;p = 0.005) when compared to HC.On the other hand, a significant negative association was observed with DRB1 * 07:01 when comparing pre-RA relatives with HC (OR = 0.28, CI: 0.12-0.64;p = 0.002) and with B * 35:01 when comparing FDR with HC (OR = 0.23, CI: 0.08-0.67;p = 0.008).As HLA B * 35:01 was retrieved associated with FDR but not with pre-RA individuals, this allele was not considered further.

. Discussion
Results from our study support the concept that RA development in individuals with familial RA case(s) is associated with a CSA score of ≥3 without including as criteria the presence of FDR and is driven/controlled by the interplay between environmental/behavioral factors (e.g., educational level, childlessness, alcohol consumption, and passive tobacco smoking) and genetic factors (e.g., HLA-DRB1 SE and * 07:01).
The CSA score to identify individuals at the pre-clinical RA stage was established by the EULAR using a longitudinal study of 2 years (15).From such an analysis, it was reported that individuals meeting ≥ 3 parameters have a 2.1× increased risk of developing RA.However, such an assertion was not confirmed in our highrisk cohort characterized by a long delay (≥2 years) to develop RA within pre-RA.Relatives from familial RA with a CSA score of ≥3 at the baseline were 36.8% in pre-RA vs. 30.0%in FDR.However, when the criteria "presence of FDR with RA" was not considered, the CSA score became effective in discriminating pre-RA from FDR (20.8 vs. 8.8%, respectively).One explanation for these discrepancies is related to the fact that four out of seven criteria from the CSA score (morning stiffness of >60 min, most severe symptoms in the morning, difficulty in making a fist, and a positive squeeze test in MCP) were rarely reported in our pre-RA group (2.8-16%) as compared to the cohorts used to define CSA (43-90%) (3).Indeed, the two most prevalent symptoms retrieved in our study were arthralgia and MCP arthralgia within pre-RA and FDR groups.Moreover, and similar to our study, it was previously reported that arthralgia was higher in FDR than in HC (16), that joint symmetry was at high risk among arthralgia in FDR from RA patients (17), and that symmetry takes place as the disease progresses and may be absent at the RA onset (18).
Education attainment represents an important health and social determinant associated positively or negatively with a large panel of mental and somatic diseases (19).Regarding RA, a preventive effect is repeatedly associated with a higher educational level, and such an effect is conserved following adjustment with tobacco smoking, body mass index, and intelligence (20,21).Indeed, in patients with RA, a higher educational level was associated with a better status including at the baseline a lower number of painful joints, less inflammation, RF seronegativity, and a higher rate of survival for men (22)(23)(24).Our study further provides arguments to support that the negative effect of education attainment on RA development starts early and is independent of HLA status.However, more studies are necessary to better characterize the exact mechanism by which education attainment controls RA development.
The HLA-DRB1 SE overrepresentation in RA was historically defined through the demonstration that the presence of HLA-DRB1 risk alleles harboring SE leads to the non-proliferative state in co-cultured lymphocytes from RA patients, which was not the case when using lymphocytes from healthy subjects (25-28).Next, it was reported that HLA-DRB1 SE represented the most predominant genetic risk factor associated with RA (29), and this overrepresentation was observed in 55% of the FDR from RA patients as compared to 43% in the HC group from North America (30).This is close to our report with 61% in pre-RA and 62% in FDR as compared to 37.2% in the HC population.Moreover, it was previously established that associations between HLA-DRB1 SE and tobacco smoking appeared at the CSA stage, while associations with autoantibody positivity (ACPA, RF) and severe disease appeared later at the RA onset, which is suggested when FDR from RA patients and early-RA patients are studied (31-33).Regarding our study, a slight association with passive tobacco smoking among pre-RA individuals carrying HLA-DRB1 SE was observed, and the absence of association with active smoking can be explained by low tobacco usage in the studied population (<10%).Regarding the association with autoantibodies (ACPA and RF), these parameters were not taken into consideration in this study as the rate of positivity in pre-RA individuals was low at inclusion (ACPA: 8.3% and RF: 12%) and not different from FDR.Moreover, HLA-DRB1 SE status was further associated with symmetrical arthralgia (OR = 2.8) when comparing pre-RA with FDR, such associations make sense and open new perspectives as joint symptoms represent key criteria in seropositive RA under progression (17,34).
The protective role of HLA DRB1 * 07:01 was previously reported in RA (35, 36).Our study further supports an interplay between HLA DRB1 * 07:01 and the immunomodulatory effect associated with alcohol consumption and childlessness in women.The paradoxical and beneficial effect of alcohol on RA was recently reviewed as well as its mechanism of action on the innate and adaptive immune systems that have predominant roles in RA (37,38).The relationship between childbirth and the risk of RA remains debated in the literature although it is well-established that RA often improves during pregnancy and flares can occur postpartum (39).Accordingly, part of these discrepancies regarding the contribution of pregnancy in RA development in the literature may be in relation to HLA DRB1-07:01 genetic status and/or individual behavior factors such as tobacco smoking and alcohol consumption, which require more exploration.
The main limitations of this study are its retrospective design with incomplete data sets and a database initiated before the CSA score was created.In particular, factors with a low rate of inclusion (<50%) have to be considered with caution.On the other hand, advantages are related to the homogenous genetic background reported between pre-RA and FDR individuals, the elevated number of pre-RA individuals included, and the delay of ≥2 years in the follow-up to exclude the pre-RA stage in both pre-RA and FDR groups.However, we could not firmly exclude that some individuals from the FDR subgroup would evolve to RA and that environmental risk factors have changed from inclusion to RA development such as childlessness and rural location.
In conclusion, a set of clinical, individual, and genetic characteristics to predict RA development in relatives from familial RA was established.More studies are required to determine the predictive accuracy of these parameters when used alone or when combined with autoantibody testing and/or imaging.

FIGURE
FIGURE Clinical spectrum at the inclusion visit in the Tatarstan cohort of healthy controls (HC, n = ) and rheumatoid arthritis (RA) relatives.Later being dichotomized according to their progression into RA (pre-RA, n = ) or as FDR when no evolution was reported during the ≥ years follow-up.(A) Prevalence of the RA-associated clinical criteria, see Section for details.FDR individuals were further subdivided according to their clinically suspected arthralgia (CSA) score.(B, C) odds ratio (OR) and significant p-values (in blue when .< p < ., and in red p < .corresponding to the post-hoc adapted threshold) for the occurrence of the clinical criteria when pre-RA individuals were compared to HC (B) or compared to FDR (C).(D) Radar plot comparing clinical criteria between pre-RA individuals and FDR according to their CSA status.

FIGURE
FIGURE Socio-demographic and individual behavioral factors to compare healthy controls (HC) and relatives from rheumatoid arthritis (RA) patients having evolved (pre-RA) or not (FDR) to RA. (A) Prevalence of the eight factors in the subgroups, see Section for details.FDR was further subdivided according to their clinically suspected arthralgia (CSA) score.(B, C) Odds ratio (OR) and significant p-values (in blue when .< p < ., and in red p < .corresponding to the post-hoc adapted threshold) for the occurrence of the individual criteria when pre-RA individuals were tested against HC (B) or against FDR (C).(D) Radar plot comparing individual criteria between pre-RA relatives and FDR according to their CSA status.

FIGURE
FIGURE HLA A/B/C/DQB and DRB allele distribution among relatives from rheumatoid arthritis (RA) patients having evolved to RA (pre-RA) or not (FDR), and from healthy controls (HC).(A) Manhattan plot for the log (p-value) for each HLA allele and HLA-DQRB shared epitope (SE) between pre-RA and controls (red), FDR and controls (blue), and pre-RA and FDR (green).The dotted lines represent the significant p-values using an adapted threshold with a post-hoc false discovery rate approach (p = .) or not (p = .). (B) Allele frequency of HLA-B* : , HLA-DRB SE, and HLA-DRB * : among HC, FDR, and pre-RA individuals.(C) Odds ratio (OR) showing the genetic e ect of HLA-B* : , HLA-DRB SE, and HLA-DRB * : between HC and pre-RA (red) or FDR (blue).The p-values are indicated when significant.

FIGURE
FIGURE Interplay between the HLA-DRB shared epitope (SE) and HLA-DRB * : genetic factors and rheumatoid arthritis (RA)-associated factors in relatives from RA patients according to their evolution to RA (pre-RA) or not (FDR).(A) Prevalence of the RA-associated clinical criteria according to their HLA-DRB shared epitope (SE) and HLA-DRB * : status in pre-RA and FDR.(B) HLA-DRB SE (red) and HLA-DRB * : (blue) influenced eight RA-associated risk factors to discriminate pre-RA from FDR.Data are represented as log (p-value) with a significant threshold fixed at p = . .(C) Prevalence of the RA-associated individual factors.(D) HLA-DRB SE (red) and HLA-DRB * : (blue) influenced RA-associated individual factors to discriminate pre-RA from FDR.
TABLE Demographic and clinical characteristics of the population studied.
. .Environmental risk factors in pre-RA Epidemiological studies and more recently the use of Mendelian randomization approaches have confirmed the direct relationship between RA development and environmental factors such as educational level and tobacco smoking (2).Accordingly, socio-demographic factors (rural residence, childlessness, and educational level) and individual behavior factors (fish/alcohol/coffee consumption and active/passive TABLE Characteristics of individuals at inclusion in the Tatarstan cohort.