Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Public Health, 10 November 2025

Sec. Public Health and Nutrition

Volume 13 - 2025 | https://doi.org/10.3389/fpubh.2025.1694503

Association between volatile organic compound co-exposure and the prevalence of rheumatoid arthritis: a nationwide cross-sectional study


Tian RenTian Ren1Erye ZhouErye Zhou1Tao ChengTao Cheng1Mingjun WangMingjun Wang1Yufeng Yin
Yufeng Yin1*Jian Wu
Jian Wu1*Weichang Chen
Weichang Chen2*
  • 1Department of Rheumatology and Immunology, The First Affiliated Hospital of Soochow University, Suzhou, Jiangsu, China
  • 2Department of Gastroenterology, The First Affiliated Hospital of Soochow University, Suzhou, Jiangsu, China

Background: Environmental contaminants, especially volatile organic compounds (VOCs) and their metabolites (mVOCs), are of significant interest for treating autoimmune diseases due to their potential immunomodulatory effects. This study aimed to assess the association between urinary mVOCs and the risk of rheumatoid arthritis (RA) in U.S. adults.

Methods: A total of 4,622 adults, including 296 participants with RA, were included in the present study utilizing data from the National Health and Nutrition Examination Survey (NHANES) from 2005 to 2020. Sixteen mVOCs were selected for the analysis while controlling for potential confounders. Weighted logistic regression models were employed to assess the association between individual mVOCs and RA risk. Least absolute shrinkage and selection operator (LASSO) regression was used to select mVOCs and covariates most pertinent to the prevalence of RA for further analyses. Then, weighted quantile sum (WQS) regression and quantile g-computation (qgcomp) models were used to estimate associations between the mVOC mixture and RA. Mediation analyses were performed to examine the effect of inflammatory indices on these relationships.

Results: In single-pollutant models, levels of most mVOCs were greater in the RA patients than in the patients without arthritis. Furthermore, multi-pollutant models unveiled a positive effect of the mVOC mixture on the risk of RA in both WQS regression (OR: 1.37; 95% CI: 1.12, 1.68; P = 0.002) and qgcomp (OR: 1.23; 95% CI: 1.07, 1.49; P = 0.034) models. This effect was notably stronger for female participants. The lymphocyte-to-monocyte ratio (LMR), a surrogate for inflammatory markers, mediated the association between the mVOC mixture and the prevalence of RA with a mediated proportion of 4.65%.

Conclusions: This study supports the substantial connection between VOC co-exposure and the risk of RA, with inflammation potentially acting as a mediator in this relationship.

1 Introduction

Rheumatoid arthritis (RA), is one of the most prevalent systemic autoimmune diseases, is characterized by chronic joint inflammation, leading to progressive joint damage and disability, with increased mortality (1). RA has a global prevalence of approximately 0.5% among adults, and women are 2 to 3 times more likely than men to develop this disease (2). Although it can manifest at any age, the most common age range for RA onset is between 50 and 59 years old (2). The disease burden of RA extends beyond joint disorders, with patients exhibiting an increased risk for cardiovascular disease, infections, and psychological disorders, underscoring its significant impact on public health and healthcare systems worldwide (2).

The pathogenesis of RA is not fully understood; however, it is widely accepted as a multifactorial disease in which genetic susceptibility interacts with environmental triggers (3). Although genetic factors contribute substantially to the risk of RA, they do not fully account for the disease occurrence, which implies a critical role for environmental factors (3). Among these, smoking is the most well-documented risk factor (4). Other environmental exposures, such as air pollution, occupational hazards, and microbial agents, have also been associated with the onset and progression of RA (5, 6).

Volatile organic compounds (VOCs) are a diverse group of carbon-based chemicals that readily evaporate at ambient temperature (7). Human exposure to VOCs can occur through various routes, including inhalation, ingestion, and dermal absorption, originating from a wide array of sources, such as industrial emissions, vehicular exhaust, building materials, and use of consumer products (8). Once in the body, VOCs can be metabolized into a range of metabolites (mVOCs), which might exert toxic effects and have been implicated in the pathogenesis of several diseases, especially systemic autoimmune diseases (79) and joint disorders (10).

Considering the ubiquitous nature of VOCs and their potential immunomodulatory effects, recent studies have elucidated the intricate relationship between RA and exposure to VOCs; however, these relationships present a complex picture. Research by Lei et al. suggests that certain single mVOCs, such as AMCA and HPMA, may be involved in the pathogenesis of RA (11). On the other hand, a concurrent study by Beidelschies et al. highlighted that environmental toxicants, including polycyclic aromatic hydrocarbons (PAHs) but not VOCs, are associated with an increased risk of RA (12).

While these works have significantly contributed to our understanding of the potential impact of mVOCs on RA, several critical questions remain to be explored to fully clarify the role of these compounds in RA. For instance, as with most environmental exposures, VOC exposures frequently encompass a mixture of different VOCs that can interact with one another, potentially leading to synergistic or interactive impacts on health outcomes (13). Therefore, it is vital to assess the combined effects of these co-exposures to better understand their health impacts. Moreover, delineating the mediating factors, particularly the role of inflammatory processes, is important for obtaining a comprehensive understanding of how environmental exposures contribute to the emergence and progression of RA.

Therefore, our research aims to bridge current knowledge gaps by analyzing the cumulative impact of mVOC mixtures on RA prevalence in a representative U.S. adult cohort. Moreover, we aimed to analyze whether inflammatory markers act as mediators of the correlation between the mVOC and increased risk of RA.

2 Methods

2.1 Study design

The data used were obtained from NHANES, a program committed to assessing the health and nutritional status of the civilian population in the U.S. This program employs a complex, multistage sampling method and has been gathering extensive data from a nationally representative sample biannually since 1999, covering various counties through both mobile examination centers (MECs) and in-home interviews. The Centers for Disease Control and Prevention (CDC) provides comprehensive information on the NHANES methodology, design, and participant recruitment. This investigation was conducted in adherence to the STROBE guidelines tailored for cross-sectional studies and received approval from the Research Ethics Review Board at the National Center for Health Statistics (NCHS). Informed consent was obtained from all individual participants involved in the study.

2.2 Study population

The study consolidated data across five NHANES cycles (2005–2006, 2011–2012, 2013–2014, 2015–2016, and 2017–2020), amounting to a total of 55,810 participants. Individuals younger than 20 years were excluded. Those with missing mVOC values or apparent outliers (exceeding the 99th percentile) were also excluded from the analysis to minimize potential bias arising from extreme data. Additional exclusions were made for missing data on covariates, complete blood count and RA status. Following these criteria for inclusion and exclusion, the final analytical sample included 4,622 participants (Supplementary Figure S1).

2.3 Assessment of RA

RA status was assessed using a disease questionnaire administered prior to the physical examination. This involved a computer-assisted face-to-face interview asking participants aged >20 years the following question: “Has a doctor or other health professional ever told you that you had arthritis?”. Those who answered affirmatively were then asked “Which type of arthritis was it?”, and persons who responded with RA were classified as having RA. Participants with incomplete data regarding arthritis and RA status, as well as those with missing information on relevant covariates, were excluded from the analyses.

2.4 Measurements of mVOCs

Urine specimens were obtained from the participants, who were not required to adhere to any fasting or dietary restrictions. Each specimen was deposited into either polystyrene cryovial tubes or polypropylene centrifuge tubes. A minimum volume of 0.25–0.5 mL was obtained, with an assay-specific aliquot of 50 μL. After collection, the samples were promptly chilled and transferred to a storage facility where they were maintained at −20 °C and at −70 °C until analysis. Urinary mVOCs were quantified using an advanced ultra-performance liquid chromatography (UPLC) system paired with electrospray ionization tandem mass spectrometry (ESI-MS/MS) (14).

Data are reported in concentration units (ng/mL) and were normalized to creatinine levels (μg/g creatinine) to account for urine dilution variability among specimens. A comprehensive outline of the analytical procedures utilized in the laboratory is accessible at the CDC website via NHANES laboratory methods. For instances in which the analyte concentrations fell below the established lower limit of detection (LLOD), the reported values were assigned as the LLOD divided by the square root of two (LLOD/√2) (15). In NHANES, a total of 29 different urinary mVOCs are analyzed. However, we omitted 13 mVOCs from our analysis due to their low detection rate (50% or less) to ensure the representativeness of the data and the reliability of the findings. Ultimately, 16 urinary mVOCs were considered for analysis (Supplementary Tables S1, S2).

2.5 Covariates and inflammatory markers

Covariates were selected due to their established association with arthritis, as indicated by previous studies (10, 11, 16). Selection of covariates included demographic details such as age, sex (male and female), racial/ethnic background (Mexican American, other Hispanic, Non-Hispanic White, Non-Hispanic Black, and other/multiracial), educational attainment (ranging from less than 9th grade to college graduate or higher), marital status (categorized as married/living with partner, widowed/divorced/separated, or never married), family income in relation to the poverty level as indicated by the poverty income ratio (PIR) (0–1.29, 1.3–3.49, and ≥3.5), and status of health insurance coverage (either insured or uninsured). Clinical parameters included body mass index (BMI), which was defined as underweight (<18.5 kg/m2), normal weight (18.5–24.9 kg/m2), overweight (25–29.9 kg/m2), and obese (≥30 kg/m2), along with waist circumference (cm). Comorbid conditions such as hypertension and diabetes were identified through participants' self-reported diagnoses that had been confirmed by a physician. The inflammatory marker utilized in the study was the lymphocyte-to-monocyte ratio (LMR), which was calculated by dividing the number of lymphocytes in the peripheral blood by the number of monocytes.

2.6 Statistical analyses

Baseline demographic characteristics were compiled and summarized for the overall population, and comparisons were drawn between groups based on the presence of RA or absence of arthritis (RA vs. non-arthritis). Continuous variables are presented as weighted means accompanied by standard deviations (SDs) or, alternatively, medians paired with interquartile ranges (IQRs). To assess differences between groups, independent t-tests and Wilcoxon rank-sum tests were employed as appropriate for the data distribution. Categorical variables were quantified as counts (percentages), and chi-square tests were used to determine the significance of differences observed between groups. All 16 mVOCs were subjected to natural logarithm (ln) transformation to approximate a normal distribution. Following this transformation, the data were standardized and divided into four quartiles (Q1, Q2, Q3, and Q4) to facilitate subsequent analyses.

In single-pollutant models, weighted multivariate binary logistic analyses were conducted to examine relationships between individual mVOCs and the risk of RA, and odds ratios (ORs) and 95% confidence intervals (CIs) are reported. Restricted cubic spline (RCS) analyses (with 3 automatically selected knots) were also conducted to investigate non-linear relationships. Pearson correlation tests were used to examine interrelationships between mVOCs by assessing the strength and direction of their mutual associations.

Given the significant correlations and multicollinearity observed among the 16 mVOCs, which can produce unstable estimates and reduce the interpretability of traditional regression models, we employed the least absolute shrinkage and selection operator (LASSO) regression (17). This machine learning method is particularly advantageous for our analysis as it performs both variable selection and regularization simultaneously. By applying a penalty function, LASSO shrinks the coefficients of less influential predictors toward zero, allowing us to identify a more robust and parsimonious subset of mVOCs and covariates for the subsequent mixture effect analyses with WQS and qgcomp.

To assess the joint effect of the chemical mixture on RA prevalence, the variables selected by LASSO were incorporated into a weighted quantile sum (WQS) regression analysis. We chose this approach to model a more realistic environmental exposure scenario, as humans are typically exposed to multiple chemicals simultaneously rather than in isolation. The WQS model collapses the high-dimensional set of correlated mVOCs into a single, empirically weighted index. A key advantage of this method is that it not only estimates the overall effect of the mixture but also identifies the individual components that contribute most significantly to this association by examining their respective weights.

To address constraints associated with the WQS regression approach, particularly regarding the directionality of associations, we applied the quantile g-computation (qgcomp) model. The adopted methodology merges the inferential framework of WQS regression with the flexible characteristics of g-computation, effectively sidestepping the constraints imposed by assumptions of directional homogeneity. Notably, it separates the adjustment for confounders from the estimation of effects, which can enhance the clarity of the analysis. Moreover, it allows for a causal inference perspective on the parameters estimated (18, 19). Causal mediation analysis was performed to ascertain whether inflammatory indices act as mediators of the relationship between the mVOC and RA risk, including the extent of such mediation.

Statistical estimates were calibrated to accommodate the complex sampling design of NHANES, utilizing the specific sample weights and stratification information that accompany the survey data. However, these same adjustments were not incorporated into the WQS regression or the qgcomp model due to their incompatibility with complex survey designs. Statistical analyses were performed with R software, version 4.3.2 (R Foundation for Statistical Computing, Vienna, Austria), and a two-sided P-value of less than 0.05 was considered to indicate statistical significance.

3 Results

3.1 Demographic characteristics and mVOC levels of the study participants

The study analyzed the baseline characteristics of 4,622 participants, including 2,303 females (48.77%) and 2,319 males (51.23%), with an average age of 44.00 years. In the RA group (n = 296), the average age was slightly older than that in the control group, with a considerable percentage of participants being in the ≥60 years age group. The racial distribution of the RA group revealed a greater percentage of Non-Hispanic White individuals. The mean BMI for the RA group was 30.86 kg/m2, indicating a greater prevalence of overweight and obesity within this subgroup. The RA group also showed different patterns of educational attainment and health insurance coverage than did the non-arthritic group. Moreover, the poverty income ratio was lower in the RA individuals, indicating a potential socioeconomic impact on RA prevalence. Smoking and alcohol consumption rates differed slightly between the groups, with a higher percentage of non-smokers and non-excessive drinkers in the RA group (Table 1). Supplementary Figure S2 shows the histogram of the mVOC distribution across the NHANES cycles. Supplementary Table S3 summarizes the percentile distributions and missing values (percentages) of the 16 studied mVOCs in these cycles.

Table 1
www.frontiersin.org

Table 1. Baseline characteristics of study participants.

3.2 Associations of individual mVOCs with RA

Table 2 shows the concentrations of 16 different mVOCs stratified by the presence of RA. Participants diagnosed with RA exhibited elevated concentrations of the majority of the mVOCs assessed, except for BMA, BPMA, 2MHA and 34MH. The ln-transformed values of the mVOCs according to the status of RA are displayed in Supplementary Table S4. The association between single mVOCs and the prevalence of RA was assessed through weighted logistic regression analysis adjusted for covariates and is presented in Supplementary Table S5. According to the different models, certain mVOCs were significantly associated with the prevalence of RA. For instance, when considering the ln-transformed variable, AAMA showed an increasing trend in ORs across quartiles, with a significant trend in all three models (model 1: P for trend = 0.037; model 2: P for trend = 0.003; model 3: P for trend = 0.006). The continuous form of AAMA was also associated with an increased prevalence of RA (all p < 0.05 in the three models), and a similar trend was found between the prevalence of RA and other mVOCs, including AMCA, CEMA, CYMA, DHBM, HPMA, MHB3, PHGA, and HPMM (all P-values for trend <0.05).

Table 2
www.frontiersin.org

Table 2. Concentrations of mVOCs according to RA status.

Figure 1 illustrates the generalized linear regression analysis using RCS analyses to explore the dose–response relationship between urinary mVOCs and the prevalence of RA. The RCS models revealed specific mVOCs (such as AAMA, AMCA, BMA, CEMA, CYMA, DHBM, HPMA, MHB3, and HPMM) to be significantly associated with RA (P overall <0.05). Notably, several mVOCs, including BMA and DHBM, exhibited a non-linear relationship with RA (P non-linearly <0.05).

Figure 1
Sixteen panel graphs displaying odds ratios of rheumatoid arthritis (RA) against biomarker accumulation levels. Each graph features a histogram, a red trend line, and a shaded confidence interval. Panels are labeled: AAMA, AMCA, ATCA, BMA, BPMA, CEMA, CYMA, DHBM, HPMA, HPM2, MADA, 2MHA, 34MH, MHB3, PHGA, and HPMM. The p-values for overall and nonlinear trends are indicated in each panel.

Figure 1. The dose—response relationship between ln-transformed metabolites of volatile organic compounds (mVOCs) and the risk of rheumatoid arthritis (RA) detected using restricted cubic spline (RCS) models. (a) AAMA, (b) AMCA, (c) ATCA, (d) BMA, (e) BPMA, (f) CEMA, (g) CYMA, (h) DHBM, (i) HPMA, (j) HPM2, (k) MADA, (l) 2MHA, (m) 34MH, (n) MHB3, (o) PHGA, and (p) HPMM. All analyses were adjusted for age, sex, race, body mass index (BMI), waist circumference, education, health insurance, marital status, poverty income ratio (PIR), smoking status, alcohol intake, hypertension, and diabetes status. The red line illustrates the odds ratio (OR) of RA, while the red shaded region denotes 95% confidence intervals (CIs). mVOCs, metabolites of volatile organic compounds; RCS, restricted cubic spline; OR, odds ratio; CI, confidence interval; BMI, body mass index; PIR, poverty income ratio.

3.3 Correlations among individual mVOCs

Spearman correlation analysis was also conducted to evaluate relationships between urinary mVOCs. Figure 2 presents a correlation matrix that highlights several mVOCs with significant positive or negative correlations. A strong correlation was identified between mVOCs derived from identical parent compounds: 2MHA and 34MHA (metabolites of xylene), with a correlation coefficient (r) of 0.87, and between CEMA and HPMA (metabolites of acrolein) (r = 0.80). In addition, our analysis revealed notable correlations between CYMA and MHB3 (r = 0.80), between HMPA and MHB3 (r = 0.83), between HPMA and HPMM (r = 0.85), and between MHB3 and HPMM (r = 0.87). These strong correlations suggest the presence of multicollinearity among these mVOCs.

Figure 2
Correlation matrix with circles representing correlation coefficients between variables AAMA to HPMM. Circle sizes and colors range from light blue to dark blue, indicating values from zero to one. A color scale on the right provides visual reference.

Figure 2. The matrix displays Spearman correlation coefficients (ρ) for the ln-transformed concentrations of 16 mVOCs. Darker shades reflect stronger correlations, with blue indicating positive relationships and red negative ones. mVOCs, metabolites of volatile organic compounds.

3.4 Identification of mVOCs and covariates more relevant to RA

LASSO regression, in which penalty functions are applied to shrink less relevant mVOC coefficients toward zero, was used to effectively select those with more substantial associations with the risk of RA. Supplementary Figure S3 illustrates the relationship between the partial likelihood deviance (binomial deviance) and the log-transformed penalty parameter (λ) based on 10-fold cross-validation in LASSO regression modeling. Supplementary Figure S4 displays a coefficient profile plot produced against the log (λ) sequence. In this study, optimal values for lambda (λ) were determined at the point of minimum deviance. Alongside λ, the selected mVOCs and covariates for each subgroup analysis are comprehensively detailed for different populations in Supplementary Table S6.

3.5 Association of mVOC mixture with RA

The results of WQS regression modeling exploring the combined effect of the multiple-mVOC mixture on RA risk in different subgroups of participants are presented in Figure 3. WQS regression revealed a positive association across the entire cohort of participants (OR: 1.37; 95% CI: 1.12, 1.68; P = 0.002), indicating a significantly greater likelihood of RA with increasing mVOC mixture. Stratified analysis indicated that the association remained significant for females (OR: 1.36; 95% CI: 1.08, 1.70; P = 0.007) and across all age subgroups within the range of 20-60 years (OR: 1.71; 95% CI: 1.27, 2.30; P < 0.001) and ≥60 years (OR: 1.27; 95% CI: 1.01, 1.61; P = 0.038). Conversely, the association did not show statistical significance in the male subgroup.

Figure 3
Two forest plots comparing adjusted odds ratios (OR) with 95% confidence intervals (CI) and P values across different groups. Panel (a) shows overall participants (OR 1.37), females (OR 1.36), males (OR 1.22), ages 20-60 (OR 1.71), and over 60 (OR 1.27). Panel (b) shows overall participants (OR 1.23), females (OR 1.26), males (OR 1.16), ages 20-60 (OR 1.30), and over 60 (OR 1.14). Each group is represented with bars. Significant P values are indicated.

Figure 3. Forest plots demonstrating associations between mVOC mixture and the risk of RA according to WQS regression (a) and qgcomp (b) analyses. All analyses were adjusted by the covariates selected by LASSO regressions previously conducted. mVOCs, metabolites of volatile organic compounds; RA, rheumatoid arthritis; WQS, weighted quantile sum; qgcomp, quantile g-computation.

The detailed weights of these mVOCs are presented in Supplementary Figure S5. CYMA emerged as one of the most influential mVOCs for the prevalence of RA, with the highest weights (0.327, 0.822, and 0.395) occurring in the overall cohort population, females, and individuals aged >20–60 years, respectively. This was followed closely by DHBM, CYMA, AMCA, and HPMA in different subpopulations. All these individual variables are significantly associated with RA according to the binary logistic regressions (Supplementary Table S5).

The qgcomp model, which refrains from presupposing a uniform direction for the impact of individual VOC exposures, produced estimated exposure weights that included both positive and negative contributions. The findings are generally consistent with those obtained from the WQS model, suggesting that simultaneous exposure to VOCs is significantly associated with an increased risk of RA across the entire population (OR = 1.23; 95% CI = 1.07, 1.49; P = 0.034) and among female participants (OR = 1.26; 95% CI = 1.03, 1.58; P = 0.046). Notably, metabolites such as AMCA, DHBM, CYMA, and ATCA were identified as the main contributors to the increase in RA risk in the positive direction. The detailed positive and negative weights attributed to each of the mVOCs and the joint effects are illustrated in Supplementary Figures S6, S7.

3.6 Mediation analysis

To further investigate the mechanisms underlying the relationship between mVOCs and the prevalence of RA, mediation analysis was performed with a focus on inflammatory markers. The LMR was used as a representative marker for inflammation. As indicated in Table 3, the LMR played a significant role in mediating the association between mVOCs and RA, with a mediating proportion of 4.65% (95% CI: 4.44, 4.88) (P < 0.001) in the overall participants. Stratified analysis showed that the mediating proportions of LMR were 6.52% (P < 0.001), 0.77% (P < 0.001), 3.39% (P < 0.001), and 1.82% (P < 0.001) among females, males, individuals aged 20-60 years, and those aged 60 years and above, respectively.

Table 3
www.frontiersin.org

Table 3. Mediating effects of inflammatory factors on the association between mVOCs and the risk of RA.

4 Discussion

Our primary objective was to examine associations of individual and multiple co-exposure events to 16 specific VOCs and the prevalence of RA in the U.S. adult population. The key findings of our investigation revealed a clear association, with certain mVOCs (such as AAMA, AMCA, CEMA, CYMA, DHBM, HPMA, MHB3, PHGA, and HPMM) being positively linked to an increased prevalence of RA within the sampled population. Due to the collinearity among different mVOCs, we employed LASSO regression, which is effective at simplifying models by penalizing large coefficients, to identify mVOCs more strongly associated with RA prevalence. Additionally, WQS regression and the qgcomp model showed an elevated risk of RA associated with higher levels of a selected mVOC mixture, with CYMA having the largest contribution to this risk. Finally, mediation analysis revealed that the inflammatory index, as indicated by the LMR, accounted for 4.65% of the mediating effect on the association between multiple VOC co-exposure and RA.

Recent literature has extensively documented the adverse effects of environmental contaminants such as VOCs on human health, linking them to oxidative stress and inflammation in pregnancy, childhood asthma, depression, and cancer (13, 2023). VOCs are predominantly metabolized in the liver by cytochrome P450 into various hydroxylated and ring-opened compounds and are subsequently excreted in urine (24). Consequently, urinary metabolites can serve as biomarkers for estimating exposure to VOCs (25). To date, the link between mVOC co-exposure and the prevalence of RA has not been thoroughly investigated. A study based on NHANES data revealed a significant association between individual mVOCs, including AMCC and 3HPMA, and the risk of RA (11). However, this study has methodological limitations, chiefly due to the lack of consideration of the statistical collinearity among various mVOCs. Given that mVOCs represent a broad spectrum of substances, often stemming from similar parent compounds, these limitations might skew understanding of their interplay with RA. In addition, it is crucial to consider the typical scenario in which humans are exposed to a mixture of mVOCs rather than to isolated compounds (26, 27). In general, examination of single pollutants in isolation provides limited insight, as it fails to capture potential synergistic or antagonistic interactions that can occur in the pathogenesis of disorders. To address this complexity, our study offers a more realistic and comprehensive assessment of the association between multiple periods of mVOC co-exposure and the risk of RA.

Our results revealed that most individual mVOCs are significantly associated with RA, which is generally consistent with the findings of previous studies (11). More importantly, our findings from multiple-pollutant models also revealed positive correlations between the prevalence of RA and mVOC mixture, with the highest contributors being CYMA, DHBM, AMCA, and ATCA. Stratified analyses further showed that mVOC mixture correlates significantly with RA prevalence within particular subpopulations, specifically females and individuals aged >20–60 years, which are the demographic groups known to have the highest prevalence rates of RA. Furthermore, despite the lack of direct experimental evidence for causality, our mediation analysis for the first time suggests that inflammatory factors may serve as mediators of the relationship between mVOCs and RA. Recognition of inflammatory markers as mediators emphasizes the importance of the inflammatory response in the pathophysiology of RA. This observation is consistent with the prevailing view of RA as an inflammation-driven disease and paves the way for additional investigations into precise preventive approaches. Research into the direct relationships between mVOCs and the risk of RA is relatively scarce. In the field of other musculoskeletal disorders, Zhou et al. reported notable correlations between urinary concentrations of DHBM, AMCA, and ATCA and between bone mineral density and osteoarthritis, suggesting that mVOCs might play a role in altering the bone microenvironment, which may subsequently lead to arthritic inflammation (10, 28, 29).

As mentioned above, inflammation plays a central role in the pathogenesis of RA, and environmental factors, including VOCs, potentially exacerbate these inflammatory pathways and contribute to the disease's onset or progression. Acrylonitrile, which is the precursor to CYMA, is an important monomer in the organic synthesis industry and is widely used in production of synthetic fibers, resins, and plastics (30). Acrylonitrile can be detected in cigarette smoke, followed by drinking water, food, and air, and can be absorbed through ingestion, inhalation of vapors, or dermal contact (31). Research has shown that acrylonitrile can induce an inflammatory response across various cell types, including neuronal cells, testicular cells, oocytes, and gastric mucosal cells (30, 32, 33). This response is characterized by the production of reactive oxygen species (ROS) and the subsequent activation of nuclear factor κB (NF-κB), which are key elements in the cytotoxic effects of the compound observed in vitro and eventually lead to synovitis and bone and cartilage degradation (34). DHBM, a metabolite of 1,3-butadiene, has been identified as a significant secondary compound associated with RA. Among the environmental sources of 1,3-butadiene, cigarette smoke stands out as a primary contributor, with other sources including emissions from industrial processes, automobile exhaust, and burning of materials such as wood, plastics, and rubber (35). Numerous studies have investigated the deleterious effects of 1,3-butadiene exposure on diseases involving inflammatory components and the respiratory and cardiovascular systems (36, 37). N,N-dimethylformamide is a parent compound of AMCA, and in addition to generating inflammation similar to the aforementioned effects of acrylonitrile, it can induce neutrophil infiltration and activate the NLRP3 inflammasome in the livers of mice, potentially leading to cellular damage (38). Increased urinary AMCA levels may cause development of inflammatory and fibrotic lesions in the liver through an imbalance in lipid metabolism and the inflammatory response (39). Another study showed that an increased urinary concentration of AMCA impairs lung function through an increased level of C-reactive protein, a commonly used marker of inflammation (40). Cyanide (the precursor of ATCA) is known primarily for its neurotoxic effects (38). Beyond neurotoxicity, cyanide exposure exerts cytotoxicity in non-neuronal cells, as evidenced by an array of damaging cellular events and inflammatory responses (38, 41). These experimental findings suggest a connection between mVOCs and inflammation.

Our study has several strengths. First, it draws upon data that are both nationally representative and encompass a sizable cohort, lending strong credibility to our conclusions. Furthermore, we bolstered the robustness of our analysis by examining the impact of individual VOCs and taking a holistic view of the links between co-exposure to VOCs and RA in overall populations and subpopulations based on sex and age. We utilized a suite of statistical techniques, such as WQS regression, the qgcomp model and mediation analysis, in our assessment, paving the way for a more nuanced grasp of how VOC exposure might influence RA prevalence.

While our study provides valuable insights into the environmental determinants of RA, it is not without limitations. First, self-reported RA diagnosis might introduce reporting bias, though NHANES is known for its rigorous data collection standards. Second, the cross-sectional nature of NHANES data precluded us from establishing a causal relationship between VOC exposure and RA prevalence. This one-time assessment may not accurately reflect chronic exposure, and the possibility of reverse causality, particularly in our mediation analysis, cannot be ruled out. Furthermore, our findings are derived from a U.S. population, and thus may not be generalizable to other populations with different genetic backgrounds, lifestyles, or environmental exposure profiles. Therefore, both longitudinal studies and research in more diverse cohorts are warranted to confirm these associations. Third, although our model incorporated key predictors, we cannot dismiss the possibility that unaccounted factors, such as other environmental air pollutants and covariates, including genetic and occupational factors, might skew the results (42, 43). Fourth, from an initial cohort of 55,810 participants in the selected NHANES cycles, 51,188 were excluded primarily due to missing data on urinary mVOCs and other key covariates, resulting in a final analytical sample of 4,622 participants. This substantial exclusion may have introduced selection bias, potentially limiting the generalizability of our findings to the broader population. Finally, the models (LASSO, WQS and qgcomp) used in this study are not currently adapted for complex survey designs, meaning the necessary NHANES sample weights could not be applied, which may compromise the generalizability of our findings to the broader U.S. population.

5 Conclusion

In summary, data from national, cross-sectional studies corroborate the hypothesis that both individual and combined exposures to VOCs are linked to a heightened risk of developing RA. Furthermore, these findings emphasize the significance of inflammatory pathways as potential intermediaries facilitating this connection. Notably, these associations are more pronounced among females and individuals within the young to middle-aged demographic group. Considering the limitations inherent in the present study, prospective cohort studies and experimental research are necessary to verify these associations and clarify the underlying biological mechanisms involved.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary material. The data and analysis code supporting the findings of this study are available at: https://github.com/ruijinyin/Association-between-mVOCs-and-rheumatoid-arthritis. Further inquiries can be directed to the corresponding authors.

Ethics statement

The studies involving humans were approved by the National Center for Health Statistics Ethics Review Board. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

Author contributions

TR: Conceptualization, Data curation, Methodology, Resources, Writing – original draft. EZ: Data curation, Methodology, Writing – review & editing. TC: Methodology, Writing – review & editing, Investigation. MW: Writing – review & editing, Data curation, Formal analysis, Visualization. YY: Visualization, Writing – review & editing, Conceptualization, Supervision, Validation. JW: Funding acquisition, Investigation, Project administration, Writing – review & editing. WC: Supervision, Validation, Visualization, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This work was supported by Suzhou Science and Technology Development Program (No. SLJ2021009) and Integrated Chinese and Western Medicine Clinical Project for Major and Refractory Diseases (Polymyositis) (No. 2024-3).

Acknowledgments

We would like to express our sincere gratitude to all the investigators, staff, and participants of NHANES program for their invaluable contributions that made this research possible.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpubh.2025.1694503/full#supplementary-material

Abbreviations

AAMA, N-Acetyl-S-(2-carbamoylethyl)-L-cysteine; PHGA, Phenylglyoxylic acid; MHB3, N-Acetyl-S-(4-hydroxy-2-butenyl)-L-cysteine;MADA, Mandelic acid; HPMM, N-Acetyl-S-(3-hydroxypropyl-1-methyl)-L-cysteine; HPMA, N-Acetyl-S-(3-hydroxypropyl)-L-cysteine; HPM2, N-Acetyl-S-(2-hydroxypropyl)-L-cysteine; DHBM, N-Acetyl-S-(3,4-dihydroxybutyl)-L-cysteine; CYMA, N-Acetyl-S-(2-cyanoethyl)-L-cysteine; CEMA, N-Acetyl-S-(2-carboxyethyl)-L-cysteine; BPMA, N-Acetyl-S-(n-propyl)-L-cysteine; BMA, N-Acetyl-S-(benzyl)-L-cysteine; ATCA, 2-Aminothiazoline-4-carboxylic acid; AMCA, N-Acetyl-S-(N-methylcarbamoyl)-L-cysteine; 34MH, 3- and 4-Methylhippuric acid; 2MHA, 2-Methylhippuric acid.

References

1. Smolen JS, Aletaha D, McInnes IB. Rheumatoid arthritis. Lancet. (2016) 388:2023–38. doi: 10.1016/S0140-6736(16)30173-8

PubMed Abstract | Crossref Full Text | Google Scholar

2. Smith MH, Berman JR. What is rheumatoid arthritis? JAMA. (2022) 327:1194. doi: 10.1001/jama.2022.0786

Crossref Full Text | Google Scholar

3. Scherer HU, Häupl T, Burmester GR. The etiology of rheumatoid arthritis. J Autoimmun. (2020) 110:102400. doi: 10.1016/j.jaut.2019.102400

Crossref Full Text | Google Scholar

4. Tang B, Liu Q, Ilar A, Wiebert P, Hägg S, Padyukov L, et al. occupational inhalable agents constitute major risk factors for rheumatoid arthritis, particularly in the context of genetic predisposition and smoking. Ann Rheum Dis. (2023) 82:316–23. doi: 10.1136/ard-2022-223134

PubMed Abstract | Crossref Full Text | Google Scholar

5. Zhang J, Fang XY, Wu J, Fan YG, Leng RX, Liu B, et al. Association of combined exposure to ambient air pollutants, genetic risk, and incident rheumatoid arthritis: a prospective cohort study in the UK Biobank. Environ Health Perspect. (2023) 131:37008. doi: 10.1289/EHP10710

PubMed Abstract | Crossref Full Text | Google Scholar

6. Deane KD, Demoruelle MK, Kelmenson LB, Kuhn KA, Norris JM, Holers VM. Genetic and environmental risk factors for rheumatoid arthritis. Best Pract Res Clin Rheumatol. (2017) 31:3–18. doi: 10.1016/j.berh.2017.08.003

PubMed Abstract | Crossref Full Text | Google Scholar

7. Kwon JW, Park HW, Kim WJ, Kim MG, Lee SJ. Exposure to volatile organic compounds and airway inflammation. Environ Health. (2018) 17:65. doi: 10.1186/s12940-018-0410-1

PubMed Abstract | Crossref Full Text | Google Scholar

8. Chen S, Wan Y, Qian X, Wang A, Mahai G, Li Y, et al. Urinary metabolites of multiple volatile organic compounds, oxidative stress biomarkers, and gestational diabetes mellitus: association analyses. Sci Total Environ. (2023) 875:162370. doi: 10.1016/j.scitotenv.2023.162370

PubMed Abstract | Crossref Full Text | Google Scholar

9. Ahmed I, Greenwood R, Costello B, Ratcliffe N, Probert CS. Investigation of faecal volatile organic metabolites as novel diagnostic biomarkers in inflammatory bowel disease. Aliment Pharmacol Ther. (2016) 43:596–611. doi: 10.1111/apt.13522

PubMed Abstract | Crossref Full Text | Google Scholar

10. Zhou HL Di DS, Cui ZB, Zhou TT, Yuan TT, Liu Q, et al. Whole-body aging mediates the association between exposure to volatile organic compounds and osteoarthritis among US middle-to-old-aged adults. Sci Total Environ. (2024) 907:167728. doi: 10.1016/j.scitotenv.2023.167728

Crossref Full Text | Google Scholar

11. Lei T, Qian H, Yang J, Hu Y. The exposure to volatile organic chemicals associates positively with rheumatoid arthritis: a cross-sectional study from the nhanes program. Front Immunol. (2023) 14:1098683. doi: 10.3389/fimmu.2023.1098683

PubMed Abstract | Crossref Full Text | Google Scholar

12. Beidelschies M, Lopez R, Pizzorno J, Le P, Rothberg MB, Husni ME, et al. Polycyclic aromatic hydrocarbons and risk of rheumatoid arthritis: a cross-sectional analysis of the national health and nutrition examination survey, 2007-2016. BMJ Open. (2023) 13:e071514. doi: 10.1136/bmjopen-2022-071514

PubMed Abstract | Crossref Full Text | Google Scholar

13. Li M, Wan Y, Qian X, Wang A, Mahai G, He Z, et al. Urinary metabolites of multiple volatile organic compounds among pregnant women across pregnancy: variability, exposure characteristics, and associations with selected oxidative stress biomarkers. Environ Int. (2023) 173:107816. doi: 10.1016/j.envint.2023.107816

PubMed Abstract | Crossref Full Text | Google Scholar

14. Alwis KU, Blount BC, Britt AS, Patel D, Ashley DL. Simultaneous analysis of 28 urinary voc metabolites using ultra high performance liquid chromatography coupled with electrospray ionization tandem mass spectrometry (Uplc-Esi/Msms). Anal Chim Acta. (2012) 750:152–60. doi: 10.1016/j.aca.2012.04.009

PubMed Abstract | Crossref Full Text | Google Scholar

15. Caudill SP, Schleicher RL, Pirkle JL. Multi-rule quality control for the age-related eye disease study. Stat Med. (2008) 27:4094–106. doi: 10.1002/sim.3222

PubMed Abstract | Crossref Full Text | Google Scholar

16. Fang L, Zhao H, Chen Y, Ma Y, Xu S, Xu S, et al. The combined effect of heavy metals and polycyclic aromatic hydrocarbons on arthritis, especially osteoarthritis, in the US adult population. Chemosphere. (2023) 316:137870. doi: 10.1016/j.chemosphere.2023.137870

Crossref Full Text | Google Scholar

17. Wyss R, van der Laan M, Gruber S, Shi X, Lee H, Dutcher SK, et al. Targeted learning with an undersmoothed lasso propensity score model for large-scale covariate adjustment in health-care database studies. Am J Epidemiol. (2024) 193:1632–40. doi: 10.1093/aje/kwae023

Crossref Full Text | Google Scholar

18. Keil AP, Buckley JP, O'Brien KM, Ferguson KK, Zhao S, White AJ, et al. Quantile-based G-computation approach to addressing the effects of exposure mixtures. Environ Health Perspect. (2020) 128:47004. doi: 10.1289/EHP5838

Crossref Full Text | Google Scholar

19. Snowden JM, Rose S, Mortimer KM. Implementation of G-computation on a simulated data set: demonstration of a causal inference technique. Am J Epidemiol. (2011) 173:731–8. doi: 10.1093/aje/kwq472

PubMed Abstract | Crossref Full Text | Google Scholar

20. Tang L, Liu M, Tian J. Volatile organic compounds exposure associated with depression among U.S. adults: results from Nhanes 2011-2020. Chemosphere (2024) 349:140690. doi: 10.1016/j.chemosphere.2023.140690

PubMed Abstract | Crossref Full Text | Google Scholar

21. Xiong Y, Zhou J, Xing Z, Du K. Cancer risk assessment for exposure to hazardous volatile organic compounds in Calgary, Canada. Chemosphere. (2021) 272:129650. doi: 10.1016/j.chemosphere.2021.129650

PubMed Abstract | Crossref Full Text | Google Scholar

22. Kuang H, Li Z, Lv X, Wu P, Tan J, Wu Q, et al. Exposure to volatile organic compounds may be associated with oxidative DNA Damage-mediated childhood asthma. Ecotoxicol Environ Saf. (2021) 210:111864. doi: 10.1016/j.ecoenv.2020.111864

PubMed Abstract | Crossref Full Text | Google Scholar

23. Zeng L, Yang B, Xiao S, Yan M, Cai Y, Liu B, et al. Species profiles, in-situ photochemistry and health risk of volatile organic compounds in the gasoline service station in China. Science of The Total Environment. (2022) 842:156813. doi: 10.1016/j.scitotenv.2022.156813

PubMed Abstract | Crossref Full Text | Google Scholar

24. James CA, Xin G, Doty SL, Strand SE. Degradation of low molecular weight volatile organic compounds by plants genetically modified with mammalian cytochrome P450 2e1. Environ Sci Technol. (2008) 42:289–93. doi: 10.1021/es071197z

PubMed Abstract | Crossref Full Text | Google Scholar

25. Tan L, Liu Y, Liu J, Liu Z, Shi R. Associations of individual and mixture exposure to volatile organic compounds with metabolic syndrome and its components among US adults. Chemosphere. (2024) 347:140683. doi: 10.1016/j.chemosphere.2023.140683

PubMed Abstract | Crossref Full Text | Google Scholar

26. Altomare DF, Di Lena M, Porcelli F, Trizio L, Travaglio E, Tutino M, et al. Exhaled volatile organic compounds identify patients with colorectal cancer. Br J Surg. (2013) 100:144–50. doi: 10.1002/bjs.8942

PubMed Abstract | Crossref Full Text | Google Scholar

27. Braun JM, Gennings C, Hauser R, Webster TF. What can epidemiological studies tell us about the impact of chemical mixtures on human health? Environ Health Perspect. (2016) 124:A6–9. doi: 10.1289/ehp.1510569

PubMed Abstract | Crossref Full Text | Google Scholar

28. Zhou HL, Su GH, Zhang RY, Di DS, Wang Q. Association of volatile organic compounds co-exposure with bone health indicators and potential mediators. Chemosphere (2022) 308(Pt 1):136208. doi: 10.1016/j.chemosphere.2022.136208

PubMed Abstract | Crossref Full Text | Google Scholar

29. Shaw AT, Gravallese EM. Mediators of Inflammation and bone remodeling in rheumatic disease. Semin Cell Dev Biol. (2016) 49:2–10. doi: 10.1016/j.semcdb.2015.10.013

PubMed Abstract | Crossref Full Text | Google Scholar

30. Luo YS, He QK, Sun MX, Qiao FX, Liu YC, Xu CL, et al. Acrylonitrile exposure triggers ovarian inflammation and decreases oocyte quality probably via mitochondrial dysfunction induced apoptosis in mice. Chem Biol Interact. (2022) 360:109934. doi: 10.1016/j.cbi.2022.109934

PubMed Abstract | Crossref Full Text | Google Scholar

31. Karp EM, Eaton TR, Sànchez i, Nogué V, Vorotnikov V, Biddy MJ, Tan ECD, et al. Renewable acrylonitrile production. Science. (2017) 358:1307–10. doi: 10.1126/science.aan1059

PubMed Abstract | Crossref Full Text | Google Scholar

32. Caito SW Yu Y, Aschner M. differential inflammatory response to acrylonitrile in rat primary astrocytes and microglia. Neurotoxicology. (2014) 42:1–7. doi: 10.1016/j.neuro.2014.02.006

PubMed Abstract | Crossref Full Text | Google Scholar

33. Dang Y, Li Z, Wei Q, Zhang R, Xue H, Zhang Y. Protective effect of apigenin on acrylonitrile-induced inflammation and apoptosis in testicular cells via the Nf-Kb pathway in rats. Inflammation. (2018) 41:1448–59. doi: 10.1007/s10753-018-0791-x

Crossref Full Text | Google Scholar

34. Manou-Stathopoulou S, Lewis MJ. Diversity of Nf-Kb signalling and inflammatory heterogeneity in rheumatic autoimmune disease. Semin Immunol. (2021) 58:101649. doi: 10.1016/j.smim.2022.101649

Crossref Full Text | Google Scholar

35. Ott MG. Assessment of 1,3-butadiene epidemiology studies. Environ Health Perspect. (1990) 86:135–41. doi: 10.1289/ehp.9086135

PubMed Abstract | Crossref Full Text | Google Scholar

36. Doyle M, Sexton KG, Jeffries H, Jaspers I. Atmospheric photochemical transformations enhance 1,3-butadiene-induced inflammatory responses in human epithelial cells: the role of ozone and other photochemical degradation products. Chem Biol Interact. (2007) 166:163–9. doi: 10.1016/j.cbi.2006.05.016

PubMed Abstract | Crossref Full Text | Google Scholar

37. McGraw KE, Riggs DW, Rai S, Navas-Acien A, Xie Z, Lorkiewicz P, et al. Exposure to volatile organic compounds - acrolein, 1,3-butadiene, and crotonaldehyde - is associated with vascular dysfunction. Environ Res. (2021) 196:110903. doi: 10.1016/j.envres.2021.110903

PubMed Abstract | Crossref Full Text | Google Scholar

38. Liu H, Li MJ, Zhang XN, Wang S, Li LX, Guo FF, et al. N,N-dimethylformamide-induced acute liver damage is driven by the activation of Nlrp3 inflammasome in liver macrophages of mice. Ecotoxicol Environ Saf. (2022) 238:113609. doi: 10.1016/j.ecoenv.2022.113609

PubMed Abstract | Crossref Full Text | Google Scholar

39. Xu L, Zhao Q, Luo J, Ma W, Jin Y, Li C, et al. Integration of proteomics, lipidomics, and metabolomics reveals novel metabolic mechanisms underlying N, N-dimethylformamide induced hepatotoxicity. Ecotoxicol Environ Saf. (2020) 205:111166. doi: 10.1016/j.ecoenv.2020.111166

PubMed Abstract | Crossref Full Text | Google Scholar

40. Wang B, Yang S, Guo Y, Wan Y, Qiu W, Cheng M, et al. Association of urinary dimethylformamide metabolite with lung function decline: the potential mediating role of systematic inflammation estimated by C-reactive protein. Sci Total Environ. (2020) 726:138604. doi: 10.1016/j.scitotenv.2020.138604

PubMed Abstract | Crossref Full Text | Google Scholar

41. Stutz MD, Gangell CL, Berry LJ, Garratt LW, Sheil B, Sly PD. Cyanide in bronchoalveolar lavage is not diagnostic for pseudomonas aeruginosa in children with cystic fibrosis. Eur Respir J. (2011) 37:553–8. doi: 10.1183/09031936.00024210

PubMed Abstract | Crossref Full Text | Google Scholar

42. Kronzer VL, Sparks JA. Occupational inhalants, genetics and the respiratory mucosal paradigm for acpa-positive rheumatoid arthritis. Ann Rheum Dis. (2023) 82:303–5. doi: 10.1136/ard-2022-223286

PubMed Abstract | Crossref Full Text | Google Scholar

43. Gumtorntip W, Kasitanon N, Louthrenoo W, Chattipakorn N, Chattipakorn SC. Potential roles of air pollutants on the induction and aggravation of rheumatoid arthritis: from cell to bedside studies. Environ Pollut. (2023) 334:122181. doi: 10.1016/j.envpol.2023.122181

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: volatile organic compounds, VOCs, mVOCs, rheumatoid arthritis (RA), lymphocyte-to-monocyte ratio (LMR)

Citation: Ren T, Zhou E, Cheng T, Wang M, Yin Y, Wu J and Chen W (2025) Association between volatile organic compound co-exposure and the prevalence of rheumatoid arthritis: a nationwide cross-sectional study. Front. Public Health 13:1694503. doi: 10.3389/fpubh.2025.1694503

Received: 28 August 2025; Accepted: 23 October 2025;
Published: 10 November 2025.

Edited by:

Annika Tillander, Linköping University, Sweden

Reviewed by:

Amir Shahmoradi, University of Texas at Arlington, United States
Étienne Babin, European Food Safety Authority (EFSA), Italy

Copyright © 2025 Ren, Zhou, Cheng, Wang, Yin, Wu and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yufeng Yin, eWlueXVmZW5nQHN1ZGEuZWR1LmNu; Jian Wu, bmp3dWppYW5AMTYzLmNvbQ==; Weichang Chen, d2VpY2hhbmdjaGVuQDEyNi5jb20=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.