Your new experience awaits. Try the new design now and help us make it even better

REVIEW article

Front. Big Data, 14 November 2025

Sec. Medicine and Public Health

Volume 8 - 2025 | https://doi.org/10.3389/fdata.2025.1621526

Achieving health equity in immune disease: leveraging big data and artificial intelligence in an evolving health system landscape


Stan Kachnowski,&#x;Stan Kachnowski1,2Asif H. Khan
&#x;&#x;Asif H. Khan3*Shad FloquetShadé Floquet4Kendal K. WhitlockKendal K. Whitlock5Juan Pablo Wisnivesky,Juan Pablo Wisnivesky6,7Daniel B. NeillDaniel B. Neill8Irene Dankwa-MullanIrene Dankwa-Mullan9Gezzer OrtegaGezzer Ortega10Moataz DaoudMoataz Daoud4Raza ZaheerRaza Zaheer4Maia HightowerMaia Hightower11Paul RowePaul Rowe3
  • 1Healthcare Innovation and Technology Lab, New York, NY, United States
  • 2Columbia Business School, Columbia University, New York, NY, United States
  • 3Sanofi, Morristown, NJ, United States
  • 4Sanofi, Cambridge, MA, United States
  • 5Walgreens Boots Alliance, New York, NY, United States
  • 6Division of General Internal Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, United States
  • 7Division of Pulmonary, Critical Care, and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, United States
  • 8Courant Institute of Mathematical Sciences, Department of Computer Science, New York University, New York, NY, United States
  • 9Department of Health Policy and Management, Milken Institute School of Public Health, George Washington University, Washington, DC, United States
  • 10Center for Surgery and Public Health, Department of Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, United States
  • 11Veritas Healthcare Insights, Park City, UT, United States

Prevalence of immune diseases is rising, imposing burdens on patients, healthcare providers, and society. Addressing the future impact of immune diseases requires “big data” on global distribution/prevalence, patient demographics, risk factors, biomarkers, and prognosis to inform prevention, diagnosis, and treatment strategies. Big data offer promise by integrating diverse real-world data sources with artificial intelligence (AI) and big data analytics (BDA), yet cautious implementation is vital due to the potential to perpetuate and exacerbate biases. In this review, we outline some of the key challenges associated with achieving health equity through the use of big data, AI, and BDA in immune diseases and present potential solutions. For example, political/institutional will and stakeholder engagement are essential, requiring evidence of return on investment, a clear definition of success (including key metrics), and improved communication of unmet needs, disparities in treatments and outcomes, and the benefits of AI and BDA in achieving health equity. Broad representation and engagement are required to foster trust and inclusivity, involving patients and community organizations in study design, data collection, and decision-making processes. Enhancing technical capabilities and accountability with AI and BDA are also crucial to address data quality and diversity issues, ensuring datasets are of sufficient quality and representative of minoritized populations. Lastly, mitigating biases in AI and BDA is imperative, necessitating robust and iterative fairness assessments, continuous evaluation, and strong governance. Collaborative efforts to overcome these challenges are needed to leverage AI and BDA effectively, including an infrastructure for sharing harmonized big data, to advance health equity in immune diseases through transparent, fair, and impactful data-driven solutions.

1 Introduction

Prevalence and burden of immune diseases, including asthma, atopic dermatitis, rheumatoid arthritis (RA), multiple sclerosis (MS), and inflammatory bowel disease (IBD), are increasing in high-income countries, and recent estimates suggest a prevalence of approximately 1 in 10 individuals for immune diseases (Lerner et al., 2015; Cao et al., 2023; Conrad et al., 2023; Miller, 2023a; Shin et al., 2023; Wang et al., 2023). The rise in these often lifelong, progressive, and incurable immune diseases (Wylezinski et al., 2019) is alarming, and despite population growth playing a role, the underlying reasons are unclear. However, as immune diseases occur in genetically predisposed individuals following exposure to environmental factors (e.g., chemicals, dietary components, gut dysbiosis, and infections) (Vojdani et al., 2014; Pisetsky, 2023), it is likely that evolving environmental exposures may explain the increases in autoimmunity and immune disease (Miller, 2023a). As public health data collection and analysis over the past 5 decades has improved, environmental factors and occupational exposures have emerged that appear to be unevenly distributed across populations, as evidenced by the socioeconomic and regional disparities underpinning immune diseases (Quinn et al., 2007; Roberts and Erdei, 2020; Conrad et al., 2023; Global Burden of Disease (GBD) 2019 IMID Collaborators, 2023). For example, changes to these exposures/disparities may explain the increasing prevalence of MS among African Americans, particularly women, who have overtaken White individuals as the population at greatest risk (Goonesekera et al., 2024). Outcome disparities are also common in minoritized populations with immune disease and include underdiagnosis, suboptimal treatment, higher morbidity, worse quality of life, and higher mortality (Davis et al., 2021; Global Burden of Disease (GBD) 2019 IMID Collaborators, 2023).

To address the future impact of immune disease, data on the distribution, risk factors (genetic, behavioral, and environmental), and biomarkers have been proposed to enhance disease understanding, develop preventive strategies, and improve diagnosis and treatment (Lerner et al., 2015; Peng et al., 2021; Miller, 2023a). Such evidence can be obtained through “big data,” defined by the seven Vs (Volume, Velocity, Variety, Variability, Veracity, Visualization, and Value) (Batko and Slezak, 2022), that consolidate real-world clinical, research, biometric, patient-reported outcome, social, and financial data. Further, by collecting health, socioeconomic, and sociodemographic data, big data have the potential to improve understanding of health disparities and identify approaches to improve health equity (Galea and Abdalla, 2023). A necessity of big data, owing to its complexity and unstructured sources, is the use of artificial intelligence (AI)-powered big data analytics (BDA), such as machine learning (ML), whereby computers use algorithms to learn from data and improve task performance (e.g., prediction of outcome variables) (Fuller et al., 2017). BDA in healthcare comprises data collection, storage, analysis, data mining, and ML techniques to provide descriptive, predictive, prescriptive, and discovery analytics using large volumes of omics, biomedical, telemedicine, and electronic health record (EHR) data, enabling big data to inform preventive and precision medicine (Bartoloni et al., 2022; Batko and Slezak, 2022).

Although AI and BDA can facilitate the identification and resolution of health inequities (Galea and Abdalla, 2023), and has been used extensively in immune disease to facilitate early diagnosis or prognostic models (Danieli et al., 2024), it can perpetuate inequities if the data are not representative of minoritized populations (Norori et al., 2021; Gurevich et al., 2023). For example, unrepresentative training data or other flawed/biased assumptions may result in algorithmic bias (Gurevich et al., 2023), whereby existing inequities are compounded or amplified by algorithms that erroneously assign patients with different needs, or levels of risk, with the same algorithm score (or vice versa) (Obermeyer et al., 2019; Panch et al., 2019; Chin et al., 2023). As AI and BDA have applications across the full spectrum of healthcare (diagnosis and treatment, prognosis/risk stratification, triage, and resource allocation), there is potential of various levels of benefit and harm (Favaretto et al., 2019; Peng et al., 2021; Batko and Slezak, 2022; Chin et al., 2023; Gurevich et al., 2023; Danieli et al., 2024). The causes of discrimination in data analytics, solutions to discrimination in big data, and barriers to their adoption have been reviewed previously (Favaretto et al., 2019). Additionally, various frameworks and interdisciplinary approaches have been proposed to ensure AI and BDA promote, and do not hinder, health equity (Ibrahim et al., 2020; Clark et al., 2021; Dankwa-Mullan and Weeraratne, 2022; Chin et al., 2023). This review outlines some of the key challenges associated with achieving health equity in immune diseases through the use of big data, AI, and BDA, together with potential solutions.

2 Challenges and solutions in implementing big data to address health equity

As highlighted by a recent systematic literature review, there are underlying challenges to the implementation of AI and BDA within immunology and allergy, including poor data quality and quantity, limited access to shared datasets, geographic bias, the high resource burden of managing complex data, lack of AI model interpretability, inadequate clinician training on AI integration, and ethical concerns around privacy, bias, and regulation (Xiao et al., 2025). We will discuss these issues in relation to health equity, aiming to identify solutions to key challenges such as political/institutional will to implement change [e.g., evidence to support a return on investment (ROI)], community engagement, and technical capabilities of AI and BDA (Galea and Abdalla, 2023) (see Figure 1).

Figure 1
Flowchart titled “Achieving Health Equity Through the Use of Big Data, AI, and BDA in Immune Diseases.” It outlines key challenges in political/institutional will, community engagement, and technical capabilities. Proposed solutions include learning from past approaches, improving trust and data quality, interdisciplinary expertise, standardization, stakeholder communication, addressing the digital divide, bias mitigation, and algorithm fairness. Collaborative efforts are highlighted as essential for advancing health equity through transparent and impactful data-driven solutions.

Figure 1. Overview of key challenges and proposed solutions to achieve health equity through use of big data and BDA in immune diseases. AI, artificial intelligence; BDA, big data analytics; ROI, return on investment, RWD, real-world data.

2.1 Political/institutional will to implement change: challenges and solutions to a lack of interdisciplinary subject-matter experts

The political/institutional will to implement change can be defined as obtaining buy-in from key decision makers, along with the commitment and capacity of healthcare organizations, research institutions, and industry stakeholders to drive meaningful improvements in health equity. Indeed, health institutions play a critical role in either perpetuating or addressing health inequities and must be held accountable for their impact (Chisolm et al., 2023). Given the complexities introduced by shifting political administrations and congressional policies on health equity initiatives, these healthcare institutions are uniquely positioned to drive systemic change (Deloitte Center for Health Solutions, 2025).

2.1.1 Increasing awareness of health equity initiatives to foster multidisciplinary expertise

The key to driving political/institutional and wider systemic change needed to address health equity in immune disease is improving evidence-generation and its communication by subject-matter experts of the benefits of AI and BDA to key stakeholders (e.g., payers, clinicians, and regulators). While there are subject-matter experts in AI/BDA, health equity, or immune diseases, there is understandably a limited pool of experts proficient in all these areas. The benefits of AI and BDA therefore need to be communicated more widely, particularly in relation to health equity, to ensure it is widely adopted and these multidisciplinary experts can be fostered.

One approach could be to increase awareness at educational institutions and offer multidisciplinary undergraduate/postgraduate training of immunologists regarding computational biology, programming, and bioinformatics, among other BDA-related topics (Schultze, 2015). Creation of a common, clear, and shared lexicon, built on existing information (Fuller et al., 2017; American Medical Association and Association of American Medical Colleges, 2021), will also be important to minimize inconsistencies and facilitate the synthesis and comparison of data (Palaniappan et al., 2024) so that the benefits of AI and BDA for health equity can be understood. These definitions could be incorporated into data guidelines and standards to inform multidisciplinary consortia involved in decision-making. For example, the Findable, Accessible, Interoperable, Reusable (FAIR) data principles offer domain-independent guidelines for producers and publishers to enhance data reusability through effective data management and stewardship (Wilkinson et al., 2016). More recently, the Gravity Project has provided consensus-based standards on how social determinants of health (SDoH) data are used and shared (Gravity Project, 2025). In addition to establishing a clear lexicon aligned with existing data guidelines and standards, it is important to define trackable key performance indicators (KPIs) to assess the value of health equity initiatives. Multidisciplinary expertise should be engaged to develop a communication strategy for health equity in big data/BDA, effectively conveying these KPIs and highlighting areas with potential or demonstrated ROI.

2.1.2 Utilizing KPIs to communicate the value of health equity initiatives

Evaluating the role of AI, ML, deep learning, and advanced analytics in promoting health equity for immune disorders requires a multidimensional set of KPIs that must address data inclusivity, diagnostic fairness, patient outcomes, and real-world implementation, with a persistent focus on addressing disparities rather than just improving averages. To ensure KPIs are communicated to and used effectively by key stakeholders they could be incorporated into data guidelines, standards, and recommendations [e.g., FUTURE-AI (Lekadir et al., 2025)] and disseminated via a new position paper. Example KPIs for determining the success of big data efforts to achieve health equity are summarized in Table 1. A more exhaustive list is available in a white paper from the National Committee for Quality Assurance (Harrington et al., 2021).

Table 1
www.frontiersin.org

Table 1. Overview of KPIs for determining the success of AI and BDA in promoting health equity.

In terms of BDA use more broadly, KPIs need to demonstrate minimized deviation from a gold-standard level of representativeness, minimized outcome disparity, and most importantly usability and scalability within robust healthcare systems. As proposed by Zimmerman and Anderson (2019), multiple metrics for health equity (health inequality, health disparities, and mean health) could be consolidated using an approach based on deviations from the best achievable health, defined by the median experience of the most privileged group (Zimmerman and Anderson, 2019). In addition to the traditional financial ROI metric, (net benefit – cost)/cost), an emerging concept is the creation of a single, blended metric that expands ROI beyond purely financial returns to include concepts such as value and social return on investment (SROI). SROI assigns financial proxy values to non-financial outcomes, making intangible social impacts apparent by demonstrating the broader social value generated from financial investments. Although SROI is recognized by domain experts as a valuable tool for demonstrating social value, awareness across the broader public health field remains limited. There is also a need for standardized methodologies and data reporting practices to ensure its validity and interpretability (Ashton et al., 2024).

2.2 Political/institutional will to implement change: challenges and solutions to stakeholder buy-in and maintaining investment

2.2.1 Learning from past failures and demonstrating ROI with health equity initiatives

Addressing health inequities in immune diseases can be costly and it is challenging to quantify a ROI. However, the unmet need is evident. Immune diseases are lifelong and expensive to treat, with direct costs in the US projected to be around $200–300 billion annually (Wylezinski et al., 2019; Miller, 2023b). Further, the estimated costs of health inequities in 2018, based on medical care expenditures, lost productivity, and costs associated with premature death, were approximately $450 billion for minoritized populations in the US (LaVeist et al., 2023). Considering the contribution of health inequities to disease costs, a recent study has demonstrated that there is great potential for ROI (Yerramilli et al., 2024), particularly in the light of well-documented impacts of SDoH (Braveman and Gottlieb, 2014) and the associated gains in health and productivity (Yerramilli et al., 2024). Despite the potential for strategies targeting SDoH to improve health outcomes and generate cost savings, literature on ROI is scarce (Nikpay et al., 2024). In the last 20 years the US government, non-governmental organizations, and corporations have invested over $179 billion in health equity (Aluko et al., 2023) and despite this—according to one independent analysis—many disparity metrics have shown little or no improvement (Zimmerman and Anderson, 2019). Aluko et al. proposed multiple reasons for the failure of previous health equity investments/approaches, including insufficient governance (see Section 2.6), limited workforce and capacity skill sets, and unsupportive data and technology infrastructure. Some of the proposed solutions from the authors included a refocus on longer-term commitments, and investment in BDA platforms capable of understanding, targeting, and tracking disparities over time (Aluko et al., 2023).

By integrating healthcare data from diverse sources, BDA has the potential to enhance clinical decision-support tools and aid development of personalized or population-based services (Schulte and Bohnet-Joschko, 2022). Indeed, big data has impacted patient care for decades by helping insurance companies incentivize preventive care, ultimately leading to reduced acute care costs and improved care equity (Sabet et al., 2023). A scoping review found that three-quarters of papers reporting economic evaluations of BDA for clinical decision-making corroborated expectations of cost savings, ranging from US$126 per patient to over US$500 million for the entire US healthcare system; however, the interpretation of results was limited by a lack of full and properly performed economic evaluations (Bakker et al., 2020). Even in the absence of robust economic/ROI data, investment in BDA remains attractive and has recently been supported by the US Department of Health and Human Services (HHS), who in 2022 pledged US$90 million to identify and reduce health disparities using new data-driven solutions (Sabet et al., 2023). For investment to continue it will be important for the HHS and other institutions to recurrently evaluate their initiatives, using appropriate KPIs to determine value and ROI. In the absence of large long-term cases, smaller initiatives, such as the EHR-enabled rheumatology registry developed by the American College of Rheumatologists (Gilvaz and Reginato, 2023), may offer the best opportunities to highlight the potential of BDA on health equity and ROI in the near term. Alternatively, researchers may be able to investigate potential for AI-powered insights into health inequalities by registering for established precision medicine initiatives such as the National Institutes of Health-funded “All of US Research Hub” that is built on strong privacy and trust principles including governance, transparency, consent, and data quality (All of Us Research Program, 2025). While funding for this program is declining, opportunities remain for funded partnerships that may help identify solutions to health inequalities (All of Us Research Program, 2025); however, such opportunities are limited and without clear evidence of ROI, competing institutional priorities (e.g., financial sustainability, regulatory compliance, or short-term efficiency gains) may take precedence, to the detriment of health equity.

2.2.2 Integration and tracking of health equity performance indicators

A survey of US healthcare executives found only 36% have a specific budget dedicated to advancing health equity (Accenture, 2022). A more recent survey found that 43% of life sciences executives and 48% of healthcare executives found it challenging to incorporate health equity into their strategic, financial, and operational processes (Deloitte Center for Health Solutions, 2025). Even when institutions commit to implementing big data and health equity initiatives, they often lack the governance structures, mechanisms, and metrics to track and encourage delivery. This is because unlike regulatory compliance, which is tied to financial or legal consequences, health equity efforts driven by big data often remain voluntary and lack clear metrics for accountability. In this regard, the KPIs described in Table 1 could prove valuable for tracking progress against key health equity metrics, facilitating accountability, and assessing the success of related initiatives.

According to one survey, more than 40% of life sciences and healthcare executives had difficulties tracking the progress of health equity initiatives (Deloitte Center for Health Solutions, 2025). Furthermore, 32% of health equity leaders had no data on the impact of health equity initiatives on their organizations' financial indicators. The same survey reported that while health equity leaders have the potential to differentiate between short- and long-term goals, relatively few are involved in decisions related to technology and IT (14%) and use of AI (12%). These findings suggest that the economic models of healthcare delivery and biomedical research may not align with the investments (and ROI) needed to use big data for equity-focused interventions. However, this may change in the future if health equity leaders can identify the factors associated with cost savings and ROI or shift to incorporating SROI and other broader value-based assessments.

Health equity leaders, in particular, are needed to direct policy and investment opportunities. Additionally, considering the lack of performance incentives and the need to incentivize health equity in resource-limited providers, the Agency for Healthcare Research and Quality's stakeholder engagement recommended the development of equity-focused evidence-based quality indicators, use of federal data to develop health equity benchmarks, and development of toolkits to assist healthcare organizations with integrating health equity metrics into their performance management (Chisolm et al., 2023). A standardized set of health equity measures would enable value-based incentive programs to reward strategies that reduce performance gaps by addressing the unique challenges faced by disadvantaged populations, rather than assuming that improvements in overall population outcomes will automatically benefit at-risk groups (Chisolm et al., 2023). Looking ahead, further investigation is needed to understand which incentives have the greatest impact, as well as which groups of stakeholders are best positioned to deliver health equity improvements. A tiered approach to performance incentives has also been proposed to ensure efforts that fall short of key benchmarks are still recognized as progress (Chisolm et al., 2023).

2.2.3 Investment in long-term projects, including big data and AI-powered BDA

As proposed by Aluko et al. (2023) investment in longer-term commitments, such as BDA, will be important to ensure the success of health equity initiatives. However, investment in AI-powered BDA platforms capable of tracking disparities over time is challenging owing to difficulties in collecting sufficient, reliable, and up-to-date information on health disparities. For example, understanding health disparities requires careful consideration of confounding factors, such as healthcare insurance in the US, and the selection of appropriate research questions and populations. To address the unmet need for frequent and granular data collection, particularly regarding SDoH, Sabet et al. discussed the potential benefits of a large national public database of anonymized patient data capable of collecting diverse metrics based on equitable data collection strategies (Sabet et al., 2023). To ensure that the database captures data from marginalized populations, these groups should be included in the process from the early design stages, the design should be adapted for those with low literacy or limited technological proficiency, and investment should be made in technology infrastructure and staff training to prepare for comprehensive data collection (Sabet et al., 2023). Further, recommendations and guidelines are needed to progress the field in an ethical and collaborative manner to ensure data collection and storage methodologies adhere to ethics regulations and data privacy laws, and that findings can be effectively translated into clinical decision-making (Gossec et al., 2020).

To address these challenges, several data standards and principles have been developed, such as the Clinical Data Interchange Standards Consortium and the FAIR Guiding Principles for Scientific Data Management and Stewardship (Wilkinson et al., 2016; cdisc, 2025). As mentioned previously, success of such a database would be predicated on the development and achievement of predefined KPIs (see Table 1). It would also need sufficient data to address the lack of information on rare immune diseases, which would benefit from consolidating information from multiple sources (Peng et al., 2021). A holistic approach to health equity remains difficult due to the fragmentation of patient data across EHR systems, insurance databases, and research cohorts, hindering the development of comprehensive, equity-driven insights. Without institutional commitment to data sharing, achieving health equity will be challenging. Promoting cross-sector collaboration and using data dashboards to deliver insights to researchers and policymakers could be one solution to expedite investment (Sabet et al., 2023) in big-data platforms for immune disease.

2.3 Community engagement: challenges and solutions to data collection

Improving minority group participation is key to ensuring AI and BDA can be utilized to further health equity. Community engagement is key and will ensure minoritized communities with similar socioeconomic status (SES) collaborate with healthcare providers in addressing issues affecting their wellbeing. While complex factors underlie the lack of inclusion of minoritized populations in clinical research (Bibbins-Domingo and Helman, 2022a; Turner et al., 2022), big data has the potential to address these; however, representative data is often lacking. For example, race and ethnicity data are inconsistently recorded in real-world data (RWD)—in one US-based study, as many as 30% of individuals' claims/EHR data had missing race/ethnicity information (Goonesekera et al., 2024). Additionally, an analysis by the UK Office of National Statistics (ONS) found differences in ethnicity data recording between EHR data and the UK census, highlighting consistency issues (Drummond, 2023). Following a desk review of the ONS data, it was found that patient ethnicity data were being incorrectly recorded due to subjective interpretation by medical staff, non-standardized ethnicity response options across healthcare settings, and data quality checks focused on completeness vs. accuracy (Drummond, 2023).

2.3.1 Improving participation by fostering data ownership

A potential solution to these issues is to enhance data accuracy by increasing patient ownership, allowing patients to review, edit, or validate their personal information. In the global shift toward paperless healthcare, patient data are increasingly accessible through online portals and mobile applications (e.g., MyChart). These platforms typically incorporate multiple features that have been shown to encourage patient ownership, including multilingual support, consolidation of data across multiple connected systems to prevent inefficiencies (e.g., entry of similar data across multiple platforms), and protection of confidentiality (Peng et al., 2021; Vishwanatha et al., 2023). However, there are potential limitations—while these platforms may allow patients to view their data, they can lack functionality to directly edit or correct inaccuracies related to ethnicity. Additionally, they may introduce errors due to a limited set of standardized ethnicity response options. Feedback from service users and advocacy groups could help refine these systems; however, a global framework for standardizing ethnicity categories may be needed to support future data integration and better identify disparities.

A key concern with engaging patients in data collection is that access to digital health technologies (DHTs) and overall digital literacy, which are key digital determinants of health, can create a digital divide that impacts the representativeness of big data (Ibrahim et al., 2020; Eruchalu et al., 2021; Campanozzi et al., 2023; Chidambaram et al., 2024) and may affect efforts to increase the diversity and accuracy of patient-reported data. For example, despite DHTs being increasingly used by patients and physicians in the management of asthma, their usage in smartphone applications has been shown to widen the digital divide by SES, as not all individuals own smartphones (Kaplan et al., 2023). While such technologies can be used to facilitate earlier diagnosis of asthma (Al Meslamani, 2023), and also atopic dermatitis (Yanagisawa et al., 2023), there is potential for outcome disparities to arise due to earlier diagnosis/DHT use and treatment in groups of higher vs. lower SES. It is therefore important to engage communities, implement strong governance, and enhance public digital literacy to ensure that the digital divide is minimized rather than widened by the adoption of DHTs (Fernandes et al., 2024).

2.3.2 Effectively and transparently communicating the unmet need and potential of AI and BDA

In addition to providing ownership and an infrastructure for patients to validate their data, the AIM-AHEAD (Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity) US-based stakeholder listening sessions identified the need to engage communities locally, obtain buy-in for each population, and ensure algorithms are transparent and easily understood (Vishwanatha et al., 2023). Transparency in practices could go a long way—especially in community engagement—toward building capacity and readiness among those who industry needs as volunteers in medical product development. To facilitate this, patients should be provided with multilingual educational materials on how their data are used, who has access, and the short- and long-term benefits of participation and data sharing to optimize BDA outputs and health equity. Short-term benefits include improved patient trust and generalizability of clinical findings. Long-term benefits include greater innovation, improved access to effective medical interventions, reduced health disparities, and lower economic costs (Bibbins-Domingo and Helman, 2022b). This education could also highlight the different disease prevalences among different racial and ethnic groups for relevant immune diseases (Goonesekera et al., 2024), and the aims of health equity to ensure equitable access to care and outcomes. In parallel with initiatives aimed at improving patient engagement and reducing barriers to clinical trial participation, frameworks such as the Clinical trial Diversity Rating should be used to ensure that key stakeholders and regulatory bodies have the data and oversight needed to address remaining challenges (Agboola and Wright, 2024).

With the advent of natural language processing, AI can help improve the quality of patient educational materials by allowing near instant translation across multiple languages and by simplifying content to improve quality and readability, maintain or improve understandability, and improve actionability (Saatçi et al., 2024; Will et al., 2025). Table 2 presents a selection of case studies on the application of AI and BDA in immunology; however, while they have the potential to improve health equity, no evaluations were conducted—highlighting the need to track equity-related KPIs in future studies. For example, an ML model scouring EHR data for immune-driven traits has been used to identify patients in need of further testing—potentially accelerating diagnosis and treatment (Forrest et al., 2023), and achieving cost savings with earlier diagnoses (Wylezinski et al., 2019). This model also identified a high-risk subgroup that would likely be underdiagnosed based on a lack of testing (Forrest et al., 2023), which is especially useful given the high prevalence of misdiagnoses in immune diseases (Goonesekera et al., 2024). Additional positive examples of AI and BDA being applied to increase health equity in immune diseases, together with appropriate ways of assessing how effective the initiatives have been, would be helpful to increase community engagement.

Table 2
www.frontiersin.org

Table 2. Case studies of AI and BDA in immunology to increase health equity.

2.4 Technical capabilities of AI and BDA: challenges and solutions to data quality and diversity

While big data and BDA may be central to addressing health disparities and providing ROI for stakeholders, inconsistency in the quality and diversity of RWD is a key limitation. Incomplete data is, however, an inherent feature of RWD, which is usually unstructured and unlabeled. Further, as outlined above, data can be missing for minoritized populations, hindering data training and the interpretability and generalizability of findings (Peng et al., 2021). For example, ML models predicting asthma exacerbations in children showed greater algorithmic bias for low-SES populations due to more incomplete EHR data (Juhn et al., 2022).

Models require rigorous testing across diverse populations and settings; otherwise, they might perform well on one group but fail when applied to a different population due to overfitting and/or lack of external validation (Peng et al., 2021). This is of concern in immune diseases that are more prevalent in low-SES populations, such as systemic lupus erythematosus (SLE) (Conrad et al., 2023), and have complex genetic and environmental triggers (Vojdani, 2014; Pisetsky, 2023) that may impact minoritized communities to a greater extent. For example, the increased risk of RA and SLE in patients from low-SES groups (Conrad et al., 2023) and hypothesized genetic differences that may explain the earlier onset of immune diseases, including IBD, MS, RA, and SLE in minoritized populations (Sharma-Oates et al., 2022). While large, representative training datasets can address these issues, as shown by EHR-trained ML diagnostic models for RA and SLE (Forrest et al., 2023), and the EXPRESSO AI model identifying causal genes and potential immune disease-modifying compounds (Wang et al., 2024), minoritized groups remain underrepresented in genome-wide association studies (GWAS) for MS (Jacobs et al., 2022). Data gaps like this may perpetuate inequities and limit the potential of AI and BDA to inform personalized genomic medicine. For example, the lack of representativeness in GWAS may explain why only ~50% of the estimated heritability is currently understood for MS, which is diagnosed and treated earlier in people of European vs. non-European ancestry (Jacobs et al., 2022).

2.4.1 Improving data quality with robust data collection and data harmonization

While community engagement is essential to increase the diversity of data and ensure the damaging effects of bias can be identified and mitigated, robust data collection standards are needed to account for bias and employ tools that address physician biases in diagnosis and measurement of patient outcomes. To expand and plug information gaps, it is essential to collect race, ethnicity, sex, gender, and other social risk factor data from diverse sources and educate stakeholders on the importance of maintaining accurate and complete records. Reviewing supplier data procurement contracts and incorporating bias-handling clauses may help ensure that disparities are actively mitigated. To facilitate improved data quality, a steering committee or leadership structure could be established by thought leaders or a governmental organization, such as the HSS. This committee would inform data collection practices and create a gold standard for representativeness, enabling the assessment of underrepresented intersectional subpopulations within big data. Collaboration is also crucial for improving access to healthcare data and sharing it within diverse communities; there is enormous potential for sharing data within a nationwide database (Sabet et al., 2023).

To ensure RWD from multiple sources (e.g., claims, Centers for Medicare and Medicaid Services, EHR, and demographic data) are useful, it will be important to harmonize and standardize the information by implementing paperless systems, standardizing metrics, and building an infrastructure for sharing data. Universal standards, such as Health Level Seven—Fast Healthcare Interoperability Resources (HL7 FHIR), should be applied to provide a standardized way of formatting and exchanging healthcare data, making it easier for different systems (e.g., EHRs, apps, hospitals, and insurers) to communicate and share data consistently and securely (HL7 FHIR, 2023). However, there are barriers to implementation because of fragmented data, inconsistent coding practices, and interoperability gaps across institutions. Implementing FAIR data principles and adopting standardized vocabularies, such as those from Observational Health Data Sciences and Informatics, can address these issues by enabling consistent data integration, improving usability, and data quality (Pezoulas and Fotiadis, 2024). Data harmonization can, however, be operationally complex and costly due to the need to manage data privacy, consent, and compliance with regulations like the Health Insurance Portability and Accountability Act and EU General Data Protection Regulation (GDPR) (Pezoulas and Fotiadis, 2024).

2.4.2 Building big data platforms founded on robust governance and patient consent

Considering the complexities of data protection and regulation, robust governance is needed to ensure that these frameworks enhance the potential of AI and BDA to address health equity, rather than becoming a significant barrier to progress. In an article by Murdoch (2021), regulation was highlighted as a key issue due to EHR data being among the most private and protected forms of information. The article cautioned that regulation and oversight risk falling behind BDA and emphasized the need for technologically enabled methods of communicating and obtaining patient consent, as well as improved data protection and anonymization (Murdoch, 2021).

An example of the challenges faced when developing and accessing robust EHR data, especially at scale, is the digital transformation of the UK National Health Service (NHS), which is one of the largest employers in the world and received an annual budget exceeding £180 billion in 2023 (NHS Confederation, 2023). Despite its best efforts to go paperless with a digital EHR system, the NHS has failed to meet targets of 100% digitization by 2018–2024, with the last target of March 2026 scrapped and “no set date” now in its place (Clews, 2024; Lovell, 2025). While 90% of NHS trusts have achieved digitization—a notable accomplishment given the organization's scale—the process has been marked by significant delays attributed to difficulties in harmonizing data across primary and secondary care, the slow pace of digital adoption, and lack of fresh thinking and decisive action (Lovell, 2025)—all suggesting potential issues with leadership, strategy, and potentially technical and financial barriers. In a related incident, General Practitioner leaders voiced concerns about patient consent and data governance in an NHS-funded AI model designed to improve predictive healthcare related to COVID-19 vaccinations, ultimately leading to its termination (Colivicchi, 2025). It is therefore clear that, even with significant investment and time, there are issues to overcome regarding the handling and governance of data; however, such challenges are not insurmountable. For example, AI has also been piloted in the NHS to help identify patients who require proactive outreach to address the risk of non-attendance. This approach aims to help patients from marginalized communities get an appointment that works for them, and in doing so improves their outcomes, reduces health inequalities, and lowers costly inefficiencies stemming from missed appointments, all while respecting GDPR and protecting patient data (Deep Medical, 2025). However, as highlighted by Xiao et al. (2025), the potential of AI technologies remains limited by concerns around data privacy, the lack of data-sharing infrastructure, and inconsistent policies, which underscores the need for secure and shared data environments. A privacy-by-design approach has also been recommended by the European Alliance of Associations for Rheumatology (EULAR; formerly the European League Against Rheumatism) and other organizations to ensure that privacy and data protection are embedded at every stage, safeguarding patient information and enabling ethical, compliant, and trustworthy research (Gossec et al., 2020).

2.5 Technical capabilities of AI and BDA: challenges and solutions to sources of bias

ML algorithms and AI are being used to facilitate and support earlier diagnosis and optimal treatment in patients with immune diseases by reviewing clinical characteristics and predicting disease and treatment outcomes (Danieli et al., 2024). Considering the known disparities in the diagnosis, treatment, and outcomes of patients with immune disease (Davis et al., 2021), there is potential for AI and BDA to worsen these due to biases within the source data, training data, or the model outputs (Mehrabi et al., 2021). One example, discussed elsewhere, is the use of race adjustment, which requires consideration of risks prior to its application (Vyas et al., 2020).

2.5.1 Ensuring fairness by detecting and addressing blind spots, anomalies, and sources of bias

To avoid exacerbating existing biases, it is essential to engage data scientists and subject-matter experts collaboratively to ensure fairness in big data and AI-powered BDA (Boykin et al., 2021). Part of this is to improve the reliability and accuracy of data through systematic identification of data quality issues using anomaly detection techniques (Gaspar et al., 2011; Churová et al., 2021), along with addressing data blind spots (e.g., biased proxy variables) that perpetuate inequities (Obermeyer et al., 2019).

Detecting systemic and harmful biases in data, models, and outcomes is also critical (Schwartz et al., 2022). This involves understanding and addressing non-random reasons for missing data, implementing methods to identify non-equitable outcomes, and developing decision-support tools to detect biases in patient-generated data, such as EHRs, predictive models, and decision-making processes (Parikh et al., 2019). Fair ML approaches with RWD, including bias mitigation techniques with supervised models (Hardt et al., 2016; Huang et al., 2024), and techniques to detect and correct biases across intersectional subpopulations (Zhang and Neill, 2016; Kearns et al., 2018) should be utilized to mitigate bias. These techniques help identify metrics leading to equitable outcomes, and assess fairness at each step of the algorithm (Suresh and Guttag, 2021; Black et al., 2023).

As an example adapted from the fair ML literature, consider a case where an algorithmic decision-support tool is used to predict the progression of immune disease, and both false positives (incorrectly predicting an increase in severity) and false negatives (incorrectly predicting that severity will not increase) are harmful to patients. Given a concern that a specific protected class defined by a sensitive attribute, such as race or gender (e.g., female patients), are receiving lower-quality predictions, a typical approach (Barocas et al., 2023) is to compare false-positive and false-negative error rates for the protected and non-protected class and identify any statistically significant discrepancies. If the affected subpopulation is not known a priori, or there is a concern that the bias may be affecting subgroups defined by multiple data dimensions (e.g., older Black male patients), then techniques such as a bias scan (Zhang and Neill, 2016) can efficiently search across subgroups defined by multiple attributes (race, gender, age, etc.) and identify the subgroups with the most significant error rate imbalances. If biases are detected, approaches for mitigation include adjustment of decision thresholds to balance the error rates (Hardt et al., 2016), resampling or reweighting the data (Kamiran and Calders, 2012), and relearning of the predictive models with additional constraints or penalties to reduce error rate imbalance (Kamishima et al., 2012; Zafar et al., 2017). Alternatively, the detected biases may inform system-wide changes, such as increasing the amount and quality of data collected for population subgroups for whom the algorithmic decision-support tool is performing poorly.

Together, these approaches help to identify and mitigate bias and to develop fairer ML models by balancing error metrics across subgroups. However, even when accounting for these sources of bias, people can misuse data from algorithms in decision-making by discounting algorithmic recommendations in favor of their own judgment, showing tolerance for algorithmic errors, and struggling to evaluate algorithmic performance accurately. They may also be influenced by irrelevant information, trust inaccurate algorithms, and apply algorithms in ways they were not designed for (Green and Chen, 2019), exemplifying the need for increased participation and improved education of subject-matter experts to ensure that AI and BDA are not misused.

2.6 Ethical, legal, and data governance models

As outlined in the previous sections, insufficient governance underlies several critical challenges, including weak political and institutional will, limited community engagement, and inadequate technical capabilities in AI and BDA. These governance gaps can lead to inefficiencies, breaches of privacy, misuse of data, and premature termination of AI and BDA projects—ultimately reinforcing existing social inequalities. To address these issues, various governance models comprising structured systems of rules, roles, responsibilities, and processes have been developed to guide decision-making. As shown in Table 3, there are numerous examples addressing ethical, legal, and data governance challenges. However, issues with governance, and indeed BDA, persist largely due to limited awareness, the early-stage nature of many AI-based technologies, and the lack of comprehensive, standalone regulatory frameworks (Palaniappan et al., 2024; Papagiannidis et al., 2025).

Table 3
www.frontiersin.org

Table 3. Ethical, legal, and data/algorithmic governance models to address challenges with AI and BDA.

An analysis of global regulatory frameworks revealed that formal regulations are often lacking and soft-law (i.e., voluntary and unenforced) alternatives prevail, such as guidelines, standards, and codes of conduct (Palaniappan et al., 2024). Currently, there is no duty of transparency in the use of healthcare data (Bartlett, 2024) and no established framework for AI liability in the USA (National Institute of Standards in Technology, 2023). In this regulatory void, AI ethics guidelines have proliferated, but are yet to translate into meaningful accountability; vague principles, lack of enforcement mechanisms, and selective implementation continue to undermine their effectiveness (Bartlett, 2024). Ethical commitments often function more as reputational signals, commonly referred to as “ethics washing,” than as governance tools, and the absence of transparency requirements has contributed to the erosion of public trust (Bartlett, 2024).

To address these shortcomings Bartlett (2024) proposed the involvement of a trusted intermediary, or “data steward” to promote public benefit and assume responsibility for the stewardship of health data and the rights of data subjects. A data steward that operates with moral independence from AI developers could manage data on behalf of beneficiaries, enhancing transparency, legitimacy, and public trust (Bartlett, 2024). However, doing so requires a legal entity capable of ensuring accountability: a data trust. A data trust manages data with institutional, legal, and ethical safeguards, while ensuring that stewards remain accountable to beneficiaries. It also helps overcome barriers to the use and sharing of large datasets, offering a more structured and enforceable model of accountability by combining the legal duties of data stewards with participatory oversight (Bartlett, 2024). An example of such an approach is the UK Biobank, a charitable organization that stewards genetic data, and whose board of directors act as charity trustees (i.e., they have oversight and can appoint a data steward) under UK charity law—illustrating how data stewardship can be embedded within a formal governance framework (Hardinges, 2020).

This need for more robust and enforceable governance mechanisms is further illustrated by current regulatory limitations in major jurisdictions. For example, no specific regulatory pathways exist in the USA for AI technologies, which are instead assessed under adapted frameworks designed for traditional medical devices (e.g., Software as a Medical Device) (Palaniappan et al., 2024). This reliance on legacy systems potentially creates oversight gaps, as current regulations may not fully account for the dynamic, adaptive, and autonomous nature of AI (Palaniappan et al., 2024). Without tailored governance frameworks, harm caused by AI systems can persist without redress, particularly given the difficulty of demonstrating such harm due to limited information, inadequate audit trails, and lack of awareness among affected individuals (National Institute of Standards in Technology, 2023). While these challenges remain significant, emerging initiatives such as the draft US-EU voluntary AI code of conduct may offer a path toward greater regulatory convergence and international alignment (Palaniappan et al., 2024).

Importantly, governance deficiencies can also hinder meaningful progress toward health equity. Although many healthcare organizations have expanded their leadership teams to include chief health equity officers or diversity officers, this progress is not always matched by evolution in operating models, governance structures, or budgetary commitments (Aluko et al., 2023). These roles are often under-resourced and under-empowered, limiting their capacity to address systemic disparities (Aluko et al., 2023). Moreover, narrow, business-centric KPIs fail to capture the complex, longitudinal nature of efforts to reduce health disparities, often sidelining equity as a philanthropic afterthought rather than a core strategic goal. A credible health equity strategy needs to be supported by structural change, institutional accountability, and a comprehensive business case that ties equity KPIs to broader organizational success. Embedding these priorities into formal governance and AI/data oversight structures is essential if BDA are to meaningfully contribute to equity rather than exacerbate existing divides.

3 Conclusions

Despite substantial investment in health equity, progress on key metrics has been lacking (Zimmerman and Anderson, 2019; Aluko et al., 2023). In immune diseases, existing disparities risk worsening due to the increasing prevalence and burden of disease (Lerner et al., 2015; Cao et al., 2023; Conrad et al., 2023; Global Burden of Disease (GBD) 2019 IMID Collaborators, 2023; Miller, 2023a). While big data, AI, and BDA hold significant potential to address these disparities, past failures and ongoing systemic challenges, such as data quality, governance, and representativeness, must be understood and addressed to deliver meaningful ROI for future investors (Ibrahim et al., 2020; Boykin et al., 2021; Peng et al., 2021; Aluko et al., 2023; Chin et al., 2023; Yerramilli et al., 2024). Big data, AI, and BDA offer transformative potential to address these disparities through earlier diagnosis, tailored treatment, and population-level insights. For example, ML models trained on EHR data have already demonstrated the ability to identify patients in need of further testing, with the potential to accelerate diagnosis and treatment (Forrest et al., 2023) and achieve costs savings with earlier diagnosis (Wylezinski et al., 2019). However, realizing this potential requires confronting and learning from past failures to provide concrete examples evidencing the potential for AI and BDA to address disparities.

A lack of examples evidencing the value and ROI of AI-powered health equity investments in immunology is a key issue and may not be resolved until the central issues discussed in this review—political/institutional will, community engagement, and technical capabilities of AI and BDA—are addressed (Galea and Abdalla, 2023). Underlying these challenges is a lack of robust governance ensuring high-quality, representative data collection through community engagement, standardized data collection practices, and ethical data stewardship, and supports the development and continuous monitoring of meaningful equity-focused KPIs to foster effective communication strategies that demonstrate tangible health equity benefits and secure sustained investment.

Key recommendations from this review, underpinned by a need for robust governance, include, firstly, the need for collaboration among subject-matter experts in health equity, data science/BDA, and immune diseases to develop a communication strategy for key stakeholders to secure engagement and investment. This strategy should include a lexicon of terms and KPIs tailored to demonstrate the benefits of big data, AI, and BDA. Second, investors must ensure funds are spent wisely, with robust governance and performance incentives to prevent wastage and encourage buy-in. Longer-term commitments capable of demonstrating ROI should be prioritized, with input from health equity experts, including improving data and technology infrastructure to understand, target, and track disparities over time. Third, improving the quality of source/training data is a priority to ensure AI and BDA can deliver on health equity. This requires community engagement and input from stakeholders/subject-matter experts. Clear communication about the benefits of representative data and transparent practices is essential to gain community buy-in. For data providers, including healthcare providers and pharmacies, improving data quality necessitates education and accountability at the point of collection and could be underpinned by data collection standards and clear KPIs, including a gold standard for representativeness developed by an interdisciplinary steering committee. These standards should include clauses to ensure disparities are mitigated and data on race, ethnicity, and other social risk factors are consistently collected, with accountability if they are not. Fourth, creating an infrastructure for sharing big data requires harmonizing and standardizing data formats and developing tools to identify data quality issues. This includes adopting FAIR data principles and interoperability standards, such as HL7 FHIR, to facilitate secure, consistent data exchange across systems. Governance frameworks are also needed to emphasize transparency and regulatory compliance, particularly regarding patient data privacy, to overcome barriers to data access. Fair ML approaches should be used to detect and mitigate bias throughout the algorithmic process, ensuring more equitable outcomes.

In summary, investments in big data, AI, and BDA to improve health equity have the potential to address disparities in immune diseases, but success requires a focus on engagement, collaboration, robust governance, meaningful KPIs, continuous monitoring and evaluation, and iterative fairness assessments to ensure a positive ROI.

Author contributions

SK: Conceptualization, Writing – original draft, Writing – review & editing. AK: Conceptualization, Writing – original draft, Writing – review & editing. SF: Conceptualization, Writing – original draft, Writing – review & editing. KW: Conceptualization, Writing – original draft, Writing – review & editing. JPW: Conceptualization, Writing – original draft, Writing – review & editing. DBN: Conceptualization, Writing – original draft, Writing – review & editing. ID-M: Conceptualization, Writing – original draft, Writing – review & editing. GO: Conceptualization, Writing – original draft, Writing – review & editing. MD: Conceptualization, Writing – original draft, Writing – review & editing. RZ: Conceptualization, Writing – original draft, Writing – review & editing. MH: Conceptualization, Writing – original draft, Writing – review & editing. PR: Conceptualization, Writing – original draft, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. Sanofi funded the advisory board. The authors received no payment from Sanofi related to the development of this publication. Sanofi had the opportunity to review the publication; however, the authors remain responsible for all content, editorial decisions, and the decision to submit the manuscript.

Acknowledgments

All authors participated in an advisory board organized by Sanofi. The authors would like to thank the following non-author attendees of the advisory board who provided insights and recommendations that have indirectly shaped the content of this article: Brian Smedley of the Urban Institute, Washington, USA, and Owen Garrick of Mayo Clinic, Minnesota, USA. Medical writing support was provided by Sam Mason, PhD of Selene Medical Communications, Macclesfield, UK, which was funded by Sanofi who initiated and sponsored the advisory board meeting where discussions on this topic were initially held.

Conflict of interest

SK, KW, JPW, DBN, ID-M, GO, and MH have served as consultants and participated in advisory boards for Sanofi. JPW has also received honoria from Sanofi, Banook, PPD and the AMA, and research grants from Sanofi, Regeneron, and Boehringer Ingelheim. AK, SF, MD, RZ, and PR were employees of Sanofi at the time of writing and may hold Sanofi stocks or shares.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Accenture (2022). Nine Out of Ten Healthcare Executives Say That Health Equity Initiatives Are a Top Business Priority, According to Accenture and HIMSS Insights. Available online at: https://newsroom.accenture.com/news/2022/nine-out-of-ten-healthcare-executives-say-that-health-equity-initiatives-are-a-top-business-priority-according-to-accenture-and-himss-insights (Accessed March 18, 2025).

Google Scholar

Agboola, F., and Wright, A. C. (2024). A framework for evaluating the diversity of clinical trials. J. Clin. Epidemiol. 169:111299. doi: 10.1016/j.jclinepi.2024.111299

PubMed Abstract | Crossref Full Text | Google Scholar

Agniel, D., Martino, S. C., Burkhart, Q., Hambarsoomian, K., Orr, N., Beckett, M. K., et al. (2021). Incentivizing excellent care to at-risk groups with a health equity summary score. J. Gen. Intern. Med. 36, 1847–1857. doi: 10.1007/s11606-019-05473-x

PubMed Abstract | Crossref Full Text | Google Scholar

Al Meslamani, A. Z. (2023). How AI is advancing asthma management? Insights into economic and clinical aspects. J. Med. Econ. 26, 1489–1494. doi: 10.1080/13696998.2023.2277072

PubMed Abstract | Crossref Full Text | Google Scholar

All of Us Research Program (2025). All of Us Research Program. Available online at: https://allofus.nih.gov/ (Accessed July 1, 2025).

Google Scholar

Aluko, Y., Garfield, S., Kasen, P., and Minta, B. (2023). Why America's Health equity Investment Has Yielded a Marginal Return. Ernst & Young. Available online at: https://www.ey.com/en_us/insights/health/america-s-health-equity-investment-marginal-return (Accessed February 21, 2025).

Google Scholar

American Medical Association and Association of American Medical Colleges (2021). Advancing Health Equity: a Guide to Language Narrative and Concepts. Available online at: https://www.ama-assn.org/system/files/ama-aamc-equity-guide.pdf (Accessed February 21, 2025).

Google Scholar

Ashton, K., Cotter-Roberts, A., Clemens, T., Green, L., and Dyakova, M. (2024). Advancing the social return on investment framework to capture the social value of public health interventions: semistructured interviews and a review of scoping reviews. Public Health 226, 122–127. doi: 10.1016/j.puhe.2023.11.004

PubMed Abstract | Crossref Full Text | Google Scholar

Bakker, L., Aarts, J., Uyl-de Groot, C., and Redekop, W. (2020). Economic evaluations of big data analytics for clinical decision-making: a scoping review. J. Am. Med. Inform. Assoc. 27, 1466–1475. doi: 10.1093/jamia/ocaa102

PubMed Abstract | Crossref Full Text | Google Scholar

Barocas, S., Hardt, M., and Narayanan, A. (2023). Fairness and Machine Learning: Limitations and Opportunities. Cambridge, MA: The MIT Press.

Google Scholar

Bartlett, B. (2024). Towards accountable, legitimate and trustworthy ai in healthcare: enhancing ai ethics with effective data stewardship. New Bioeth. 30, 285–309. doi: 10.1080/20502877.2025.2482282

PubMed Abstract | Crossref Full Text | Google Scholar

Bartoloni, E., Perricone, C., Cafaro, G., Alunno, A., and Gerli, R. (2022). The facts and fictions of precision medicine in autoimmune diseases: is the machine learning approach the response? Rheumatology 61, 484–485. doi: 10.1093/rheumatology/keab715

PubMed Abstract | Crossref Full Text | Google Scholar

Batko, K., and Slezak, A. (2022). The use of big data analytics in healthcare. J. Big Data 9:3. doi: 10.1186/s40537-021-00553-4

PubMed Abstract | Crossref Full Text | Google Scholar

Bibbins-Domingo, K., and Helman, A. eds. (2022a). “Barriers to representation of underrepresented and excluded populations in clinical research,” in Improving Representation in Clinical Trials and Research: Building Research Equity for Women and Underrepresented Groups (Washington, DC: US National Academies Press).

Google Scholar

Bibbins-Domingo, K., and Helman, A. eds. (2022b). “Why diverse representation in clinical research matters and the current state of representation within the clinical research ecosystem,” in Improving Representation in Clinical Trials and Research: Building Research Equity for Women and Underrepresented Groups (Washington, DC: US National Academies Press).

Google Scholar

Black, E., Naidu, R., Ghani, R., Rodolfa, K., Ho, D., and Heidari, H. (2023). “Toward operationalizing pipeline-aware ML fairness: a research agenda for developing practical guidelines and tools,” in Paper presented at Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO), Boston, MA, USA, October 30, 2023 (Article 36). Available online at: https://dl.acm.org/doi/10.1145/3617694.3623259 (Accessed February 21, 2025).

Google Scholar

Bovens, M. (2007). Analysing and assessing accountability: a conceptual framework. Eur. Law J. 13, 447–468. doi: 10.1111/j.1468-0386.2007.00378.x

Crossref Full Text | Google Scholar

Boykin, C. M., Dasch, S. T., Rice Jr, V., Lakshminarayanan, V. R., Togun, T. A., and Brown, S. M. (2021). “Opportunities for a more interdisciplinary approach to measuring perceptions of fairness in machine learning,” Paper Presented at Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO) (New York, NY). Available online at: https://dl.acm.org/doi/10.1145/3465416.3483302 (Accessed February 21, 2025).

Google Scholar

Braveman, P., and Gottlieb, L. (2014). The social determinants of health: it's time to consider the causes of the causes. Public Health Rep. 129(Suppl. 2), 19–31. doi: 10.1177/00333549141291S206

PubMed Abstract | Crossref Full Text | Google Scholar

Bronsdon, C. (2025). Understanding Human Evaluation Metrics in AI: What They Are and How They Work. Available online at: https://galileo.ai/blog/human-evaluation-metrics-ai (Accessed June 19, 2025).

Google Scholar

Campanozzi, L. L., Gibelli, F., Bailo, P., Nittari, G., Sirignano, A., and Ricci, G. (2023). The role of digital literacy in achieving health equity in the third millennium society: a literature review. Front. Public Health 11:1109323. doi: 10.3389/fpubh.2023.1109323

PubMed Abstract | Crossref Full Text | Google Scholar

Cao, F., He, Y.-S., Wang, Y., Zha, C.-K., Lu, J.-M., Tao, L.-M., et al. (2023). Global burden and cross-country inequalities in autoimmune diseases from 1990 to 2019. Autoimmun. Rev. 22:103326. doi: 10.1016/j.autrev.2023.103326

PubMed Abstract | Crossref Full Text | Google Scholar

cdisc (2025). Standards. Available online at: https://www.cdisc.org/standards (Accessed June 16, 2025).

Google Scholar

Chaddad, A., Peng, J., Xu, J., and Bouridane, A. (2023). Survey of explainable AI techniques in healthcare. Sensors 23:634. doi: 10.3390/s23020634

PubMed Abstract | Crossref Full Text | Google Scholar

Chen, B., Khodadoust, M. S., Olsson, N., Wagar, L. E., Fast, E., Liu, C. L., et al. (2019). Predicting HLA class II antigen presentation through integrated deep learning. Nat. Biotechnol. 37, 1332–1343. doi: 10.1038/s41587-019-0280-2

PubMed Abstract | Crossref Full Text | Google Scholar

Chidambaram, S., Jain, B., Jain, U., Mwavu, R., Baru, R., Thomas, B., et al. (2024). An introduction to digital determinants of health. PLoS Digit. Health 3:e0000346. doi: 10.1371/journal.pdig.0000346

PubMed Abstract | Crossref Full Text | Google Scholar

Chin, M. H., Afsar-Manesh, N., Bierman, A. S., Chang, C., Colón-Rodríguez, C. J., Dullabh, P., et al. (2023). Guiding principles to address the impact of algorithm bias on racial and ethnic disparities in health and health care. JAMA Netw. Open 6:e2345050. doi: 10.1001/jamanetworkopen.2023.45050

PubMed Abstract | Crossref Full Text | Google Scholar

Chisolm, D. J., Dugan, J. A., Figueroa, J. F., Lane-Fall, M. B., Roby, D. H., Rodriguez, H. P., et al. (2023). Improving health equity through health care systems research. Health Serv. Res. 58(Suppl. 3), 289–299. doi: 10.1111/1475-6773.14192

PubMed Abstract | Crossref Full Text | Google Scholar

Churová, V., Vyškovský, R., Maršálová, K., Kudláček, D., and Schwarz, D. (2021). Anomaly detection algorithm for real-world data and evidence in clinical research: implementation, evaluation, and validation study. JMIR Med. Inf. 9:e27172. doi: 10.2196/27172

PubMed Abstract | Crossref Full Text | Google Scholar

Clark, C. R., Wilkins, C. H., Rodriguez, J. A., Preininger, A. M., Harris, J., DesAutels, S., et al. (2021). Health care equity in the use of advanced analytics and artificial intelligence technologies in primary care. J. Gen. Intern. Med. 36, 3188–3193. doi: 10.1007/s11606-021-06846-x

PubMed Abstract | Crossref Full Text | Google Scholar

Clews, G. (2024). Chancellor promises electronic patient records for all NHS trusts by March 2026. Pharm. J. Available online at: https://pharmaceutical-journal.com/article/news/chancellor-promises-electronic-patient-records-for-all-nhs-trusts-by-march-2026 (Accessed June 16, 2025).

Google Scholar

Colivicchi, A. (2025). NHS England Pauses “Ground-Breaking” AI Project Following GP Data Concerns. Pulse Today. Available online at: https://www.pulsetoday.co.uk/news/technology/nhs-england-pauses-ground-breaking-ai-project-following-gp-data-concerns/ (Accessed June 16, 2025).

Google Scholar

Conrad, N., Misra, S., Verbakel, J. Y., Verbeke, G., Molenberghs, G., Taylor, P. N., et al. (2023). Incidence, prevalence, and co-occurrence of autoimmune disorders over time and by age, sex, and socioeconomic status: a population-based cohort study of 22 million individuals in the UK. Lancet 401, 1878–1890. doi: 10.1016/S0140-6736(23)00457-9

PubMed Abstract | Crossref Full Text | Google Scholar

Cooper, A. F., Moss, E., Laufer, B., and Nissenbaum, H. (2022). “Accountability in an algorithmic society: relationality, responsibility, and robustness in machine learning,” in Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (New York, NY: Association for Computing Machinery), 864–876.

Google Scholar

Danieli, M. G., Brunetto, S., Gammeri, L., Palmeri, D., Claudi, I., Shoenfeld, Y., et al. (2024). Machine learning application in autoimmune diseases: state of art and future prospectives. Autoimmun. Rev. 23:103496. doi: 10.1016/j.autrev.2023.103496

PubMed Abstract | Crossref Full Text | Google Scholar

Dankwa-Mullan, I., and Weeraratne, D. (2022). Artificial intelligence and machine learning technologies in cancer care: addressing disparities, bias, and data diversity. Cancer Discov. 12, 1423–1427. doi: 10.1158/2159-8290.CD-22-0373

PubMed Abstract | Crossref Full Text | Google Scholar

Davis, C. M., Apter, A. J., Casillas, A., Foggs, M. B., Louisias, M., Morris, E. C., et al. (2021). Health disparities in allergic and immunologic conditions in racial and ethnic underserved populations: a Work Group Report of the AAAAI Committee on the Underserved. J. Allergy Clin. Immunol. 147, 1579–1593. doi: 10.1016/j.jaci.2021.02.034

PubMed Abstract | Crossref Full Text | Google Scholar

Deep Medical (2025). Available online at: https://www.deep-medical.ai/ (Accessed July 1, 2025).

Google Scholar

Deloitte Center for Health Solutions (2025). Health Equity Remains a Business Imperative in the Life Sciences and Health Care Industries. Available online at: https://www2.deloitte.com/us/en/insights/industry/health-care/health-equity-business-imperative-in-2025.html (Accessed March 18, 2025).

Google Scholar

Drummond, R. (2023). How Ethnicity Recording Differs Across Health Data Sources and the Impact on Analysis. Office for National Statistics. Available online at: https://blog.ons.gov.uk/2023/01/16/how-ethnicity-recording-differs-across-health-data-sources-and-the-impact-on-analysis/ (Accessed February 21 2025).

Google Scholar

Eruchalu, C. N., Pichardo, M. S., Bharadwaj, M., Rodriguez, C. B., Rodriguez, J. A., Bergmark, R. W., et al. (2021). The expanding digital divide: digital health access inequities during the COVID-19 pandemic in New York City. J. Urban Health 98, 183–186. doi: 10.1007/s11524-020-00508-9

PubMed Abstract | Crossref Full Text | Google Scholar

Favaretto, M., De Clercq, E., and Elger, B. S. (2019). Big data and discrimination: perils, promises and solutions. A systematic review. J. Big Data 6:12. doi: 10.1186/s40537-019-0177-4

Crossref Full Text | Google Scholar

Fernandes, F. A., Chaltikyan, G., Adib, K., Caton-Peters, H., and Novillo-Ortiz, D. (2024). The role of governance in the digital transformation of healthcare: results of a survey in the WHO Europe Region. Int. J. Med. Inform. 189:105510. doi: 10.1016/j.ijmedinf.2024.105510

PubMed Abstract | Crossref Full Text | Google Scholar

Forrest, I. S., Petrazzini, B. O., Duffy, Á., Park, J. K., O'Neal, A. J., Jordan, D. M., et al. (2023). A machine learning model identifies patients in need of autoimmune disease testing using electronic health records. Nat. Commun. 14:2385. doi: 10.1038/s41467-023-37996-7

PubMed Abstract | Crossref Full Text | Google Scholar

Fuller, D., Buote, R., and Stanley, K. (2017). A glossary for big data in population and public health: discussion and commentary on terminology and research methods. J. Epidemiol. Community Health 71, 1113–1117. doi: 10.1136/jech-2017-209608

PubMed Abstract | Crossref Full Text | Google Scholar

Galea, S., and Abdalla, S. M. (2023). Data to improve global health equity – key challenges. JAMA Health Forum 4:e234433. doi: 10.1001/jamahealthforum.2023.4433

Crossref Full Text | Google Scholar

Gaspar, J., Catumbela, E., Marques, B., and Freitas, A. (2011). A systematic review of outliers detection techniques in medical data – preliminary study. Proc. Int. Conf. Health Inf. 0, 575–582. doi: 10.5220/0003168705750582

Crossref Full Text | Google Scholar

Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., and Wallach, H. (2021). Datasheets for datasets. Commun. ACM 64, 86–92. doi: 10.1145/3458723

Crossref Full Text | Google Scholar

Gilvaz, V. J., and Reginato, A. M. (2023). Artificial intelligence in rheumatoid arthritis: potential applications and future implications. Front. Med. 10:1280312. doi: 10.3389/fmed.2023.1280312

PubMed Abstract | Crossref Full Text | Google Scholar

Global Burden of Disease (GBD) 2019 IMID Collaborators (2023). Global, regional, and national incidence of six major immune-mediated inflammatory diseases: findings from the Global Burden of Disease study 2019. EClinicalMedicine 64:102193. doi: 10.1016/j.eclinm.2023.102193

Crossref Full Text | Google Scholar

Goonesekera, S. D., Dey, S., Thakur, S., and Davila, E. P. (2024). Racial/ethnic differences in autoimmune disease prevalence in US claims/EHR data. Am. J. Manag. Care 30, e4–e10. doi: 10.37765/ajmc.2024.89488

PubMed Abstract | Crossref Full Text | Google Scholar

Gossec, L., Kedra, J., Servy, H., Pandit, A., Stones, S., Berenbaum, F., et al. (2020). EULAR points to consider for the use of big data in rheumatic and musculoskeletal diseases. Ann. Rheum. Dis. 79, 69–76. doi: 10.1136/annrheumdis-2019-215694

PubMed Abstract | Crossref Full Text | Google Scholar

Gravity Project (2025). Available online at: https://thegravityproject.net/ (Accessed June 16, 2025).

Google Scholar

Green, B., and Chen, Y. (2019). The principles and limits of algorithm-in-the-loop decision making. Proc. ACM Hum. Comput. Interact. 3:50. doi: 10.1145/3359152

Crossref Full Text | Google Scholar

Gurevich, E., El Hassan, B., and El Morr, C. (2023). Equity within AI systems: what can health leaders expect? Healthc. Manage. Forum 36, 119–124. doi: 10.1177/08404704221125368

PubMed Abstract | Crossref Full Text | Google Scholar

Hardinges, J. (2020). Data Trusts in 2020. Open Data Institute. Available online at: https://theodi.org/news-and-events/blog/data-trusts-in-2020/ (Accessed July 1, 2025).

Google Scholar

Hardt, M., Price, E., and Srebro, N. (2016). Equality of opportunity in supervised learning. arXiv 1–22. doi: 10.48550/arXiv.1610.02413

Crossref Full Text | Google Scholar

Harrington, R., Washington, D., Burke, A., Jones-Pool, M., Spaulding, B., and Willits, J. (2021). Evaluating Medicaid's Use of Quality Measurement to Achieve Equity Goals - White Paper. Available online at: https://www.ncqa.org/health-equity/measure-accountability/ (Accessed June 16, 2025).

Google Scholar

HL7 FHIR (2023). FHIR Overview. Available online at: https://www.hl7.org/fhir.html (Accessed June 16, 2025).

Google Scholar

Huang, Y., Guo, J., Chen, W.-H., Lin, H.-Y., Tang, H., Wang, F., et al. (2024). A scoping review of fair machine learning techniques when using real-world data. J. Biomed. Inform. 151:104622. doi: 10.1016/j.jbi.2024.104622

PubMed Abstract | Crossref Full Text | Google Scholar

Ibrahim, S. A., Charlson, M. E., and Neill, D. B. (2020). Big data analytics and the struggle for equity in health care: the promise and perils. Health Equity 4, 99–101. doi: 10.1089/heq.2019.0112

PubMed Abstract | Crossref Full Text | Google Scholar

Jacobs, B. M., Peter, M., Giovannoni, G., Noyce, A. J., Morris, H. R., and Dobson, R. (2022). Towards a global view of multiple sclerosis genetics. Nat. Rev. Neurol. 18, 613–623. doi: 10.1038/s41582-022-00704-y

PubMed Abstract | Crossref Full Text | Google Scholar

Juhn, Y. J., Ryu, E., Wi, C.-I., King, K. S., Malik, M., Romero-Brufau, S., et al. (2022). Assessing socioeconomic bias in machine learning algorithms in health care: a case study of the HOUSES index. J. Am. Med. Inform. Assoc. 29, 1142–1151. doi: 10.1093/jamia/ocac052

PubMed Abstract | Crossref Full Text | Google Scholar

Kamiran, F., and Calders, T. (2012). Data preprocessing techniques for classification without discrimination. Knowl. Inf. Syst. 33, 1–33. doi: 10.1007/s10115-011-0463-8

Crossref Full Text | Google Scholar

Kamishima, T., Akaho, S., Asoh, H., and Sakuma, J. (2012). “Fairness-aware classifier with prejudice remover regularizer,” in Machine Learning and Knowledge Discovery in Databases, eds. P. A. Flach, T. De Bie, and N. Cristianini (Berlin, Heidelberg: Springer), 35–50.

Google Scholar

Kaplan, A., Boivin, M., Bouchard, J., Kim, J., Hayes, S., and Licskai, C. (2023). The emerging role of digital health in the management of asthma. Ther. Adv. Chronic Dis. 14:20406223231209329. doi: 10.1177/20406223231209329

PubMed Abstract | Crossref Full Text | Google Scholar

Kaye, J., Whitley, E. A., Lund, D., Morrison, M., Teare, H., and Melham, K. (2015). Dynamic consent: a patient interface for twenty-first century research networks. Eur. J. Hum. Genet. 23, 141–146. doi: 10.1038/ejhg.2014.71

PubMed Abstract | Crossref Full Text | Google Scholar

Kearns, M., Neel, S., Roth, A., and Steven Wu, Z. (2018). “Preventing fairness gerrymandering: auditing and learning for subgroup fairness,” in Proceedings of the 35th International Conference on Machine Learning. Available online at: https://proceedings.mlr.press/v80/kearns18a.html (Accessed February 21, 2025).

Google Scholar

LaVeist, T. A., Pérez-Stable, E. J., Richard, P., Anderson, A., Isaac, L. A., Santiago, R., et al. (2023). The economic burden of racial, ethnic, and educational health inequities in the US. JAMA 329, 1682–1692. doi: 10.1001/jama.2023.5965

PubMed Abstract | Crossref Full Text | Google Scholar

Lekadir, K., Frangi, A. F., Porras, A. R., Glocker, B., Cintas, C., Langlotz, C. P., et al. (2025). FUTURE-AI: international consensus guideline for trustworthy and deployable artificial intelligence in healthcare. BMJ 388:e081554. doi: 10.1136/bmj-2024-081554

PubMed Abstract | Crossref Full Text | Google Scholar

Lerner, A., Jeremias, P., and Matthias, T. (2015). The world incidence and prevalence of autoimmune diseases is increasing. Int. J. Celiac Dis. 3, 151–155. doi: 10.12691/ijcd-3-4-8

Crossref Full Text | Google Scholar

Lim, D. (2025). Determinants of Socially Responsible AI Governance | Duke Law & Technology Review. Available online at: https://dltr.law.duke.edu/2025/01/27/determinants-of-socially-responsible-ai-governance/ (Accessed June 19, 2025).

Google Scholar

Lindemark, F., Norheim, O. F., and Johansson, K. A. (2014). Making use of equity sensitive QALYs: a case study on identifying the worse off across diseases. Cost Eff. Resour. Alloc. 12:16. doi: 10.1186/1478-7547-12-16

PubMed Abstract | Crossref Full Text | Google Scholar

Lovell, T. (2025). No Set Date for the NHS to be Paperless, Says Amanda Pritchard. Digital Health. Available online at: https://www.digitalhealth.net/2025/01/no-set-date-for-the-nhs-to-be-paperless-says-amanda-pritchard/ (Accessed June 16 2025).

Google Scholar

Lu, X., Yang, C., Liang, L., Hu, G., Zhong, Z., and Jiang, Z. (2024). Artificial intelligence for optimizing recruitment and retention in clinical trials: a scoping review. J. Am. Med. Inform. Assoc. 31, 2749–2759. doi: 10.1093/jamia/ocae243

PubMed Abstract | Crossref Full Text | Google Scholar

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., and Galstyan, A. (2021). A survey on bias and fairness in machine learning. ACM Comput. Surv. 54:115. doi: 10.1145/3457607

Crossref Full Text | Google Scholar

Mienye, I. D., Swart, T. G., and Obaido, G. (2024). “Fairness metrics in ai healthcare applications: a review,” in 2024 IEEE International Conference on Information Reuse and Integration for Data Science (IRI) (New York, NY), 284–289.

Google Scholar

Miller, F. W. (2023a). The increasing prevalence of autoimmunity and autoimmune diseases: an urgent call to action for improved understanding, diagnosis, treatment, and prevention. Curr. Opin. Immunol. 80, 102266. doi: 10.1016/j.coi.2022.102266

PubMed Abstract | Crossref Full Text | Google Scholar

Miller, F. W. (2023b). Autoimmunity Has Reached Epidemic Levels. We Need Urgent Action to Address It. Scientific American. Available online at: https://www.scientificamerican.com/article/autoimmunity-has-reached-epidemic-levels-we-need-urgent-action-to-address-it/ (Accessed February 21, 2025).

Google Scholar

Munoz, C., da Costa, K., Modenesi, B., and Koshiyama, A. (2023). Evaluating explainability in machine learning predictions through explainer-agnostic metrics. arXiv.org. Available online at: https://arxiv.org/abs/2302.12094v3 (Accessed June 19 2025).

Google Scholar

Murdoch, B. (2021). Privacy and artificial intelligence: challenges for protecting health information in a new era. BMC Med. Ethics 22:122. doi: 10.1186/s12910-021-00687-3

PubMed Abstract | Crossref Full Text | Google Scholar

National Institute of Standards in Technology (2023). AI Risk Management Framework. Available online at: https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf (Accessed June 19, 2025).

Google Scholar

National Institutes of Health (2025). All of Us Research Program. Available online at: https://allofus.nih.gov/ (Accessed June 19, 2025).

Google Scholar

National Telecommunications and Information Administration (2024). Liability Rules and Standards. Washington, DC: Liability Rules and Standards. Available online at: https://www.ntia.gov/issues/artificial-intelligence/ai-accountability-policy-report/using-accountability-inputs/liability-rules-and-standards (Accessed June 19, 2025).

Google Scholar

NHS Confederation (2023). Key Statistics on the NHS. Available online at: https://www.nhsconfed.org/articles/key-statistics-nhs (Accessed June 16, 2025).

Google Scholar

Nikpay, S., Zhang, Z., and Karaca-Mandic, P. (2024). Return on investments in social determinants of health interventions: what is the evidence? Health Aff. Sch. 2:qxae114. doi: 10.1093/haschl/qxae114

PubMed Abstract | Crossref Full Text | Google Scholar

Norori, N., Hu, Q., Aellen, F. M., Faraci, F. D., and Tzovara, A. (2021). Addressing bias in big data and AI for health care: a call for open science. Patterns 2:100347. doi: 10.1016/j.patter.2021.100347

PubMed Abstract | Crossref Full Text | Google Scholar

Obermeyer, Z., Powers, B., Vogeli, C., and Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447–453. doi: 10.1126/science.aax2342

PubMed Abstract | Crossref Full Text | Google Scholar

Palaniappan, K., Lin, E. Y. T., and Vogel, S. (2024). Global regulatory frameworks for the use of artificial intelligence (AI) in the healthcare services sector. Healthcare 12:562. doi: 10.3390/healthcare12050562

PubMed Abstract | Crossref Full Text | Google Scholar

Panch, T., Mattie, H., and Atun, R. (2019). Artificial intelligence and algorithmic bias: implications for health systems. J. Glob. Health 9:010318. doi: 10.7189/jogh.09.020318

PubMed Abstract | Crossref Full Text | Google Scholar

Papagiannidis, E., Mikalef, P., and Conboy, K. (2025). Responsible artificial intelligence governance: a review and research framework. J. Strat. Inf. Syst. 34:101885. doi: 10.1016/j.jsis.2024.101885

Crossref Full Text | Google Scholar

Parikh, R. B., Teeple, S., and Navathe, A. S. (2019). Addressing bias in artificial intelligence in health care. JAMA 322:2377. doi: 10.1001/jama.2019.18058

PubMed Abstract | Crossref Full Text | Google Scholar

Peng, J., Jury, E. C., Dönnes, P., and Ciurtin, C. (2021). Machine learning techniques for personalised medicine approaches in immune-mediated chronic inflammatory diseases: applications and challenges. Front. Pharmacol. 12:720694. doi: 10.3389/fphar.2021.720694

PubMed Abstract | Crossref Full Text | Google Scholar

Pezoulas, V. C., and Fotiadis, D. I. (2024). The pivotal role of data harmonization in revolutionizing global healthcare: a framework and a case study. Chatmed 3, 1–10. doi: 10.20517/chatmed.2023.37

Crossref Full Text | Google Scholar

Pisetsky, D. S. (2023). Pathogenesis of autoimmune disease. Nat. Rev. Nephrol. 19, 509–524. doi: 10.1038/s41581-023-00720-1

PubMed Abstract | Crossref Full Text | Google Scholar

Quinn, M. M., Sembajwe, G., Stoddard, A. M., Kriebel, D., Krieger, N., Sorensen, G., et al. (2007). Social disparities in the burden of occupational exposures: results of a cross-sectional study. Am. J. Ind. Med. 50, 861–875. doi: 10.1002/ajim.20529

PubMed Abstract | Crossref Full Text | Google Scholar

Rahman, R. (2025). Federated learning: a survey on privacy-preserving collaborative intelligence. arXiv 1–6. doi: 10.48550/arXiv.2504.17703

Crossref Full Text | Google Scholar

Rajkomar, A., Hardt, M., Howell, M. D., Corrado, G., and Chin, M. H. (2018). Ensuring fairness in machine learning to advance health equity. Ann. Intern. Med. 169, 866–872. doi: 10.7326/M18-1990

PubMed Abstract | Crossref Full Text | Google Scholar

Roberts, M. H., and Erdei, E. (2020). Comparative United States autoimmune disease rates for 2010–2016 by sex, geographic region, and race. Autoimmun. Rev. 19:102423. doi: 10.1016/j.autrev.2019.102423

PubMed Abstract | Crossref Full Text | Google Scholar

Rodríguez-Barroso, N., Stipcich, G., Jiménez-López, D., Ruiz-Millán, J. A., Martínez-Cámara, E., González-Seco, G., et al. (2020). Federated learning and differential privacy: software tools analysis, the Sherpa.ai FL framework and methodological guidelines for preserving data privacy. Inf. Fus. 64, 270–292. doi: 10.1016/j.inffus.2020.07.009

Crossref Full Text | Google Scholar

Saatçi, G., Korkut, S., and Ünsal, A. (2024). The effect of the use of artificial intelligence in the preparation of patient education materials by nursing students on the understandability, actionability and quality of the material: a randomized controlled trial. Nurse Educ. Pract. 81:104186. doi: 10.1016/j.nepr.2024.104186

PubMed Abstract | Crossref Full Text | Google Scholar

Sabet, C., Hammond, A., Ravid, N., Tong, M. S., and Stanford, F. C. (2023). Harnessing big data for health equity through a comprehensive public database and data collection framework. NPJ Digit. Med. 6:91. doi: 10.1038/s41746-023-00844-5

PubMed Abstract | Crossref Full Text | Google Scholar

Schaekermann, M., Spitz, T., Pyles, M., Cole-Lewis, H., Wulczyn, E., Pfohl, S. R., et al. (2024). Health equity assessment of machine learning performance (HEAL): a framework and dermatology AI model case study. EClinicalMedicine 70:102479. doi: 10.1016/j.eclinm.2024.102479

PubMed Abstract | Crossref Full Text | Google Scholar

Schulte, T., and Bohnet-Joschko, S. (2022). How can big data analytics support people-centred and integrated health services: a scoping review. Int. J. Integr. Care 22:23. doi: 10.5334/ijic.5543

PubMed Abstract | Crossref Full Text | Google Scholar

Schultze, J. L. (2015). Teaching “big data” analysis to young immunologists. Nat. Immunol. 16, 902–905. doi: 10.1038/ni.3250

PubMed Abstract | Crossref Full Text | Google Scholar

Schwartz, R., Vassilev, A., Greene, K. K., Perine, L., Burt, A., and Hall, P. (2022). Towards a Standard for Identifying and Managing Bias in Artificial Intelligence. Special Publication (NIST SP) 1270. Available online at: https://www.nist.gov/publications/towards-standard-identifying-and-managing-bias-artificial-intelligence (Accessed February 21, 2025).

Google Scholar

Sharma-Oates, A., Zemedikun, D. T., Kumar, K., Reynolds, J. A., Jain, A., Raza, K., et al. (2022). Early onset of immune-mediated diseases in minority ethnic groups in the UK. BMC Med. 20:346. doi: 10.1186/s12916-022-02544-5

PubMed Abstract | Crossref Full Text | Google Scholar

Shin, Y. H., Hwang, J., Kwon, R., Lee, S. W., Kim, M. S., et al. (2023). Global, regional, and national burden of allergic disorders and their risk factors in 204 countries and territories, from 1990 to 2019: a systematic analysis for the Global Burden of Disease Study 2019. Allergy 78, 2232–2254. doi: 10.1111/all.15807

PubMed Abstract | Crossref Full Text | Google Scholar

Stoyanovich, J., and Howe, B. (2019). Nutritional Labels for Data and Models. A Quarterly bulletin of the Computer Society of the IEEE Technical Committee on Data Engineering, 42. Available online at: https://par.nsf.gov/biblio/10176629-nutritional-labels-data-models (Accessed July 31, 2025).

Google Scholar

Suresh, H., and Guttag, J. (2021). “A framework for understanding sources of harm throughout the machine learning life cycle,” in Paper Presented at Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO) 2021 (New York, NY). Available online at: https://dl.acm.org/doi/10.1145/3465416.3483305 (Accessed February 21, 2025).

Google Scholar

Turner, B. E., Steinberg, J. R., Weeks, B. T., Rodriguez, F., and Cullen, M. R. (2022). Race/ethnicity reporting and representation in US clinical trials: a cohort study. Lancet Reg. Health Am. 11:100252. doi: 10.1016/j.lana.2022.100252

PubMed Abstract | Crossref Full Text | Google Scholar

Vishwanatha, J. K., Christian, A., Sambamoorthi, U., Thompson, E. L., Stinson, K., and Syed, T. A. (2023). Community perspectives on AI/ML and health equity: AIM-AHEAD nationwide stakeholder listening sessions. PLOS Digit. Health 2:e0000288. doi: 10.1371/journal.pdig.0000288

PubMed Abstract | Crossref Full Text | Google Scholar

Vojdani, A. (2014). A potential link between environmental triggers and autoimmunity. Autoimmune Dis. 2014:437231. doi: 10.1155/2014/437231

PubMed Abstract | Crossref Full Text | Google Scholar

Vojdani, A., Pollard, K. M., and Campbell, A. W. (2014). Environmental triggers and autoimmunity. Autoimmune Dis. 2014:798029. doi: 10.1155/2014/798029

PubMed Abstract | Crossref Full Text | Google Scholar

Vyas, D. A., Eisenstein, L. G., and Jones, D. S. (2020). Hidden in plain sight – reconsidering the use of race correction in clinical algorithms. N. Engl. J. Med. 383, 874–882. doi: 10.1056/NEJMms2004740

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, C., Markus, H., Diwadkar, A. R., Khunsriraksakul, C., Carrel, L., Li, B., et al. (2025). Integrating electronic health records and GWAS summary statistics to predict the progression of autoimmune diseases from preclinical stages. Nat. Commun. 16:180. doi: 10.1038/s41467-024-55636-6

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, L., Khunsriraksakul, C., Markus, H., Chen, D., Zhang, F., Chen, F., et al. (2024). Integrating single cell expression quantitative trait loci summary statistics to understand complex trait risk genes. Nat. Commun. 15:4260. doi: 10.1038/s41467-024-48143-1

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, Z., Li, Y., Gao, Y., Fu, Y., Lin, J., Lei, X., et al. (2023). Global, regional, and national burden of asthma and its attributable risk factors from 1990 to 2019: a systematic analysis for the Global Burden of Disease Study 2019. Respir. Res. 24:169. doi: 10.1186/s12931-023-02475-6

PubMed Abstract | Crossref Full Text | Google Scholar

Wieringa, M. (2020). “What to account for when accounting for algorithms: a systematic literature review on algorithmic accountability,” in Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (New York, NY: Association for Computing Machinery), 1–18.

Google Scholar

Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J. J., Appleton, G., Axton, M., Baak, A., et al. (2016). The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3:160018. doi: 10.1038/sdata.2016.18

PubMed Abstract | Crossref Full Text | Google Scholar

Will, J., Gupta, M., Zaretsky, J., Dowlath, A., Testa, P., and Feldman, J. (2025). Enhancing the readability of online patient education materials using large language models: cross-sectional study. J. Med. Int. Res. 27:e69955. doi: 10.2196/69955

PubMed Abstract | Crossref Full Text | Google Scholar

Wylezinski, L. S., Gray, J. D., Polk, J. B., Harmata, A. J., and Spurlock, C. F. (2019). Illuminating an invisible epidemic: a systemic review of the clinical and economic benefits of early diagnosis and treatment in inflammatory disease and related syndromes. J. Clin. Med. 8:493. doi: 10.3390/jcm8040493

PubMed Abstract | Crossref Full Text | Google Scholar

Xiao, N., Huang, X., Wu, Y., Li, B., Zang, W., Shinwari, K., et al. (2025). Opportunities and challenges with artificial intelligence in allergy and immunology: a bibliometric study. Front. Med. 12:1523902. doi: 10.3389/fmed.2025.1523902

PubMed Abstract | Crossref Full Text | Google Scholar

Yanagisawa, Y., Shido, K., Kojima, K., and Yamasaki, K. (2023). Convolutional neural network-based skin image segmentation model to improve classification of skin diseases in conventional and non-standardized picture images. J. Dermatol. Sci. 109, 30–36. doi: 10.1016/j.jdermsci.2023.01.005

PubMed Abstract | Crossref Full Text | Google Scholar

Yerramilli, P., Chopra, M., and Rasanathan, K. (2024). The cost of inaction on health equity and its social determinants. BMJ Glob. Health 9(Suppl. 1):e012690. doi: 10.1136/bmjgh-2023-012690

PubMed Abstract | Crossref Full Text | Google Scholar

Yoosuf, N., Maciejewski, M., Ziemek, D., Jelinsky, S. A., Folkersen, L., Müller, M., et al. (2022). Early prediction of clinical response to anti-TNF treatment using multi-omics and machine learning in rheumatoid arthritis. Rheumatology 61, 1680–1689. doi: 10.1093/rheumatology/keab521

PubMed Abstract | Crossref Full Text | Google Scholar

Zafar, M. B., Valera, I., Rogriguez, M. G., and Gummadi, K. P. (2017). “Fairness constraints: mechanisms for fair classification,” in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (PMLR), 962–970. Available online at: https://proceedings.mlr.press/v54/zafar17a.html (Accessed June 16, 2025).

Google Scholar

Zhang, Z., and Neill, D. B. (2016). Identifying significant predictive bias in classifiers. arXiv. 1–5. doi: 10.48550/arXiv.1611.08292

Crossref Full Text | Google Scholar

Zimmerman, F. J., and Anderson, N. W. (2019). Trends in health equity in the United States by race/ethnicity, sex, and income, 1993–2017. JAMA Netw. Open 2:e196386. doi: 10.1001/jamanetworkopen.2019.6386

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: health equity, AI, machine learning, big data, big data analytics, immunology, immune disease

Citation: Kachnowski S, Khan AH, Floquet S, Whitlock KK, Wisnivesky JP, Neill DB, Dankwa-Mullan I, Ortega G, Daoud M, Zaheer R, Hightower M and Rowe P (2025) Achieving health equity in immune disease: leveraging big data and artificial intelligence in an evolving health system landscape. Front. Big Data 8:1621526. doi: 10.3389/fdata.2025.1621526

Received: 01 May 2025; Accepted: 08 October 2025;
Published: 14 November 2025.

Edited by:

Thomas Hartung, Johns Hopkins University, United States

Reviewed by:

Sotiris Kotsiantis, University of Patras, Greece
Ayse Gul Eker, Kocaeli University, Türkiye

Copyright © 2025 Kachnowski, Khan, Floquet, Whitlock, Wisnivesky, Neill, Dankwa-Mullan, Ortega, Daoud, Zaheer, Hightower and Rowe. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Asif H. Khan, YXNpZi5raGFuQHNhbm9maS5jb20=

These authors have contributed equally to this work

ORCID: Asif H. Khan orcid.org/0000-0001-9158-0116

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.