METHODS article

Front. Disaster Emerg. Med., 28 April 2026

Sec. Emergency Health Services

Volume 4 - 2026 | https://doi.org/10.3389/femer.2026.1748193

Deploying the Medical Informatics Platform for cross-border federated analytics in FERES and eCREAM

  • 1. NeuroDigital@NeuroTech Lab, Department of Clinical Neurosciences, Lausanne University Hospital, Lausanne, Switzerland

  • 2. Athena Research Center, Information Management Systems Institute, Athens, Greece

  • 3. Department of Medical Epidemiology, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy

  • 4. Stroke Center, Neurology Service, Department of Clinical Neurosciences, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland

  • 5. The European Clinical Research Infrastructure Network ECRIN, Paris, France

Abstract

Emergency medicine generates vast quantities of electronic health record (EHR) data across hospitals and countries, but leveraging these datasets for research and quality improvement is challenging due to privacy regulations, data silos, and heterogeneity of systems. Here, we describe how the Medical Informatics Platform (MIP) operationalizes cross-border federated analytics, combining governance, privacy-preserving data preparation, secure deployment, and federated execution as illustrated through the FERES and eCREAM federations. Each participating site runs a local MIP “node” containing its anonymized dataset behind its firewall; analysis queries are executed locally, and only aggregated results are shared to a central interface. Through this approach, sensitive patient data never leave their site of origin, yet clinicians and researchers can collaboratively analyze large multi-centric datasets in real time via a web-based interface. The MIP provides an intuitive, visualization-rich environment where users can select variables, apply statistical or machine learning algorithms, and interactively review results through charts and graphs. Robust governance and security measures are built-in: data remain under the control of the original institutions (who act as data controllers), all datasets are harmonized to common data models and irreversibly anonymized prior to analysis, and the platform enforces strict privacy safeguards to protect against re-identification. The MIP has been deployed in EU funded initiatives including the Federating European REgistries for Stroke (FERES) project, which is part of the larger EBRAINS initiative, and the eCREAM (enabling Clinical Research in Emergency and Acute care Medicine) retrospective observational multicenter study, allowing cross-border analyses of stroke outcomes and emergency department data while complying with GDPR and national regulations. By enabling international EHR collaborations without compromising patient privacy, the MIP shows how electronic records can support cross-border research and quality-improvement analyses in emergency medicine. This manuscript primarily reports the implementation approach and operational blueprint; it is not presented as a clinical outcomes study or a usability evaluation.

1 Introduction

1.1 Rationale and objective

Cross-border clinical research using routinely collected health data remains constrained by legal, ethical, and technical barriers (privacy regulation, institutional control, heterogeneous data structures). Our objective in this manuscript is to report an implementation-focused blueprint for enabling privacy-preserving, cross-border federated analytics with the Medical Informatics Platform (MIP) (1, 2).

1.2 Research question/objective

How can a multinational consortium operationalize federated (“code-to-data”) analyses across institutions while maintaining local control over sensitive patient datasets and meeting governance and privacy requirements?

1.3 Focus and scope

The primary focus of this manuscript is the implementation approach (governance, data preparation/harmonization, secure node deployment and configuration, and federated execution). Platform component descriptions are provided as contextual background only insofar as they are necessary to understand the implementation steps.

1.4 Manuscript organization

We first motivate federated analysis and summarize two deployments (FERES and eCREAM). We then describe core platform components, followed by the implementation workflow and privacy safeguards required to execute federated analyses in practice.

1.5 Federated analysis: a new opportunity for emergency medicine data

EHRs in emergency and acute care hold rich clinical information but remain underused for research and quality improvement due to barriers such as patient data sensitivity, strict data protection laws, data fragmentation across institutions, incompatible formats, and a lack of tools for multi-center analysis (3, 4). In emergency departments (EDs), high volumes and time pressure leave little capacity for research-oriented data curation (5). Most EHRs were built for documentation and accountability, not secondary analysis; they mix heterogeneous structured fields with abundant free text, complicating retrospective studies (6). Even when hospitals want to pool data, sharing patient level records is often impractical or unlawful (7). Bienzeisler et al. describe a nationwide emergency registry using a federated access system; they note that hospitals preferred to “maintain control over their data” and that federated access was crucial to overcome the “privacy–exploitation barrier” in data sharing (8). In a recent comparative study of data-sharing strategies, Rujano and colleagues observed that centralized models facilitate data linkage, harmonization and interoperability, whereas federated models facilitate scaling up and legal compliance because data remain under the control of the data generator (9). At the same time, we acknowledge that large, federated infrastructures still rely heavily on interoperability standards and harmonized semantics to ensure that cross-site analyses are comparable and interpretable. This gap between EHRs as care logs and their potential for insight has driven new approaches that enable collaboration without exposing patient identities.

The MIP exemplifies a privacy-preserving federated approach within the EBRAINS infrastructure (10): each participating site runs a local MIP node behind its own firewall to execute analyses on its anonymized data, and only aggregated results or model parameters are sent back to a central interface. This design keeps patient-level data under local custody (at its original institution or country of origin) and is consistent with other established networks like EHDEN (11). (For a detailed technical description of the MIP architecture, see below in Materials and Equipment).

1.6 Real-world MIP deployments: from stroke registries to emergency care

The MIP is deployed for the FERES project, which unites national stroke registries to enable pan-European analyses of care and outcomes. A 2023 systematic review by Leigh et al. identified dozens of national stroke registries with variable data elements, noting fragmentation of stroke data across Europe (12). Previously confined within national borders, some of these datasets can now support virtual pooled analyses through the MIP without any country relinquishing control (9). Accredited researchers can compare treatment modalities and times or outcomes across countries and develop predictive models on hundreds of thousands of aggregated cases, while data remain local and small count results are hidden. FERES, under the coordination of the Lausanne University Hospital, is securing the required ethics and regulatory approvals with each registry obtaining consent or legal clearance for secondary use and signing the necessary agreements. Its inclusion in the European initiative EBRAINS 2.0 (10) in 2024 highlights FERES as a model of federated clinical research and a means to identify best practices that no single stroke registry could reveal alone (13).

In emergency medicine, the eCREAM project brings together 11 partners across 8 countries to unlock ED EHRs for research. Use Case 1 (UC1) of the project follows a two-step privacy workflow. First, each hospital extracts predefined variables from its ED system (structured fields plus information derived from free text via NLP) and pseudonymizes the data locally: direct identifiers are removed and replaced with site-held codes. The pseudonymized records are securely transferred to a central eCREAM server, where they are cleaned, harmonized, and organized into two cohorts: adult patients with dyspnea and those with transient loss of consciousness (TLoC) (14). Second, after this curation, the combined dataset is irreversibly anonymized on the central server so that no re-identification is possible. The fully anonymized multi-site dataset is then loaded into the MIP environment for federated analysis, enabling authorized investigators to run cross-hospital queries in compliance with GDPR and national regulations while only receiving aggregated results. A detailed step-by-step description of the eCREAM UC1 ETL pipeline (including quality checks) is provided in Section 3.2.

By pairing data use agreements and ethics approvals with MIP's privacy-by-design architecture, eCREAM enables secondary analysis of ED EHR data for questions such as factors influencing admission rates or outcomes across regions. A recent NPJ Digital Medicine study of Germany's national ED registry underscores the potential of sharing ED data: it analyzed 7.9 million ED records across 58 hospitals via a federated authorization system (8). They conclude that continuous EHR sharing (with local control) is key to powering real-time clinical insights and cite the European Health Data Space (EHDS) initiative as motivation.

In the remainder of this manuscript, we explicitly distinguish (i) platform-level MIP components that are generic across federations, from (ii) implementation choices and workflows that are specific to FERES (stroke registries) and to eCREAM (emergency department EHR extraction and curation). The originality of this manuscript lies in consolidating the end-to-end, cross-border implementation blueprint (governance + data preparation + secure deployment + federated execution) and reporting how these elements are operationalized in two active federations.

2 Materials and equipment

This section summarizes the platform components needed to understand the implementation workflow, as described later in Section 3.

2.1 Federated architecture: central and local nodes

The Medical Informatics Platform (MIP) is built on a federated, two-tier architecture consisting of a central node and multiple local (member) nodes. Each participating institution (e.g., a hospital or a research institution) runs its own MIP node locally, behind its institutional firewall, where it stores and processes its data. This means that patient-level data never leaves the institution's servers, all computations on the raw data occur at the local nodes. The central MIP node, on the other hand, does not contain raw data; instead, it serves as the aggregation and coordination point for analyses. When an accredited researcher initiates an analysis via the platform, the central node dispatches the query to all connected local nodes, each of which executes the analysis on its local dataset. Only aggregated results or model parameters (never individual records) are then transmitted back to the central node, which combines them (for example, by averaging model parameters or pooling summary statistics) and presents the federated result to the user. This design preserves data privacy across national boundaries while enabling large scale analyses, as emphasized by Filippopolitis et al. (2, 15).

2.2 Web-based user interface for end users

From the user's perspective, a core component of the MIP is its web-based graphical user interface (GUI). The platform's front-end is implemented as a modern web application (built with Node.js and TypeScript frameworks) that runs in any standard browser, requiring no local installation. This design emphasizes usability and accessibility for non-IT users such as clinicians and clinical researchers. Through the GUI, users can easily perform complex analyses by pointing and clicking rather than writing code. The interface allows researchers to browse available variables and schemas of the datasets via effective visualizations like circle packing (Figure 1), configure analysis parameters, perform descriptive analytics and execute statistical or machine learning algorithms on the federated data with just a few steps. Results are returned in real time to the GUI, where they are visualized in intuitive charts, tables, or graphs. The overall goal is to hide the technical complexity of distributed computing behind a user-friendly portal. Mirroring familiar analytics workflows, clinicians, epidemiologists, and health data specialists use the MIP as a conventional analysis tool. They select variables of interest, choose an algorithm (e.g., survival analysis or a machine-learning model), and review the aggregated results. The difference is that, in the MIP, those results are computed simultaneously across data from multiple hospitals.

Figure 1

2.3 Distributed analytics engine (exareme2)

MIP's compute core, Exareme2, runs analyses across participating sites and merges the results. It decomposes a user's request (e.g., a logistic regression) into jobs sent to each member node; each site computes partial summaries or model coefficients on its own data, and the central service combines these into a single federated result. Because computation happens in parallel and in place, the system handles large datasets and multiple users without centralizing patient-level records (15). As new sites join, capacity scales horizontally: each node processes its share using local resources while the platform coordinates the workflow end-to-end.

2.4 Containerization and orchestration (Kubernetes)

The platform's services are packaged as containers and primarily orchestrated on Kubernetes because the MIP must be deployed reproducibly across heterogeneous environments, including centrally hosted infrastructure and hospital-side member nodes (15). Kubernetes provides a common operational substrate that supports declarative Infrastructure-as-Code deployment, easier scaling, federation-specific isolation, rolling updates, and self-healing behavior that improves service continuity. This choice also aligns with the infrastructure available at the Swiss National Supercomputing Center (CSCS), where central MIP services can be deployed through Kubernetes-as-a-Service, while member sites can run lightweight Kubernetes installations on dedicated local servers when needed.

For each federation, the system looks like a small “team”: a central node hosts the web portal with the graphical interface and coordinates the analytics engine in the backend, while one or more worker nodes run the analysis close to the hospitals' anonymized data. Some hospitals participate remotely with their own worker nodes running in separate clusters; secure cross-cluster networking (e.g., Submariner) links these remote workers to the central node as if they were on the same network, while policies ensure data never flows directly between hospitals.

Behind the scenes, common Kubernetes services take care of traffic routing (ingress), security boundaries (network policies), and observability. Central dashboards for logs, metrics, and audit trails monitor the health and compliance of the system without exposing technical details to users. Rolling updates and self-healing keep services available during maintenance. For users, all this machinery is invisible: the MIP web portal loads reliably, and analyses run smoothly. User identity, single sign-on, and permissions are provided by the platform's IAM service, described next.

2.5 Authentication and user access management

Because the MIP spans multiple institutions and countries, strong authentication and authorization ensure that only approved users can access the system, and only within their permissions. The platform uses an integrated Identity and Access Management (IAM) system based on Keycloak (RRID:SCR_021222), providing single sign-on for the MIP. The Keycloak solution supports standard protocols and can federate identities from other providers (e.g., EBRAINS accounts or institutional logins).

In federations such as FERES or eCREAM, users are onboarded through the governance process and are then issued MIP accounts tied to their organizational email and EBRAINS identity (16). Once logged into the web interface, authorization rules determine what each user can do and see, aligning access with project roles and the scope approved by the consortium.

2.6 Data curation

Before any analysis can be performed on the MIP, each participating site must prepare its dataset in a standardized and privacy-preserving format. The data preparation process is a crucial “materials” component of the platform. Each site extracts data from its EHR system, converts it to a tabular format, and removes personal identifiers. It then harmonizes the dataset to the common data format defined by the project's pathology. Detailed procedures for anonymization, quality checks and harmonization are described in the Data Preparation and Harmonization section.

3 Methods

The ‘methods' reported here describe the implementation workflow required to operationalize a cross-border MIP federation: (1) governance and user accreditation, (2) data preparation/harmonization and privacy transformation, (3) secure node deployment and network configuration, and (4) federated analysis execution with disclosure protection. Section 2 provides the platform component context needed to understand these steps; the focus of Section 3 is the practical implementation procedure and how it is applied in the FERES and eCREAM federations.

3.1 Governance and user accreditation

Each MIP federation is established by a consortium of institutions focused on a specific clinical domain. A governance framework ensures that only authorized, qualified researchers can access the federated network and its data, consistent with evidence from European health data hubs that identify authorization functionality, regulated access mechanisms, and documented provider identification and anonymization practices as core governance features for secondary use (17).

Participating institutions enter into a Data Sharing Agreement (DSA) and/or Data Transfer Agreement (DTA) with the coordinating center to regulate data use and platform deployment. Users are also bound by the EBRAINS access framework and the MIP General Terms and Conditions, which forbid sharing data outside the platform or attempting re identification (16, 18).

Access to a given MIP federation is granted through a governance committee of that specific federation. When a new user requests access, a committee representing the contributing institutions reviews the application (including the researcher's affiliation and intended use). Upon approval, the user's EBRAINS account is accredited for the specific federation, enabling login to that federation's analysis environment. In FERES, for instance, representatives from each national stroke registry collectively evaluate new user requests in line with the FERES User Charter and the project's ethical approvals. The researcher's home institution must also endorse access and, upon joining the federation, sign the necessary agreements with CHUV to formalize data hosting and analysis rights. This multi-layered accreditation process ensures that data is accessed only for approved research purposes by qualified individuals.

The federation‘s governance framework is designed to meet legal and ethical requirements across jurisdictions. Each participating site remains the custodian of its data and retains the right to approve analyses and publications involving its patients' data. In practice, all analyses are conducted within the scope defined by the consortium's research protocol. In addition, users are informed of, and agree to, strict privacy and security rules before accessing any data. For example, the MIP General Terms and Conditions mandate that users maintain data confidentiality and acknowledge that any breach (such as attempting to identify a patient) will result in termination of access. Onboarding an individual user onto the MIP completes within approximately 2–8 weeks, depending on the cadence of governance committee review and any required institutional endorsements. However, when joining as a new Federation member, for many cases that require additional institutional or administrative steps at national level (Data Agreements, Ethics, and Clinical Research authorities' approvals), onboarding may take much longer, in the order of many months until the member can make available his dataset and engage in Federation activities.

The eCREAM project provides another useful example of tiered access governance. In eCREAM, three MIP environments are planned: (1) use of the public MIP instance hosted on EBRAINS for broader community access to a dedicated anonymized dataset, (2) an eCREAM consortium MIP for accredited project members, and (3) an extended federation for external partner organizations wishing to join the research effort. This structure illustrates how governance instruments can flexibly accommodate open science while protecting sensitive data: general users can explore only a safe, public dataset, whereas consortium members work within a secure private federation. Across all levels, governance and accreditation ensure compliance with regulations and uphold the trust of data-providing hospitals.

3.2 Data preparation and harmonization

Before federated analysis can take place, each participating site must prepare and harmonize its dataset according to a common data model. The consortium defines a unified set of data elements, often based on well-established Common Data Elements in the field, to ensure that clinical variables are directly comparable across sites. In the case of FERES, the project is harmonizing data from national stroke registries in Switzerland, Austria, Greece, Italy, and Ireland into a single schema covering key cerebral stroke variables (e.g., demographics, risk factors, treatments, and outcomes). This process involves mapping different registry schemas to the agreed common data elements (Table 1) and standardizing variables coding (for example, aligning category labels for stroke subtypes and outcome scales across registries). A simple example illustrates how heterogeneous source variables are normalized: in the Swiss Stroke Registry, intravenous thrombolysis is captured in a single variable treat_ivt with values {0 = no, 1 = yes, 2 = started before admission}; during FERES harmonization, this is transformed into two CDEs—acute_treat_ivt (overall IVT performed: yes if treat_ivtε {1,2}, no if 0) and acute_treat_ivt_preadm (IVT started before admission: yes if treat_ivt = 2, otherwise no)—so that both treatment presence and timing can be compared consistently even when other registries encode these concepts differently. It is evident that having to map data variables to CDEs (each dataset with more than 200 variables) is a rather manual process that requires data engineering skills and knowledge of the clinical domain. The MIP team provides technical support during this phase, assisting centers with the creation of data dictionaries and transformation scripts so that local data can be converted into the federation's common format. All sites, via the mapping process, collaboratively validate the harmonized data model to ensure that it captures each center's data without ambiguity. The schema of the common data model of the datasets is loaded into the MIP. It becomes both part of an online data catalog with metadata information about the federated cohort but also an interactive graphical representation (dendrogram or circle packing) that the researchers can navigate to better understand the data and choose the variables they will analyze (Figure 2).

Table 1

CodeNameConcept pathTypeValuesDescription
nihss_adm_scoreAdmission scoreStroke/Clinical scores/Initial Scores/NIHSS/NIHSS Integer scores/nihss_adm_scoreInteger0-42NIHSS score at admission
nihss_ENDi24Early Neurological Deterioration in 24 hrsStroke/Clinical scores/Initial Scores/NIHSS/NIHSS subscores/nihss_ENDi24Nominal{“0”, “no”}, {“1”, “yes”}, {“9”, “unknown”}Early Neurological Deterioration of ischemic origin within 24 hours, yes (1) if nihss24h - nihss_adm_score >3
gcs_admissAdmission GCS detailedStroke/Clinical scores/Initial Scores/Other scores/GCS/gcs_admissInteger3-15Glasgow Coma Scale (GCS) at admission
ageAgeStroke/Demographics and event/Demographics/ageInteger0-120What is the age of the patient at the time of stroke?
biol_sexSexStroke/Demographics and event/Demographics/biol_sexNominal{“1”, “male”}, {“2”, “female”}, {“9”, “unknown”}What is the biological sex of the patient
arrival_ambulAmbulanceStroke/Hospitalization/Admission and hospital stay/Transport/arrival_ambulNominal{“0”, “no”}, {“1”, “yes”}, {“9”, “unknown”}Ambulance
arrival_helicoHelicopterStroke/Hospitalization/Admission and hospital stay/Transport/arrival_helicoNominal{“0”, “no”}, {“1”, “yes”}, {“9”, “unknown”}Helicopter
arrival_otherOther TransportStroke/Hospitalization/Admission and hospital stay/Transport/arrival_otherNominal{“0”, “no”}, {“1”, “yes”}, {“9”, “unknown”}Other (self-referral, taxi, relatives)
disch_destDischarge destinationStroke/Hospitalization/Discharge/Main discharge information/disch_destNominal{“1”, “home”}, {“2”, “rehabilitation”}, {“3”, “other acute hospital”}, {“4”, “nursing home, palliative care or other medical facility”}, {“5”, “transferred within neurological unit in same hospital”}, {“6”, “transferred in other department of same hospital”}, {“7”, “transferred to SU/SC of other hospital”}, {“8”, “not applicable since patient died during hospitalization”}, {“9”, “unknown”}Discharge destination
inhosp_deathDeath in-hospitalStroke/Hospitalization/Discharge/Main discharge information/inhosp_deathNominal{“0”, “no”}, {“1”, “yes”}, {“9”, “unknown”}Death in-hospital
inhosp_death_causeCause of DeathStroke/Hospitalization/Discharge/Main discharge information/inhosp_death_causeNominal{“0”, “not applicable”}, {“1”, “Fatal stroke or intracranial hemorrage”}, {“2”, “other vascular cause”}, {“3”, “non-vascular cause”}, {“9”, “unknown”}Cause of death for in-hospital death

Snapshot of the Common Data Elements representing the hospitalization and follow-up care trajectory of a stroke patient. Death in Hospital variable highlighted.

Selected values in bold.

Figure 2

In the eCREAM project, data are collected from the EHRs of 25 emergency departments (EDs) across Italy, the UK, Poland, and Slovenia. Since these EHRs rely on different software systems, the study variables are captured in highly heterogeneous ways across sites. To ensure that the retrieved information can be used and compared, data harmonization is essential. A key step in this process is the development of a mapping tool that links each hospital's data to the study variables defined in a shared data dictionary. This dictionary specifies the agreed common data elements and their standardized coding; for example, category labels for qualitative variables such as mode of arrival, or units of measurement for quantitative variables such as laboratory results.

For eCREAM UC1, the data processing workflow is implemented as a staged ETL pipeline: (1) extraction of predefined study variables from each ED system (including structured fields and NLP-derived variables from free text where applicable), (2) transformation into a standardized tabular dataset with consistent units and coding aligned to the shared data dictionary, (3) harmonization to ensure that variable meanings and value sets are comparable across sites, (4) privacy transformation (local pseudonymization for secure transfer to the central curation environment, followed by irreversible anonymization after curation), and (5) quality checks performed during curation (schema validity/type checks, missingness summaries for key variables, basic plausibility/range checks, and consistency checks for major cross-field dependencies) before inclusion in analysis-ready cohorts and loading into the MIP environment. Step 1 is a particularity of the eCREAM project (vs. FERES). Structured fields from ED EHR forms are extracted directly, while the free-text “notes” sections (e.g., symptoms, anamnesis, suspected/confirmed diagnoses) are processed with the eCREAM NLP language model to convert clinically relevant narrative into additional structured variables that are stored alongside the native fields in the research database before downstream curation and analysis. This is done because key emergency-care information is frequently documented only in free text and manual abstraction is not feasible at scale, so NLP enrichment improves dataset completeness and harmonized, cross-site reuse for research. FERES operates on secondary data already present in each member registry database, so the ETL step mainly exports existing registry tables into a tabular CSV for validation/harmonization. It is clear that applying NLP to the Stroke EHR forms could further improve data richness and reduce missingness of the Stroke Registries.

Crucially, data preparation also includes a robust anonymization step to protect patient privacy before any data is made available for analysis. Each data provider performs irreversible anonymization of its dataset in line with applicable international and national privacy frameworks. Direct personal identifiers (such as names, medical record numbers, or contact details) are removed. Quasi-identifiers (such as dates of birth or rare diagnoses) are generalized or binned to make re-identification not reasonably likely. The Swiss Human Research Ordinance (HRO), which applies to Federation participants from Switzerland, defines anonymization as “the irreversible masking or deletion of all items that would enable the data subject to be identified without disproportionate effort” (19). This requirement is grounded in the broader legal framework established by the Swiss Human Research Act (HRA) (20). In compliance, MIP Federation sites drop or irreversibly mask any data points that could indirectly identify individuals (encryption alone would constitute pseudonymization and therefore remain personal data under the GDPR). For example, detailed admission dates are either converted to month/year or like in the case of FERES, were completely dropped after the necessary deltas were calculated (e.g., the “door_ivt_delay” variable represents the time from a patient's arrival to intravenous thrombolysis). For FERES, the resulting datasets are treated as anonymized under GDPR and national laws, nevertheless no personal data leaves the hospital premises. While the GDPR does not provide an explicit legal definition of anonymization comparable to that of the HRO, Recital 26 clarifies that the principles of data protection do not apply to information “which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable” (7). This formulation is conceptually aligned with the HRO's emphasis on irreversibility and disproportionate effort, and with the WP29/EDPB view that anonymization should render re-identification practically impossible, taking into account “all the means reasonably likely to be used” given available technology and context (21).

In practice, during anonymization, MIP federations (ex. eCREAM and FERES) follow best-practice guidelines such as those from the Swiss Personalized Health Network (SPHN) and the UK's NHS anonymization standard ISB1523 (22, 23). This ensures a consistent risk-based approach to privacy across all sites. For instance, SPHN guidance combines rule-based de-identification with a re-identification risk assessment, helping projects document how anonymity is achieved and verified (Table 2). Meanwhile, NHS ISB1523 provides a framework to distinguish identifying vs. non-identifying data and recommends techniques (like cell suppression) to ensure published results do not inadvertently disclose patient identities. Upon completion of the preparation phase, each participating hospital retains an anonymized and harmonized dataset, ready for inclusion in federated analyses.

Table 2

Var groupIdentifying quasi-identifying variablesDe-identification rule #De-identification rule descriptionSelected rule
D-01Direct identifiers (e.g. name, phone number social security number, email address, medical record number, license number)D-01-01Identifiers are not used in the projectx
D-01-02Identifiers are replaced by plausible surrogates (e.g., in text reports)
D-01-03Original values of one or more direct identifiers are kept (*if this rule is selected the data set is not considered de-identified)
D-02Patient identifierD-02-01Identifiers are not used in the project; they are dropped at anonymization stagex
D-02-02Identifiers are replaced by pseudonyms (project specific patient identifier) and the mapping table is securely stored by the data provider
D-02-03Original values are kept (hospital internal patient identifier) (*if this rule is selected the data set is not considered de-identified)
D-03Sample identifierD-03-01Sample identifiers are not used in the projectx
D-03-02Sample identifiers are replaced by pseudonyms (project specific sample identifier) and the mapping table is securely stored by the provider and not accessible by the research team
D-03-03Original values are kept (hospital internal sample identifier)
D-04Administrative case identifierD-04-01Administrative case identifier is not used in the projectx
D-04-02Identifiers are replaced by pseudonym (project specific identifier) and the mapping table is securely stored by the data provider and not accessible by the research team
D-04-03Original values are kept (hospital internal case identifier)
D-05Lab report identifierD-05-01Lab report identifier and Lab order identifier is not used in the projectx
D-05-02Identifiers are replaced by pseudonym (project specific identifier)
D-05-03Original values are kept (hospital internal sample identifier)
D-06Dates in the patient record (dates of birth and death excluded)D-06-01Dates are suppressed at anonymization. Variables for Time Metrics that are essential for the project are calculated before the dates are droppedx
D-06-02Dates are shifted by a random number of days within +/– 365 days
D-06-03Dates are shifted by a random number of days within +/– 90 days (one quarter offset to preserve seasonality) (default)
D-06-04Dates are shifted by a random number of days within +/– 30 days (one month offset to preserve seasonality)
D-06-05Dates are shifted by a random number of days within +/– 7 days (one week offset)
D-06-06Original dates are kept
D-07Date of birthD-07-01Date of birth concept is suppressed or not used in the projectx
D-07-02Only the year of the original birth date is kept
D-07-03Only the year and month of the original date of death are kept
D-07-04Full original date of birth is kept (dd/mm/yyyy)
D-07-05Date of Birth is shifted by the same rule selected in D-06
D-08Date of deathD-08-01Date of death concept is suppressed or not usedx
D-08-02Only the year of the original date of death is kept
D-08-03Only the year and month of the original date of death are kept
D-08-04Date of Death is shifted by the same rule selected in D-06 (default)
D-08-05Full original date of death is kept (dd/mm/yyyy)
D-09AgeD-09-01The age concept is not used in the project
D-09-02Age in generalized in groups of 5 or more years
D-09-03Age at arrival is kept (in years). Brackets for 90+and < 18 for avoiding outliersx
D-09-04Original age is kept
D-10ProfessionD-10-01Profession is not used in the projectx
D-10-02Original profession is kept, but replaced by a random profession for identifying ones
D-11Location (street, zip code, city, region, country)D-11-01Location is not used in the projectx
D-11-02Location is generalized to the country level
D-11-03Location is generalized to the region level
D-11-04Location is generalized to the city level. If cities have less than 20.000 inhabitants, cities are replaced by the corresponding region
D-11-05Location is generalized to the zip level. If zip codes refer to areas with less than 20.000 inhabitants, the last 2 numbers of the zip codes are suppressed
D-11-06The original locations are kept
D-12Organizations (data provider organization excluded)D-12-01Organization name is not used in the projectx
D-12-02Organization type is kept (e.g., hospital, clinic, etc.)
D-12-03Organization name is kept (e.g., University Hospital Basel)
D-13Organizational units (data provider organizational unit excluded)D-13-01Organizational unit is not used in the projectx
D-13-02Organizational unit is generalized to the division level (e.g., Neurology, Radiology, Urology, etc.)
D-13-03Organizational unit is kept (e.g., 328 Kardiologie ME)

De-identification rules for anonymizing eCREAM datasets before loading them to the MIP.

Selected values in bold.

Although each MIP node stores harmonized, irreversibly anonymized row-level records (one row per patient/encounter) containing no direct identifiers and only a small set of quasi-identifiers (e.g., age and sex), the platform is engineered to disclose aggregate outputs only. To reduce disclosure risk from rare filter combinations, MIP applies built-in small-cohort suppression with a minimum cell threshold of 10 (default): site-level results based on fewer than 10 individuals are withheld, and the analysis workflows are structured around summaries that make it difficult to reconstruct multi-variable, patient-level profiles even when cohort filters are used. In addition, for some federated datasets (local owner's decision), k-anonymity/generalization is applied during curation of quasi-identifiers (e.g., age banding and handling of sex) before data are loaded into the MIP. Finally, the FERES and eCREAM deployments operate as private federations (no public access), with tight governance, accredited access, and monitoring, which also informs the selection of relatively low k values when k-anonymity is used at curation. Overall, re-identification risk is mitigated across multiple layers during input anonymization, aggregate-only analytics, output suppression, and controlled access—and is considered extremely low.

3.3 Node deployment and secure network configuration

MIP is deployed as a centrally mediated network with a coordination layer and hospital-run site nodes. As previously described in the Containerization and Orchestration section, the platform is organized as containerized services orchestrated by Kubernetes. This Kubernetes-based design was selected not only for security, but also for operational portability: the same deployment model can run on managed central infrastructure (e.g., CSCS) or on smaller on-premise hospital servers, while preserving reproducibility, isolation, and simpler onboarding of new sites through automated, version-controlled deployment workflows.

Connectivity is outbound-only from each site to a single, well-known central endpoint over an end-to-end encrypted VPN (or secure multi-cluster link), so it fits hospital firewalls and avoids opening inbound ports. Only the minimal ports/services needed for MIP are allowed; network policies block any site-to-site paths, so local nodes exchange messages only with the central hub. Queries and results traverse the encrypted tunnel, and even operational traffic (e.g., cluster monitoring and health checks) remains on private channels. In FERES, for example, the central node at CSCS maintains separate VPN tunnels to each national registry node; even when multiple nodes are co-hosted, they never communicate directly—each uses its own dedicated link to the hub. Only authenticated federation nodes can join, and storage volumes holding anonymized datasets can be encrypted at rest. This design protects against eavesdropping and ensures that sensitive data (even in aggregate) is never exposed on open networks.

Researchers access the system through the central web interface using single-sign-on. They never log into hospital servers or databases; all compute jobs are brokered by the central hub and executed locally under the control of the nodes. Certificates, permissions, and audit logs are maintained so that every analysis can be traced to its origin.

Deployment is iterative. Projects typically begin with one or two pilot nodes to validate connectivity and governance, then onboard additional sites as local IT teams sign off. This staged rollout can accelerate participation, while keeping operational risk low.

3.4 Federated analysis execution

Accredited investigators sign into the central portal, choose the relevant federation, and select variables and analysis modules through the graphical interface. The experience is deliberately simple: clinicians interact with familiar tables and controls rather than code.

When an analysis is launched, the system sends the computation to each participating site, “the code visits the data” (24). Each local node runs the job on its anonymized dataset using the Exareme2 analytics engine and returns only partial aggregates or model updates. Exareme2 was chosen because it is the MIP's native federated analytics engine and is developed within the same extended technical ecosystem as the platform itself, which enables close alignment between platform requirements and analytics implementation. This makes it practical to adapt federated execution flows, privacy controls, and aggregation strategies to the needs of specific statistical or machine-learning methods while preserving the core principle that patient-level data remain local. The central server then combines the returned partial results into a single result; for iterative methods such as federated learning, this send-update-aggregate cycle repeats until convergence.

Results appear in the interface as summary tables and visualizations. Where a subgroup is too small to report safely, the platform applies disclosure control, so that no individual can be inferred from an output figure.

If allowed by the consortium's rules, aggregated output (i.e., graphs, results tables) can be exported for further reporting after routine review. Throughout, patient-level records remain at their home institutions. Analyses are saved with all specific parameters selected (e.g., datasets and variables selected, filters applied) and can be retrieved and edited at any time. When relevant, a specific analysis can be shared with other users of the federation.

3.5 Security, ethics, privacy safeguards

Security and ethical compliance are cornerstones of the MIP federated approach. As described in the Data Preparation and Harmonization section, all datasets are irreversibly anonymized at the source in alignment with GDPR and relevant national laws, so no personal identifiers or traceable codes enter the MIP nodes. This foundation meets or exceeds the standards set by Swiss law (HRO/HRA) and EU regulations and allows ethics committees to approve secondary use of patient data. In some cases, such as eCREAM, it can also support waivers of individual consent when safeguards are in place. (e.g., GDPR Art. 9 (2) (j); 5 (1) (b) and Recital 50) (7). In practice, in cases like eCREAM, obtaining consent from hundreds of thousands of past ED patients would be impracticable and could introduce bias; therefore, ethics committees granted waivers based on the strong public interest in improving emergency care.

So, each country's ethics committee either waives consent based on anonymization and public interest, or (in cases where national law is more stringent) requires specific patient consent, which the project has to accommodate. As a third possibility, as done in the data provided by the Swiss Stroke Registry (SSR), countries or registries can allow anonymized data to be used for research purposes unless a patient expresses a clear opposition (≪opt-out≫). It must be mentioned here that as patients have the right to withdraw their consent at any time, and as our FERES datasets are fully anonymized without the possibility to re-identify individual patient records, we are obliged at each update to update the whole dataset, not individual records. This careful navigation of ethics requirements ensures that the federated analyses are conducted lawfully and with respect for patient rights across all sites.

On the technical side, the security architecture of the MIP prevents unauthorized data access and ensures data integrity. The use of a VPN and central proxy means that even if a researcher's account were compromised, an attacker could not directly penetrate a hospital's node—they would encounter the central server which is secured and monitored. All user actions within MIP require valid authentication, and session management is strict. Moreover, every query and data access is logged, and these logs can be audited to detect any unusual or disallowed activity. The platform's design inherently limits the scope of data that a user can see: researchers never access row-level data, only aggregated results meeting the privacy threshold. In effect, the MIP behaves as a “safe analysis environment,” analogous to a trusted research environment (TRE), where data never leaves the secure servers and only safe outputs are released. This characterization is consistent with the TRE model as defined by the UK Health Data Research Alliance and with the Five-Safes-oriented controls described for DataSHIELD, a peer federated analytics platform (25, 26).

To further bolster privacy, the MIP enforces built-in disclosure controls (e.g., a minimum cell threshold) so that outputs based on very small groups are not returned; the anonymization workflow and suppression policy are detailed in Data Preparation and Harmonization. Optionally, differential privacy can be enabled to add calibrated noise when required by a data provider or analysis plan. Differential privacy is mainly valuable for settings with heavy, repeated interactive querying, especially when users can perform many high-dimensional stratifications or probe rare subgroups because privacy risks can accumulate through differencing and composition across multiple outputs. In MIP, which already restricts disclosure to aggregate results, applies small-cohort suppression (default threshold 10), and runs within tightly governed, access-controlled private federations, DP is often overkill and would add algorithmic implementation complexity, substantial compute overhead and can reduce analytical precision without a commensurate reduction in the already very low re-identification risk. Combined with irreversibly anonymized inputs and strict aggregation, MIP's controls ensure no individual can be inferred from released results.

Ethically, the platform's operation is guided by principles of transparency and accountability. All participating centers sign agreements that clearly stipulate data usage terms, publication rights, and obligations to report any data breaches or incidents. The EBRAINS ecosystem's policies complement those by requiring all users to agree to overarching ethical standards for data use. In line with the FAIR principles (Findable, Accessible, Interoperable, Reusable), the federation ensures that data is used to advance knowledge while respecting the individuals behind the data. Results of federated analyses are often shared with the broader community (e.g., via publications or public dashboards) only in aggregated, non-sensitive form, effectively turning raw clinical data into societal benefits without compromising privacy.

In summary, the MIP-based federations like eCREAM and FERES implement a multi-faceted approach to security and ethics: strong governance agreements, rigorous data anonymization, state-of-the-art IT security, automatic privacy-preserving features in analysis, and compliance with legal frameworks across all involved regions.

4 Implementation results

This section reports implementation results and current analytical capabilities of the federations, rather than clinical outcome findings. Operationally, the MIP infrastructure and deployment workflow are in place in both FERES and eCREAM, although the extent of completed cross-site analyses still depends on continued site onboarding, data curation, and investigator use. In FERES, the MIP network links national stroke registries from five European countries. For example, the SSR, a FERES member, has contributed anonymized and harmonized data for 181 clinical variables across 147,000 stroke patient encounters from 2013 through 2024 (27, 28). Contributions from additional registries (Ireland, Italy, Greece, Austria) continue to expand the size and diversity of the federated resource. In eCREAM, a federation is being established across approximately 25 emergency departments in four countries (Italy, Slovenia, Poland and the UK), with data focused on transient loss of consciousness and dyspnea over the 2021–2023 retrospective period. Taken together, these deployments indicate that the federated infrastructure is operational and that analysis-ready resources are being assembled across datasets that would otherwise remain institutionally or nationally isolated.

The current analytical capabilities of the federations range from descriptive summaries to federated statistical and machine-learning analyses. These capabilities make it possible to characterize participating datasets, compare key indicators across countries or hospital networks, and apply methods such as t-tests, ANOVA, correlation matrices, linear and logistic regression, Naïve Bayes classifiers, clustering algorithms, and principal component analysis. Based on the infrastructure now in place, these methods are expected to support a range of cross-site analyses within FERES and eCREAM. In eCREAM, this includes descriptive comparisons of patient populations, care pathways, and outcomes, as well as multivariable prediction models estimating the probability of hospitalization for patients presenting with dyspnea or transient loss of consciousness and adjusted comparisons of hospitalization rates across emergency departments (14). In FERES, regression models such as logistic regression for functional outcomes or mortality, together with machine-learning approaches, can be applied across the federated stroke registries to investigate prognostic factors and trends that may be less apparent in smaller single-country cohorts (Figure 3). More broadly, the federated setting can enable analyses that would be difficult or less informative if datasets remained isolated, particularly between-site comparisons, evaluation of practice variation, study of less frequent subgroups, and model development across more diverse populations. In that sense, the current implementation establishes the operational basis for collaborative analytics and the range of analyses that these federations are now positioned to undertake as onboarding and research use progress.

Figure 3

By enabling such analyses on large, federated datasets, the platform can support the identification of population-level patterns, demographic trends, and differences in outcomes across subgroups that may be difficult to detect in isolated datasets. It can also support more systematic comparisons of care processes and outcomes across institutions and regions. Such analyses may help identify variation in practice, inform quality-improvement efforts, and contribute to future policy or guideline discussions. The federated approach may also broaden the representativeness of the evidence base by enabling participation from smaller or less-resourced centers whose data would otherwise remain absent from multi-site analyses.

5 Discussion

5.1 Federated vs. centralized data sharing models in clinical research

Federated data analysis, as implemented by the Medical Informatics Platform (MIP), offers a fundamentally different approach to multi-centric clinical research compared to traditional centralized data sharing. In the federated model, each institution retains custody of its data behind its firewall, and analyses are performed locally with only aggregate results shared across nodes. This privacy-preserving design means patient-level data never leaves the original storage or country of origin, which directly addresses the data protection barriers that often hinder centralized registries. In contrast, centralized models require pooling patient data into a single repository, raising well-documented concerns about cross-border data transfer, patient confidentiality, regulatory compliance, and loss of local control. Many valuable clinical datasets have historically been underused or siloed precisely because sharing health data across institutions or borders is fraught with legal and ethical challenges (4). The federated approach mitigates these issues by analyzing data in situ. This architecture is especially advantageous for emergency medicine and acute care, where large and diverse datasets are essential, because it enables cross-center analyses without compromising privacy. By bringing the analysis to the data, rather than the data to the analysis, federated networks facilitate multi-center studies in acute care settings that would be logistically or ethically infeasible under a centralized paradigm (24). Other federated platforms, such as DataSHIELD, likewise implement built-in disclosure control mechanisms (e.g., query restrictions and output checks) to mitigate re-identification risk in multi-site analyses, highlighting the shared privacy–utility trade-offs across federated approaches (25).

5.2 Lessons learned from FERES and eCREAM federations

Implementing a federated data network across multiple institutions and countries requires not only technical infrastructure but also significant groundwork in governance, legal agreements, and data standardization. The experiences of the Federating European Registries for Stroke (FERES) project and the enabling Clinical Research in Emergency and Acute Medicine (eCREAM) project highlight several critical challenges and lessons:

  • Lengthy legal and governance processes: establishing a federation required extensive legal work. FERES, which is not covered by a single grant agreement, found that negotiating and signing DSAs, DTAs and governance charters with each national registry took much longer than anticipated and involved many stakeholders. Federations should budget sufficient time for these negotiations and seek flexible template agreements and early engagement with legal teams. Establishing a clear governance framework is also essential. Owing to the complex and potentially different decision structure of each national registry, FERES created a governance structure, including a Federated Executive Board and Scientific Board, to oversee strategic, legal, and methodological decisions. While such governance bodies are not legally binding by themselves (the FERES User Charter serves as a “best practice” reference rather than a contract), they foster trust and shared understanding. Regular communication and consensus-building in these boards were critical to keeping the geographically dispersed members aligned. One lesson is that successful federation often involves navigating a complex mix of institutional policies and legal requirements. It greatly helps to involve data protection officers and legal teams at each site early, use flexible agreement templates, and where applicable draw on standardized European guidance such as the European Health Data Space (EHDS; Regulation (EU) 2025/327). The EHDS entered into force in March 2025 and is intended to progressively enable cross-border health data sharing and the secondary use of health data in the EU (29).

  • Data harmonization and integration challenges: a second major hurdle was the harmonization of heterogeneous datasets across sites. In FERES, the participating stroke registries each had their own data schemas, variable definitions, and coding standards, reflecting local clinical practices and legacy database designs. Before any federated analysis could occur, these differences had to be reconciled through a meticulous data curation process. The FERES team developed a Common Data Elements (CDE) schema for stroke through an iterative process aimed at capturing the key variables across registries. Because the registries themselves evolve over time to meet the changing needs of clinicians, researchers, and policymakers, this process requires continual refinement. The resulting common data model must remain both flexible and comprehensive, accommodating each country's dataset while allowing consistent federated analysis across sites. So, this added flexibility to adapt over time adds an extra level of work to the already labor-intensive nature of the harmonization process. To further support harmonization and analysis development without exposing real patient records, the FERES MIP creates and makes available platform-hosted synthetic stroke datasets for training, demonstration, and prototyping; this mirrors practices in other federated frameworks (e.g., DataSHIELD), where synthetic data are used to develop and validate analysis code while minimizing disclosure risk (25).

  • Mapping and transforming variables: after anonymization, each site's dataset must be mapped to the common CDE definitions, a complex and often painstaking task. In FERES, data engineers worked closely with stroke clinical experts to interpret each local variable and identify its corresponding CDE, accounting for nuances in naming conventions, measurement units, and coding schemes. Units and categorical values were standardized to ensure comparability across sites. In retrospective and multi-national settings, this standardization can be particularly demanding given the diversity of recording systems, languages, and measurement scales. Because the FERES project encompasses large patient cohorts and hundreds of variables, the current schema includes 946 stroke CDEs, much of this conversion had to be automated through scripts that were rigorously validated. A key lesson from this experience is that data harmonization often represents the most critical step in enabling federated analytics. It requires domain expertise, iterative testing, dedicated staff, and at times, compromises on which variables can be reliably compared across all sites. Projects like eCREAM face similar and additional challenges, as emergency department EHRs are notoriously inconsistent and contain both structured and free-text fields. The eCREAM consortium developed natural language processing methods to extract usable structured data from clinical notes, effectively creating new datasets that could be shared or federated. This highlights a broader lesson for future federations: adopting common data standards early on, for instance through established ontologies or data dictionaries, can greatly reduce the burden of post-hoc harmonization. In practice, however, retrospective data for secondary use must often be retrofitted to shared standards, a process that demands substantial effort and can significantly delay the realization of a fully operational federation.

  • Operational and technical hurdles: setting up the technical infrastructure for a federated network was another learning area. The MIP itself relies on containerized services orchestrated via Kubernetes, deployed either on local servers or cloud infrastructure at each site. It showed that not all participating centers had the necessary IT resources on-premises. The coordinating center (CHUV) therefore offered to host some partners' MIP nodes at the Swiss national supercomputing center (CSCS) as isolated virtual machines. This helped less-resourced sites join the federation but introduced the need for DTAs (since data from those sites had to be transferred and stored in Switzerland under CHUV's management). Setting up multiple nodes required creating a secure VPN to link each one to the main server. Working with local IT teams was key for configuring the VPN and managing firewall permissions. The takeaway is that a federated platform's deployment within a healthcare setting can be technically complex, especially when scaling to many nodes. Adequate support and documentation (as provided by the MIP core team) are vital. Nonetheless, once deployed, the MIP has, in our experience, shown stable technical performance and supports a range of statistical and machine-learning algorithms in federated mode. The final operational lesson was the importance of user training and iterative feedback. Both federations developed user charters and need to hold training for investigators on how to use the web-based interface and interpret federated analysis results. Users have to become comfortable with the idea that they cannot see individual patient data, only aggregate patterns.

In summary, FERES and eCREAM demonstrate that while the federated model can surmount many privacy and logistical barriers, it introduces its own challenges in legal negotiation and data preparation. Careful planning, robust governance, and dedicated investment in data curation are essential to fully realize the benefits of federated networks. These insights provide valuable guidance for future projects seeking to connect data providers across institutional and national boundaries within such frameworks.

5.3 Implications for health data governance, clinical research, and emergency medicine

The success to date of the MIP-based federations holds several broader implications for health data governance and clinical research. First, these projects indicate that advanced clinical analyses can be organized across national boundaries in a privacy-preserving manner. This supports a shift in health data governance from strict data isolation toward more controlled forms of cross-site analysis. Historically, concerns over patient privacy and data ownership have made institutions reluctant to share data, especially across borders. The federated approach offers a governance framework where those concerns are addressed not by denying access to data altogether, but by controlling how data is accessed and where it resides. Each data provider remains the ultimate gatekeeper of their information, and usage is governed by consortium agreements and oversight committees. Such a model can support collaborative research while respecting the sovereignty of local data custodians. It exemplifies what a modern data governance strategy can look like: one that enables compliance and collaboration simultaneously (17).

For clinical research, federated networks can support studies that are larger and more representative than isolated single-site analyses. In fields like emergency medicine, where practice patterns may differ widely and outcomes can be influenced by local context, having multi-center data is crucial. The eCREAM project's focus on ED hospitalization rates across Europe is a case in point. By federating or aggregating EHR data from dozens of hospitals, researchers can discern systemic issues (like variability in admission thresholds) that no single-hospital study could detect.

Europe has extensive efforts to establish medical best-practice guidelines through initiatives like the European Commission's Public Health Best Practice portal (30) and the European Medicines Agency (EMA)'s scientific guidelines (31), which align EU member states on quality, safety, and efficacy standards for medicines and promote health promotion and disease prevention. These efforts aim to improve patient care, reduce variation in practice, enhance healthcare quality, and facilitate the translation of research into clinical practice at national, regional, and local levels. Federated analysis of EHR data across multiple countries could provide a useful evidence base for expert decision-making. Moreover, the ability to conduct such analyses without compromising patient privacy encourages more institutions to participate, including those in regions with strict data protection climates. This inclusive engagement is particularly important in emergency medicine research, which often suffers from data gaps and silos. Through federated data governance, even hospitals that cannot export data can still contribute to global knowledge by “bringing the algorithm to the data” on their premises.

Another potential benefit is that this approach may help accelerate clinical research during public health emergencies. While neither FERES nor eCREAM was conducted in an acute epidemic scenario, the model they use could be highly relevant in future pandemics or multinational public health studies. A federated network could, for example, enable rapid analysis of emerging infections or treatment outcomes across countries without waiting to centralize all data –which in a fast-moving outbreak might be too slow or raise public mistrust. The federated queries could run behind each hospital's firewall, giving authorities a near-real-time view of aggregate trends (e.g., ICU occupancy, mortality rates, treatment efficacy) with patient privacy intact. Establishing the legal and technical infrastructure ahead of time, as initiatives like the MIP are doing, means a network is in place that can be activated for such urgent analyses. From a health-system perspective, this can be viewed as a form of “data preparedness” for collaborative analytics when time is critical.

From a governance perspective, these projects highlight the need to standardize agreements and ethical frameworks for data sharing, in line with OECD guidance that identifies common data-sharing agreements, clear legal bases for secondary use, transparent access controls, and interoperable safeguards as foundational elements of health-data governance across jurisdictions (32). In our federations we had to draft our own charters, agreements, and operating procedures because the approach was new. As federated networks become more common, there may be a shift toward template agreements and more standardized cross-border legal frameworks. The FERES and eCREAM experience suggests that stakeholders are willing to collaborate when given clear assurances. If a common European framework for the secondary use of health data, complete with widely accepted data-sharing and data-use agreement templates, were already in place, many legal and administrative hurdles would be reduced. The European Health Data Space is expected to build such a framework in the coming years, so future consortia may not need 12 to 18 months of negotiations before they can begin (29). These federations also underline the importance of transparency and patient engagement in governance. In these projects, patients' individual data did not leave their country, yet patients are the ultimate beneficiaries of the research. Robust ethical oversight is essential; each eCREAM center obtained local ethics approval or waivers for retrospective use, and results are published for the public good. By demonstrating practical value, such as cross-site identification of best practices and gaps in care, these networks may help strengthen public trust in data-driven research.

For emergency medicine specifically, federated analysis may support more consistent multi-center evidence generation. Emergency care has traditionally faced challenges in multi-center research due to inconsistency in data collection and the hectic nature of ED workflows. The initial lessons from eCREAM suggest that modern tools, including NLP for data extraction and platforms such as the MIP for secure analysis, can help address some of these hurdles. One of the tasks of eCREAM is to develop an ED EHR that already writes data to the program's common requirements; standardization happens at the point of capture, not after the fact. If this practice of having a minimum common base at the EHR level is adopted widely, harmonization across hospitals could become a predictable, light-touch step rather than a demanding, post-hoc exercise. Local flexibility would remain, with sites tailoring templates and adding fields, yet every extension would reference the core schema, so it stays interpretable across sites. Over time, pooling data for research, quality improvement and cross-border reporting could become routine, a by-product of everyday documentation rather than a bespoke integration project. With this foundation, hospitals can contribute data on common syndromes or procedures (such as management of chest pain, stroke, trauma, etc.) to federated analyses that benchmark performance and outcomes. This could support a more collaborative learning system among emergency departments internationally. In essence, federated networks can turn the variability among emergency care systems into a learning opportunity, where each site can compare against aggregated outcomes and identify areas for improvement in a confidential, non-punitive manner. This collaborative ethos of sharing insights without sharing raw data could become an important component of future quality-improvement initiatives.

6 Conclusion

In conclusion, the Medical Informatics Platform's federated model addresses a longstanding challenge in multicenter health-data research: enabling large-scale analyses while preserving patient privacy and institutional autonomy. Compared with conventional centralized data-sharing approaches, federation can offer practical advantages in governance, compliance, and trust that are particularly relevant for emergency medicine and other domains working with sensitive data across jurisdictions. The experiences of FERES and eCREAM show that this model is feasible in practice, while also making clear that its implementation depends on sustained legal coordination, data harmonization, and organizational commitment. At this stage, the main contribution of these federations lies in establishing the conditions under which previously isolated datasets can be studied collaboratively in a privacy-preserving manner. Looking ahead, the broader significance of this approach lies in its reproducibility and potential scale. As additional hospitals, registries, and national networks join existing federations or establish new ones, the analytical value of these infrastructures is likely to grow through increased diversity, representativeness, and statistical power. Because the MIP is modular and open source, the same model can in principle be adapted to other domains handling sensitive data, including oncology, critical care, rare diseases, psychology and cognitive science, or prospective clinical trials spanning multiple health systems. The continuing maturation of standards, common data models, and supportive policy frameworks such as the EHDS may further reduce the effort required to establish future federations and improve their interoperability. In that perspective, federated data networks may become an important component of learning health systems, supporting continuous knowledge generation from routine care data across institutions and borders (33).

Statements

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding authors.

Ethics statement

Each participating institution obtained approval or a waiver from its local Ethics Committee for the secondary use of de-identified clinical data, in accordance with national regulations and institutional policies. All patient datasets were irreversibly anonymized prior to ingestion into the Medical Informatics Platform (MIP) using documented de-identification protocols aligned with the Swiss HRO, European GDPR requirements, Swiss SPHN guidelines, and the UK NHS ISB1523 anonymization standard. This rigorous anonymization process ensured that no personal identifiers remained in the data, upholding patient privacy and data protection norms across all sites.

Author contributions

GM: Writing – original draft, Data curation, Visualization, Conceptualization, Methodology. BS: Conceptualization, Writing – review & editing, Writing – original draft, Project administration. EM: Project administration, Writing – review & editing. PD: Data curation, Writing – review & editing. GB: Writing – review & editing, Supervision, Investigation. PM: Writing – review & editing, Supervision, Investigation. MR: Investigation, Writing – review & editing. GG: Data curation, Writing – review & editing. AS: Data curation, Investigation, Writing – review & editing. VB: Project administration, Writing – review & editing. JD: Software, Writing – review & editing. KF: Writing – review & editing, Software. M-OK: Software, Visualization, Writing – review & editing. T-MK: Software, Writing – review & editing. AG: Software, Writing – review & editing. YI: Supervision, Writing – review & editing. PR: Supervision, Writing – review & editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This work was co-funded by the European Union's H2020 Framework Partnership Agreement No. 650003 for the Human Brain Project; the Horizon Europe R&I program through eCREAM, Grant Agreement No. 101057726 and SERI (the Swiss State Secretariat for Education, Research and Innovation) contract number 22.00347; the Horizon Europe R&I program through the EBRAINS2.0 project grant agreement No. 101147319 and SERI contract number 23.00638. Additional funding was provided by the European Academy of Neurology (EAN) for the FERES project. The MIP is a service provided by the EBRAINS Research Infrastructure. All funding sources had no direct involvement in the study design, data analysis, or decision to publish.

Acknowledgments

The authors thank the members of the eCREAM consortium and the FERES project, as well as partner institutions and collaborators, for their participation and contributions to this work. We acknowledge Lausanne University Hospital (CHUV), particularly the Neurodigital team within the Department of Clinical Neurosciences, for support in clinical data coordination and federation deployments; the Swiss National Supercomputing Center (CSCS) for computational infrastructure; the AthenaRC MaDgIK team for the continuous development of the MIP; and the EBRAINS technical team for infrastructure services and for hosting the MIP as an EBRAINS service. Finally, we are grateful to colleagues who supported the platform's development, governance, and deployment, including key contributors not listed as co-authors, for their invaluable assistance throughout the FERES, eCREAM and other MIP related projects.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was used in the creation of this manuscript. Generative AI disclosure. During manuscript preparation, we used ChatGPT (product: ChatGPT 5 Pro; model: GPT-5 Pro; provider: OpenAI) to assist with language editing (clarity, grammar, and condensation of repetitive passages), structural suggestions and for selecting/validating references. All content was written, reviewed, verified, and edited by the author(s), who take full responsibility for the accuracy and integrity of the manuscript. No generative-AI tools were used for data collection, statistical analysis, figure creation.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Summary

Keywords

cross-border multicenter research, data harmonization, electronic health records (EHR), emergency department research, federated analytics, Medical Informatics Platform (MIP), privacy-preserving analysis, stroke registries

Citation

Melissargos G, Schaffhauser B, Mailli E, Ducouret P, Bertolini G, Michel P, Rujano MA, Ghilardi GI, Salerno A, Bastic V, Dhallenne J, Filippopolitis K, Katsouli M-O, Karampatsis T-M, Glenis A, Ioannidis Y and Ryvlin P (2026) Deploying the Medical Informatics Platform for cross-border federated analytics in FERES and eCREAM. Front. Disaster Emerg. Med. 4:1748193. doi: 10.3389/femer.2026.1748193

Received

17 November 2025

Revised

23 March 2026

Accepted

27 March 2026

Published

28 April 2026

Volume

4 - 2026

Edited by

Göksu Bozdereli Berikol, Atilim Universitesi Tip Fakultesi, Türkiye

Reviewed by

Buğra Ilhan, Kırıkkale University, Türkiye

Jonas Bienzeisler, University Hospital RWTH Aachen, Germany

Altuğ Kanbakan, Ufuk University Faculty of Medicine, Türkiye

Updates

Copyright

*Correspondence: Georges Melissargos, ; Birgit Schaffhauser,

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics