Your new experience awaits. Try the new design now and help us make it even better

PERSPECTIVE article

Front. Med., 23 January 2026

Sec. Intensive Care Medicine and Anesthesiology

Volume 13 - 2026 | https://doi.org/10.3389/fmed.2026.1746867

This article is part of the Research TopicData Science in Anesthesiology and Intensive CareView all 14 articles

Multimodal data for predictive medicine: algorithmic fusion of clinical data in anesthesiology and intensive care

Sebastian Daniel Boie
Sebastian Daniel Boie1*Niklas Giesa,Niklas Giesa1,2Maria Sekutowicz,,Maria Sekutowicz1,2,3Rustam Zhumagambetov,Rustam Zhumagambetov4,5Stefan Haufe,,Stefan Haufe1,4,5Elias GrünewaldElias Grünewald1Felix BalzerFelix Balzer1
  • 1Institute of Medical Informatics, Charité-Universitätsmedizin Berlin, Berlin, Germany
  • 2Department of Anesthesiology and Operative Intensive Care Medicine (CCM, CVK), Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
  • 3Berlin Institute of Health at Charité – Universitätsmedizin Berlin, BIH Biomedical Innovation Academy, BIH Charité Junior Clinician Scientist Program, Berlin, Germany
  • 4Physikalisch-Technische Bundesanstalt, Berlin, Germany
  • 5Technische Universitat Berlin, Berlin, Germany

Anesthesiology and intensive care medicine are among the most data-rich fields of medicine, where accurate and timely outcome prediction or risk stratification is important. During patient care, heterogeneous data streams, including structured electronic health records, free-text documentation, and high-frequency physiologic time series are recorded. This provides a fertile ground for machine learning (ML) models to make individualized risk predictions. Yet, secondary use of routine data remains difficult due to heterogeneity, missingness, variable granularity, ambiguously defined outcomes, or poor representation of clinical concepts in routine data. Reproducibility and transparency are difficult to achieve with hospital-specific complex data pipelines. New complexities arise when combining different data modalities. This perspective article discusses three common modalities—tabular data, clinical text, and time series—and outlines data modality-specific challenges, data preprocessing strategies, and ML modeling approaches. We examine multimodal fusion strategies through the common taxonomy of early, intermediate, and late fusion. In early fusion, generated features are aggregated into a unified tabular representation, offering simplicity and often serve as first baseline prediction models. Intermediate fusion uses modality-specific encoders with shared layers to learn cross-modal dependencies. This strategy yields the most complex and powerful models. Late decision-level fusion combines outputs from modality-optimized models, providing modularity and robustness to missing modalities, leading to advantages for real-time deployment where data arrive asynchronously. The growth of multi-centric datasets and federated infrastructures may enable intermediate-fusion architectures and multimodal foundation models to better capture patient trajectories, supporting risk stratification and personalized therapy in perioperative and intensive care settings.

Introduction

Patients in anesthesiology and intensive care units are among the most closely monitored patients. Different data types, such as continuous physiologic signals, high-frequency device outputs, and detailed electronic health records (EHRs), produce large streams of data (1). Traditional clinical decision-making tools, which are based on guidelines and physician expertise are increasingly challenged by addressing the volume and complexity of information available today. This environment has created opportunities for the development of machine (ML) applications that promise to reshape perioperative and intensive care medicine. Typical use cases for ML are perioperative risk prediction for complications such as acute kidney injury (2, 3) and post-operative delirium (4, 5), real-time physiologic monitoring and early-warning systems for hypotension and deterioration (6, 7) or early detection and outcome prediction in sepsis (8).

It is well-known that research based on routine data is challenging due to complexity, heterogeneity, and data quality challenges. Clinical and administrative data is primarily collected for coordinating care and billing, so use for other purposes may lead to difficulties. Clinical workflows are typically not optimized for research-grade data quality. Inconsistencies, missingness, and varying levels of granularity across institutions are equally challenging (9).

Standardization is a large area of research that alleviates some of the pain points. The major standardization efforts are the Observational Medical Outcomes Partnership (OMOP) common data model for standardized storage of observational data (10), Fast Healthcare Interoperability Resources (FHIR) for standardized data exchange (11), the openEHR framework (12) for standardized clinical information representation and HL7 Structured Data Capture (SDC) for standardized data capture.

Missingness patterns are categorized into missing completely at random (MCAR), where missingness is independent of both observed and unobserved values, missing at random (MAR), where missingness is only dependent on observed values, and missing not at random (MNAR), where missingness is dependent on unobserved values. Typically, all three patterns are observed in routine data, and assumptions about missingness are generally untestable in practice, requiring sophisticated strategies for imputation by estimating missing values on a training cohort (1319).

Missingness of outcomes is a bigger hurdle than missing predictors. Reliable definitions of outcomes are challenging for many outcomes of interest, as endpoints are sometimes observed with delay (right-censored at prediction time), ambiguously defined or insufficiently captured. Therefore, formulating clinically meaningful prediction tasks and developing robust models are challenging. Many researchers default to predicting in-hospital mortality or length of stay (20), even though the prediction is rarely actionable, since it’s not specific enough to suggest an intervention and the cause can seldomly be addressed over a short period of time.

Virtually every treatment episode in anesthesiology and intensive care results in tabular data (e.g., demographic factors, billing information, comorbidities, laboratory results, medication administration), free-text data (e.g., anamnesis, admission and discharge notes) and time-series or waveform signals from monitors and devices (e.g., heart rate, blood pressure) (21). Some patients also have recorded imaging data (e.g., CT, MRI or X-ray) during their treatment episode. Emerging data sources, such as genomics data, wearable data and patient-reported outcomes fit into the existing modalities of tabular, text and time series data.

Combining different data modalities complicates integration and joint analysis. The data modalities are often recorded at irregular intervals (from multiple samples a second for waveform data to daily documentation of risk scores and observations), requiring sophisticated preprocessing and alignment methods (22, 23).

In this article, we will discuss the three common data modalities (tabular data, clinical notes and time series data) highlight challenges and common approaches to analyze each modality and discuss typical approaches for a joint analysis of modalities for predicting outcomes.

Data modalities and analysis strategies

The three key data modalities tabular data, text, and time series may contain overlapping information, but provide distinct insights that are essential for comprehensive patient monitoring and outcome prediction. In this perspective, we focus on the three modalities that are available for every patient in anesthesiology and intensive care.

Tabular data

Tabular data is information organized in predefined fields and formats, organized in tables or key-value pairs, such that each variable has consistent meaning and data type. Data types can be continuous or categorical.

Common data quality issues of tabular data are the use of different physical units (e.g., height in m or cm) and extreme values or outliers (24). Each datum has an associated timestamp; however, these timestamps can be inaccurate (e.g., due to miscalibrated clocks, daylight saving time adjustments, or diagnoses and procedures being coded only at the end of the hospital stay for billing) (25). To mitigate these quality issues, manual review of variables to check consistent units followed by mapping to a standard vocabulary and unit, applying reasonable bounds and robust scaling like z-score standardization have proved effective (26, 27).

A vast body of literature is dedicated to outcome prediction based on tabular data (2, 2830). Unlike for other data modalities, there is no a-priori correlation structure to exploit. For other data modalities, specific architectures (e.g., CNNs for images, recurrent NNs or Transformer for time series and text) can capitalize on the inherent correlation structure in these modalities. However, correlations exist (e.g., between height, weight and diagnoses codes coding obesity) and can be learned by models during the training process.

Models for prediction based on tabular data range from simple models like linear (continuous prediction) or logistic regression (categorical prediction) to advanced algorithms like tree-based methods (e.g., xgboost), deep neural networks (e.g., multi-layer perceptrons) and lately neural-network based foundation models (31). The widely used xgboost algorithm is often a very strong candidate for high prediction accuracy (32).

Free-text

A substantial part of relevant information is documented in free text (e.g., anamnesis, progress notes). The advantage of free text is that information that does not fit into a predefined tabular format can be documented. For instance, information like clinical reasoning and interpretation of data (e.g., a patient is hypotensive likely due to septic shock) and rationale (e.g., increased norepinephrine due to persistent low MAP) is typically contained in the clinical notes. Free text also allows for detailed patient history documentation, event descriptions, care coordination, nursing observations and other contextual information not captured in tabular data (33, 34). Furthermore, free-text is often preferred by experienced clinicians as it is usually considered to be more practical and time-efficient for data entry as opposed to structured data input.

Analyzing free-text data is arguably more challenging compared to other data types. While implicit standards for clinical notes exist, no rules are enforced, since conditions can be described in many ways, local customs have become established, or the clinical information systems limit their expressiveness. Free-text can be context dependent on other data modalities, contain a mix of tabular (e.g., a specific lab value) and word documentation, non-standard abbreviations, jargon, typos, and multilingual entries (35). The secondary use of free-text data for research is also hampered by legal and privacy considerations since personally identifiable information can be included and is challenging to remove in an automated fashion.

Traditionally, free-text data has been included in predictive models by mapping elements to tabular data using a standardized terminology such as SNOMED CT (36) or by extracting relevant information using either a rule-based or model-based strategies such as named-entity-recognition (37) and bag-of-words (38). Nowadays, there is a growing interest in working with free-text data that automatically converts text to relevant representations for prediction using deep-learning approaches (e.g., BioBERT, ClinicalBERT, MedPaLM 2) (39, 40).

Time series/waveforms

Continuously monitored and recorded parameters are often stored as time series. Since typical parameters (e.g., heart rate, respiratory rate) exhibit periodic patterns, they are also called waveforms. This modality uniquely captures dynamic changes in biosignals of individuals.

These data are not free of artifacts due to sensor disconnection or misplacement, motion artefacts, data transmission errors or procedures initiated by the hospital staff. Artifacts can include missing or extreme values beyond physiologically plausible limits. Often a simple last-observation carried forward imputation is therefore performed, which has been shown to be competitive for time series data (41).

High frequency time series exhibit unique challenges. Some signals are recorded with 500 samples per second and quickly generate millions of data points per patient and day. Analyzing such data for an entire cohort requires specialized (distributed) algorithms for compression, indexing, storage, and retrieval (42). Common basic approaches are low-pass filtering and downsampling (43). Additionally, detecting signals pertaining to the onset of some event of interest is akin to finding the needle in the haystack.

Traditionally, time series are incorporated into prediction models by aggregation strategies that collapse the signal over a predefined period of time (44). Such strategies are easy to implement and allow researchers to use same models as for tabular data. However, this approach discards much of the temporal structure. Alternatively, modern approaches include algorithms that process raw time series data directly (e.g., recurrent neural networks that read one value per feature per time step) (45).

Combining data modalities

Each of the three data modalities has its own challenges, mitigation strategies, and routinely used algorithms for analysis. While many studies have used multiple modalities for outcome prediction, this is often done by a simple integration strategy neglecting the inherent structure of some modalities. This possibly discards information useful for solving prediction tasks (44, 46).

In the ML community, there is a widely-used taxonomy of early fusion, intermediate fusion, and late fusion of different data modalities for addressing machine learning tasks (47). We discuss this taxonomy in the context of outcome prediction in anesthesiology and intensive care. An overview of the fusion strategies can be seen in Figure 1.

Figure 1
Diagram illustrating three types of data fusion for prediction: A) Early fusion combines structured data, time series, and text into structured data for input into a tree-based model. B) Intermediate fusion processes structured data, time series with RNN, and text with embeddings and attention-based encoders, integrated into a fully connected network (FCN) for prediction. C) Late fusion processes structured data, time series, and text individually with FCN, RNN, and attention encoders, then combines results using majority vote or meta learners for prediction.

Figure 1. Data modality fusion strategies. The left panel (A) shows an example of early fusion, where data modalities are combined prior to model input. The middle panel (B) shows an example of intermediate fusion, where data of different modalities are ingested by bespoke algorithms before being combined by a central model. The right panel (C) shows an example of late fusion, where each modality is used independently in an appropriate model, while only model outputs are combined at the decision stage.

Early fusion

Early fusion combines heterogeneous data at the input level. Commonly, this is done by using tabular data directly, and by extracting features from time series and text (see Figure 1A). Features from time series can be obtained by manually extracting summary statistics (e.g., mean, variance, percentiles) across the entire time series or segments of it. This hand-crafted feature selection is often guided by domain knowledge therefore needs input from experts (48). Alternatively, there are solutions for automatically extracting hundreds of time series features based on hypothesis testing procedures (49).

In early fusion, text is often incorporated by searching for keywords, lexicon or regular expression-based matching (50). This form of information extraction leads to a tabular documentation of relevant entities which can then be used downstream in the prediction task. For these tasks, a range of libraries are available (e.g., spaCy, NLTK, MedCAT) (51, 52). Features are also commonly obtained via embedding models that produce dense vector representations of text.

In early fusion, these representations are concatenated with tabular data and time-series features to form a single, large feature vector, which serves as input to a downstream prediction model. Prediction models that operate on modalities fused in this way are discussed in the tabular data section. However, early fusion can discard important information from individual modalities, potentially reducing performance on downstream prediction tasks.

Intermediate fusion

A joint analysis of tabular, textual, and time-series data substantially increases methodological complexity. It requires aligning heterogeneous data modalities with different temporal resolutions, standardizing variable definitions across modalities, and resolving inconsistencies in documentation practices.

Intermediate fusion integrates data at the representation level. Here, each modality is first processed by a modality-specific encoding model that transforms raw input into a latent feature representation. The encoded representations are then combined in a shared architecture, typically using concatenation, attention, or gating mechanisms that allow the model to learn cross-modal interactions (53, 54).

A common setup involves neural encoders tailored to each modality—for instance, feed-forward networks for tabular data, pre-trained language models such as BERT or ClinicalBERT for text, and Transformer or recurrent architectures for time series (55, 56). The encoded data are then fused in a joint layer, projecting into a shared latent space, where information can be attended to across modalities. A key feature is that the entire architecture, from joint layer to individual modality encoders, can be jointly trained or fine-tuned, such that all components are optimized towards the prediction task of interest. This supports flexible and data-driven representation learning and has become one of the most widely used and powerful strategies in multimodal machine learning (55, 57).

Intermediate fusion offers several advantages over early and late fusion. It allows each modality to contribute learned features rather than relying on handcrafted summaries, and it can discover complex nonlinear dependencies between modalities. However, it typically requires more computational resources and training data, and careful design choices are needed to balance the contribution of each modality, to prevent overfitting, and to handle missing modalities during inference (58).

Recent work in clinical prediction models has demonstrated high performance of intermediate fusion architectures. For example, cross-modal attention and gating mechanisms have been successfully applied to integrate physiologic time series with clinical notes and tabular EHR data, yielding improvements in mortality and event prediction tasks (59). Such architectures illustrate the potential of data-driven representation learning to capture intricate patterns of different data modalities.

Late fusion

Late fusion (or decision-level fusion) combines the outputs of separate models trained independently on individual data modalities, for example through weighted averaging, majority voting or stacking. Each modality is modeled using an algorithm most suitable given its characteristics. The predictions are then combined in a decision layer (see Figure 1C). This architecture offers practical advantages. It is modular by design, and individual components can be retrained or replaced without affecting other parts. Late fusion is robust to missing modalities, since predictions are still generated from available data modalities. Therefore, it is particularly applicable to real-time ML in intensive care, where waveform data may be unavailable during certain procedures and textual documentation may occur only after care decisions are taken and executed. Additionally, with this architecture, some interpretability is given by the fact that it becomes apparent which modality and set of clinical parameters lead to a certain prediction.

The major shortcoming of this strategy is that it cannot learn complex interdependencies between data modalities (e.g., a co-occurrence of an abnormal lab value (tabular), an unusual blood pressure pattern (time series) and a documented observation which may be inconclusive individually but very informative taken together).

Table 1 shows a brief summary of the different fusion strategies with advantages, disadvantages and examples.

Table 1
www.frontiersin.org

Table 1. Summary table of data fusion strategies with advantages, disadvantages and examples.

Clinical perspective

In this perspective article, we discuss key challenges, strategies and architecture design patterns for three common data modalities in anesthesia and intensive care. With complementary data modalities available during different phases of clinical care, the multimodal integration holds substantial potential to enhance understanding of disease trajectories, and to improve ML predictions to support clinical decisions, stratify risk groups, and enable personalized therapy (57, 58).

In anesthesiology and intensive care medicine, tabular, text and time series data are commonly acquired for every patient. Each of these modalities comes with distinct challenges and requirements for preprocessing strategies (60). Multimodal integration demands robust strategies for harmonizing units, ontologies, and timestamps, as well as methods that can gracefully handle incomplete records or entirely missing modalities.

The multimodal ML community has a widely used taxonomy of data fusion strategies, which is directly applicable to ML use cases in anesthesia and intensive care. Since each fusion strategy has advantages and limitations, it is important to carefully consider which strategy is chosen for a given prediction task. We provide guidance below.

Early fusion is simple and powerful when reliable feature extraction strategies exist (e.g., curated time-series summaries and keyword/entity based textual features). This approach requires less computational resources than other strategies and less training data. Hence, it is particularly suitable for smaller data sets and single-center studies.

Intermediate fusion (encoder-specific layers for each modality with shared fusion layers) best captures cross-modal interactions and has the potential to yield the strongest predictive performance among strategies (61). It puts higher computational requirements on the model development and requires larger representative data sets. In the future, with national and international data sharing and federated machine learning infrastructures being developed, intermediate fusion models are likely to attain superior performance (58).

Late fusion is modular, robust to missing data modalities, and suitable for real-time deployment where modalities arrive asynchronously. While it cannot learn complex cross-modal dependencies, its strength lies in fault-tolerant architectures with easier-to-achieve real-time capabilities, making this strategy especially suitable for first production pilots.

Machine learning-based (multimodal) predictions are valuable only if they support actions or inform processes. Prediction tasks should be defined by the decisions they enable for interventions in a given time horizon (e.g., vasopressor titration, fluid administration, antibiotic escalation, ICU triage). Outcome choices, such as in-hospital mortality or length of stay, are comparably easy to label but rarely actionable. Designing prediction tasks and evaluation strategies is one of the primary challenges in clinical ML (62). The availability of multiple data modalities adds complexity but will likely yield the highest performance.

Recent multimodal studies illustrate these considerations. Benchmarking work in emergency settings has used late fusion to combine high-frequency waveforms with structured clinical data, demonstrating added value of waveform signals for acute risk stratification (63). Sepsis trajectory models using multivariate temporal EHR data employ early fusion to encode longitudinal predictors and improve continuous risk estimation (64).

Also, studies predicting postoperative deterioration or cardiac events from combined waveform and EHR data largely adopt early fusion, integrating engineered waveform features with structured variables before modeling (65, 66). In deployment contexts, intraoperative hypotension-prediction systems such as the Hypotension Prediction Index exemplify unimodal early fusion of high-frequency waveform features for real-time decision support (6).

Collectively, these studies show that multimodal fusion strategies are already used in clinical research and practice, reinforcing the potential of multimodal ML to enhance real-time risk assessment in emergency and perioperative care.

Conclusion

Multimodal ML applied to tasks in anesthesiology and intensive care is an emerging field with large potential. Early fusion is usually the fastest path to a baseline prediction, late fusion is the most resilient architecture for real-time deployment, and intermediate fusion offers the highest predictive performance. With larger multi-centric datasets and national and international federated infrastructures being developed, multimodal foundation models with an intermediate fusion strategy may prove to be transformative for personalized medicine.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

SB: Visualization, Conceptualization, Writing – review & editing, Methodology, Writing – original draft. NG: Validation, Writing – review & editing. MS: Writing – review & editing. RZ: Writing – review & editing, Validation. SH: Validation, Methodology, Writing – review & editing. EG: Writing – review & editing, Validation. FB: Supervision, Writing – review & editing, Resources.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that Generative AI was used in the creation of this manuscript. Generative AI tools were used solely for language editing to improve grammar and clarity. No AI tools were used to generate, analyze, or interpret scientific content. All edits were reviewed by the authors, who take full responsibility for the content of the manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Noteboom, SH, Kho, E, Galanty, M, Sánchez, CI, Ten Bookum, FCP, Veelo, DP, et al. From intensive care monitors to cloud environments: a structured data pipeline for advanced clinical decision support. EBioMedicine. (2025) 111:105529. doi: 10.1016/j.ebiom.2024.105529

Crossref Full Text | Google Scholar

2. Feng, Y, Wang, AY, Jun, M, Pu, L, Weisbord, SD, Bellomo, R, et al. Characterization of risk prediction models for acute kidney injury: a systematic review and meta-analysis. JAMA Netw Open. (2023) 6:e2313359. doi: 10.1001/jamanetworkopen.2023.13359,

PubMed Abstract | Crossref Full Text | Google Scholar

3. Hill, BL, Brown, R, Gabel, E, Rakocz, N, Lee, C, Cannesson, M, et al. An automated machine learning-based model predicts postoperative mortality using readily-extractable preoperative electronic health record data. Br J Anaesth. (2019) 123:877–86. doi: 10.1016/j.bja.2019.07.030,

PubMed Abstract | Crossref Full Text | Google Scholar

4. Giesa, N, Haufe, S, Menk, M, Weiß, B, Spies, CD, Piper, SK, et al. Predicting postoperative delirium assessed by the nursing screening delirium scale in the recovery room for non-cardiac surgeries without craniotomy: a retrospective study using a machine learning approach. PLOS Digit Health. (2024) 3:e0000414. doi: 10.1371/journal.pdig.0000414

Crossref Full Text | Google Scholar

5. Giesa, N, Sekutowicz, M, Rubarth, K, Spies, CD, Balzer, F, Haufe, S, et al. Applying a transformer architecture to intraoperative temporal dynamics improves the prediction of postoperative delirium. Commun Med. (2024) 4:251. doi: 10.1038/s43856-024-00681-x

Crossref Full Text | Google Scholar

6. Mohammadi, I, Firouzabadi, SR, Hosseinpour, M, Akhlaghpasand, M, Hajikarimloo, B, Tavanaei, R, et al. Predictive ability of hypotension prediction index and machine learning methods in intraoperative hypotension: a systematic review and meta-analysis. J Transl Med. (2024) 22:725. doi: 10.1186/s12967-024-05481-4,

PubMed Abstract | Crossref Full Text | Google Scholar

7. Verma, AA, Stukel, TA, Colacci, M, Bell, S, Ailon, J, Friedrich, JO, et al. Clinical evaluation of a machine learning–based early warning system for patient deterioration. CMAJ. (2024) 196:E1027–37. doi: 10.1503/cmaj.240132,

PubMed Abstract | Crossref Full Text | Google Scholar

8. Yadgarov, MY, Landoni, G, Berikashvili, LB, Polyakov, PA, Kadantseva, KK, Smirnova, AV, et al. Early detection of sepsis using machine learning algorithms: a systematic review and network meta-analysis. Front Med. (2024) 11:1491358. doi: 10.3389/fmed.2024.1491358

Crossref Full Text | Google Scholar

9. Johnson, AEW, Ghassemi, MM, Nemati, S, Niehaus, KE, Clifton, D, and Clifford, GD. Machine learning and decision support in critical care. Proc IEEE. (2016) 104:444–66. doi: 10.1109/JPROC.2015.2501978,

PubMed Abstract | Crossref Full Text | Google Scholar

10. Stang, PE, Ryan, PB, Racoosin, JA, Overhage, JM, Hartzema, AG, Reich, C, et al. Advancing the science for active surveillance: rationale and design for the observational medical outcomes partnership. Ann Intern Med. (2010) 153:600–6. doi: 10.7326/0003-4819-153-9-201011020-00010,

PubMed Abstract | Crossref Full Text | Google Scholar

11. Mandel, JC, Kreda, DA, Mandl, KD, Kohane, IS, and Ramoni, RB. SMART on FHIR: a standards-based, interoperable apps platform for electronic health records. J Am Med Inform Assoc. (2016) 23:899–908. doi: 10.1093/jamia/ocv189,

PubMed Abstract | Crossref Full Text | Google Scholar

12. Beale, T, and Heard, S. An ontology-based model of clinical information. Stud Health Technol Inform. (2007) 129:760–4.

Google Scholar

13. Tan, ALM, Getzen, EJ, Hutch, MR, Strasser, ZH, Gutiérrez-Sacristán, A, Le, TT, et al. Informative missingness: what can we learn from patterns in missing laboratory data in the electronic health record? J Biomed Inform. (2023) 139:104306. doi: 10.1016/j.jbi.2023.104306,

PubMed Abstract | Crossref Full Text | Google Scholar

14. Jazayeri, A, Liang, OS, and Yang, CC. Imputation of missing data in electronic health records based on patients’ similarities. J Healthc Inform Res. (2020) 4:295–307. doi: 10.1007/s41666-020-00073-5,

PubMed Abstract | Crossref Full Text | Google Scholar

15. Getzen, E, Ungar, L, Mowery, D, Jiang, X, and Long, Q. Mining for equitable health: assessing the impact of missing data in electronic health records. J Biomed Inform. (2023) 139:104269. doi: 10.1016/j.jbi.2022.104269

Crossref Full Text | Google Scholar

16. Groenwold, RHH. Informative missingness in electronic health record systems: the curse of knowing. Diagn Progn Res. (2020) 4:8. doi: 10.1186/s41512-020-00077-0

Crossref Full Text | Google Scholar

17. Weiskopf, NG, Rusanov, A, and Weng, C. Sick patients have more data: the non-random completeness of electronic health records. AMIA Annu Symp Proc. (2013) 2013:1472–7.

Google Scholar

18. Afkanpour, M, Hosseinzadeh, E, and Tabesh, H. Identify the most appropriate imputation method for handling missing values in clinical structured datasets: a systematic review. BMC Med Res Methodol. (2024) 24:188. doi: 10.1186/s12874-024-02310-6,

PubMed Abstract | Crossref Full Text | Google Scholar

19. Austin, PC, White, IR, Lee, DS, and Van Buuren, S. Missing data in clinical research: a tutorial on multiple imputation. Can J Cardiol. (2021) 37:1322–31. doi: 10.1016/j.cjca.2020.11.010,

PubMed Abstract | Crossref Full Text | Google Scholar

20. Giacobbe, DR, Signori, A, Del Puente, F, Mora, S, Carmisciano, L, Briano, F, et al. Early detection of sepsis with machine learning techniques: a brief clinical perspective. Front Med. (2021) 8:617486. doi: 10.3389/fmed.2021.617486

Crossref Full Text | Google Scholar

21. Johnson, AEW, Bulgarelli, L, Shen, L, Gayles, A, Shammout, A, Horng, S, et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data. (2023) 10:1. doi: 10.1038/s41597-022-01899-x,

PubMed Abstract | Crossref Full Text | Google Scholar

22. Khadanga, S, Aggarwal, K, Joty, S, and Srivastava, J. Using clinical notes with time series data for ICU management. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China: Association for Computational Linguistics; 2019. p. 6431–6436. Available online at: https://www.aclweb.org/anthology/D19-1678 (Accessed September 4, 2025).

Google Scholar

23. Papareddy, P, Lobo, TJ, Holub, M, Bouma, H, Maca, J, Strodthoff, N, et al. Transforming sepsis management: AI-driven innovations in early detection and tailored therapies. Crit Care. (2025) 29:366. doi: 10.1186/s13054-025-05588-0,

PubMed Abstract | Crossref Full Text | Google Scholar

24. Jackson, N, Woods, J, Watkinson, P, Brent, A, Peto, TEA, Walker, AS, et al. The quality of vital signs measurements and value preferences in electronic medical records varies by hospital, specialty, and patient demographics. Sci Rep. (2023) 13:3858. doi: 10.1038/s41598-023-30691-z,

PubMed Abstract | Crossref Full Text | Google Scholar

25. Albu, E, Gao, S, Stijnen, P, Rademakers, FE, van Bussel, BCT, Collyer, T, et al. Challenges and recommendations for electronic health records data extraction and preparation for dynamic prediction modelling in hospitalized patients—a practical guide. arXiv ; 2025. Available online at: http://arxiv.org/abs/2501.10240 (Accessed April 4, 2025).

Google Scholar

26. Bennett, N, Plečko, D, Ukor, IF, Meinshausen, N, and Bühlmann, P. ricu: R’s interface to intensive care data. Gigascience. (2022) 12:giad041. doi: 10.1093/gigascience/giad041,

PubMed Abstract | Crossref Full Text | Google Scholar

27. Boie, SD, Meyer-Eschenbach, F, Schreiber, F, Giesa, N, Barrenetxea, J, Guinemer, C, et al. A scalable approach for critical care data extraction and analysis in an academic medical center. Int J Med Inform. (2024) 192:105611. doi: 10.1016/j.ijmedinf.2024.105611

Crossref Full Text | Google Scholar

28. Shamout, F, Zhu, T, and Clifton, DA. Machine learning for clinical outcome prediction. IEEE Rev Biomed Eng. (2021) 14:116–26. doi: 10.1109/RBME.2020.3007816,

PubMed Abstract | Crossref Full Text | Google Scholar

29. Van Den Berg, T, Heymans, MW, Leone, SS, Vergouw, D, Hayden, JA, Verhagen, AP, et al. Overview of data-synthesis in systematic reviews of studies on outcome prediction models. BMC Med Res Methodol. (2013) 13:42. doi: 10.1186/1471-2288-13-42

Crossref Full Text | Google Scholar

30. Hopkins, D, Rickwood, DJ, Hallford, DJ, and Watsford, C. Structured data vs. unstructured data in machine learning prediction models for suicidal behaviors: a systematic review and meta-analysis. Front Digit Health. (2022) 4:945006. doi: 10.3389/fdgth.2022.945006

Crossref Full Text | Google Scholar

31. Hollmann, N, Müller, S, Purucker, L, Krishnakumar, A, Körfer, M, Hoo, SB, et al. Accurate predictions on small data with a tabular foundation model. Nature. (2025) 637:319–26. doi: 10.1038/s41586-024-08328-6,

PubMed Abstract | Crossref Full Text | Google Scholar

32. Mesinovic, M, Watkinson, P, and Zhu, T. Explainable machine learning for predicting ICU mortality in myocardial infarction patients using pseudo-dynamic data. Sci Rep. (2025) 15:27887. doi: 10.1038/s41598-025-13299-3,

PubMed Abstract | Crossref Full Text | Google Scholar

33. Soguero-Ruiz, C, Hindberg, K, Mora-Jiménez, I, Rojo-Álvarez, JL, Skrøvseth, SO, Godtliebsen, F, et al. Predicting colorectal surgical complications using heterogeneous clinical data and kernel methods. J Biomed Inform. (2016) 61:87–96. doi: 10.1016/j.jbi.2016.03.008,

PubMed Abstract | Crossref Full Text | Google Scholar

34. Jensen, K, Soguero-Ruiz, C, Oyvind Mikalsen, K, Lindsetmo, RO, Kouskoumvekaki, I, Girolami, M, et al. Analysis of free text in electronic health records for identification of cancer patient trajectories. Sci Rep. (2017) 7:46226. doi: 10.1038/srep46226

Crossref Full Text | Google Scholar

35. Névéol, A, Dalianis, H, Velupillai, S, Savova, G, and Zweigenbaum, P. Clinical natural language processing in languages other than English: opportunities and challenges. J Biomed Semantics. (2018) 9:12. doi: 10.1186/s13326-018-0179-8,

PubMed Abstract | Crossref Full Text | Google Scholar

36. Donnelly, K. SNOMED-CT: the advanced terminology and coding system for eHealth. Stud Health Technol Inform. (2006) 121:279–90.

Google Scholar

37. Pilowsky, JK, Choi, JW, Saavedra, A, Daher, M, Nguyen, N, Williams, L, et al. Natural language processing in the intensive care unit: a scoping review. Crit Care Resusc. (2024) 26:210–6. doi: 10.1016/j.ccrj.2024.06.008

Crossref Full Text | Google Scholar

38. Shao, Y, Taylor, S, Marshall, N, Morioka, C, and Zeng-Treitler, Q. Clinical text classification with word embedding features vs. bag-of-words features. In: 2018 IEEE International Conference on Big Data (Big Data). Seattle, WA, USA: IEEE; 2018. p. 2874–2878. Available online at: https://ieeexplore.ieee.org/document/8622345/ (Accessed November 13, 2025).

Google Scholar

39. Singhal, K, Azizi, S, Tu, T, Mahdavi, SS, Wei, J, Chung, HW, et al. Large language models encode clinical knowledge. Nature. (2023) 620:172–80. doi: 10.1038/s41586-023-06291-2,

PubMed Abstract | Crossref Full Text | Google Scholar

40. Singhal, K, Tu, T, Gottweis, J, Sayres, R, Wulczyn, E, Amin, M, et al. Toward expert-level medical question answering with large language models. Nat Med. (2025) 31:943–50. doi: 10.1038/s41591-024-03423-7,

PubMed Abstract | Crossref Full Text | Google Scholar

41. Giesa, N, Zhumagambetov, R, Sekutowicz, M, Sikora, V, Piper, S, Balzer, F, et al. Benchmarking imputation methods on real-world clinical time series with simulated spatio-temporal missingness. In Review; 2025. Available online at: https://www.researchsquare.com/article/rs-7263571/v1 (Accessed October 31, 2025).

Google Scholar

42. Goodwin, AJ, Eytan, D, Greer, RW, Mazwi, M, Thommandram, A, Goodfellow, SD, et al. A practical approach to storage and retrieval of high-frequency physiological signals. Physiol Meas. (2020) 41:035008. doi: 10.1088/1361-6579/ab7cb5,

PubMed Abstract | Crossref Full Text | Google Scholar

43. Rahman, J, Brankovic, A, Tracy, M, and Khanna, S. Exploring computational techniques in preprocessing neonatal physiological signals for detecting adverse outcomes: scoping review. Interact J Med Res. (2024) 13:e46946. doi: 10.2196/46946,

PubMed Abstract | Crossref Full Text | Google Scholar

44. Tang, S, Davarmanesh, P, Song, Y, Koutra, D, Sjoding, MW, and Wiens, J. Democratizing EHR analyses with FIDDLE: a flexible data-driven preprocessing pipeline for structured clinical data. J Am Med Inform Assoc. (2020) 27:1921–34. doi: 10.1093/jamia/ocaa139,

PubMed Abstract | Crossref Full Text | Google Scholar

45. Boie, SD, Engelhardt, LJ, Coenen, N, Giesa, N, Rubarth, K, Menk, M, et al. A recurrent neural network model for predicting activated partial thromboplastin time after treatment with heparin: retrospective study. JMIR Med Inform. (2022) 10:e39187. doi: 10.2196/39187,

PubMed Abstract | Crossref Full Text | Google Scholar

46. Guo, C, Lu, M, and Chen, J. An evaluation of time series summary statistics as features for clinical prediction tasks. BMC Med Inform Decis Mak. (2020) 20:48. doi: 10.1186/s12911-020-1063-x,

PubMed Abstract | Crossref Full Text | Google Scholar

47. Stahlschmidt, SR, Ulfengborg, B, and Synnergren, J. Multimodal deep learning for biomedical data fusion: a review. Brief Bioinform. (2022) 23:bbab569. doi: 10.1093/bib/bbab569

Crossref Full Text | Google Scholar

48. Roe, KD, Jawa, V, Zhang, X, Chute, CG, Epstein, JA, Matelsky, J, et al. Feature engineering with clinical expert knowledge: a case study assessment of machine learning model complexity and performance. PLoS One. (2020) 15:e0231300. doi: 10.1371/journal.pone.0231300,

PubMed Abstract | Crossref Full Text | Google Scholar

49. Christ, M, Braun, N, Neuffer, J, and Kempa-Liehr, AW. Time series featuRe extraction on basis of scalable hypothesis tests (tsfresh—a Python package). Neurocomputing. (2018) 307:72–7. doi: 10.1016/j.neucom.2018.03.067

Crossref Full Text | Google Scholar

50. Liu, L, Blake, V, Barman, M, Gallego, B, Churches, T, Kennedy, G, et al. Using natural language processing to extract information from clinical text in electronic medical records for populating clinical registries: a systematic review. J Am Med Inform Assoc. (2025):ocaf176. doi: 10.1093/jamia/ocaf176

Crossref Full Text | Google Scholar

51. Kraljevic, Z, Searle, T, Shek, A, Roguski, L, Noor, K, Bean, D, et al. Multi-domain clinical natural language processing with MedCAT: the medical concept annotation toolkit. Artif Intell Med. (2021) 117:102083. doi: 10.1016/j.artmed.2021.102083,

PubMed Abstract | Crossref Full Text | Google Scholar

52. Schmitt, X, Kubler, S, Robert, J, Papadakis, M, and LeTraon, Y. A replicable comparison study of NER software: StanfordNLP, NLTK, OpenNLP, SpaCy, gate. In: 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS). Granada, Spain: IEEE; 2019. p. 338–343. Available online at: https://ieeexplore.ieee.org/document/8931850/ (Accessed October 20, 2025).

Google Scholar

53. Li, Y, El Habib Daho, M, Conze, PH, Zeghlache, R, Le Boité, H, Tadayoni, R, et al. A review of deep learning-based information fusion techniques for multimodal medical image classification. Comput Biol Med. (2024) 177:108635. doi: 10.1016/j.compbiomed.2024.108635

Crossref Full Text | Google Scholar

54. Huang, SC, Pareek, A, Seyyedi, S, Banerjee, I, and Lungren, MP. Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. NPJ Digit Med. (2020) 3:136. doi: 10.1038/s41746-020-00341-z

Crossref Full Text | Google Scholar

55. Yang, H, Kuang, L, and Xia, F. Multimodal temporal-clinical note network for mortality prediction. J Biomed Semantics. (2021) 12:3. doi: 10.1186/s13326-021-00235-3,

PubMed Abstract | Crossref Full Text | Google Scholar

56. Kotula, CA, Martin, J, Carey, KA, Edelson, DP, Dligach, D, Mayampurath, A, et al. Comparison of multimodal deep learning approaches for predicting clinical deterioration in ward patients: observational cohort study. J Med Internet Res. (2025) 27:e75340–07. doi: 10.2196/75340

Crossref Full Text | Google Scholar

57. Teles, AS, De Moura, IR, Silva, F, Roberts, A, and Stahl, D. EHR-based prediction modelling meets multimodal deep learning: a systematic review of structured and textual data fusion methods. Inf Fusion. (2025) 118:102981. doi: 10.1016/j.inffus.2025.102981

Crossref Full Text | Google Scholar

58. Guarrasi, V, Aksu, F, Caruso, CM, Di Feola, F, Rofena, A, Ruffini, F, et al. A systematic review of intermediate fusion in multimodal deep learning for biomedical applications. Image Vis Comput. (2025) 158:105509. doi: 10.1016/j.imavis.2025.105509

Crossref Full Text | Google Scholar

59. Pawar, Y, Henriksson, A, Hedberg, P, and Naucler, P. Leveraging clinical BERT in multimodal mortality prediction models for COVID-19 In: 2022 IEEE 35th International Symposium on Computer-Based Medical Systems (CBMS). Shenzen, China: IEEE; 2022. p. 199–204. Available online at: https://ieeexplore.ieee.org/document/9867036/ (Accessed October 23, 2025).

Google Scholar

60. Pollard, TJ, Johnson, AEW, Raffa, JD, Celi, LA, Mark, RG, and Badawi, O. The eICU collaborative research database, a freely available multi-center database for critical care research. Sci Data. (2018) 5:180178. doi: 10.1038/sdata.2018.178

Crossref Full Text | Google Scholar

61. Wang, Y, Yin, C, and Zhang, P. Multimodal risk prediction with physiological signals, medical images and clinical notes. Heliyon. (2024) 10:e26772. doi: 10.1016/j.heliyon.2024.e26772,

PubMed Abstract | Crossref Full Text | Google Scholar

62. Do, DK, Rockenschaub, P, Boie, S, Kumpf, O, Volk, HD, Balzer, F, et al. The impact of evaluation strategy on sepsis prediction model performance metrics in intensive care data. Intensive Care Med. (2025). doi: 10.1101/2025.02.20.25322509

Crossref Full Text | Google Scholar

63. Alcaraz, JML, Bouma, H, and Strodthoff, N. Enhancing clinical decision support with physiological waveforms — a multimodal benchmark in emergency care. Comput Biol Med. (2025) 192:110196. doi: 10.1016/j.compbiomed.2025.110196,

PubMed Abstract | Crossref Full Text | Google Scholar

64. Agor, JK, Li, R, and Özaltın, OY. Septic shock prediction and knowledge discovery through temporal pattern mining. Artif Intell Med. (2022) 132:102406. doi: 10.1016/j.artmed.2022.102406

Crossref Full Text | Google Scholar

65. Kim, RB, Alge, OP, Liu, G, Biesterveld, BE, Wakam, G, Williams, AM, et al. Prediction of postoperative cardiac events in multiple surgical cohorts using a multimodal and integrative decision support system. Sci Rep. (2022) 12:11347. doi: 10.1038/s41598-022-15496-w,

PubMed Abstract | Crossref Full Text | Google Scholar

66. Mathis, MR, Engoren, MC, Williams, AM, Biesterveld, BE, Croteau, AJ, Cai, L, et al. Prediction of postoperative deterioration in cardiac surgery patients using electronic health record and physiologic waveform data. Anesthesiology. (2022) 137:586–601. doi: 10.1097/ALN.0000000000004345,

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: anesthesiology and intensive care, artificial intelligence, data science & machine learning, machine learning (ML), multimodal data fusion

Citation: Boie SD, Giesa N, Sekutowicz M, Zhumagambetov R, Haufe S, Grünewald E and Balzer F (2026) Multimodal data for predictive medicine: algorithmic fusion of clinical data in anesthesiology and intensive care. Front. Med. 13:1746867. doi: 10.3389/fmed.2026.1746867

Received: 15 November 2025; Revised: 13 December 2025; Accepted: 07 January 2026;
Published: 23 January 2026.

Edited by:

Ata Murat Kaynar, University of Pittsburgh, United States

Reviewed by:

Serban Stoica, University of Bristol, United Kingdom

Copyright © 2026 Boie, Giesa, Sekutowicz, Zhumagambetov, Haufe, Grünewald and Balzer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Sebastian Daniel Boie, c2ViYXN0aWFuLWRhbmllbC5ib2llQGNoYXJpdGUuZGU=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.