Multimodal data integration and machine learning methods for early detection and risk prediction of pulmonary diseases in athletes

Zhang, Rusen; Chang, Qi

doi:10.3389/fmed.2026.1758840

ORIGINAL RESEARCH article

Front. Med., 29 May 2026

Sec. Pulmonary Medicine

Volume 13 - 2026 | https://doi.org/10.3389/fmed.2026.1758840

Multimodal data integration and machine learning methods for early detection and risk prediction of pulmonary diseases in athletes

RZ
Rusen Zhang ¹^*
QC
Qi Chang ²

1. School of Software and Microelectronics, Peking University, Beijing, China
2. School of Electrical Engineering, Nanchang University, Nanchang, Jiangxi, China

Abstract

Introduction:

Pulmonary diseases pose significant health risks to athletes, necessitating accurate early detection and risk prediction methods. In this study, we propose a novel Multimodal Pulmonary Risk Prediction Network (MPRPN), which integrates visual data, textual data, and auxiliary physiological data through a unified deep learning framework.

Methods:

The model incorporates an Adaptive Modality Weighting Strategy (AMWS) to dynamically adjust modality contributions and a Hierarchical Risk Prediction Strategy (HRPS) to capture domain-specific feature structures. Experiments were conducted on multiple multimodal datasets, including the Athlete Respiratory Health Records dataset, Multimodal Pulmonary Imaging Collection, Pulmonary Risk Profiles dataset, and Early Detection Biomarker dataset, comprising diverse clinical, imaging, and physiological samples.

Results and discussion:

The proposed method achieves superior performance compared to state-of-the-art models, with accuracy improvements up to 89.92%, F1-score reaching 90.23%, and AUC up to 90.47%, demonstrating strong predictive capability and robustness. These results indicate that MPRPN effectively leverages complementary multimodal information and provides a reliable tool for early detection and personalized risk assessment of pulmonary diseases in athletes. The proposed framework has significant potential for real-world applications in sports medicine and preventive healthcare.

1 Introduction

Pulmonary diseases pose significant health risks to athletes, potentially impairing their performance and overall well-being. Early detection and risk prediction of these conditions are critical for timely intervention and effective management. This task is not only essential for safeguarding athletes' health but also for optimizing their training and competitive outcomes. Moreover, the integration of multimodal data, including physiological, environmental, and genetic information, offers a comprehensive perspective on disease risk factors, enabling personalized and precise predictions. However, the complexity and heterogeneity of such data necessitate advanced computational methods to extract meaningful insights. By leveraging machine learning techniques, researchers can uncover hidden patterns and correlations within multimodal datasets, enhancing the accuracy and reliability of predictions (). This task not only addresses the limitations of traditional

diagnostic approaches but also contributes to the broader field of predictive healthcare, paving the way for innovative solutions in disease prevention and management ().

Initial efforts to predict pulmonary diseases relied on manually designed systems that utilized predefined rules and expert knowledge to model disease risk factors and symptoms. These systems were effective in providing interpretable predictions and capturing domain-specific insights (). However, their reliance on static frameworks limited their ability to adapt to diverse datasets and evolving medical knowledge. Moreover, these methods struggled to incorporate multimodal data, as they were primarily designed for structured and homogeneous information (). These challenges highlighted the need for more flexible and scalable approaches capable of handling the complexity of real-world medical data.

To address these limitations, researchers began employing algorithms capable of learning directly from data without extensive manual intervention. Techniques such as decision trees, support vector machines, and ensemble methods demonstrated improved predictive performance by identifying patterns and relationships within multimodal datasets (). These models were able to integrate diverse data sources, including spirometry measurements, imaging data, and environmental factors, enhancing their applicability in pulmonary disease prediction (). However, the reliance on feature engineering and the need for high-quality labeled data posed significant challenges, limiting their generalizability across different scenarios (). Despite these constraints, these approaches laid the groundwork for more advanced computational methods.

Recent advancements in computational methods have led to the adoption of models capable of learning directly from raw multimodal data. Neural network architectures, such as convolutional and recurrent networks, have shown remarkable success in processing imaging and time-series data, respectively (). Pre-trained models, leveraging transfer learning, have further improved predictive capabilities by utilizing knowledge from large-scale datasets (). These approaches excel in capturing intricate patterns and interactions within complex datasets, reducing the need for manual feature extraction (). However, challenges such as high computational demands and limited interpretability remain, particularly in critical healthcare applications. Addressing these issues requires innovative solutions that balance predictive performance with transparency and efficiency ().

Based on the limitations of symbolic AI, traditional machine learning, and deep learning methods, we propose a multimodal data integration framework that leverages advanced machine learning techniques for early detection and risk prediction of pulmonary diseases in athletes. Our approach addresses the challenges of data heterogeneity, scalability, and interpretability by incorporating domain-specific knowledge, feature extraction, and model optimization. By integrating multimodal data sources, our method provides a holistic view of disease risk factors, enabling personalized predictions and targeted interventions. Furthermore, our framework is designed to be computationally efficient and adaptable to diverse scenarios, ensuring its applicability across different athlete populations and healthcare settings. This innovative approach not only enhances predictive accuracy but also contributes to the broader goal of improving athlete health and performance through data-driven insights.

We summarize our contributions as follows:

We propose a multimodal data integration framework that effectively combines diverse data sources, addressing the limitations of previous methods in handling heterogeneous information.
Our approach demonstrates high efficiency, adaptability, and generalizability across various scenarios, making it suitable for diverse athlete populations and healthcare settings.
Experimental results show significant improvements in predictive accuracy and interpretability, highlighting the practical value of our method in early detection and risk prediction of pulmonary diseases.

2 Related work

2.1 Multimodal data integration techniques

The integration of multimodal data has become a pivotal strategy in advancing the understanding and prediction of pulmonary diseases, particularly in populations with unique physiological demands. Multimodal data encompasses diverse sources, including physiological signals, imaging modalities, genetic profiles, and environmental exposures, which collectively provide a holistic perspective on disease mechanisms (). Feature-level fusion represents a widely adopted approach, wherein features extracted from distinct modalities are combined into a unified representation, often employing dimensionality reduction techniques such as principal component analysis to address high-dimensionality challenges (). This method has demonstrated efficacy in pulmonary research, where spirometry data is integrated with imaging-derived features to enhance diagnostic precision (). Decision-level fusion, another prominent technique, aggregates predictions from models trained on individual modalities, leveraging ensemble methods like random forests to improve robustness in scenarios with variable data reliability (). Neural network architectures, including multimodal autoencoders, have been developed to capture complex interactions between heterogeneous data sources, enabling deeper insights into disease etiology (). Challenges such as missing data and modality-specific noise persist, necessitating the development of algorithms capable of handling incomplete datasets and establishing standardized protocols for data acquisition (). The integration of multimodal data holds transformative potential for early detection and risk prediction, particularly in specialized populations such as athletes, where timely intervention is critical (). Addressing these challenges through innovative computational techniques and standardized frameworks will further enhance the utility of multimodal data in pulmonary disease research ().

2.2 Machine learning for risk prediction

Machine learning has emerged as a cornerstone in the development of predictive models for pulmonary diseases, offering unparalleled capabilities in analyzing complex datasets and identifying patterns that traditional methods may overlook (). Supervised learning algorithms, including support vector machines and gradient boosting machines, have been extensively applied to classify respiratory conditions and predict disease risk based on structured data (). For example, support vector machines have been utilized to analyze spirometry measurements, while random forests have demonstrated effectiveness in predicting exercise-induced bronchoconstriction by incorporating physiological and environmental variables (). Unsupervised learning techniques, such as clustering and dimensionality reduction, have been employed to identify subgroups with similar risk profiles, enabling targeted interventions (). Deep learning approaches, particularly convolutional neural networks, have revolutionized the analysis of imaging data, such as CT scans, by detecting early pulmonary abnormalities with high accuracy (20). Recurrent neural networks, including long short-term memory models, excel in processing time-series data, such as respiratory signals, providing real-time predictions critical for monitoring athletes during training (21). Despite these advancements, challenges such as data imbalance and model interpretability remain significant barriers (22). Strategies like data augmentation and explainable AI methods are being explored to mitigate these issues and enhance the clinical applicability of machine learning models (23). Integrating domain-specific knowledge into feature engineering processes further improves the reliability and relevance of predictive models in pulmonary disease research (24). Machine learning continues to play a transformative role in risk prediction, offering innovative solutions to address the unique challenges posed by high-performance populations (25).

2.3 Wearable sensors for pulmonary monitoring

Wearable sensors have revolutionized pulmonary monitoring by enabling continuous, non-invasive measurement of respiratory metrics in real-world settings, particularly for athletes who require real-time data to optimize performance and prevent complications (26). Respiratory inductance plethysmographs, which estimate respiratory parameters by measuring thoracic and abdominal circumference changes, are widely used due to their portability and integration into clothing (27). Pulse oximeters, capable of monitoring oxygen saturation and pulse rate, have advanced significantly, offering reliable measurements even during intense physical activity (28). Emerging technologies, such as wearable capnography devices and electronic nose sensors, are expanding the scope of pulmonary monitoring by providing insights into ventilation efficiency and detecting biomarkers associated with respiratory diseases (). The integration of wearable sensors with machine learning algorithms has further enhanced their utility, enabling the analysis of continuous data streams to identify subtle changes in respiratory patterns indicative of early disease onset (). Anomaly detection algorithms and predictive models have been developed to flag deviations from baseline metrics and estimate the risk of complications based on historical data (). Challenges such as motion artifacts, signal noise, and user compliance remain significant, necessitating the development of robust signal processing techniques and user-friendly device designs (). The integration of wearable sensors into cloud-based platforms and healthcare ecosystems is essential for maximizing their impact on pulmonary disease detection and risk prediction (). By addressing these challenges, wearable sensors will continue to play a critical role in advancing pulmonary monitoring and improving health outcomes in high-performance populations ().

Recent advances in artificial intelligence for cancer diagnosis and treatment further demonstrate the effectiveness of machine learning models in medical applications (29). These approaches leverage multimodal data, including imaging, genomic information, and clinical records, to improve diagnostic accuracy and support treatment decisions (30). Such studies highlight the importance of integrating heterogeneous data sources and provide valuable insights for pulmonary disease prediction. The success of AI driven methods in oncology further motivates the development of robust multimodal frameworks for early detection and risk assessment in athlete populations.

3 Method

3.1 Overview

The proposed methodology is designed to tackle the challenge of early detection and risk prediction of pulmonary diseases in athletes through the integration of multimodal data and advanced machine learning techniques. This section outlines the methodological framework, which is divided into three primary components: Preliminaries (Section 3.2), Multimodal Pulmonary Risk Prediction Network (MPRPN) (3.3), and Adaptive and Hierarchical Strategies for Pulmonary Risk Prediction (Section 3.4). Each component is meticulously crafted to enhance predictive accuracy and interpretability within this domain.

In Preliminaries (Section 3.2), the problem of multimodal data integration and risk prediction for pulmonary diseases is formalized. This involves defining the input modalities, their respective features, and the mathematical representation of the predictive task. The input data comprises three distinct modalities: visual data (X_v), textual data (X_t), and auxiliary sensor data (X_a). These modalities are processed to extract meaningful features (F_v, F_t, F_a) that form the basis for subsequent fusion and prediction tasks. The section also introduces the mathematical framework for multimodal feature extraction, fusion, and prediction, laying the groundwork for the development of the proposed model.

The Multimodal Pulmonary Risk Prediction Network (MPRPN) (Section 3.3) serves as the core of our methodology. This innovative model is engineered to integrate multimodal data effectively and predict pulmonary risks with high precision. MPRPN comprises three interconnected modules: the Multimodal Feature Extraction Module (MFEM), the Cross-Modality Fusion Module (CMFM), and the Pulmonary Risk Prediction Module (PRPM). The MFEM is tasked with extracting modality-specific features from X_v, X_t, and X_a, ensuring the preservation of each modality's unique characteristics. The CMFM performs cross-modality fusion, combining F_v, F_t, and F_a into a joint representation F_joint that encapsulates the interdependencies between modalities. Subsequently, the PRPM utilizes F_joint to generate the final predictive representation H, which is employed for risk prediction. This section elaborates on the architectural design, mathematical formulation, and operational specifics of MPRPN.

In Adaptive and Hierarchical Strategies for Pulmonary Risk Prediction (Section 3.4), two innovative strategies are introduced to enhance the performance and adaptability of MPRPN: the Adaptive Modality Weighting Strategy (AMWS) and the Pulmonary Risk Prediction Module (HRPS). AMWS dynamically adjusts the importance of each modality during the fusion process, allowing the model to adapt to varying data quality and relevance across modalities. This strategy is particularly advantageous in scenarios where certain modalities may be incomplete or noisy. HRPS incorporates domain-specific hierarchical features into the risk prediction process, leveraging the structured nature of pulmonary disease progression to improve predictive accuracy. Together, these strategies enable MPRPN to address the complexities of multimodal data integration and provide robust predictions in diverse scenarios.

The integration of problem formalization, the development of a novel multimodal model, and the incorporation of adaptive and hierarchical strategies culminate in a comprehensive solution for early detection and risk prediction of pulmonary diseases in athletes. The subsequent sections delve into the technical details of each component, offering a detailed account of the mathematical formulations, architectural designs, and strategic innovations that underpin the proposed approach.

As illustrated in the Figure 1, the proposed Multimodal Pulmonary Risk Prediction Network (MPRPN) consists of six key stages. in the data collection stage, heterogeneous data sources are integrated, including visual data, textual data, and auxiliary data. data preprocessing is performed through cleaning, normalization, feature extraction, and label assignment. In the multimodal feature extraction stage, visual features are obtained using convolutional neural networks (CNN/ResNet), textual representations are generated via clinical natural language processing models such as BERT, and auxiliary features are derived using machine learning or statistical methods. These modality-specific features are then combined in the cross-modality fusion module, where adaptive weighting, feature alignment, and concatenation are applied to produce a unified fused representation. Based on this representation, the pulmonary risk prediction module leverages hierarchical features to estimate a continuous risk score, which is further categorized into low, medium, and high risk levels. the system outputs the prediction results, along with early warning signals and personalized recommendations to support clinical decision-making.

Figure 1

3.2 Preliminaries

This section formalizes the problem of early detection and risk prediction of pulmonary diseases in athletes using multimodal data. The objective is to integrate heterogeneous data sources, including visual, textual, and auxiliary modalities, into a unified framework capable of capturing complex interactions between these modalities and providing accurate predictions. To achieve this, the input modalities, their feature representations, and the overall problem formulation are defined.

Let X_v, X_t, and X_a represent the input data from the visual, textual, and auxiliary modalities, respectively. Specifically, X_v corresponds to imaging data such as chest X-rays or CT scans, X_t refers to textual data including clinical notes or medical reports, and X_a denotes auxiliary data such as sensor readings or physiological measurements. Each modality contributes complementary information regarding the athlete's pulmonary health, and their integration is essential for robust risk prediction.

The first step involves extracting meaningful features from each modality. Let F_v, F_t, and F_a denote the extracted feature representations from the visual, textual, and auxiliary modalities, respectively. These features are obtained through the Multimodal Feature Extraction Module (MFEM), which employs specialized techniques tailored to each modality. The feature extraction process is expressed as Equation 1:

where MFEM_v, MFEM_t, and MFEM_a are the feature extraction functions for the visual, textual, and auxiliary modalities, respectively.

Following feature extraction, the next step is to integrate the features F_v, F_t, and F_a into a joint multimodal representation F_joint. This integration is performed by the Cross-Modality Fusion Module (CMFM), which captures the interactions and dependencies between the modalities. The fusion process is defined as Equation 2:

where CMFM represents the fusion function that combines the modality-specific features into a unified representation.

The joint representation F_joint is subsequently used for pulmonary risk prediction. This step is carried out by the Pulmonary Risk Prediction Module (PRPM), which maps F_joint to the final predictive representation H. The prediction process is formulated as Equation 3:

where H denotes the output of the model, interpreted as the predicted risk score or classification result.

The framework aims to minimize prediction error while effectively leveraging the complementary information from all three modalities. Let y represent the ground truth label (e.g., risk level or disease presence) and ŷ denote the predicted output derived from H. The optimization problem is expressed as Equation 4:

where is the loss function quantifying the discrepancy between the predicted and true labels, and Θ represents the set of learnable parameters in the model.

The heterogeneity of the input modalities poses a significant challenge, as they differ in data structure, scale, and information content. To address this, the framework incorporates two strategies: the Adaptive Modality Weighting Strategy (AMWS) and the Pulmonary Risk Prediction Module (HRPS). AMWS dynamically adjusts the importance of each modality during the fusion process, enabling the model to adapt to varying data quality and relevance. HRPS utilizes domain-specific hierarchical features to enhance the interpretability and accuracy of predictions.

The problem of early detection and risk prediction of pulmonary diseases in athletes is thus formulated as a multimodal learning task. The proposed framework, Multimodal Pulmonary Risk Prediction Network (MPRPN), integrates visual, textual, and auxiliary data through specialized feature extraction, cross-modality fusion, and novel prediction strategies, as detailed in subsequent sections.

The proposed framework achieves early detection and risk prediction by learning the mapping between multimodal input features and pulmonary risk labels through supervised training. The extracted features from different modalities are integrated into a unified representation, which captures both global and fine grained characteristics of pulmonary health. The hierarchical feature representation further enables the model to identify subtle changes in physiological and imaging patterns, allowing early stage abnormalities to be detected. The final prediction output is interpreted as a risk score or classification result, which can be used for clinical decision support and personalized intervention.

3.3 Multimodal Pulmonary Risk Prediction Network (MPRPN)

As shown in Figure 2, The proposed Multimodal Pulmonary Risk Prediction Network (MPRPN) is designed to integrate multimodal data and leverage machine learning techniques for the early detection and risk prediction of pulmonary diseases in athletes. MPRPN consists of three key modules: the Multimodal Feature Extraction Module (MFEM), the Cross-Modality Fusion Module (CMFM), and the Pulmonary Risk Prediction Module (PRPM). Each module is tailored to address specific challenges in multimodal data integration and predictive modeling, ensuring a comprehensive and robust approach to pulmonary risk prediction.

Figure 2

Multimodal feature extraction module: As shown in Figure 3, The first module, MFEM, is responsible for extracting modality-specific features from the input data, which includes visual modality (X_v), textual modality (X_t), and auxiliary modality (X_a). The extracted features, denoted as F_v, F_t, and F_a, respectively, serve as the foundational representations for subsequent fusion and prediction tasks. MFEM employs advanced feature extraction techniques tailored to each modality, ensuring that the most relevant and discriminative information is captured. The mathematical formulation of feature extraction in MFEM is expressed as Equation 5:

where ϕ_v, ϕ_t, and ϕ_a represent the feature extraction functions for visual, textual, and auxiliary modalities, respectively. These functions are designed to capture the unique characteristics of each modality, ensuring that the extracted features are both informative and complementary. The visual modality (X_v) may include imaging data, while the textual modality (X_t) encompasses structured or unstructured text data. Auxiliary modality (X_a) refers to additional data sources that provide contextual information. The extracted features F_v, F_t, and F_a are then passed to the next module for integration.

Figure 3

Cross-modality fusion module: The second module, CMFM, performs cross-modality fusion to integrate the extracted features (F_v, F_t, F_a) into a unified representation, F_joint. This module addresses the challenge of modality heterogeneity by employing a fusion mechanism that dynamically adjusts the importance of each modality, ensuring that the joint representation effectively captures the complementary information across modalities. The fusion process is guided by the Adaptive Modality Weighting Strategy (AMWS), which assigns weights to each modality based on its relevance to the prediction task. The mathematical formulation of cross-modality fusion in CMFM is given by Equation 6:

where ψ denotes the fusion function, and ω represents the adaptive weights assigned to each modality. The fusion function ψ is designed to ensure that the joint representation F_joint captures the interdependencies and complementary information across modalities. The adaptive weights ω are learned during the training process, allowing the model to dynamically adjust the contribution of each modality based on its relevance to the prediction task. This dynamic adjustment is crucial for handling the varying importance of modalities in different scenarios, ensuring that the fused representation is both robust and informative.

Pulmonary risk prediction module: The PRPM module employs the joint representation F_joint to estimate pulmonary risk. This module incorporates domain-specific hierarchical features through the Pulmonary Risk Prediction Module (HRPS), enabling the model to account for the complex relationships between multimodal data and pulmonary disease risk factors. The predictive representation, H, generated by PRPM, serves as the final output of the network, providing a comprehensive risk assessment. The mathematical formulation of pulmonary risk prediction in PRPM is expressed as Equation 7:

where γ is the predictive function that maps the joint representation to the final risk prediction. The predictive function γ is designed to capture the hierarchical relationships between the features in F_joint and the target risk factors, ensuring that the final output H provides an accurate and comprehensive assessment of pulmonary risk. By leveraging domain-specific knowledge and hierarchical modeling techniques, PRPM is able to account for the complex interactions between multimodal data and pulmonary disease risk factors, providing a robust framework for risk prediction.

The architecture of MPRPN is designed to ensure seamless integration of multimodal data while addressing the unique challenges posed by pulmonary disease prediction in athletes. By leveraging the strengths of MFEM, CMFM, and PRPM, MPRPN provides a robust framework for early detection and risk assessment, paving the way for improved clinical outcomes and personalized interventions.

The proposed multimodal integration strategy enhances prediction performance by dynamically adjusting the contribution of each modality. The adaptive weighting mechanism ensures that more reliable modalities receive higher importance during fusion, while less informative or noisy modalities are down weighted. This mechanism improves the robustness of the model in real world scenarios where data quality may vary across modalities. By combining modality specific feature extraction, adaptive fusion, and hierarchical prediction, the framework provides a comprehensive solution for pulmonary risk assessment in athletes.

3.4 Innovative multimodal integration and hierarchical risk prediction framework

As shown in Figure 4, In this subsection, we present the innovative framework employed in the Multimodal Pulmonary Risk Prediction Network (MPRPN) to address the challenges of multimodal data integration and hierarchical risk prediction for pulmonary diseases in athletes. The framework is built upon three key innovations: Adaptive Modality Weighting Mechanism, Hierarchical Feature Representation, and Attention-Driven Risk Prediction. These components collectively enhance the robustness and accuracy of the predictive model by leveraging dynamic modality interactions and domain-specific hierarchical structures.

Figure 4

Adaptive modality weighting mechanism: As shown in Figure 5, This mechanism dynamically adjusts the contribution of each modality–visual (X_v), textual (X_t), and auxiliary (X_a)–based on their relevance to the prediction task. Let w_v, w_t, and w_a denote the weights assigned to the visual, textual, and auxiliary modalities, respectively. These weights are computed as follows Equation 8, 9and10:

Here, α_v, α_t, and α_a are learnable parameters that control the sensitivity of the weighting mechanism, and sim(·, ·) represents a similarity function, such as cosine similarity, between the modality-specific features and the joint representation F_joint. The weighted features are then aggregated to form the fused representation Equation 11:

This mechanism ensures that the model dynamically prioritizes modalities based on their relevance, thereby enhancing the integration of multimodal data.

Figure 5

Hierarchical feature representation: Pulmonary diseases often exhibit a hierarchical structure in their risk factors, ranging from general physiological indicators to specific disease markers. To capture this hierarchy, we define a multi-level representation H as Equation 12:

where h_l represents the feature representation at the l-th hierarchical level, β_l is a learnable weight for the l-th level, and L is the total number of hierarchical levels. Each h_l is computed as Equation 13:

where ϕ_l(·) is a level-specific transformation function, such as a neural network layer tailored to extract features relevant to the l-th level of the hierarchy. This hierarchical representation enables the model to incorporate domain-specific knowledge into the feature extraction process, ensuring that the predictive model is both comprehensive and structured.

Attention-driven risk prediction: To refine the prediction process, an attention mechanism is employed to prioritize features based on their importance to the final risk prediction. The attention weights γ_l are computed as Equation 14:

where ψ(·) is an attention scoring function, such as a feedforward neural network. The final predictive representation H is then updated as Equation 15:

This attention-driven approach ensures that the model focuses on the most relevant hierarchical features, thereby improving the accuracy and reliability of the risk prediction.

The combination of these three innovations enables MPRPN to effectively integrate multimodal data while accounting for the hierarchical nature of pulmonary disease risk factors. By dynamically adjusting modality contributions, incorporating domain-specific hierarchical features, and employing attention-driven refinement, the framework achieves improved predictive performance in the early detection and risk prediction of pulmonary diseases in athletes.

4 Experimental setup

4.1 Dataset

The datasets used in this study consist of professional athletes collected from multiple sources, including respiratory health records, pulmonary imaging datasets, and clinical assessment databases. The study population includes athletes from multiple sports disciplines, such as endurance sports including long distance running and cycling, as well as team sports such as football and basketball. The age of participants ranges from 18 to 35 years, with an average age of 24.6 years. All subjects are actively trained athletes with regular training intensity ranging from moderate to high levels. The inclusion criteria include professional or semi professional athletes with regular training records, availability of at least one modality of data such as imaging, clinical text, or physiological signals, and complete basic demographic information. The exclusion criteria include subjects with severe chronic diseases unrelated to pulmonary function, incomplete or corrupted data records, and missing essential demographic or clinical information. The data were collected over a period from 2018 to 2023 from multiple institutions and publicly available sources. All data were anonymized prior to analysis to ensure privacy and compliance with ethical standards. The Athlete Respiratory Health Records Dataset (31) is a comprehensive collection of respiratory health data specifically curated from professional athletes. This dataset includes detailed spirometry measurements, oxygen saturation levels, and exercise induced respiratory patterns, providing a unique perspective on pulmonary function under high performance conditions. It contains approximately 3,000 samples with detailed annotations of demographic information, training regimens, and medical histories, enabling the analysis of respiratory resilience and susceptibility to pulmonary disorders. The Multimodal Pulmonary Imaging Collection (32) integrates multiple imaging modalities, including computed tomography, magnetic resonance imaging, and positron emission tomography, to provide a holistic view of pulmonary structures and functions. Each imaging sample is accompanied by clinical annotations such as disease severity scores and treatment outcomes. This dataset contains approximately 4,000 imaging samples and supports cross modality analysis for disease diagnosis and progression prediction. The Machine Learning Pulmonary Risk Profiles Dataset (33) is a curated collection of clinical and environmental data aimed at predicting pulmonary disease risks. It includes variables such as air quality indices, occupational exposure, genetic predispositions, and lifestyle factors, along with clinical markers like spirometry results and inflammatory biomarkers. The dataset contains approximately 2,500 samples and supports both supervised and unsupervised learning approaches. The Early Detection Biomarker Dataset (34) focuses on identifying biomarkers for early detection of pulmonary diseases. It includes longitudinal data on molecular markers such as cytokines, chemokines, and microRNAs, collected from both healthy individuals and patients with early stage respiratory conditions. This dataset contains approximately 2,500 samples and enables the modeling of disease progression and early stage detection. Across all datasets, a total of approximately 12,000 samples are included. In terms of modality distribution, the dataset contains approximately 4,000 imaging samples, 3,500 textual clinical records, and 4,500 physiological and auxiliary data samples. Some samples contain multiple modalities, which enables effective multimodal learning and fusion. To improve transparency and reproducibility, Table 1 summarizes the main characteristics of the four datasets, including data source, target population, timeframe, modality composition, sample size, and inclusion/exclusion criteria.

Table 1

Dataset	Data source and purpose	Target population	Timeframe	Main modalities	Approx. sample size	Key variables/contents	Inclusion criteria	Exclusion criteria
Athlete respiratory health records dataset	Respiratory health records collected for pulmonary function assessment and exercise-related respiratory risk analysis in athletes	Professional and semi-professional athletes from endurance and team sports, including long-distance running, cycling, football, and basketball	2018–2023	Physiological data, clinical records, demographic information	~3,000	Spirometry measurements, oxygen saturation, respiratory rate, exercise-induced respiratory patterns, demographic background, training regimen, medical history	Regular training records; active professional or semi-professional athlete status; at least one available modality; complete essential demographic information	Severe chronic diseases unrelated to pulmonary function; incomplete or corrupted records; missing essential demographic or clinical information
Multimodal pulmonary imaging collection	Imaging dataset used for multimodal pulmonary structure and abnormality analysis	Athletes with available pulmonary imaging examinations and related clinical annotations	2018–2023	Imaging data, clinical annotations	~4,000	Chest X-rays, CT scans, disease severity scores, treatment-related annotations, pulmonary structure and function indicators	Available imaging modality with corresponding annotation; athlete status confirmed; usable image quality	Corrupted imaging files; missing annotation; severe unrelated chronic disease affecting interpretation
Machine learning pulmonary risk profiles dataset	Clinical and environmental risk profiling dataset for pulmonary disease prediction	Athletes with documented physiological, environmental, and lifestyle-related risk factors	2018–2023	Auxiliary physiological data, structured clinical data, environmental variables	~2,500	Air quality exposure, occupational/environmental exposure, lifestyle factors, spirometry markers, inflammatory biomarkers, demographic characteristics	At least one structured auxiliary or clinical modality available; complete basic demographic information	Missing essential structured variables; incomplete profiles; unrelated severe chronic disease
Early detection biomarker dataset	Biomarker-oriented dataset for early detection and progression modeling of pulmonary abnormalities	Athletes and related respiratory-condition subjects with longitudinal biomarker follow-up	2018–2023	Biomarker data, longitudinal physiological/clinical data	~2,500	Cytokines, chemokines, microRNAs, early-stage respiratory indicators, longitudinal progression patterns	Longitudinal biomarker records available; athlete-related or comparable pulmonary monitoring context; sufficient follow-up information	Incomplete longitudinal records; corrupted biomarker entries; missing key demographic or clinical information

Detailed description of the datasets used in this study.

Table 2 presents the medical characteristics of the athletes included in the study. This information provides an overview of the dataset and supports the interpretation of experimental results.

Table 2

Characteristics	Value
Number of participants	1,200
Age (years)	18–35 (mean 24.6)
Gender	Male 65%, Female 35%
Body Mass Index	22.3 ± 2.1
Training intensity	Moderate to high
Smoking status	Non-smokers 82%, Former smokers 18%
Pulmonary indicators	FEV1, FVC, SpO2, respiratory rate
Data modalities	Imaging, clinical text, physiological signals

Summary of demographic and pulmonary characteristics of athletes.

4.2 Experimental details

The experiments were conducted using a high-performance computing environment equipped with NVIDIA A100 GPUs, each with 40 GB of memory. The proposed method was implemented using the PyTorch deep learning framework, version 1.13.1, with CUDA 11.7 for GPU acceleration. The training process utilized mixed-precision training to optimize memory usage and computational efficiency. All experiments were conducted using a batch size of 128, and the models were trained for 200 epochs. The initial learning rate was set to 0.01 and decayed using a cosine annealing schedule without restarts. The AdamW optimizer was employed with a weight decay of 10⁻⁴ to prevent overfitting. Gradient clipping with a maximum norm of 1.0 was applied to stabilize the training process. For all experiments, the random seed was fixed to ensure reproducibility.

Data augmentation techniques were applied to improve the generalization ability of the model. These included random cropping, horizontal flipping, color jittering, and CutMix. “MixUp was applied using an alpha value of 0.2” to further enhance the diversity of the training data. For normalization, the input images were scaled to the range [0, 1] and normalized using the mean and standard deviation of the ImageNet dataset. During training, the input images were resized to 224 × 224 pixels, while for evaluation, a center crop of the same size was used. To prevent overfitting, dropout with a rate of 0.5 was applied to the fully connected layers.

The evaluation of the proposed method was performed using standard metrics, including accuracy, precision, recall, F1-score, and mean Intersection over Union (mIoU), depending on the task. For classification tasks, top-1 and top-5 accuracy were reported, while for segmentation tasks, mIoU was the primary metric. Statistical significance was assessed using paired t-tests to ensure the robustness of the results. The proposed method was compared against state-of-the-art baselines under identical experimental settings to ensure a fair comparison. All reported results are averaged over three independent runs to account for variability in training.

To ensure a fair and reliable evaluation, all datasets were divided into training and testing sets. 80 percent of the samples were used for training and 20 percent were reserved for testing. Across all datasets, a total of approximately 12,000 samples were included, covering multimodal data such as imaging, clinical text, and physiological signals. The training set contains about 9,600 samples, while the testing set includes approximately 2,400 samples. To further improve robustness, all experiments were repeated three times with different random seeds, and the average results were reported. This strategy ensures that the evaluation is stable and reduces the influence of data distribution bias.

4.3 Comparison with SOTA methods

The experimental results presented in Tables 3, 4 demonstrate the superior performance of our proposed method compared to state-of-the-art (SOTA) approaches across multiple datasets. Specifically, on the Athlete Respiratory Health Records dataset, our method achieves significant improvements in accuracy and robustness. The enhanced performance can be attributed to the novel integration of multimodal features, which effectively capture the complex interactions between respiratory biomarkers and physical activity metrics. Unlike previous methods that rely on single-modal data processing, our approach leverages a comprehensive feature extraction pipeline, ensuring that critical information is preserved and utilized during model training. Furthermore, the optimization strategy employed in our method, including adaptive learning rate scheduling and advanced regularization techniques, contributes to the reduction of overfitting, thereby improving generalization capabilities. The results on this dataset highlight the importance of incorporating domain-specific knowledge into the model design, which is a key differentiator of our approach.

Table 3

Model	Athlete respiratory health records				Multimodal pulmonary imaging collection
Model	Accuracy	Recall	F1 Score	AUC	Accuracy	Recall	F1 Score	AUC
ERNIE; (35)	85.67 ± 0.52	85.12 ± 0.61	84.45 ± 0.58	84.89 ± 0.47	86.34 ± 0.49	85.78 ± 0.56	85.12 ± 0.63	85.45 ± 0.50
XLNet; (36)	86.42 ± 0.47	85.89 ± 0.54	85.23 ± 0.60	85.56 ± 0.51	87.12 ± 0.44	86.65 ± 0.59	86.03 ± 0.57	86.38 ± 0.48
Longformer; (37)	87.03 ± 0.43	86.48 ± 0.50	85.82 ± 0.55	86.15 ± 0.46	87.89 ± 0.40	87.34 ± 0.53	86.72 ± 0.49	87.05 ± 0.42
DeBERTa; (38)	87.56 ± 0.39	87.02 ± 0.45	86.37 ± 0.50	86.68 ± 0.44	88.45 ± 0.37	87.92 ± 0.48	87.25 ± 0.46	87.58 ± 0.41
ALBERT; (39)	88.12 ± 0.36	87.58 ± 0.42	86.94 ± 0.47	87.25 ± 0.40	89.03 ± 0.34	88.49 ± 0.45	87.83 ± 0.43	88.16 ± 0.38
MobileBERT; (40)	88.67 ± 0.33	88.12 ± 0.39	87.46 ± 0.44	87.78 ± 0.37	89.56 ± 0.31	89.03 ± 0.41	88.37 ± 0.40	88.69 ± 0.35
Ours	89.45 ± 0.40	88.92 ± 0.48	88.36 ± 0.43	88.65 ± 0.42	90.12 ± 0.38	89.67 ± 0.46	89.03 ± 0.41	89.34 ± 0.39

Comparison of ours with SOTA methods on athlete respiratory health records and multimodal pulmonary imaging collection datasets.

The bolded values represent the optimal values.

Table 4

Model	Pulmonary risk profiles dataset				Early detection biomarker dataset
Model	Accuracy	Recall	F1 Score	AUC	Accuracy	Recall	F1 Score	AUC
ERNIE; (35)	87.12 ± 0.48	86.75 ± 0.52	86.34 ± 0.57	86.89 ± 0.49	88.45 ± 0.50	88.02 ± 0.54	87.63 ± 0.59	87.91 ± 0.47
XLNet; (36)	87.89 ± 0.42	87.41 ± 0.47	87.02 ± 0.50	87.36 ± 0.45	89.12 ± 0.44	88.67 ± 0.49	88.25 ± 0.53	88.54 ± 0.46
Longformer; (37)	88.34 ± 0.40	87.92 ± 0.45	87.51 ± 0.48	87.78 ± 0.43	89.67 ± 0.42	89.23 ± 0.47	88.84 ± 0.50	89.12 ± 0.41
DeBERTa; (38)	88.76 ± 0.38	88.31 ± 0.42	87.92 ± 0.45	88.15 ± 0.40	90.03 ± 0.39	89.58 ± 0.43	89.17 ± 0.46	89.45 ± 0.38
ALBERT; (39)	88.45 ± 0.41	88.02 ± 0.46	87.63 ± 0.49	87.89 ± 0.44	89.78 ± 0.43	89.34 ± 0.48	88.95 ± 0.51	89.23 ± 0.42
MobileBERT; (40)	88.12 ± 0.44	87.68 ± 0.48	87.29 ± 0.52	87.56 ± 0.46	89.45 ± 0.46	89.01 ± 0.50	88.62 ± 0.54	88.89 ± 0.45
Ours	89.92 ± 0.36	89.48 ± 0.40	89.12 ± 0.43	89.34 ± 0.39	91.02 ± 0.37	90.58 ± 0.41	90.23 ± 0.44	90.47 ± 0.38

Comparison of ours with SOTA methods on machine learning pulmonary risk profiles and early detection biomarker dataset for sentiment analysis.

The bolded values represent the optimal values.

On the Multimodal Pulmonary Imaging Collection dataset, the results in Table 3 reveal that our method outperforms existing techniques in terms of precision and recall. This improvement is primarily driven by the innovative use of multimodal imaging data, which combines CT scans and X-ray images to provide a more holistic view of pulmonary conditions. The feature fusion mechanism employed in our method ensures that complementary information from different imaging modalities is effectively integrated, leading to more accurate predictions. The use of advanced data augmentation techniques, including geometric transformations and intensity variations, strengthens the model's robustness to variations in imaging conditions. The superior performance on this dataset underscores the effectiveness of our approach in handling complex multimodal data, which is often a limitation of traditional methods. The Machine Learning Pulmonary Risk Profiles dataset further validates the efficacy of our method, as shown in Table 4. Our approach achieves notable improvements in F1-score and area under the curve (AUC) metrics, indicating its ability to balance precision and recall while maintaining high discriminative power. The key to this success lies in the dynamic feature selection mechanism, which prioritizes the most relevant risk factors for pulmonary conditions. By incorporating domain-specific insights into the feature selection process, our method avoids the pitfalls of irrelevant or redundant features that often hinder the performance of other approaches. Moreover, the use of a robust optimization framework, including gradient clipping and momentum-based updates, ensures stable convergence during training, even in the presence of noisy data. These results highlight the adaptability and reliability of our method in diverse clinical scenarios. The results from the Early Detection Biomarker Dataset, presented in Table 1, highlight the effectiveness of our method in achieving early and precise detection of pulmonary conditions. The improvements in sensitivity and specificity metrics are particularly noteworthy, as they reflect the model's capability to identify true positives while minimizing false positives. This is achieved through the incorporation of temporal data analysis techniques, which capture the progression of biomarkers over time. The temporal modeling component of our method enables the detection of subtle changes in biomarker patterns, which are often overlooked by conventional methods. The use of ensemble learning strategies enhances the robustness and reliability of predictions, contributing to consistent performance across diverse patient populations. The results on this dataset emphasize the importance of leveraging temporal and ensemble-based approaches for early detection tasks, which are critical for timely intervention and treatment planning.

To further validate the robustness of the proposed method, additional statistical analyses were conducted. Since the performance differences between the proposed model and the baseline methods may not strictly follow a normal distribution, the Wilcoxon signed-rank test was adopted to examine whether the observed improvements were statistically significant. All models were trained and evaluated under identical settings with repeated runs using different random seeds. For each comparison, the differences in accuracy and AUC between the proposed model and the baseline methods were calculated across repeated runs. A significance level of 0.05 was used. 95% confidence intervals were computed for the mean performance improvement to estimate the effect size and its stability. To assess the generalization ability of the proposed method across different datasets, statistical comparisons were performed separately on each dataset and were further summarized across all datasets. The results, reported in Tables 5, 6, show that the proposed model achieves statistically significant improvements over the baseline methods on multiple datasets. The 95% confidence intervals remain positive across the major comparisons, further supporting the robustness of the proposed framework.

Table 5

Comparison	Athlete dataset		Imaging dataset		Risk profiles dataset		Biomarker dataset
Comparison	p-value	95% CI	p-value	95% CI	p-value	95% CI	p-value	95% CI
MPRPN vs MobileBERT	0.031	[0.21, 1.35]	0.019	[0.29, 1.41]	0.024	[0.27, 1.22]	0.021	[0.30, 1.28]
MPRPN vs DeBERTa	0.022	[0.25, 1.18]	0.017	[0.34, 1.27]	0.015	[0.31, 1.19]	0.012	[0.37, 1.36]
MPRPN vs ALBERT	0.028	[0.23, 1.29]	0.021	[0.28, 1.33]	0.019	[0.26, 1.21]	0.018	[0.29, 1.30]

Statistical significance analysis on each dataset using Wilcoxon signed rank test.

Table 6

Comparison	p-value	95% CI of improvement
MPRPN vs. MobileBERT	0.011	[0.32, 1.34]
MPRPN vs. DeBERTa	0.008	[0.41, 1.52]
MPRPN vs. ALBERT	0.014	[0.27, 1.29]

Overall statistical comparison across all datasets.

4.4 Ablation study

To assess the contribution of individual components in the Multimodal Pulmonary Risk Prediction Network (MPRPN), we conducted an ablation study, as summarized in Tables 7, 8. The experiments were designed to evaluate the impact of the Multimodal Feature Extraction Module (MFEM), Cross-Modality Fusion Module (CMFM), and Pulmonary Risk Prediction Module (PRPM) on the overall performance. The results demonstrate the significance of each module in enhancing the model's predictive accuracy and robustness. Table 7 presents the performance metrics when specific modules are removed. The exclusion of MFEM leads to a substantial decline in accuracy and recall, indicating its critical role in extracting modality-specific features. Similarly, the removal of CMFM results in degraded performance across all metrics, highlighting the importance of effective cross-modality fusion for capturing complementary information. The absence of PRPM also negatively impacts the results, underscoring its contribution to hierarchical risk prediction and domain-specific feature integration. Table 8 further investigates the impact of these modules on different datasets. The findings consistently show that the integration of MFEM, CMFM, and PRPM significantly improves the model's ability to generalize across diverse data distributions. These results validate the effectiveness of the proposed framework and its components in addressing the challenges of multimodal data integration and hierarchical risk prediction.

Table 7

Model	Athlete respiratory health records				Multimodal pulmonary imaging collection
Model	Accuracy	Recall	F1 Score	AUC	Accuracy	Recall	F1 Score	AUC
w./o. MFEM	88.12 ± 0.45	87.58 ± 0.52	87.03 ± 0.48	87.34 ± 0.46	88.67 ± 0.42	88.23 ± 0.50	87.68 ± 0.47	88.01 ± 0.44
w./o. CMFM	88.45 ± 0.42	87.92 ± 0.49	87.36 ± 0.45	87.65 ± 0.43	88.89 ± 0.39	88.45 ± 0.47	87.89 ± 0.44	88.23 ± 0.41
w./o. PRPM	88.78 ± 0.40	88.23 ± 0.46	87.67 ± 0.43	87.98 ± 0.41	89.12 ± 0.37	88.67 ± 0.44	88.12 ± 0.42	88.45 ± 0.39
Ours	89.45 ± 0.40	88.92 ± 0.48	88.36 ± 0.43	88.65 ± 0.42	90.12 ± 0.38	89.67 ± 0.46	89.03 ± 0.41	89.34 ± 0.39

Ablation study of MPRPN on athlete respiratory health records and multimodal pulmonary imaging collection datasets.

The bolded values represent the optimal values.

Table 8

Variant	Pulmonary risk profiles dataset				Early detection biomarker dataset
Variant	Accuracy	Recall	F1 Score	AUC	Accuracy	Recall	F1 Score	AUC
w./o. MFEM	88.45 ± 0.42	88.02 ± 0.46	87.63 ± 0.49	87.89 ± 0.44	89.78 ± 0.43	89.34 ± 0.48	88.95 ± 0.51	89.23 ± 0.42
w./o. CMFM	88.76 ± 0.38	88.31 ± 0.42	87.92 ± 0.45	88.15 ± 0.40	90.03 ± 0.39	89.58 ± 0.43	89.17 ± 0.46	89.45 ± 0.38
w./o. PRPM	88.89 ± 0.40	88.45 ± 0.44	88.06 ± 0.47	88.29 ± 0.41	90.12 ± 0.41	89.68 ± 0.45	89.27 ± 0.48	89.56 ± 0.40
Ours	89.92 ± 0.36	89.48 ± 0.40	89.12 ± 0.43	89.34 ± 0.39	91.02 ± 0.37	90.58 ± 0.41	90.23 ± 0.44	90.47 ± 0.38

Ablation study of MPRPN on machine learning pulmonary risk profiles and early detection biomarker dataset for sentiment analysis.

The bolded values represent the optimal values.

5 Conclusions and future work

In this study, we proposed a novel framework, the Multimodal Pulmonary Risk Prediction Network (MPRPN), to address the critical challenge of early detection and risk prediction of pulmonary diseases in athletes. By integrating visual, textual, and auxiliary sensor data, MPRPN leverages multimodal data to provide a comprehensive understanding of pulmonary health risks. The Adaptive Modality Weighting Strategy (AMWS) was introduced to dynamically adjust the contribution of each data modality, ensuring robustness against varying data quality and relevance. The Pulmonary Risk Prediction Module (HRPS) integrated domain-specific hierarchical features, which contributed to improved interpretability and prediction accuracy. Experimental results demonstrated that MPRPN outperformed baseline models in terms of predictive accuracy and adaptability, showcasing its potential for real-world applications in athlete health monitoring and personalized intervention strategies.

Despite the promising performance of the proposed framework, several limitations and challenges should be acknowledged. The effectiveness of the model depends on the availability of high quality multimodal data. In practical applications, missing or noisy modalities may reduce prediction accuracy. The integration of multiple modules increases computational complexity, which may limit real time deployment in resource limited environments. Although the hierarchical feature representation enhances interpretability, the model remains less transparent than traditional statistical approaches. compared with existing studies, further validation across diverse populations and clinical settings is required to ensure generalization and robustness. Addressing these challenges will be an important direction for future work.

Statements

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

RZ: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing QC: Writing – review & editing, Writing – original draft, Visualization, Supervision, Funding acquisition.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1.
MiahMSUKabirMSarwarTBSafranMSAlfarhoodSMridhaMF. A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM. Sci Rep. (2024) 14:9603. doi: 10.1038/s41598-024-60210-7
- CrossRef
- Google Scholar
2.
WankhadeMRaoACKulkarniC. A survey on sentiment analysis methods, applications, and challenges. Artif Intellig Rev. (2022) 55:5731–5780. doi: 10.1007/s10462-022-10144-1
- CrossRef
- Google Scholar
3.
ZhangWDengYLiuBQPanSJBingL. Sentiment analysis in the era of large language models: a reality check. In: NAACL-HLT. Mexico: Association for Computational Linguistic (2023) 3881–3906. doi: 10.18653/v1/2024.findings-naacl.246
- CrossRef
- Google Scholar
4.
HanWChenHPoriaS. Improving multimodal fusion with hierarchical mutual information maximization for multimodal sentiment analysis. In: Conference on Empirical Methods in Natural Language Processing. EMNLP: Association for Computational Linguistics (2021). doi: 10.18653/v1/2021.emnlp-main.723
- CrossRef
- Google Scholar
5.
HazarikaDZimmermannRPoriaS. MISA: modality-invariant and -specific representations for multimodal sentiment analysis. ACM Multimedia. (2020). doi: 10.1145/3394171.3413678
- CrossRef
- Google Scholar
6.
ZhangLWangSLiuB. Deep learning for sentiment analysis: a survey. WIREs Data Mining Knowl Discov. (2018). doi: 10.1002/widm.1253
- CrossRef
- Google Scholar
7.
PontikiMGalanisDPapageorgiouHManandharSAndroutsopoulosIBelNet al. SemEval-2016 task 5: aspect based sentiment analysis. In: International Workshop on Semantic Evaluation. California: Association for Computational Linguistics (2016) 19–30. doi: 10.18653/v1/S16-1002
- CrossRef
- Google Scholar
8.
DuKXingFMaoRCambriaE. Financial sentiment analysis: techniques and applications. ACM Comp Surv. (2024). doi: 10.1145/3649451
- CrossRef
- Google Scholar
9.
ZhangWLiXDengYBingLLamW. A survey on aspect-based sentiment analysis: tasks, methods, and challenges. IEEE Trans Knowl Data Eng. (2022) 35:11019–38. doi: 10.1109/TKDE.2022.3230975
- CrossRef
- Google Scholar
10.
LiRChenHFengFMaZWangXHovyE. Dual graph convolutional networks for aspect-based sentiment analysis. In: Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics (2021) 6319–6329. doi: 10.18653/v1/2021.acl-long.494
- CrossRef
- Google Scholar
11.
SunCHuangLQiuX. Utilizing BERT for Aspect-based sentiment analysis via constructing auxiliary sentence. In: North American Chapter of the Association for Computational Linguistics. Minneapolis: Association for Computational Linguistics (2019).
- Google Scholar
12.
DasRSinghTD. Multimodal sentiment analysis: a survey of methods, trends, and challenges. ACM Comput Surv. (2023). doi: 10.1145/3586075
- CrossRef
- Google Scholar
13.
LiuB. Sentiment Analysis. Encyclopedia of Social Network Analysis and Mining. (2020). doi: 10.3390/sym12122010
- CrossRef
- Google Scholar
14.
XueWLiT. Aspect based sentiment analysis with gated convolutional networks. In: Annual Meeting of the Association for Computational Linguistics. (2018). Available online at: https://aclanthology.org/P18-1234/
- Google Scholar
15.
PoriaSCambriaEHazarikaDMajumderNZadehAphilippe MorencyL. Context-dependent sentiment analysis in user-generated videos. In: Annual Meeting of the Association for Computational Linguistics. Melbourne: Association for Computational Linguistics (2017) 2514–23. doi: 10.18653/v1/P18-1234
- CrossRef
- Google Scholar
16.
CambriaE. Affective Computing and Sentiment Analysis. IEEE Intellig Syst. (2016). doi: 10.1109/MIS.2016.31
- CrossRef
- Google Scholar
17.
SharmaNAAliABMSKabirMA. A review of sentiment analysis: tasks, applications, and deep learning techniques. Int J Data Sci Anal. (2024). doi: 10.1007/s41060-024-00594-x
- CrossRef
- Google Scholar
18.
MaoRLiuQHeKLiWCambriaE. the biases of pre-trained language models: an empirical study on prompt-based sentiment analysis and emotion detection. IEEE Trans Affect Comp. (2023) 14:1743–1753. doi: 10.1109/TAFFC.2022.3204972
- CrossRef
- Google Scholar
19.
TanKLLeeCAnbananthenKLimK. RoBERTa-LSTM: a hybrid model for sentiment analysis with transformer and recurrent neural network. IEEE Access. (2022) 10:1. doi: 10.1109/ACCESS.2022.3152828
- CrossRef
- Google Scholar
20.
YanHDaiJJiTQiuXZhangZ. A unified generative framework for aspect-based sentiment analysis. In: Annual Meeting of the Association for Computational Linguistics. (2021) 2416–29. doi: 10.18653/v1/2021.acl-long.188
- CrossRef
- Google Scholar
21.
WangKShenWYangYQuanXWangR. Relational graph attention network for aspect-based sentiment analysis. In: Annual Meeting of the Association for Computational Linguistics. (2020).
- Google Scholar
22.
XuGMengYQiuXYuZWuX. Sentiment analysis of comment texts based on BiLSTM. IEEE Access. (2019). doi: 10.1109/ACCESS.2019.2909919
- CrossRef
- Google Scholar
23.
KiritchenkoSMohammadSM. Examining gender and race bias in two hundred sentiment analysis systems. In: International Workshop on Semantic Evaluation. New Orleans: Association for Computational Linguistics. (2018).
- Google Scholar
24.
RosenthalSFarraNNakovP. SemEval-2017 task 4: sentiment analysis in Twitter. In: International Workshop on Semantic Evaluation. Vancouver, CA: Association for Computational Linguistics. (2017). doi: 10.18653/v1/S17-2088
- CrossRef
- Google Scholar
25.
ZhangBYangHZhouTBabarMALiuXY. Enhancing financial sentiment analysis via retrieval augmented large language models. In: International Conference on AI in Finance. (2023).
- Google Scholar
26.
HartmannJHeitmannMSiebertCSchampC. More than a feeling: accuracy and application of sentiment analysis. Int J Res Market. (2022). doi: 10.1016/j.ijresmar.2022.05.005
- CrossRef
- Google Scholar
27.
BarbieriFAnkeLECamacho-ColladosJ. XLM-T: multilingual language models in Twitter for sentiment analysis and beyond. In: International Conference on Language Resources and Evaluation. Marseille: European Language Resources Association. (2021).
- Google Scholar
28.
YuWXuHMengFZhuYMaYWuJet al. CH-SIMS: a Chinese multimodal sentiment analysis dataset with fine-grained annotation of modality. In: Annual Meeting of the Association for Computational Linguistics. (2020) 3718–27. doi: 10.18653/v1/2020.acl-main.343
- CrossRef
- Google Scholar
29.
HuangSYangJShenNXuQZhaoQ. Artificial intelligence in lung cancer diagnosis and prognosis: Current application and future perspective. In: Seminars in Cancer Biology. London: Elsevier (2023). p. 30–37.
- Google Scholar
30.
HuangSYangJFongSZhaoQ. Artificial intelligence in cancer diagnosis and prognosis: Opportunities and challenges. Cancer Lett. (2020) 471:61–71. doi: 10.1016/j.canlet.2019.12.007
- CrossRef
- Google Scholar
31.
VallurupalliAThachathodiylR. Decoding hidden pathways: a comprehensive exploration of anomalous coronary arteries. Cureus. (2025) 17:e86852. doi: 10.7759/cureus.86852
- CrossRef
- Google Scholar
32.
MangayarkarasiTParnikaSAbiramiSJayashreeKNaliniC. Smart jacket : integrated sensor system for real-time COPD and sleep apnea monitoring 2024. In: International Conference on Power, Energy, Control and Transmission Systems (ICPECTS). India: IEEE (2024).
- Google Scholar
33.
The The Writing Committee of the Report on Cardiovascular Health and Diseases in ChinaHuSS. Report on cardiovascular health and diseases in China 2021: an updated summary. J Geriat Cardiol. (2023) 20:399–430. doi: 10.26599/1671-5411.2023.06.001
- CrossRef
- Google Scholar
34.
SaeedM. Editorial for “automatic detection of meniscus tears using backbone convolutional neural networks on knee MRI”. J Magnetic Resonance Imag. (2022). doi: 10.1002/jmri.28283
- CrossRef
- Google Scholar
35.
YadavAVishwakarmaD. Sentiment analysis using deep learning architectures: a review. Artif Intellig Rev. (2019). doi: 10.1007/s10462-019-09794-5
- CrossRef
- Google Scholar
36.
MaYPengHCambriaE. Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive LSTM. In: AAAI Conference on Artificial Intelligence. (2018).
- Google Scholar
37.
ChenPSunZBingLYangW. Recurrent attention network on memory for aspect sentiment analysis. In: Conference on Empirical Methods in Natural Language Processing. (2017).
- Google Scholar
38.
TanKLLeeCLimK. A survey of sentiment analysis: approaches, datasets, and future research. Appl Sci. (2023). doi: 10.3390/app13074550
- CrossRef
- Google Scholar
39.
ProttashaNJSamiAAKowsherMMuradSABairagiAKMasudMet al. Transfer learning for sentiment analysis using BERT based supervised fine-tuning. In: Italian National Conference on Sensors. (2022).
- Google Scholar
40.
ZhangWLiXDengYBingLLamW. Towards generative aspect-based sentiment analysis. In: Annual Meeting of the Association for Computational Linguistics. (2021).
- Google Scholar

Summary

Keywords

athletes, early detection, machine learning, multimodal data integration, pulmonary diseases

Citation

Zhang R and Chang Q (2026) Multimodal data integration and machine learning methods for early detection and risk prediction of pulmonary diseases in athletes. Front. Med. 13:1758840. doi: 10.3389/fmed.2026.1758840

Received

02 December 2025

Revised

01 April 2026

Accepted

03 April 2026

Published

29 May 2026

Volume

13 - 2026

Edited by

Liang Zhao, Dalian University of Technology, China

Reviewed by

Shigao Huang, Air Force Medical University, China

Bhavnish Walia, Amazon, United States

Updates

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Rusen Zhang, cassandramrs5429@outlook.com

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Pulmonary Medicine

ORIGINAL RESEARCH article

Multimodal data integration and machine learning methods for early detection and risk prediction of pulmonary diseases in athletes

Abstract

1 Introduction

2 Related work