Automatic cardiothoracic ratio calculation based on lung fields abstracted from chest X-ray images without heart segmentation

Introduction The cardiothoracic ratio (CTR) based on postero-anterior chest X-rays (P-A CXR) images is one of the most commonly used cardiac measurement methods and an indicator for initially evaluating cardiac diseases. However, the hearts are not readily observable on P-A CXR images compared to the lung fields. Therefore, radiologists often manually determine the CTR’s right and left heart border points of the adjacent left and right lung fields to the heart based on P-A CXR images. Meanwhile, manual CTR measurement based on the P-A CXR image requires experienced radiologists and is time-consuming and laborious. Methods Based on the above, this article proposes a novel, fully automatic CTR calculation method based on lung fields abstracted from the P-A CXR images using convolutional neural networks (CNNs), overcoming the limitations to heart segmentation and avoiding errors in heart segmentation. First, the lung field mask images are abstracted from the P-A CXR images based on the pre-trained CNNs. Second, a novel localization method of the heart’s right and left border points is proposed based on the two-dimensional projection morphology of the lung field mask images using graphics. Results The results show that the mean distance errors at the x-axis direction of the CTR’s four key points in the test sets T1 (21 × 512 × 512 static P-A CXR images) and T2 (13 × 512 × 512 dynamic P-A CXR images) based on various pre-trained CNNs are 4.1161 and 3.2116 pixels, respectively. In addition, the mean CTR errors on the test sets T1 and T2 based on four proposed models are 0.0208 and 0.0180, respectively. Discussion Our proposed model achieves the equivalent performance of CTR calculation as the previous CardioNet model, overcomes heart segmentation, and takes less time. Therefore, our proposed method is practical and feasible and may become an effective tool for initially evaluating cardiac diseases.


Introduction
X-ray is the most widely used primary imaging technique for routine chest and bone radiography as it is widely available, low cost, has fast imaging speed, and easy to acquire (Liu et al., 2022).Because of its fast imaging (seconds after exposure), X-ray has become the preferred imaging device to improve the work efficiency and facilitate the diagnosis of critically ill and/or emergency patients in clinical practice (Howell, 2016;Seah et al., 2021).
The relationship analysis between the heart and lungs has been a hot topic in clinical or scientific research (Yang et al., 2022).The cardiothoracic ratio (CTR) is one of the most commonly used cardiac measurement methods and a commonly used indicator for evaluating cardiac enlargement (Saul, 1919;Hada, 1995;Chang et al., 2022).Specifically, various etiologies, such as pathological changes in the heart itself and increased adaptability secondary to hemodynamic changes leading to left and right heart enlargement, will increase the CTR.Chest X-rays (CXRs), as an economical and convenient routine examination, can better display the situation of the chest, lung tissue, pulmonary blood vessels, heart, chest blood vessels, etc., providing a reliable basis for clinical diagnosis (Howell, 2016;Seah et al., 2021;Liu et al., 2022).Although cardiac enlargement should be diagnosed through echocardiography, follow-up and treatment can be based on the posteroanterior (P-A) CXR images.P-A CXR must be performed for the initial cardiac examination (Hada, 1995).In addition, the CTR is also a predictor of heart failure progression in asymptomatic patients with cardiac diseases (Nochioka et al., 2023).Therefore, the accurate CTR measurement of these vulnerable populations is crucial for precision healthcare.
Manual CTR measurement based on the P-A CXR image requires experienced radiologists and is time-consuming and laborious.Therefore, with the rapid development of artificial intelligence, such as convolutional neural networks (CNNs), automatic CTR calculation methods or models based on P-A CXR images have been successively proposed in recent years (Li et al., 2019;Saiviroonporn et al., 2021;Jafar et al., 2022).Li et al. proposed a computer-aided technique that is more reliable and time-and labor-saving than the manual method in CTR calculation.In addition, Pairash et al. verified that the AI (artificial intelligence)-only method could replace the manual CTR measurement.Meanwhile, Abbas et al. proposed a CardioNet model that achieved acceptable accuracy and competitive results across all datasets.However, the above methods or models in references 9-11 are still limited by heart segmentation.Although heart segmentation techniques based on P-A CXR images have made significant progress (Lyu and Tian, 2023), whether the CTR calculation requires the specific morphology and structure of the heart remains to be studied.
Anatomically, the heart is located within the chest cavity between the left and right lungs.Specifically, about one-third of the heart is on the right side of the midline, about two-thirds is on the left side, and the apex is on the lower left front (Weinhaus and Roberts, 2005).The transverse diameter of the heart in the cardiothoracic ratio refers to the sum of the maximum distances from the left and right cardiac margins to the midline.However, the heart in the P-A CXR images is not prominent.Therefore, if the heart segmentation needs to be more precise, the error of this heart segmentation may fail to calculate the CTR automatically based on the P-A CXR images.
Based on the above, a novel, fully automatic CTR calculation method based on the lung field should be proposed to overcome the limitations to heart segmentation.Specifically, we train a robust and standard segmentation model of pathological lungs based on multi-center training datasets of the P-A CXR images and image enhancement techniques for extracting lung fields in P-A CXR images.Then, the CTR is automatically calculated based on the lung field based on graphics.Our contributions in this paper are briefly described as follows: (1) We propose a fully automatic CTR calculation method based on lung fields abstracted from the P-A CXR images using CNNs, overcoming the limitations to heart segmentation, avoiding errors in heart segmentation, and taking less time.
(2) We propose a novel localization method of the heart's right and left border points based on the two-dimensional projection morphology of the lung field mask images using graphics.
(3) The proposed automatic CTR calculation method based on lung fields abstracted from the P-A CXR images may become an effective tool for initially evaluating cardiac diseases.
2 Materials and methods

Materials
Here, 789 (635 + 54 + 72 + 15 + 13) sets of the P-A CXR images from public CXR datasets, the Google website, and a case of P-A CXR video are collected in the study for training the CNNs of lung field segmentation and automatic CTR calculation.Specifically, 21 P-A CXR images are used as a test set for evaluating lung field segmentation models.In addition, these 21 P-A CXR images and 13 dynamic P-A CXR images are used to calculate the CTR.
Figure 1 intuitively shows the detailed distribution of these 789 P-A CXR images in each dataset.Specifically, the dataset used in this study includes six sub-datasets (D1-D6).The public dataset D1 (The Shenzhen set -Chest X-ray database) includes 635 static P-A CXR images (324 normal cases and 311 cases with manifestations of tuberculosis).The public dataset D2 (The Shenzhen set -Chest X-ray database) includes 54 static P-A CXR images (47 normal cases and 7 cases with manifestation of tuberculosis).The public dataset D3 [NIAID TB portal program dataset (Online)] and D4 [kaggle.RSNA Pneumonia Detection Challenge (Online)] include 72 static P-A CXR images (7 cases with the manifestation of tuberculosis, 60 cases with the manifestation of pneumonia, and 5 normal cases).The public dataset D5 includes 15 static P-A CXR images collected from the Google website.
In addition, dataset D6 includes 13 dynamic P-A CXR images from the case of CXR video that were collected during free breathing.Specifically, the CXR video was collected from a female participant  More specifically, the public CXR datasets D1 and D2 are collected from the website https://www.kaggle.com/datasets/kmader/pulmonary-chest-xray-abnormalities?select=ChinaSet_AllFiles. Meanwhile, the public CXR datasets D3 and D4 are collected from the websites https://data.tbportals.niaid.nih.gov/ and https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/data, respectively.

Methods
Figure 2 intuitively shows the schematic diagram of the automatic cardiothoracic ratio algorithm.Specifically, the proposed fully automatic CTR calculation method based on lung field abstracted from the P-A CXR images includes lung field segmentation and cardiothoracic ratio calculation.Figure 2A shows that the lung field mask images are abstracted from the P-A CXR images based on the trained CNNs with the connected domain (CD) algorithm.Meanwhile, Figure 2B shows the automatic CTR calculation method based on the lung field mask images.

Lung field segmentation
The organ segmentation model of medical images based on CNNs has become an indispensable technology for quantitative analysis (Conze et al., 2023;Jiang et al., 2023;Ma et al., 2024).CNNs have even been applied to the lung segmentation of rats for measuring lung parenchyma parameters (Yang et al., 2021).In addition, automatic lung field segmentation in routine imaging is a data diversity problem not a methodology problem (Hofmanninger et al., 2020).
The SegNet, U-Net, and its improved networks have been widely applied in organ segmentation of medical images.Based on the above, we train four traditional and basic CNNs to test whether different CNN lung field segmentation models have differences in CTR calculation, including SegNet (Badrinarayanan et al., 2017), U-Net (Ronneberger et al., 2015), and its two improved networks, ResU-Net++ (Jha et al., 2019) and AttU-Net (Wang et al., 2021).
The training process of four lung field segmentation models based on CNNs is detailed below.First, the 755 P-A CXR images' lung field label images (ground truth) are labeled and examined by three experienced radiologists using the software Labelme (v5.1.0)and ITK-SNAP (v4.0.2).Second, each CNN is trained by 755 P-A CXR images (755 × 512 × 512 × 1) with their lung field label images (ground truth).Specifically, 755 CXR cases include 371 normal cases, 380 abnormal cases with the manifestation of tuberculosis (N = 320) and pneumonia (N = 60), and 15 unclear cases.In addition, data augmentation techniques were adopted to avoid overfitting, further improving the robustness and generalization ability of the lung field segmentation models in the training process (Chlap et al., 2021).The standard cross entropy loss function is selected to calculate the model's loss and dynamically adjust each CNN's parameters.Finally, the CD algorithm (Zhao et al., 2010) is applied to the lung field mask images generated by each CNN to eliminate non-lung field masks not connected to the lung field masks.

Experimental environment and evaluation metrics
These four traditional and basic CNNs (SegNet, U-Net, ResU-Net++, and AttU-Net) are trained on PyCharm 2017.3.3 (community edition) in Windows 10 Pro 64-bit with an NVIDIA GeForce GTX 1080 Ti GPU and 16 GB RAM.Then, the pth format of each CNN's optimal lung field segmentation model is converted to the pth format based on PyCharm 2017.3.3.Finally, each CNN's optimal lung field segmentation model with the pt format is called by C++ codes based on Visual Studio 2017 for lung field segmentation of 21 static P-A CXR images (Test set T1) and 13 dynamic P-A CXR images (Test set T2). Similarly, the CTR algorithm is automatically performed in Visual Studio 2017.
This study selects the six standard evaluation metrics of each lung field segmentation model, including accuracy, precision, recall, dice, intersection over union (IoU), and the median 95th Hausdorff distance (HD) (Yang et al., 2023;Zeng et al., 2023).Four x-axis direction distance errors are calculated to evaluate detected points between the right and left heart border points and thoracic inner edge points and their ground truths.Furthermore, the errors of calculated CTR and its ground truth were also calculated to evaluate the proposed method.Specifically, the evaluation metrics of the x-axis direction distance and CTR errors are calculated by Equations 10, 11.
x error x DP − x GT , (10) where x DP represents the horizontal coordinates x(E 1 ), x(E 2 ), x(D 2 ′ ), and x(D 3 ′ ) of the detection points of E 1 , E 2 , D 2 ′ , and D 3 ′ , respectively.In addition, x GT represents the ground truth of the horizontal coordinates x(E 1 ), x(E 2 ), x(D 2 ′ ), and x(D 3 ′ ).CTR C and CTR GT represent the calculated value and ground truth of the CTR.

Results
Figures 4, 5 show the visualized lung field segmentation results and evaluation metrics of the test set T1 based on various trained CNN models.In addition, Figure 6 shows that the visualized lung field segmentation results of the test set T2 are based on these trained CNN models.These results indicate that all CNNs can perform well in lung field segmentation for static and dynamic P-A CXR images.
To quantitatively evaluate the key points detection results of Figures 7-9, Table 1 reports the mean distance errors at the x-axis direction of key points in the test sets T1 and T2 based on various trained CNN models.In addition, Figures 10, 11 show the visualized mean distance errors at the x-axis direction of key points on the test set T1.The mean with SD plots of Figures 10, 11 is drawn based on the absolute values of these distance errors at the x-axis direction of key points.
Specifically, mean distance errors at the x-axis direction of key points in the test set T1 based on the four trained CNN models are 3.8213 pixels, 4.1788 pixels, 4.3808 pixels, and 4.0833 pixels, respectively.Meanwhile, mean distance errors at the x-axis direction of key points in the test set T2 based on the four trained CNN models are 3.4423 pixels, 2.7115 pixels, 2.7310 pixels, and 3.9615 pixels, respectively.Mean distance errors at the x-axis direction of key points in the sets T1 and T2 of all trained CNN models are 4.1161 pixels and 3.2116 pixels, respectively.The mean distance error at the x-axis direction of key points in the test sets T1 and T2 based on various trained CNN models is approximately 3.6639 [≈(4.1161+ 3.2116)/2] pixels.Therefore, the deviation degree at the x-axis direction is about 0.72% (3.6639/512) on all 512 × 512 P-A CXR images.Subsequently, Table 2 compares the mean CTR error in the test sets T1 and T2 based on the previous methods (CardioNet) (Jafar et al., 2022) and our proposed models.In addition, Figure 12 shows the visualized CTR of the test sets T1 and T2 based on various models.
Table 3 compares the mean segmentation time, mean CTR calculation time, and mean total time of the test sets T1 and T2 based on previous and our proposed models.Specifically, these trained CNN models run on the GPU for segmenting the lung field and/or heart, and then, the CTR calculation algorithm based on the lung field and/or heart mask images runs on the CPU.The previous CardioNet takes more mean segmentation time, mean CTR calculation time, and mean total time of the test sets T1 and T2 than our proposed model.However, when the GPU runs the segmentation task for the first CXR image, it requires an amount of time to configure and load the corresponding model.For example, it takes 4,897/4,885 ms (CardioNet),951/979 ms (SegNet), 2,249/ 2,350 ms (U-Net), 2,182/2,226 ms (ResU-Net++), and 2,144/ 2,158 ms (AttU-Net), when the GPU runs the segmentation task for the first CXR image of the test set T1/T2.

Discussion
This section conducts the following discussion and points out this study's limitations and the future direction based on the experimental results.The CNN lung field segmentation model trained by inspiratory chest computed tomography (CT) images has been applied to the lung field segmentation of inspiratory and expiratory chest CT images, achieving good performance (Deng et al., 2024;Wang et al., 2024).Similarly, the lung field segmentation model trained on CNNs based on static P-A CXR images also demonstrated good performance in lung field segmentation of dynamic P-A CXR images.Therefore, this provides a necessary foundation for quantitative analysis of dynamic P-A CXR images, such as the CTR calculation.Specifically, CNNs have also played a crucial role in semantic segmentation, where the goal is to assign a class label to each pixel in an image, enabling pixel-level understanding and overcoming the limitations of the traditional approaches (Robert, 2024).In addition, a robust and standard segmentation model of pathological lungs is crucial for quantitative analysis of the lungs based on P-A CXR images.However, generalizing lung field segmentation models based on P-A CXR images has always been a significant engineering problem in clinical applications (Rajaraman et al., 2024).The main reason for this engineering problem is the lack of cross-center P-A CXR images and their diversity.The data augmentation technology enriches the training set of the static P-A CXR images and relieves the engineering problem of generalization in lung field segmentation models (Hasan et al., 2024;Kiruthika and Khilar, 2024).Therefore, the static P-A CXR images form a single center, limiting the generalization of lung field segmentation models.Meanwhile, the diversity of pathological static P-A CXR images in the training set is also essential for improving the generalization of lung field segmentation models, enabling CNNs to learn more prior knowledge.The static P-A CXR images of the cross-center and the diversity and data augmentation techniques may fundamentally solve the generalization problem of lung field segmentation models.

The lung field morphology for automatic and precise CTR calculation
The positional relationship between the lung field and the heart on the P-A CXR images is the basis for calculating the CTR based on the lung field.Specifically, the right and left cardiophrenic angles C 1  and C 2 are two relatively prominent points on the P-A CXR images after careful observation and analysis of the P-A CXR images.This provides a certain possibility for automatically calculating the CTR based on the P-A CXR images.Significantly, the right cardiophrenic angle C 1 is the farthest point from the line connecting the right apex pulmonis A 1 and costophrenic angle B 1 , which helps determine the highest point D 1 on the right hemi-diaphragm from the costophrenic angle B 1 to the right cardiophrenic angles C 1 .Meanwhile, the right and left heart border points E 1 and E 2 are closely adjacent to the left and right lung fields and form inward indentations on the opposite sides of the left and right lung fields on the P-A CXR images.Therefore, this facilitates the location of the

Providing the possibility for the analysis and evaluation of dynamic CTRs
Dynamic CTRs can directly reflect the relationship between the changes in the maximum transverse diameter of the chest during the respiratory process and the maximum transverse diameter of the heart at different cardiac cycles.Actually, the chest can be imaged by autonomously controlling breathing while performing a chest X-ray.However, the heartbeat process cannot be autonomously controlled while chest X-rays are performed.Therefore, the models developed above may provide evidence of the possibility of analyzing and evaluating dynamic CTRs.Specifically, the corresponding P-A CXR images of the different cardiac cycles can be obtained by controlling the breathing state at any moment during the breathing process, such as deep inhalation, deep exhalation, or breath holding.Subsequently, dynamic CRTs for different cardiac cycles can be calculated for clinical analysis and evaluation.Meanwhile, like inspiratory and expiratory chest CT images (Deng et al., 2024;Wang et al., 2024), P-A CXR images of the different cardiac cycles can be obtained by controlling the holding of breath to achieve deep inhalation and exhalation, to analyze the difference in CTR between deep inhalation and exhalation.

Limitations and future research directions
Although we propose an automated CTR calculation technique based on lung field models from an engineering perspective, our research still has certain limitations.First, the pathological lung image types used for training CNNs are insufficient.Second, we only explain the principle of automatically calculating the CTR based on the lung fields based on graphics and achieve the detection of dynamic CTRs.However, we do not have sufficient dynamic P-A chest X-ray images to further analyze the association between dynamic CTRs and specific lung or heart diseases from a clinical perspective, for example, differences in dynamic CTRs under different GOLD classifications in chronic pulmonary heart disease caused by chronic obstructive pulmonary disease.Based on the above, we encourage researchers to collect more dynamic P-A CXR images of different lung diseases to improve the lung field segmentation model further and hope to discover more clinically significant facts based on dynamic CTRs.

Conclusion
We propose an automatic CTR calculation model based on lung fields abstracted from P-A CXR images without heart segmentation.First, the lung field mask images are abstracted from the P-A CXR images based on the pre-trained CNNs.Second, a novel localization method of the heart's right and left border points is proposed based on the two-dimensional projection morphology of the lung field mask images using graphics.The results show that the mean distance errors at the x-axis direction of the CTR's four key points in the test sets T1 and T2 based on these pre-trained CNNs are 4.1161 and 3.2116 pixels, respectively.In addition, the mean CTR errors on the test set T1 and T2 based on four proposed models are 0.0208 and 0.0180, respectively.Our proposed model achieves the equivalent performance of CTR calculation as the previous CardioNet model and takes less time.Therefore, our proposed method is practical and feasible and may become an effective tool for initially evaluating cardiac diseases.

FIGURE 1
FIGURE 1 Distribution of the dataset in this study.(A) Distribution of case classification in each dataset and (B) distribution of abnormal cases in each dataset.

FIGURE 2
FIGURE 2 Automatic cardiothoracic ratio algorithm schematic diagram.(A) Lung field segmentation and (B) cardiothoracic ratio calculation.

FIGURE 3
FIGURE 3Schematic diagram of the automatic cardiothoracic ratio calculation is based on the images of the lung edge.

FIGURE 4
FIGURE 4 Visualized lung field segmentation results of the test set T1 based on various trained CNN models.(A) Twenty-one static P-A CXR images; (B) twentyone static P-A CXR images with their lung field label images (ground truth); (C) twenty-one static lung field mask images based on SegNet; (D) twenty-one static lung field mask images based on U-Net; (E) twenty-one static lung field mask images based on ResU-Net++; and (F) twenty-one static lung field mask images based on AttU-Net.

Figure 3
Figure 3 intuitively shows the schematic diagram of the automatic cardiothoracic ratio algorithm.First, the right and left lungs are identified based on lung field mask images.Specifically, the largest and the second largest lung field areas in each lung field mask image are identified as the right and left lung mask images, respectively.Second, edge detection is performed separately on left and right lung images, obtaining the right and left lung mask edge images.Finally,

FIGURE 6
FIGURE 6 Visualized lung field segmentation results of the test set T2 based on various trained CNN models.(A) Thirteen dynamic P-A CXR images; (B) thirteen dynamic lung field mask images based on SegNet; (C) thirteen dynamic lung field mask images based on U-Net; (D) thirteen dynamic lung field mask images based on ResU-Net++; and (E) thirteen dynamic lung field mask images based on AttU-Net.

FIGURE 7
FIGURE 7 Typical key point detection results of a normal case in the test set T1 based on various trained CNN models.(A) SegNet; (B) U-Net; (C) ResU-Net++; and (D) AttU-Net.

FIGURE 8
FIGURE 8 Typical key point detection results of a tuberculosis case in the test set T1 based on various trained CNN models.(A) SegNet; (B) U-Net; (C) ResU-Net++; and (D) AttU-Net.

FIGURE 9
FIGURE 9 Typical key point detection results of a dynamic case in the test set T1 based on various trained CNN models.(A) SegNet; (B) U-Net; (C) ResU-Net++; and (D) AttU-Net.

FIGURE 10
FIGURE 10 Visualized mean distance errors at the x-axis direction of key points on the test set T1. (A) Distribution map of key points' distance errors based on SegNet; (B) mean with an SD plot of key points' distance errors based on SegNet; (C) distribution map of key points' distance errors based on U-Net; (D) mean with an SD plot of key points' distance errors based on U-Net; (E) distribution map of key points' distance errors based on ResU-Net++; (F) mean with an SD plot of key points' distance errors based on ResU-Net++; (G) distribution map of key points' distance errors based on AttU-Net; and (H) mean with an SD plot of key points' distance errors based on AttU-Net.

FIGURE 11
FIGURE 11 Visualized mean distance errors at the x-axis direction of key points on the test set T2. (A) Distribution map of key points' distance errors based on SegNet; (B) mean with an SD plot of key points' distance errors based on SegNet; (C) distribution map of key points' distance errors based on U-Net; (D) mean with an SD plot of key points' distance errors based on U-Net; (E) distribution map of key points' distance errors based on ResU-Net++; (F) mean with an SD plot of key points' distance errors based on ResU-Net++; (G) distribution map of key points' distance errors based on AttU-Net; and (H) mean with an SD plot of key points' distance errors based on AttU-Net.

FIGURE 12 Visualized
FIGURE 12 Visualized CTR of the test sets T1 and T2 based on various models.(A) CTR distribution map of the test set T1; (B) mean with an SD plot of CTR on test set T1; (C) CTR distribution map of the test set T2; and (D) mean with an SD plot of CTR on test set T2.

TABLE 1
Mean distance errors at the x-axis direction of key points in the sets T1 and T2 based on various trained CNN models.

TABLE 2
Comparison of the mean CTR error in the test sets T1 and T2 based on previous and our proposed models.

TABLE 3
(Jafar et al., 2022)an segmentation time, mean CTR calculation time, and mean total time of the test sets T1 and T2 based on previous and our proposed models.andleftheart border points E 1 and E 2 on lung mask edges from the right and left cardiophrenic angles C 1 and C 2 to the intersection points C 1 ′ and C 2 ′ , respectively.Based on the above, the twodimensional projection morphology of the lung field enables automatic and precise CTR calculation, overcoming the limitations to heart segmentation and avoiding errors in heart segmentation.In addition, our proposed models take less time to calculate the CTR, which benefits that the proposed models only segment the lung field compared with the previous CardioNet model.Our proposed model achieves the equivalent performance of CTR calculation as the previous CardioNet model(Jafar et al., 2022). right