AUTHOR=Ammar Ali , Zhu Libing , Bryan Shep , Yu Nathan Y. , Vargas Carlos , Rong Yi , Chen Quan 

TITLE=Evaluating the impact of different deface algorithms on deep learning segmentation software performance

JOURNAL=Frontiers in Oncology

VOLUME=Volume 15 - 2025

YEAR=2025

URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2025.1603593

DOI=10.3389/fonc.2025.1603593

ISSN=2234-943X

ABSTRACT=IntroductionData sharing is essential for advancing research in radiation oncology, particularly for training artificial intelligence (AI) models in medical imaging. However, privacy concerns necessitate de-identification of medical images, including defacing operations to remove facial features. This study evaluates the impact of defacing on AI-driven organ segmentation in head-and-neck (HN) computed tomography (CT) images.MethodsTwo defacing algorithms, DeIdentifier and mri_reface_0.3.3, were applied to 50 patient CT scans. Segmentation accuracy was assessed using two commercially available AI segmentation tools, INTContour and AccuContour®, and evaluated using Dice similarity coefficient (DSC), Hausdorff Distance at the 95th percentile (HD95), and Surface Dice Similarity Coefficients (SDSC) with 2 mm tolerance. Dose differences (D0.01cc) were calculated for each structure to evaluate potential clinical implications. Statistical comparisons were made using paired t-tests (p<0.05).ResultsThe results showed that defacing significantly impacted segmentation of on-face structures (e.g., oral cavity, eyes, lacrimal glands) with reduced DSC (<0.9) and higher HD95 (>2.5 mm), while off-face structures (e.g., brainstem, spinal cord) remained largely unaffected (DSC >0.9, HD95 <2 mm). DeIdentifier better preserved Hounsfield Units (HU) and anatomical consistency than mri_reface, which introduced more variability, including HU shifts in air regions. Minor differences in segmentation accuracy were observed between defacing algorithms, with mri_reface showing slightly greater variability. AccuContour showed slightly greater segmentation variability than INTContour, particularly for small or complex structures. Dose distribution analysis revealed minimal differences (<20 cGy) in most structures, with the largest variation observed in the Brainstem (34 cGy), followed by Lips_NRG (28 cGy) and Brain (25 cGy).ConclusionThese findings suggest that while defacing alters segmentation accuracy in on-face regions, its overall impact on off-face structures and radiation therapy planning is minimal. Future work should explore domain adaptation techniques to improve model robustness across defaced and non-defaced datasets, ensuring privacy while maintaining segmentation integrity.