AUTHOR=Chen Shuhang , Wang Zhihuan , Lu Tianye , Zhu Jiayang , Zhang Chunchang , Zeng Xiangming , Wang Jiayi , Shang Zandi TITLE=Research on global ship cargo capacity prediction based on multi-source heterogeneous data JOURNAL=Frontiers in Marine Science VOLUME=Volume 12 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/marine-science/articles/10.3389/fmars.2025.1632661 DOI=10.3389/fmars.2025.1632661 ISSN=2296-7745 ABSTRACT=Maritime cargo capacity serves as a critical indicator of port efficiency and regional economic impact, yet reliable data remain constrained by operational and commercial complexities. This study addresses this gap by leveraging maritime big data to compare traditional empirical methods with machine learning approaches, integrating multi-source datasets (ship inbound/outbound records, vessel archives, and AIS data). Results demonstrate that the K-nearest neighbors (KNN) algorithm achieves 88% predictive accuracy on validation data—a 19-percentage-point improvement over conventional methods (69%). While training accuracy reached 95%, anomalous vessel operations in validation samples reduced performance to 88%, revealing the model’s sensitivity to real-world variability and underscoring the need for enhanced data preprocessing. These findings highlight machine learning’s potential to refine cargo capacity estimation while emphasizing the importance of robust data quality frameworks for operational deployment.