AUTHOR=Li Qiucheng , Liu Fang , Zhong Jianfeng , Fang Xiaoling , Zhang Xinyi , Xiong Huizhen , Li Guangyi , Chen Honglei TITLE=Multi-cohort metagenomics reveals strain functional heterogeneity and demonstrates fecal microbial load correction improves colorectal cancer diagnostic models JOURNAL=Frontiers in Microbiology VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2025.1656016 DOI=10.3389/fmicb.2025.1656016 ISSN=1664-302X ABSTRACT=IntroductionColorectal cancer (CRC) is strongly associated with alterations in the gut microbiome. While numerous studies have examined this association, most focus on genus– or species–level taxonomic classifications, overlooking functional heterogeneity at the strain level.MethodsWe integrated 1,123 metagenomic samples from seven global CRC cohorts to conduct multi-level metagenome-wide association studies (MWAS). Fecal microbial load (FML) correction was applied to mitigate technical confounding. We evaluated the performance of taxonomic models at various resolutions strain, species, and genus levels in classifying CRC status both within and across cohorts.ResultsStrain–level analysis revealed conspecific strains with divergent associations to CRC. For instance, distinct strains of Bacteroides thetaiotaomicron exhibited both protective and risk-increasing effects across different cohorts. Genomic functional annotation suggested potential mechanistic bases for these opposing roles. Correction for FML reduced confounding and significantly improved the performance of within–cohort and cross–cohort CRC classification models. Interestingly, genus- and species-level models demonstrated superior predictive robustness compared to strain–level models, likely due to higher microbial abundance and greater cross-population conservation at these taxonomic ranks.ConclusionOur study underscores the biological relevance of strain level analysis in elucidating functional diversity within the microbiome. However, higher taxonomic levels provide more robust and clinically translatable diagnostic markers for CRC. Integrating FML correction with multi-level taxonomic profiling enhances both mechanistic insight into microbiom CRC interactions and the generalizability of diagnostic models across diverse populations.