Analysis of Mutations and Dysregulated Pathways Unravels Carcinogenic Effect and Clinical Actionability of Mutational Processes

Somatic mutations accumulate over time in cancer cells as a consequence of mutational processes. However, the role of mutational processes in carcinogenesis remains poorly understood. Here, we infer the causal relationship between mutational processes and somatic mutations in 5,828 samples spanning 34 cancer subtypes. We found most mutational processes cause abundant recurrent mutations in cancer genes, while exceptionally ultraviolet exposure and altered activity of the error-prone polymerase bring a large number of recurrent non-driver mutations. Furthermore, some mutations are specifically induced by a certain mutational process, such as IDH1 p.R132H which is mainly caused by spontaneous deamination of 5-methylcytosine. At the pathway level, clock-like mutational processes extensively trigger mutations to dysregulate cancer signal transduction pathways. In addition, APOBEC mutational process destroys DNA double-strand break repair pathway, and bladder cancer patients with high APOBEC activity, though with homologous recombination proficient, show a significantly longer overall survival with platinum regimens. These findings help to understand how mutational processes act on the genome to promote carcinogenesis, and further, presents novel insights for cancer prevention and treatment, as our results showing, APOBEC mutagenesis and HRD synergistically contributed to the clinical benefits of platinum-based treatment.

. Correlations between the number of cancer gene mutations and total mutations caused by mutational processes. The y-axes and x-axes show the numbers of cancer gene mutations and total mutations attributed to each mutational process respectively. Each panel corresponds to a mutational process in a cancer type or subtype. Each dot represents a sample. The lines represent the linear relationship calculated by robust linear regression and 95% confidence intervals for the slopes are shown in lighter gray shading.    S5. The recurrent mutation is shaped by some mutational processes. Barplots and word clouds illustrate the recurrent mutation landscapes for SBS3, SBS18, SBS17, SBS12, SBS16, SBS22. Barplot depicts a recurrence pattern, where mutations were binned by their reoccurrence frequency and the height of the bar represented the fraction of mutations in each cancer type or subtype. The numbers of mutations in and out of cancer genes are filled by red and gray respectively and also show in parentheses separated by commas in each cancer type or subtype. Word clouds show high-frequency mutations occurring at least 6 times in a cancer type or subtype, of which word size is proportional to the number of mutations and word colored by cancer type or subtype. Note that some mutational processes are not found to cause high-frequency mutations. The red numbers indicate that the mutational process causes high-frequency recurrent mutations in a cancer type.

Fig. S7. The relation between mutational processes and high-frequency mutations.
The network illustrates the associations between mutational processes and highfrequency mutations. Red circles and blue rectangles correspond to mutations and mutational processes, respectively. The size of the node is proportional to the degree in the network. The thickness of the edge connecting two nodes is proportional to the number of associations in all samples.

Fig. S8. Specific mutations induced by a certain mutational process.
Pie charts show the mutational process composition contributing to a mutation in a cancer type and the size is proportional to the mutation number. Significant dots (adjusted p < 0.05) are filled by the mutational process. The y coordinate of each pie center reflects mutation frequency in the corresponding cancer type; the x coordinate was determined by the normalized entropy (see Methods).

Fig. S9. Overview of identifying pathways affected by the mutational process.
For a mutational process, firstly, we prioritized genes in a cancer type. Then, a ranked hypergeometric test is used to find enriched pathways. Finally, we get pathways by integrating the evidence from all cancer types. This schematic map is referenced by Paczkowska et al (Nature communication. 2020).   S11. Pathways affected by clock-like mutational processes. Enrichment map of some pathways affected by A SBS1, B SBS5, and C SBS40. Nodes in the network represent pathways and are filled by cancer-type evidence. The node size indicated the number of genes in a pathway. Similar pathways with many common genes were connected.

Fig. S12. Pathways affected by APOBEC mutational process. (A)
Enrichment map of some pathways affected by APOBEC mutational process solely. In this map, nodes in the network represent pathways and are colored by cancer-type evidence. The node size indicates the number of genes in a pathway. Similar pathways with many common genes were connected. (B) A scatter plot shows correlations between the log exposure value of APOBEC mutational signatures (SBS2 and SBS13, axes) and HRD signature (SBS3, y-axes). Each dot represents a sample colored according to the cancer type and the line shows best estimates for the slope estimated by mixed effect model in samples that APOBEC mutational signatures and HRD signature co-occurred. The 0 value of exposure is not processed by log transformation.