Putative Breast Cancer Driver Mutations in TBX3 Cause Impaired Transcriptional Repression

The closely related T-box transcription factors TBX2 and TBX3 are frequently overexpressed in melanoma and various types of human cancers, in particular, breast cancer. The overexpression of TBX2 and TBX3 can have several cellular effects, among them suppression of senescence, promotion of epithelial–mesenchymal transition, and invasive cell motility. In contrast, loss of function of TBX3 and most other human T-box genes causes developmental haploinsufficiency syndromes. Stephens and colleagues (1), by exome sequencing of breast tumor samples, identified five different mutations in TBX3, all affecting the DNA-binding T-domain. One in-frame deletion of a single amino acid, p.N212delN, was observed twice. Due to the clustering of these mutations to the T-domain and for statistical reasons, TBX3 was inferred to be a driver gene in breast cancer. Since mutations in the T-domain generally cause loss of function and because the tumorigenic action of TBX3 has generally been attributed to overexpression, we determined whether the putative driver mutations had loss- or gain-of-function properties. We tested two in-frame deletions, one missense, and one frameshift mutant protein for DNA-binding in vitro, and for target gene repression in cell culture. In addition, we performed an in silico analysis of somatic TBX mutations in breast cancer, collected in The Cancer Genome Atlas (TCGA). Both the experimental and the in silico analysis indicate that the observed mutations predominantly cause loss of TBX3 function.

The closely related T-box transcription factors TBX2 and TBX3 are frequently overexpressed in melanoma and various types of human cancers, in particular, breast cancer. The overexpression of TBX2 and TBX3 can have several cellular effects, among them suppression of senescence, promotion of epithelial-mesenchymal transition, and invasive cell motility. In contrast, loss of function of TBX3 and most other human T-box genes causes developmental haploinsufficiency syndromes. Stephens and colleagues (1), by exome sequencing of breast tumor samples, identified five different mutations in TBX3, all affecting the DNA-binding T-domain. One in-frame deletion of a single amino acid, p.N212delN, was observed twice. Due to the clustering of these mutations to the T-domain and for statistical reasons, TBX3 was inferred to be a driver gene in breast cancer. Since mutations in the T-domain generally cause loss of function and because the tumorigenic action of TBX3 has generally been attributed to overexpression, we determined whether the putative driver mutations had loss-or gain-of-function properties. We tested two in-frame deletions, one missense, and one frameshift mutant protein for DNA-binding in vitro, and for target gene repression in cell culture. In addition, we performed an in silico analysis of somatic TBX mutations in breast cancer, collected in The Cancer Genome Atlas (TCGA). Both the experimental and the in silico analysis indicate that the observed mutations predominantly cause loss of TBX3 function.
Keywords: TBX3, breast cancer, somatic mutations, p21, frameshift mutation, in-frame deletion, driver mutation inTrODUcTiOn Cancer is assumed to progress by a type of Darwinian evolution that operates at the cellular rather than the organismic level. Mutations arise in an approximately random fashion. Those that increase the fitness of the affected cell relative to its neighbors (driver mutations) will be positively selected (and vice versa). Most mutations are neutral (passenger mutations) or deleterious from the perspective of the cancer cell. New sequencing techniques allow increasingly fast and affordable acquisition of DNA sequence information from cancer tissue, either of the exome or the complete genome. Genes that in a mutated or misexpressed form can promote cancer development are operationally defined as cancer genes. Most cancer mutations act dominantly (oncogenes) indicating gain-offunction, or possibly haploinsufficiency. Less than 10% are assumed to be tumor suppressors, requiring homozygosity of the mutant allele for the tumor promoting effect. The discrimination between driver and passenger mutations is not trivial (2,3). It is plausible to assume and is born out by the mutation pattern of bona fide oncogenes that, in the category of point mutations, only particular missense mutations are able to cause an activated gene product. On the other hand, tumor suppressor genes can be inactivated by a higher proportion of missense mutations, in addition to nonsense and frameshift mutations, in a large part of the protein coding region. Vogelstein and colleagues proposed a rule of the thumb according to which >20% of all mutations in a gene must be recurrent missense mutations in particular positions to qualify it as an oncogene, while a tumor suppressor gene should contain >20% of inactivating mutations (3). It should be added that only mutations in tissues in which these genes are active should be taken into account.
Although Tbx2 and Tbx3 are often both expressed in the development of a given organ, their expression at tissue and cellular level can diverge in time and space. During mouse embryonic breast development, Tbx2 and Tbx3 are co-expressed in mesenchyme but only Tbx3 is expressed in epithelial tissue (mammary placode and branching mammary duct epithelium). Tbx2 is not expressed in postnatal stages (53), whereas Tbx3 remains expressed throughout breast development (54,55). Although TBX2 and TBX3 share many properties at the molecular and cell biological level, they can be functionally distinct, at least in melanoma cells (17,56).
According to current knowledge based on studies in cell culture and experimental animals, the tumorigenic action of TBX2/3 appears predominantly caused by gain-of-function (transcriptional upregulation or gene amplification). This is supported by the finding that TBX2 or TBX3 are overexpressed/amplified in melanoma and various types of cancer and that the degree of overexpression correlates with invasiveness, distant metastasis, and poor prognosis (11,(57)(58)(59).
In recent exome analyses of somatic mutations in breast cancer, several mutations in TBX3 were identified. Stephens and colleagues in an analysis of 350 breast cancer samples found three short deletions (in each case resulting in the in-frame deletion of one amino acid; N212 was deleted in two cancers) and three truncating mutations. All changes clustered within 50 amino acids of the TBX3 T-domain. Based on the high relative mutation frequency and the apparent non-random clustering of the mutation sites, TBX3 was considered a driver gene (1). Due to the recurrence of ΔN212, TBX3 also fulfils the driver gene definition of Vogelstein et al. (3). Kandoth et al. analyzed the exome of 12 tumor types present in The Cancer Genome Atlas (TCGA). Using the MuSIC tool (60) they identified TBX3 as "significantly mutated gene" (SMG) with an increased mutation rate relative to background, in particular in breast cancer. ΔN212 was also identified in their analysis of 763 breast cancer specimen (61).
We here investigate the molecular function of four putative driver mutations identified by Stephens et al. (1) and Kandoth et al. (61) in order to determine whether they have gain-or loss-of-function properties in vitro and in vivo. We, furthermore, analyze the occurrence and distribution of TBX3 mutations in breast cancer in relation to other TBX genes and to other tumors.
Construction of bacterial expression clones: DNA encoding the first 342 amino acids of TBX3 (normal or mutant sequence) were amplified by linker PCR from the corresponding TBX3-3xFLAG/pENTR/D-TOPO clones. Primers 1879 and 1880 containing BamHI and SalI restriction sites, respectively, were used for amplification. These sites were used to clone the amplificate into pGEX-2TK (GE Healthcare).
Construction of p21-FLuc (firefly luciferase) expression vector: 2335 bp from the promoter region of the p21 gene (CDKN1A, ENSG00000124762) were amplified from a genomic clone (obtained from G. Spoden) by linker PCR using primers 1808 and 1323. The SV40 promoter was removed from the pGL3-control vector (Promega, Mannheim, Germany) by KpnI/HindIII digest and replaced by the p21 promoter fragment by T4 DNA ligase ligation.

electrophoretic Mobility shift assay of gsT-TBX3 Protein Mutants
Electrophoretic mobility shift assays were performed using a double-stranded 5′ digoxigenin (DIG)-labeled 24 base oligonucleotide (AATTTCACACCT AGGTGTGAAATT) containing the Brachyury consensus target site (synthesized by MWG Biotech/Eurofins, Ebersberg, Germany). GST-tagged proteins were incubated with DIG-oligo at 15°C for 1 h in electrophoretic mobility shift assay (EMSA) buffer [20 mM HEPES, 100 mM KCl, 1 mM DTT, 0.25 mM EDTA, 0.01% NP-40, 1 mM MgCl2, 8% glycerol, 100 μg/ml BSA, 1 mg/ml Poly(dI-dC)], then loaded on a 6% acrylamide gel and run in standard 1× TBE, 1% glycerol. The gel was blotted to a positively charged Nylon membrane (Roche, Mannheim, Germany) in a Mini-PROTEAN Tetra Cell (Bio-Rad) in Tris glycin buffer. Protein-DNA complexes were fixed to the membrane by UV crosslinking. The DIG label was detected by DIG luminescent detection (Roche, Mannheim, Germany). 1 http://www.cube-biotech.com/protocols normalization of repression activity to Protein concentration TBX3 and its mutant derivatives differed in concentration in transfected cells, even though they were expressed from the same type of vector. This departure from uniform expression contributes to differences in repressive activity between the TBX3 variants. Repression was calculated from FLuc activity values by normalizing these values to the control (=100%). These values were subtracted from 100, yielding 0 repression for the control, and values between 0 and 100 for TBX3 and its derivatives. The dose response curves of p21-Luc repression by TBX3 and TBX3dm were approximately linear in the range between 8.3 and 25 ng with about 0.5% change in repression per nanogram of expression vector. Using the measured relative protein levels (Western blot), we extrapolated repression by the mutant proteins to the level observed for normal TBX3.

Measurement of Firefly luciferase activity
In the 96-well format, cells (100 μl) were transfected with 25 ng of reporter vector and 25 ng of TBX expression vector. Forty-eight hours after transfection, the supernatant was removed, cells were washed with PBS and then lysed for 30 min at room temperature in Passive Lysis Buffer (Promega, Mannheim, Germany). Twenty microliters of lysate were measured after addition of 40 μl luciferase agent (Luciferase Assay System, Promega) in white 96-well microplates (Nunc/Thermo Scientific, 136101, Schwerte, Germany) in a GloMax ® 96 Microplate Luminometer (Promega). Premeasurement delay was 2 s, the measurement took 10 s.
For pixel density analysis of fluorographs, several exposures were taken and analyzed using the ImageJ program 2 .  Table S5 in Supplementary Material).

statistical analysis
Statistical analysis of cell culture experimental data was performed with GraphPad Prism (Version 5). Inflated p-values due to multiple comparisons between groups were adjusted by the Bonferroni-Holm procedure (64). The following significance levels were used: *<0.05, **<0.01, and ***<0.001. Significance assessment of TBX mutation type distributions in the ICGC database was done by calculating the respective probability for observing the actual or a more extreme distribution of the mutations between the different TBX3 domains. For this, an urn model was used with the number of mutations representing the number of draws. Since mutations can repeatedly occur at the same locus ("drawing with replacement"), a binomial distribution with the ratio of domain length to total TBX3 length as expectation value under the null hypothesis was chosen. The resulting p-values were doubled to correspond to two-tailed tests. Calculations were performed with SAS ® for Windows ® version 9.4 (SAS Institute Inc., Cary, NC, USA).

Degree of conservation of Mutated amino acids in TBX3
Amino acid conservation is used as one parameter to assess whether a mutation is likely to impart driver properties to the 2 http://imagej.nih.gov/ij/ affected protein (65). Stephens and colleagues observed full conservation of the TBX3 residues, which were mutated in their collection of breast cancer sequences among orthologous animal Tbx3 proteins (1). Because of the strong phylogenetic conservation of the T-domain, we used a more stringent alignment containing all paralogous human TBX proteins. This alignment (Figure 1) shows that the mutation p.T210delT affects a fully conserved residue.
p.N212delN, which was identified in three cases of breast cancer (1,61), affects a position that is largely conserved, only TBX6 showing a deviating serine. However, in Brachyury (T) and in TBX19, this residue is lacking, suggesting that it is not essential for Tbx function.
The mutation p.H187Y, a potential driver mutation identified in two cases of breast cancer, affects a fully conserved histidine (61). Kandoth  In addition to missense mutations, Stephens and colleagues described three frameshift mutations in TBX3 whose breakpoints cluster within 30 amino acids of the T-domain (1). As an example of a truncated protein, we also analyzed p.Y163fs2*.

Dna-Binding Properties of Mutant Proteins
We tested DNA binding of the bacterially expressed mutant proteins by EMSA. For all variants, we used an N-terminal fragment (1-342) in order to reduce degradation to which the full-length protein is susceptible. The fragment was derived from the long TBX3 + 2a splice form (67). The 1-342 fragment extends C-terminally for some 50 amino acids beyond the T-domain. The protein length of 342 was chosen, because the paralogous proteins TBX2 and TBX3 start to diverge more strongly beyond this position. All fragments contained an N-terminal GST tag (glutathione S-transferase) for affinity purification. Mutations were introduced by site specific mutagenesis. The mutant proteins were expressed in E. coli and subsequently purified by glutathione affinity chromatography. Figure 2A shows that all proteins were obtained with similar purity. A Western blot of SDS gel-separated proteins ( Figure 2B) shows that the fast band at around 30 kDa, which is present in all preparations, was also immunoreactive to anti-GST. The molecular mass of GST is 26.7 kDa, indicating that GST is partly cleaved from TBX3 very close to the TBX3 N-terminus, possibly in the protease cleavage site, which is present in pGEX-expressed proteins at the junction between GST and cargo. The intensity of the GST-TBX3 fusion bands was measured ( Figure 2C) so that equivalent amounts of normal and mutant GST-TBX3 fragments could be tested by EMSA for binding to a palindromic Tbx consensus oligonucleotide (68). The normal protein produced a single strong shift band. The binding affinities of p.N212delN and p.T210delT were slightly reduced compared to wild type ( Figure 2D). The reduced affinity of p.T210delT and p.N212delN was not statistically significant ( Figure 2E), even though we observed the same pattern of attenuated binding in all EMSA gels. Binding of p.H187Y was only obvious when the gel was overexposed. No DNA binding was observed with p.Y163fs2* (Figures 2D-F) and p.G129A/R130S (TBX3dm, engineered as a binding-deficient mutant, see below) ( Figure 2F) under these conditions.
reporter gene repression in Transfected cells: The role of Dna Binding TBX3 was first implicated in tumorigenesis because it was found to suppress cellular senescence, could immortalize primary embryonic fibroblasts, and could transform this cell type in combination with Myc or activated Ras (13,16,35). In the suppression of cellular senescence, transcriptional repression of cyclin-dependent kinase inhibitor 1A (p21) and 2A (in human: p14ARF; in mouse: p19ARF) has a key role. The repression by TBX3 is direct and repression activity of TBX2/3 variants can be tested on p21 promoter constructs in transient transfection assays (32,69). Since the presumptive TBX3 driver mutations affected DNA binding, we first tested whether p21-Luc could still be repressed by a DNA-binding-deficient mutant. We used the TBX3 double mutant p.G129A, R130S (TBX3dm). Single mutations in homologous position have previously been shown to cause loss of DNA binding in other Tbx proteins [TBX5: G80R (70), Omb: R355K (71), see also TBX2 (R122E, R123E) (72)].
In COS-7 cells, the p21-Luc promoter construct was repressed by TBX3 in a dose-dependent fashion. Normal TBX3 caused 80-90% repression of reporter activity at our highest concentration of TBX vector (25 ng/assay). TBX3dm generally reached 60% repression ( Figure 3A). Both proteins at 25 ng approached saturation. The dose-dependence curves show that TBX3 reached 60% repression already at a concentration of <1 ng/assay, while TBX3dm required 25 ng/assay to reach this degree of repression. The data indicate that part of the repression caused by TBX3 does not require DNA binding. The mutant p.Y163fs2*, which lacks both an intact DNA-binding domain and the entire C-terminal domain, failed to repress the p21 reporter ( Figure S1 in Supplementary Material).
TBX3dm protein abundance in transfected cells reproducibly was less than that of normal TBX3 (as determined by anti-FLAG immunoblotting: 0.59 ± 0.08, n = 6) (Figures 3B,C and 4B). The repression activity of TBX3dm at 25 ng (60% repression) should, thus, be compared to TBX3 at 15 ng (85% repression) ( Figure 3A). Because of the shallow slope in this part of the dose-response curve, TBX3 remains a more effective inhibitor than TBX3dm even when adjusted for differences in protein abundance.

reporter gene repression in Transfected cells: The effect of Presumptive Driver Mutations
We used the p21-Luciferase assay to determine how TBX3 activity was affected by the putative driver mutations. Fulllength proteins, C-terminally tagged with a FLAG tag were used, with the exception of p.Y163fs2* where the untagged truncated protein was tested. As observed above (Figure 3A), the DNA-binding-defective double mutant TBX3dm repressed p21-Luc 1.5-to 2-fold. A similar repression was observed for the other binding-deficient mutant p.Y163fs2*. Surprisingly, the mutants that were less impaired in DNA binding, p.H187Y and p.T210delT, did not differ significantly from TBX3dm. However, the repression activity of p.N212delN was close to that of TBX3 (Figure 4A). While the differences in p21-Luc repression between TBXdm, Y163fs2*, H187Y, and T210delT were small, the same relative order was observed in all experiments.
We did not normalize the data in Figure 4A to the expression of a co-transfected vector because we noted that the expression of Renilla luciferase from the normalization vector pGL4.74 (pRL-TK, Promega) was also modulated by TBX3 and its mutants. This also holds for other normalization constructs (29). As noted above for TBX3dm, the expression level for mutant proteins differed in a characteristic way from normal TBX3. TBX3dm, H187Y, and T210delT were less abundant than TBX3 (~2×, ~3×, and ~1.5×, respectively), while N212delN was always more abundant (up to 1.5×) (Figure 4B). We normalized the repression to a uniform protein concentration (see Materials and Methods). This correction led to stronger apparent repression activity of the proteins expressed at a lower level than TBX3, and vice versa. Overall, the effect of normalization was small ( Figure 4C). The normalization could not be performed for the untagged p.Y163fs2*.

repression of endogenous p21 by TBX3 and Mutant Derivatives
Transiently transfected reporter constructs cannot be expected to render the regulatory complexity of the endogenous gene.
We, therefore, determined by real time qPCR how TBX3 and its mutant derivatives affected endogenous p21 transcription. TBX3 repressed endogenous p21 although to a lesser degree than the p21-Luc reporter (Figure 5). Of the four tested mutants, only the DNA-binding-deficient TBX3dm did not repress the endogenous gene. The three putative driver mutant proteins containing point mutations were as repressive on endogenous p21 as normal TBX3 (the untagged frame shift mutant p.Y163fs2* was not tested). This still held when the p21 expression values were normalized to protein abundance. The repressive effect of the TBX3 mutants, thus, differed between endogenous p21 and the transfected p21-Luc reporter. Mutant TBX3 proteins with residual DNA-binding affinity (H187Y, T210delT, and N212delN) effectively repressed the endogenous p21 gene but TBX3dm failed to do so. However, on transfected p21-Luc, TBX3dm repressed transcription about twofold, similar to the

Differential expression of Mutant TBX3 Proteins
The expression level of the five FLAG-tagged TBX3 variant proteins, as determined by Western blotting, differed in a characteristic manner (Figure 4B). This pattern was observed in five independent transfection experiments and, thus, was unlikely to be produced by stochastic fluctuations. All TBX3 variants were expressed from the same expression vector. The different protein accumulation which we observed, therefore, could be due to differential protein stability. We determined protein stability of TBX3, H187Y, and N212delN by blocking protein synthesis with cycloheximide (73) and following protein abundance over 12-24 h. Protein levels were normalized to alpha-tubulin, which appeared stable over this period. The TBX3 level declined to 0.73 within 12 h (SD = 0.18, n = 6), whereas H187Y declined to 0.2 (Figures 6A,B), i.e., at least threefold faster than the normal protein. N212delN before addition of cycoheximide was slightly more abundant than TBX3 FigUre 5 | repression of the endogenous p21 gene by TBX3 and mutant derivatives. COS-7 cells were transfected with 50 ng of expression vector encoding TBX3 or its mutants (the empty vector was transfected in the control experiment). RNA was isolated and the level of p21 transcripts was determined by real-time qPCR. The p21 level is normalized to the control (1.0, black columns, error bars are SDs, n = 3 with three technical replicates each). To correct for differences in the amount of TBX protein, an adjustment was performed as described in Section "Materials and Methods" (white columns). Only TBX3dm differed significantly in p21 repression from normal TBX3, the three single mutants did not.

Fischer and Pflugfelder
Breast cancer driver mutations Frontiers in Oncology | www.frontiersin.org and was similarly stable (Figures 6C,D); N212delN declined to 0.74 within 12 h (SD = 0.26, n = 3). Thus, differential protein stability did not account for the higher steady state abundance of N212delN. Therefore, we also determined the transcript level of normal and mutant TBX3 in transfected cells by real-time PCR. Surprisingly, transcript abundance of H187Y was slightly lower and that of N212delN significantly higher than that of TBX3, indicating that difference in RNA production or turnover contributes to the differences in protein levels ( Figure 6E).

statistical considerations of somatic TBX3 Mutations in cancer Tissues
The somatic mutation database of the International Cancer Genome Consortium (ICGC) 3 lists 149 somatic mutations in TBX3. In order to draw inferences about the relevance of these mutations, we used the mutations in the other human TBX genes as reference. Table S1 in Supplementary Material shows the distribution of somatic mutations in the 16 human TBX genes in different cancer tissues. The breast cancer data (BRCA-US) derive from TCGA project in which somatic mutations from about thousand donor tissues were analyzed (61,74). It is obvious that in breast cancer, TBX3 has the highest number of mutations of all TBX genes. In Table 1, these numbers are normalized to the total number of mutations in a given TBX gene across the different cancer types. This normalization allows the comparison between different TBX 3 https://dcc.icgc.org, release 17 genes. In breast cancer, TBX3 is the most frequently mutated of all TBX genes (17.6%). The mutation rate averaged over all 16 TBX genes is 3.6% corresponding to a 4.7-fold enrichment for TBX3.
In Table S2 in Supplementary Material, the mutation numbers are normalized to the number of mutations in the entire set of TBX genes. This serves to normalize mutations in a given gene to the average mutation frequency in a particular cancer tissue and to the different number of tissue samples analyzed. Breast cancer was conspicuous because 29.3% of all TBX mutations in this tissue were found in TBX3. This is not due to TBX3 being a larger mutational target. TBX3 (13.9 kb) is twofold smaller than the average of TBX genes (29.9 kb). Table 1 and Table S2 in Supplementary Material are sorted by the number of TBX mutations identified in a given cancer project. Results at the top of these tables are more significant than those at the bottom.
Human TBX gene size varies by more than 17-fold (between 6 and 106 kb), whereas variation in protein length is <2-fold. The ICGC data collection also contains intronic mutations such that mutation number per gene will be influenced by gene size. Table 2 gives the number of breast cancer mutations that affect the open reading frame (ORF) of TBX genes. Five types of mutations were considered: frameshift, missense, synonymous, stop codon gained, and in-frame deletion. Compared to the average occurrence of a particular mutation type in TBX genes, frameshift and in-frame deletion mutations in TBX3 were strongly (12× and 16×) and missense mutations moderately (3×) enriched. This enrichment is characteristic for breast cancer. When the entire ICGC data set was analyzed for TBX3 ORF mutations (ratio of complete ICGC to BRCA-US dataset ~12/1), 21 frameshift mutations were found, 15 of which arose in breast cancer (8.5-fold enrichment). This concentration of frameshift TBX3 ORF mutations in breast cancer is highly non-random (p < 10 −7 ). Similarly, two of five in-frame deletions occurred in the BRCA-US data set. The preponderance of these particular types of mutations in TBX3 versus the other TBX genes points to an important role of TBX3 in breast cancer progression.
The position of breast cancer frameshift mutations in TBX3 was highly non-random. Fourteen of 15 mutations lie in the N-terminal half of TBX3, 10 lie in the T-domain (Figure 7). Both distributions are highly unlikely to arise by chance (p < 10 −4 and <2 × 10 −3 , respectively). The frameshift mutation, which we analyzed (Y163fs2*) and identified in a different cancer project (1), also fell into this pattern.
If breast cancer progression is associated with mutations that cause loss of DNA binding, this should also be reflected in TBX3 missense mutations. In the BRCA-US data set, eight independent missense mutations in TBX3 were identified, H187Y occurring twice. Four of these lie in the T-domain [L112F, W113R, H187Y (2×)] (Table S3 in Supplementary Material). We expect that mutations of fully conserved residues of the T-domain are more likely to disturb DNA binding than mutations of non-conserved residues. L112F, W113R, and H187Y are fully conserved residues in all 16 human TBX proteins. We showed that DNA binding was severely compromised by H187Y (Figure 2C). A mutation affecting the tryptophane corresponding to TBX3 W113 was identified in TBX22 in a patient with X-linked cleft palate (W102C). The mutant TBX22 protein is inactive in DNA binding October 2015 | Volume 5 | Article 244 9

Fischer and Pflugfelder
Breast cancer driver mutations Frontiers in Oncology | www.frontiersin.org (75). In the BRCA-US data set, 13 missense mutations lie in the T-domain of non-TBX3 TBX genes. Of these, three affect fully conserved (23%), two functionally conserved (15%), and eight (62%) non-conserved residues. This closely corresponds to the distribution of conservation categories (25,21, and 64%, respectively). The enrichment for conserved residues as targets of missense mutations in TBX3 supports the idea that loss of TBX3 T-box function is relevant for breast cancer progression. There was also a trend for conservation of the C-terminal TBX3 breast cancer mutations. About one-quarter of C-terminal residues are conserved between mammalian Tbx2 and Tbx3 proteins. Three of four C-terminal TBX3 mutation affected such conserved amino acids.

Protein structural considerations of the investigated Putative Driver Mutations
Coll and colleagues have analyzed the structure of the TBX3 T-domain in complex with a 20-bp palindromic target sequence (76). In this structure, N212 is part of a loop that connects the short β-strands e and e′. The length of this loop varies in different Tbx proteins. As shown in Figure 1, the members of the Tbr1 subfamily (TBR1, TBX21, and EOMES) (77) contain an insertion of three amino acids in this loop. Drosophila Omb, too, the insect ortholog of Tbx3, contains such an insertion. Importantly, the e-e′ loop also contains the 20 amino acids insertion present in the splice variant TBX3 + 2a. TBX3 + 2a is the product on an alternative splice form, which contains the additional 60 bp exon 2a which is found in all mammalian Tbx3 genes (69). Hoogaars and colleagues found no influence of the additional 20 amino acids on DNA-binding in vitro, on the interaction with Nkx2-5, and on the repressor function of Tbx3 on two natural promoters in transfected cells. Similarly, no difference was noted in the interaction with Sox4 (78). The issue is, however, controversial (79,80). Our data show that shortening of the e-e′ loop had little effect on DNA-binding in vitro and target gene repression. While N212 is a strongly conserved residue among T-box proteins (mammalian members of the Tbx6 subfamily carry serine at this position, Figure 1), members of the T subfamily (Brachury and
In the X-ray structure of the TBX3-DNA complex, T210 is part of the short β-strand e encompassing amino acids 208-211. These four amino acids are completely conserved in the T-box protein family (Figure 1). K208 and N211 interact with the target DNA both in TBX3 and in Xenopus Brachyury (76,81). A structural change in this protein strand is, therefore, expected to have functional consequences. In TBX19, even a conservative mutation of the lysine, corresponding to K208 in TBX3, to arginine abolishes DNA binding and target gene activation (82). The mutation T209I in Tbx20 (T209 in Tbx20 corresponds to T210 in TBX3, see Figure 1) causes a reduction in the activation of an endogenous target gene in an ex vivo assay at low concentration of the mutant protein. However, at high concentration the mutant protein is as effective as normal Tbx20 (83). In this respect, Tbx20-T209I and TBX3-T210delT appear similar. T210delT, too, reduces but does not eliminate affinity for T-box target sequences.
H187 is completely conserved in the T-box protein family. In the TBX3/DNA structure, it lies in the β-strand C′, which is part of the seven-stranded β-barrel that forms the core of the T-domain. C′ does not contact the DNA but the H187Y mutation apparently changes the structure such that the mutant protein no longer effectively binds to target sequences. The same missense mutation at the corresponding position (H125Y) was identified in Tpit (TBX19). Tpit-H125Y does not bind to the consensus palindrome in EMSA and is inactive in target gene activation in a transfection assay (82).

repressor activity in TBX3 Mutant Proteins
In our transient transfection assay, the DNA-binding-deficient TBX3 mutants, TBX3dm (Figures 3A and 4A) and H187Y (Figure 4A), still caused about 40-60% repression of p21-Luc. Like repression by normal TBX3, this repression was dose dependent (Figure 3A). These findings suggest that repression of p21-Luc occurs by two mechanisms. In one, TBX3 acts as an active, DNA-bound repressor; in the other, TBX3 is a corepressor. Repression by TBX2 and TBX3 largely depends on a C-terminal repression domain (12,35,36). In TBX3, this repression domain was shown to be essential for the immortalization of primary embryonic fibroblasts (35). Y163fs2*, which lacks DNA binding and repression domain, did not repress p21-Luc ( Figure  S1 in Supplementary Material). TBX2 has been reported to act as a co-repressor on promoter constructs of two breast cancer tumor suppressor genes (NDRG1 and CST6) by binding to the transcription factor EGR1 (24,53). Thus, there is precedence for both repression mechanisms. While TBX3dm could repress transfected p21-Luc to some extent (Figure 4A), it completely failed to repress the endogenous p21 gene (Figure 5), indicating a higher stringency of the co-repression mechanism in the natural p21 chromosomal context.
Repression of p21-Luc by TBX3 mutant proteins corresponded to their DNA-binding in vitro. H187Y with low DNA binding did not differ from the DNA-binding-deficient TBX3dm. The repression by T210delT with intermediate DNA binding was intermediate between normal TBX3 and TBX3dm, while N212delN, in which DNA binding was barely compromised, repressed nearly as strongly as normal TBX3 (Figure 4A). In contrast, all three point mutants were effective repressors of the endogenous gene indicating that, in the native chromatin context, a partly functional T-domain sufficed for full repression (Figure 5). The T-domain of T-box genes is not only required for DNA binding but also for protein-protein interactions. The interaction with many chromosomal proteins is mediated by the T-domain (84)(85)(86)(87)(88)(89)(90)(91). Presumably, weak perturbations of the T-box still allow binding to such factors in the native chromatin context and, thus, effective gene repression by mutant TBX3 proteins with attenuated DNA binding.

somatic TBX3 Mutations in Breast cancer
Our analysis of TBX mutations in the ICGC data set showed that in breast cancer (BRCA-US), TBX3 was the most frequently mutated TBX gene among the 16 human paralogs. This is not due to TBX3 presenting an unusually large target size, TBX3 being smaller than the average TBX gene. We also reduced the target size effect by concentrating on the ORF, which only varies about twofold among the TBX genes. We noted that, over the entire set of somatic cancer genome projects, two types of mutations affecting the ORF were significantly enriched in TBX3 compared to the other TBX genes: frameshift mutations and in-frame deletions ( Table S4 in Supplementary Material). A large part of these mutations arose in breast cancer (71 and 40%, respectively; compare Table 2 with Table S4A in Supplementary Material). Restricting the analysis to breast cancer revealed also a slight enrichment of missense mutations in TBX3 ( Table 2).
Frameshifts cause a truncation of the mutant protein downstream of the mutation site and can be expected to be all the more severe, the closer the mutations lie to the N-terminus. In breast cancer, the frameshift mutations in TBX3 showed an extreme bias to the N-terminal half of the protein (Figure 7). With the exception of N673fs, all frameshift mutations can be expected to cause loss of function either by disruption of the DNA-binding domain or of the nuclear localization sequence (R294fs). E307fs, which causes a sequence change just downstream of the T-box, could potentially be a dominant negative mutation (5,28,92,93). We demonstrated loss of function for Y163fs2*, which lacked DNA binding ( Figure 2C) and failed to repress p21-Luc ( Figure S1 in Supplementary Material). In the case of missense mutations, conclusion with regard to residual function of the mutant proteins cannot be drawn as easily. The enrichment for mutations in conserved residues suggests that most will lead to loss of function.
Our experimental data with the frameshift Y163fs2*, the missense H187Y, and the in-frame deletion mutant proteins (T210delT) revealed partial or complete loss of function in vitro (EMSA) and in cell culture (p21-Luc repression). While this loss of function is in agreement with the expectations from the statistical analysis of TBX3 mutations in breast cancer as outlined above, it is in apparent contradiction to several previous observations on the role of TBX3 in oncogenesis. In general, it was overexpression of TBX3, which was found associated with oncogenic processes at the organismic, cellular, and gene regulatory level. This we outline in the following mainly for breast cancer and melanoma but similar findings were made for other types of cancer (27,(94)(95)(96)(97)(98).
TBX3 was reported to be overexpressed in breast cancer cell lines and in primary cancer tissue (37,79). An increased level of TBX3 was observed in blood plasma from patients with higher stage breast cancer (99). Similarly, increased expression of TBX3 was seen in melanoma cell lines (4) where TBX3 represses E-cadherin expression and causes increased invasiveness (6,17,18) and tumor formation in vivo (100). A critical role of TBX3 for cell migration was also demonstrated in the MCF7 breast cancer cell line (17,101).
TBX3 expression also antagonizes replicative and oncogenic senescence by repressing p14(ARF). This was demonstrated with a conditionally transformed mouse neuronal cell line (13) and with primary mouse fibroblasts (16,35) but not yet, to our knowledge, with mammary cells.
TBX3 promotes growth of mammary epithelial cells both in cell culture (26) and in a transgenic mouse where it causes mammary hyperplasia in the absence of tumor formation (21). In human breast cancer cell lines and primary tissue, TBX3 mediates the effect of estrogen to induce the formation and expansion of cancer stem-like cells (20). The effect on proliferation is, however, dependent on cell type. In the breast cancer cell line MCF-7 and in vertical growth phase melanoma cell lines, both of which express TBX3 endogenously, it is knock-down of TBX3 that promotes proliferation (17). A similar phenomenon we observed for the TBX3 ortholog Omb in the wing imaginal disk of Drosophila, where Omb overexpression antagonizes proliferation in regions of high endogenous Omb and promotes proliferation in regions of low endogenous Omb (and vice versa). Omb overexpression leads to invasive cell motility in all regions of the disk epithelium (51).
Among the TBX3 mutants that we analyzed, only N212delN was not significantly compromised with regard to DNA binding and p21 repression, compared to normal TBX3. Rather, N212delN showed an aspect of gain-of-function: N212delN had higher transcript and protein abundance compared to normal TBX3.