Colorectal Cancer: Molecular Mutations and Polymorphisms

Colorectal cancer (CRC) is one of the major causes of mortality and morbidity, and is the third most common cancer in men and the second most common cancer in women worldwide. The incidence of CRC shows considerable variation among racially or ethnically defined populations in multiracial/ethnic countries. The tumorigenesis of CRC is either because of the chromosomal instability (CIN) or microsatellite instability (MIN) or involving various proto-oncogenes, tumor-suppressor genes, and also epigenetic changes in the DNA. In this review I have focused on the mutations and polymorphisms of various important genes of the CIN and MIN pathways which have been implicated in the development of CRC.


INTRODUCTION
Colorectal cancer (CRC) defined as the cancerous growths in the colon, rectum and appendix is also referred to as colon cancer or large bowel cancer. It is a commonly diagnosed cancer in both men and women and represents the third most common form of cancer and the second leading cause of cancer-related death in the western world (Center et al., 2009). Sporadic CRC constitutes about 75% of the patients, with no apparent evidence of having inherited the disorder; while as the patients having a family history of CRC constitutes the remaining 25% which suggests a genetic contribution, common exposures among family members, or a combination of both (National cancer Institute, 2008 1 ). In the countries like Western Europe, North America, and Australia CRC has the high incidence rates among the western world populations while as the lowest rates of CRC are found in the sub-Saharan Africa, South America and Asia, but are increasing in countries adopting western life-style and dietary habits (Vainio and Miller, 2003).

TYPES
Depending upon the genetics and the etiology of the disease, CRC is usually categorized into three specific types: sporadic, inherited, or familial.

Sporadic colorectal carcinomas
Sporadic carcinomas devoid of any familial or inherited predisposition, accounts for approximately 70% of CRC. Sporadic cancer 1 http://www.cancer.gov/cancertopics/pdq/genetics/colorectal/healthprofessional/ allpages is common in persons older than 50 years of age, probably as a result of dietary and environmental factors as well as normal aging. Fewer than 10% of patients have an inherited predisposition to colon cancer.

Inherited colorectal carcinomas
They include those in which colonic polyps are a major manifestation of disease and those in which they are not. The nonpolyposis predominant syndromes include hereditary nonpolyposis CRC (HNPCC) (Lynch syndrome I) and the cancer family syndrome (Lynch syndrome II). Although uncommon, these syndromes provide insight into the biology of all types of CRC.

Familial colorectal carcinomas
The third and least understood pattern of CRC development is known as familial CRC. In affected families, CRC develops too frequently to be considered sporadic but not in a pattern consistent with an inherited syndrome. Up to 25% of all cases of CRC may fall into this category (Paula and Harold, 2002).

CLASSIFICATION AND GRADING
The most common colon cancer cell type is adenocarcinoma which accounts for 95% of cases. Other, rarer types include lymphoma and squamous cell carcinoma. Cancers on the right side (ascending colon and cecum) tend to be exophytic, that is, the tumor grows outward from one location in the bowel wall. Leftsided tumors tend to be circumferential, and can obstruct the bowel much like a napkin ring.
Two classification systems are being used for the staging of the CRC-Dukes classification and TNM (Tumors/Nodes/Metastases) system. Dukes' classification, first proposed by Dukes (1932), identifies the stages as: (A) Tumor confined to the intestinal wall; (B) Tumor invading through the intestinal wall; (C) With lymph node(s) involvement; and (D) With distant metastasis, which is the commonest in use still (Table 1). There has been a gradual move advantage to an affected colorectal cell. These genetic changes ultimately result in uninhibited cell growth, proliferation, and clonal tumor development. The additive and cumulative effect of these somatic mutations has been shown to be the cause of sporadic colorectal cancer. The salient features of the Vogelstein's model of CRC carcinogenesis for sporadic cancers, can be conclusively drawn as: 1) the mutational activation of oncogenes and/or the inactivation of tumor suppressor genes results in colorectal carcinogenesis; 2) at least four or five genes of a cell must undergo somatic mutations so as to get malignantly transformed; 3) the characteristics of the tumor is dependent upon the accumulation of multiple genetic mutations rather than the sequence of mutations of the genes involved; and 4) the features of the tumorigenic process of colorectal cancer also apply to other solid tumors, such as breast, stomach and pancreatic cancer.
Two major mechanisms of genomic instability have been identified that give rise to colorectal carcinoma development and progression: chromosomal instability (CIN) and microsatellite instability (MIN).

CHROMOSOMAL INSTABILITY
It is associated with the series of genetic changes that occur initially in some cases by loss of one allele at a chromosomal locus (loss of heterozygosity) and may imply the presence of a tumorsuppressor gene at that site. Loss of both alleles at a given locus (homozygous deletion) is an even stronger indicator of the existence of a tumor-suppressor gene. Loss of heterozygosity occurs clonally in both the adenoma-carcinoma sequence and ulcerative colitis associated neoplasia. Many of these loci are already associated with one or more known candidate tumor-suppressor genes. These include 3p21 (β-Catenin gene), 5q21 (APC gene), 9p (p16 and p15 genes), 13q (retinoblastoma gene), 17p (p53 gene), 17q (BRCA1 gene), 18q (DCC and SMAD4 genes), and less frequently 16q (E-cadherin gene) (Esteller et al., 2001;Conlin et al., 2005;Hsieh et al., 2005). FAP represents the hereditary syndrome dealing with APC gene mutation (Fearon and Vogelstein, 1990).

MICROSATELLITE INSTABILITY
It comprises length alterations of oligonucleotide repeat sequences that occur somatically in human tumors. Mutations in DNA mismatch repair (MMR) genes result in a failure to repair errors that occur during DNA replication in repetitive sequences [microsatellites (MSI)], resulting in an accumulation of frameshift mutations in genes that contain MSI. This failure leads to MIN type of tumor and is the hallmark of HNPCC (Boland et al., 1998). MIN is also found in 12-15% of sporadic CRCs. MIN tumors are more frequently right-sided and poorly differentiated, and more often display unusual histological type (mucinous), and marked peri-tumoral and intra-tumoral lymphocytic infiltration (Dolcetti et al., 1999;Benatti et al., 2005). Microsatellite instability also occurs in patients with ulcerative colitis and is fairly common in premalignant (dysplastic) and malignant lesions (21 and 19%, respectively) (Kerr et al., 2001).
According to the Vogelstein's model of CRC carcinogenesis the etiology of CRC is multifactorial, and is likely to involve the actions of genes at multiple levels of the carcinogenesis trail. Various genes which have been implicated in the pathogenesis of CRC include p53, p16, p14, APC, β-catenin, E-cadherin, Transforming Growth Factor (TGF )-β, SMADs, MLH1, MSH2, MSH6, PMS2, AXIN, STK11, PTEN, DCC, and KRAS (Sayar and Banerjee, 2007).

MOLECULAR MUTATIONS
The mutations whether point or gross mutations in a number of tumor suppressor or oncogenes have been implicated in the development of colorectal cancer in every corner and every population of the world. A number of important genes whose mutations have been ascertained to be present in the colorectal cancer have been discussed in brief as below.

TP53
The TP53 gene is located on the short (p) arm of chromosome 17, and 17p deletions are found in 6-25% of colonic adenomas and in as many as 75% of colonic carcinomas (Baker et al., 1989). The TP53 gene encodes a protein, which maintains genomic integrity by inducing cell cycle arrest and apoptosis when DNA is damaged (Levine et al., 1991). Mutations in TP53 gene occur in almost half of all CRCs, proposed as a late event in the transition of an adenoma to carcinoma (Harris and Hollstein, 1993). The mutations in p53 are thought to cause an increase in the half life of the protein and also often associated with overexpression in the nucleus, in one view (Remvikos et al., 1990). Also, most of the mutations in TP53 gene occur in exons 5-8, in highly preserved regions, and in the three main structural domains of the p53 protein (L2, L3, and loop-sheet-helix) (Borrensen-Dale et al., 1998). These mutations cause the synthesis of a stable protein that loses the ability to www.frontiersin.org bind DNA and to cause the activation of target genes (Soussi and Beroud, 2001).

KRAS
KRAS gene located at 12p12 consists of six exons, spread over 35 kb of genomic DNA, alternative RNA splicing reveals two different transcripts of 5.5 and 3.8 kb. Ras proteins small monomeric proteins of 189 amino acids with a molecular weight of 21 Kd (Watzinger and Lion, 1999). These function as small GTPases that cycle between inactive guanosine diphosphate (GDP)-bound and active guanosine triphosphate (GTP)-bound conformations (Ras-GDP and Ras-GTP, respectively) (Boguski and McCormick, 1993;Donovan et al., 2002). The human ras family consists of three proto-oncogenes, Harvey (H)-, Kirsten (K)-, and N-ras, all of which possess an intrinsic GTPase activity, implicated in the regulation of their activity. RAS proteins control multiple pathways in a tissue-specific manner, affecting cell growth, differentiation, and apoptosis (Khosrave-far and Der, 1994). Specific mutations in the RAS genes lead to the formation of constitutively active proteins, which trigger the transduction of proliferative and/or differentiative signals, even in the absence of extracellular stimuli (Fearon and Vogelstein, 1990;Schubbert et al., 2007).

BRAF
BRAF gene is located at 7q34 in between NDUFB2 and MRPS33 genes. It is composed of 18 exons spanning in a region of 190 Kbp. Its mRNA has 2478 bp. Braf protein consists of 766 amino acid residues having MW of 85 Kd. The BRAF gene is a proto-oncogene that belongs to the Serine/Threonine Kinase Family. It is also a member of the RAF Subfamily together with the ARAF and RAF1 genes. Braf protein contains three highly conserved domains RBD (Ras binding domain), CRD (Cysteine-rich domain), and KD (Kinase domain). Within the kinase domain there lies two other specific domains -one glycine motif (G-loop) in exon 11 and other activation segment (AS) in exon 15 (Domingo and Schwartz, 2004).
BRAF presents somatic mutations in different sort of tumors, predominantly in malignant melanoma, sporadic colorectal tumors showing MMR defects in MSI, low-grade ovarian serous carcinoma and thyroid papillary cancer. About 80% of these mutations correspond to the hotspot transversion mutation T1799A that causes the amino acidic substitution V600E (Davies et al., 2002) The other 20% accounts for a wide variable range of missense mutations and all of them reside in the glycines of the G-loop in the exon 11 or in the AS in exon 15 near the V600. The mutation V600E confers transformant activity to the cells because it mimics the phosphorylation of T599 and/or S602 in the AS and so Braf rests constitutively active in a RAS independent manner (Wan et al., 2004).

APC
APC is a classical tumor-suppressor gene located on 5q21 containing 21 exons. APC transcript is 9.0 kb in length and the most common isoform of Apc protein contains 2843 amino acids with molecular weight of 310 kD. The Apc protein consists of an oligomerization domain, armadillo region in the N-terminus, a number of 15-and 20-amino acid repeats in the central portion, and a C-terminus that contains a basic domain and binding sites for EB1 and the human disk large (HDLG) protein. Although being an integral part of the Wnt-signaling mechanism, it also plays a role in cell-cell adhesion, stability of microtubular cytoskeleton, cell cycle regulation, and possibly apoptosis (Fearnhead et al., 2001).
The APC gene product indirectly regulates transcription of a number of critical cell proliferation genes, through its interaction with the transcription factor β-catenin. Apc binding to β-catenin leads to ubiquitin-mediated beta catenin destruction; loss of Apc function increases transcription of beta catenin targets. These targets include cyclin D, C-myc, ephrins, and caspases. Apc also interacts with numerous actin and microtubule associated proteins. Apc itself stabilizes microtubules. Homozygous Apc truncation has been shown to affect chromosome attachment in cultured cells. Roles for Apc in cell migration have been demonstrated in vitro and in mouse models (Hamelin, 1998;Polakis, 2000;Tirnauer, 2005).
In addition to the mutational inactivation, hypermethylation of the gene promoter is the other important mechanism associated with gene silencing . In many tumors the hypermethylation of CpG islands in gene promoters has been found to be a frequent epigenetic change in cancers, and is usually associated with the loss of transcription of APC (Esteller et al., 2000;Rowan et al., 2000;Tsuchiya et al., 2000;Virmani et al., 2001;Esteller and Herman, 2002;Kang et al., 2003;Lind et al., 2004;Zare et al., 2009). Hypermethylation of the APC gene promoters has been reported to be present in about 20-48 per cent of human CRCs (Hiltunen et al., 1997;Esteller et al., 2000;Arnold et al., 2004;Lind et al., 2004).

β-Catenin
β-Catenin is located at 3p22-p21.3 and encompasses 23.2 kb of DNA containing 16 exons with mRNA transcript about 2343 bp long coding 781 amino acid residue protein of 92 kD molecular weight. β-Catenin protein contains a phosphorylation site by the serine-threonine glycogen synthase kinase-3β (GSK-3β), an α-Catenin binding site, 13 armadillo repeats, and a transactivating domain (from N-terminus to C-terminus). β-Catenin is assumed to transactivate mostly unknown target genes, which may stimulate cell proliferation (acts as an oncogene) or inhibit apoptosis. The β-Catenin level in the cell is regulated by its association with the adenomatous polyposis coli (APC) tumor-suppressor protein, axin, and GSK-3β. Phosphorylation of b-catenin by the APC-axin-GSK-3β complex leads to its degradation by the ubiquitin-proteasome system (Debuire et al., 2002). β-Catenin is mutated in up to 10% of all sporadic colorectal carcinoma by point mutations or in frame deletions of the serine and threonine residues that are phosphorylated by GSK-3β (Polakis, 2000). These mutations result in stabilization of β-Catenin and activation of WNT -signaling. Mutations in β-Catenin occur in exclusivity to APC aberrations as both molecules are the components of the same pathway (Behrens, 2005).

SMAD4
SMAD4 gene -also known as MADH4, DPC4 & JIP, is located on the long arm (q) of chromosome 18 at band 21.1. The gene encompasses 49.5 kb of DNA with 13 exons, out of which first two exons do not code for any amino acid and hence constitute 5 -UTR of the SMAD4 gene. SMAD4 mRNA transcript constitutes 3220 nucleotides (Saffroy et al., 2004). The protein of SMAD4 gene -Smad4 belongs to the Darfwin family of proteins which harbors two conserved amino-and carboxyl-terminal domains known as MH1 and MH2, respectively. Smad4 in the basal state is found mostly as a homo-oligomer, most likely a trimer. It is ubiquitously expressed within the human body. Smad4 is an intracellular mediator of TGF-β family and activin type 1 receptor. Smad4 mediate TGF-β signaling to regulate cell growth and differentiation. TGF-β stimulation leads to phosphorylation and activation of Smad2 and Smad3, which form complexes with Smad4 that accumulate in the nucleus and regulate transcription of target genes. By interacting with DNA-binding proteins, Smad complexes then positively or negatively regulate the transcription of target genes (Attisano and Wrana, 2000;Massagué et al., 2000;Wrana, 2000;Attisano and Lee-Hoeflich, 2001;Shi, 2001).
The role of Smad4 gene as an important tumor-suppressor gene came out by the novel study of the allelotype loss in pancreatic adenocarcinoma (Shi, 2001). The tissue restriction of alterations in DPC4, as in many other tumor-suppressor genes, emphasizes the complexity of rate-limiting checkpoints in human tumorigenesis (Schutte et al., 1996).
Smad4 was proposed to be a tumor-suppressor gene that may function to disrupt TGF-β signaling. Mutant Smad4 proteins, identified in human carcinomas, were found to be impaired in their ability to regulate gene transcription. Most of Smad4 gene mutations in human cancer are missense, nonsense, and frameshift mutations at the mad homology 2 region (MH2) which interfere with the homo-oligomer formation of Smad4 protein and heterooligomer formation between Smad4 and Smad2 proteins, resulting in disruption of TGF-β signaling (Shi, 2001;Woodford-Richens et al., 2001;Roth et al., 2003).

AXIN
Axis inhibition protein (AXIN1) and its homolog AXIN2 (also known as conductin) are tumor-suppressor genes and their proteins act as master scaffold ones. AXIN1 is located on 16p13.3 while as AXIN2 is located 17q24.1 (Atlas.org). Axin has plieotrophic effect on various signaling pathways. One of its key functions is to negatively regulate the activity of the WNT pathway by enhancing formation of the β-Catenin destruction complex. The Wnt/Wingless biological signaling pathway plays an important role in both embryonic development and tumorigenesis (Hong-Tao et al., 2007). The genomic region containing the AXIN1/2 genes shows loss of heterozygosity and rearrangements in a variety of cancers. In addition somatic point mutations and deletions have been identified in CRC, hepatocellular carcinomas, ovarian endometrioid adenocarcinomas, and hepatoblastomas. Many of these mutations/deletions result in translation of truncated proteins that are likely to be functionally inactive (Lammi et al., 2004).

SINGLE-NUCLEOTIDE POLYMORPHISMS
The human genome contains a massive amount of genetic variation, such as the insertion/deletion of one or more nucleotides, the copy-number variations (CNVs) that can involve DNA sequences of a few kilobases up to millions of bases, and single-nucleotide polymorphisms (SNPs), which are the substitution of a singlenucleotide along the DNA (Ionita-Laza et al., 2009;Savas and Liu, 2009). SNPs are the most common form of genetic variation. There are >10 million SNPs estimated to be in the human genome (Miller et al., 2005).
These days several molecular and epidemiological studies are focusing on the role of SNP's in modulating the risk of various cancer and quite a number of studies have implicated various gene polymorphisms in affecting the risk of cancer in almost all the populations around the world.

TP53 polymorphism
In TP53 gene several polymorphisms have been identified both in non-coding and coding regions (Murphy, 2006;Bojesen and Nordestgaard, 2008;Costa et al., 2008;Whibley et al., 2009). Most of these polymorphisms are SNPs affecting a single base. Within the coding regions of TP53, only two important polymorphisms are present which alter the amino acid sequence of their products (Pietsch et al., 2006), these are located at codon 47 and codon 72 in exon 4. Codon 72 (Arg72Pro) -a frequent functional SNP that leads to an arginine-proline amino acid change has been reported by many authors (Thomas et al., 1999;Dumont et al., 2003). Dumont et al., reported that the Arg72 allele, if in homozygous, has an apoptosis-inducing ability 15-fold higher than does the Pro72 allele. According to Leu et al. (2004), this high apoptosis-inducing ability of the Arg72 allele is in part due to its mitochondrial location which makes it possible for TP53 to have a direct interaction with pro-apoptotic BAK protein. Studies on this SNP function were the basis for testing its impact on the risk and progression of tumors, where the less apoptotic allele Pro72 was associated with increased risk for development of tumors (Marin et al., 2000;Ignaszak-Szczepaniak et al., 2006;Toyama et al., 2007). Codon 47 (Pro47Ser) -second most common polymorphism in TP53 that leads to change of Proline with Serine was first identified by Felley-Bosco et al. (1993). The Ser47 polymorphic variant is very rare, with an allele frequency under 5% in populations of African origin (Murphy, 2006;Pietsch et al., 2006;Whibley et al., 2009). In a pioneer study by Li et al. (2005), 106, it was found that the serine 47 polymorphic variant, which replaces the proline residue -necessary for recognition by proline-directed kinases, is a markedly poorer substrate for phosphorylation. Codon 47 encodes proline (CCG) in wild-type p53, but in a small subset of individuals it can encode serine (TCG).

NQO1 polymorphism
NQO1 is located on chromosome 16q22, is 20 kb in length and has 6 exons and 5 introns. NQO1 is a flavoprotein which functions as a homodimer. The physiological dimer has one catalytic site per monomer. Each monomer consists of 273 amino acids. NQO1 is expressed in human epithelial and endothelial tissues and at high levels throughout many human solid tumors. NQO1 is a mainly cytosolic enzyme although it has also been localized in smaller amounts to mitochondria, endoplasmic reticulum and nucleus (Ross, 2004;Chao et al., 2006). The enzyme is generally considered as a detoxification enzyme because of its ability to reduce reactive quinones and quinone-imines to less reactive and less toxic hydroquinones by its unique ability to use either NADH or NADPH as reducing cofactors (Siegel et al., 2004). Because of its reducing capability NQO1 prevents the generation of semiquinone free radicals and reactive oxygen species with its unique property of transferring two electrons at a time to quinone, thus protecting cells from oxidative damage (Chen et al., 1999;Winski et al., 2002).

CYP2E polymorphism
Cytochrome P450 2E1 (CYP2E1) gene is located on 10q26.3. It is 18,754 bp long consisting of nine exons and eight introns, which encodes a 493 amino acid protein. CYP2E1 belongs to the cytochrome P450 superfamily (Wang et al., 2010). It is a natural ethanol-inducible enzyme that is of great interest due to its role in the metabolism and bioactivation of many low molecular weight compounds, including ethanol, acetone, drugs like acetaminophen, isoniazid, chlorzoxazone, and fluorinated anesthetics and many procarcinogens like benzene, N-nitrosoamines, vinyl chloride, and styrene (Guengerich et al., 1991;Kharasch and Thummel, 1993;Ulusoy et al., 2007;Zhou et al., 2010). CYP2E1 gene contains six restriction fragment length polymorphisms, of these are the two important -RsaI polymorphism (CYP2E1 * 5B; C-1054T substitution) and the 96-bp insertion in its 5 -flanking region have drawn much interest (Morita et al., 2008;Wang et al., 2010;Zhou et al., 2010). RsaI polymorphism has been shown to affect its transcription level. The variant type of this polymorphic site can enhance the transcription and increase the level of CYP2E1 enzymatic activity in vitro (Hayashi et al., 1991). The variant allele of the 96-bp insertion polymorphism was shown to express greater transcriptional activity (Nomura et al., 2003).

MTHFR polymorphism
The MTHFR gene, located on 1p36.22, encompasses 19.3 kb of DNA and is composed of 11 exons. The gene codes for a 74.6-kD protein of 656 amino acids (Saffroy et al., 2005). It is a cytosolic enzyme that catalyzes the conversion of 5,10-methylene tetrahydrofolate (THF) to 5-methylTHF, a cosubstrate for homocysteine remethylation to methionine with subsequent production of S-adenosyl methionine (SAM), the universal methyl donor in humans, required for DNA methylation. The methylation of homocysteine is catalyzed by the enzyme methionine synthase, which requires the cofactor vitamin B12. MTHFR is also linked to the production of dTMP via thymidylate synthase and to purine synthesis and, therefore, plays a role in the provision of nucleotides essential for DNA synthesis (Wagner, 1995). Thus, any defect in the MTHFR gene will be reflected in a defect in the methylation pattern of DNA as well as in its synthesis. Two common functional polymorphisms have been defined in the MTHFR gene -one is C677T and other A1298C. MTHFR C677T polymorphism is the most important one regulating the function of this enzyme. This polymorphism results in an alanine-to-valine substitution at codon 222 of the protein (Frosst et al., 1995). This polymorphism has a profound effect on the MTHFR protein, not only does it decrease the thermal stability of this enzyme but also reduces its activity (Cicek et al., 2004). Individuals with the variant Val/Val genotype (TT) have no more than 30% of normal enzyme activity, and heterozygotes (CT) have 65% of normal enzyme activity (Frosst et al., 1995;Kono and Chen, 2005). This substitution also results in lower levels of 5-methyltetrahydrofolate, an accumulation of 5,10-methylenetetrahydrofolate and increased plasma homocysteine levels (Frosst et al., 1995;Ma et al., 1997;Bagley and Selhub, 1998).

CONCLUSION
However, there is an unresolved debate among the geneticists and molecular oncologists as to whether which of the two pathways -CIN and MIN initiates and predisposes an individual to the development of colorectal cancer. Still it is not clear which of the two is the first event in the carcinogenesis. However, as per se it is regarded that CIN is the earlier event in the tumorigenesis and subsequently tumor progression. In addition there is the question of the epigenetic silencing of the various CIN and MIN pathway genes also which occurs exclusive or in addition to the mutations of the genes thus adding a second dimension of complexity in the molecular mechanism of the tumor development.
Also molecular biologist are fighting to double their efforts to define and characterize various metabolic pathways associated with DNA structure and function like DNA methylation and chromatin modification, changes in the patterns of mRNA and noncoding RNA expression, and the consequent effect on the corresponding protein expression and posttranslational modification which surely are the variables affecting the CRC tumorigenesis. So, there are more than one pathway for the tumor to undergo tumorigenesis and hence it is important for the molecular oncologist to obtain data about age, sex, tumor site, ethinicity, diet and gut flora when investigating genetic and epigenetic risk factors for CRC to understand the complex interactions among dietary and environmental agents.