Genetic Influences on Cystic Fibrosis Lung Disease Severity

Understanding the causes of variation in clinical manifestations of disease should allow for design of new or improved therapeutic strategies to treat the disease. If variation is caused by genetic differences between individuals, identifying the genes involved should present therapeutic targets, either in the proteins encoded by those genes or the pathways in which they function. The technology to identify and genotype the millions of variants present in the human genome has evolved rapidly over the past two decades. Originally only a small number of polymorphisms in a small number of subjects could be studied realistically, but speed and scope have increased nearly as dramatically as cost has decreased, making it feasible to determine genotypes of hundreds of thousands of polymorphisms in thousands of subjects. The use of such genetic technology has been applied to cystic fibrosis (CF) to identify genetic variation that alters the outcome of this single gene disorder. Candidate gene strategies to identify these variants, referred to as “modifier genes,” has yielded several genes that act in pathways known to be important in CF and for these the clinical implications are relatively clear. More recently, whole-genome surveys that probe hundreds of thousands of variants have been carried out and have identified genes and chromosomal regions for which a role in CF is not at all clear. Identification of these genes is exciting, as it provides the possibility for new areas of therapeutic development.


CYSTIC FIBROSIS BACKGROUND
Cystic fibrosis (CF) is the most common lethal autosomal recessive disease in Caucasians, affecting an estimated 1 in 3,300 live-born infants (Davis et al., 1996). Affected individuals have variants in both copies of the 230-kb CF transmembrane conductance regulator gene (CFTR), that result in significant reduction or absence of CFTR function. The CFTR gene is located on the long arm of chromosome 7 at position 7q31and encodes a 1,480 amino acid protein (Riordan et al., 1989;Rommens et al., 1989) with cAMPdependent anion channel activity (Bear et al., 1992) found in the apical membranes of epithelial cells in the lungs, olfactory sinuses, pancreas, intestines, vas deferens, and sweat ducts, as well as nonepithelial cells such as immune cells (myeloid and lymphocytes) and various muscle cell types (Yoshimura et al., 1991;Krauss et al., 1992;McDonald et al., 1992;Dong et al., 1995;Moss et al., 2000;Robert et al., 2005;Di et al., 2006;Vandebrouck et al., 2006;Divangahi et al., 2009;Lamhonwah et al., 2010). Low or absent CFTR function in the airway epithelium not only results in decreased chloride permeability, but also in increased sodium absorption across the epithelium, impairing hydration of the airway mucosal surface and resulting in thick, sticky mucus and an environment for bacteria to thrive. Thus, typical clinical features of CF include chronic infection and inflammation of the airways. Accordingly, a hallmark characteristic of the CF airways is progressive bronchiectasis; this destruction and dilation of the airways is the primary cause of morbidity and mortality of CF patients. In addition to the airway manifestations, most CF patients will experience exocrine pancreatic insufficiency, males are most often sterile, and other co-morbidities such as liver disease and diabetes are common as well. Previously considered almost exclusively a pediatric disease, CF babies now have a predicted median survival of nearly 40 years (Cystic Fibrosis Foundation Patient Registry, 2009).

HETEROGENEITY OF CFTR
To date, over 1,800 CF-associated mutations have been described 1 and the effects of these mutations have been grouped into six general classes based on the consequence to CFTR message and/or protein (Zielenski, 2000). These range from complete absence of full-length, functional CFTR protein (class I), proteins that do not traffic to the membrane well due to misfolding (class II), proteins that reach the membrane but do not respond to activation stimuli such as phosphorylation (class III), proteins that reach the membrane and activate, but do not conduct anions sufficiently to prevent disease (class IV), mutations that reduce the amount of functional CFTR, such as by gene expression regulation or protein trafficking (class V), and proteins that are unstable and experience increased turnover in the plasma membrane (class VI). It should be noted that these classes are not mutually exclusive, as a single change may have multiple effects on the protein.
Given the diversity of mutations, it is perhaps not surprising that there is a wide range of phenotypic variability in CF simply due to variation in CFTR. Many reports of correlations between CFTR genotype and clinical phenotype exist (Kerem et al., 1990a;Stuhrmann et al., 1991;The Cystic Fibrosis Genotype-Phenotype Consortium, 1993;Tsui and Durie, 1997;Zielenski, 2000), with the most extensive catalog to date carried out as an international effort 2 and currently includes data on over 35,000 patients. Because most CF mutations are rare, surveying such a large number of individuals makes it possible to most reliably assess the phenotypic effects associated with a genotype, rather than extrapolate from individual cases.
In addition to CFTR genotype, there is evidence that gender contributes to phenotypic variability (Davis, 1999). Females are reported to have a reduced median survival age (by approximately 3 years), an earlier average age of Pseudomonas aeruginosa infection in the lungs, greater rates of pulmonary decline, and elevated resting energy expenditure when compared to males (Demko et al., 1995;Corey et al., 1997;Allen et al., 2003). Although some current studies replicate these findings (Barr et al., 2011;Reid et al., 2011), others show no evidence of a gender gap and propose that phenotypic variability could be attributed to non-uniformity of care or the need to account for other factors such as body habitus, presence of diabetes, or the finding that females are more likely to be diagnosed later in life than males (Widerman et al., 2000;Milla et al., 2005;Rodman et al., 2005;Verma et al., 2005;Stern et al., 2008;Fogarty et al., 2012).

GENOMIC HETEROGENEITY AND CLINICAL VARIATION
Even among patients with the same CFTR genotype, there is a wide range of phenotypic variability (Kerem et al., 1990a;Tsui and Durie, 1997). Perhaps most notably, there is remarkable variation of pulmonary phenotype, with some patients maintaining normal lung function well into adolescence and adulthood while others do quite poorly even at a very young age (Kerem et al., 1990a). Understanding the causes of this variation is important, as it provides insight into developing new therapies, or improving existing ones.
Clearly environmental factors contribute to clinical variation; exposure to tobacco smoke, bacterial infections, and socioeconomic status have all been implicated as having detrimental effects on pulmonary phenotype of CF patients (Kerem et al., 1990b;Rubin, 1990;Corey and Farewell, 1996;Schechter et al., 2001;O'Connor et al., 2003) while improvement of nutritional status, through aggressive treatment, has been associated with improvements in pulmonary phenotype (Steinkamp and von der Hardt, 1994). Each of the environmental sources of clinical variation provide potential intervention points, but it is also clear that there are heritable sources (Mekus et al., 2000;Vanscoy et al., 2007) of variation as well and that may provide insight into even more therapeutic targets.

EVIDENCE OF GENETIC MODIFIERS OF DISEASE
Human twin and sibling studies have been useful in verifying the role of modifier genes, and quantifying their contribution to phenotypic variation. Mekus et al. (2000) found in a survey of 277 sibling pairs, with 29 monozygous and 12 dizygous pairs, that a combined index of lung function and body mass was more concordant among monozygous twins (sharing 100% of genetic material) than dizygous twins or other sibling pairs (sharing 50% of genetic material), pointing to a genetic etiology of variation. Similarly, Vanscoy et al. (2007) examined the pulmonary phenotype of 57 twin pairs and 231 sibling pairs with CF. Lung function measurements were significantly more concordant between monozygous twins than dizygous twins, also indicating the presence of genetic modifiers. The similarity in lung function between sibling pairs was compared to the similarity in lung function in unrelated patients, and again was found to be more similar. Heritability estimates were calculated from these data, and it was determined that non-CFTR genetic variation could account for approximately 50-80% of the pulmonary phenotypic variability in CF patients with the same CFTR genotype (homozygous F508del) (Vanscoy et al., 2007).

GENETIC APPROACHES
With a genetic component established, the next task at hand was to identify the genes responsible. There are two fundamental strategies by which to accomplish this. One requires family information and is often referred to as linkage analysis. Through this approach, one determines whether a polymorphism's genotype is concordant in siblings with similar clinical profiles, discordant when clinical features are discordant or show no pattern. The other approach is association, determining if particular alleles of a polymorphism are distributed randomly among patients or have skewed distributions that track with clinical characteristics. These two approaches are outlined in Figure 1 and the findings that these strategies have produced are listed in Table 1 with several examples described in more detail below.
The vast majority of studies have been of the association design, predominantly due to the small number of families with multiple, affected children. These studies have evolved over time; cost and time restricted most early studies to screen for potential diseasemodifying genes by candidate gene approaches with later studies utilizing array-based methods and soon whole-genome sequencing will be the state of the art. These three approaches are compared in Figure 2.

PHENOTYPIC CONSIDERATIONS
As lung disease is the major source of CF-related mortality, most studies have focused on some measure of lung function as a phenotype to examine for association. As most CF care centers carry out standard pulmonary function tests, spirometry has most commonly been used. Other tests may, in fact, be more specific for particular modifying functions, such as lung clearance index, but these are not as widely used and thus less practical for multi-center studies.

CANDIDATE GENES
Candidate genes are those suspected to have a role in some aspect of CF pathophysiology and variants in those genes are then tested for association with disease manifestations. Those traits may be represented by a continuum of values (lung disease severity, for example) or discrete traits, such as the occurrence of intestinal obstruction. Candidate gene selections for study involved many areas because of the complex pathophysiology of CF, including bacterial infections, inflammation, and lung remodeling/deterioration. This approach yielded multiple reports of putative modifiers of the CF pulmonary phenotype. For example, mannose-binding lectin (MBL), a gene involved in innate FIGURE 1 | Linkage analysis tracks alleles of polymorphisms through families to determine if an allele is linked to a phenotype. In this example, alleles of gene 1, 1 A , 1 B , 1 C , and 1 D , track with severity (black, severe; gray, mild), showing concordant genotypes between siblings with similar phenotypes (left pedigree) and discordant genotypes when phenotypes are dissimilar (right pedigree). In contrast, genotype and phenotype show no relationship at polymorphism 2. Association studies examine a population of unrelated individuals to determine if particular alleles of a polymorphism are found in different proportions, depending on the disease profile. In the example here, alleles 1 A and 1 B have equal frequencies in the population, but 1 A is much higher in the severely affected subjects (black) and 1 B higher in the mildly affected subset (gray).
immunity, was one of the first potential modifier genes described. Low-expressing MBL alleles were found to associate with a more severe pulmonary disease course than those with higher expression (Garred et al., 1999). HLA haplotypes were also investigated as modifiers due to the role of the genes in this complex in innate defense and inflammation. Carriers of the HLA II DR7 haplotype were found to have a higher incidence of P. aeruginosa colonization (Aron et al., 1999).
Polymorphisms within cytokines and other inflammatory mediators were investigated as potential modifiers of CF pulmonary disease due to their role in immune response as well. Tumor necrosis factor alpha (TNFα) is a pro-inflammatory cytokine that is stimulated by NF-κB as a first line of defense against infection. The minor allele of a TNF α promoter polymorphism associated with worse pulmonary function in a small set of CF patients (Hull and Thomson, 1998). Interestingly, the TNF α minor allele that associated with a worse CF prognosis was also associated with an increase in mRNA expression level when measured using a reporter construct (Wilson et al., 1992). Interleukin-10 (IL-10), an anti-inflammatory cytokine was also investigated. Like TNF α, an IL-10 promoter polymorphism was also associated with differences in IL-10 expression . In this case, the lower expressing IL-10 allele was associated with worse CF disease. These studies supported a model in which higher levels of the pro-inflammatory cytokine TNFα, and lower levels of the anti-inflammatory cytokine IL-10 contribute to more severe CF lung disease.

CHALLENGES OF EARLY CANDIDATE GENE MODIFIER STUDIES
Early studies that attempted to identify potential modifiers were challenged by small numbers of study subjects. Typically, pulmonary function data using standard spirometry are not available on children younger than age 6, and multiple measures over time are needed to assess a subject's trajectory, as an indicator of current and future disease severity. Nonetheless, numerous studies compared pulmonary function of subjects over a range of ages, statistically adjusting for age. Younger patients were included in order to maximize participation, but epidemiologic studies indicated that much of the pulmonary phenotypic variability was not present until after puberty (Zemel et al., 2000).
An additional constraint is that not all mutations in CFTR have the same consequences on protein function and thus it is likely to confound interpretation if CFTR genotype is not accounted for. Consequently, after limiting to patients with sufficient lung function measurements and comparable CFTR genotypes, the number of available subjects is low, making it unfeasible for any single center to carry out an association study that would have the statistical power to detect anything but a very major effect of a modifier gene.

CONSORTIUM APPROACHES
The ability to effectively carry out genetic studies is limited by numbers of subjects. As a means to increase numbers, the European CF Twin and Sibling Study mentioned earlier was conceived and compared morphometric and pulmonary function indices of sib pairs. Using lung function measurements from patients in North America and Europe, this study was the first to compare lung function using a CF population for reference (Mekus et al., 2000).

The association of MBL2 deficiency alleles with indicators of pulmonary disease severity was replicated in a population of 298 adults, but refuted in a population of 260 children.
The Trevisiol et al. (2005) study replicated an association of MBL2 deficiency alleles with pulmonary function, but not with PA colonization. *The study by Arkwright et al. (2000) found the severe variant at codon 10 to be T/T, but the study by Drumm et al. (2005) found the severe variant to be C/C at codon 10. A more detailed discussion of the TGB1 association with CF can be found in the text.

FIGURE 2 | Candidate gene approaches (A) have only involved a few variants in one to several dozen genes.
Given a genome of roughly 25,000 genes, this represents a very small sampling (∼0.01% or less). GWAS (B) samples a much larger component of the genome, probing more than 90% of the genes, but it still only examines less than 5% of the over 50 million reference SNPs (http://www.ncbi.nlm.nih.gov/mailman/ pipermail/dbsnp-announce/2012q2/000123.html) curated as of June, 2012. As costs come down, exome (not shown) and whole-genome sequencing (C) provide the potential to capture all variation in study subjects.
severe patients for differences in allele or genotype frequencies of single nucleotide polymorphisms (SNPs) or other gene-associated variants as markers of potential modifier genes. Phenotypic categories of disease severity were defined using a patient's forced expiratory volume in 1 s (FEV 1 ), a pulmonary function index based on age, sex, and height, and used clinically to monitor CF disease progression and therapeutic efficacy. Subjects with FEV 1 values in the upper quintile were classified as "mild" and those in the lower quintile as "severe." Those subjects surviving beyond the age of 34 were classified as mild regardless of pulmonary function, as they represented the upper quintile of their birth cohort (Schluchter, 1992;Schluchter et al., 2002). DNA was obtained from these individuals and genotyped for a variety of variants in or near genes that were considered candidate modifiers.
In the initial candidate gene approach, 1,064 SNPs were tested in over 300 genes/gene regions that were chosen in the following ways: (1) they were SNPs that had previously been reported in the literature as associating with CF phenotype, (2) they were SNPs that were reportedly associated with similar pulmonary disease phenotypes, (3) they were genes that were known to play a key role in CF pathophysiology (Drumm et al., 2005).
Experience using this approach has shed light on the challenges involved in conducting modifier studies. Early studies struggled to achieve statistical power due to small sample sizes. Long and Langley (1999) calculated that the sample size must include at least 500 individuals in order to detect a causative polymorphism and for its association to be replicable. To accommodate the ability to replicate and maximize power, the GMS expanded to a North American Consortium that included a family-based genetic study at the Johns Hopkins University and a population-based study of Canadian CF patients being led by investigators at the University of Toronto and the Hospital for Sick Kids (Taylor et al., 2006). This consortium grew from the need to increase sample size and carry out replication studies and demonstrated its utility in a report that showed variants in the TGFB1 gene associate with pulmonary disease (Drumm et al., 2005) (discussed in more detail below).
The union of the three large studies provided a cohort of unprecedented size for studying modifiers of a single gene disorder, but also presented logistical issues due to the nature of the designs as each group had developed their own methods for assessing pulmonary phenotypes. Kulich et al. (2005) generated CF-specific reference equations for FEV 1 that compare a CF subject's lung function to CF subjects of the same age, sex, and height, as a more appropriate reference than the non-CF population and those values, adjusted for survival, were used to develop a phenotypic index that all three designs could incorporate.
The candidate gene approach showed the effectiveness of genetic studies, but a limitation is that it does not identify genetic locations other than those suspected to influence disease. That is, it will not detect modifying genes or pathways beyond those involved in our limited understanding of the disease. Understanding the functional effects of a modifier and its protein product fuel future studies to provide mechanistic insight of disease pathophysiology and how it might be dealt with (Cutting, 2010).

ASSOCIATING GENES AND INSIGHT INTO THEIR MODIFYING MECHANISMS
One of the powerful attributes of genetics is that it allows one to identify clinically relevant genes, proteins, or pathways by virtue of the effect that variation in the gene produces on a clinical trait. However, the mechanisms by which genetic variation acts on the phenotype is not necessarily obvious. Thus, for any associating gene an obligatory step is to carry out functional studies to understand how it imparts its effect on disease presentation or outcome. Some examples are given below.

ASSOCIATING GENES: MBL
Mannose-binding lectin is a serum protein involved in innate immunity. MBL enhances phagocytosis of infectious organisms, especially during infancy, when adaptive immune response is immature (Eisen and Minchinton, 2003). Variant alleles that decrease MBL serum levels increase risk for many different infections (Garred et al., 1995(Garred et al., , 1997Summerfield et al., 1995Summerfield et al., , 1997 and have been shown to play a role in autoimmune diseases (Davies et al., 1995;Graudal et al., 1998). MBL has been suggested to regulate inflammatory responses, perhaps by delaying one of the first steps in inflammation or by reducing the levels of inflammatory cytokines (Jack et al., 2001). MBL is an attractive CF modifier candidate because it protects against infection and has some role in modulating inflammation.
Three amino acid substitutions in exon 1 (alleles B, C, and D) each contribute to decreased MBL plasma concentrations and are collectively referred to as 0, or null, alleles with the functional allele, containing none of the above variants, designated A. There are also variants with quantitative effects on mRNA expression, termed X, that also result in low MBL serum levels. Genotypes resulting in low MBL levels are designated low-producing or deficient alleles, but there are also genotype combinations associated with high and intermediate serum levels of MBL as well. Using the rationale that www.frontiersin.org MBL protects against bacterial infection or somehow suppresses inflammation, then MBL deficiency alleles would be predicted to associate with a more severe CF lung disease.
In support of such a model, Garred et al. (1999) found that patients with higher expression MBL genotypes had a higher FEV 1 and forced vital capacity (FVC). In other words, there was an additive effect of poor pulmonary function in the presence of an X allele. After further analysis, the cumulative adverse effects of low expression alleles were restricted to patients with chronic P. aeruginosa and were more pronounced in adults. MBL deficiency did not significantly associate with chronic colonization of P. aeruginosa. A study by Gabolde et al. found that cirrhosis of the liver was more common in CF patients carrying deficiency alleles, but other sources are conflicting about the association with CF liver disease (Gabolde et al., 2001;Bartlett et al., 2009;Tomaiuolo et al., 2009).
Several studies agree that MBL low expression alleles associate with lung function (Gabolde et al., 1999;Davies et al., 2004;Yarden et al., 2004;Trevisiol et al., 2005;Choi et al., 2006;Buranawuti et al., 2007;Dorfman et al., 2008), but there is no consensus as to whether this effect is only seen in patients colonized with P. aeruginosa, and whether a heterozygous genotype is sufficient to cause such impairment. Two studies found an association with chronic P. aeruginosa colonization (Trevisiol et al., 2005;McDougal et al., 2010), whereas others failed to detect an association between MBL alleles and colonization of any kind. Buranawuti et al. (2007) found that MBL high expression alleles predicted survival; the null genotype was underrepresented in adult populations and over represented in patients who died late in adolescence. This is consistent with multiple observations that the adverse effect of deficiency alleles is more pronounced in adults (Garred et al., 1999;Yarden et al., 2004;Buranawuti et al., 2007). In fact, a study by Davies et al. (2004) found no association between pulmonary function and MBL genotype in children. Despite replications, not all studies have detected associations between MBL alleles and lung disease severity (Carlsson et al., 2005;Drumm et al., 2005;Faria et al., 2009;McDougal et al., 2010).

ASSOCIATING GENES: TGFB1
As alluded to above, the first significant association identified by the consortium approach demonstrated that severity of pulmonary disease tracked with variants in the TGFB1 gene (Drumm et al., 2005). TGFB1 encodes transforming growth factor beta-1 (TGFβ1), a protein with complex function, involved in several cellular processes from differentiation and proliferation to innate immunity, and has been studied in relation to many disorders including Alzheimer's disease, cancer, Marfan disease, and heart disease (Waltenberger et al., 1993;Yamamoto et al., 1993;Dickson et al., 2005;Brooke et al., 2008). Interest in investigating TGF β1 as a potential modifier of CF pulmonary disease stemmed from both its biologic plausibility, and its identification as a modifier of asthma and chronic obstructive pulmonary disease (COPD) (Pulleyn et al., 2001;Celedon et al., 2004;Silverman et al., 2004;Wu et al., 2004).
TGFβ1 is biologically relevant to CF for several reasons. Leukocytes secrete TGFβ1 in response to infectious agents. TGFβ1 participates in the immune process by regulating the production of cytokines, and is generally thought to be pro-inflammatory in nature (Omer et al., 2003). TGFβ1 also increases the formation of extracellular tissue during injury repair by increasing production of connective tissue by altered gene regulation (Bartram and Speer, 2004). Post-injury repair in the lung is a delicate balance; inadequate remodeling leads to poor wound healing, whereas excessive remodeling leads to pathogenic fibrosis and scarring. There is strong evidence to suggest that the difference between these outcomes is at least in part related to TGF β1 expression levels ( Bartram and Speer, 2004).
Variation in TGF β1 has been shown to modify asthma and COPD. A variant in the promoter region (C-509T), thought to be associated with increased TGF β1 expression, was studied as a potential contributor to asthma disease severity. In two separate studies homozygosity for the T allele (associated with increased TGFβ1 production) was found to be more common among severe asthmatics when compared to mild asthmatics or healthy controls (Pulleyn et al., 2001;Silverman et al., 2004). Variation in codon 10 was studied in patients with COPD. In this case, the allele associated with increased TGFβ1 production was found more commonly in control patients, suggesting a protective role for TGFβ1 in COPD (Wu et al., 2004). Contrasting with associations found in asthma patients, the T allele of -509 was more prevalent in those with mild COPD (Celedon et al., 2004).
The TGF β1 variants that have been implicated in other airway diseases have become a source of interest in CF as well. A study by Arkwright et al. (2000) found that the T allele (high producer genotype) in codon 10 associated with more rapid deterioration in lung function, while the genotype at codon 25 did not correlate with survival or lung function. Another study confirmed the codon 10 association found by Arkwright but interestingly, it was the C allele (low producer genotype) that prevailed in severe patients (Drumm et al., 2005). This finding, replicated in a second population of 498 patients, is counterintuitive given the protective role of TGFβ1 in COPD. The same study, by Drumm et al. found that the -509 T allele also associated with a severe pulmonary phenotype, which is the same adverse effect seen in asthma populations. There have been several attempts to resolve these conflicting data (Arkwright et al., 2000(Arkwright et al., , 2003Drumm et al., 2005;Brazova et al., 2006;Buranawuti et al., 2007;Bremer et al., 2008;Corvol et al., 2008;Faria et al., 2009), but only one study has used a relatively large cohort to accommodate the statistical power needed. It found that a haplotype of a 3 C allele (rs8179181), -509 C, and codon 10 T associated with improved lung function to a greater degree than any SNP alone (Bremer et al., 2008). It would appear from these studies that CF more closely mimics the type of disease seen in asthma and that the same polymorphisms may be protective or adverse, depending on the genetic and environmental context. Gu et al. (2009) applied a novel strategy by pooling equal amounts of DNA from similarly affected subjects into "mild" and "severe" pools and examined 320 patients in the GMS population (160 with severe lung disease, 160 with mild lung disease) with much lower cost and time than the other efforts. By quantifying the signal for each allele (rather than a yes/no output) the genotyping arrays were used to estimate allele frequencies in the pools. Discordant allele frequencies were identified between the pools using this strategy (Gu et al., 2009) and indicated that alleles of IFRD1 may contribute to pulmonary disease severity. In a subsequent study, however, IFRD1 variants did not significantly associate with lung disease .

ASSOCIATING GENES: IFRD1
The IFRD1 protein acts in a histone deacetylase (HDAC)dependent manner to regulate gene expression (Vietor et al., 2002) and the IFRD1 gene is up-regulated during cell differentiation and regeneration in response to stress (Vietor and Huber, 2007). Previous studies found high expression in human blood cells (SymAtlas, 2008) and Gu et al. found highest expression in neutrophils, where up-regulation occurs during the final differentiation steps (Ehrnhoefer, 2009;Gu et al., 2009). The authors suggested that IFRD1 modulates CF lung disease through the regulation of neutrophil effector function, but that other explanations, involving different cell types, should not be ignored.

GENOME-WIDE ASSOCIATION STUDIES
Although the cost of large-scale genotyping had fallen more than a 1000-fold since these studies were initiated, genome sequencing was still well out of range by price and feasibility. Thus, it became feasible to think about whole genome, or genome-wide association studies (GWAS). A GWAS would rapidly interrogate hundreds of thousands of SNPs for association in large populations (Manolio, 2010) without bias imposed by pre-existing models and provide the opportunity to identify novel genes, regulatory loci, and pathways not previously considered. The disadvantage to testing so many variants is that there are statistical penalties that increase as the number of comparisons rises, and thus power is a major limitation (Cutting, 2010). This is less of a concern if the effect of a locus is large, but as common population variants are being examined in these studies, it is likely that the effects of any one locus are not large, perhaps with each accounting for only a few percent of the variation, for example (Long and Langley, 1999). It is an important concept to understand that these studies are conceptually analogous to those designed to find disease-causing genes, which would have major effects if they do, in fact, cause disease.

GWAS-IDENTIFIED ASSOCIATIONS
In a combined GWAS and family-based (linkage) study, 3,467 CF patients were tested for associations between lung disease severity and more than half a million SNPs . To accommodate the various study designs and data acquisition protocols, yet another method to examine pulmonary function, with age-specific CF percentile values of FEV 1 (Kulich et al., 2005;Taylor et al., 2011), was developed and which accounted for mortality and longitudinal changes. With this phenotype and over 500,000 common genetic variants to assess for association, two new loci, one on chromosome 11p13 and one on chromosome 20q13 were identified as having variants that associate with lung function in CF.
The region on chromosome 11p13 of most significant association lies between two annotated genes, APIP and EHF. APIP encodes Apaf-1-interacting protein and EHF is a member of the epithelial-specific Ets transcription factors, both of which provide interesting candidates as disease modifiers, but through very different models, all of which must yet be worked out. It is important to understand that despite the power of genetics to identify such disease-relevant locations in the genome, it does not provide information regarding mechanisms and these must be examined empirically. APIP, for example, has been shown to suppress apoptosis in the presence of hypoxia (Cho et al., 2007), a context experienced by CF tissues. At this point, it is not clear if the adverse allele provides less or greater activity than the protective allele, but one could construct models either way. For example, one hypothesis is that excessive anti-apoptotic activity, resulting from increased APIP, could prolong neutrophilic inflammation and therefore lead to more severe lung disease . Similarly, EHF is reported to serve as a regulator of epithelial cell differentiation under conditions of stress and inflammation (Tugores et al., 2001;Wright et al., 2011) and thus could be modeled to have very important effects during airway development or remodeling from disease-related damage. Finally, it must be considered that the modifying locus could be working at a distance, involving a regulatory site such as a transcriptional enhancer or non-coding RNA.
The other associating region on chromosome 20 was detected by linkage analysis and then refined by association. The linkage signal includes several genes including MC3R, encoding the melanocortin-3 receptor, CBLN4 encoding cerebellin-like 4, CASS4, encoding Crk-associated substrate scaffolding (CASS) 4, and AURKA, encoding Aurora kinase A . With the exception of MC3R, which is a receptor involved in metabolic control, models to explain the other candidates are not presently clear.
Certainly functional studies will help sort out which genes in these associating intervals are responsible for their modifying effects, but these findings illustrate both the power and some of the challenges of genetic studies. On one hand, the unbiased approach provides the opportunity to identify novel disease modulators, but on the other hand identifying the source of the modifying effect and the mechanisms through with it acts are challenging tasks.

THE IMPACT OF DISEASE-MODIFYING GENES
The implications of disease-modifying genes are multiple. First, understanding the genetic contribution to phenotypic variation has the potential to provide insight into prognosis. Second, understanding the mechanisms by which these genes and their alleles are exerting their effects will likely suggest new therapeutic approaches or ways to optimize existing ones. Third, it opens the door to personalized medicine, as a given patient's treatment regimen could conceivably be developed around a genetic profile. Using inflammation as an example, one could imagine a patient whose modifier panel predicts a lessened inflammatory response, and another patient whose modifier panel predicts a heightened inflammatory response. Inflammation is part of the immune response that is necessary to fight infection, however its prolonged state in CF patients can cause lung damage. The patient with the heightened response may benefit from anti-inflammatory drugs earlier, and the patient with the reduced inflammatory response may benefit from increased antibiotic usage. Both are common treatments for CF, but they may be used more beneficially with the help of modifier identification and mechanistic understanding.

SUMMARY
Cystic fibrosis is a simple, Mendelian disorder with complex clinical manifestations that are consequences of CFTR genotype, environmental factors (Boyle, 2007), and heterogeneity throughout the entire genome. The discovery of genetic modifiers may help account for the broad spectrum of disease severity observed in patients, especially those with the same CFTR genotype. Modifying loci identified thus far each appear to contribute only a small percentage to overall disease profile and thus it is likely the combination of these variants in different permutations shape an individual's outcome, an outcome that is also significantly influenced by non-genetic factors, as well as the interaction of genetic and non-genetic factors. There are few genes whose modifying effects withstand the test of replication and further studies must elucidate the role of each one in CF. Additional research about gene-environment interactions and gene-gene interactions will certainly demonstrate how complex these genetic effects are. With the careful use of candidate gene approaches and now, genomewide scans (and soon whole-genome sequencing), it is realistic to believe that modifiers of CF disease will be identified and from which interventions tailored around an individual's genetic profile will be developed. This fine-tuning of therapeutic strategies could contribute to better quality of life and ultimately, improved survival in CF.