Quantifying the Predictive Accuracy of a Polygenic Risk Score for Predicting Incident Cancer Cases : Application to the CARTaGENE Cohort

With the increasing use of polygenic risk scores (PRS) there is a need for adapted methods to evaluate the predictivity of these tools. In this work, we propose a new pseudo-R2 criterion to evaluate PRS predictive accuracy for time-to-event data. This new criterion is related to the score statistic derived under a two-component mixture model. It evaluates the effect of the PRS on both the propensity to experience the event and on the dynamic of the event among the susceptible subjects. Simulation results show that our index has good properties. We compared our index to other implemented pseudo-R2 for survival data. Along with our index, two other indices have comparable good behavior when the PRS has a non-null propensity effect, and our index is the only one to detect when the PRS has only a dynamic effect. We evaluated the 5-year predictivity of an 18-single-nucleotide-polymorphism PRS for incident breast cancer cases on the CARTaGENE cohort using several pseudo-R2 indices. We report that our index, which summarizes both a propensity and a dynamic effect, had the highest predictive accuracy. In conclusion, our proposed pseudo-R2 is easy to implement and well suited to evaluate PRS for predicting incident events in cohort studies.


INTRODUCTION
With the power of genotyping technologies, genome-wide association studies (GWAS) focusing on complex diseases have identified a large number of genetic variants associated with various traits of interest (e.g., diabetes, cardiovascular diseases, cancer...) (Buniello et al., 2019). These large GWAS have provided various lists of disease-related single nucleotide polymorphisms (SNP) together with their effect size estimates (McCarthy et al., 2008). Based on these findings, there has been a growing interest recently in deriving polygenic risk scores (PRS) in order to provide individual risk predictions for various phenotypes (Torkamani et al., 2018). Broadly speaking, for each individual, a classical polygenic score consists of a linear combination of the trait-associated alleles carried by the subject and weighted by their effect sizes. The list of these risk alleles and their corresponding weights are obtained from published GWAS. Keeping in mind classical modeling assumptions of linearity, additivity, and lack of interactions, these PRS provide some estimates of the probabilistic susceptibility of an individual to experience the disease and can be used for disease risk stratification (Torkamani et al., 2018). As a result, in recent years, a burgeoning literature has started to focus on the evaluation of published PRS (for a few: International Multiple Sclerosis Genetics Consortium et al., 2010;Machiela et al., 2011;Khera et al., 2018).
The predictive accuracy of these PRS are usually assessed by measures such as the coefficient of determination (noted as R 2 ) which is interpreted as the proportion of variation in the phenotype that is explained by the PRS. For continuous outcomes, the coefficient of determination is well-defined and unique, however its extension to other outcomes such as binary outcomes is not straightforward. By analogy with the linear model and from different perspectives, various generalizations of the R 2 have been proposed (e.g., Hu et al., 2006). They are usually referred to as pseudo-R 2 .
In practice, most of the pseudo-R 2 that are used for assessing the predictive accuracy of PRS focus on binary outcomes. This is the case for cancer susceptibility where most of the PRS studies have analyzed cancers as a binary outcome (affected/not affected), irrespective of the age of onset. However, as the occurrence of cancers is strongly influenced by age, prediction modeling should not neglect the dynamic of cancer occurrence over time. Thus, the assessment of predictive accuracy of PRS for cancer susceptibility based upon epidemiological cohorts should use timeto-event data and provide predictivity for a defined risk projection interval.
For such quantification with time-to-event data, a large spectrum of pseudo-R 2 have been proposed, most of them relying on the classical Cox proportional hazards model (Cox, 1972). These pseudo-R 2 fall mainly into the categories of explained randomness (entropy-based) or explained variation (variancebased). This latter framework corresponds to the proportion of the outcome variance that is explained by the studied covariates. The estimates rely either on comparing empirical survival functions with and without covariates or on statistical quantities which are directly or indirectly related to the likelihood function (O'Quigley, 2008;Flandre et al., 2017). In practice, most of these pseudo-R 2 have to maximize the log (partial) likelihood of the full model. However, it is sometimes not a straightforward issue, particularly for complex non-proportional survival models.
We had to face such issue in a recent study relying upon the Quebec population-based cohort CARTaGENE where our main objective was to evaluate the 5-year predictivity of a published PRS for breast cancer. To assess the 5-year predictivity, we had to rely upon a non-standard survival model that considers age as the time scale and takes into account that a proportion of the individuals are not susceptible to develop the disease within 5 years. Such survival model belongs to the class of survival mixture models where the population under study is a mixture of individuals with those at risk for experiencing the event within 5 years and those who are not at risk (Maller and Zhou, 1996). The motivation behind the use of this survival model is that at a specified age and over a 5-year horizon a woman can be either susceptible or nonsusceptible to experience breast cancer and her PRS may be related to the propensity for experiencing the event. Moreover, among the susceptible ones, some of them may experience the event earlier than others and their PRS may be linked to the dynamic of the event. In this context, we know that the classical survival models are not well suited for quantifying both effects.
This issue prompted us to derive a pseudo-R 2 which relies on time-to-event data and quantifies the 5-year predictivity accuracy of PRS. This new criterion extends a previous work (Rouam et al., 2010) on pseudo-R 2 that was restricted to classical proper survival models. The score statistic derived from the partial likelihood under an entry-age-stratified age-scaled working survival model enables the calculation of the pseudo-R 2 . This criterion quantifies the predictive ability of the studied factor to separate subject outcomes on both the probability of experiencing the event and the occurrence dynamic.
In this work, we present this new criterion and report the results obtained from a simulation study. We show its practical interest for evaluating the accuracy of published PRS for predicting incident breast cancer cases in the following 5 years using the Quebec population-based cohort CARTaGENE.

Materials: The CARTaGENE Study
CARTaGENE (www.cartagene.qc.ca) is a population cohort consisting of 43,037 Quebec residents aged between 40 and 69 years at recruitment (Awadalla et al., 2013). Enrollment of participants began in July 2009 and was carried out in two phases in six metropolitan areas. On their enrollment date, each participant filled out a questionnaire about health, lifestyle, individual, and familial history of disease, and prescribed medication. Only the women were included in the breast cancer study (n = 23,797 To comprehensively define women with a breast cancer (invasive or in situ), we used an algorithm based on a previous report from the Institut National de Santé Publique du Québec (INSPQ) (Théberge et al., 2003). Using the Breast Cancer Registry, we retrieved the incidence date of histologically confirmed breast cancers. Then, we selected all women having an abnormal mammography and retrieved, if available, the incidence date from the MED-ÉCHO database for women with at least two claims in 2 years or one hospitalization with the corresponding ICD-9 (174, 2330) and ICD-10 (C50, D05) codes.
For the breast cancer study, only the women with genetic data available and who did not have a breast cancer before their enrollment in the CARTaGENE cohort were selected (n = 4,554).

Polygenic Risk Scores
Based on the study of Evans et al. (2017), we computed a PRS for each woman. This PRS uses 18 SNP that have been shown to be associated with breast cancer risk in general European populations together with the published per-allele ORs. Each PRS is the linear combination of the number of risk alleles for each of the 18 SNP weighted by their corresponding log odds-ratios.
Genotyping data has been generated through different projects and using different chips and platforms (Illumina Omni 2.5M, Illumina Infinium Global Screening Array, and Affymetrix Axiom UK biobank). Among the 18 SNP composing the Evans PRS, three were not available in our study and nine had missing data. To impute the missing SNP, we used the Michigan Imputation Server with the Minimac4 algorithm (Das et al., 2016). Imputation reference panel was the HRC r1.1 2016 European population, and the phasing was performed with Eagle v2.4 (Loh et al., 2016). A genetic quality control (QC) was made before the imputation. After imputation, QC was performed on individuals based on the Anderson et al. protocol (Anderson et al., 2010), using all individuals genotyped with the Illumina Infinium Global Screening Array. Individuals with a call rate lower than 95% and a heterozygosity higher than three standard deviations were removed. In pairs of individuals with an identity by state (IBD) higher than 0.1875, the individual with the lowest call rate was removed. To remove participants with divergent European ancestries, we used the first two principal components with the HapMap phase 3 reference panel (The International HapMap 3 Consortium, 2010).

Outcomes
The outcome was the age at occurrence of breast cancer. Patients without breast cancer occurrence were censored. Censoring time was age at the end of the 5-year study period (administrative censoring) or age at death.
For taking into account the age effect, our survival model was stratified by age at cohort entry with four groups: [39][40][41][42][43][44][45][46][47][48][49][50] In this work, we considered a survival model with age as the time scale and stratification on age at entry in the study. Let A 0 denote a random variable that corresponds to the age at which the individual free of disease enters the cohort. Let A be the age at which the individual is experiencing the event of interest with A = A 0 + T ⋆ where T ⋆ is the event time (i.e., the time elapsed between the enrollment in the cohort and the date of event). Let C be the age at which the individual is censored with C = A 0 +C ⋆ where C ⋆ is the censoring time (i.e., the time elapsed between the enrollment in the cohort and the date of analysis or last follow-up). Conditionally on A 0 , let the random variables A and C assumed to satisfy the condition of independent censoring (Fleming and Harrington, 2005).
In the following, we consider J strata for age at entry. For a subject i in stratum j (j = 1, .., J), let X ij = min(A ij , C ij ) be the observed follow-up time on the age scale, δ ij = 1 (X ij =A ij ) the indicator of event and Y ij (s) = 1 (X ij ≥s) the indicator of being at risk for the event at age s. Y ij (s) = 1 indicates that subject i in stratum j is at risk just before time s, Y ij (s) = 0 otherwise. Let Z ij be the value of the PRS computed for individual i in stratum j. For each individual, the observed data consists of (X ij , δ ij , A 0ij , Z ij ).
Since we focus on a 5-year projection interval, we have to take into account that some individuals who enter the cohort are nonsusceptible subjects for the event of interest whereas the other are susceptible subjects who may experience the event by the end of the 5 years or may be censored prior to experiencing the event.
Thus, we consider the following two-part semi-parametric multiplicative model whose marginal survival distribution is expressed as: where exp −θ j e αz is the probability of being a non-susceptible individual for a subject in stratum j. This latter quantity (sometimes called the tail defect) depends upon the age at entry and the PRS. Here, θ j > 0 is an unknown positive agestratum parameter and α an unknown parameter of interest which quantifies the impact of the PRS on the propensity of being a non-susceptible individual.
In this model, exp − 0j (a)e βz is the conditional survival distribution of time-to-event for stratum j for those who are susceptible to experience the event within the projection interval. Here, 0j (a) is an unspecified conditional baseline cumulative hazard function for stratum j and β is an unknown parameter of interest which quantifies the impact of the PRS on the dynamic of the event's occurrence.
In the following, we re-write the survival model presented above in terms of the multiplicative hazard functions as: where λ 0j (a) is the first derivative of 0j (a). From this multiplicative model, a partial likelihood can be written as follows: with n j the number of individuals in stratum j and N = n 1 + ... + n J the total number of individuals. Here, S j (a) and 0j (a) are the marginal survival function and the conditional baseline cumulative hazard function for stratum j, respectively.

Scores Components
From what precedes, we can easily obtain the two components of the log partial likelihood score function evaluated under the null hypothesis of no PRS effect H 0 : β = α = 0: where R ij is the risk set of stratum j including the individual that experienced the event at time a i . Here s) are the weights for the two components, respectively. The nuisance parameters, e −θ j , 0j and S j (s) are the tail defect, the conditional cumulative hazard function and the marginal survival distribution for stratum j computed under the null hypothesis.
From a biological perspective, the quantity U 1ij can be linked to differences in the propensity of being a non-susceptible individual and U 2ij to differences in the dynamic of the occurrence of the event of interest for susceptible individuals.

Scores as Measures of Separability
Following a previous work (Rouam et al., 2010), the nonnull quantities U 1ij and U 2ij computed at event time can be reformulated as two measures of separability that quantify the ability of the PRS to separate individuals from the stratum j who experience the event at time a i from those who are still at risk.
The first quantity U 1ij can be re-written as: In like manner, the second one is as: where R * i is the risk set of stratum j without the individual that experienced the event at time a i .
The two non-null components U 1ij and U 2ij can be interpreted as a weighted differences between the mean value of the genomic score for the individuals who experience the event at time a i and the weighted mean for those who are still at risk (i.e., the mixture of individuals who do not experience the event at time a i ). Differences close to zero indicate a weak or null separability. Large differences indicate that the individuals are well-separated.

Shifted Score Components
In the following, we introduce the shifted scores (or robust scores) W 1ij and W 2ij derived from the seminal work of Lin and Wei (1989). The shifts take into account the dependence between the individuals scores U 1ij and U 2ij . The shifted scores W 1ij and W 2ij are independent and identically distributed (Lachin, 2011) and are as: The practical expressions of these latter quantities are obtained by plugging the two estimatesŵ 1 (s) andŵ 2 (s) where we replace S j (s) by the left-continuous version of the Fleming-Harrington estimator obtained under H 0 using the Nelson-Aalen estimator. The nuisance parameter θ is estimated byθ j = −log(1 − S j (t max )) where t max is the last observed failure time. The shiftsŨ 1ij = E(Û 1ij ) andŨ 2ij =Ê(Û 1ij ) are weighted average of the score calculated at times s prior to time a i . In the following, as we focus on separability measures we will consider only the shifted score components associated with event times that we denoted as W * 1ij and W * 2ij .

Pseudo-R 2 Criterion
In the following we derive a pseudo-R 2 criteria which is interpreted as the proportion of variation of separability that is explained by the PRS. Since the classical score contributions (U ij ) are based on a partial rather than a full likelihood, they are not independently and identically distributed. However, the shifted score contributions (W ij ) introduced by Lin and Wei (1989) are independent and identically distributed. As our pseudo-R 2 quantifies the proportion of variance explained by the PRS, it relies on variance estimates and this latter condition is important for estimation purposes. Moreover, since our individual score contributions can be expressed as differences between the means of the PRS of the group of patients observed experiencing the event of interest and the group of those observed not experiencing the event, the considered shifted score contributions are those calculated at each occurrence of the event.
We recall that the quantities W * 1ij and W * 2ij represent the measures of separability that are calculated at an event time. We denote k j the number of event times for stratum j. The total number of events is denoted by K. In practice, we use the generalized variance of the two-dimensional random vector W = (W * 1 , W * 2 ) that is defined as the determinant of its variancecovariance matrix. Then, we derive a pseudo-R 2 which is based on the relative difference between an estimate of the variancecovariance matrix of the shifted scores computed under the null hypothesis (H 0 : no effect of the PRS; E[W] = 0) and an estimate calculated under the alternative hypothesis (H 1 : effect of the PRS; Under the null hypothesis (H 0 : α = β = 0), we have: Then (omitting the term 1 K−1 ) we have : Under the alternative hypothesis (H 1 ), we have: is the pseudo-R 2 criterion. This quantity measures the global predictive accuracy of the PRS. As shown from the Supplementary Material and the simulation study, the pseudo-R 2 is unit-less, ranges from zero to one, increases with the effect related with either the tail defect proportion or the dynamic of the occurrence of the event of interest.

Simulation-Based Studies
The objective of this section is to evaluate the behavior of the proposed index for different levels of tail defect, values of parameters α and β, sample sizes n and percentages of censoring. We compare the values of to those of the pseudo-R 2 proposed by Nagelkerke (N) based on a transformation of the partial likelihood ratio test (Nagelkerke, 1991) (Rouam et al., 2010) based on the robust score statistic. These indexes are implemented in the R packages "survAUC" (Potapov et al., 2015) (for N, XO, and OXS), "PHeval" (Chauvel, 2018) (for OF), and "survival" (Therneau, 2015) (for RMB).
For the sake of simplicity and without loss of generality, we evaluated the behavior of the test with only one stratum for most of the simulation scenarios. However, in order to check the behavior of the indice with strata, we performed an additional simulation scheme in a case with two strata.

Simulation Scheme
Survival times were generated according to the two-part survival model presented in the previous section with baseline conditional cumulative hazard function 0 (s) = s. The baseline probability of being a non-susceptible was chosen such as exp(−θ ) was equal to 0.3, 0.5, or 0.7. For each subject, we simulated a variable Z (its PRS) from a standard Normal distribution [N (0, 1)].
In order to see the behaviors of the different indices and in particular if they were able to attain values close to one, the following configurations were considered for the hazard ratio (HR) values e α and e β (termed as "propensity effect" and "dynamic effect", respectively): 1 (no effect), 1.5, 2, 2.5, 3, 4, 5, 10, 20, 50, or 500. These values explore a large range of effects from small [log(1.5) ≈ 0.41] to huge [log(500) ≈ 6.21] for α and β. The number of subjects was 500.
For investigating the robustness of the proposed pseudo-R 2 to model misspecification, we performed simulations with survival times generated according to an improper Gompertz distribution such as S(s|Z = z) = exp(−θ e αz (1 − exp(−se βz )).
For investigating the robustness of the indice to the distribution of the covariate, we simulated a variable Z from a Student distribution with ten degrees of freedom. The effect of censoring was investigated by generating independently censoring times from a uniform distribution over [0, u]. Values for u were computed from the chosen percentage of censoring and from the parameters of the considered distributions. The percentage of censoring refers to the percentage of censored FIGURE 2 | Comparison of the evolution of different pseudo-R 2 indices according to the effect of the evaluated polygenic risk score on the dynamic (β) when the criterion has no effect on the propensity (α), for 500 subjects, a tail defect of 70% and no censoring.
observations without the fraction of non-susceptible subjects. Here, 20% censoring were considered.
For investigating the robustness of the proposed pseudo-R 2 to the number of subjects, we performed additional simulations with 200, 500, and 1, 000 subjects. We also performed additional simulations where we generated a stratification variable from a Bernoulli distribution with parameter 0.5.
For each configuration, 1, 000 replications were performed. Figure 1 displays the behavior of , OF, OXS, XO, N, and RMB according to α and β for a tail defect of 70% with 500 subjects and no censoring. Our simulations showed that the value of increases with the value of the strength of the PRS' effect (through the hazards ratio e α and e β ). In our simulation study, ranges from 0 when both α and β are null, to near 1 for very large effects of α and β. In the range of effects presented in this paper, the maximum value reached by is 0.96 when both α and β take value of 6.21 (corresponding to a HR of 500) (see Figure 1A). We can see that is able to quantify an effect on the dynamic (β) when there is no propensity effect (α) related to the PRS. For example, when α is null and β = 0.92 (corresponding to a HR of e β = 2.5) we have = 0.52. The other studied indices are not able to quantify a dynamic effect when there is no propensity effect (see Figures 1A,C-F, 2).

Simulation Results
When the propensity effect is not null, all indices, including ours, are able to quantify the mixture of both effects. The different configurations show that , OF, and OXS lead to higher values than XO and N. Figure 3 displays the values of the indices according to the propensity and dynamic effects, when both the parameters α and β have the same value. has the highest values when α = β ≤ 0.91 (e α = e β ≤ 2.5) and ranks in third place for higher values of α and β. Figure 4 shows that quantifies both the propensity effect and the dynamic effect. However, the reported predictive accuracy is not symmetrical and is higher for the propensity effect (α) as compared to the dynamic effect (β). For example, when α is null and β = 0.92 (HR of 2.5), = 0.21, whereas when α = 0.92 and β is null, then = 0.42 (see Figure 4). Table 1 displays the results obtained with the pseudo-R 2 s for uncensored cases with various tail defects. These results show that the proposed pseudo-R 2 is the only one able to quantify the dynamic effect. As an example, for a tail defect of 30% with e α = 1.25 and e β = 2.50, its value is of 37% whereas the highest value for the other indices is 16% (with OXS). The same behavior of the proposed pseudo-R 2 is observed for the different values of the tail defect. Table 2 displays the results obtained for the proposed pseudo-R 2 under various tail defects and simulation schemes: (i) Two-part survival model with no censoring and Normally distributed explanatory variables, (ii) Two-part survival model with 20% censoring and Normally distributed explanatory variables, (iii) Two-part survival model with no censoring and Student distributed explanatory variables, (iv) Improper Gompertz model with no censoring and Normally distributed explanatory variables.
In case of 20% censoring, simulation results show that the mean and standard error values of the proposed indice are slightly increased. When looking to the simulation results under an improper Gompertz model, simulation results show that the proposed indice is only slightly affected by model misspecification. For a Student distribution, we can see that the estimated mean values of the proposed indice are very close to the Normal distribution. Table 3 shows that the estimated mean and standard error values of the proposed indice are slightly increased when the number of subjects decreases. It is worth noting that the slight bias observed for the mean values with censored or reduced samples is linked to the fact that our pseudo-R 2 uses the estimate of the tail defect. In these situations, the dispersion of this nuisance parameter increases.
As expected, results obtained for the proposed pseudo-R 2 s with and without strata are the same (Table 4).
It should be noted that comparison of the values of the pseudo-R 2 across different tail defect values are difficult to interpret and can be misleading since the dynamic/propensity effects should be interpreted conditionally upon the defective survival distributions that are not normalized to one but to different values according to the defect. This is also the case for the other indices that increase when the proportion of susceptible subjects increase. For example, with e α = 1.5 and e β = 2.5, values of the OXS are 29, 25, and 20% for a tail defect of 30, 50, and 70%, respectively. FIGURE 3 | Comparison of the evolution of different pseudo-R 2 indices according to the effect of the evaluated polygenic risk score on the dynamic (β) and the propensity (α), when α = β, for 500 subjects, a tail defect of 70% and no censoring.
FIGURE 4 | Display of the behavior of the pseudo-R 2 according to the effect of the evaluated polygenic risk score on the dynamic (β) or the propensity (α), when α or β is null, for 500 subjects, a tail defect of 70% and no censoring.

CARTaGENE Results
Among the 4,554 women selected for the analysis, 60 (1.32%) had breast cancer during the 5 years of follow-up. The PRS' mean (Evan's score) was higher among participants with a diagnosed breast cancer within the 5 years (0.57) than for those free of event at 5 years (0.44).   Figures 5A,B, the distribution of the PRS can be considered close to the Normal distribution. From Figure 5C, we can see that higher the PRS, higher the risk of breast cancer is. Table 5 shows the results for and the five pseudo-R 2 indices. For these latter, pseudo-R 2 measures are computed upon an entry-age-stratified age-scaled Cox survival model. was the highest pseudo-R 2 (17.8%), while OF, RMB, and OXS were equal to 12.0, 12.1, and 14.0%, respectively. The indices XO and N had values of 0.21 and 1.01%, respectively.

DISCUSSION
The evaluation of the PRS for clinical prediction has received a lot of attention these last years. For binary or time-to-event traits, prediction accuracy of PRS is usually assessed using likelihoodbased pseudo-R 2 criteria that can be complicated to compute for complex models. In this work, we have proposed a novel pseudo-R 2 test for assessing the accuracy prediction over a time period of PRS for both the probability of occurrence of the event of interest and the dynamic. This criterion is easy to compute since it avoids maximizing Cox's partial likelihood under the alternative.
As seen from the simulation study, our pseudo-R 2 showed good performances as compared to classical indices based on the Cox proportional hazards model. As expected, it showed  , 2005), and by Rouam et al. (2010). In contrast, the coefficient proposed by Nagelkerke (1991) that is frequently used in the literature showed poor results. Simulations results have shown that the proposed indice is only barely affected by model misspecification and covariate distribution. It slightly increases with reduced sample size or censoring. Based on our simulation study, when a dynamical effect related to the PRS is expected, we recommend using our proposed pseudo-R 2 . In other cases, our index can be used together with the OF and OXS indices since they show good performances. In contrast, we do not recommend the use of the Nagelkerke index for quantifying predictive accuracy of PRS for time to event outcomes. Notwithstanding the good performance of the proposed criteria, it has some shortcomings that should be mentioned. We should first keep in mind that pseudo-R 2 s are always model-based criteria. In our case, our pseudo-R 2 measures the proportion of variability in the outcome that is explained by the PRS under the assumption that the PRS may be linked to the dynamic and/or the propensity of the occurrence of the event. In this work, it relies on a two-component survival model which is potentially prone to misspecification. Thus, such underlying assumption of a mixture model should be discussed before considering the use of the proposed pseudo-R 2 . We do not claim that our model represents the reality but that it is a useful approximation of what we suppose the effects (propensity/dynamic) are. Moreover, it is also worth noting that pseudo-R 2 s and even the classical R 2 are not robust to outlier observations and non-normal distributions. Thus, we should warn practitioners to be cautious and examine the PRS distribution. Here, we have considered pseudo-R 2 criteria since our main interest was to focus on the percentage of variation in the outcome explained by the PRS. Time-dependent ROC curves (with incident cases/dynamic controls; Heagerty and Zheng, 2005) could also have been considered if the main objective was to evaluate the prognostic potential of the PRS by focusing on the correct classification rates. Moreover, it should be noted that our pseudo-R 2 is not designed for more complex situations such as those with marker-dependent censoring. We plan to do further work in this direction.
This novel pseudo-R 2 was used to evaluate the 5-years predictive accuracy of the PRS proposed by Evans et al. (2017) for breast cancer occurrence on the CARTaGENE cohort. In this study, we reported a pseudo-R 2 of 17.8% for our novel index which is higher than the values reported by OF and OXS indices. Based on the results from our simulation study, we hypothesize that the PRS has both a propensity effect and a dynamical effect. This is not surprising since the selected SNP are located within genes that encode for proteins that are involved in important processes such as cell growth and division. It is worth noting that for this analysis we have considered an age-dependent model with four strata. Here, our pseudo-R 2 provides a global predictive measure that average all differences for both the propensity and dynamic effects taking into account the stratum of age at entry. Other strategies can be considered and easily implemented. We should keep in mind that the majority of PRS were developed and evaluated from case-control designs which raises some issues about misspecification when applying these results for time-toevent prediction (Lambert et al., 2019). With large prospective cohorts, we may expect to see in a close future more published polygenic hazard scores.
Finally, we think that the proposed novel pseudo-R 2 , which is easy to implement with standard softwares, is worth being used to evaluate PRS for predicting incident events in cohort studies.

DATA AVAILABILITY STATEMENT
The data analyzed in this study were collected in the context of the CARTaGENE cohort. The data are available upon official request and ethical approval. The dataset cannot be made publicly available but can be obtained by interested researchers upon request to the CARTaGENE team at CHU Sainte-Justine. The requests can be made using the CARTaGENE online application at sdas.cartagene.qc.ca, by contacting the CARTaGENE team at access@cartagene.qc.ca or by calling 514-345-2156.

ETHICS STATEMENT
CARTaGENE has obtained ethics approval by the CHU Sainte-Justine under the reference: MP-21-2011-345, 3297. The latest annual ethics renewal was granted on September 13, 2019.

AUTHOR CONTRIBUTIONS
JD and PB developed the original statistic. PB coordinated the project and is JD's Ph.D. thesis advisor. JD, RJ, and PB analyzed the data. TD, YP, CL, and NN participated in the collection of the data. JD, RJ, TD, YP, CL, NN, and PB participated in writing the original draft. All authors read and approved the final manuscript.

FUNDING
The research of the first author was supported by a grant from University Paris-Saclay (France). The funder had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.