Composite Measure of Physiological Dysregulation as a Predictor of Mortality: The Long Life Family Study

Biological aging results in changes in an organism that accumulate over age in a complex fashion across different regulatory systems, and their cumulative effect manifests in increased physiological dysregulation (PD) and declining robustness and resilience that increase risks of health disorders and death. Several composite measures involving multiple biomarkers that capture complex effects of aging have been proposed. We applied one such approach, the Mahalanobis distance (DM), to baseline measurements of various biomarkers (inflammation, hematological, diabetes-associated, lipids, endocrine, renal) in 3,279 participants from the Long Life Family Study (LLFS) with complete biomarker data. We used DM to estimate the level of PD by summarizing information about multiple deviations of biomarkers from specified “norms” in the reference population (here, LLFS participants younger than 60 years at baseline). An increase in DM was associated with significantly higher mortality risk (hazard ratio per standard deviation of DM: 1.42; 95% confidence interval: [1.3, 1.54]), even after adjustment for a composite measure summarizing 85 health-related deficits (disabilities, diseases, less severe symptoms), age, and other covariates. Such composite measures significantly improved mortality predictions especially in the subsample of participants from families enriched for exceptional longevity (the areas under the receiver operating characteristic curves are 0.88 vs. 0.85, in models with and without the composite measures, p = 2.9 × 10−5). Sensitivity analyses confirmed that our conclusions are not sensitive to different aspects of computational procedures. Our findings provide the first evidence of association of PD with mortality and its predictive performance in a unique sample selected for exceptional familial longevity.

Biological aging results in changes in an organism that accumulate over age in a complex fashion across different regulatory systems, and their cumulative effect manifests in increased physiological dysregulation (PD) and declining robustness and resilience that increase risks of health disorders and death. Several composite measures involving multiple biomarkers that capture complex effects of aging have been proposed. We applied one such approach, the Mahalanobis distance (D M ), to baseline measurements of various biomarkers (inflammation, hematological, diabetes-associated, lipids, endocrine, renal) in 3,279 participants from the Long Life Family Study (LLFS) with complete biomarker data. We used D M to estimate the level of PD by summarizing information about multiple deviations of biomarkers from specified "norms" in the reference population (here, LLFS participants younger than 60 years at baseline). An increase in D M was associated with significantly higher mortality risk (hazard ratio per standard deviation of D M : 1.42; 95% confidence interval: [1.3, 1.54]), even after adjustment for a composite measure summarizing 85 health-related deficits (disabilities, diseases, less severe symptoms), age, and other covariates. Such composite measures significantly improved mortality predictions especially in the subsample of participants from families enriched for exceptional longevity (the areas under the receiver operating characteristic curves are 0.88 vs. 0.85, in models with and without the composite measures, p = 2.9 × 10 −5 ). Sensitivity analyses confirmed that our conclusions are not sensitive to different aspects of computational procedures. Our findings provide the first evidence of association of PD with mortality and its predictive performance in a unique sample selected for exceptional familial longevity.

INTRODUCTION
Traditional demographic analyses based on information from population life tables provide useful insights on historical patterns of change in mortality, survival curves, and life expectancy which can also be used to predict future trends in these characteristics in the entire population or specific subpopulations. However, such "aggregated" predictions provide information for an "average" individual from a (sub)population and may yield little information about mortality risk and remaining life expectancy for some individuals which are determined by their unique histories of exposures to various risk factors during the life-course, by their genetic makeup, or the interaction of these risk factors and genetics. Therefore, although age is an important risk factor for mortality and determinant of remaining life expectancy, individuals of the same age can have very diverse and unique characteristics that affect their current health status and future risks of health deterioration and mortality. Measurements of different physiological and other variables (biomarkers) provide additional opportunity for personalized predictions of morbidity and mortality risks as they can reflect individual age-related changes occurring at the molecular and cellular levels in different organs and tissues that result in individual-specific rates of physiological dysregulation, health deterioration, and mortality risks. Composite measures based on multiple biomarkers of different physiological systems [see e.g., (1)(2)(3)(4) and recent reviews (5,6)] can capture the complex effect of aging on different regulatory systems and its relation to morbidity and mortality.
Recently, the statistical (Mahalanobis) distance (denoted as D M ) constructed for the joint distribution of multiple biomarkers was suggested as a composite measure that can represent the level of physiological dysregulation in an organism (2) and aging-related declines in robustness and resilience (7). This measure was associated with mortality and agingrelated outcomes in numerous studies [see e.g., (2,(7)(8)(9)(10)(11)(12)]. In this paper, we constructed D M using measurements of multiple biomarkers collected at the baseline visit of the Long Life Family Study (LLFS) to test if the level of PD is associated with mortality in this study, and whether it improves mortality predictions compared to the models with age and common individual risk factors. The LLFS is a unique study which enrolled participants from families selected for exceptional familial longevity (13), along with their spouses. The LLFS participants from the probands' generation have much better survival chances than their age peers from the general population so that the survival curves for the LLFS participants are shifted to the right compared to population survival functions computed from respective cohort life tables (14). Hence, unlike all previous studies applying D M , this paper investigates whether D M is a useful predictor of mortality in persons with much lower mortality risk compared to a general population (and who, respectively, have much higher remaining life expectancy than that estimated from population-based cohort life-tables).

Data
The LLFS is a family-based, longitudinal study of healthy aging and longevity that enrolled participants at field centers in the US (Boston, New York, Pittsburgh) and Denmark. During the baseline visit in 2006-2009, more than 4,900 participants were enrolled from families determined to have exceptional longevity according to the Family Longevity Selection Score (FLoSS) (13). Details on study eligibility criteria are described elsewhere (15). Socio-demographic variables, data on past medical history and current medical conditions, medications use, physical and cognitive functioning, and blood samples were collected via in-person visits and phone questionnaires for all subjects at the time of enrollment (15). Blood assays were centrally processed at a Laboratory Core (University of Minnesota) and protocols were standardized, monitored and coordinated through a Data Management Coordinating Center (Washington University, St. Louis). Written informed consent was obtained from all subjects following protocols approved by the respective field center's Institutional Review Boards (IRBs). In this paper, we performed secondary analyses of LLFS data collected at all field centers. This study was approved by the Duke Health IRB.
LLFS participants were followed up annually to track their vital and health status. The analyses reported in this paper used the March 11, 2019 release of LLFS data with the latest recorded date of death on February 13, 2019. Ages at the baseline visit were validated using dates of birth from official documents (such as birth certificate or driver's license) (16). Ages at death were computed from available dates of birth and death. Ages at censoring for those who did not die within the followup period were determined from dates of birth and the last follow-up in the March 11, 2019 release of LLFS data. We also computed prevalence (i.e., the disease status at the baseline) and incidence (i.e., new cases reported during the follow-up) of major diseases available in the study such as Alzheimer's disease (AD) or dementia, cancer, cardiovascular diseases (CVD), diabetes, for the entire sample and for the reference population used in construction of D M [see section Construction of the Cumulative Measure of Physiological Dysregulation (D M )]. Information on health conditions was collected during the interviews from study participants or proxies (if a participant was unable to provide an answer). At the baseline, the question asked was "Please respond 'yes' or 'no' if you have EVER been told by a doctor that you had this condition." Similar questions were asked during follow-up interviews ("Please respond 'yes' or 'no' if you have EVER been told by a doctor that you had this condition since we last interviewed you on. . . "). Using responses to such questions about specific diseases [AD or dementia: Alzheimer's Disease or Dementia; cancer: All cancer cites; CVD: Myocardial Infarction, Heart Attack, Coronary Angioplasty, Coronary Artery, Bypass Grafting, (Congestive) Heart Failure, Stroke, Cerebrovascular Accident, Transient Ischemic Attack, or Mini-Stroke; diabetes: Diabetes] from the baseline and followup interview, we computed the numbers of prevalent cases at the baseline and the numbers of new cases reported since the baseline.
In addition to association and predictive analyses with D M described in section Association of D M with Mortality and Predictive Performance Analyses, we conducted descriptive analyses of the original variables as well as D M in the entire sample and in generation (probands' and offspring) and spouse groups (probands, their siblings and offspring, and their spouses). Specifically, we computed empirical characteristics (means, standard deviations, ranges, correlations with age, percentages) for various variables used in the analyses (see Table 1). Relevant tests (one-way ANOVA, t-test, chi-square test) were used to provide statistical inference (see section Descriptive Analyses). We also computed the Kaplan-Meier estimates of survival curves (conditional at age 80 years) for subsamples of participants grouped by the quartiles of the distribution of D M in the analyzed sample. Age at the baseline or age 80 years (whatever was the largest) was used as the left truncation variable for these analyses. The quartiles were computed separately by sex and in the entire sample as reported in respective figures (see section Descriptive Analyses). The 95% confidence intervals for the survival curves were computed based on respective estimates of cumulative hazards.

Construction of the Cumulative Measure of Physiological Dysregulation (D M )
The statistical (Mahalanobis) distance (17,18) constructed for the joint distribution of biomarkers was recently suggested in the literature as an approach to construct a composite measure (denoted as D M ) that reflects physiological dysregulation in aging body (2,9,11). It is designed to measure how "aberrant each individual's profile is with respect to the overall average (centroid) of the reference population" (10) that represents the "normal" physiological state. Such "reference population" can be either a subsample of the same study or it can come from an external study. Here we constructed D M using baseline observations of 19 biomarkers that were used in the study of biomarker signatures of aging in LLFS (19). The list of biomarkers (that includes inflammation, hematological, diabetes-associated, lipid, endocrine, and renal biomarkers) along with their descriptive statistics (means, standard deviations, correlation with age) is presented in Table 1. The initial sample contained 4,938 individuals participating in LLFS visit 1. The notes under Table 1 contain information about numbers of missing observations of these biomarkers and other variables used in the analyses. After exclusion of individuals with at least one missing value of respective variables, the resulting sample used in construction and analyses of D M included 3,279 participants (1,815 females, 1,464 males, 886 probands/siblings, 128 spouses of probands/siblings, 1,691 offspring, and 574 offspring spouses). Further, in the Cox regression analyses described below, we removed 19 individuals that were lost to follow-up right after visit 1 (their age at censoring was set to age at baseline) so that the resulting sample size for the Cox regression model was 3,260 individuals. See also section Sensitivity Analyses regarding analyses using multiple imputation of missing values.
Observed values of each biomarker were transformed using the Box-Cox transformation and standardized so that the transformed biomarkers are all on the same scale (with a zero mean and a unit variance). When a variable had zeros for some individuals, all records for that variable were shifted by adding 0.1, so that the Box-Cox transformation could work. We used individuals younger than 60 years at the baseline as a "reference population." This cutoff age produced a reasonably large "reference population" for the current analyses (1,361; 815 females, 546 males). Computations of the means and variancecovariance matrix in the "reference population" [which are needed for construction of D M (2)] were performed separately for females and males. The resulting D M was also transformed using the Box-Cox transformation (see also description of additional computations in section Sensitivity Analyses). Table S1 provides characteristics of the reference population used for construction of D M .

Association of D M With Mortality and Predictive Performance Analyses
We fitted the Cox proportional hazards models with adjustment for related individuals (sandwich estimator) to follow-up data on mortality in the entire LLFS sample. We also performed analyses stratified by generation (probands' generation and offspring generation) and spouse status (probands, their siblings and offspring, and their spouses). Age was used as a time variable with age at visit 1 included as the left truncation variable in the model. The models were adjusted for the following covariates (in addition to D M ): sex (1-male, 0-female), field center (four levels: Boston, Denmark, New York, Pittsburgh; Denmark was used as the reference category), education (1-below high school, 0-otherwise), smoking (smoked >100 cigarettes in lifetime: yes [1]/no [0]), medication use (anti-diabetic, lipid-lowering, antihypertensive) (1-used, 0-did not use), fasting (1-≥ 8 h, 0otherwise), and an 85-item deficits index (DI) (20). The DI (also known as a frailty index) aggregates a number of various health traits into a single measure and it is computed as the number of failed or abnormal traits (or "deficits") divided by the total number of traits measured in individual at respective age (21,22). An important advantage of the DI is that it can be constructed using the set of variables available in a specific dataset as its properties are weakly sensitive to the selection of a specific set of variables as shown in different studies [see e.g., (23)(24)(25)(26)]. To construct the DI in the LLFS, we used health-related variables collected in LLFS that cover major health dimensions such as disability, cognition, morbidities, depression, physical performance, etc. Dichotomous variables were recoded as 1deficit; 0-no deficit. Non-dichotomous variables were recoded as outlined in Kulminski et al. (20). The list of 85 variables used in the DI is provided in Table 1 in Kulminski et al. (20). The DI is constructed as a sum of the recoded variables divided by the number of variables measured in the respective individual. We computed receiver operating characteristic (ROC) curves and areas under the ROC curves (AUC) in logistic regression models with binary indicator of death (1-died during the followup, 0-alive) as the outcome for four combinations of D M and   DI variables (both D M and DI, DI only, D M only, none) used as covariates. All models were adjusted for other covariates specified above (sex, field center, education, smoking, medication use, fasting, and age). We did these calculations in the entire sample and also performed analyses stratified by generation and spouse status. Leave-one-out cross-validation was used for model evaluation in all calculations. See also description of additional computations in section Sensitivity Analyses.
Statistical analyses, data preparation, and visualization were done in SAS 9.4 (SAS/STAT 14.3) and R 3.5.0.

Sensitivity Analyses
We performed sensitivity analyses to check whether our conclusions are sensitive to different aspects of the computation procedures, which might hypothetically affect the results. First, we considered different sets of biomarkers in computations of D M . We added the biomarkers used in our previous applications of D M in the Framingham Heart Study (7,12) to the list of the original 19 biomarkers from Sebastiani et al. (19). We also created D M variants focusing on the subsets of biomarkers with absolute values of correlations with age exceeding specific thresholds (0.05, 0.1, 0.15, 0.2) and removing highly correlated biomarkers (one of a pair of biomarkers with absolute value of correlation between the biomarkers exceeding 0.8). We also computed separate D M variants selecting biomarkers positively and negatively correlated with age. Second, we estimated the models using the original (non-transformed) values of D M . Third, we repeated the analyses focusing on the subsample of whites (which constitute the majority of the LLFS sample, 99%). Fourth, we modified the method of computation of the reference population changing the threshold (<65 and <70 years) and also computing means and variance/covariance matrices separately in the US and Danish subsamples. We also repeated computations excluding individuals with prevalent diseases (cancer, CVD, diabetes, AD or dementia) at the baseline [to focus on healthier reference populations, as discussed in (10)] and/or spouses (as the spousal groups are relatively small and spouses may also tend to share health habits). Fifth, we followed the common practice in the DI literature [e.g., (27)] and calculated DI only for individuals in whom < 20% of the respective variables were missing. Sixth, we repeated the analyses using multiple imputation (MI) for biomarkers and other covariates with missing data (see notes under Table 1). We performed MI using the R-package mice (28) and SAS/STAT PROC MI/MIANALYZE (as needed for different analyses) in two scenarios: (a) we imputed (Box-Cox transformed) individual biomarkers under the assumption of multivariate normality using respective approaches (the joint modeling in mice, MCMC in SAS); (b) we imputed both (Box-Cox transformed) individual biomarkers and other covariates with missing values (education, smoking, anti-diabetic drugs, anti-hypertensive drugs, lipid-lowering drugs, fasting; see notes under Table 1) using fully conditional specification (29). The results using both approaches were similar; therefore, we report only the latter approach. We generated 25 datasets with imputed values of biomarkers and other covariates and computed D M in each dataset using the observed and imputed data. Then we repeated the Cox model and the ROC/AUC analyses in each dataset and pooled respective estimates (the regression parameter estimates and differences between AUCs) using the standard tools implemented in the software to make statistical inference from imputed data. Seventh, we estimated the Cox model with D M included as a categorical variable quantifying the quartiles of D M with the first quartile as a reference category (see note under Table 2 about the proportionality of  hazards assumption). Eighth, we recalculated the ROC/AUC analyses taking into account the relatedness between individuals (probands, their siblings, and offspring) using SAS/STAT PROC GENMOD. The results were nearly identical to those from SAS/STAT PROC LOGISTIC which did not make such adjustments; therefore, only the latter are reported in the text. Ninth, we repeated computations excluding individuals who died within a short period of time (one and 2 years) since the baseline to focus on predicting more distant events (considering the hypothetical possibility of reverse causation in cases when deaths occurred shortly after the measurements of biomarkers). The proportion of females in the "Probands' Spouses" group is higher than in the other groups (possibly because females have better survival than males so that female spouses have higher chances to be included in this group). The proportion of participants from US field centers is higher in the probands' generation reflecting the sample recruitment specifics of the study. There are differences in proportions of low educated participants and smokers between the probands' and offspring generations that reflect the cohort/time trends in education and smoking patterns in the contemporary populations. Medication use also differs between the generations and it is more prevalent in the older groups (the probands' generation). All differences between the groups for the characteristics described above are significant (p = 0.0003 for lipid-lowering medication use; p < 0.0001 for all other) except for anti-diabetic medication use (p = 0.07). The follow-up period since the baseline is relatively long in this study [e.g., the mean follow-up period for alive participants in LLFS is similar to the mean follow up in the Women's Health and Aging Study used in (2)]. As expected, the mean follow-up period is larger in the younger groups (the offspring generation) and the mean time until death is smaller in the older groups (the probands' generation). Also expectedly, the mean follow-up time for those who survived is larger than the mean time until death for those who died, in all groups (p < 0.0001 in all cases described above).

Descriptive Analyses
Participants from the older groups (the probands' generation) had higher prevalence of major diseases (cancer, CVD, diabetes, and AD or dementia; see section Data; p-values for differences between the groups: p < 0.0001 for cancer, CVD, and AD or dementia, p = 0.0003 for diabetes). However, differences in incidence of new cases of these diseases did not follow the uniform pattern. While the proportions of new cases of CVD and AD or dementia were higher in the probands' generation, the proportions of new cancer cases did not differ substantially between the groups and the proportions of new diabetes cases tended to be higher in the offspring generation. All differences between the groups were significant (p < 0.0001) except for cancer incidence (p = 0.5). Table 1 also presents descriptive statistics for the 19 biomarkers used in computations of D M . We note that the biomarkers selected for this study were those from Sebastiani et al. (19) which were found to change with age in that study. Accordingly, all these biomarkers were highly correlated with age (p < 0.0001) in our analysis and their mean values changed, respectively, in the older and younger groups (p < 0.0001). We note however, that these results are purely descriptive and do not explore how multiple factors (except age) may contribute to such differences (or the absence of those) between the groups. We take into account appropriate variables in the regression analyses presented in the next sections. Figure 1 shows violin plots for D M distribution in the total sample and by groups. As one would expect (considering the fact that the offspring generation is much younger than the probands' one), D M distributions differ substantially between the generation groups and participants from the probands' generation show a higher level of dysregulation (that is, larger D M ) compared to offspring. We also investigated whether the level of physiological dysregulation (D M values) at the baseline differentiates individuals according to their subsequent survival chances. Figure 2 displays the Kaplan-Meier estimates of the survival functions (conditional at age 80 years) for the strata defined by the quartiles of D M distribution (computed separately for females and males). The figure shows that females and males with the lowest level of dysregulation (i.e., the first quartile of D M ) have the best survival chances whereas those with the highest level of dysregulation (i.e., the fourth quartile of D M ) have the worst survival, and those from the middle quartiles are in between these two extremes. Figure S1 shows the same curves for the combined sample of females and males. We note that such figures may provide some simple empirical evidence about the relationship between D M and mortality; however, additional analyses are needed to take into account relevant factors (covariates) that can confound the observed association of D M with mortality. Such analyses are presented in the next section.  measured at the baseline visit and follow-up information on mortality in LLFS participants (total sample and stratified analyses by generation and/or spouse status, see notes under the table). Analyses of the total LLFS sample showed that higher D M values are associated with higher mortality risk: hazard ratio (HR) per standard deviation (SD) of D M is 1.42 (95% confidence interval, CI: [1.30, 1.54]). Similar associations were observed in strata by generation and/or spouse with HRs per SD of D M ranging from 1.22 to 1.75 (however, the results were nonsignificant for spouses in the offspring generation which had the smallest number of deaths among all strata). Table 3 compares the performance of different models in predictions of mortality during the follow-up in LLFS, for the total sample and in the strata (same as in Table 2). The table shows the estimates of the areas under the receiver operating characteristic curves (AUC) for the reference model which includes age and other covariates (see section Association of D M with Mortality and Predictive Performance Analyses) but does not include D M and DI and the estimates of AUC in the models with D M and/or DI (along with age and other covariates) and differences between AUC (dAUC) in these models and the reference model. The analyses indicated that addition of D M and/or DI significantly improves the predictive performance of the models compared to the reference model in the total sample (p-values for the null hypothesis dAUC = 0 are 1.5 × 10 −5 , 4.05 × 10 −3 , and 9.19 × 10 −5 , for the models with D M +DI, D M , and DI, respectively). Analyses in generation and spouse status strata revealed that the largest increase in AUC was observed for non-spouses from probands' generation in the model including D M and DI (dAUC = 0.032, p = 2.87 × 10 −5 ). Similarly, for the model with D M , the largest increase in AUC was observed in the same stratum. The models with DI (based on 85 health-related deficits) and D M (based on 19 biomarkers) produced similar dAUC's in this case. Figure 3 displays the AUCs for all four models in this stratum. Also we observed that in some cases (the offspring generation in the model with D M and spouses in each generation in all models) differences between AUCs in the reference model and in the models with D M and or DI did not reach statistical significance.

Sensitivity Analyses
We ran different sensitivity analyses (see section Sensitivity Analyses) in the total sample, which confirmed the observations,  In sum, our extensive sensitivity analyses confirmed that our conclusions are not sensitive to different aspects of the computation procedures.

DISCUSSION
In this paper, we confirmed that the composite measure of physiological dysregulation (D M ) is associated with mortality in LLFS (with larger D M associated with increased mortality risk), similar to other studies (2,7,9,12). We showed also that addition of D M significantly improves mortality predictions compared to the reference model (containing age, sex, and other relevant covariates) in the total LLFS sample. We also found that the largest improvement in predictive performance  (14). The present work is the first study that explored the association of D M with mortality and its predictive performance in such a unique sample. As we showed in Yashin et al. (14), the improved predictions of lifespans based on applications of the deficit index (DI) (20) resulted in detection of additional signals in genome-wide association studies (GWAS) of longevity which were not observable in GWAS with actual ages at death of deceased individuals. Importantly, we showed that the benefits of using predicted vs. observed lifespan data in the GWAS of human longevity are most noticeable for shorter follow-up periods, which is the case for many contemporary studies collecting genetic data, including LLFS. As the results of the present work indicate, including D M in predictive models can provide further benefits for GWAS of human longevity. There are additional opportunities for improving the power of such studies if appropriate methods are used (31).
As other studies showed, D M is associated not only with mortality but also with other health and aging related outcomes (7,9,11,32). In particular, as discussed in our recent study (7), D M can be a promising indicator of declining robustness and resilience during aging, and may precede clinical manifestation of not just one but many diseases even in the absence of strong clinical diagnostic markers pointing out to a specific pathology. Given that, D M could be an especially useful predictor of mortality among the elderly without major chronic diseases. In Arbeev et al. (12), we implemented D M in the framework of the stochastic process model (SPM) of aging (33), which allowed us to observe regularities in dynamic characteristics of trajectories of D M in relation to different aging-related characteristics such as decline in stress resistance and adaptive capacity, and to evaluate how such characteristics might be associated with an increase in mortality risk with age. The LLFS provides opportunities to perform similar analyses in a unique sample of individuals from families enriched for exceptional survival who not only have better survival (14) but also have better health and functioning (15) than a general population. Applications of SPM to analyses of D M in this unique sample and comparison with other studies can help reveal which particular aging-related characteristics differ in individuals with exceptional health and lifespan compared to average individuals and how these differences can propagate to the observed differences in morbidity and mortality risks. Applications of this model will also provide opportunities to take into account varying strength of association of biomarkers with mortality at different ages in construction of the composite measures. In addition, the SPM versions developed for analyses of genetic data (34,35) can be applied to find genetic factors associated with various hidden agingrelated mechanisms (e.g., decline in adaptive capacity and stress resistance, allostatic adaptation) which are not directly observed in the data but can be estimated by this model using longitudinal measurements of biomarkers and follow-up data on mortality or morbidity.
In addition to composite biomarkers such as D M , other approaches were suggested in the literature to quantify biological aging, which can shed light on different aspects of the aging process (32,36). The upcoming collection of extensive omics information (whole genome sequencing, methylomics, transcriptomics, metabolomics, proteomics) in the LLFS participants should open new perspectives for comprehensive evaluation of potential biological mechanisms and pathways related to exceptional longevity and delayed aging in this unique sample. We note also that these future studies need to be accompanied by relevant methodological developments that would take into account specifics of the data (e.g., informative missingness, multi-generational sample, longitudinal omics profiles) to generate valid statistical inferences.
This study has several limitations. We analyzed a unique sample (LLFS) which was selected for exceptional longevity (which was the goal of this study) and the LLFS participants also have better health and function in several domains compared to other cohorts (15). Therefore, the results are not generalizable to the general population. However, the association of D M with mortality was already established in several other studies with health and survival patterns closer to a general population [e.g., (2,7,9,12)]. The sample analyzed in our study is predominantly white ( Table 1). Thus, applications to studies with sizable samples of other race and ethnic groups are necessary to confirm the results for such groups. As we do not have verified information on causes of death for deceased participants, we cannot exclude that some participants had non-natural causes of death (such as accidents, homicides, etc.) which are not related to D M . However, we note that most participants in our study are very old and these causes are not among the leading causes of death for such ages. In this study, we analyzed only one (baseline) measurement of D M . Even though the observed associations were still strong despite a relatively long follow-up period since baseline, future analyses of repeated measurements of biomarkers will allow investigating associations of dynamic characteristics of trajectories of cumulative biomarkers (such as D M ) with mortality and exploring genetic underpinnings of such dynamics. This requires applying advanced statistical tools to appropriately handle methodological challenges in such analyses and this is a subject of our ongoing research.
Although this study was not performed in the clinical settings with patients' data, there is a potential for applications of D M in such settings as we discussed in our prior research (7). Blood tests results and other relevant measures are routinely collected from patients and they can be used to construct D M that can inform health practitioners about underlying transition to an unhealthy state even in the absence of specific pathological values of individual biomarkers. Also importantly, there is no "pre-defined" set of biomarkers that need to be included in such a measure. Therefore, it can be constructed from available biomarkers (e.g., standard laboratory tests) without incurring additional costs for data collection. As we showed in sensitivity analyses, the associations with mortality and improved predictive performance was observed for different subsets of biomarkers used in D M (some of those sets were parsimonious ones with just a few biomarkers). Even though the concept of statistical distance measure computed from biomarkers showed its usefulness in several applications, this is still an active area of research. In particular, the approach to specify such a distance considering non-linear patterns of changes of many biomarkers with age is a subject of our ongoing research.

DATA AVAILABILITY STATEMENT
The Long Life Family Study (LLFS) data used in this study were provided by the LLFS Data Management and Coordinating Center (DMCC), Washington University, St. Louis: https://wustl. edu/. All questions regarding access to the LLFS data should be addressed to Professor Michael Province: mprovince@wustl.edu.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Duke Health IRB. The patients/participants provided their written informed consent to participate in this study.