Age-Related Decline of Speech Perception

Hearing loss is one of the most common disorders worldwide. It affects communicative abilities in all age groups. However, it is well known that elderly people suffer more frequently from hearing loss. Two different model approaches were employed: A generalised linear model and a random forest regression model were used to quantify the relationship between pure-tone hearing loss, age, and speech perception. Both models were applied to a large clinical data set of 19,801 ears, covering all degrees of hearing loss. They allow the estimation of age-related decline in speech recognition for different types of audiograms. Our results show that speech scores depend on the specific type of hearing loss and life decade. We found age effects for all degrees of hearing loss. A deterioration in speech recognition of up to 25 percentage points across the whole life span was observed for constant pure-tone thresholds. The largest decrease was 10 percentage points per life decade. This age-related decline in speech recognition cannot be explained by elevated hearing thresholds as measured by pure-tone audiometry.


INTRODUCTION
More than 5% of the world's population, approximately 460 million people, suffer from disabling hearing loss (WHO, 2021). Hearing disability is associated with reduced speech perception and, in consequence, reduced communication function. Hearing deteriorates with age (Zwaardemaker, 1891;Technical Committee ISO/TC 43 Acoustics, 2017). The ISO standard describes the agedependent frequency-specific loss (ISO 7029:2017(ISO 7029: , 2017. The slope of the decline increases with growing age and frequency: While for 250 Hz the decline is in the order of 1 dB per decade in the fourth life decade, about 20 dB per decade can be observed for 6,000 Hz in the eighth life decade. While the ISO standard provides detailed information about the relationship between age and pure-tone sensitivity loss (PTSL), it makes no reference to speech recognition.
Given our ageing society and the prevalence of age-related hearing loss (ARHL), it is clear that hearing loss is a common public-health issue of increasing importance in the near future (WHO, 2021). Individuals with ARHL experience social withdrawal (Pronk et al., 2011), mental and physical decline (Shukla et al., 2020), and poorer quality of life (Davis et al., 2007).
Speech perception deficits in hearing-impaired people are mainly attributable to decreased audibility of the speech signal over part or all of the speech frequency range. Within Carhart's (1951) framework for word recognition in quiet, this was referred to as loss of acuity. Additionally, Carhart introduced a second component which stems from impaired processing of the audible speech signal, resulting in a loss of clarity. Plomp (1978) referred to these components of hearing loss as attenuation (class A) and distortion (class D), respectively. The attenuation component can be assessed by pure-tone audiometry. The distortion component describes the impact of reduced temporal and frequency resolution. It is thought that the distortion component explains the deterioration of speech recognition which is not described by attenuation, namely puretone thresholds. Both attenuation and distortion are part of ARHL (van Rooij et al., 1989).
A large number of studies have focussed upon hearing in the elderly and have investigated PTSL and speech perception. However, the interpretation of these results remains challenging, as pure-tone thresholds change substantially with increasing age. Hence, it is necessary to correct for the effect of PTSL when investigating the effect of age on speech perception. One of the first attempts to do this was described by Jerger (1973) in a report on speech recognition in a large group of older subjects. He analysed scores from the clinical records of 2,162 patients. With subjects grouped according to age and average hearing loss at 0.5, 1 and 2 kHz, results suggested that speech recognition, defined as the maximum score (WRS max ) obtained by using a monosyllabic word list, declines above the age of sixty. In particular, he found that age had an effect on speech recognition of approximately 4% per life decade for individuals with mild hearing loss, but that it had a greater effect (e.g., 10% per decade) upon those with higher degrees of hearing loss. Unfortunately, he did not report on hearing loss at higher frequencies. It is known for a long time that hearing thresholds at these higher frequencies are, in particular, worse for older subjects (Zwaardemaker, 1891;Technical Committee ISO/TC 43 Acoustics, 2017).
Several studies have revealed that deterioration in speech understanding occurs in addition to deterioration in hearing sensitivity and includes components beyond elevated hearing thresholds (Bergman et al., 1976;Jerger and Hayes, 1977;Marshall and Bacon, 1981;Pedersen et al., 1991;Divenyi and Haupt, 1997;Kronlachner et al., 2018).
Some authors (Dubno et al., 1997;Humes, 2007) have highlighted the challenge of separating varying auditory thresholds from age, a factor affecting all sensory modalities (Humes and Young, 2016). In recent studies, speech recognition and its relation to age were investigated either by correcting for PTSL (Hoppe et al., 2014;Müller et al., 2016) or by using a longitudinal study design (Dubno et al., 2008). In a clinical population Hoppe et al. (2014) investigated speech recognition with hearing aids and WRS max for different age groups in relation to average hearing loss at 0.5, 1, 2, and 4 kHz (4FPTA). They found a monotonic decrease in speech recognition with increasing age and a significant drop of about 2-4% per decade. This drop was attributed to age-dependent distortion. Müller et al. (2016) investigated, as well, the WRS max as a function of age. After correcting for 4FPTA they found a significant, though smaller, drop for people aged above 70 years of about 2-3% per decade. Neither study included a hearing threshold beyond 4 kHz, and therefore, a small overestimation of the influence of age cannot be excluded. However, Dubno et al. (2008) found a larger effect, around 7-8% per life decade. They performed a longitudinal study including 256 subjects with age-related hearing loss, aged 50-82 years, over a period of 3-15 years. The speech recognition scores were corrected for by changing hearing thresholds during the observation phase; this was done by using the individuals' articulation index as an importance-weighted metric for speech audibility. Unfortunately, longitudinal studies suffer from other disadvantages relating to population size, loss of follow-up etc., and their duration can approach the limits of the clinician's working life span. The special characteristics of the study population and methods-neither the WRS max nor hearing-aid scores were measured-differ from the studies mentioned above. This impedes a direct comparison with the above-mentioned studies and therefore does not imply a contradiction amongst them.
In summary, increased PTSL is the most common expression of ARHL. However, there is evidence that a number of other auditory functions are affected as well (Profant et al., 2019). These functions decline with increasing age and the PTSL does not predict speech recognition sufficiently well.
The goal of this study is to describe the relationship between hearing loss, age, and speech recognition by means of a machinelearning algorithm (Random Forest Regression, RFR, Breiman, 2001). RFR is an algorithm that uses an ensemble method of decision-tree-based regressions to determine a response from a set of input variables. It does not rely on any particular assumptions regarding data distribution. This algorithm is applied to a large data set from routine clinical audiometry in order to investigate the influence of age. The result is a representation of the relationship between pure-tone thresholds and age on the input side and speech recognition on the target side. The model reflects the influence of the age-related distortion component on speech perception.
Additionally, the results of the RFR model will be compared with those of a generalised linear model (GLM) approach. In contrast to the RFR, the GLM requires assumptions about the qualitative relation between input and target variables, whereas the RFR does not need a pre-defined equation framework.
In order to categorise pure-tone thresholds, standard audiograms as proposed by Bisgaard et al. (2010) are used as model input. Both derived models (the RFR and GLM) will be applied to these standard audiograms.

MATERIALS AND METHODS
Audiometric data were retrieved from a clinical data base at the Audiological Department of Erlangen University Hospital. From the routine audiometric measurements, pure-tone thresholds for both bone and air conduction were extracted. Additionally, speech recognition scores for monosyllabic word lists of 20 items for each presentation level of the Freiburg Test (Hahlbrock, 1957) were evaluated. The complete discrimination function, ranging from 65 dB SPL up to 120 dB SPL was measured. All measurements had been conducted in clinical routine in sound-shielded booths with clinical class A audiometers (AT900 / AT1000 AURITEC Medizindiagnostische Systeme GmbH, Hamburg, Germany). Approval for this study was received from the Institutional Review Board of the University of Erlangen (Ref. No. 162_17 Bc). All methods were carried out in accordance with relevant guidelines and regulations.

Data Preparation
Among 91,991 patients who underwent audiometry at our centre from 2002 to 2020 we identified 53,782 adults aged at least 18 years at the time of first investigation. Initially, the data were screened for repeated measurements. Only the first audiometric assessment of each patient was retained. Subsequently, the data from 107,564 ears (hereinafter "cases") were checked for a complete set of air and bone conduction thresholds. After removal of incomplete data sets there remained 107,010 cases. In the next step, cases with missing or incomplete speech audiometry data were deleted, whereafter 26,324 cases remained. The data were then screened for cases of mixed hearing loss; the latter was defined as a difference between air and bone conduction thresholds greater than 10 dB for frequencies within the range 0.5-3 kHz. After removal of mixed-hearing-loss cases, the remaining 19,929 cases were checked for inconsistent results (<1%) caused e.g., by simulation or lack of collaboration on the part of the patient. If, within the discrimination function for monosyllabic words, a score larger than zero was observed while the presentation level was below the hearing threshold, the data set for that case was not used. For some cases it was observed that the measurement of the discrimination function had not been fully completed, so that a score of 100% was not reached, with the presentation level well (>15 dB) below the discomfort level. Those cases were removed as well. The 19,801 cases (19,801 ears of 12,040 patients) finally remaining were used for model-building and for error analysis.
WRS 65 describes speech perception at a typical conversational level. While WRS 65 is primarily dependent on the attenuation and reflects the loss of speech perception ability in everyday life, WRS max describes the maximum information that can be processed to the auditory system. The difference WRS max -WRS 65 can be used to estimate the acceptance of acoustic amplification (Halpin and Rauch, 2012).

Model Setup
For data analysis, model calculation, statistics and figures, the software Matlab R2019B including the Statistics and Machine Learning Toolbox V11.6 (The Mathworks Inc. Natick, Massachusetts) was used. Data were rounded before the RFR model calculation: hearing thresholds to 5 dB and the patients' ages to life decades. Two models (GLM and RFR) were used to describe the relationship between age and PTSL as input variables and speech recognition variables (WRS 65 , WRS max and L max ) as target variables. Equation 1 describes the applied GLM for the target variables WRS 65 and WRS max . Equation 2 describes the GLM for L max .: (1) PTSL i refer to the air-conduction hearing thresholds at the test frequencies 125 Hz to 8 kHz as mentioned above. In order to represent correctly the overall data distribution according to age and 4FPTA, a stratified fivefold cross-validation was applied. In detail, both models, the RFR and GLM, were trained with 80% of the data (training group). The models were then tested in the remaining 20% of the study population (test group). Before group assignment, the data sets were sorted according to 4FPTA and age. Subsequently, every fifth data set was assigned to the test group. This procedure was repeated five times with disjoint training and test sets. The pure-tone thresholds at all frequencies and the patients' age were input variables, while the WRS 65 , WRS max and L max were targets. For each of the three output variables a separate model was built. As a parameter for optimisation and estimating the RFR performance, the median absolute error (MAE, resulting from measured minus predicted score) was used as cost function for both the training group and the test group. The MAE of the test group varied up to 25% for different parameters.
For a large range (50-1,000) of the number of learning cycles (equivalent to number of decision trees) the resulting MAE varied by less than 10%. Finally, a value of 100 for the number of learning cycles was used. A small effect on the MAE was found for the other parameters as well. In summary, the following values were used for the Matlab function "fitrensemble()": "MergeLeaves" = off, the decision tree does not merge leaves. "MinLeafSize" = 5, the minimum number of observations per leaf. "MinParentSize" = 10, the minimum number of observations per branch node. "NumVariablesToSample" = square root of the number of predictors for classification. "PredictorSelection" = allsplits, selects the split predictor that maximises the split-criterion gain over all possible splits of all predictors. The number of nodes per binary decision tree, one result of the model calculation, varied for each model: around 2,150 for WRS max , around 2,700 for WRS 65 , and around 3,650 for L max .
The RFR and GLM were applied to Bisgaard standard audiograms. These standard audiograms are well established and widely used for audiological investigations (e.g., Tu et al., 2021;van Beurden et al., 2021). They are based on a large clinical data base. The standard set comprises ten standard audiograms (see Figure 1) covering a frequency range of 250 Hz to 6,000 Hz. Flat and moderately sloping (N 1 -N 7 ) and steep (S 1 -S 3 ) audiograms are considered. Higher indices correspond to greater PTSL.

RESULTS
Figures 2, 3 depict the basic characteristics of the clinical population investigated. The stacked bar plot (Figure 2) shows the case distribution in our clinical population (N = 19,801). The mean ages of the different groups were 50, 61, 66, 65, and 59 years for WHO 0 , WHO 1 , WHO 2 , WHO 3 and WHO 4 . The vast majority (77%) of cases involved persons between 40 and 80 years of age. The subjects aged 40-80 years dominated all WHO grades except WHO 0 . The smallest data coverage with respect to age and hearing loss was observed for very young adults in the WHO 4 group and for subjects above 80 years of age in the WHO 0 group.
The speech audiometric results for the model's target scores, WRS 65 and WRS max , are shown in Figures 3A,B, respectively. For both measures the median decreased with increasing degrees of hearing loss. The Kruskal-Wallis Test yielded significant group effects for WRS 65 (χ 2 = 15.055, p < 10 −15 , df = 4) WRS 65 and WRS max (χ 2 = 11.873, p < 10 −15 , df = 4). The interquartile ranges for WRS 65 were 5, 25, 50, 0, and 0% for WHO 0 , WHO 1 , WHO 2 , WHO 3 , and WHO 4 , respectively. The interquartile ranges for WRS max were 0, 0, 25, 40, and 30% for the corresponding WHO groups. The variability for WRS 65 was largest for WHO 1 , while for WRS max the largest variability was found for WHO 3 . In this rather rough classification the interpretation of some outliers may benefit from additional information about the specific configuration of hearing loss. In particular, the WHO classification employs the hearing thresholds at only four frequencies, while other frequencies are not considered. The lowest quartile of the WHO 0 cases shows a WRS 65 lower than 95%. In this subgroup the mean threshold for high frequencies (>4 kHz) was 48 dB HL , while for the cases with WRS 65 above 95% in the WHO 0 group the mean threshold for high frequencies was 25 dB HL in the WHO 0 group.

GLM and RFR
Tables 1-3 show the derived GLM parameters β for each target variable including statistical parameters. For the word recognition scores, WRS 65 and WRS max , the lowest frequency (125 Hz) did not contribute significantly to the model output. None of the other frequencies provided a consistent picture. For L max all but one frequency (750 Hz) contributed significantly to the target variable. For the subject's age the GLM revealed a significant effect on all target variables. For comparison, the permutation feature importance of the RFR is added in the right-hand column of Tables 1-3. Larger values for a feature indicate a greater impact on the target variable. Table 4 summarises the performance of the model as assessed by MAE for both the training and the test group by means of fivefold cross-validation. The results are given separately for the GLM and the RFR model. Owing to the composition of our study population the WHO 0 is by far the largest group. The MAE of this group would have dominated the overall summary. For this reason, Table 4 shows the error estimation for each grade of hearing loss separately. Evidently, there was a great variation of the MAE among the WHO groups. With the RFR the largest errors were observed in WHO 2 for the WRS 65 group and in WHO 3 and WHO 4 for WRS max . For those WHO groups the MAE of the training and test groups differed by a factor of 1.5   to 1.7. Unlike the RFR, the GLM yielded comparable MAE for the training and test groups.

Application of the Model
One possible application of the model is shown in Figure 4. The model input was one of the standard audiograms (N 1 -N 7 , S 1 -S 3 ) and the subjects' age was varied between 18 and 99 years. Owing to the relation between age and hearing thresholds hardly any subjects were in our population aged > 85 years for N 1 and S 1 . Therefore, this range was excluded from model calculations. Figures 4A,D show that both models indicate a decrease in WRS 65 with increasing age of up to 20 percentage points across For comparison the permutation feature importance of the RFR was added in the right column. For comparison the permutation feature importance of the RFR was added in the right column.
the whole life span. The GLM suggests a rather constant decline of speech recognition over life span. The RFR on the other hand yields specific periods with different amounts of age-dependent decline. The largest decrease was observed for N 3 in the fifth life decade with 10 percentage points per decade. The RFR results become even more complex if the WRS max and L max are considered, as shown in Figures 4B,C, respectively. The presentation level shows, for all types except N 6 , an increased presentation level for WRS max with increasing age. A considerable decrease in score can be observed in N 6 , accompanied by a slight but significant decrease of L max . For the N 4 and S 3 types the RFR model gives a significant decrease in WRS max which is somehow weakened by an increased presentation L max for this type. For all other types the WRS max does not change with age. However, for these types the RFR model results in an increased presentation level. In comparison, the GLM output indicates a decline for WRS max over age while L max increases for all audiogram types. For both models a decrease of up to 25 percentage points across the whole life span was observed.

DISCUSSION
The analysis of a large clinical database allows the description of the age-related decline of speech perception in detail. In For comparison the permutation feature importance of the RFR was added in the right column. comparison with previous studies, more detailed information about the time course and amount of degradation was achieved by means of RFR. Both models, the GLM and the RFR, describe an age-related decline in speech recognition after being corrected for PTSL. The GLM is based on predefined hypotheses and confirms significant age effects. Inevitably, the relationship between age and speech scores follows the underlying functional relations. The GLM results in an age-related decline for WRS max of about 3-4% per decade for N 4 -N 6 , and S 3 . For all other audiogram types smaller effects were found owing to saturation effects. This is in concordance with previous studies (Jerger, 1973;Dubno et al., 2008). WRS 65 decreases at a rate of up to 2.5% per decade for mild hearing losses, i.e., N 2/3 and S 2 .
For the other audiogram types the GLM yielded smaller rates of decline. Owing to the lower presentation level of 65 dB SPL floor effects were observed even for moderate hearing losses, i.e., N 4−7 . The RFR model yielded more specific information about the time course and rate of decline. Additionally, the RFR model allows the quantitative description of the two basic effects of hearing loss and its relation to age: On the one hand the impact of the attenuation component of ARHL, and on the other hand the impact of the distortion component of ARHL. This could be achieved by keeping constant the model input variables representing PTSL (attenuation), and by modifying the model input variable representing age. It therefore offers the opportunity to overcome a bias that was immanent in previous investigations (Jerger, 1973;Marshall and Bacon, 1981;Dubno et al., 1997;Hoppe et al., 2014;Müller et al., 2016) by isolating age-related hearing threshold elevation from age-related decline in speech recognition as such. This study should not be misunderstood as an attempt to predict speech recognition scores on the basis of PTSL. These scores have to be measured individually. The large variability of individual scores necessitates speech audiometry. The purpose of the model in this study was to analyse the impact of age for larger patient populations with respect to specific audiogram types. It can be seen in Figure 4 that those age-related changes are present for the entire duration of adulthood. However, apart from the fact that higher age relates to lower speech recognition scores, no common quantitative trend, for any age groups or PTSL, can be discerned. This may be regarded as the major outcome of the RFR model calculations. The measurable agerelated decline in speech recognition depends on the age range considered, the specific audiogram, and the specific application of speech audiometry. Owing to saturation effects of the WRS 65 measured at typical conversation level, we observed the largest age effect for moderate hearing losses (N 3 -type audiograms). For the WRS max measured at substantially higher levels, the largest effects were observed for audiogram types corresponding to severe hearing losses (N 4 , N 5 , N 6 ). This result of the RFR is in agreement with findings of Jerger (1973). Even though the variability in his data is considerable (as in our data) one may conclude that a stronger age-related decline can be observed for later life decades and greater hearing loss. Additionally, Jerger's data also indicated that the onset of age-related decline may occur already at younger age. This is in line with our results where the RFR model e.g., yielded for N 6 the strongest decline for WRS max of 20% per decade around the fifth and sixth life decade.
According to the RFR, the decrease in the WRS max was counterbalanced by an increased presentation level for all audiogram types except N 6 . The N 6 -type audiogram showed the largest age-related decline in speech recognition. The decreased tolerance of higher presentation levels may have contributed to this decline. This might reflect certain underlying pathomechanisms that are more likely to be present in patients with this audiogram type compared with others. Complementary to attenuation and distortion, a causal and more differentiated breakdown with respect to presbyacusis was proposed early on. Finally, five main types were proposed, namely sensory, neural, metabolic, mechanical, and vascular presbyacusis (Schuknecht, 1964;Johnsson and Hawkins, 1972). This was complemented by the term central presbyacusis in order to reserve the term neural for degeneration of the cochlear nerve. Sensory presbyacusis is congruent with the attenuation component and is, as pointed out above, represented by the audiogram type as a fixed parameter in Figure 4. The effects of all the other types of presbyacusis are included in the specific relationships between age and WRSs, respectively, L max . Moreover, the specific and different root causes may potentially explain why, for some degrees of hearing loss, different changes in speech perception occur in different life decades. However, possible interactions between-or even independent mechanisms-of the main types of presbyacusis are still not completely understood (Bao and Ohlemiller, 2010;Profant et al., 2019).
It is not possible to confirm all these explanatory hypotheses by retrospective data analyses, a fact that clearly underlines the limits of our study design. We found differences in age effects in comparison with some of the studies referred to above. This is partly due to the neglect of hearing loss at higher frequencies for the elderly in those studies. On the other hand, for some hearing losses and audiogram types, this study may underestimate age effects, as ceiling effects of speech tests in quiet are included. Another aspect of this study is the inclusion of a considerable number of subjects with mild hearing loss, as seen in group S 1 . Even in that group, age effects play a part. Especially the WRS 65 illustrates how everyday communicative ability in quiet might be already affected by mild to moderate hearing loss in a population in which the use of hearing aids does not reach the penetration level needed (Halpin and Rauch, 2012).
Other possible applications of the RFR model are related to acoustic amplification with hearing aids: As shown in Figure 4, in all groups except N 6 , the level for best speech recognition (L max ) increases with age at about 0.5 dB per decade. This may indicate that older people may benefit from larger sound pressure levels for speech recognition, i.e., greater amplification, when provided with a hearing aid. As far as we know, current amplification strategies do not take this into account. On the other hand, one has to consider that in some pathologies more amplification might be detrimental rather than beneficial (Halpin and Rauch, 2012).
The age dependence of the WRS max found in our study may be used to improve studies evaluating the outcome of hearing aid use: The WRS max or an equivalent measure is often used as reference for the measurement of successful hearing aid provision or other acoustic amplification (Halpin and Rauch, 2012;Hoppe et al., 2014;Maier et al., 2018a,b), for investigation of age-related changes in cognition (Kronlachner et al., 2018), and for speech-perception-related studies in general (Müller J. et al., 2017). A consideration of both age and specific audiogram type could potentially decrease the variability of results. Furthermore, the functional relation between audiogram types and speech perception as presented here can be used to link epidemiological studies on hearing loss (Sohn and Jörgenshaus, 2001;von Gablenz et al., 2017von Gablenz et al., , 2020Chang et al., 2019;Löhler et al., 2019;Cantuaria et al., 2021) with speech recognition.

Comparison of the Two Model Approaches
The need for pre-defined hypotheses may be considered a weakness of the GLM, as all model results inevitably follow the underlying analytical equations. If an effect for certain audiogram types is found, the GLM yields a smooth decline over all life decades. The RFR is able to take varying rates of decline in different life decades into account if variation indeed takes place in the study population. Overall, as shown in Table 4, for most of the WHO groups the RFR yielded smaller MAE for the test groups compared with the MAE yielded by the GLM. However, the differences obtained between MAE in the training and test groups by RFR indicate some degree of overfitting. This was not the case for the GLM.
The impact of audiometric test frequencies on the calculated WRS is different for the two model approaches. The GLM is less suitable to reflect the impact of low and high frequency hearing loss for all WHO groups. In cases with mild hearing loss higher frequencies have a greater impact: Typically, the low frequencies show low variability and fail to explain the variability in the scores. Vice versa, for cases with severe hearing loss the PTSL for high frequencies are already near or at the audiometer limits. Consequently, the GLM explains the variability in the scores by utilising PTSL in the low-frequency range. As a result for all WHO groups, the GLM suggests that there is no effect of the highest and lowest test frequencies (Tables 1, 2). Some other findings, such as the absence of an effect at 750 Hz on the WRS max in Table 2, can be considered as typical signs of an overdetermined system. The measurement at 750 Hz does not provide any additional information compared with the adjacent frequencies and vice versa. A priori, there is no audiological rationale for removing single test frequencies.

Limitations of the Study
An important limitation of this study is the restriction to a specific language and test. However, with respect to other languages and speech material the comparison of recent studies (Holden et al., 2013;Hoppe et al., 2019) suggests that the test we used is comparable to the English Consonant-Vowel-Nucleus-Consonant (CNC) test (Causey et al., 1984).
Secondly, the outdated but established calibration procedure for the Freiburg monosyllable test at 65 dB SPL (Holube et al., 2019) is roughly comparable to a level of 60 dB A . Consequently, L max should be corrected by about 5 dB for a comparison e.g., with CNC results.
The disadvantage of binary decision trees is the high chance of overfitting. The use of a random-forest method decreases this risk. However, a factor of up to 1.7 between the MAEs in the test group as compared with the training group still indicates some degree of overfitting. Even the considerable size of the study population and the clustering of input variables do not entirely prevent this risk. Additionally, there are some intrinsic sources of unexplained variability. Even after thorough data-cleaning as described above, the population may still have included mild cases of aggravation, simulation or dissimulation. There was also a small number of cases with retrocochlear lesions. This number can be estimated as less than 0.5% in our population by comparison with our patient files and the reported incidence (Lin et al., 2005). The unilateral processing of the data without the contralateral status as additional input variable is a potential shortcoming and should be therefore subject to future studies as well.
An RFR model inevitably reflects the characteristics of the clinical population that contributed to the training. The group characteristics differ from those of their peers outside a clinic. Finally, the model reflects the statistical characteristics of a population, and not causal relationships.

CONCLUSION
A random-forest regression model allowed the estimation of age-related decline of speech recognition in quiet, completely separated from the effect of pure-tone sensitivity loss. Noticeable declines were found across the whole duration of adulthood and for all audiogram types. Model calculations resulted in a decrease of up to 25 percentage points word recognition scores across the whole life span. Depending on the specific hearing loss, the RFR model indicated a maximum decline of up to 10 percentage points in certain life decades. The decline can be attributed to an increased distortion component related to presbyacusis which is not represented by pure-tone audiometry. The careful derivation of working hypotheses from our data has the potential to provide greater insight into the relationships between puretone sensitivity loss, specific audiogram types and age.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethik-Kommission, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The patients/participants provided their written informed consent to participate in this study.